Neural Networks with PyTorch and Lightning AI Part 3: Moving Training Logic into Lightning

#machinelearning #ai

In the previous series, when we optimized our neural network, we had to write quite a bit of training code ourselves.

First, we created an optimizer object that used Stochastic Gradient Descent (SGD) to optimize final_bias.

Then we wrote loops to calculate the derivatives required for gradient descent.

We trained the model for up to 100 epochs.

For each training example, we:

Ran the input through the neural network to get a prediction.
Calculated the loss.
Calculated the derivatives of the loss function.

After processing all three training points, we used:

optimizer.step()

to take a small step toward a better value for final_bias.

Then we used:

optimizer.zero_grad()

to clear the accumulated gradients before starting the next epoch.

All of this required a considerable amount of training code.

Let's see how Lightning helps simplify this process.

Organizing Training Logic with Lightning

Previously, we created a class to store the weights, biases, and the forward() function.

The optimization-related code was written separately outside the class.

With Lightning, we can keep all of this logic in one place.

We start by creating the class as usual, and then add a few new methods.

Configuring the Optimizer

The first method is configure_optimizers().

def configure_optimizers(self):
    return SGD(
        self.parameters(),
        lr=self.learning_rate
    )

This method tells Lightning how the neural network should be optimized.

The learning rate is stored in the self.learning_rate variable that we defined earlier.

Defining a Training Step

Next, we add a method called training_step().

def training_step(self, batch, batch_idx):
    input_i, label_i = batch

    output_i = self.forward(input_i)

    loss = (output_i - label_i) ** 2

    return loss

This method receives:

A batch of training data from the DataLoader.
The index of that batch.

Inside the method, we:

Extract the input and label from the batch.
Run the input through the neural network.
Calculate the loss using the squared residual.
Return the loss.

Notice that we only calculate and return the loss.

With these methods in place, we are ready to start training the neural network using Lightning.

In the next article, we will see how Lightning uses these methods to automatically optimize the model.

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

Give it a ⭐ star on Github