Source: https://www.kaggle.com/ysachit/inference-and-validation-ipynb

Original title: Inference and Validation

library(rTorch)

Model

Most likely class

With the probabilities, we can get the most likely class using the ps.topk method. This returns the k highest values. Since we just want the most likely class, we can use ps.topk(1). This returns a tuple of the top-k values and the top-k indices. If the highest value is the fifth element, we’ll get back 4 as the index.

Compare predicted vs true labels

Now we can check if the predicted classes match the labels. This is simple to do by equating top_class and labels, but we have to be careful of the shapes. Here top_class is a 2D tensor with shape (64, 1) while labels is 1D with shape (64). To get the equality to work out the way we want, top_class and labels must have the same shape.

equals will have shape (64, 64), try it yourself. What it’s doing is comparing the one element in each row of top_class with each element in labels which returns 64 True/False boolean values for each row.

Untrained model

Now we need to calculate the percentage of correct predictions. equals has binary values, either 0 or 1. This means that if we just sum up all the values and divide by the number of values, we get the percentage of correct predictions.

we’ll need to convert equals to a float tensor. Note that when we take torch.mean it returns a scalar tensor, to get the actual value as a float we’ll need to do accuracy.item().

accuracy = torch$mean(equals$type(torch$FloatTensor))
cat(sprintf("Accuracy: %f %%", accuracy$item()*100))
#> Accuracy: 9.375000 %

The network is untrained so it’s making random guesses and we should see an accuracy around 10%.

Train the model

Now let’s train our network and include our validation pass so we can measure how well the network is performing on the test set. Since we’re not updating our parameters in the validation pass, we can speed up our code by turning off gradients using torch.no_grad():

model     <- main$Classifier()
criterion <- nn$NLLLoss()
optimizer <- optim$Adam(model$parameters(), lr = 0.003)

epochs <- 5
steps  <- 0

train_losses <- vector("list"); test_losses  <- vector("list")
for (e in 1:epochs) {
  i <- 1    # counter for training loop
  running_loss <- 0
  iter_train_dataset <- builtins$enumerate(train_loader) # reset iterator
  for (train_obj in iterate(iter_train_dataset)) {
      images <- train_obj[[2]][[1]] # extract images
      labels <- train_obj[[2]][[2]] # extract labels
    
      optimizer$zero_grad()
      
      log_ps <- model(images)
      loss <- criterion(log_ps, labels)
      loss$backward()
      optimizer$step()
      
      running_loss <- running_loss + loss$item()  
  }
  test_loss <- 0
  accuracy  <- 0
  
  with(torch$no_grad(), {
    iter_test_dataset <- builtins$enumerate(test_loader) # reset iterator
    for (test_obj in iterate(iter_test_dataset)) {
      images <- test_obj[[2]][[1]]  # extract images
      labels <- test_obj[[2]][[2]]  # extract labels
      output <- model(images)
      test_loss <- test_loss + criterion(output, labels)
      
      ps       <- torch$exp(model(images))
      top_     <- ps$topk(1L, dim=1L)
      top_p    <- top_[0]; top_class <- top_[1]
      equals   <- top_class == labels$view(top_class$shape)
      accuracy <- accuracy + torch$mean(equals$type(torch$FloatTensor))
      # Look at the most likely classes for the first 10 examples
    }
  })
  train_losses[[i]] <- running_loss / py_len(train_loader)
  test_losses[[i]]  <- test_loss / py_len(test_loader)
  cat(sprintf("\n Epoch: %3d Training Loss: %8.3f Test Loss: %8.3f Test Accuracy: %8.3f", 
              e,
              running_loss / py_len(train_loader), 
              test_loss$item() / py_len(test_loader),
              accuracy$item() / py_len(test_loader)       
  ))
}
#> 
#>  Epoch:   1 Training Loss:    0.519 Test Loss:    0.458 Test Accuracy:    0.829
#>  Epoch:   2 Training Loss:    0.393 Test Loss:    0.410 Test Accuracy:    0.852
#>  Epoch:   3 Training Loss:    0.355 Test Loss:    0.387 Test Accuracy:    0.854
#>  Epoch:   4 Training Loss:    0.333 Test Loss:    0.399 Test Accuracy:    0.857
#>  Epoch:   5 Training Loss:    0.320 Test Loss:    0.377 Test Accuracy:    0.868

Overfitting

If we look at the training and validation losses as we train the network, we can see a phenomenon known as overfitting.

The network learns the training set better and better, resulting in lower training losses. However, it starts having problems generalizing to data outside the training set leading to the validation loss increasing. The ultimate goal of any deep learning model is to make predictions on new data, so we should strive to get the lowest validation loss possible. One option is to use the version of the model with the lowest validation loss, here the one around 8-10 training epochs. This strategy is called early-stopping. In practice, you’d save the model frequently as you’re training then later choose the model with the lowest validation loss.

Model with dropout

The most common method to reduce overfitting (outside of early-stopping) is dropout, where we randomly drop input units. This forces the network to share information between weights, increasing it’s ability to generalize to new data. Adding dropout in PyTorch is straightforward using the nn.Dropout module.

Train the model with dropout

During training we want to use dropout to prevent overfitting, but during inference we want to use the entire network. So, we need to turn off dropout during validation, testing, and whenever we’re using the network to make predictions. To do this, you use model.eval(). This sets the model to evaluation mode where the dropout probability is 0. You can turn dropout back on by setting the model to train mode with model.train(). In general, the pattern for the validation loop will look like this, where you turn off gradients, set the model to evaluation mode, calculate the validation loss and metric, then set the model back to train mode.

modelDO     <- main$ClassifierDO()
criterion   <- nn$NLLLoss()
optimizerDO <- optim$Adam(modelDO$parameters(), lr = 0.003)

epochs <- 5
steps  <- 0

train_losses <- vector("list"); test_losses  <- vector("list")
for (e in 1:epochs) {
  i <- 1    # counter for training loop
  running_loss <- 0
  iter_train_dataset <- builtins$enumerate(train_loader) # reset iterator
  for (train_obj in iterate(iter_train_dataset)) {
      images <- train_obj[[2]][[1]] # extract images
      labels <- train_obj[[2]][[2]] # extract labels
    
      optimizerDO$zero_grad()
      
      log_ps <- modelDO(images)
      loss <- criterion(log_ps, labels)
      loss$backward()
      optimizerDO$step()
      
      running_loss <- running_loss + loss$item()  
  }
  test_loss <- 0
  accuracy  <- 0
  
  with(torch$no_grad(), {
    iter_test_dataset <- builtins$enumerate(test_loader)    # reset iterator
    for (test_obj in iterate(iter_test_dataset)) {
      images <- test_obj[[2]][[1]]  # extract images
      labels <- test_obj[[2]][[2]]  # extract labels
      output <- modelDO(images)
      test_loss <- test_loss + criterion(output, labels)
      
      ps       <- torch$exp(modelDO(images))
      top_     <- ps$topk(1L, dim=1L)
      top_p    <- top_[0]; top_class <- top_[1]
      equals   <- top_class == labels$view(top_class$shape)
      accuracy <- accuracy + torch$mean(equals$type(torch$FloatTensor))
      # Look at the most likely classes for the first 10 examples
    }
  })
  train_losses[[i]] <- running_loss / py_len(train_loader)
  test_losses[[i]]  <- test_loss / py_len(test_loader)
  cat(sprintf("\n Epoch: %3d Training Loss: %8.3f Test Loss: %8.3f Test Accuracy: %8.3f", 
              e,
              running_loss / py_len(train_loader), 
              test_loss$item() / py_len(test_loader),
              accuracy$item() / py_len(test_loader)       
  ))
}
#> 
#>  Epoch:   1 Training Loss:    0.607 Test Loss:    0.564 Test Accuracy:    0.810
#>  Epoch:   2 Training Loss:    0.479 Test Loss:    0.497 Test Accuracy:    0.827
#>  Epoch:   3 Training Loss:    0.449 Test Loss:    0.502 Test Accuracy:    0.830
#>  Epoch:   4 Training Loss:    0.437 Test Loss:    0.462 Test Accuracy:    0.839
#>  Epoch:   5 Training Loss:    0.419 Test Loss:    0.450 Test Accuracy:    0.843

Inference on dropout model

Now that the model is trained, we can use it for inference. We’ve done this before, but now we need to remember to set the model in inference mode with model.eval(). You’ll also want to turn off autograd with the torch.no_grad() context.