library(rTorch)

Digits recognition on PNG images. R code

  1. Added utility functions
  2. Introspection on dim=2. Extra layers on tensor
  3. Converting code from Python to R
  4. Read PNG images instead of the MNIST standard dataset (IDX format)

Objectives

  1. Read PNG images instead of MNIST standard dataset.
  2. Save the model
  3. Read an unseen digit using the model
  4. Predict digit and calculate accuracy

Read image and label for a data point

The size of the image tensor is torch.Size([3, 28, 28]) but it should really be torch.Size([1, 28, 28]).

Retrieve a second data point

Reduce the number of layers for dim=1 in the image

Build the model

We arbitrarily set 3000 iterations here which means the model would update 3000 times.

One epoch consists of 60,000 / 100 = 600 iterations. Because we would like to go through 3000 iterations, this implies we would have 3000 / 600 = 5 epochs as each epoch has 600 iterations.

num_epochs = n_iters / (py_len(train_dataset) / batch_size_train)
num_epochs = as.integer(num_epochs)
num_epochs
#> [1] 5

Training the model

# train the model
iter <- 0L
for (epoch in 1:num_epochs) {
    iter_train_dataset <- builtins$enumerate(train_loader) # reset iterator
    for (train_obj in iterate(iter_train_dataset)) {
        # extract images, labels
        images <- train_obj[[2]][[1]]
        labels <- train_obj[[2]][[2]]
        # Load images as Variable
        images = images$view(-1L, 28L*28L)$requires_grad_()
        labels = labels
        # Clear gradients w.r.t. parameters
        optimizer$zero_grad()
        # Forward pass to get output/logits
        # outputs = model(torch$as_tensor(images, dtype = torch$double))
        outputs = model(images)
        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)
        # Getting gradients w.r.t. parameters
        loss$backward()
        # Updating parameters
        optimizer$step()
        iter = iter + 1
        if (iter %% 500 == 0) {
            # Calculate Accuracy for each epoch
            correct <- 0
            total <- 0
            # Iterate through test dataset
            iter_test_dataset <- builtins$enumerate(test_loader) # reset iterator
            for (test_obj in iterate(iter_test_dataset)) {
                # Load images to a Torch Variable
                images <- test_obj[[2]][[1]]
                labels <- test_obj[[2]][[2]]
                images <- images$view(-1L, 28L*28L)$requires_grad_()
                # Forward pass only to get logits/output
                # outputs = model(torch$as_tensor(images, dtype = torch$double))
                outputs = model(images)
                # Get predictions from the maximum value
                .predicted = torch$max(outputs$data, 1L)
                predicted <- .predicted[1L]
                # Total number of labels
                total = total + labels$size(0L)
                # Total correct predictions
                correct = correct + sum((predicted$numpy() == labels$numpy()))
            }
            accuracy = 100 * correct / total
            # Print Loss
            cat(sprintf('Iteration: %5d. Loss: %f. Accuracy: %8.2f \n',
                  iter, loss$item(), accuracy))
        }
    }
}
#> Iteration:   500. Loss: 1.907703. Accuracy:    65.00 
#> Iteration:  1000. Loss: 1.619042. Accuracy:    75.41 
#> Iteration:  1500. Loss: 1.399451. Accuracy:    78.92 
#> Iteration:  2000. Loss: 1.202429. Accuracy:    80.90 
#> Iteration:  2500. Loss: 1.056811. Accuracy:    82.17 
#> Iteration:  3000. Loss: 1.071682. Accuracy:    82.98

This is a modified version of the original article.

Source: https://www.deeplearningwizard.com/deep_learning/practical_pytorch/pytorch_logistic_regression/