Chapter 3 rTorch vs PyTorch

Last update: Sun Oct 25 13:00:41 2020 -0500 (265c0b3c1)

3.1 What’s different

This chapter will explain the main differences between PyTorch and rTorch. Most of the things work directly in PyTorch but we need to be aware of some minor differences when working with rTorch. Here is a review of existing methods.

Let’s start by loading rTorch:

3.2 Calling objects from PyTorch

We use the dollar sign or $ to call a class, function or method from the rTorch modules. In this case, from the torch module:

#> tensor([1., 2., 3.])

In Python, what we do is using the dot to separate the sub-members of an object:

#> tensor([1, 2, 3])

3.3 Call functions from torch

#> tensor([1., 2., 3.])

The code above is equivalent to writing this code in Python:

#> tensor([1, 2, 3])

Then we can proceed to extract classes, methods and functions from the nn, transforms, and dsets objects. In this example we use the module torchvision$datasets and the function transforms$ToTensor(). For example, the train_dataset of MNIST:`

#> Dataset MNIST
#>     Number of datapoints: 60000
#>     Root location: ./datasets/mnist_digits
#>     Split: Train
#>     StandardTransform
#> Transform: ToTensor()

3.4 Python objects

Sometimes we are interested in knowing the internal components of a class. In that case, we use the reticulate function py_list_attributes().

In this example, we want to show the attributes of train_dataset:

#>  [1] "__add__"                "__class__"              "__delattr__"           
#>  [4] "__dict__"               "__dir__"                "__doc__"               
#>  [7] "__eq__"                 "__format__"             "__ge__"                
#> [10] "__getattribute__"       "__getitem__"            "__gt__"                
#> [13] "__hash__"               "__init__"               "__init_subclass__"     
#> [16] "__le__"                 "__len__"                "__lt__"                
#> [19] "__module__"             "__ne__"                 "__new__"               
#> [22] "__reduce__"             "__reduce_ex__"          "__repr__"              
#> [25] "__setattr__"            "__sizeof__"             "__str__"               
#> [28] "__subclasshook__"       "__weakref__"            "_check_exists"         
#> [31] "_format_transform_repr" "_repr_indent"           "class_to_idx"          
#> [34] "classes"                "data"                   "download"              
#> [37] "extra_repr"             "processed_folder"       "raw_folder"            
#> [40] "resources"              "root"                   "target_transform"      
#> [43] "targets"                "test_data"              "test_file"             
#> [46] "test_labels"            "train"                  "train_data"            
#> [49] "train_labels"           "training_file"          "transform"             
#> [52] "transforms"

Knowing the internal methods of a class could be useful when we want to refer to a specific property of such class. For example, from the list above, we know that the object train_dataset has an attribute __len__. We can call it like this:

#> [1] 60000

3.5 Iterating through datasets

3.5.1 Enumeration

Given the following training dataset x_train, we want to find the number of elements of the tensor. We start by entering a numpy array, which then will convert to a tensor with the PyTorch function from_numpy():

#> torch.float32
#> tensor([[ 3.3000],
#>         [ 4.4000],
#>         [ 5.5000],
#>         [ 6.7100],
#>         [ 6.9300],
#>         [ 4.1680],
#>         [ 9.7790],
#>         [ 6.1820],
#>         [ 7.5900],
#>         [ 2.1670],
#>         [ 7.0420],
#>         [10.7910],
#>         [ 5.3130],
#>         [ 7.9970],
#>         [ 3.1000]])

length is similar to nelement for number of elements:

#> [1] 15
#> [1] 15

3.5.2 enumerate and iterate

#> <enumerate>
#> [1] 15

If we directly use iterate over the enum_x_train object, we get an R list with the index and the value of the 1D tensor:

#> [[1]]
#> [[1]][[1]]
#> [1] 0
#> 
#> [[1]][[2]]
#> tensor([3.3000])
#> 
#> 
#> [[2]]
#> [[2]][[1]]
#> [1] 1
#> 
#> [[2]][[2]]
#> tensor([4.4000])
#> 
#> 
#> [[3]]
#> [[3]][[1]]
#> [1] 2
#> 
#> [[3]][[2]]
#> tensor([5.5000])
#> 
#> 
#> [[4]]
#> [[4]][[1]]
#> [1] 3
#> 
#> [[4]][[2]]
#> tensor([6.7100])
#> 
#> 
#> [[5]]
#> [[5]][[1]]
#> [1] 4
#> 
#> [[5]][[2]]
#> tensor([6.9300])
#> 
#> 
#> [[6]]
#> [[6]][[1]]
#> [1] 5
#> 
#> [[6]][[2]]
#> tensor([4.1680])
#> 
#> 
#> [[7]]
#> [[7]][[1]]
#> [1] 6
#> 
#> [[7]][[2]]
#> tensor([9.7790])
#> 
#> 
#> [[8]]
#> [[8]][[1]]
#> [1] 7
#> 
#> [[8]][[2]]
#> tensor([6.1820])
#> 
#> 
#> [[9]]
#> [[9]][[1]]
#> [1] 8
#> 
#> [[9]][[2]]
#> tensor([7.5900])
#> 
#> 
#> [[10]]
#> [[10]][[1]]
#> [1] 9
#> 
#> [[10]][[2]]
#> tensor([2.1670])
#> 
#> 
#> [[11]]
#> [[11]][[1]]
#> [1] 10
#> 
#> [[11]][[2]]
#> tensor([7.0420])
#> 
#> 
#> [[12]]
#> [[12]][[1]]
#> [1] 11
#> 
#> [[12]][[2]]
#> tensor([10.7910])
#> 
#> 
#> [[13]]
#> [[13]][[1]]
#> [1] 12
#> 
#> [[13]][[2]]
#> tensor([5.3130])
#> 
#> 
#> [[14]]
#> [[14]][[1]]
#> [1] 13
#> 
#> [[14]][[2]]
#> tensor([7.9970])
#> 
#> 
#> [[15]]
#> [[15]][[1]]
#> [1] 14
#> 
#> [[15]][[2]]
#> tensor([3.1000])

3.5.3 for-loop for iteration

Another way of iterating through a dataset that you will see a lot in the PyTorch tutorials is a loop through the length of the dataset. In this case, x_train. We are using cat() for the index (an integer), and print() for the tensor, since cat doesn’t know how to deal with tensors:

#> 0    tensor([3.3000])
#> 1    tensor([4.4000])
#> 2    tensor([5.5000])
#> 3    tensor([6.7100])
#> 4    tensor([6.9300])
#> 5    tensor([4.1680])
#> 6    tensor([9.7790])
#> 7    tensor([6.1820])
#> 8    tensor([7.5900])
#> 9    tensor([2.1670])
#> 10   tensor([7.0420])
#> 11   tensor([10.7910])
#> 12   tensor([5.3130])
#> 13   tensor([7.9970])
#> 14   tensor([3.1000])

Similarly, if we want the scalar values but not as tensor, then we will need to use item().

#> 0    [1] 3.3
#> 1    [1] 4.4
#> 2    [1] 5.5
#> 3    [1] 6.71
#> 4    [1] 6.93
#> 5    [1] 4.17
#> 6    [1] 9.78
#> 7    [1] 6.18
#> 8    [1] 7.59
#> 9    [1] 2.17
#> 10   [1] 7.04
#> 11   [1] 10.8
#> 12   [1] 5.31
#> 13   [1] 8
#> 14   [1] 3.1

We will find very frequently this kind of iterators when we read a dataset read by torchvision. There are several different ways to iterate through these objects as you will find.

3.6 Zero gradient

The zero gradient was one of the most difficult to implement in R if we don’t pay attention to the content of the objects carrying the weights and biases. This happens when the algorithm written in PyTorch is not immediately translatable to rTorch. This can be appreciated in this example.

We are using the same seed in the PyTorch and rTorch versions, so, we could compare the results.

3.6.1 Code version in Python

#> <torch._C.Generator object at 0x7f42c604e250>
#> Loss:  tensor(1270.1233, grad_fn=<DivBackward0>)
#> 
#> Predictions:
#> tensor([[ 69.3122,  80.2639],
#>         [ 73.7528,  97.2381],
#>         [118.3933, 124.7628],
#>         [ 89.6111,  93.0286],
#>         [ 47.3014,  80.6467]], grad_fn=<AddBackward0>)
#> 
#> Targets:
#> tensor([[ 56.,  70.],
#>         [ 81., 101.],
#>         [119., 133.],
#>         [ 22.,  37.],
#>         [103., 119.]])

3.6.2 Code version in R

#> <torch._C.Generator>
#> Loss: tensor(1270.1237, grad_fn=<DivBackward0>)
#> 
#> Predictions:
#> tensor([[ 69.3122,  80.2639],
#>         [ 73.7528,  97.2381],
#>         [118.3933, 124.7628],
#>         [ 89.6111,  93.0286],
#>         [ 47.3013,  80.6467]], grad_fn=<AddBackward0>)
#> 
#> Targets:
#> tensor([[ 56.,  70.],
#>         [ 81., 101.],
#>         [119., 133.],
#>         [ 22.,  37.],
#>         [103., 119.]])

Notice that while we are in Python, the tensor operation, gradient (\(\nabla\)) of the weights \(w\) times the Learning Rate \(\alpha\), is:

\[w = -w + \nabla w \; \alpha\]

In Python, it is a very straight forwward and clean code:

In R, without generics, it shows a little bit more convoluted:

3.7 R generic functions

Which why we simplified these common operations using the R generic function. When we use the generic methods from rTorch the operation looks much neater.

The following two expressions are equivalent, with the first being the long version natural way of doing it in PyTorch. The second is using the generics in R for subtraction, multiplication and scalar conversion.