library(tensorflow)
library(keras3)
# Don't worry about weird messages. TensorFlow supports additional optimizations.
exists("tf")
[1] TRUE
= tf$constant(5.0)
immutable = tf$Variable(5.0) mutable
One of the most commonly used frameworks for machine learning is TensorFlow. TensorFlow is an open source linear algebra library with focus on neural networks, published by Google in 2015. TensorFlow supports several interesting features, in particular automatic differentiation, several gradient optimizers and CPU and GPU parallelization.
These advantages are nicely explained in the following video:
To sum up the most important points of the video:
All operations in TensorFlow are written in C++ and are highly optimized. But don’t worry, we don’t have to use C++ to use TensorFlow because there are several bindings for other languages. TensorFlow officially supports a Python API, but meanwhile there are several community carried APIs for other languages:
In this course we will use TensorFlow with the https://tensorflow.rstudio.com/ binding, that was developed and published 2017 by the RStudio team. First, they developed an R package (reticulate) for calling Python in R. Actually, we are using the Python TensorFlow module in R (more about this later).
TensorFlow offers different levels of API. We could implement a neural network completely by ourselves or we could use Keras which is provided as a submodule by TensorFlow. Keras is a powerful module for building and training neural networks. It allows us building and training neural networks in a few lines of codes. Since the end of 2018, Keras and TensorFlow are completly interoperable, allowing us to utilize the best of both. In this course, we will show how we can use Keras for neural networks but also how we can use the TensorFlow’s automatic differenation for using complex objective functions.
Useful links:
TensorFlow has two data containers (structures):
To get started with TensorFlow, we have to load the library and check if the installation worked.
library(tensorflow)
library(keras3)
# Don't worry about weird messages. TensorFlow supports additional optimizations.
exists("tf")
[1] TRUE
= tf$constant(5.0)
immutable = tf$Variable(5.0) mutable
Don’t worry about weird messages (they will only appear once at the start of the session).
We now can define the variables and do some math with them:
= tf$constant(5)
a = tf$constant(10)
b print(a)
tf.Tensor(5.0, shape=(), dtype=float32)
print(b)
tf.Tensor(10.0, shape=(), dtype=float32)
= tf$add(a, b)
c print(c)
tf.Tensor(15.0, shape=(), dtype=float32)
$print(c) # Prints to stderr. For stdout, use k_print_tensor(..., message). tf
Normal R methods such as print() are provided by the R package “tensorflow”.
The TensorFlow library (created by the RStudio team) built R methods for all common operations:
`+.tensorflow.tensor` = function(a, b){ return(tf$add(a,b)) }
# Mind the backticks.
+b) (a
tf.Tensor(15.0, shape=(), dtype=float32)
Their operators also automatically transform R numbers into constant tensors when attempting to add a tensor to an R number:
= c + 5 # 5 is automatically converted to a tensor.
d print(d)
tf.Tensor(20.0, shape=(), dtype=float32)
TensorFlow containers are objects, what means that they are not just simple variables of type numeric (class(5)), but they instead have so called methods. Methods are changing the state of a class (which for most of our purposes here is the values of the object). For instance, there is a method to transform the tensor object back to an R object:
class(d)
[1] "tensorflow.tensor"
[2] "tensorflow.python.framework.ops.EagerTensor"
[3] "tensorflow.python.framework.ops._EagerTensorBase"
[4] "tensorflow.python.framework.tensor.Tensor"
[5] "tensorflow.python.types.internal.NativeObject"
[6] "tensorflow.python.types.core.Symbol"
[7] "tensorflow.python.types.core.Value"
[8] "tensorflow.python.types.core.Tensor"
[9] "python.builtin.object"
class(d$numpy())
[1] "numeric"
class(as.matrix(d))
[1] "matrix" "array"
R uses dynamic typing, what means you can assign a number, character, function or whatever to a variable and the the type is automatically inferred. In other languages you have to state the type explicitly, e.g. in C:
int a = 5;
float a = 5.0;
char a = "a";
While TensorFlow tries to infer the type dynamically, you must often state it explicitly. Common important types:
The reason why TensorFlow is so explicit about types is that many GPUs (e.g. the NVIDIA GeForces) can handle only up to 32 bit numbers! (you do not need high precision in graphical modeling)
But let us see in practice what we have to do with these types and how to specifcy them:
= matrix(runif(10*10), 10, 10)
r_matrix = tf$constant(r_matrix, dtype = "float32")
m = tf$constant(2.0, dtype = "float64")
b = m / b # Doesn't work! We try to divide float32/float64. c
So what went wrong here? We tried to divide a float32 by a float64 number, but we can only divide numbers of the same type!
= matrix(runif(10*10), 10, 10)
r_matrix = tf$constant(r_matrix, dtype = "float64")
m = tf$constant(2.0, dtype = "float64")
b = m / b # Now it works. c
We can also specify the type of the object by providing an object e.g. tf$float64.
= matrix(runif(10*10), 10, 10)
r_matrix = tf$constant(r_matrix, dtype = tf$float64) m
In TensorFlow, arguments often require exact/explicit data types: TensorFlow often expects integers as arguments. In R however an integer is normally saved as float. Thus, we have to use an “L” after an integer to tell the R interpreter that it should be treated as an integer:
is.integer(5)
is.integer(5L)
matrix(t(r_matrix), 5, 20, byrow = TRUE)
$reshape(r_matrix, shape = c(5, 20))$numpy()
tf$reshape(r_matrix, shape = c(5L, 20L))$numpy() tf
Skipping the “L” is one of the most common errors when using R-TensorFlow!
PyTorch is another famous library for deep learning. Like TensorFlow, Torch itself is written in C++ with an API for Python. In 2020, the RStudio team released R-Torch, and while R-TensorFlow calls the Python API in the background, the R-Torch API is built directly on the C++ Torch library!
Useful links:
To get started with Torch, we have to load the library and check if the installation worked.
library(torch)
Attaching package: 'torch'
The following object is masked from 'package:keras3':
as_iterator
Unlike TensorFlow, Torch doesn’t have two data containers for mutable and immutable variables. All variables are initialized via the torch_tensor function:
= torch_tensor(1.) a
To mark variables as mutable (and to track their operations for automatic differentiation) we have to set the argument ‘requires_grad’ to true in the torch_tensor function:
= torch_tensor(5, requires_grad = TRUE) # tf$Variable(...)
mutable = torch_tensor(5, requires_grad = FALSE) # tf$constant(...) immutable
We now can define the variables and do some math with them:
= torch_tensor(5.)
a = torch_tensor(10.)
b print(a)
torch_tensor
5
[ CPUFloatType{1} ]
print(b)
torch_tensor
10
[ CPUFloatType{1} ]
= a$add(b)
c print(c)
torch_tensor
15
[ CPUFloatType{1} ]
The R-Torch package provides all common R methods (an advantage over TensorFlow).
= torch_tensor(5.)
a = torch_tensor(10.)
b print(a+b)
torch_tensor
15
[ CPUFloatType{1} ]
print(a/b)
torch_tensor
0.5000
[ CPUFloatType{1} ]
print(a*b)
torch_tensor
50
[ CPUFloatType{1} ]
Their operators also automatically transform R numbers into tensors when attempting to add a tensor to a R number:
= a + 5 # 5 is automatically converted to a tensor.
d print(d)
torch_tensor
10
[ CPUFloatType{1} ]
As for TensorFlow, we have to explicitly transform the tensors back to R:
class(d)
[1] "torch_tensor" "R7"
class(as.numeric(d))
[1] "numeric"
Similar to TensorFlow:
= matrix(runif(10*10), 10, 10)
r_matrix = torch_tensor(r_matrix, dtype = torch_float32())
m = torch_tensor(2.0, dtype = torch_float64())
b = m / b c
But here’s a difference! With TensorFlow we would get an error, but with R-Torch, m is automatically casted to a double (float64). However, this is still bad practice!
During the course we will try to provide the corresponding PyTorch code snippets for all Keras/TensorFlow examples.
::: {.callout-caution icon=“false”} #### Question: Runtime
This exercise compares the speed of R to torch The first exercise is to rewrite the following function in torch:
= function(x = matrix(0.0, 10L, 10L)){
do_something_R = apply(x, 1, mean)
mean_per_row = x - mean_per_row
result return(result)
}
Here, we provide a skeleton for a TensorFlow function:
= function(x = matrix(0.0, 10L, 10L)){
do_something_torch
... }
We can compare the speed using the Microbenchmark package:
= matrix(0.0, 100L, 100L)
test ::microbenchmark(do_something_R(test), do_something_torch(test)) microbenchmark
Try different matrix sizes for the test matrix and compare the speed.
Tip: Have a look at the the torch_mean documentation and the “dim” argument.
Compare the following with different matrix sizes:
Also try the following:
::microbenchmark(
microbenchmarktorch_matmul(testTorch, testTorch$t()), # Torch style.
%*% t(test) # R style.
test )
= function(x = matrix(0.0, 10L, 10L)){
do_something_torch = torch_tensor(x) # Remember, this is a local copy!
x = torch_mean(x, dim = 1)
mean_per_row = x - mean_per_row
result return(result)
}
= matrix(0.0, 100L, 100L)
test ::microbenchmark(do_something_R(test), do_something_torch(test)) microbenchmark
Unit: microseconds
expr min lq mean median uq max
do_something_R(test) 261.908 281.0755 295.29307 284.909 291.2230 1141.071
do_something_torch(test) 55.555 61.9510 93.54683 64.206 70.7455 1672.349
neval cld
100 a
100 b
= matrix(0.0, 1000L, 500L)
test ::microbenchmark(do_something_R(test), do_something_torch(test)) microbenchmark
Unit: microseconds
expr min lq mean median uq
do_something_R(test) 5509.785 5678.172 7001.044 5937.169 8000.432
do_something_torch(test) 910.897 1224.875 1328.881 1328.031 1413.065
max neval cld
12527.181 100 a
3075.697 100 b
Why is R faster (the first time)?
= matrix(0.0, 1000L, 500L)
test = torch_tensor(test)
testTorch
::microbenchmark(
microbenchmarktorch_matmul(testTorch, testTorch$t()), # Torch style.
%*% t(test) # R style.
test )
Unit: milliseconds
expr min lq mean
torch_matmul(testTorch, testTorch$t()) 1.005238 1.241234 1.631383
test %*% t(test) 164.589293 167.845821 190.496715
median uq max neval cld
1.437911 1.678602 6.608872 100 a
175.203742 195.195916 423.898262 100 b
:::