1  Getting Started

1.1 Organization of this book

The book is organized on five parts chapters:

  1. ML basics

  2. Classical ML algorithms

  3. Deep learning

  4. xAI and causal ML

  5. Generative AI

1.2 Software requirements

1.2.1 R System

Make sure you have a recent version of R (>=4.2, ideally >=4.3) and RStudio on your computers. For Mac users, if you have already a M1-M3 Mac, please install the R-ARM version (see here (not the x86_64 version))

1.2.2 TensorFlow and Keras

If you want to run the code on your own computers, you need to install TensorFlow / Keras for R. For this, the following should work for most people:

install.packages("keras3", dependencies = TRUE)
keras3::install_keras(backend="tensorflow")

This should work on most computers, in particular if all software is recent. Sometimes, however, things don’t work well, especially the python distribution often makes problems. If the installation does not work for you, we can look at it together. Also, we will provide some virtual machines in case your computers / laptops are too old or you don’t manage to install TensorFlow.

1.2.3 Torch for R

We may also use Torch for R. This is an R frontend for the popular PyTorch framework. To install Torch, type in R:

install.packages("torch")
library(torch)
torch::install_torch()

1.2.4 EcoData

We use data sets from the EcoData package. To install the package, run:

devtools::install_github(repo = "TheoreticalEcology/EcoData", 
                         dependencies = TRUE, build_vignettes = TRUE)

The default installation will install a number of packages that are useful for statistics. Especially in Linux, this may take some time to install. If you are in a hurry and only want the data, you can also run

devtools::install_github(repo = "TheoreticalEcology/EcoData", 
                         dependencies = FALSE, build_vignettes = FALSE)

1.2.5 Additional Libraries

There are a number of additional libraries that we may use during the course. So take a coffee or two (that will take a while…) and install the following libraries. Please do this in the given order unless you know what you’re doing, because there are some dependencies between the packages.

install.packages("abind")
install.packages("animation")
install.packages("ape")
install.packages("BiocManager")
BiocManager::install(c("Rgraphviz", "graph", "RBGL"))
install.packages("coro")
install.packages("cito")
install.packages("dbscan")
install.packages("dendextend")
install.packages("devtools")
install.packages("dplyr")
install.packages("e1071")
install.packages("factoextra")
install.packages("fields")
install.packages("forcats")
install.packages("glmnet")
install.packages("glmnetUtils")
install.packages("gym")
install.packages("kknn")
install.packages("knitr")
install.packages("iml")
install.packages("lavaan")
install.packages("lmtest")
install.packages("magick")
install.packages("mclust")
install.packages("Metrics")
install.packages("microbenchmark")
install.packages("missRanger")
install.packages("mlbench")
install.packages("mlr3")
install.packages("mlr3learners")
install.packages("mlr3measures")
install.packages("mlr3pipelines")
install.packages("mlr3tuning")
install.packages("paradox")
install.packages("partykit")
install.packages("pcalg")
install.packages("piecewiseSEM")
install.packages("purrr")
install.packages("randomForest")
install.packages("ranger")
install.packages("rpart")
install.packages("rpart.plot")
install.packages("scales")
install.packages("semPlot")
install.packages("stringr")
install.packages("tfprobability")
install.packages("tidyverse")
install.packages("torchvision")
install.packages("xgboost")
install.packages("tidymodels")

devtools::install_github("andrie/deepviz", dependencies = TRUE,
                         upgrade = "always")
devtools::install_github('skinner927/reprtree')
devtools::install_version("lavaanPlot", version = "0.6.0")

reticulate::conda_install("r-keras", packages = "scipy", pip = TRUE)
reticulate::conda_install("r-keras", packages = "tensorflow_probability", pip = TRUE)

1.3 Linux/UNIX

Linux/UNIX systems have sometimes to fulfill some further dependencies

Debian based systems

For Debian based systems, we need:

build-essential
gfortran
libmagick++-dev
r-base-dev

If you are new to installing packages on Debian / Ubuntu, etc., type the following:

sudo apt update && sudo apt install -y --install-recommends build-essential gfortran libmagick++-dev r-base-dev

1.4 Assumed R knowledge

Basic knowledge of R is required to successfully participate in this course. In particular, you should be able to transform and subselect (slice) data. Have a look at this section from the advanced statistic course which provides you with a short tests as well as with further links to read up on background!