R in jupyter notebooks

Exploit R's notebook support for blogposts

  • toc: true
  • branch: master
  • badges: true
  • hide_binder_badge: true
  • comments: true
  • author: Konrad W├Âlms
  • categories: [first steps, R, jupyter]

Jupyter's intentions

Jupyter was orignally designed as an interactive environment for Julia, Python and R. This is even reflected in the name JuliapythonR, even if they snug an extra e in there. In the python data science community Jupyter is widely used, while the R community uses Rstudio as the standard IDE and it is what most newcomers to R are introduced to initially.

Here we'll give an example of how to use Jupyter as an alternative to Rstudio. This should not be misunderstood. Rstudio is a great tool and we don't want to downplay its importance. The main advantage of using Jupyter is access to a set of tools that are specifically build for Jupyter notebooks. Many of these tools were developed with python in mind, but the multi language usability of Jupyter allows us to make use of them in R. This post is an example in the sense, that it was written for fastPages, a notebook blogging framework.

IRkernel and code execution

In order to run R code in Jupyter an R kernel needs to be registered with Jupyter. This requires an existing R installation and the additional R package IRkernel. This module can then register the R installation as a kernel for Jupyter. Subsequently the R kernel can be selected when running a Jupyter notebook which allows the execution from R code in code cells.

The following is a simple example of loading some standard R packages, some data and creating a plot.

In [1]:
#hide_output
require('tidyverse');
data(mtcars)
Loading required package: tidyverse

-- Attaching packages ------------------------------------------------------------------------------- tidyverse 1.3.0 --

v ggplot2 3.3.3     v purrr   0.3.4
v tibble  3.0.6     v dplyr   1.0.4
v tidyr   1.1.2     v stringr 1.4.0
v readr   1.4.0     v forcats 0.5.1

-- Conflicts ---------------------------------------------------------------------------------- tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()

In [2]:
mtcars %>%
    arrange('cyl') %>%
    ggplot(aes(x=mpg,y=cyl)) +
    geom_point()

Interactive features

Nowadays there are many libraries that create interactive plots and maps that are driven by JavaScript. Many of these naturally work well together with Jupyter notebooks ans fastPages. An example for this is leaflet. A popular JavaScript library for displaying locations on maps. R has a package with the same name also supporting this library. The following is a standard example from the package website.

In [3]:
#hide
install.packages("leaflet")
Installing package into '/home/konrad/R/x86_64-pc-linux-gnu-library/3.5'
(as 'lib' is unspecified)

In [4]:
library(leaflet)

m <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addMarkers(lng=174.768, lat=-36.852, popup="The birthplace of R")
m  # Print the map

Google Colab and Reproducibility

Another nice feature of fastPages is that it directly support links to Google Colab and Binder. These tools allow readers of Jupyter Notebooks/fastPages blog posts to execute the code themselves without needing a local Jupyter setup local installation. Unfortunately Binder does not support R kernels at this point, but Google Colab does. There posts (like this one) can be opened there and can be executed and modified. One could even use Google Colab to create blog posts to begin with. The following blog post gives a few more details on that. There are a few limitation to Google Colab however, for instance some interactive libraries like leaflet don't display correctly.

Conclusion

We showed how to using Jupyter notebooks for R programming is possible and opens up many possibilities to the R user. Out of the features that this brings to the R user, we want to highlight most, that it allows to create blog posts using fastPages, just like this one.

In [ ]: