Docker has made building web applications much easier than before. For literate programming and teaching reproducibility, this means virtual environments can be built with the specific versions of software we want, allowing us to experiment with e.g. Jupyter notebooks and R Markdown in a “clean” environment, and not allowing one to claim that “it doesn’t work on my machine”.
I teach statistics and data science modules where the students need to submit assignments that are in pdf generated by R Markdown. Pdf is chosen over HTML for page limit reasons, even though the latter is better for accessibility purposes and brings about no dependency issues when students are installing required packages in RStudio. However, to be able to generate pdf, a TeX distribution is needed, and installing that sometimes messes things up even if TinyTeX is used, and the issues might be OS-dependent. Therefore, I have always wanted to create a virtual environment (VE) with Docker, for at least two purposes:
- Students can check the reproducibility of their work by uploading their Rmd to the VE;
- The VE acts as the last resort if they keep getting errors on their local machines.
Lastly, not everyone is comfortable with Posit/RStudio cloud as it requires creating another account.
Last week I attended a training course with Project TIER, in partnership with UK Reproducibility Network, and was introduced to Binder; thanks to Matt Ingram. Binder takes Dockerisation further by doing the heavy-lifting in the background. To create a VE, you just need to do the following steps:
- create an open repository; I created my on GitHub;
- create, at minimum, an
environment.yml
, specifying the dependencies (R packages) required; - add other files required; in my case it’s the
apt.txt
for the TeX distribution; - go to Binder and type in the repository name or URL, and click the launch button;
- wait for usually a few minutes for Binder to build the VE for you;
- once it’s launched, work in the VE as you would with a local piece of software; uploading and downloading files is also seamless.
I have checked that there is no problem with generating pdf using R Markdown. There might be less needed for environment.yml
and apt.txt
, but for now I will go with “if it ain’t broke, don’t fix it”.