--- title: "Explore tidyverse with liftr" author: "Nan Xiao <>" bibliography: liftr-tidyverse.bib output: rmarkdown::html_document: toc: true toc_float: false toc_depth: 4 number_sections: false highlight: "textmate" css: "custom.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Explore tidyverse with liftr} --- ```{r, include=FALSE} knitr::opts_chunk$set( comment = "#>", collapse = TRUE ) ``` ## Introduction Creating Docker images from scratch can be time and labor consuming. Fortunately, many pre-built and regularly updated Docker images for the R community are ready for use, especially when creating your own containerized R Markdown documents with liftr. Such sources of pre-built Docker images include the [rocker project](https://github.com/rocker-org/rocker) and [Bioconductor Docker containers](https://bioconductor.org/help/docker/). In this article, we will use the [tidyverse image](https://hub.docker.com/r/rocker/tidyverse/) provided by rocker. This image includes the essential tidyverse packages and devtools environment loved by many data scientists [@wickham2014tidy]. We will demonstrate how to containerize and render your tidyverse-heavy R Markdown document using Docker in only a few minutes. ## Install Docker If Docker has not been installed on your system, please use `install_docker()` and follow the guidelines to install it. After that, `check_docker_install()` and `check_docker_running()` would help you make sure that Docker has been installed and running properly. ## Example document Let's create a new folder first and copy the example R Markdown document to this folder: ```{r, eval = FALSE} path = paste0("~/liftr-tidyverse/") dir.create(path) file.copy(system.file("examples/liftr-tidyverse.Rmd", package = "liftr"), path) input = paste0(path, "liftr-tidyverse.Rmd") ``` If we open the R Markdown file, we will see the header section includes a `liftr` section, which defines the Docker system environment required to render this document. For our case, it is very straightforward and simple indeed: ```yaml --- title: "Explore tidyverse with liftr" author: "Nan Xiao <>" date: "`r Sys.Date()`" output: rmarkdown::pdf_document: toc: true number_sections: true liftr: from: "rocker/tidyverse:latest" maintainer: "Nan Xiao" email: "me@nanx.me" pandoc: false texlive: true cran: - nycflights13 --- ``` Most of the fields are self-explanatory: - Here we simply specified the latest `rocker/tidyverse` image as our base image, which would save us a lot of time creating a custom base image with all the tidyverse dependencies. - The custom `pandoc` installation was not included because the tidyverse image already includes `pandoc`. - We included TeXLive here since we intend to render a PDF file in the end. - The CRAN data package `nycflights13` will be installed. ## Containerize the document Let's containerize this document by generating a `Dockerfile` for it, using `liftr::lift`: ```{r, eval = FALSE} lift(input) ``` A file named `Dockerfile` will be generated under the same directory of the input RMD file. It contains the necessary commands for building the Docker container for rendering the document. ## Render the document We can use `render_docker()` to start the Docker container, and render the document inside it: ```{r, eval = FALSE} render_docker(input) ``` Let's view the rendered document: ```{r, eval = FALSE} browseURL(paste0(path, "liftr-tidyverse.pdf")) ``` In the last section of the rendered PDF, we will see that the session information are probably different with your current system's information. Yes, that is because the document is completed generated by a newly built, isolated Linux system environment, using Docker. In this way, the R Markdown document gains a higher, system level reproducibility, thus easily replicable by other users who might not have the identical system and R package environment to yours. This is a good thing for team collaboration and large-scale document orchestration. The best part is, all you need to share is still the document itself, only with a few extra metadata fields. ## Housekeeping The Docker images stored in your system could take a few gigabytes and get larger gradually as you build more images. Let's remove the generated Docker image to save some disk space: ```{r, eval = FALSE} prune_image(paste0(path, "liftr-tidyverse.docker.yml")) ``` If we do this, the Docker container will be rebuilt next time when you use `render_docker()`. If not, the image will be cached in the system and reused when compiling the document later and save some time for you. ## References