How to Set Up an R / Python Virtual Environment Using Renv

It’s easier than you think to set up a virtual environment that can be used for both R and Python

Soon your R / Python packages will be working in perfect harmony. Photo from Wikimedia Commons.

Thanks to my coworker Miko Wieczorek for helping me with this one.

We have an agile team that uses both R and Python for machine learning projects, and having a single, reproducible environment for both has been a lifesaver.

Here’s how to set up a joint Python / R virtual environment.

Renv package

Renv is a powerful package manager built for R. renv::use_python() lets you integrate a Python virtual environment for use either with the reticulate package, or with native Python.

Renv does a good job of supporting both R and Python, so if your project needs both this is probably the best thing to use. It supports both virtualenv and conda.

If you’re already familiar with using either virtualenv or conda for package management, this StackOverflow lists handy translations between pip, conda and renv commands.

In this situation, I was dropping in on Miko’s project from the Python side. He had been working on the project in RStudio and already used renv::snapshot() to save out the dependencies that he was using, and I wanted to do some modeling in Python with his existing virtual environment.

Snapshotting an R environment

Renv has a great tutorial on how to snapshot an environment.

From an existing R project, call
renv::snapshot()

This saves out two files, renv.lock which stores the R dependencies, and a requirements.txt for python virtualenv dependencies (conda creates an environment.yml file).

Of course, you can create a virtualenv from requirements.txt, but I found that I had package dependency issues when I did this, and thought it was safer to wholly duplicate the renv environment.

Restoring renv environment

It’s easiest to do this step from a terminal. You’ll need both R and Python installed.

First, make sure renv is already installed in your base R environment.
$ Rscript -e "install.packages("renv")"

Make sure you’re in the project directory. You should see both requirements.txt and renv.lock. Use renv::restore() to load the project and follow instructions (you will be prompted to activate it if you’re setting it up for the first time).
$ Rscript -e "renv::restore()

At this point we ran into some specific package dependency issues. Specifically I ran into issues with the xml2 and also a conflict with Tensorflow and SciPy.

After this is run, you can check your Renv package status:
$ Rscript -e "renv::status()"

Activating Python environment

Running renv::restore() will create a “renv” folder in the project directory which contains activate.R which can be used for your future R needs. RProfile will also be created where the activate.R script is sourced at project startup if you’re using an interactive session. In our case, we’re interested in a python virtual environment which will be created in renv/python/virtualenvs

From the terminal, you can source this python install, in my case
$ source ./renv/python/virtualenvs/renv-python-3.7.8/bin/activate

Loading Jupyter Kernel

For this project I was using a Jupyter Notebook.

To use the virtualenv in Jupyter Notebook, you will need to install it as a kernel. After sourcing the activate script above in the terminal, run
$ python -m ipykernel install --user --name=my_virtualenv

After this, my_virtualenv will show as your kernel to select in the Launcher or under the “Kernel” tab. Verify and list all your kernels with the command
$ jupyter kernelspec list

If you don’t have ipykernel installed, refer to this documentation: Installing the IPython kernel.

Takeaways

Managing package dependencies is always going to be a hassle, but hopefully this makes it a bit easier to manage projects with both R and Python

Data scientist at Mayo Clinic. My views are entirely my own.