I am using R on a MacBook. I have an Rmarkdown document and I'm trying to use reticulate in order to use python within R. First I download the libraries: Next I look at an R chunk and figure out my working directory. Then I write mtcars to my desktop. Then I try to use python instead to read in

python in Rmarkdown using reticulate cannot read packages

I am using R on a MacBook. I have an Rmarkdown document and I’m trying to use reticulate in order to use python within R.

First I download the libraries:

```{r libraries, warning = FALSE, message = FALSE}

library(dplyr)
library(reticulate)

```

JavaScript
​x
 
```{r libraries, warning = FALSE, message = FALSE}
​
library(dplyr)
library(reticulate)
​
```
​

Next I look at an R chunk and figure out my working directory. Then I write mtcars to my desktop.

```{r chunk, warning = FALSE, message = FALSE}

getwd()

write.csv(mtcars, '/Users/name/Desktop/mtcars.csv', row.names = TRUE)

```

JavaScript
 
```{r chunk, warning = FALSE, message = FALSE}
​
getwd()
​
write.csv(mtcars, '/Users/name/Desktop/mtcars.csv', row.names = TRUE)
​
```
​

Then I try to use python instead to read in that csv that I just wrote to my desktop.

```{python}

import pandas as pd

mtcars = pd.read_csv('/Users/name/Desktop/mtcars.csv')

```

JavaScript
 
```{python}
​
import pandas as pd
​
mtcars = pd.read_csv('/Users/name/Desktop/mtcars.csv')
​
```
​

But I get this error:

ModuleNotFoundError: No module named 'pandas'
NameError: name 'pd' is not defined

JavaScript
 
ModuleNotFoundError: No module named 'pandas'
NameError: name 'pd' is not defined
​

So I went to this R documentation website and discovered that with python you have to import packages differently. So I went to terminal and then I typed in

python -m pip install pandas

JavaScript
 
python -m pip install pandas
​

It seemed to download OK? But when I return to my Rmarkdown document I can’t seem to get the python code to run and read in the csv. I still get the same error message.

I also saw a similar question on this SO post but I’m certain that my RStudio version is newer than the version in this question, so I don’t the answer hits on the same error exactly.

Answer

An option is to create a virtualenv, install the package and then specify the virtual env to be used

virtualenv_create("py-proj")
py_install("pandas", envname = "py-proj")

JavaScript
 
virtualenv_create("py-proj")
py_install("pandas", envname = "py-proj")
​

In the rmarkdown, we can use

---
title: "Testing"
output:
  pdf_document: default
  html_document: default
---

```{r libraries, warning = FALSE, message = FALSE}
library(reticulate)
use_virtualenv("py-proj")
```


```{r chunk, warning = FALSE, message = FALSE}



write.csv(mtcars, "/Users/name/Desktop/mtcars.csv", row.names = TRUE)

```
```{python}

import pandas as pd
mtcars = pd.read_csv("/Users/name/Desktop/mtcars.csv")
mtcars.head(5)
```

JavaScript
 
---
title: "Testing"
output:
  pdf_document: default
  html_document: default
---
​
```{r libraries, warning = FALSE, message = FALSE}
library(reticulate)
use_virtualenv("py-proj")
```
​
​
```{r chunk, warning = FALSE, message = FALSE}
​
​
​
write.csv(mtcars, "/Users/name/Desktop/mtcars.csv", row.names = TRUE)
​
```
```{python}
​
import pandas as pd
mtcars = pd.read_csv("/Users/name/Desktop/mtcars.csv")
mtcars.head(5)
```
​

-output

Advertisement

Answer