I am working on a Jupyter notebook from AWS EMR.
I am able to do this:
pd.read_csv("s3:\mypath\xyz.csv')
.
However, if I try to open a pickle file like this, pd.read_pickle("s3:\mypath\xyz.pkl")
I am getting this error:
[Errno 2] No such file or directory: 's3://pvarma1/users/users/candidate_users.pkl' Traceback (most recent call last): File "/usr/local/lib64/python2.7/site-packages/pandas/io/pickle.py", line 179, in read_pickle return try_read(path) File "/usr/local/lib64/python2.7/site-packages/pandas/io/pickle.py", line 177, in try_read lambda f: pc.load(f, encoding=encoding, compat=True)) File "/usr/local/lib64/python2.7/site-packages/pandas/io/pickle.py", line 146, in read_wrapper is_text=False) File "/usr/local/lib64/python2.7/site-packages/pandas/io/common.py", line 421, in _get_handle f = open(path_or_buf, mode) IOError: [Errno 2] No such file or d
However, I can see both xyz.csv
and xyz.pkl
in the same path! Can anyone help?
Advertisement
Answer
Pandas read_pickle
supports only local paths, unlike read_csv
. So you should be copying the pickle file to your machine before reading it in pandas.