Skip to content

Tag: pandas

Pandas .isin on column entries containing lists

I am trying to filter a dataframe using the isin() function by passing in a list and comparing with a dataframe column that also contains lists. This is an extension of the question below: How to implement ‘in’ and ‘not in’ for Pandas dataframe For example, instead of having one countr…

Pandas read_pickle from s3 bucket

I am working on a Jupyter notebook from AWS EMR. I am able to do this: pd.read_csv(“s3:\mypath\xyz.csv’). However, if I try to open a pickle file like this, pd.read_pickle(“s3:\mypath\xyz.pkl”) I am getting this error: However, I can see both xyz.csv and xyz.pkl in the same path! Can a…

Python: concatenate pandas multiindex

I need to generate a pd.DataFrame with columns being composed by a list and a Multiindex object, and I need to do it before filling the final dataframe with data. Say the columns are [‘one’, ‘two’] and the multiindex obtained from from_product: I would like to get a list of columns whi…

Fast way to cyclically wrap values in pandas dataframe

In words: I have a data frame that consists of values over a day, for multiple days per Userid. I’d like to shift all of certain people’s data by 1 period, so that the first value in their first column is a nan, and then everything is cyclically offset, with the last value truncated or lost to spa…