Skip to content
Advertisement

Select rows from a pandas dataframe using a set of values

I have a dataframe with a column named label_id which is a string value. I also have a set of label_id values in required_labels. I would like to select the rows of the dataframe where the label_id value is contained in the set.

I understand that I need to use df.loc for this, but when I try to generate a boolean mask that I can pass into df.loc, I get an error as below

boolean_mask = df['label_id'] in required_labels
TypeError: 'Series' objects are mutable, thus they cannot be hashed

What is the right way to do this?

Advertisement

Answer

You have to use the series built-in func .isin() The correct syntax is:

boolean_mask = df['label_id'].isin(required_labels)
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement