I have a dataframe with a column named label_id
which is a string value. I also have a set of label_id
values in required_labels
. I would like to select the rows of the dataframe where the label_id
value is contained in the set.
I understand that I need to use df.loc
for this, but when I try to generate a boolean mask that I can pass into df.loc
, I get an error as below
boolean_mask = df['label_id'] in required_labels TypeError: 'Series' objects are mutable, thus they cannot be hashed
What is the right way to do this?
Advertisement
Answer
You have to use the series built-in func .isin()
The correct syntax is:
boolean_mask = df['label_id'].isin(required_labels)