I have a dataframe with a column named label_id which is a string value. I also have a set of label_id values in required_labels. I would like to select the rows of the dataframe where the label_id value is contained in the set.
I understand that I need to use df.loc for this, but when I try to generate a boolean mask that I can pass into df.loc, I get an error as below
boolean_mask = df['label_id'] in required_labels TypeError: 'Series' objects are mutable, thus they cannot be hashed
What is the right way to do this?
Advertisement
Answer
You have to use the series built-in func .isin()
The correct syntax is:
boolean_mask = df['label_id'].isin(required_labels)