I have a list of problematic rows where there is a unique identifier, all of which I want to remove from a dataframe.
I’ve tried to use loc to index them, as follows:
df.loc[df['GUID'] != toDel['GUID']]
where df is 5063 row x 28 cols and toDel[‘GUID’] is a list of GUIDs that I want to remove from the df.
I expected this to give me a df that doesn’t include the problematic rows. However, I get a ‘valueError: Can only compare identically-labeled Series objects.’ I guess this means they have to be identically sized Series, but then how do I get rid of the problematic GUIDs using this toDel[‘GUID’] list?
Advertisement
Answer
To keep only rows where GUID
is in toDel['GUID']
, you can do this
df.loc[df['GUID'].isin(toDel['GUID'])]