Drop rows from dataframe where problematic values are in separate list

I have a list of problematic rows where there is a unique identifier, all of which I want to remove from a dataframe.

I’ve tried to use loc to index them, as follows:

df.loc[df['GUID'] != toDel['GUID']]

where df is 5063 row x 28 cols and toDel[‘GUID’] is a list of GUIDs that I want to remove from the df.

I expected this to give me a df that doesn’t include the problematic rows. However, I get a ‘valueError: Can only compare identically-labeled Series objects.’ I guess this means they have to be identically sized Series, but then how do I get rid of the problematic GUIDs using this toDel[‘GUID’] list?

Answer

To keep only rows where GUIDis in toDel['GUID'], you can do this

df.loc[df['GUID'].isin(toDel['GUID'])]

Advertisement

Answer