Skip to content
Advertisement

How to drop rows from a pandas dataframe based on a pre-made list

I have a big dataset. It’s about news reading. I’m trying to clean it. I created a checklist of cities that I want to keep (the set has all the cities). How can I drop the rows based on that checklist? For example, I have a checklist (as a list) that contains all the french cities. How can I drop other cities?

To picture the data frame (I have 1.5m rows btw):

JavaScript

Advertisement

Answer

You can do this using pandas.Dataframe.isin. This will return boolean values checking whether each element is inside the list x. You can then use the boolean values and take out the subset of the df with rows that return True by doing df[df['City'].isin(x)]. Following is my solution:

JavaScript

Output:

JavaScript
Advertisement