filter string elements from list using another list

Question

I have a list of strings of various lengths stored in df. The total number of rows in df is 301501. Example is as follows: I have also stored a list of female names in another list called f_name. I want to create another column in df to filter out elements that are not found in f_name. What I tried

Accepted Answer

Assuming your item column actually contains lists of strings (and aren&#8217;t just strings that look like lists, e.g. '[1, 2, 3]'), cast f_name to set and perform set intersection:f_name = set(f_name)df["item"].apply(f_name.intersection)Demo:In [3]: dfOut[3]:                               item0                      [Tom, David]1          [Robert, Jennifer, Jane]2           [Robert, Tom, Patricia]3  [Thomas, David, Chloe, Michelle]In [4]: f_name = {"Jane", "Michelle", "Patricia", "Jennifer", "Chloe"}In [5]: df.item.apply(f_name.intersection)Out[5]:0                   {}1     {Jane, Jennifer}2           {Patricia}3    {Michelle, Chloe}Name: item, dtype: object

Advertisement

Answer