Skip to content
Advertisement

Python/Pandas bring value based on another DF

I have two dataframe, below

    Key_words   Possiblities
0   ar            NaN
1   va            NaN
2   eb            NaN
3   ne            NaN
4   ke            NaN

    id  first_name  last_name   email
0   7840    Avery   Beldon  abeldon0@cyberchimps.com
1   7840    Emilie  Anton   eanton1@hp.com
2   7840    Corine  Gabey   cgabey2@state.tx.us
3   7840    Noak    Lowdyane    nlowdyane3@dot.gov
4   9907    Yetta   Kornilov    ykornilov4@smugmug.com

I am trying to fill df[“Possibilites”] with df2[“first_name”] if key_words in df2[“first_name”] with this code:

for i in range(0,len(df["Key_words"])):
    df["Possiblities"].loc[i]=list(df2["first_name"][df2["first_name"].str.contains(df["Key_words"].loc[i])])

it returns what i expect but gives a warning also:

” SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame”

What should I do instead using “for loop”? more practical or right way…

Advertisement

Answer

Use custom lambda function with generator with join for match multiple matched values, if necessary convert values to lowercase:

f = lambda x: ','.join(y for y in df2["first_name"] if x.lower() in y.lower())
df["Possiblities"] = df["Key_words"].apply(f)
print (df)
  Key_words Possiblities
0        ar             
1        va             
2        eb             
3        ne       Corine
4        ke             
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement