Is there any possibility to create a new column based on the keywords list?
Keywords = ["A", "B"]
I have data like this:
Location Type Ger A Ger F Ger C Ned D Ned A Ned B Aus C US B
I would like to create a new column if the keyword exists in the Type column and if 2 keywords exist then the value should both keyword with a comma. I am having a problem because I have to check also location first and then type…
Location Type NewType Ger A A Ger F A Ger C A Ned D A,B Ned A A,B Ned B A,B Aus C NaN US B B
Is there any way other than if-else
?
Advertisement
Answer
Let’s use groupby
and map
:
m = df['Type'].isin(keywords) s = df[m].groupby('Location')['Type'].agg(','.join) df['NewType'] = df['Location'].map(s)
Details:
Create boolean mask with .isin
to test for the values in Type
that exists in keywords
list:
print(m) 0 True 1 False 2 False 3 False 4 True 5 True 6 False 7 True Name: Type, dtype: bool
filter the rows using the above mask and groupby
on Location
then agg
Type
using join
:
print(s) Location Ger A Ned A,B US B Name: Type, dtype: object
.map
the values from the above aggregated frame to the original df
based on Location
print(df) Location Type NewType 0 Ger A A 1 Ger F A 2 Ger C A 3 Ned D A,B 4 Ned A A,B 5 Ned B A,B 6 Aus C NaN 7 US B B