Is there any possibility to create a new column based on the keywords list?
Keywords = ["A", "B"]
I have data like this:
Location Type Ger A Ger F Ger C Ned D Ned A Ned B Aus C US B
I would like to create a new column if the keyword exists in the Type column and if 2 keywords exist then the value should both keyword with a comma. I am having a problem because I have to check also location first and then type…
Location Type NewType Ger A A Ger F A Ger C A Ned D A,B Ned A A,B Ned B A,B Aus C NaN US B B
Is there any way other than if-else?
Advertisement
Answer
Let’s use groupby and map:
m = df['Type'].isin(keywords)
s = df[m].groupby('Location')['Type'].agg(','.join)
df['NewType'] = df['Location'].map(s)
Details:
Create boolean mask with .isin to test for the values in Type that exists in keywords list:
print(m) 0 True 1 False 2 False 3 False 4 True 5 True 6 False 7 True Name: Type, dtype: bool
filter the rows using the above mask and groupby on Location then agg Type using join:
print(s) Location Ger A Ned A,B US B Name: Type, dtype: object
.map the values from the above aggregated frame to the original df based on Location
print(df) Location Type NewType 0 Ger A A 1 Ger F A 2 Ger C A 3 Ned D A,B 4 Ned A A,B 5 Ned B A,B 6 Aus C NaN 7 US B B
