Skip to content
Advertisement

Is there any possibility to create a new column based on the keywords list

Is there any possibility to create a new column based on the keywords list?

Keywords = ["A", "B"]

I have data like this:

Location             Type          
Ger                  A          
Ger                  F
Ger                  C     
Ned                  D
Ned                  A
Ned                  B
Aus                  C
US                   B

I would like to create a new column if the keyword exists in the Type column and if 2 keywords exist then the value should both keyword with a comma. I am having a problem because I have to check also location first and then type…

Location             Type          NewType
Ger                  A             A
Ger                  F             A
Ger                  C             A
Ned                  D             A,B
Ned                  A             A,B
Ned                  B             A,B
Aus                  C             NaN
US                   B             B

Is there any way other than if-else?

Advertisement

Answer

Let’s use groupby and map:

m = df['Type'].isin(keywords)
s = df[m].groupby('Location')['Type'].agg(','.join)
df['NewType'] = df['Location'].map(s)

Details:

Create boolean mask with .isin to test for the values in Type that exists in keywords list:

print(m)

0     True
1    False
2    False
3    False
4     True
5     True
6    False
7     True
Name: Type, dtype: bool

filter the rows using the above mask and groupby on Location then agg Type using join:

print(s)

Location
Ger      A
Ned    A,B
US       B
Name: Type, dtype: object

.map the values from the above aggregated frame to the original df based on Location

print(df)

  Location Type NewType
0      Ger    A       A
1      Ger    F       A
2      Ger    C       A
3      Ned    D     A,B
4      Ned    A     A,B
5      Ned    B     A,B
6      Aus    C     NaN
7       US    B       B
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement