Is there any possibility to create a new column based on the keywords list?
JavaScript
x
2
1
Keywords = ["A", "B"]
2
I have data like this:
JavaScript
1
10
10
1
Location Type
2
Ger A
3
Ger F
4
Ger C
5
Ned D
6
Ned A
7
Ned B
8
Aus C
9
US B
10
I would like to create a new column if the keyword exists in the Type column and if 2 keywords exist then the value should both keyword with a comma. I am having a problem because I have to check also location first and then type…
JavaScript
1
10
10
1
Location Type NewType
2
Ger A A
3
Ger F A
4
Ger C A
5
Ned D A,B
6
Ned A A,B
7
Ned B A,B
8
Aus C NaN
9
US B B
10
Is there any way other than if-else
?
Advertisement
Answer
Let’s use groupby
and map
:
JavaScript
1
4
1
m = df['Type'].isin(keywords)
2
s = df[m].groupby('Location')['Type'].agg(','.join)
3
df['NewType'] = df['Location'].map(s)
4
Details:
Create boolean mask with .isin
to test for the values in Type
that exists in keywords
list:
JavaScript
1
12
12
1
print(m)
2
3
0 True
4
1 False
5
2 False
6
3 False
7
4 True
8
5 True
9
6 False
10
7 True
11
Name: Type, dtype: bool
12
filter the rows using the above mask and groupby
on Location
then agg
Type
using join
:
JavaScript
1
8
1
print(s)
2
3
Location
4
Ger A
5
Ned A,B
6
US B
7
Name: Type, dtype: object
8
.map
the values from the above aggregated frame to the original df
based on Location
JavaScript
1
12
12
1
print(df)
2
3
Location Type NewType
4
0 Ger A A
5
1 Ger F A
6
2 Ger C A
7
3 Ned D A,B
8
4 Ned A A,B
9
5 Ned B A,B
10
6 Aus C NaN
11
7 US B B
12