Skip to content
Advertisement

Python dataframe in one column strings delimit by comma, and in another column if pass or fail

How I can count if a country that is in more rows , has failed or passed, enter image description here

Like is

ID unique   Countries                    Test
1           Spain, Netherlands               Fail
2           Italy                             Pass
3         France, Netherlands                Pass
4         Belgium, France, Bulgaria         Fail
5         Belgium, United Kingdom            Pass
6        Netherlands, France                 Pass
7        France, Netherlands, Belgiu        Pass

and the result should be like this enter image description here

             Pass   Fail
Spain           0   1
Italy           1   0
France          3   1
Netherlands     3   1
Belgium         2   1
United Kingdom  1   0

Because Netherlands is in 4 rows , and has 3 passed and one failed.

Advertisement

Answer

Use Series.str.split with DataFrame.explode and last call crosstab:

df1 = df.assign(Countries = df.Countries.str.split(', ')).explode('Countries')
df2 = pd.crosstab(df1['Countries'],df1['Test'])
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement