I need help with deleting “None” along with extra comma in language
columns that have one or more language
Here is the existing csv:
JavaScript
x
5
1
f = pd.DataFrame({'Movie': ['name1','name2','name3','name4'],
2
'Year': ['1905', '1905','1906','1907'],
3
'Id': ['tt0283985', 'tt0283986','tt0284043','tt3402904'],
4
'language':['Mandarin,None','None,Cantonese','Mandarin,None,Cantonese','None,Cantonese']})
5
Where f
now looks like:
JavaScript
1
6
1
Movie Year Id language
2
0 name1 1905 tt0283985 Mandarin,None
3
1 name2 1905 tt0283986 None,Cantonese
4
2 name3 1906 tt0284043 Mandarin,None,Cantonese
5
3 name4 1907 tt3402904 None,Cantonese
6
And the result should be like this:
JavaScript
1
6
1
Movie Year Id language
2
0 name1 1905 tt0283985 Mandarian
3
1 name2 1905 tt0283986 Cantonese
4
2 name3 1906 tt0284043 Mandarin,Cantonese
5
3 name4 1907 tt3402904 Cantonese
6
There are also other columns that have only ‘None’ values in language column, so I can’t just use the replace function in excel, and there’s also a problem of extra “,” after doing that. So I may need help with a new way using pandas or something. Thanks in advance!
JavaScript
1
1
1
Advertisement
Answer
You could achieve it this way,
JavaScript
1
4
1
f["language"] = f.apply(
2
lambda x: ",".join(filter(lambda y: y != "None", x.language.split(","))), axis=1
3
)
4
Or much better
JavaScript
1
4
1
f["language"] = f.apply(
2
lambda x: ",".join([y for y in x.language.split(",") if y != "None"]), axis=1
3
)
4