Skip to content
Advertisement

How to delete a certain value in a cell in columns of csv using pandas

I need help with deleting “None” along with extra comma in language columns that have one or more language

Here is the existing csv:

f = pd.DataFrame({'Movie': ['name1','name2','name3','name4'],
                  'Year': ['1905', '1905','1906','1907'],
                  'Id': ['tt0283985', 'tt0283986','tt0284043','tt3402904'],
                  'language':['Mandarin,None','None,Cantonese','Mandarin,None,Cantonese','None,Cantonese']})

Where f now looks like:

   Movie  Year         Id   language
0  name1  1905  tt0283985  Mandarin,None
1  name2  1905  tt0283986  None,Cantonese
2  name3  1906  tt0284043  Mandarin,None,Cantonese
3  name4  1907  tt3402904  None,Cantonese

And the result should be like this:

   Movie  Year         Id             language
0  name1  1905  tt0283985            Mandarian
1  name2  1905  tt0283986            Cantonese
2  name3  1906  tt0284043            Mandarin,Cantonese
3  name4  1907  tt3402904            Cantonese

There are also other columns that have only ‘None’ values in language column, so I can’t just use the replace function in excel, and there’s also a problem of extra “,” after doing that. So I may need help with a new way using pandas or something. Thanks in advance!

Advertisement

Answer

You could achieve it this way,

f["language"] = f.apply(
    lambda x: ",".join(filter(lambda y: y != "None", x.language.split(","))), axis=1
)

Or much better

f["language"] = f.apply(
    lambda x: ",".join([y for y in x.language.split(",") if y != "None"]), axis=1
)
Advertisement