I have created a pandas dataframe called df
using this code:
import numpy as np import pandas as pd
ds = {'col1' : ['1','3/','4'], 'col2':['A','!B','@C']} df =pd.DataFrame(data=ds)
The dataframe looks like this:
print(df) col1 col2 0 1 A 1 3/ !B 2 4 @C
The columns contain some special characters (/
and @
) that I need to replace with a blank space.
Now, I have a list of special characters:
listOfSpecialChars = ‘¬`!”£$£#/,.+*><@|”‘
How can I replace any of the special characters listed in listOfSpecialChars
with a blank space, any time I encounter them at any point in a dataframe, for any columns?
At the moment I am dealing with a 100K-record dataframe with 560 columns, so I can’t write a piece of code for each variable.
Advertisement
Answer
You can use apply
with str.replace
:
import re chars = ''.join(map(re.escape, listOfSpecialChars)) df2 = df.apply(lambda c: c.str.replace(f'[{chars}]', '', regex=True))
df2 = df.stack().str.replace(f'[{chars}]', '', regex=True).unstack()
output:
col1 col2 0 1 A 1 3 B 2 4 C