I have created a pandas dataframe called df
using this code:
import numpy as np import pandas as pd
JavaScript
x
4
1
ds = {'col1' : ['1','3/','4'], 'col2':['A','!B','@C']}
2
3
df =pd.DataFrame(data=ds)
4
The dataframe looks like this:
JavaScript
1
7
1
print(df)
2
3
col1 col2
4
0 1 A
5
1 3/ !B
6
2 4 @C
7
The columns contain some special characters (/
and @
) that I need to replace with a blank space.
Now, I have a list of special characters:
listOfSpecialChars = ‘¬`!”£$£#/,.+*><@|”‘
How can I replace any of the special characters listed in listOfSpecialChars
with a blank space, any time I encounter them at any point in a dataframe, for any columns?
At the moment I am dealing with a 100K-record dataframe with 560 columns, so I can’t write a piece of code for each variable.
Advertisement
Answer
You can use apply
with str.replace
:
JavaScript
1
5
1
import re
2
chars = ''.join(map(re.escape, listOfSpecialChars))
3
4
df2 = df.apply(lambda c: c.str.replace(f'[{chars}]', '', regex=True))
5
JavaScript
1
2
1
df2 = df.stack().str.replace(f'[{chars}]', '', regex=True).unstack()
2
output:
JavaScript
1
5
1
col1 col2
2
0 1 A
3
1 3 B
4
2 4 C
5