Skip to content
Advertisement

remove special characters and string from df columns in python

Currently my column is of object type and I’m trying to convert it to type numeric. But it shows the error because of special characters and string contained in it.

error:

ValueError: Unable to parse string "7`" at position 3298

code:

data['col1']=pd.to_numeric(data.col1)

So, I want to remove the special char and string from the columns that requires only number and col1 being one of it. Any suggested solution?

Advertisement

Answer

Using str.replace with regex pattern.

Ex:

df = pd.DataFrame({"col1": ["7`", "123", "AS123", "*&%3R4"]})
print(pd.to_numeric(df['col1'].str.replace(r"[^d]", "")))

Output:

0      7
1    123
2    123
3     34
Name: col1, dtype: int64
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement