Hi I would like to remove all “.0” at the end of a string for an entire DataFrame and I need it to be an exact match.
Let’s make an example df:
a b c 20 39.0 17-50 34.0 .016.0 001-6784532
The desired output:
a b c 20 39 17-50 34 .016 001-6784532
I tried using replace
but it didn’t work for some reason (I read maybe because replace only replaces entire strings and not substrings?). Either way, if there is a way it can work I’m interested to hear about it because it would work for my dataframe but I feel it’s less correct in case I’ll have values like .016.0 beacause then it would also replace the first 2 characters.
Then I tried sub and rtrim with regex r'.0$'
but I didn’t get this to work either. I’m not sure if it’s because of the regex or because these methods don’t work on an entire dataframe. Also using rtrim with .0
didn’t work because it removes also zeros without a dot before and then 20 will become 2.
When trying sub and rtrim with regex I got an error that dataframe doesn’t have an attribute str
, how is that possible?
Is there anyway to do this without looping over all columns?
Thank you!
Advertisement
Answer
Let’s try DataFrame.replace
:
import pandas as pd df = pd.DataFrame({ 'a': ['20', '34.0'], 'b': ['39.0', '.016.0'], 'c': ['17-50', '001-6784532'] }) df = df.replace(r'.0$', '', regex=True) print(df)
Optional DataFrame.astype
if the columns are not already str
:
df = df.astype(str).replace(r'.0$', '', regex=True)
Before:
a b c 0 20 39.0 17-50 1 34.0 .016.0 001-6784532
After:
a b c 0 20 39 17-50 1 34 .016 001-6784532
rtrim
/rstrip
will not work here as they don’t parse regex but rather take a list of characters to remove. For this reason, they will remove all 0
because 0
is in the “list” to remove.