I read a csv file into a pandas dataframe and got all column types as objects. I need to convert the second and third columns to float.
I tried using
JavaScript
x
2
1
df["Quantidade"] = pd.to_numeric(df.Quantidade, errors='coerce')
2
but got NaN.
Here’s my dataframe. Should I need to use some regex in the third column to get rid of the “R$ “?
Advertisement
Answer
Try this:
JavaScript
1
5
1
# sample dataframe
2
d = {'Quantidade':['0,20939', '0,0082525', '0,009852', '0,012920', '0,0252'],
3
'price':['R$ 165.000,00', 'R$ 100.000,00', 'R$ 61.500,00', 'R$ 65.900,00', 'R$ 49.375,12']}
4
df = pd.DataFrame(data=d)
5
JavaScript
1
7
1
# Second column
2
df["Quantidade"] = df["Quantidade"].str.replace(',', '.').astype(float)
3
4
#Third column
5
df['price'] = df.price.str.replace(r'w+$s+', '').str.replace('.', '')
6
.str.replace(',', '.').astype(float)
7
Output:
JavaScript
1
7
1
Quantidade price
2
0 0.209390 165000.00
3
1 0.008252 100000.00
4
2 0.009852 61500.00
5
3 0.012920 65900.00
6
4 0.025200 49375.12
7