I read a csv file into a pandas dataframe and got all column types as objects. I need to convert the second and third columns to float.
I tried using
df["Quantidade"] = pd.to_numeric(df.Quantidade, errors='coerce')
but got NaN.
Here’s my dataframe. Should I need to use some regex in the third column to get rid of the “R$ “?
Advertisement
Answer
Try this:
# sample dataframe d = {'Quantidade':['0,20939', '0,0082525', '0,009852', '0,012920', '0,0252'], 'price':['R$ 165.000,00', 'R$ 100.000,00', 'R$ 61.500,00', 'R$ 65.900,00', 'R$ 49.375,12']} df = pd.DataFrame(data=d)
# Second column df["Quantidade"] = df["Quantidade"].str.replace(',', '.').astype(float) #Third column df['price'] = df.price.str.replace(r'w+$s+', '').str.replace('.', '') .str.replace(',', '.').astype(float)
Output:
Quantidade price 0 0.209390 165000.00 1 0.008252 100000.00 2 0.009852 61500.00 3 0.012920 65900.00 4 0.025200 49375.12