Skip to content
Advertisement

converting percentage values into numbers in python dataframe

I am getting hold of data from google sheet(consisting of 26 columns) into a python dataframe. 4 columns A,B,C,D have data in the form of % values(eg 15.6%) and also contain some rows with N/A values. I am trying to convert these columns into numbers so that I can use them for other calculations, but am having problems doing so. The output for all the rows is 1.0000 using my code, but I want it to be 0.156(from 15.6%) This is the code I am using

df = df.replace(r'^s*$', np.nan, regex=True)
df = df.replace(r'%', '', regex=True)
cols = ['A','B','C','D']
df[cols] = df[cols].str.astype('float').div(100)

I have been trying to find a solution for the past 3 hrs, but nothing seems to work:/ I feel the main problem here could be the presence of N/A values which can’t be transformed using the astype() Sample dataset: enter image description here

Solution: Thanks to @PraysonW.Daniel

df[cols] = df[cols].apply(pd.to_numeric, errors="coerce").div(100)

Advertisement

Answer

These are the steps I took

import pandas as pd

…

dataf.loc[:, ["A","B"]] = (dataf
 .loc[:, ["A","B"]]
 .replace(r"%","", regex=True)
 .apply(pd.to_numeric, errors="coerce")
 .div(100)
)

First I selected the columns we are interested in, then replace % with “ , force values to numeric and last divided by 100

Data and Results

# generate data
dataf = pd.DataFrame({"id":["lgloo",]*3,"A":["75%", "10%",""], "B":["20%","0", "0"]})

…
dataf

    id       A      B
0   lgloo    0.75   0.2
1   lgloo    0.1    0
2   lgloo    NaN    0

Advertisement