Skip to content
Advertisement

pandas convert columns to percentages of the totals

I have a dataframe with 4 columns an ID and three categories that results fell into

  <80% 80-90 >90
id
1   2     4    4
2   3     6    1
3   7     0    3

I would like to convert it to percentages ie:

   <80% 80-90 >90
id
1   20%   40%  40%
2   30%   60%  10%
3   70%    0%  30%

this seems like it should be within pandas capabilities but I just can’t figure it out.

Thanks in advance!

Advertisement

Answer

You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:

cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
  • Calculate the sum of each column (df[cols].sum(axis=1). axis=1 makes the summation occur across the rows, rather than down the columns.
  • Divide the dataframe by the resulting series (df[cols].div(df[cols].sum(axis=1), axis=0). axis=0 makes the division happen across the columns.
  • To finish, multiply the results by 100 so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement