I’m trying to add a sum total to an already existing pandas dataframe. The problem is that when I do this my multiindices become just normal indices. How can I prevent that?
JavaScript
x
11
11
1
df = pd.DataFrame({'date':['2021-06','2021-06','2021-09','2021-08','2021-09'],'type':['t1','t1','t1','t2','t2'], 'other_col':['a','b','b','a','c']})
2
df1 = df.pivot_table(index='type', columns='date', values='other_col', aggfunc='count').fillna(0)
3
df1.index = pd.MultiIndex.from_arrays([df1.index, ['count']*len(df1)])
4
df2 = np.round(df1 / df1.sum(axis=0) * 100, 0).astype(int)
5
df2.index = pd.MultiIndex.from_arrays([df2.index.get_level_values(0), ['perc']*len(df2)])
6
df3 = pd.concat([df1, df2]).sort_index()
7
df3 = df3.assign(**{'row tot': np.where(df3.index.isin([('t1', 'count'), ('t2', 'count')]), df3.sum(axis=1).astype(int), df3.mean(axis=1).astype(int))})
8
df3 = df3.assign(**{f"{col}":np.where(df3.index.isin([('t1', 'perc'), ('t2', 'perc')]), df3[col].astype(str)+'%', df3[col]) for col in df3.columns.tolist()})
9
column_total = df3[df3.index.isin([('t1', 'count'), ('t2', 'count')])].sum(axis=0).astype(int).to_frame().T
10
column_total.index = pd.MultiIndex.from_arrays([['col tot'], ])
11
neither this
JavaScript
1
2
1
df3 = pd.concat([df3, column_total]).sort_index()
2
nor this works
JavaScript
1
2
1
df3.append(column_total)
2
Maybe there is also an easier way to add a sum?
Advertisement
Answer
You’re not really creating a MultiIndex dataframe with column_total
. You need to add a second level so that the dataframes align:
JavaScript
1
12
12
1
column_total.index = pd.MultiIndex.from_arrays([['col tot'], ['']])
2
df3 = pd.concat([df3, column_total])
3
4
print(df3)
5
date 2021-06 2021-08 2021-09 row tot
6
type
7
t1 count 2.0 0.0 1.0 3
8
perc 100.0% 0.0% 50.0% 50%
9
t2 count 0.0 1.0 1.0 2
10
perc 0.0% 100.0% 50.0% 50%
11
col tot 2 1 2 5•
12