So I have my first dataframe that has countries as headers and infected and death values as subheaders,
df Dates Antigua & Barbuda Australia Infected Dead Infected Dead 2020-01-22 0 0 0 0... 2020-01-23 0 0 0 0... ...
then I have my second dataframe,
df_indicators Dates Location indicator_1 indicator_2 ..... 2020-04-24 Afghanistan 0 0 2020-04-25 Afghanistan 0 0 ... 2020-04-24 Yemen 0 0 2020-04-25 Yemen 0 0
I want to merge the dataframes so that the indicator columns become subheaders of the countries column like in df
with the infected and dead subheaders.
What I want to produce is something like this,
df_merge Dates Antigua & Barbuda Infected Dead indicator_1 indicator_2.... 2020-04-24 0 0 0 0...
There are so many indicators that are all named something different that I don’t feel I can call them all so not sure if theres a way I can do this easily.
Thank you in advance for any help!
Advertisement
Answer
Because there are duplicates first aggregate by mean
and then reshape by Series.unstack
with DataFrame.swaplevel
:
df2 = df_indicators.groupby(['Dates','Location']).mean().unstack().swaplevel(0,1,axis=1)
Or with DataFrame.pivot_table
:
df2 = (df.pivot_table(index='Dates', columns='Location', aggfunc='mean') .swaplevel(0,1,axis=1))
And last join with sorting MultiIndex in columns
:
df = pd.concat([df, df2], axis=1).sort_index(axis=1)