Skip to content
Advertisement

How to sort MultiIndex using values from a given column

I have a DataFrame with 2-level index and column with the numerical values. I want to sort it by level-0 and level-1 index in such a way that the the order of 0-level index is determined by the sum of values from Value column (descending), and the order of 1-level index is also determined by the values in Value column. This is my code:

import pandas as pd

df = pd.DataFrame()
df["Index1"] = ["A", "A", "B", "B", "C", "C"]
df["Index2"] = ["X", "Y", "X", "Y", "X", "Y"]
df["Value"] = [1, 4, 7, 3, 2, 7]
df = df.set_index(["Index1", "Index2"])
df

And this is the desired output (B is at the top because the sum is 10 and then we have X first because 7 >3): enter image description here

Advertisement

Answer

You can do this with pandas.DataFrame.sort_values :

out= (
        df
         .assign(temp_col = df.groupby(level=0).transform("sum"))
         .sort_values(by=["temp_col", "Value"], ascending=[False, False])
         .drop(columns="temp_col")
     )

# Output :

print(out)

               Value
Index1 Index2       
B      X           7
       Y           3
C      Y           7
       X           2
A      Y           4
       X           1
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement