Skip to content
Advertisement

How to Efficiently Perform Multiplication within MultiIndex Groupby

I am trying to use two of my second level indices to calculate a third index. However, I can’t find an idiomatic way to do this.

How can I calculate one second level index from two other second level indices? Each group has the same second level indices.

My Code

import numpy as np
import pandas as pd

# Create index
tickers = ['A', 'B', 'C']
line_items = ['sales', 'ebitda_margin', 'ebitda', 'other_field']
index = pd.MultiIndex.from_product([tickers, line_items], names=['ticker', 'line_item'])

df = pd.DataFrame([100, 0.3, np.nan, 7, 200, 0.2, np.nan, 8, 300, 0.1, np.nan, 9], 
                  index=index, 
                  columns=[0])

# Let's assume 10% sales growth for all companies
# This is included to show that I am doing this calculation for multiple years (columns)
df[1] = df[0]
df.loc[pd.IndexSlice[:, 'sales'], 1] *= 1.1

This produces the following data frame:

                          0      1
ticker line_item                  
A      sales          100.0  110.0
       ebitda_margin    0.3    0.3
       ebitda           NaN    NaN
       other_field      7.0    7.0
B      sales          200.0  220.0
       ebitda_margin    0.2    0.2
       ebitda           NaN    NaN
       other_field      8.0    8.0
C      sales          300.0  330.0
       ebitda_margin    0.1    0.1
       ebitda           NaN    NaN
       other_field      9.0    9.0

What I Have

Note that I know that I will need to do some work with indexes to get the below to work, but would rather find a better way if one exists rather than using this code.

df.apply(lambda x: x.loc[pd.IndexSlice[:, 'sales']] * x.loc[pd.IndexSlice[:, 'ebitda_margin']])

Is there a better way to do this?

Advertisement

Answer

Try xs as an alternative to pd.IndexSlice where you get to remove one level, then mul which allows level alignment when multiply:

df.loc[pd.IndexSlice[:, 'sales'],:] = (df.loc[pd.IndexSlice[:, 'sales'],:]
                                         .mul(df.xs('ebitda_margin', level='line_item'), level=0)
                                      )

Output:

                         0
ticker line_item          
A      sales          30.0
       ebitda_margin   0.3
       ebitda          NaN
       other_field     7.0
B      sales          40.0
       ebitda_margin   0.2
       ebitda          NaN
       other_field     8.0
C      sales          30.0
       ebitda_margin   0.1
       ebitda          NaN
       other_field     9.0
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement