Skip to content
Advertisement

Python: concatenate pandas multiindex

I need to generate a pd.DataFrame with columns being composed by a list and a Multiindex object, and I need to do it before filling the final dataframe with data.

Say the columns are ['one', 'two'] and the multiindex obtained from from_product:

import pandas as pd

col_21 = ['day', 'month']
col_22 = ['a', 'b']

mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

I would like to get a list of columns which looks like this:

'one' | 'two' | ('day','a') | ('day','b') | ('month','a') | ('month','b')

One possible solution would be to use two different and separate Multiindex, one with a dummy column, both generate by from_product

col_11 = ['one', 'two']
col_12 = ['']
col_21 = ['day', 'month']
col_22 = ['a', 'b']

mult_1 =  pd.MultiIndex.from_product([ col_11, col_12 ])
mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

How could I get to this?

(one, '') | (two, '') | ('day','a') | ('day','b') | ('month','a') |  ('month','b')

I have tried several trivial solutions, but each gave me a different error or a wrong result

mult_1+mult_2 #TypeError: cannot perform __add__ with this index type: MultiIndex
pd.merge #TypeError: Can only merge Series or DataFrame objects, a <class 'list'> was passed
pd.MultiIndex.from_arrays([ mult_1, mult_2 ]) #NotImplementedError: isna is not defined for MultiIndex

Thank you for your advices

Advertisement

Answer

If possible, the best is create MultiIndex in index by columns one, two and then MultiIndex in columns by pairs – so not mixed non multiindex with multindex values:

col_21 = ['day', 'month']
col_22 = ['a', 'b']
mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

one = range(5)
two = list('ABCDE')
mult_3 =  pd.MultiIndex.from_arrays([ one, two], names=['one','two'])

df = pd.DataFrame(0, columns=mult_2, index=mult_3)
print (df)
        day    month   
          a  b     a  b
one two                
0   A     0  0     0  0
1   B     0  0     0  0
2   C     0  0     0  0
3   D     0  0     0  0
4   E     0  0     0  0

Use Index.append:

print (mult_1.append(mult_2))

MultiIndex([(  'one',  ''),
            (  'two',  ''),
            (  'day', 'a'),
            (  'day', 'b'),
            ('month', 'a'),
            ('month', 'b')],
           )

Or Index.union with sort=False:

print (mult_1.union(mult_2, sort=False))

MultiIndex([(  'one',  ''),
            (  'two',  ''),
            (  'day', 'a'),
            (  'day', 'b'),
            ('month', 'a'),
            ('month', 'b')],
           )
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement