Skip to content
Advertisement

Combinations of all dataframe columns in python

I have three data frames that have the same index (Countries). I need to find all the combinations of the three data frames, create new columns with the data frames. Under each of those columns I will have the multiplication of the values from those combinations.

Envelope = pd.read_excel("Envelope.xlsx",index_col=0)
Shading = pd.read_excel("Shading.xlsx",index_col=0)
ThermalMass = pd.read_excel("ThermalMass.xlsx",index_col=0)

#Envelope dataframe
Country         Group(A)  Group(B)  Group(C)                       
France          0.4       0.4       0.2
Brussels        0.8       0.1       0.1
Germany_A       0.3       0.6       0.1
Germany_B       0.2       0.5       0.3

#Shading dataframe            
Country     YeSH  NoSH        
France      0.5   0.5
Brussels    0.6   0.4
Germany_A   0.9   0.1
Germany_B   0.4   0.6

#ThermalMass dataframe             
Country     Heavy   Light         
France       0.4    0.6
Brussels     0.5    0.5
Germany_A    0.3    0.7
Germany_B    0.5    0.5`

I tried to use the MultiIndex.from_product

all = pd.MultiIndex.from_product([Envelope,Shading,ThermalMass])

but the results is only for the titles:

print(all)
MultiIndex([('Group(A)', 'YeSH', 'Heavy'),
            ('Group(A)', 'YeSH', 'Light'),
            ('Group(A)', 'NoSH', 'Heavy'),
            ('Group(A)', 'NoSH', 'Light'),
            ('Group(B)', 'YeSH', 'Heavy'),
            ('Group(B)', 'YeSH', 'Light'),
            ('Group(B)', 'NoSH', 'Heavy'),
            ('Group(B)', 'NoSH', 'Light'),
            ('Group(C)', 'YeSH', 'Heavy'),
            ('Group(C)', 'YeSH', 'Light'),
            ('Group(C)', 'NoSH', 'Heavy'),
            ('Group(C)', 'NoSH', 'Light')],
           )

I need to have the values for each countries, so it should look like this (3 x 2x 2) = 12 combinations:

           Group(A)_YeSH_Heavy  Group(A)_YeSH_Light  Group(A)_NoSH_Heavy   Group(A)_NoSH_Light
Country                 
France       0.08                0.12                 0.08                    0.12 
Brussels     0.24                0.24                 0.16                    0.16
Germany_A    0.081               0.189                0.009                   0.6
Germany_B    0.04                 0.04                0.06                    0.06

How can I create the new columns and the combinations of the three dataframes?

Advertisement

Answer

You could do the following:

from itertools import product

# Only if country isn't the index yet
Envelope.set_index('Country', drop=True, inplace=True)
Shading.set_index('Country', drop=True, inplace=True)
ThermalMass.set_index('Country', drop=True, inplace=True)

columns = list(product(Envelope.columns, Shading.columns, ThermalMass.columns))
df = pd.concat([Envelope[col[0]] * Shading[col[1]] * ThermalMass[col[2]]
                for col in columns],
               axis='columns')
df.columns = ['_'.join(c for c in col) for col in columns]

Output:

           Group(A)_YeSH_Heavy  ...  Group(C)_NoSH_Light
Country                         ...                     
France                   0.080  ...                0.060
Brussels                 0.240  ...                0.020
Germany_A                0.081  ...                0.007
Germany_B                0.040  ...                0.090

[4 rows x 12 columns]

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement