I have three data frames that have the same index (Countries). I need to find all the combinations of the three data frames, create new columns with the data frames. Under each of those columns I will have the multiplication of the values from those combinations.
Envelope = pd.read_excel("Envelope.xlsx",index_col=0) Shading = pd.read_excel("Shading.xlsx",index_col=0) ThermalMass = pd.read_excel("ThermalMass.xlsx",index_col=0)
#Envelope dataframe Country Group(A) Group(B) Group(C) France 0.4 0.4 0.2 Brussels 0.8 0.1 0.1 Germany_A 0.3 0.6 0.1 Germany_B 0.2 0.5 0.3 #Shading dataframe Country YeSH NoSH France 0.5 0.5 Brussels 0.6 0.4 Germany_A 0.9 0.1 Germany_B 0.4 0.6 #ThermalMass dataframe Country Heavy Light France 0.4 0.6 Brussels 0.5 0.5 Germany_A 0.3 0.7 Germany_B 0.5 0.5`
I tried to use the MultiIndex.from_product
all = pd.MultiIndex.from_product([Envelope,Shading,ThermalMass])
but the results is only for the titles:
print(all) MultiIndex([('Group(A)', 'YeSH', 'Heavy'), ('Group(A)', 'YeSH', 'Light'), ('Group(A)', 'NoSH', 'Heavy'), ('Group(A)', 'NoSH', 'Light'), ('Group(B)', 'YeSH', 'Heavy'), ('Group(B)', 'YeSH', 'Light'), ('Group(B)', 'NoSH', 'Heavy'), ('Group(B)', 'NoSH', 'Light'), ('Group(C)', 'YeSH', 'Heavy'), ('Group(C)', 'YeSH', 'Light'), ('Group(C)', 'NoSH', 'Heavy'), ('Group(C)', 'NoSH', 'Light')], )
I need to have the values for each countries, so it should look like this (3 x 2x 2) = 12 combinations:
Group(A)_YeSH_Heavy Group(A)_YeSH_Light Group(A)_NoSH_Heavy Group(A)_NoSH_Light Country France 0.08 0.12 0.08 0.12 Brussels 0.24 0.24 0.16 0.16 Germany_A 0.081 0.189 0.009 0.6 Germany_B 0.04 0.04 0.06 0.06
How can I create the new columns and the combinations of the three dataframes?
Advertisement
Answer
You could do the following:
from itertools import product # Only if country isn't the index yet Envelope.set_index('Country', drop=True, inplace=True) Shading.set_index('Country', drop=True, inplace=True) ThermalMass.set_index('Country', drop=True, inplace=True) columns = list(product(Envelope.columns, Shading.columns, ThermalMass.columns)) df = pd.concat([Envelope[col[0]] * Shading[col[1]] * ThermalMass[col[2]] for col in columns], axis='columns') df.columns = ['_'.join(c for c in col) for col in columns]
Output:
Group(A)_YeSH_Heavy ... Group(C)_NoSH_Light Country ... France 0.080 ... 0.060 Brussels 0.240 ... 0.020 Germany_A 0.081 ... 0.007 Germany_B 0.040 ... 0.090 [4 rows x 12 columns]