Is there any efficient way to concatenate Pandas column name, and don’t use loop.
My current method is very slow.
input :
F1 F2 F3 F4 0 0.653150 -0.877143 -1.640587 -0.571843 1 0.118184 1.499173 0.637869 -0.410608 feature_map ={"F1":["F1"], "F2": ["F2","F3"] , "F4":["F4"]} delta_x = pd.DataFrame(np.random.randn(2,4),index=[0,1],columns=["F1", "F2", "F3", "F4"])
Output :
F1 F2 F4 0 [0.6531501163310599] [-0.8771426082487118, -1.6405865645819901] [-0.5718426901939191] 1 [0.1181836121394836] [1.4991725444466424, 0.6378685281925491] [-0.4106075515826911] result = pd.DataFrame([list(delta_x.loc[:, i].values) for i in feature_map.values()],index=feature_map.keys(), columns=delta_x.index).T
Advertisement
Answer
You could rework your dictionary to form groups and use groupby
+agg(list)
:
groups = {k:v for v,l in feature_map.items() for k in l} # {'F1': 'F1', 'F2': 'F2', 'F3': 'F2', 'F4': 'F4'} out = delta_x.T.groupby(delta_x.columns.map(groups)).agg(list).T
output:
F1 F2 F4 0 [-1.8341169478884958] [0.543785421630868, 0.29151404233014466] [0.6325262957339908] 1 [-0.18774374391279974] [-0.4323328409917436, -0.8389437070428051] [1.2530256320658806]