Skip to content
Advertisement

How to concatenate the column by column name in pandas?

Is there any efficient way to concatenate Pandas column name, and don’t use loop.

My current method is very slow.

input :

         F1        F2        F3        F4
0  0.653150 -0.877143 -1.640587 -0.571843
1  0.118184  1.499173  0.637869 -0.410608

feature_map ={"F1":["F1"], "F2": ["F2","F3"] , "F4":["F4"]} 
delta_x = pd.DataFrame(np.random.randn(2,4),index=[0,1],columns=["F1", "F2", "F3", "F4"])

Output :

                 F1                                          F2                     F4
0  [0.6531501163310599]  [-0.8771426082487118, -1.6405865645819901]  [-0.5718426901939191]
1  [0.1181836121394836]    [1.4991725444466424, 0.6378685281925491]  [-0.4106075515826911]


result = pd.DataFrame([list(delta_x.loc[:, i].values) for i in feature_map.values()],index=feature_map.keys(), columns=delta_x.index).T

Advertisement

Answer

You could rework your dictionary to form groups and use groupby+agg(list):

groups = {k:v for v,l in feature_map.items() for k in l}
# {'F1': 'F1', 'F2': 'F2', 'F3': 'F2', 'F4': 'F4'}

out = delta_x.T.groupby(delta_x.columns.map(groups)).agg(list).T

output:

                       F1                                          F2                    F4
0   [-1.8341169478884958]    [0.543785421630868, 0.29151404233014466]  [0.6325262957339908]
1  [-0.18774374391279974]  [-0.4323328409917436, -0.8389437070428051]  [1.2530256320658806]
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement