Skip to content
Advertisement

Stacking a number of columns into one column in python

I have a pandas dataframe of 100 rows x 7 columns like this:

enter image description here

Values in column source are connected to the values in the other columns. For example, a is connected to contact_1, contact_2... contact_5. In the same way, b is connected to contact_6, contact_7 .... and contact_10.

I want to stack these columns into two columns only (i.e. source and destination), to help me build a graph using edgelist format.

The expected output data format is:

enter image description here

I tried df.stack() but did not get the desired result, I got the following:

enter image description here

Any suggestions?

Advertisement

Answer

You’re looking for pd.wide_to_long. This should do:

pd.wide_to_long(df, stubnames='destination_', i=['source'], j='number')

The column destination_ will have the info you’re looking for.

Example:

import pandas as pd
d = {'source': ['a', 'b'],
 'destination_1': ['contact_1', 'contact_6'],
 'destination_2': ['contact_2', 'contact_7']}
df = pd.DataFrame(d)
pd.wide_to_long(df, stubnames='destination_', i=['source'], j='number')

Output:

              destination_
source number             
a      1         contact_1
b      1         contact_6
a      2         contact_2
b      2         contact_7
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement