I have a pandas dataframe of 100 rows x 7 columns like this:
Values in column source
are connected to the values in the other columns. For example, a
is connected to contact_1, contact_2... contact_5
.
In the same way, b
is connected to contact_6, contact_7 .... and contact_10
.
I want to stack these columns into two columns only (i.e. source and destination), to help me build a graph using edgelist format.
The expected output data format is:
I tried df.stack()
but did not get the desired result, I got the following:
Any suggestions?
Advertisement
Answer
You’re looking for pd.wide_to_long
. This should do:
JavaScript
x
2
1
pd.wide_to_long(df, stubnames='destination_', i=['source'], j='number')
2
The column destination_
will have the info you’re looking for.
Example:
JavaScript
1
7
1
import pandas as pd
2
d = {'source': ['a', 'b'],
3
'destination_1': ['contact_1', 'contact_6'],
4
'destination_2': ['contact_2', 'contact_7']}
5
df = pd.DataFrame(d)
6
pd.wide_to_long(df, stubnames='destination_', i=['source'], j='number')
7
Output:
JavaScript
1
7
1
destination_
2
source number
3
a 1 contact_1
4
b 1 contact_6
5
a 2 contact_2
6
b 2 contact_7
7