I have the following dataset:
JavaScript
x
21
21
1
Company_ID Firm_Name
2
125911 Ampersand
3
125911 BancBoston
4
32679 BP Corp
5
74240 CORNING
6
32679 DIEBOLD
7
32679 DIEBOLD
8
74240 Fidelity
9
74240 Greylock
10
32679 INCO
11
67734 INCO
12
67734 Innova
13
32679 Kleiner
14
67734 Kleiner
15
67734 Kleiner
16
67734 Mayfield
17
32679 Pliant
18
67734 Pliant
19
67734 Sofinnova
20
43805 Warburg
21
The dataframe shows when different investment firms have invested in the same Company during a year. I want to create a network graph of the Connections between the Firm_ID only. For example Ampersand and BancBoston have both invested in the same company and should therefore be connected. The code I have tried is:
JavaScript
1
4
1
G = nx.Graph()
2
G = nx.from_pandas_edgelist(df, 'Company_ID', 'Firm_Name')
3
nx.draw_shell(H, with_labels=True)
4
Which generates the following graph:
This shows the connections of both Company_ID and Firm_Name. I only want to have the Firms as nodes, where they are connected if they have invested in the same company. I have not found any similar problems or similar datasets where networkx is used. Any help is greatly appreciated!
Advertisement
Answer
Try with merge
JavaScript
1
4
1
out = df.merge(df,on=['Company_ID'])
2
G = nx.Graph()
3
G = nx.from_pandas_edgelist(df, 'Firm_Name_x', 'Firm_Name_y')
4