Skip to content
Advertisement

Networkx: Network graph from pandas dataframe

I have the following dataset:

Company_ID  Firm_Name
125911      Ampersand 
125911      BancBoston 
32679       BP Corp 
74240       CORNING 
32679       DIEBOLD 
32679       DIEBOLD 
74240       Fidelity 
74240       Greylock
32679       INCO 
67734       INCO 
67734       Innova
32679       Kleiner 
67734       Kleiner 
67734       Kleiner 
67734       Mayfield
32679       Pliant 
67734       Pliant 
67734       Sofinnova 
43805       Warburg 

The dataframe shows when different investment firms have invested in the same Company during a year. I want to create a network graph of the Connections between the Firm_ID only. For example Ampersand and BancBoston have both invested in the same company and should therefore be connected. The code I have tried is:

G = nx.Graph()
G = nx.from_pandas_edgelist(df, 'Company_ID', 'Firm_Name')
nx.draw_shell(H, with_labels=True)

Which generates the following graph: enter image description here

This shows the connections of both Company_ID and Firm_Name. I only want to have the Firms as nodes, where they are connected if they have invested in the same company. I have not found any similar problems or similar datasets where networkx is used. Any help is greatly appreciated!

Advertisement

Answer

Try with merge

out = df.merge(df,on=['Company_ID'])
G = nx.Graph()
G = nx.from_pandas_edgelist(df, 'Firm_Name_x', 'Firm_Name_y')
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement