I have the following dataset:
Company_ID Firm_Name 125911 Ampersand 125911 BancBoston 32679 BP Corp 74240 CORNING 32679 DIEBOLD 32679 DIEBOLD 74240 Fidelity 74240 Greylock 32679 INCO 67734 INCO 67734 Innova 32679 Kleiner 67734 Kleiner 67734 Kleiner 67734 Mayfield 32679 Pliant 67734 Pliant 67734 Sofinnova 43805 Warburg
The dataframe shows when different investment firms have invested in the same Company during a year. I want to create a network graph of the Connections between the Firm_ID only. For example Ampersand and BancBoston have both invested in the same company and should therefore be connected. The code I have tried is:
G = nx.Graph() G = nx.from_pandas_edgelist(df, 'Company_ID', 'Firm_Name') nx.draw_shell(H, with_labels=True)
Which generates the following graph:
This shows the connections of both Company_ID and Firm_Name. I only want to have the Firms as nodes, where they are connected if they have invested in the same company. I have not found any similar problems or similar datasets where networkx is used. Any help is greatly appreciated!
Advertisement
Answer
Try with merge
out = df.merge(df,on=['Company_ID']) G = nx.Graph() G = nx.from_pandas_edgelist(df, 'Firm_Name_x', 'Firm_Name_y')