Skip to content
Advertisement

Pandas DataFrame Groupby two columns and get different relation in same keys insert list

I have this table :

  Head  Relation    Tail 
0   9       1          0 
1   10      1         11  
2   9       0         23 
3   10      1         61 
4   9       0         12
5   10      0         66
6   10      0         61

I have to create a dictionary with Head key and values ​​equal to the relations but not repeated and for each value of the relations I have to insert the corresponding tail.

example:

{9 : {1: [10], 0:[23,12]}, 10 : {1:[11,61], 0:[66,61]}}

I don’t really know how to do it. Is there someone who can help me?

Result with nan.

Second Example Input:

    Head    Relation    Tail  
  0 207       1         269  
  1 207       1          61  
  2 207       0          62  
  3 208       1          269 
  4 290       0          269

the output:

{207: {0: [62], 1: [269,61]}, 208: {0: nan, 1: [269]},
 290: {0: [269], 1: nan}}

I would like to remove the nans

Advertisement

Answer

You could use pivot_table and to_dict:

(df.pivot_table(index='Head', columns='Relation', values='Tail', aggfunc=list)
   .to_dict('index')
)

Or, the other way around:

(df.pivot_table(index='Relation', columns='Head', values='Tail', aggfunc=list)
   .to_dict()
)

output:

{9: {0: [23, 12], 1: [0]}, 10: {0: [66, 61], 1: [11, 61]}}
post-processing the output to remove NaNs:
d = (df.pivot_table(columns='Head', index='Relation', values='Tail', aggfunc=list)
     .to_dict()
     )

d2 = {k: {k2:v2 for k2,v2 in v.items() if pd.isna(v2) is not True}
      for k,v in d.items()}
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement