Skip to content
Advertisement

Pandas union with parent ids in the same dataframe

I have a pandas dataframe that looks like this:

id | folder | level0_parent_id | level1_parent_id | level2_parent_id | level3_parent_id

1     A             0                    0                   0                 0
2     B             1                    0                   0                 0
3     C             1                    2                   0                 0
4     D             1                    2                   0                 0
5     E             1                    2                   4                 0

I want to return the folder structure for each line in a separate colunm, for example

  • line 1: A is the root there is no parent all ids are 0 = A
  • line 2: B is under A, id = 1, so the path is A/B
  • line 3: C is under id A=1 for level0 and under B=2 for level1, resulting = A/B/C
  • line 4: D is under id A=1 for level0 and under B=2 for level1, resulting = A/B/D
  • line 5: E is under id A=1 for level0 and under B=2 for level1 and under D=4 for level 2, resulting = A/B/D/E

and this go on and on, I can have multiple Roots meaning level0 = 0 and if level 0 is not 0 means that is not Root so it’s related to a Root …(obvious i think )

How can I do this with Pandas?

Advertisement

Answer

One approach with melt

df['structure'] = df['folder'].map(
    df.melt(['id', 'folder'], var_name='level')
      .assign(child=lambda x: x['value'].map(dict(zip(df['id'], df['folder'])))
                                        .fillna(x['folder']))
      .drop_duplicates(['folder', 'child'])
      .groupby('folder')['child'].agg('/'.join)
)
print(df)

Output

   id folder  level0_parent_id  level1_parent_id  level2_parent_id  
0   1      A                 0                 0                 0   
1   2      B                 1                 0                 0   
2   3      C                 1                 2                 0   
3   4      D                 1                 2                 0   
4   5      E                 1                 2                 4   

   level3_parent_id   structure  
0                 0           A  
1                 0         A/B  
2                 0       A/B/C  
3                 0       A/B/D  
4                 0     A/B/D/E 
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement