Is there an easy way to establish a hierarchy between entities using only 2 ID fields?

Tags: , , ,



I have a table with 2 fields like so:

Account_ID Parent_ID
x y
x1 y
x2 y
y z
y1 z
y2 z
z z
z a
z1 a
a a
b b

The IDs fields are both in int64 format. The first field represents accounts which could be controlled by a parent account which could be itself controlled by another parent account. For example: A controls Z who controls Y who controls X.

Not sure if this is relevant but for full context, it is possible for a parent to not actually use their own accounts for personal purpose and only use it to manage their children accounts. For example: Y falls under this scenario because it does not have a row relating it to itself like Z & A.

I am trying to create a field like in the table below that would easily highlight the structure of that hierarchy:

Account_ID Parent_ID Hierarchy
x y x/y/z/a
x1 y x1/y/z/a
x2 y x2/y/z/a
y z x/y/z/a
y1 z y1/z/a
y2 z y2/z/a
z z x/y/z/a
z a x/y/z/a
z1 a z1/a
a a x/y/z/a
b b b

The one & two tiered ones are fairly simple with a concatenate, but I was wondering if there is an elegant way to populate the multi-tiered ones like “x/y/z/a”?

Answer

This does the trick. Advantageously it only relies on pandas and a small function.

def add_hierarchy(df, s, tier):
    df['Hierarchy'] = df['Account_ID']
    for i in range(tier):
        next_tier = s.apply(lambda x: df[df['Account_ID'] == x].iloc[0])
        df['Hierarchy'] += '/' +  next_tier['Account_ID']
        s = next_tier['Parent_ID']



add_hierarchy(df, df['Parent_ID'], tier=3)

Probably you may want to add some sort of nice format afterwards. However, it returns the correct hierarchies

  Account_ID Parent_ID Hierarchy
0          x         y   x/y/z/a
1         x1         y  x1/y/z/a
2         x2         y  x2/y/z/a
3          y         z   y/z/a/a
4         y1         z  y1/z/a/a
5         y2         z  y2/z/a/a
6          z         a   z/a/a/a
7         z2         a  z2/a/a/a
8          a         a   a/a/a/a
9          b         b   b/b/b/b

(Please check your output. z has two parents given. I assumed z to have only one parent to be deterministic as is also indicated in your example output.)



Source: stackoverflow