Skip to content
Advertisement

determine chain of predecessors and successor from a list of first predecessor in python

I have a list like the following

+----+-------------------+
| id | first_predecessor |
+----+-------------------+
|  0 | 4                 |
|  1 | 5                 |
|  2 | 6                 |
|  3 | 17,18             |
|  4 | 7                 |
|  5 | 8                 |
|  6 | 9                 |
|  7 | 10,11,12          |
|  8 | 13,14,15          |
|  9 | 16                |
| 10 | Input             |
| 11 | Input             |
| 12 | Input             |
| 13 | Input             |
| 14 | Input             |
| 15 | Input             |
| 16 | Input             |
| 17 | 19                |
| 18 | 20                |
| 19 | 21                |
| 20 | 22                |
| 21 | Input             |
+----+-------------------+

One item can have multiple immediate incoming ids, like in case of id=3, which is imediately preceeded by id=17 and id=18. I need a python code to determine this result by following the chain of predecessors both ways: (it is best to read the column all_successors to understand the logic, all_predecessors is the same logic backwards)

+----+-------------------+------------------+----------------+
| id | first_predecessor | all_predecessors | all_successors |
+----+-------------------+------------------+----------------+
|  0 | 4                 | 4,7,10,11,12     |                |
|  1 | 5                 | 5,8,13,14,15     |                |
|  2 | 6                 | 6,9,16           |                |
|  3 | 17,18             | 19,21,20,22      |                |
|  4 | 7                 | 7,10,11,12       | 0              |
|  5 | 8                 | 8,13,14,15       | 1              |
|  6 | 9                 | 9,16             | 2              |
|  7 | 10,11,12          | 10,11,12         | 0,4            |
|  8 | 13,14,15          | 13,14,15         | 1,5            |
|  9 | 16                | 16               | 2,6            |
| 10 | Input             |                  | 0,4,7          |
| 11 | Input             |                  | 0,4,7          |
| 12 | Input             |                  | 0,4,7          |
| 13 | Input             |                  | 1,5,8          |
| 14 | Input             |                  | 1,5,8          |
| 15 | Input             |                  | 1,5,8          |
| 16 | Input             |                  | 2,6,9          |
| 17 | 19                | 19,21            | 3              |
| 18 | 20                | 20,22            | 3              |
| 19 | 21                | 21               | 3,17           |
| 20 | 22                | 22               | 3,18           |
| 21 | Input             |                  | 3,17,19        |
| 22 | Input             |                  | 3,18,20        |
+----+-------------------+------------------+----------------+

I need some kind of recursive solution, or should I use some graph package?

Advertisement

Answer

You can use the following functions to find all predecessors and all successors.

To run the following example, make sure you change INPUT in your id column to NaN.

df_ = df.copy()

df_['first_predecessor'] = df_['first_predecessor'].str.split(',')
df_ = df_.explode('first_predecessor')
df_['first_predecessor'] = df_['first_predecessor'].fillna(-1).astype(int)

G = nx.from_pandas_edgelist(df_, 'first_predecessor', 'id', create_using=nx.DiGraph())
G.remove_node(-1)

df['all_predecessors'] = df['id'].apply(lambda x: ','.join(map(str, sorted(nx.ancestors(G, x)))))
df['all_successors'] = df['id'].apply(lambda x: ','.join(map(str, sorted(nx.descendants(G, x)))))
print(df)

    id first_predecessor   all_predecessors all_successors
0    0                 4       4,7,10,11,12               
1    1                 5       5,8,13,14,15               
2    2                 6             6,9,16               
3    3             17,18  17,18,19,20,21,22               
4    4                 7         7,10,11,12              0
5    5                 8         8,13,14,15              1
6    6                 9               9,16              2
7    7          10,11,12           10,11,12            0,4
8    8          13,14,15           13,14,15            1,5
9    9                16                 16            2,6
10  10               NaN                             0,4,7
11  11               NaN                             0,4,7
12  12               NaN                             0,4,7
13  13               NaN                             1,5,8
14  14               NaN                             1,5,8
15  15               NaN                             1,5,8
16  16               NaN                             2,6,9
17  17                19              19,21              3
18  18                20              20,22              3
19  19                21                 21           3,17
20  20                22                 22           3,18
21  21               NaN                           3,17,19
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement