I need to update vid
or maybe create a new column based on the change
column
df = [{'vid': 14, 'change': 0}, {'vid': 15, 'change': 1}, {'vid': 16, 'change': 0}, {'vid': 16, 'change': 0}, {'vid': 17, 'change': 0}, {'vid': 17, 'change': 1}, {'vid': 18, 'change': 0}, {'vid': 18, 'change': 0}]
vid change
0 14 0
1 15 1
2 16 0
3 16 0
4 17 0
5 17 1
6 18 0
7 18 0
If change == 1
then the next set of vid should be changed to the current and if change == 0
then the next vid should stay the same.
From my example above vid 16 needs to be changed to 15 since 15 has change = 1 however 17 stays since 16 doesn’t have change = 1
Change = 1 will only occur when vid is going to change in the next row
Expected output
vid change
0 14 0
1 15 1
2 15 0
3 15 0
4 17 0
5 17 1
6 17 0
7 17 0
Advertisement
Answer
If you don’t mind renumbering all the replaced numbers to be sequential, you can get a pretty clean version of it by subtracting the shift
ed cumsum
of change:
In [59]: df['new_vid'] = df['vid'] - df['change'].shift(fill_value=0).cumsum()
In [60]: df
Out[60]:
vid change new_vid
0 14 0 14
1 15 1 15
2 16 0 15
3 16 0 15
4 17 0 16
5 17 1 16
6 18 0 16
7 18 0 16
Another option with the original numbering is to use map
and groupby
:
In [81]: df['vid'] = df['vid'] - df['vid'].map(df.groupby("vid")['change'].max().shift(fill_value=0))
In [82]: df
Out[82]:
vid change
0 14 0
1 15 1
2 15 0
3 15 0
4 17 0
5 17 1
6 17 0
7 17 0