I have two columns of data ‘Part & Qty’ where some part numbers are repeated across multiple rows. The quantity value I need to retrieve is contained in the final row before the part number changes.
My code (below) adds a True/False column to flag when the part number changes. I had thought when the flag is ‘True’ I want to retrieve the data from the previous row, however this does not work for the first and last rows.
Running my code gives the output (left) vs the data I’m trying to extract is(right):
What is the best way to achieve this?
import pandas as pd import numpy as np df = pd.DataFrame({ 'part_no_2': [22, 22, 22, 23, 23, 24, 25, 25, 25, 26], 'qty': [0, 0, 4, 44, 22, 0, 7, 16, 5, 6]}) df['part_no_change'] = np.where(df["part_no_2"].shift() != df["part_no_2"], True, False) #look for when PNo changes df
Advertisement
Answer
Try shift(-1)
:
df[df.part_no_2 != df.part_no_2.shift(-1)] part_no_2 qty 2 22 4 4 23 22 5 24 0 8 25 5 9 26 6