Skip to content
Advertisement

Pandas extract previous row on value change

I have two columns of data ‘Part & Qty’ where some part numbers are repeated across multiple rows. The quantity value I need to retrieve is contained in the final row before the part number changes.

My code (below) adds a True/False column to flag when the part number changes. I had thought when the flag is ‘True’ I want to retrieve the data from the previous row, however this does not work for the first and last rows.

Running my code gives the output (left) vs the data I’m trying to extract is(right):

output enter image description here

What is the best way to achieve this?

    import pandas as pd
    import numpy as np
    df = pd.DataFrame({
        'part_no_2': [22, 22, 22, 23, 23, 24, 25, 25, 25, 26],
        'qty': [0, 0, 4, 44, 22, 0, 7, 16, 5, 6]})

    df['part_no_change'] = np.where(df["part_no_2"].shift() != df["part_no_2"], True, False) #look for when PNo changes

    df

Advertisement

Answer

Try shift(-1):

df[df.part_no_2 != df.part_no_2.shift(-1)]

   part_no_2  qty
2         22    4
4         23   22
5         24    0
8         25    5
9         26    6
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement