Skip to content
Advertisement

Determinate Consecutive Values (Invoices) Pandas

I have a dataset with supplier and its invoices and I need to determinate which of the invoices are consecutives marking it with a 1 or a 0.

For example:

df1 = pd.DataFrame()
df1['Supplier'] = ['ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC']
df1['Invoice'] = ['101','102','105','109','110','114','120','121','122','130','131']
   Supplier Invoice
0       ABC     101
1       ABC     102
2       ABC     105
3       ABC     109
4       ABC     110
5       ABC     114
6       ABC     120
7       ABC     121
8       ABC     122
9       ABC     130
10      ABC     131

And what I want is a third column like this:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1

EDIT

Thanks for your answers, this options works great, but when I tried it in a real database I realized that I need to compare the suppliers, if the next row is a new supplier the consecutive must be 0.

For example:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1
11      DEF     132           0
12      DEF     133           1
13      GHI     134           0

Thanks in advance!

Advertisement

Answer

using npwhere and diff

# take difference b/w consecutive rows using diff
# update to 0 when difference is not 1

df1['Consecutive']=(df1.groupby('Supplier')['Invoice']
                    .transform(lambda x: np.where(
                                           x.astype(int).diff() !=1, 0, 1)) )
df1
df1
    Supplier    Invoice     Consecutive
0   ABC     101     0
1   ABC     102     1
2   ABC     105     0
3   ABC     109     0
4   ABC     110     1
5   ABC     114     0
6   ABC     120     0
7   ABC     121     1
8   ABC     122     1
9   ABC     130     0
10  ABC     131     1
11  DEF     132     0
12  DEF     133     1
13  GHI     134     0
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement