# Count number of days in each continuous period pandas

Suppose I have next df `N03_zero` (`date_code` is already `datetime`):

```item_code       date_code
8028558104973   2022-01-01
8028558104973   2022-01-02
8028558104973   2022-01-03
8028558104973   2022-01-06
8028558104973   2022-01-07
7622300443269   2022-01-01
7622300443269   2022-01-10
7622300443269   2022-01-11
513082          2022-01-01
513082          2022-01-02
513082          2022-01-03
```

Millions of rows with `date_code` assigned to some `item_code`. I am trying to get the number of days of each continuous period for each `item_code`, all other similar questions doesn’t helped me. The expected df should be:

```item_code       continuous_days
8028558104973   3
8028558104973   2
7622300443269   1
7622300443269   2
513082          3
```

Once days sequence breaks, it should count days in this sequence and then start to count again. The aim is, to able to get then the dataframe with `count`, `min`, `max`, and `mean` for each `item_code`.

Like this:

```item_code       no. periods   min   max   mean
8028558104973   2             2     3     2.5
7622300443269   2             1     2     1.5
513082          1             3     3     3
```

Any suggestions?

For consecutive days compare difference by `Series.diff` in days by `Series.dt.days` for not equal `1` by `Series.ne` with cumulative sum by `Series.cumsum` and then use `GroupBy.size`, remove second level by `DataFrame.droplevel` and create `DataFrame`:

```df['date_code'] = pd.to_datetime(df['date_code'])

df1= (df.groupby(['item_code',df['date_code'].diff().dt.days.ne(1).cumsum()], sort=False)
.size()
.droplevel(1)
.reset_index(name='continuous_days'))
print (df1)
item_code  continuous_days
0  8028558104973                3
1  8028558104973                2
2  7622300443269                1
3  7622300443269                2
4         513082                3
```

And then aggregate values by named aggregations by `GroupBy.agg`:

```df2 = (df1.groupby('item_code', sort=False, as_index=False)
.agg(**{'no. periods': ('continuous_days','size'),
'min':('continuous_days','min'),
'max':('continuous_days','max'),
'mean':('continuous_days','mean')}))
print (df2)
item_code  no. periods  min  max  mean
0  8028558104973            2    2    3   2.5
1  7622300443269            2    1    2   1.5
2         513082            1    3    3   3.0
```
User contributions licensed under: CC BY-SA
1 People found this is helpful