Skip to content
Advertisement

Pandas conditional counting by date

I want to count all orders done by each customer at each order date, to find out how many orders were done at the time of each order.

Input:

customer_id   order_id  order_date
27501235  163595958  2018-12-14
27501235  165810252  2019-01-05

Expected output:

enter image description here

The following code works but is extremely slow. Taking upwards of 10 hours for 100k+ rows. There is certainly a better way.

JavaScript

Advertisement

Answer

Try sort_values to get dates in ascending order then groupby cumcount to enumerate groups in order:

JavaScript

df:

JavaScript

Complete Working Example:

JavaScript

Edit Assuming same dates should have the same value per group via rank:

JavaScript

df:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement