This is how my data looks like:
I want to aggregate this by category, Issue and Date(weekly) to get count of record.
Date: Group, the result should be monday to sunday
Count: ADD, if two or more records have same Name and fall in a same week Date time (if falls on same interval 7 days week)
The desired output is given below:
Category | Issue | Date |
---|---|---|
Bakes | Back Brake failures | 11/28/2022 |
Machines | Oiling of the machines | 11/29/2022 |
Cars | windscreen broken | 11/30/2022 |
Cars | steering wheel is shaking | 11/30/2022 |
Bakes | The tyres are flat | 12/01/2022 |
Machines | Normal wear and tear | 12/02/2022 |
Machines | Normal wear and tear | 12/02/2022 |
Cars | warning lights are on | 12/03/2022 |
Bakes | Back Brake failures | 12/04/2022 |
Machines | Oiling of the machines | 12/05/2022 |
Cars | windscreen broken | 12/06/2022 |
Bakes | excessive emissions | 12/06/2022 |
Bakes | The tyres are flat | 12/07/2022 |
Machines | Normal wear and tear | 12/08/2022 |
Cars | warning lights are on | 12/09/2022 |
Bakes | Brake pads worn | 12/10/2022 |
Machines | Machine is consuming too much oil | 12/11/2022 |
Cars | the tyres are wearing unevenly | 12/11/2022 |
I want to have a result similar to this:
if possible the count should be sorted in descending order
Thanks in advance for your help.
Advertisement
Answer
You can use a crosstab
with a weekly period:
JavaScript
x
5
1
pd.crosstab([df['Category'], df['Issue']],
2
pd.to_datetime(df['Date']).dt.to_period('W'),
3
margins=True, margins_name='Grand Total'
4
)
5
Output:
JavaScript
1
15
15
1
Date 2022-11-28/2022-12-04 2022-12-05/2022-12-11 Grand Total
2
Category Issue
3
Bakes Back Brake failures 2 0 2
4
Brake pads worn 0 1 1
5
The tyres are flat 1 1 2
6
excessive emissions 0 1 1
7
Cars steering wheel is shaking 1 0 1
8
the tyres are wearing unevenly 0 1 1
9
warning lights are on 1 1 2
10
windscreen broken 1 1 2
11
Machines Machine is consuming too much oil 0 1 1
12
Normal wear and tear 2 1 3
13
Oiling of the machines 1 1 2
14
Grand Total 9 9 18
15