I have a dataframe made up of dummy car purchases across a year which looks like:
df =
JavaScript
x
7
1
purchase_date brand
2
2021-02-13 BMW
3
2021-02-28 BMW
4
2021-03-10 Audi
5
2021-03-11 BMW
6
7
What I’m looking for is to get an aggregated count of each brand of car for each month in 2021, so it would look like this:
df =
JavaScript
1
5
1
BMW Audi
2
(2021-02) 2 0
3
(2021-03) 1 1
4
5
So far I’ve used this code to group by the year, month but I can’t split it to count individual brands:
JavaScript
1
2
1
df = df.groupby([df['purchase_date'].dt.year.rename('year'), df3['purchase_date'].dt.month.rename('month')]).agg({'count'})
2
This returns:
JavaScript
1
4
1
('brand','count')
2
(2021-02) 2
3
(2021-03) 2
4
Advertisement
Answer
Use crosstab
with month periods:
JavaScript
1
7
1
df1 = pd.crosstab(df['purchase_date'].dt.to_period('m').rename('year'), df['brand'])
2
print (df1)
3
brand Audi BMW
4
year
5
2021-02 0 2
6
2021-03 1 1
7
Your solution is with add column brand
, aggregate GroupBy.size
and Series.unstack
:
JavaScript
1
10
10
1
df2 = (df.groupby([df['purchase_date'].dt.year.rename('year'),
2
df['purchase_date'].dt.month.rename('month'), 'brand'])
3
.size()
4
.unstack(fill_value=0))
5
print (df2)
6
brand Audi BMW
7
year month
8
2021 2 0 2
9
3 1 1
10
Alternative:
JavaScript
1
9
1
df3 = (df.groupby([pd.Grouper(freq='MS',key='purchase_date'), 'brand'])
2
.size()
3
.unstack(fill_value=0))
4
print (df3)
5
brand Audi BMW
6
purchase_date
7
2021-02-01 0 2
8
2021-03-01 1 1
9