What am I doing wrong?
This is all the code needed to reproduce.
import pandas as pd g = pd.Grouper('datetime', freq='D')
Result:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In [1], line 2 1 import pandas as pd ----> 2 g = pd.Grouper('datetime', freq='D') TypeError: TimeGrouper.__init__() got multiple values for argument 'freq'
Pandas version 1.5.1, Python version 3.10.6.
Advertisement
Answer
This seems to be a bug
It looks like the weirdness is because Grouper.__new__()
instantiates a TimeGrouper
if you pass freq
as a kwarg, but not if you pass freq
as a positional argument. I don’t know why it does that, and it’s not documented, so it seems like a bug.
The reason for the error is that TimeGrouper.__init__()
‘s first parameter is freq
, not key
. (In fact it doesn’t even have a key
parameter; it specifies **kwargs
, which are then sent to Grouper.__init__()
at the end.)
Workaround
Pass key
as a kwarg too:
g = pd.Grouper(key='datetime', freq='D')
All-positional syntax is also broken
In cottontail’s answer, they suggested using positional arguments, with None
for level, but this doesn’t work fully. For example, you can’t specify an origin:
pd.Grouper('datetime', None, 'D', origin='epoch')
TypeError: __init__() got an unexpected keyword argument 'origin'
(I’m using Python 3.9 and Pandas 1.4.4, so the error might look a bit different, but the same error should occur on your version.)
Even worse, the resulting Grouper
doesn’t work, for example:
df = pd.DataFrame({ 'datetime': pd.to_datetime([ '2022-11-29T15', '2022-11-30T15', '2022-11-30T16']), 'v': [1, 2, 3]})
>>> g = pd.Grouper('datetime', None, 'D') >>> df.groupby(g).sum() v datetime 2022-11-29 15:00:00 1 2022-11-30 15:00:00 2 2022-11-30 16:00:00 3
Compared to:
>>> g1 = pd.Grouper(key='datetime', freq='D') >>> df.groupby(g1).sum() v datetime 2022-11-29 1 2022-11-30 5