Skip to content
Advertisement

Pandas groupby – Find mean of first 10 items

I have 30 items in each group.

To find mean of entire items, I use this code.

y = df[["Value", "Date"]].groupby("Date").mean()

That returns a value like this.

Date                  Value
       
2020-01-01 00:30:00   7172.36
2020-01-01 01:00:00   7171.55
2020-01-01 01:30:00   7205.90
2020-01-01 02:00:00   7210.24
2020-01-01 02:30:00   7221.50

However, I would like to find the mean of the first 10 items in the group instead of the entire items.

y1 = df[["Value", "Date"]].groupby("Date").head(10).mean()

That code return only a single Value instead of a pandas series.

So I’m getting errors like this.

AttributeError: 'numpy.float64' object has no attribute 'shift'

What is the proper way to get the pandas series instead of a single value?

Advertisement

Answer

You can try

y1 = df[["Value", "Date"]].groupby("Date").apply(lambda g: g['Value'].head(10).mean())
print(y1)

Date
2020-01-01 00:30:00    7172.36
2020-01-01 01:00:00    7171.55
2020-01-01 01:30:00    7205.90
2020-01-01 02:00:00    7210.24
2020-01-01 02:30:00    7221.50
dtype: float64

In .groupby("Date").head(10).mean(), groupby.head() returns the DataFrame, .mean() is operated on the whole DataFrame rather than the group.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement