Python

I have 30 items in each group.

To find mean of entire items, I use this code.

y = df[["Value", "Date"]].groupby("Date").mean()

JavaScript
​x
 
y = df[["Value", "Date"]].groupby("Date").mean()
​

That returns a value like this.

Date                  Value
       
2020-01-01 00:30:00   7172.36
2020-01-01 01:00:00   7171.55
2020-01-01 01:30:00   7205.90
2020-01-01 02:00:00   7210.24
2020-01-01 02:30:00   7221.50

JavaScript
 
Date                  Value
       
2020-01-01 00:30:00   7172.36
2020-01-01 01:00:00   7171.55
2020-01-01 01:30:00   7205.90
2020-01-01 02:00:00   7210.24
2020-01-01 02:30:00   7221.50
​

However, I would like to find the mean of the first 10 items in the group instead of the entire items.

y1 = df[["Value", "Date"]].groupby("Date").head(10).mean()

JavaScript
 
y1 = df[["Value", "Date"]].groupby("Date").head(10).mean()
​

That code return only a single Value instead of a pandas series.

So I’m getting errors like this.

AttributeError: 'numpy.float64' object has no attribute 'shift'

What is the proper way to get the pandas series instead of a single value?

Answer

You can try

y1 = df[["Value", "Date"]].groupby("Date").apply(lambda g: g['Value'].head(10).mean())

JavaScript
 
y1 = df[["Value", "Date"]].groupby("Date").apply(lambda g: g['Value'].head(10).mean())
​

print(y1)

Date
2020-01-01 00:30:00    7172.36
2020-01-01 01:00:00    7171.55
2020-01-01 01:30:00    7205.90
2020-01-01 02:00:00    7210.24
2020-01-01 02:30:00    7221.50
dtype: float64

JavaScript
 
print(y1)
​
Date
2020-01-01 00:30:00    7172.36
2020-01-01 01:00:00    7171.55
2020-01-01 01:30:00    7205.90
2020-01-01 02:00:00    7210.24
2020-01-01 02:30:00    7221.50
dtype: float64
​

In .groupby("Date").head(10).mean(), groupby.head() returns the DataFrame, .mean() is operated on the whole DataFrame rather than the group.

Pandas groupby – Find mean of first 10 items

Advertisement

Answer