I have a Series, called ‘scores’, with a datetime index.
I wish to subset it by quarter
and year
pseudocode: series.loc['q2 of 2013']
Attempts so far:
s.dt.quarter
AttributeError: Can only use .dt accessor with datetimelike values
s.index.dt.quarter
AttributeError: ‘DatetimeIndex’ object has no attribute ‘dt’
This works (inspired by this answer), but I can’t believe it is the right way to do this in Pandas:
d = pd.DataFrame(s)
d['date'] = pd.to_datetime(d.index)
d.loc[(d['date'].dt.quarter == 2) & (d['date'].dt.year == 2013)]['scores']
I expect there is a way to do this without transforming into a dataset, forcing the index into datetime, and then getting a Series from it.
What am I missing, and what is the elegant way to do this on a Pandas series?
Advertisement
Answer
import numpy as np import pandas as pd index = pd.date_range('2013-01-01', freq='M', periods=12) s = pd.Series(np.random.rand(12), index=index) print(s) # 2013-01-31 0.820672 # 2013-02-28 0.994890 # 2013-03-31 0.928376 # 2013-04-30 0.848532 # 2013-05-31 0.122263 # 2013-06-30 0.305741 # 2013-07-31 0.088432 # 2013-08-31 0.647288 # 2013-09-30 0.640308 # 2013-10-31 0.737139 # 2013-11-30 0.233656 # 2013-12-31 0.245214 # Freq: M, dtype: float64 d = pd.Series(s.index, index=s.index) quarter = d.dt.quarter.astype(str) + 'Q' + d.dt.year.astype(str) print(quarter) # 2013-01-31 1Q2013 # 2013-02-28 1Q2013 # 2013-03-31 1Q2013 # 2013-04-30 2Q2013 # 2013-05-31 2Q2013 # 2013-06-30 2Q2013 # 2013-07-31 3Q2013 # 2013-08-31 3Q2013 # 2013-09-30 3Q2013 # 2013-10-31 4Q2013 # 2013-11-30 4Q2013 # 2013-12-31 4Q2013 # Freq: M, dtype: object print(s[quarter == '1Q2013']) # 2013-01-31 0.124398 # 2013-02-28 0.052828 # 2013-03-31 0.126374 # Freq: M, dtype: float64
If you don’t want to create a new Series that holds a label for each quarter (e.g., if you are subsetting just once), you could even do
print(s[(s.index.quarter == 1) & (s.index.year == 2013)]) # 2013-01-31 0.124398 # 2013-02-28 0.052828 # 2013-03-31 0.126374 # Freq: M, dtype: float64