I want to convert a date time series to season, for example for months 3, 4, 5 I want to replace them with 2 (spring); for months 6, 7, 8 I want to replace them with 3 (summer) etc.
So, I have this series
id 1 2011-08-20 2 2011-08-23 3 2011-08-27 4 2011-09-01 5 2011-09-05 6 2011-09-06 7 2011-09-08 8 2011-09-09 Name: timestamp, dtype: datetime64[ns]
and this is the code I have been trying to use, but to no avail.
# Get seasons spring = range(3, 5) summer = range(6, 8) fall = range(9, 11) # winter = everything else month = temp2.dt.month season=[] for _ in range(len(month)): if any(x == spring for x in month): season.append(2) # spring elif any(x == summer for x in month): season.append(3) # summer elif any(x == fall for x in month): season.append(4) # fall else: season.append(1) # winter
and
for _ in range(len(month)): if month[_] == 3 or month[_] == 4 or month[_] == 5: season.append(2) # spring elif month[_] == 6 or month[_] == 7 or month[_] == 8: season.append(3) # summer elif month[_] == 9 or month[_] == 10 or month[_] == 11: season.append(4) # fall else: season.append(1) # winter
Neither solution works, specifically in the first implementation I receive an error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
While in the second is a large list with errors. Any ideas please? Thanks
Advertisement
Answer
You can use a simple mathematical formula to compress a month to a season, e.g.:
>>> [month%12 // 3 + 1 for month in range(1, 13)] [1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 1]
So for your use-case using vector operations (credit @DSM):
>>> temp2.dt.month%12 // 3 + 1 1 3 2 3 3 3 4 4 5 4 6 4 7 4 8 4 Name: id, dtype: int64