I have a dataframe structured like this
Time Z X Y 01-01-18 1 20 10 02-01-18 20 4 15 03-01-18 34 16 21 04-01-18 67 38 8 05-01-18 89 10 18 06-01-18 45 40 4 07-01-18 22 10 13 08-01-18 1 46 11 ... 24-12-20 56 28 9 25-12-20 6 14 22 26-12-20 9 5 40 27-12-20 56 11 10 28-12-21 78 61 35 29-12-21 33 23 29 30-12-21 2 35 12 31-12-21 0 31 7
I have data for all days and months from 2018 to 2021, with around 50k observations
How can I aggregate all the data for the same month and perform a Train-Test splitting for each month? I.e. for all the data of the months of January, February, March and so on.
Advertisement
Answer
try this:
df['month'] = df.Time.apply(lambda x: x.split('-')[1]) #get month