Skip to content

Tag: time-series

Pandas lagged rolling average on aggregate data with multiple groups and missing dates

I’d like to calculate a lagged rolling average on a complicated time-series dataset. Consider the toy example as follows: This results in the following DataFrame: Now I’d like to add a column representing the average weight per fruit for the previous 7 days: wgt_per_frt_prev_7d. It should be defined as the sum of all the fruit weights divided by the sum

Pivot matrix to time-series – Python

I’ve got a dataframe with date as first column and time as the name of the other columns. Date 13:00 14:00 15:00 16:00 … 2022-01-01 B R M M … 2022-01-02 B B B M … 2022-01-03 R B B M … How could I transform that matrix into a datetime time-series? My objective its something like this: Date Data

Downsampling time series data in pandas

I have timeseries data that looks like this: I would like to downsample my data from 15-minute frequencies to 1-hour frequencies. So, the first 4 rows above would be summed under 00:00 timestamp, then next 4 rows would be combined under 01:00. Is there an efficient way to make this happen? Answer Look at pandas.DataFrame.resample would result in All you

Combining weeks 52 and 0 with Python Datetime

I have a Pandas DataFrame with daily data that I’m trying to group by week number to sum some columns, and I notice that when years do not begin on Sunday, the data for the week spanning the end of one year and the beginning of the next do not cleanly sum, instead being broken into two groups. My code

Error with Pipeline for fourier featurizer

When I run above code. I get following error: TypeError: Last step of Pipeline should be of type BaseARIMA. ‘FourierFeaturizer(k=1, m=14)’ I don’t wish to use BaseARIMA. Just wish to use FourierFeaturizer is it possible? Answer Yes, it’s possible. Each FourierFeaturizer has a fit_transform method, which returns the y var and new exogenous variables. By concatenating this return value, you

Pandas Aggregate Daily Data to Monthly Timeseries

I have a time series that looks like this (below) And I want to resample it monthly, so it has 2019-10 is equal to the average of all the values of october, November is the average of all the PTS values for November, etc. However, when i use the pd.resample(‘M’).mean() method, if the final day for each month does not

Is there any function to get multiple timeseries with .get and create a dataframe in Pandas?

I get multiple time series data in series format with datetimeindex, which I want to resample and convert to a dataframe with multiple columns with each column representing each time series. I am using separate functions to create the dataframe, for example, .get(), .resample(), pd.concat(). Since it is not following the DRY principle (Don’t Repeat Yourself) and I can be