Skip to content
Advertisement

Calculating YoY growth for daily volumes in python pandas

sample Excel data

I have 5 years of daily volume data. I want to create a new column in pandas dataframe where it produces the values of YoY Growth for that particular day. For e.g. compares 2018-01-01 with 2017-01-01, and 2019-01-01 compares with 2018-01-01

I have 364 records for each year (except for the year 2020 there are 365 days)

How can I create the column YoY_Growth as below in pandas dataframe.

Advertisement

Answer

# It's more convenient to index the dataframe with the Date for our algorith,
df = df.set_index("Date")
is_leap_day = (df.index.month == 2) & (df.index.day == 29)

# Leap day is an edge case, since you can't find Feb 29 of the previous year.
# pandas handles this by shifting to Feb 28 of the previous year:
#   2020-02-29 -> 2019-02-28
#   2020-02-28 -> 2019-02-28
# This creates a duplicate for Feb 28. So we need to handle leap day separately.
volume_last_year = df.loc[~is_leap_day, "Volume"].shift(freq=pd.DateOffset(years=1))

# For non leap days
df["YoY_Growth"] = df["Volume"] / volume_last_year - 1

# For leap days
df.loc[is_leap_day, "YoY_Growth"] = (
    df.loc[is_leap_day, "Volume"] / volume_last_year.shift(freq=pd.DateOffset(days=1))
    - 1
)

Result (Volume was randomly generated):

df.loc[["2019-01-01", "2019-02-28", "2020-01-01", "2020-02-28", "2020-02-29"], :]

            Volume  YoY_Growth
Date                          
2019-01-01      45         NaN
2019-02-28      23         NaN
2020-01-01      10   -0.777778 # = 10 / 45 - 1
2020-02-28      34    0.478261 # = 34 / 23 - 1
2020-02-29      76    2.304348 # = 76 / 23 - 1
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement