I have the following dataframe called df,
date flag1 flag2 flag3 flag4… 2020-12-31 2021-01-01 2021-01-02 1 2021-01-03 2021-01-04 2021-01-05 1 2021-01-06 1 2021-01-07 2021-01-08 2021-01-09 2021-01-10 2021-01-11 1 1 2021-01-12
I want to do a backfill when a 1 appears in any column, and fill backwards until a number appears or failing that, backfill til a set number.
So let’s say the set number to reduce o to is 0 and the decrement is 0.1, it should look like this,
date flag1 flag2 flag3 flag4… 2020-12-31 0.5 0.8 0.4 2021-01-01 0.0 0.6 0.9 0.5 2021-01-02 0.1 0.7 1.0 0.6 2021-01-03 0.2 0.8 0.7 2021-01-04 0.3 0.9 0.8 2021-01-05 0.4 1.0 0.9 2021-01-06 0.5 1.0 2021-01-07 0.6 0.6 2021-01-08 0.7 0.7 2021-01-09 0.8 0.8 2021-01-10 0.9 0.9 2021-01-11 1.0 1.0 2021-01-12
Can this be achieved with pandas? I want to be able to set the decrement amount and the limit for example the above would be 0.1 and 0.
I know that this command can increment the values backwards:
df1 = df1[::-1].fillna(method='ffill') (df1 + (df1 == df1.shift()).cumsum()).sort_index()
But that’s not what I want.
Advertisement
Answer
You could also try using iloc
to change the values based on the indices where the column value is equals to 1.0:
import pandas as pd import numpy as np def process_data(c, n): for idx in reversed(np.where(c==1)[0]): c.iloc[np.arange(idx)[::-1][:n.shape[0]]] = n[idx-1::-1][::-1] c.iat[idx] = 1.0 return c df = df.apply(lambda r: process_data(r, np.linspace(1.0, 0.0, num=11)[1:]))
flag1 flag2 flag3 flag4 date 2020-12-31 NaN 0.5 0.8 0.4 2021-01-01 0.0 0.6 0.9 0.5 2021-01-02 0.1 0.7 1.0 0.6 2021-01-03 0.2 0.8 NaN 0.7 2021-01-04 0.3 0.9 NaN 0.8 2021-01-05 0.4 1.0 NaN 0.9 2021-01-06 0.5 NaN NaN 1.0 2021-01-07 0.6 NaN NaN 0.6 2021-01-08 0.7 NaN NaN 0.7 2021-01-09 0.8 NaN NaN 0.8 2021-01-10 0.9 NaN NaN 0.9 2021-01-11 1.0 NaN NaN 1.0 2021-01-12 NaN NaN NaN NaN