Let’s say input was
JavaScript
x
7
1
d = {'col1': [1,2,3,4,5,6,7,8,9,10],
2
'col2': [1,2,3,4,5,6,7,8,9,10],
3
'col3': [1,2,3,4,5,6,7,8,9,10],
4
'offset': [1,2,3,1,2,3,1,2,3,1]}
5
6
df = pd.DataFrame(data=d)
7
I want to create an additional column that looks like this:
JavaScript
1
2
1
df['output'] = [1, 4, 9, 4, 10, 18, 7, 16, 27, 10]
2
Basically each number in offset
is telling you the number of columns to sum over (from col1
as ref point).
Is there a vectorized way to do this without iterating through each value in offset
?
Advertisement
Answer
You use np.select. To use it, create each of the column sum
(1, 2, 3 … as needed) as the possible choices, and create a boolean masks for each value in offset column as the possible conditons.
JavaScript
1
23
23
1
# get all possible values from offset
2
lOffset = df['offset'].unique()
3
4
# get te result with np.select
5
df['output'] = np.select(
6
# create mask for each values in offset
7
condlist=[df['offset'].eq(i) for i in lOffset],
8
# crerate the sum over the number of columns per offset value
9
choicelist=[df.iloc[:,:i].sum(axis=1) for i in lOffset]
10
)
11
print(df)
12
# col1 col2 col3 offset output
13
# 0 1 1 1 1 1
14
# 1 2 2 2 2 4
15
# 2 3 3 3 3 9
16
# 3 4 4 4 1 4
17
# 4 5 5 5 2 10
18
# 5 6 6 6 3 18
19
# 6 7 7 7 1 7
20
# 7 8 8 8 2 16
21
# 8 9 9 9 3 27
22
# 9 10 10 10 1 10
23
Note: this assumes that your offset column is the last one