Given this example:
JavaScript
x
20
20
1
from pandas import DataFrame, isna
2
from numpy import nan
3
4
5
df = DataFrame([
6
{'id': '1', 'x': 2, 'y': 3, 'z': 4},
7
{'id': '5', 'x': 6, 'y': 7, 'z': 8},
8
{'id': '9', 'x': 10, 'y': 11, 'z': 12}
9
]).set_index('id')
10
11
factors = DataFrame([
12
{'id': '5', 'x': nan, 'z': 3},
13
{'id': '9', 'x': 0.2, 'z': nan},
14
]).set_index('id')
15
16
for row_id in factors.index:
17
for col in factors.columns:
18
if not isna(factors[col][row_id]):
19
df[col][row_id] *= factors[col][row_id]
20
Where the values in df
are multiplied by non-NaN values from factors
, is there a cleaner way to do this with pandas
? (or numpy
for that matter) I had a look at .mul()
, but that doesn’t appear to allow me to do what’s required here.
Additionally, what if factors contains rows with an id
that’s not in df
, e.g.:
JavaScript
1
5
1
factors = DataFrame([
2
{'id': '5', 'x': nan, 'z': 3},
3
{'id': '13', 'x': 2, 'z': 4},
4
]).set_index('id')
5
Advertisement
Answer
If I understand your problem right, you can use .update
+ .mul
:
JavaScript
1
2
1
df.update(df.mul(factors))
2
Prints:
JavaScript
1
6
1
x y z
2
id
3
1 2.0 3 4.0
4
5 6.0 7 24.0
5
9 2.0 11 12.0
6
For the second example (if factors contains rows with an id that’s not in df) this prints:
JavaScript
1
6
1
x y z
2
id
3
1 2 3 4.0
4
5 6 7 24.0
5
9 10 11 12.0
6