Given this example:
from pandas import DataFrame, isna from numpy import nan df = DataFrame([ {'id': '1', 'x': 2, 'y': 3, 'z': 4}, {'id': '5', 'x': 6, 'y': 7, 'z': 8}, {'id': '9', 'x': 10, 'y': 11, 'z': 12} ]).set_index('id') factors = DataFrame([ {'id': '5', 'x': nan, 'z': 3}, {'id': '9', 'x': 0.2, 'z': nan}, ]).set_index('id') for row_id in factors.index: for col in factors.columns: if not isna(factors[col][row_id]): df[col][row_id] *= factors[col][row_id]
Where the values in df
are multiplied by non-NaN values from factors
, is there a cleaner way to do this with pandas
? (or numpy
for that matter) I had a look at .mul()
, but that doesn’t appear to allow me to do what’s required here.
Additionally, what if factors contains rows with an id
that’s not in df
, e.g.:
factors = DataFrame([ {'id': '5', 'x': nan, 'z': 3}, {'id': '13', 'x': 2, 'z': 4}, ]).set_index('id')
Advertisement
Answer
If I understand your problem right, you can use .update
+ .mul
:
df.update(df.mul(factors))
Prints:
x y z id 1 2.0 3 4.0 5 6.0 7 24.0 9 2.0 11 12.0
For the second example (if factors contains rows with an id that’s not in df) this prints:
x y z id 1 2 3 4.0 5 6 7 24.0 9 10 11 12.0