Given this example:
from pandas import DataFrame, isna
from numpy import nan
df = DataFrame([
{'id': '1', 'x': 2, 'y': 3, 'z': 4},
{'id': '5', 'x': 6, 'y': 7, 'z': 8},
{'id': '9', 'x': 10, 'y': 11, 'z': 12}
]).set_index('id')
factors = DataFrame([
{'id': '5', 'x': nan, 'z': 3},
{'id': '9', 'x': 0.2, 'z': nan},
]).set_index('id')
for row_id in factors.index:
for col in factors.columns:
if not isna(factors[col][row_id]):
df[col][row_id] *= factors[col][row_id]
Where the values in df are multiplied by non-NaN values from factors, is there a cleaner way to do this with pandas? (or numpy for that matter) I had a look at .mul(), but that doesn’t appear to allow me to do what’s required here.
Additionally, what if factors contains rows with an id that’s not in df, e.g.:
factors = DataFrame([
{'id': '5', 'x': nan, 'z': 3},
{'id': '13', 'x': 2, 'z': 4},
]).set_index('id')
Advertisement
Answer
If I understand your problem right, you can use .update + .mul:
df.update(df.mul(factors))
Prints:
x y z id 1 2.0 3 4.0 5 6.0 7 24.0 9 2.0 11 12.0
For the second example (if factors contains rows with an id that’s not in df) this prints:
x y z id 1 2 3 4.0 5 6 7 24.0 9 10 11 12.0