Skip to content
Advertisement

Cleaner way to selectively multiply pandas DataFrame values

Given this example:

from pandas import DataFrame, isna
from numpy import nan


df = DataFrame([
    {'id': '1', 'x': 2, 'y': 3, 'z': 4},
    {'id': '5', 'x': 6, 'y': 7, 'z': 8},
    {'id': '9', 'x': 10, 'y': 11, 'z': 12}
]).set_index('id')

factors = DataFrame([
    {'id': '5', 'x': nan, 'z': 3},
    {'id': '9', 'x': 0.2, 'z': nan},
]).set_index('id')

for row_id in factors.index:
    for col in factors.columns:
        if not isna(factors[col][row_id]):
            df[col][row_id] *= factors[col][row_id]

Where the values in df are multiplied by non-NaN values from factors, is there a cleaner way to do this with pandas? (or numpy for that matter) I had a look at .mul(), but that doesn’t appear to allow me to do what’s required here.

Additionally, what if factors contains rows with an id that’s not in df, e.g.:

factors = DataFrame([
    {'id': '5', 'x': nan, 'z': 3},
    {'id': '13', 'x': 2, 'z': 4},
]).set_index('id')

Advertisement

Answer

If I understand your problem right, you can use .update + .mul:

df.update(df.mul(factors))

Prints:

      x   y     z
id               
1   2.0   3   4.0
5   6.0   7  24.0
9   2.0  11  12.0

For the second example (if factors contains rows with an id that’s not in df) this prints:

     x   y     z
id              
1    2   3   4.0
5    6   7  24.0
9   10  11  12.0
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement