Improve function that redistribute shares in a pandas dataframe column (possible to avoid nested for loops?)

Question

Below I have dataframe (df) of ten rows, each row has a NAME and belongs to a GROUP. Each row has a value for SHARE that is 0.1. I want to manipulate the distribution of shares. For example, if I increase share value for NAME=&#8217;ONE&#8217; from 0.1 to 0.175 I want a function that simultaneously decrease s…

Accepted Answer

Here is what your code outputs:      GROUP     SHARENAMEONE       A  0.175000TWO       A  0.094375THREE     A  0.094375FOUR      B  0.095000FIVE      B  0.095000SIX       B  0.095000SEVEN     C  0.094375EIGHT     C  0.094375NINE      D  0.081250TEN       D  0.081250I suggest a more idiomatic way to get to the same result:# Redefined variablesname_to_change = "ONE"share_change = 0.075groups = ["A", "B", "C", "D"]weights = {"A": 0.15, "B": 0.2, "C": 0.15, "D": 0.50}def redistr_by_group(df, name_to_change, share_change, groups, weights):    """Refactored function.    """    df.loc[df["NAME"] == name_to_change, "SHARE"] += share_change    mask = df["NAME"] != name_to_change    df.loc[mask, "COEFF"] = df.loc[mask, "GROUP"].apply(        lambda x: df[mask].groupby("GROUP").count()["NAME"].to_dict()[x]    )    df.loc[mask, "WEIGHT_TEMP"] = (        df.loc[mask, "GROUP"].apply(lambda x: weights[x]) / df.loc[mask, "COEFF"]    )    df.loc[mask, "SHARE"] = (        df.loc[mask, "SHARE"] - df.loc[mask, "WEIGHT_TEMP"] * share_change    )    return df.drop(columns=["COEFF", "WEIGHT_TEMP"]).reindex(        columns=["NAME", "GROUP", "SHARE"]    )df = redistr_by_group(df, name_to_change, share_change, groups, weights)print(df)# Output    NAME GROUP     SHARE0    ONE     A  0.1750001    TWO     A  0.0943752  THREE     A  0.0943753   FOUR     B  0.0950004   FIVE     B  0.0950005    SIX     B  0.0950006  SEVEN     C  0.0943757  EIGHT     C  0.0943758   NINE     D  0.0812509    TEN     D  0.081250print(df["SHARE"].sum())  # 1

Advertisement

Answer