How to set values in a dataframe according to multiple conditions and different groups?

Question

I have a complex mapping problem. I have a table with groups. Each group has a list of 7 products The product list may contain duplicate products. MY GOAL : I have to map 5 products in the list of 7 products If the value of the column rank is equal to 1, 2, 3 then Product_1, Product_2, Product_3. If

Accepted Answer

With the dataframe you provided:import pandas as pddf = pd.DataFrame(    {        "ID": ["A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B"],        "PRODUCT": [            "RED",            "BLUE",            "BLACK",            "GREEN",            "GREEN",            "PURPLE",            "YELLOW",            "PURPLE",            "BLACK",            "BLACK",            "GREEN",            "WHITE",            "BROWN",            "GREEN",        ],        "RANK": [1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7],    })Here is one way to do it using shuffle and choice from Python standard library random module:import randomdfs = []for col in df["ID"].unique():    # Setup    available_ranks = [4, 5]    random.shuffle(available_ranks)    temp = df.copy().loc[df["ID"] == col, :]    # First rule    temp["MAP"] = temp.apply(        lambda x: f"PRODUCT_{x['RANK']}" if x["RANK"] <= 3 else "", axis=1    )    # Second rule    temp["MAP"] = temp.apply(        lambda x: f"PRODUCT_{available_ranks.pop()}"        if x["PRODUCT"] == "YELLOW" and x["RANK"] > 3        else x["MAP"],        axis=1,    )    # Third rule    temp.loc[        (temp["MAP"] == "")        & (temp["RANK"] == random.choice(temp.loc[temp["MAP"] == "", "RANK"].values)),        "MAP",    ] = f"PRODUCT_{available_ranks.pop()}"    if available_ranks:        temp.loc[            (temp["MAP"] == "")            & (                temp["RANK"]                == random.choice(temp.loc[temp["MAP"] == "", "RANK"].values)            ),            "MAP",        ] = f"PRODUCT_{available_ranks.pop()}"    dfs.append(temp)new_df = pd.concat(dfs)And so:print(new_df)# Output   ID PRODUCT  RANK        MAP0   A     RED     1  PRODUCT_11   A    BLUE     2  PRODUCT_22   A   BLACK     3  PRODUCT_33   A   GREEN     44   A   GREEN     5  PRODUCT_55   A  PURPLE     66   A  YELLOW     7  PRODUCT_47   B  PURPLE     1  PRODUCT_18   B   BLACK     2  PRODUCT_29   B   BLACK     3  PRODUCT_310  B   GREEN     411  B   WHITE     5  PRODUCT_412  B   BROWN     6  PRODUCT_513  B   GREEN     7

ID	PRODUCT	RANK
A	RED	1
A	BLUE	2
A	BLACK	3
A	GREEN	4
A	GREEN	5
A	PURPLE	6
A	YELLOW	7
B	PURPLE	1
B	BLACK	2
B	BLACK	3
B	GREEN	4
B	WHITE	5
B	BROWN	6
B	GREEN	7

ID	PRODUCT	RANK	MAP
A	RED	1	PRODUCT_1
A	BLUE	2	PRODUCT_2
A	BLACK	3	PRODUCT_3
A	GREEN	4	PRODUCT_5
A	GREEN	5	“”
A	PURPLE	6	“”
A	YELLOW	7	PRODUCT_4
B	PURPLE	1	PRODUCT_1
B	BLACK	2	PRODUCT_2
B	YELLOW	3	PRODUCT_3
B	GREEN	4	“”
B	WHITE	5	PRODUCT_4
B	BROWN	6	PRODUCT_5
B	GREEN	7	“”

Advertisement

Answer