Map 2 df but column to value instead of value to value for each ID

Question

I have a table with top 3 reasons (Table 1) and another table with the category it belongs to for each variable (Table 2). I am trying to match the category bins into the reason table like in table 3. Answer Approach index two data frames in way that works with join() then it&#8217;s a pd.concat() of each of …

Accepted Answer

Approachindex two data frames in way that works with join()then it&#8217;s a pd.concat() of each of the reasons from output of joindf1 = pd.read_csv(io.StringIO("""      RegNo    Reason1  Reason2 Reason3        1111         v3       v2     v6        2222         v2       v3     v5        3333         v3       v2     v6        4444         v3       v6     v2        5555         v3       v2     v5"""), sep="s+")df2 = pd.read_csv(io.StringIO("""    RegNo           v2              v3        v4          v5                 v6    1111    (0.44, 0.64]    (0.0, 60.0]     missing    (-inf, 3.0]   (102.0, 175.0]    2222    (-inf, 0.33]    (0.0, 60.0]     welldone   missing       missing    3333    (0.44, 0.64]    (0.0, 60.0]     rare       missing       missing    4444    (0.44, 0.64]    (0.0, 60.0]     missing    missing       (20.0, 102.0]    5555     (0.64, inf]    (0.0, 60.0]     missing    missing       (-inf, 20.0]"""),sep="ss+", engine="python")# index main dfdf1 = df1.set_index("RegNo")# reshape and index reasonsdfm = df2.set_index("RegNo").stack()df3 = pd.concat([    # add apprpriate column into index,  don't want othert columns    (df1.set_index(c, append="True").loc[:,[]]     # now it's a straight forward join     .join(dfm.rename_axis(["RegNo",c]).to_frame())     # cleanup index and rename columns of joined DF     .droplevel(1).rename(columns={0:c}))    for c in ["Reason1","Reason2","Reason3"]], axis=1)RegNoReason1Reason2Reason31111(0.0, 60.0](0.44, 0.64](102.0, 175.0]2222(-inf, 0.33](0.0, 60.0]missing3333(0.0, 60.0](0.44, 0.64]missing4444(0.0, 60.0](20.0, 102.0](0.44, 0.64]5555(0.0, 60.0](0.64, inf]missing

RegNo	Reason1	Reason2	Reason3
1111	(0.0, 60.0]	(0.44, 0.64]	(102.0, 175.0]
2222	(-inf, 0.33]	(0.0, 60.0]	missing
3333	(0.0, 60.0]	(0.44, 0.64]	missing
4444	(0.0, 60.0]	(20.0, 102.0]	(0.44, 0.64]
5555	(0.0, 60.0]	(0.64, inf]	missing

Advertisement

Answer