I have two dataframe in which columns are different, And I need to search and the map the data in new file. I am sharing the dataframes and also desired output
DF1
    Ref.X               Ref.Y            Ref.No
    255623              145244651        XF1234
    45454544            NA               DF7886
    1300256655          3511614646       DF8948
    5546446166          NA               VX4578
    1230001556          NA               GL8947
    4548754545          4548545877       DF7896
    3652656             NA               XF4895
    7884045455          45681247         GL8597
    148428              1242154661       XF3564
DF2
    Type      STR1            STR2
    ABJH      45656644        145244651
    ABJH      844654989       1242154661
    BHJH      51654241654     3511614646
    BHJH      1230001556      255565656
    ABJH      5546446166      8946549849
    ABJH      45454544        544895655
    BHJH      3652656         565464447
Now I need to a output from above two dataframes.
Conditions
- Need to search Ref.Y in STR2, if available then pick the “Type” for output.
- if Ref.Y is “NA” then need to search Ref.X in STR1, if available then pick the “Type” for output.
- Also I need “Ref.No” in desired output correspond to the “Type” Ref.X or Ref.Y found in DF2
Desired Output
    Ref.X               Ref.Y            Ref.No     Int_Type
    255623              145244651        XF1234     ABJH
    45454544            NA               DF7886     ABJH
    1300256655          3511614646       DF8948     BHJH
    5546446166          NA               VX4578     ABJH
    1230001556          NA               GL8947     BHJH
    3652656             NA               XF4895     BHJH
    148428              1242154661       XF3564     ABJH
Thanks.
Advertisement
Answer
You can try repace NA string to <NA> type then use np.where
df1 = df1.replace({'NA': pd.NA})
df1['Int_Type'] = np.where(df1['Ref.Y'].isna(),
                       df1['Ref.X'].map(dict(zip(df2['STR1'], df2['Type']))),
                       df1['Ref.Y'].map(dict(zip(df2['STR2'], df2['Type']))))
Or you can directly compare with NA string:
df1['Int_Type'] = np.where(df1['Ref.Y'].eq('NA'),
Ref.X Ref.Y Ref.No Int_Type 0 255623 1.452447e+08 XF1234 ABJH 1 148428 1.242155e+09 XF3564 ABJH 2 1300256655 3.511615e+09 DF8948 BHJH 3 1230001556 NaN GL8947 BHJH 4 5546446166 NaN VX4578 ABJH 5 45454544 NaN DF7886 ABJH 6 3652656 NaN XF4895 BHJH 7 4548754545 4.548546e+09 DF7896 NaN 8 7884045455 4.568125e+07 GL8597 NaN
To get Ref.No correspond to the “Type” Ref.X or Ref.Y found in DF2, you can drop the NAN column in Int_Type
df1 = df1.dropna(subset='Int_Type')
print(df1)
        Ref.X         Ref.Y  Ref.No Int_Type
0      255623  1.452447e+08  XF1234     ABJH
1      148428  1.242155e+09  XF3564     ABJH
2  1300256655  3.511615e+09  DF8948     BHJH
3  1230001556           NaN  GL8947     BHJH
4  5546446166           NaN  VX4578     ABJH
5    45454544           NaN  DF7886     ABJH
6     3652656           NaN  XF4895     BHJH