I have two dataframe in which columns are different, And I need to search and the map the data in new file. I am sharing the dataframes and also desired output
DF1
JavaScript
x
11
11
1
Ref.X Ref.Y Ref.No
2
255623 145244651 XF1234
3
45454544 NA DF7886
4
1300256655 3511614646 DF8948
5
5546446166 NA VX4578
6
1230001556 NA GL8947
7
4548754545 4548545877 DF7896
8
3652656 NA XF4895
9
7884045455 45681247 GL8597
10
148428 1242154661 XF3564
11
DF2
JavaScript
1
9
1
Type STR1 STR2
2
ABJH 45656644 145244651
3
ABJH 844654989 1242154661
4
BHJH 51654241654 3511614646
5
BHJH 1230001556 255565656
6
ABJH 5546446166 8946549849
7
ABJH 45454544 544895655
8
BHJH 3652656 565464447
9
Now I need to a output from above two dataframes.
Conditions
- Need to search Ref.Y in STR2, if available then pick the “Type” for output.
- if Ref.Y is “NA” then need to search Ref.X in STR1, if available then pick the “Type” for output.
- Also I need “Ref.No” in desired output correspond to the “Type” Ref.X or Ref.Y found in DF2
Desired Output
JavaScript
1
9
1
Ref.X Ref.Y Ref.No Int_Type
2
255623 145244651 XF1234 ABJH
3
45454544 NA DF7886 ABJH
4
1300256655 3511614646 DF8948 BHJH
5
5546446166 NA VX4578 ABJH
6
1230001556 NA GL8947 BHJH
7
3652656 NA XF4895 BHJH
8
148428 1242154661 XF3564 ABJH
9
Thanks.
Advertisement
Answer
You can try repace NA
string to <NA>
type then use np.where
JavaScript
1
6
1
df1 = df1.replace({'NA': pd.NA})
2
3
df1['Int_Type'] = np.where(df1['Ref.Y'].isna(),
4
df1['Ref.X'].map(dict(zip(df2['STR1'], df2['Type']))),
5
df1['Ref.Y'].map(dict(zip(df2['STR2'], df2['Type']))))
6
Or you can directly compare with NA
string:
JavaScript
1
2
1
df1['Int_Type'] = np.where(df1['Ref.Y'].eq('NA'),
2
JavaScript
1
11
11
1
Ref.X Ref.Y Ref.No Int_Type
2
0 255623 1.452447e+08 XF1234 ABJH
3
1 148428 1.242155e+09 XF3564 ABJH
4
2 1300256655 3.511615e+09 DF8948 BHJH
5
3 1230001556 NaN GL8947 BHJH
6
4 5546446166 NaN VX4578 ABJH
7
5 45454544 NaN DF7886 ABJH
8
6 3652656 NaN XF4895 BHJH
9
7 4548754545 4.548546e+09 DF7896 NaN
10
8 7884045455 4.568125e+07 GL8597 NaN
11
To get Ref.No
correspond to the “Type” Ref.X or Ref.Y found in DF2, you can drop the NAN
column in Int_Type
JavaScript
1
2
1
df1 = df1.dropna(subset='Int_Type')
2
JavaScript
1
11
11
1
print(df1)
2
3
Ref.X Ref.Y Ref.No Int_Type
4
0 255623 1.452447e+08 XF1234 ABJH
5
1 148428 1.242155e+09 XF3564 ABJH
6
2 1300256655 3.511615e+09 DF8948 BHJH
7
3 1230001556 NaN GL8947 BHJH
8
4 5546446166 NaN VX4578 ABJH
9
5 45454544 NaN DF7886 ABJH
10
6 3652656 NaN XF4895 BHJH
11