Skip to content
Advertisement

Python. Pandas. Merge

I have written a code that merges File B into File A based on a column ‘Code’. Some of the values from File B, however, are generic (‘Color’) and I would need to do another merge with file C – but instead of creating a new column I would like to use the same column created during the first merge and whenever first merge returned value ‘Color’ only for those rows do merge with file C to get the proper value.

I went as far as merging A with B:

import pandas as pd

File_A = pd.read_excel(r'.../My Files/Python/Supplier cat testing/File A.xlsx')
File_B = pd.read_excel(r'.../My Files/Python/Supplier cat testing/File B.xlsx')
File_C = pd.read_excel(r'.../My Files/Python/Supplier cat testing/File C.xlsx')
results = pd.merge(File_A, File_B[['Code','Color']], on='Code')

results.to_excel('Output_File.xlsx', index=False)

Would anyone have any idea where do I even start, please?

enter image description here

Advertisement

Answer

Try :

dfOut = dfB.copy()
dfOut['Color'] = dfB.merge(dfC, on='Code').apply(lambda r: r.Color_x if r.Color_y == 'Color' else r.Color_y, axis=1)
print(df)

Output

   Code   Color
0     0   Green
1     1  Yellow
2     2  Orange
3     3     Red
4     4   Black
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement