I have written a code that merges File B into File A based on a column ‘Code’. Some of the values from File B, however, are generic (‘Color’) and I would need to do another merge with file C – but instead of creating a new column I would like to use the same column created during the first merge and whenever first merge returned value ‘Color’ only for those rows do merge with file C to get the proper value.
I went as far as merging A with B:
import pandas as pd File_A = pd.read_excel(r'.../My Files/Python/Supplier cat testing/File A.xlsx') File_B = pd.read_excel(r'.../My Files/Python/Supplier cat testing/File B.xlsx') File_C = pd.read_excel(r'.../My Files/Python/Supplier cat testing/File C.xlsx') results = pd.merge(File_A, File_B[['Code','Color']], on='Code') results.to_excel('Output_File.xlsx', index=False)
Would anyone have any idea where do I even start, please?
Advertisement
Answer
Try :
dfOut = dfB.copy() dfOut['Color'] = dfB.merge(dfC, on='Code').apply(lambda r: r.Color_x if r.Color_y == 'Color' else r.Color_y, axis=1) print(df)
Output
Code Color 0 0 Green 1 1 Yellow 2 2 Orange 3 3 Red 4 4 Black