I have 3 dataframes (df1, df2 & df3), the main one (df1) and two additional ones which contain 1 column amongst others that I want to bring over to the main dataframe.
Sample dfs:
df1 = {'Col_f': ['Georgia', 'Nevada', 'New York', 'Texas', 'Arizona'], 'Col_g': ['SUV', 'Coupe', 'Wagon', 'Crossover', 'Sedan']} df1 = pd.DataFrame(df1) df2 = {'Col_g': ['SUV', '4x4', 'Wagon', 'Truck', 'Sedan'], 'Objective': ['15%', '13%', '55%', '40.4%', '2.48%']} df2 = pd.DataFrame(df2) df3 = {'Col_f': ['Georgia', 'California', 'Pennsylvania', 'Texas', 'Arizona'], 'Objective': ['15%', '13%', '55%', '40.4%', '2.48%']} df3 = pd.DataFrame(df3)
I am using the following code:
df1_new = pd.merge(df1, df2, on = 'Col_g', how = 'left') Which returns the following df: df_new = {'Col_f': ['Georgia', 'California', 'Pennsylvania', 'Texas', 'Arizona'], 'Col_g': ['SUV', 'Coupe', 'Wagon', 'Crossover', 'Sedan'], 'Objective': ['15%', ' ', '55%', ' ', '2.48%']}
Then for the two empty strings for the “Objective” column I want to continue with a second merge (or an excel vlookup which I guess are the same), to fill in those empty ones.
Code that I thought would be used for the second merge:
df1_newer = pd.merge(df_new, df3, on = 'Col_f', how = 'left')
Desired final output.
df_newer = {'Col_f': ['Georgia', 'California', 'Pennsylvania', 'Texas', 'Arizona'], 'Col_g': ['SUV', '4x4', 'Wagon', 'Truck', 'Sedan'], 'Objective': ['15%', '13%', '55%', '40.4%', '2.48%']}
Any suggestions would be more than appreciated!
Advertisement
Answer
After merging the three dataframes, you can use mask
or np.where
to conditionally assign value
df1_new = pd.merge(df1, df2, on='Col_g', how='left')['Objective'] df1_newer = pd.merge(df1, df3, on='Col_f', how='left')['Objective']
print(df1_new) 0 NaN 1 NaN 2 55% 3 NaN 4 2.48% Name: Objective, dtype: object print(df1_newer) 0 15% 1 NaN 2 NaN 3 40.4% 4 2.48% Name: Objective, dtype: object
df1['Objective'] = df1_new.mask(df1_new.isna(), df1_newer) #or import numpy as np df1['Objective'] = np.where(df1_new.isna(), df1_newer, df1_new)
print(df1) Col_f Col_g Objective 0 Georgia Minivan 15% 1 Nevada Coupe NaN 2 New York Wagon 55% 3 Texas Crossover 40.4% 4 Arizona Sedan 2.48%