How to populate columns of a dataframe using a subset of another dataframe?

Question

I have two dataframes like this I now want to populate columns prop1 and prop2 in df2 using the values of df1. For each key, we will have more or equal rows in df1 than in df2 (in the example above: 5 times A vs 3 times A, 2 times B vs 2 times B and 3 times C vs

Accepted Answer

Because duplicates in key values possible solution is create new counter columns in both DataFrames by GroupBy.cumcount, so possible replace missing values from df2 with align by MultiIndex created by key and g columns with DataFrame.fillna:df1['g'] = df1.groupby('key').cumcount()df2['g'] = df2.groupby('key').cumcount()print (df1)  key prop1 prop2  g0   A     x     m  01   A     y     n  12   A     z     b  23   B     u     n  04   B     u     b  15   C     y     b  06   C     x     n  17   A     z     n  38   A     z     n  49   C     z     n  2print (df2)  key  prop1  prop2 keep_me  g0   A    NaN    NaN   stuff  01   B    NaN    NaN   stuff  02   B    NaN    NaN   stuff  13   C    NaN    NaN   stuff  04   A    NaN    NaN   stuff  15   A    NaN    NaN   stuff  2df = (df2.set_index(['key','g'])        .fillna(df1.set_index(['key','g']))        .reset_index(level=1, drop=True)        .reset_index())print (df)  key prop1 prop2 keep_me0   A     x     m   stuff1   B     u     n   stuff2   B     u     b   stuff3   C     y     b   stuff4   A     y     n   stuff5   A     z     b   stuff

Advertisement

Answer