Skip to content
Advertisement

How to populate columns of a dataframe using a subset of another dataframe?

I have two dataframes like this

JavaScript

I now want to populate columns prop1 and prop2 in df2 using the values of df1. For each key, we will have more or equal rows in df1 than in df2 (in the example above: 5 times A vs 3 times A, 2 times B vs 2 times B and 3 times C vs 1 time C). For each key, I want to fill df2 using the first n rows per key from df1.

So, my expected outcome for df2 would be:

JavaScript

As key is not unique, I cannot simple build a dictionary and then use .map.

I was hoping that something along these lines would work:

JavaScript

but that fails with

ValueError: Shape of passed values is (5, 22), indices imply (5, 10)

as – I guess – the index contains non-unique values.

How can I get my desired output?

Advertisement

Answer

Because duplicates in key values possible solution is create new counter columns in both DataFrames by GroupBy.cumcount, so possible replace missing values from df2 with align by MultiIndex created by key and g columns with DataFrame.fillna:

JavaScript

JavaScript
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement