Create Dataframe by calling indices of df1 that are listed in df2

Tags: , , ,



I’m new to Python Pandas and struggling with the following problem for a while now.

The following dataframe df1 values show the indices that are coupled to the values of df2 that should be called

              Name1        Name2 ...        Name160          Name161
0              62            18  ...          NaN               75
1              79            46  ...          NaN               5
2               3            26  ...          NaN               0

df2 contains the values that belong to the indices that have to be called.

              Name1         Name2  ...              Name160                 Name161
0             152.0         204.0  ...                NaN                    164.0
1             175.0         308.0  ...                NaN                    571.0
2             252.0         695.0  ...                NaN                    577.0
3             379.0         722.0  ...                NaN                    655.0
4             398.0         834.0  ...                NaN                    675.0
..              ...           ...  ...                ...                      ...
213             NaN           NaN  ...                NaN                      NaN
214             NaN           NaN  ...                NaN                      NaN
215             NaN           NaN  ...                NaN                      NaN
216             NaN           NaN  ...                NaN                      NaN
217             NaN           NaN  ...                NaN                      NaN

For example, df1 shows the value ‘0’ in column ‘Name161’. Then df3 should show the value that is listed in df2 with index 0. In this case ‘164’.

Till so far, I got df3 showing the first 3 values of df2, but of course that not what I would like to achieve.

Input:

    df3 = df1*0
    for c in df1.columns:
          df3[c]= df2[c]
    print(df3)

Output:
        
                    Name1        Name2   ...              Name160                 Name161
        0           152.0         204.0  ...                NaN                    164.0
        1           175.0         308.0  ...                NaN                    571.0
        2           252.0         695.0  ...                NaN                    577.0

Any help would be much appreciated, thanks!

Answer

Use DataFrame.stack with Series.reset_index for reshape both DataFrames, then merging by DataFrame.merge with left join and last pivoting by DataFrame.pivot:

#change index values for match by sample data in df2
print (df1)
   Name1  Name2  Name160  Name161
0      2      4      NaN        4
1      0    213      NaN      216
2      3      2      NaN        0

df11 = df1.stack().reset_index(name='idx')
df22 = df2.stack().reset_index(name='val')

df = (df11.merge(df22, 
                 left_on=['idx','level_1'], 
                 right_on=['level_0','level_1'], 
                 how='left')
         .pivot('level_0_x','level_1','val')
         .reindex(df1.columns, axis=1)
         .rename_axis(None)
         )
print (df)
   Name1  Name2  Name160  Name161
0  252.0  834.0      NaN    675.0
1  152.0    NaN      NaN      NaN
2  379.0  695.0      NaN    164.0


Source: stackoverflow