Skip to content
Advertisement

pandas: append rows to another dataframe under the similar row based on multiple columns

I asked quite a similar question here but was wondering if there is a way to tackle the issue if one has to rely on multiple columns to perform the append. So the dataframes look as follows,

    import pandas as pd
d1 ={'col1': ['I ate dinner','I ate dinner', 'the play was inetresting','the play was inetresting'],
'col2': ['I ate dinner','I went to school', 'the play was inetresting for her','the gold is shining'],
'col3': ['I went out','I did not stay at home', 'the play was inetresting for her','the house is nice'],
 'col4': ['min', 'max', 'mid','min'],
 'col5': ['min', 'max', 'max','max']}

d2 ={'col1': ['I ate dinner',' the glass is shattered', 'the play was inetresting'],
'col2': ['I ate dinner',' the weather is nice', 'the gold is shining'],
'col3': ['I went out',' the house was amazing', 'the house is nice'],
     'col4': ['min', 'max', 'max'],
     'col5': ['max', 'min', 'mid']}

df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)

so this time, i would like to append the rows from df2 under similar rows in df1 only if the rows are similar in all col1, col2, col3. so the output is,

    col1         col2          col3       col4 col5
 0 I ate dinner  I ate dinner  I went out  min  min
 1 I ate dinner  I ate dinner  I went out  min  max
 2  the play was inetresting the gold is shining  the house is nice  min  max
 3  the play was inetresting the gold is shining  the house is nice  max  mid

so I tried the following,

df = pd.concat(df1[df1.set_index(['col1','col2','col3']).index.isin(df2.set_index(['col1','col2','col3']).index)]).sort_values(df1.set_index(['col1','col2','col3']).index, ignore_index=True)

but I get this error,

 TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

Advertisement

Answer

Another solution is to use pd.merge and pd.wide_to_long:

out = (
    pd.wide_to_long(
        pd.merge(df1, df2, how='inner', on=['col1', 'col2', 'col3']).reset_index(),
        stubnames=['col4', 'col5'], i='index', j='val', sep='_', suffix=r'[xy]')
      .sort_index().reset_index(drop=True)[df1.columns]
)

Output:

>>> out
                       col1                 col2               col3 col4 col5
0              I ate dinner         I ate dinner         I went out  min  min
1              I ate dinner         I ate dinner         I went out  min  max
2  the play was inetresting  the gold is shining  the house is nice  min  max
3  the play was inetresting  the gold is shining  the house is nice  max  mid

Step by step

# Step 1: merge
>>> out = pd.merge(df1, df2, how='inner', on=['col1', 'col2', 'col3']).reset_index()
   index                      col1                 col2               col3 col4_x col5_x col4_y col5_y
0      0              I ate dinner         I ate dinner         I went out    min    min    min    max
1      1  the play was inetresting  the gold is shining  the house is nice    min    max    max    mid

# Step 2: wide_to_long
>>> out = pd.wide_to_long(out, stubnames=['col4', 'col5'], i='index', j='val', sep='_', suffix=r'[xy]')
                        col3                 col2                      col1 col4 col5
index val                                                                            
0     x           I went out         I ate dinner              I ate dinner  min  min
1     x    the house is nice  the gold is shining  the play was inetresting  min  max
0     y           I went out         I ate dinner              I ate dinner  min  max
1     y    the house is nice  the gold is shining  the play was inetresting  max  mid

# Step 3: reorder dataframe
>>> out = out.sort_index().reset_index(drop=True)[df1.columns]
                       col1                 col2               col3 col4 col5
0              I ate dinner         I ate dinner         I went out  min  min
1              I ate dinner         I ate dinner         I went out  min  max
2  the play was inetresting  the gold is shining  the house is nice  min  max
3  the play was inetresting  the gold is shining  the house is nice  max  mid
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement