Join pandas dataframes based on column values

Question

I'm quite new to pandas dataframes, and I'm experiencing some troubles joining two tables. The first df has just 3 columns: DF1: And the second has exactly same two columns (and plenty of others): DF2: What I need is to perform an operation which, in SQL, would look as follows: And, as a result, I want to see DF2, complemented

Accepted Answer

I think you need merge with default inner join, but is necessary no duplicated combinations of values in both columns:print (df2)   item_id  document_id col1  col2  col30      337           10    s     4     71     1002           11    d     5     82     1003           11    f     7     0df = pd.merge(df1, df2, on=['document_id','item_id'])print (df)   item_id  position  document_id col1  col2  col30      337         2           10    s     4     71     1002         2           11    d     5     82     1003         3           11    f     7     0But if necessary position column in position 3:df = pd.merge(df2, df1, on=['document_id','item_id'])cols = df.columns.tolist()df = df[cols[:2] + cols[-1:] + cols[2:-1]]print (df)   item_id  document_id  position col1  col2  col30      337           10         2    s     4     71     1002           11         2    d     5     82     1003           11         3    f     7     0

Advertisement

Answer