I am merging two data frames with pandas. I would like to avoid that, when joining, the output includes the join column of the right table.
Example:
import pandas as pd age = [['tom', 10], ['nick', 15], ['juli', 14]] df1 = pd.DataFrame(age, columns = ['Name', 'Age']) toy = [['tom', 'GIJoe'], ['nick', 'car']] df2 = pd.DataFrame(toy, columns = ['Name_child', 'Toy']) df = pd.merge(df1,df2,left_on='Name',right_on='Name_child',how='left')
df.columns
will give the output Index(['Name', 'Age', 'Name_child', 'Toy'], dtype='object')
. Is there an easy way to obtain Index(['Name', 'Age', 'Toy'], dtype='object')
instead? I can drop the column afterwards of course like this del df['Name_child']
, but I’d like my code to be as short as possible.
Advertisement
Answer
Based on @mgc comments, you don’t have to rename the columns of df2. Just you pass df2 to merge
function with renamed columns. df2 column names will remain as it is.
df = pd.merge(df1,df2.rename(columns={'Name_child': 'Name'}),on='Name', how='left') df Name Age Toy 0 tom 10 GIJoe 1 nick 15 car 2 juli 14 NaN df.columns Index(['Name', 'Age', 'Toy'], dtype='object') df2.columns Index(['Name_child', 'Toy'], dtype='object')