Skip to content
Advertisement

Pandas to lookup and return corresponding values from many dataframes

A list of names and I want to retrieve each of the correspondent information in different data frames, to form a new dataframe.

I converted the list into a 1 column dataframe, then to look up its corresponding values in different dataframes.

The idea is visualized as:

enter image description here

I have tried:

import pandas as pd

data = {'Name': ["David","Mike","Lucy"]}

data_h = {'Name': ["David","Mike","Peter", "Lucy"],
    'Hobby': ['Music','Sports','Cooking','Reading'],
    'Member': ['Yes','Yes','Yes','No']}

data_s = {'Name': ["David","Lancy", "Mike","Lucy"],
    'Speed': [56, 42, 35, 66],
    'Location': ['East','East','West','West']}

df = pd.DataFrame(data)
df_hobby = pd.DataFrame(data_h)
df_speed = pd.DataFrame(data_s)

df['Hobby'] = df.lookup(df['Name'], df_hobby['Hobby'])

print (df)

But it returns the error message as:

ValueError: Row labels must have same size as column labels

I have also tried:

df = pd.merge(df, df_hobby, on='Name')

It works but it includes unnecessary columns.

What will be the smart an efficient way to do such, especially when the number of to-be-looked-up dataframes are many?

Thank you.

Advertisement

Answer

Filter only columns for merge and columns for append like:

df = (pd.merge(df, df_hobby[['Name','Hobby']], on='Name')
        .merge(df_speed[['Name','Location']], on='Name'))
 
print(df)
    Name    Hobby Location
0  David    Music     East
1   Mike   Sports     West
2   Lucy  Reading     West

If want working with list use this solution with filtering columns:

dfList = [df,  
          df_hobby[['Name','Hobby']],
          df_speed[['Name','Location']]]

from functools import reduce
df = reduce(lambda df1,df2: pd.merge(df1,df2,on='Name'), dfList)
print (df)
    Name    Hobby Location
0  David    Music     East
1   Mike   Sports     West
2   Lucy  Reading     West
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement