A list of names and I want to retrieve each of the correspondent information in different data frames, to form a new dataframe.
I converted the list into a 1 column dataframe, then to look up its corresponding values in different dataframes.
The idea is visualized as:
I have tried:
JavaScript
x
20
20
1
import pandas as pd
2
3
data = {'Name': ["David","Mike","Lucy"]}
4
5
data_h = {'Name': ["David","Mike","Peter", "Lucy"],
6
'Hobby': ['Music','Sports','Cooking','Reading'],
7
'Member': ['Yes','Yes','Yes','No']}
8
9
data_s = {'Name': ["David","Lancy", "Mike","Lucy"],
10
'Speed': [56, 42, 35, 66],
11
'Location': ['East','East','West','West']}
12
13
df = pd.DataFrame(data)
14
df_hobby = pd.DataFrame(data_h)
15
df_speed = pd.DataFrame(data_s)
16
17
df['Hobby'] = df.lookup(df['Name'], df_hobby['Hobby'])
18
19
print (df)
20
But it returns the error message as:
JavaScript
1
2
1
ValueError: Row labels must have same size as column labels
2
I have also tried:
JavaScript
1
2
1
df = pd.merge(df, df_hobby, on='Name')
2
It works but it includes unnecessary columns.
What will be the smart an efficient way to do such, especially when the number of to-be-looked-up dataframes are many?
Thank you.
Advertisement
Answer
Filter only columns for merge and columns for append like:
JavaScript
1
9
1
df = (pd.merge(df, df_hobby[['Name','Hobby']], on='Name')
2
.merge(df_speed[['Name','Location']], on='Name'))
3
4
print(df)
5
Name Hobby Location
6
0 David Music East
7
1 Mike Sports West
8
2 Lucy Reading West
9
If want working with list use this solution with filtering columns:
JavaScript
1
12
12
1
dfList = [df,
2
df_hobby[['Name','Hobby']],
3
df_speed[['Name','Location']]]
4
5
from functools import reduce
6
df = reduce(lambda df1,df2: pd.merge(df1,df2,on='Name'), dfList)
7
print (df)
8
Name Hobby Location
9
0 David Music East
10
1 Mike Sports West
11
2 Lucy Reading West
12