I have a dataframe like as below
JavaScript
x
8
1
data_df = pd.DataFrame({'p_id': ['abc@gmail.com','abc@gmail.com','abc@gmail.com','ace@gmail.com','ace@gmail.com','pqr@gmail.com','pqr@gmail.com'],
2
'company': ['a','b','c','d','e','f','g'],
3
'dept_access':['a1','a1','a1','a1','a2','a2','a2']})
4
5
key_df = pd.DataFrame({'p_id': ['abc@gmail.com','xyz@gmail.com','pqr@gmail.com'],
6
'company': ['a','c','b'],
7
'location':['UK','USA','KOREA']})
8
I would like to do the below
a) Attach the location
column from key_df
to data_df
based on two fields – p_id
and company
So, I tried the below
JavaScript
1
3
1
loc = key_df.drop_duplicates(['p_id','company']).set_index(['p_id','company'])['location']
2
data_df['location'] = data_df[['p_id','company']].map(loc)
3
But this resulted in error like below
KeyError: “None of [Index([‘p_id’,’company’], dtype=’object’)] are in the [columns]”
How can I map based on multiple index columns? I don’t wish to use merge
Advertisement
Answer
Merge can be used for a lot, so let’s first try to use it:
JavaScript
1
2
1
data_df.merge(key_df, on=['p_id', 'company'], how="left")
2
JavaScript
1
9
1
p_id company dept_access location
2
0 abc@gmail.com a a1 UK
3
1 abc@gmail.com b a1 NaN
4
2 abc@gmail.com c a1 NaN
5
3 ace@gmail.com d a1 NaN
6
4 ace@gmail.com e a2 NaN
7
5 pqr@gmail.com f a2 NaN
8
6 pqr@gmail.com g a2 NaN
9
You can also do this by mapping the index like this:
JavaScript
1
4
1
idx = ['p_id', 'company']
2
3
data_df.assign(location=data_df.set_index(idx).index.map(key_df.set_index(idx)['location']))
4
JavaScript
1
9
1
p_id company dept_access location
2
0 abc@gmail.com a a1 UK
3
1 abc@gmail.com b a1 NaN
4
2 abc@gmail.com c a1 NaN
5
3 ace@gmail.com d a1 NaN
6
4 ace@gmail.com e a2 NaN
7
5 pqr@gmail.com f a2 NaN
8
6 pqr@gmail.com g a2 NaN
9