I am facing the below challenge.
For instance, let the dummy dataframes be,
JavaScript
x
5
1
col1_a | A | B | C
2
a | 1 | 4 | 7
3
b | 2 | 5 | 8
4
c | 3 | 6 | 9
5
Let another dataframe be,
JavaScript
1
6
1
col1_b |col2
2
b | B
3
c | C
4
a | A
5
6
The output dataframe should be the following,
JavaScript
1
6
1
col1_b | col2 | output
2
b | B | 5
3
c | C | 9
4
a | A | 1
5
6
My train of thought was to create dictionary(s), in this case, would be,
JavaScript
1
4
1
A, B, C
2
3
({'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9})
4
followed by this function,
JavaScript
1
10
10
1
def func_test1(row):
2
if row['col2'] == 'B':
3
return test1_df.col1_b.map(B)
4
elif row['col2'] == 'C':
5
return test1_df.col1_b.map(C)
6
elif test1_df['col2'] == 'A':
7
return test1_df.col1_b.map(A)
8
9
test1_df['output'] = test1_df.apply(func_test1, axis=1)
10
I am always getting the following error,
JavaScript
1
2
1
('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 2')
2
Also I think this is not an efficient solution at all.
Are there pandas inbuilt function(s) to achieve the goal?
Advertisement
Answer
Use DataFrame.join
with unpivot df1
by DataFrame.set_index
with DataFrame.stack
and rename
for new column name:
JavaScript
1
7
1
df = df2.join(df1.set_index('col1_a').stack().rename('output'), on=['col1_b','col2'])
2
print (df)
3
col1_b col2 output
4
0 b B 5
5
1 c C 9
6
2 a A 1
7
Another idea is use DataFrame.melt
, rename
column and DataFrame.merge
:
JavaScript
1
10
10
1
df1 = (df1.melt('col1_a', var_name='col2', value_name='output')
2
.rename(columns={'col1_a':'col1_b'}))
3
4
df = df2.merge(df1, on=['col1_b','col2'], how='left')
5
print (df)
6
col1_b col2 output
7
0 b B 5
8
1 c C 9
9
2 a A 1
10