I am facing the below challenge. For instance, let the dummy dataframes be, Let another dataframe be, The output dataframe should be the following, My train of thought was to create dictionary(s), in this case, would be, followed by this function, I am always getting the following error, Also I think this is not an efficient solution at all. Are

How to perform index/match excel function equivalent using pandas?

I am facing the below challenge.

For instance, let the dummy dataframes be,

 col1_a | A | B | C
    a   | 1 | 4 | 7
    b   | 2 | 5 | 8
    c   | 3 | 6 | 9

JavaScript
​x
 
 col1_a | A | B | C
    a   | 1 | 4 | 7
    b   | 2 | 5 | 8
    c   | 3 | 6 | 9
​

Let another dataframe be,

 col1_b |col2
    b   | B
    c   | C
    a   | A

JavaScript
 
 col1_b |col2
    b   | B
    c   | C
    a   | A
​
​

The output dataframe should be the following,

 col1_b | col2 | output
    b   |  B   |  5
    c   |  C   |  9
    a   |  A   |  1

JavaScript
 
 col1_b | col2 | output
    b   |  B   |  5
    c   |  C   |  9
    a   |  A   |  1
​
​

My train of thought was to create dictionary(s), in this case, would be,

A, B, C

({'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9})

JavaScript
 
A, B, C
​
({'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9})
​

followed by this function,

def func_test1(row):
    if row['col2'] == 'B':
        return test1_df.col1_b.map(B)
    elif row['col2'] == 'C':
        return test1_df.col1_b.map(C)
    elif test1_df['col2'] == 'A':
        return test1_df.col1_b.map(A)
    
test1_df['output'] = test1_df.apply(func_test1, axis=1)

JavaScript
 
def func_test1(row):
    if row['col2'] == 'B':
        return test1_df.col1_b.map(B)
    elif row['col2'] == 'C':
        return test1_df.col1_b.map(C)
    elif test1_df['col2'] == 'A':
        return test1_df.col1_b.map(A)
    
test1_df['output'] = test1_df.apply(func_test1, axis=1)
​

I am always getting the following error,

 ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 2')

JavaScript
 
 ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 2')
​

Also I think this is not an efficient solution at all.

Are there pandas inbuilt function(s) to achieve the goal?

Answer

Use DataFrame.join with unpivot df1 by DataFrame.set_index with DataFrame.stack and rename for new column name:

df = df2.join(df1.set_index('col1_a').stack().rename('output'), on=['col1_b','col2'])
print (df)
  col1_b col2  output
0      b    B       5
1      c    C       9
2      a    A       1

JavaScript
 
df = df2.join(df1.set_index('col1_a').stack().rename('output'), on=['col1_b','col2'])
print (df)
  col1_b col2  output
0      b    B       5
1      c    C       9
2      a    A       1
​

Another idea is use DataFrame.melt, rename column and DataFrame.merge:

df1 = (df1.melt('col1_a', var_name='col2', value_name='output')
          .rename(columns={'col1_a':'col1_b'}))
    
df = df2.merge(df1, on=['col1_b','col2'], how='left')
print (df)
  col1_b col2  output
0      b    B       5
1      c    C       9
2      a    A       1

JavaScript
 
df1 = (df1.melt('col1_a', var_name='col2', value_name='output')
          .rename(columns={'col1_a':'col1_b'}))
    
df = df2.merge(df1, on=['col1_b','col2'], how='left')
print (df)
  col1_b col2  output
0      b    B       5
1      c    C       9
2      a    A       1
​

Advertisement

Answer