I’m trying to use the value of one cell to find the value of a cell in another column. The first cell value (‘source’) dictates which column to lookup.
import pandas as pd df = pd.DataFrame({'A': ['John', 'Andrew', 'Bob', 'Fred'], 'B': [ 'Fred', 'Simon', 'Andrew', 'Andrew'], 'source': ['A', 'B', 'A', 'B']}, ) print(df) A B source 0 John Fred A 1 Andrew Simon B 2 Bob Andrew A 3 Fred Andrew B
My required output value in the ‘output’ column is a lookup of the ‘source’:
A B source output 0 John Fred A John 1 Andrew Simon B Simon 2 Bob Andrew A Bob 3 Fred Andrew B Andrew
Failed attempts
df['output'] = df[df['source']]
This results in a ValueError: Wrong number of items passed 4, placement implies 1 because the df['source']
passes in a Series, not a string. I tried converting to a string using:
df['output'] = df[df['source'].convertDTypes(convert_string=True)]
which gave error AttributeError: ‘Series’ object has no attribute ‘convertDTypes’.
Working solution
I found a solution might by iterating through the rows using:
for index, row in df.iterrows(): column = df.loc[index, 'source'] df.at[index, 'output'] = df.loc[index, column]
However, this post suggests iterating is a bad idea. The code doesn’t seem very elegant, either.
I feel I’ve missed something basic here; this really should not be that hard.
Advertisement
Answer
Let us do numpy
way since lookup
will not longer work in the future version
df['new'] = df.values[df.index,df.columns.get_indexer(df.source)] df Out[339]: A B source new 0 John Fred A John 1 Andrew Simon B Simon 2 Bob Andrew A Bob 3 Fred Andrew B Andrew