I have a Pandas dataframe and a function that pulls entries from the dataframe. If the requested entry is not present in the dataframe—whether because the requested column does not exist, because the requested row/index does not exist, or both—I would like to return the string 'entry not found' instead of an error message.
import pandas as pd
df = pd.DataFrame({'col1': [12, 13, 14, 15], 'col2': [16, 15, 14, 13]})
Ideally, I would like to write my query function as
def query(col, idx):
return df.get(col, idx, 'entry not found')
Unfortunately, the df.get() method only accepts two arguments, so I came up with the following alternatives.
def query1(col, idx):
return df[col, idx]
def query2(col, idx):
return df[col].get(idx, 'entry not found')
def query3(col, idx):
return df.get(col, 'entry not found')[idx]
def query4(col, idx):
return df.get(col, 'entry not found').get(idx, 'entry not found')
Only query2 and query4 work if the user asks for a row that doesn’t exist:
# User asks for a row that doesn't exist.
query1('col1', 24) # KeyError
query2('col1', 24) # 'entry not found'
query3('col1', 24) # ValueError: 24 is not in range
query4('col1', 24) # 'entry not found'
Whereas only query3 (kind of) works if the user asks for a column that doesn’t exist:
# User asks for a column that doesn't exist.
query1('col5', 3) # KeyError
query2('col5', 3) # KeyError
query3('col5', 3) # Returns 'r' ( = 4th char of 'entry not found')
query4('col5', 3) # AttributeError: 'str' object has no attribute 'get'
How can I obtain the desired behavior? Is there a way to do this without a heavy try: ... except: ... block?
Advertisement
Answer
What about using get twice:
def lookup(col, idx):
"""
`col` is the column indexer, `idx` is the row indexer.
"""
return df.get(col, {}).get(idx, "entry not found")
First get will look for a col column:
- if it exists, then it gives
df.col - if it doesn’t, then it gives a dict
{}(so the successorgetcan work)
Then second get looks for idx row:
- if
df.colis queried with this, essentially returnsdf.loc[idx, col] if it exists - otherwise entry is not found