I have a Pandas dataframe and a function that pulls entries from the dataframe. If the requested entry is not present in the dataframe—whether because the requested column does not exist, because the requested row/index does not exist, or both—I would like to return the string 'entry not found'
instead of an error message.
import pandas as pd df = pd.DataFrame({'col1': [12, 13, 14, 15], 'col2': [16, 15, 14, 13]})
Ideally, I would like to write my query function as
def query(col, idx): return df.get(col, idx, 'entry not found')
Unfortunately, the df.get()
method only accepts two arguments, so I came up with the following alternatives.
def query1(col, idx): return df[col, idx] def query2(col, idx): return df[col].get(idx, 'entry not found') def query3(col, idx): return df.get(col, 'entry not found')[idx] def query4(col, idx): return df.get(col, 'entry not found').get(idx, 'entry not found')
Only query2
and query4
work if the user asks for a row that doesn’t exist:
# User asks for a row that doesn't exist. query1('col1', 24) # KeyError query2('col1', 24) # 'entry not found' query3('col1', 24) # ValueError: 24 is not in range query4('col1', 24) # 'entry not found'
Whereas only query3
(kind of) works if the user asks for a column that doesn’t exist:
# User asks for a column that doesn't exist. query1('col5', 3) # KeyError query2('col5', 3) # KeyError query3('col5', 3) # Returns 'r' ( = 4th char of 'entry not found') query4('col5', 3) # AttributeError: 'str' object has no attribute 'get'
How can I obtain the desired behavior? Is there a way to do this without a heavy try: ... except: ...
block?
Advertisement
Answer
What about using get
twice:
def lookup(col, idx): """ `col` is the column indexer, `idx` is the row indexer. """ return df.get(col, {}).get(idx, "entry not found")
First get
will look for a col
column:
- if it exists, then it gives
df.col
- if it doesn’t, then it gives a dict
{}
(so the successorget
can work)
Then second get
looks for idx
row:
- if
df.col
is queried with this, essentially returnsdf.loc[idx, col
] if it exists - otherwise entry is not found