How to combine queries with a single external variable using Pandas

I am trying to accept a variable input of many search terms seperated by commas via html form (@search) and query 2 columns of a dataframe.

Each column query works on its own but I cannot get them to work together in a and/or way.

First column query:

filtered = df.query ('`Drug Name` in @search')

JavaScript
​x
 
filtered = df.query ('`Drug Name` in @search')
​

Second column query:

filtered = df.query ('BP.str.contains(@search, na=False)', engine='python')

JavaScript
 
filtered = df.query ('BP.str.contains(@search, na=False)', engine='python')
​

edit combining like this:

filtered = df.query ("('`Drug Name` in @search') and ('BP.str.contains(@search, na=False)', engine='python')")

JavaScript
 
filtered = df.query ("('`Drug Name` in @search') and ('BP.str.contains(@search, na=False)', engine='python')")
​

Gives the following error, highlighting the python identifier in the engine argument

SyntaxError: Python keyword not valid identifier in numexpr query

edit 2

The dataframe is read from an excel file, with columns: Drug Name (containing a single drug name), BP, U&E (with long descriptive text entries)

The search terms will be input via html form:

search = request.values.get('searchinput').replace(" ","").split(',')

JavaScript
 
search = request.values.get('searchinput').replace(" ","").split(',')
​

as a list of drugs which a patient may be on sometimes with the addition of specific conditions relating to medication use. sample user input:

Captopril, Paracetamol, kidney disease, chronic

I want the list to be checked against specific drug names and also to check other columns such as BP and U&E for any mention of the search terms.

edit 3

Apologies, but trying to implement the answers given is giving me stacks of errors. What I have below is giving me 90% of what I’m after, letting me search both columns including the whole contents of ‘BP’. But I can only search a single term via the terminal, if I # out and swap the lines which collect the use input (taking it from the html form as apposed to the terminal) I get:

TypeError: unhashable type: ‘list’

@app.route('/', methods=("POST", "GET"))

    def html_table():
        searchterms = []
        #searchterms = request.values.get('searchinput').replace(" ","").split(',')
        searchterms = input("Enter drug...")   
        filtered = df.query('`Drug Name` in @searchterms | BP.str.contains(@searchterms, na=False)', engine='python')
        return render_template('drugsafety.html', tables=[filtered.to_html(classes='data')], titles=['na', 'Drug List'])

<form action="" method="post">
  <p><label for="search">Search</label>
  <input type="text" name="searchinput"></p>        
  <p><input type="submit"></p>
</form>

JavaScript
 
@app.route('/', methods=("POST", "GET"))
​
    def html_table():
        searchterms = []
        #searchterms = request.values.get('searchinput').replace(" ","").split(',')
        searchterms = input("Enter drug...")   
        filtered = df.query('`Drug Name` in @searchterms | BP.str.contains(@searchterms, na=False)', engine='python')
        return render_template('drugsafety.html', tables=[filtered.to_html(classes='data')], titles=['na', 'Drug List'])
​
<form action="" method="post">
  <p><label for="search">Search</label>
  <input type="text" name="searchinput"></p>        
  <p><input type="submit"></p>
</form>
​

Sample data

The contents of the BP column can be quite long, descriptive and variable but an example is:

Every 12 months – Patients with CKD every 3 to 6 months.

Drug Name         BP                            U&E
Perindopril       Every 12 months               Not needed
Alendronic Acid   Not needed                    Every 12 months
Allopurinol       Whilst titrating - 3 months   Not needed

JavaScript
 
Drug Name         BP                            U&E
Perindopril       Every 12 months               Not needed
Alendronic Acid   Not needed                    Every 12 months
Allopurinol       Whilst titrating - 3 months   Not needed
​

With this line:

searchterms = request.values.get('searchinput')

JavaScript
 
searchterms = request.values.get('searchinput')
​

Entering ‘months’ into the html form outputs:

1   Perindopril  Every 12 months                Not needed 
14  Allopurinol  Whilst titrating – 3 months    Not needed

JavaScript
 
1   Perindopril  Every 12 months                Not needed 
14  Allopurinol  Whilst titrating – 3 months    Not needed
​

All good.

Entering ‘Alendronic Acid’ into the html form outputs:

13  Alendronic Acid Not needed  Every 12 months

JavaScript
 
13  Alendronic Acid Not needed  Every 12 months
​

Also good, but entering ‘Perindopril, Allopurinol’ returns nothing.

If I change the line to:

searchterms = request.values.get('searchinput').replace(" ","").split(',')

JavaScript
 
searchterms = request.values.get('searchinput').replace(" ","").split(',')
​

I get TypeError: unhashable type: ‘list’ when the page reloads.

However – If I then change:

filtered = df.query('`Drug Name` in @searchterms | BP.str.contains(@searchterms, na=False)', engine='python')

JavaScript
 
filtered = df.query('`Drug Name` in @searchterms | BP.str.contains(@searchterms, na=False)', engine='python')
​

to:

filtered = df.query('`Drug Name` in @searchterms')

JavaScript
 
filtered = df.query('`Drug Name` in @searchterms')
​

Then the unhashable type error goes and entering ‘Perindopril, Allopurinol’ returns:

1   Perindopril   Every 12 months                   Not needed
14  Allopurinol   Whilst titrating – Every 3 months Not needed

JavaScript
 
1   Perindopril   Every 12 months                   Not needed
14  Allopurinol   Whilst titrating – Every 3 months Not needed
​

But I’m now no longer searching the BP column for the searchterms.

Just thought that maybe its because searchterms is a list ‘[]’ changed it t oa tuple ‘()’ Didn’t change anything.

Any help is much appreciated.

Answer

I am assuming you want to query 2 columns and want to return the row if any of the query matches.

In this line, the issue is that engine=python is inside query.

filtered = df.query ("('`Drug Name` in @search') and ('BP.str.contains(@search, na=False)', engine='python')")

JavaScript
 
filtered = df.query ("('`Drug Name` in @search') and ('BP.str.contains(@search, na=False)', engine='python')")
​

It should be

df.query("BP.str.contains(@search, na=False)", engine='python')

JavaScript
 
df.query("BP.str.contains(@search, na=False)", engine='python')
​

If you do searchterms = request.values.get('searchinput').replace(" ","").split(','), it converts your string to list of words which will cause Unhashable type list error because str.contains expects str as input.

What you can do is use regex to search for search terms in list, it will look something like this:

df.query("BP.str.contains('|'.join(@search), na=False, regex=True)", engine='python')

JavaScript
 
df.query("BP.str.contains('|'.join(@search), na=False, regex=True)", engine='python')
​

What this does is it searches for all the individual words using regex. ('|'.join(@search) will be “searchterm_1|search_term2|…” and “|” is used to represent or in regex, so it looks for searchterm_1 or searchterm_2 in BP column value)

To combine the outputs of both queries, you can run those separately and concatenate the results

pd.concat([df.query("`Drug Name` in @search", engine='python'),df.query("BP.str.contains('|'.join(@search), na=False, regex=True)", engine='python')])

JavaScript
 
pd.concat([df.query("`Drug Name` in @search", engine='python'),df.query("BP.str.contains('|'.join(@search), na=False, regex=True)", engine='python')])
​

Also any string based matching will require your strings to match perfectly, including case. so you can maybe lowercase everything in dataframe and query. Similarly for space separated words, this will remove spaces.

if you do searchterms = request.values.get('searchinput').replace(" ","").split(',') on Every 12 months, it will get converted to “Every12months”. so you can maybe remove the .replace() part and just use searchterms = request.values.get('searchinput').split(',')

Advertisement

Answer