Skip to content
Advertisement

String/regex search over Excel in Python issue

I’m a newb to SO, and relatively new to Python, so i’m sorry if this is a simple fix or an inappropriate question.

Firstly, my program generally works, but i’m trying to implement some redundancy/catchalls for to make it robust.

The program looks over a directory (and sub-dirs) of excel files, opens them individually, scours for data (on a specific sheet), and dumps it out to a csv. There are loops involved as each search term is effectively for the head of a column, and i want 4 values beneath this.

I use regular expressions to define search terms.

I’ve written a function to search over the excel sheet for a match to a regular expression. The sheet has strings and other format-types within the cells, hence the type(query) for strings.

def SearchXLWithRe(regex)
    for i in range(1, Row_limit):         # row limit is defined by OpenPyXL module
        for j in range(1, Column_limit):    # same here for column limit
            query = ws.cell(row = i, column = j).value
            if type(query) == str:         # i only want to look at strings
                if regex.search(query):    # of the responses that are strings, i want to match to the regex
                    return [i,j]

This function works for searches on strings that are there (which has so far always been the case). I want to add redundancy for when some excel files wont contain terms I want to search for, but others will (it could just return some made up coordinates for a blank cell at eg. 1000,1000 or something).

I have tried putting an else but as it’s looping over an excel doc and finding multiple string, all this returns is a None.

I think i have a simple logic problem, but I just can’t see it; if anyone can offer me some pointers the help would be gratefully (and eagerly!) received.

Questions i’ve reviewed (but i’m still lost):

In Python how should I test if a variable is None, True or False

OpenPyXL + How can I search for content in a cell in Excel, and if the content matches the search criteria update the content?

Advertisement

Answer

def SearchXLWithRe(regex)
    for i in range(1, Row_limit):         # row limit is defined by OpenPyXL module
        for j in range(1, Column_limit):    # same here for column limit
            query = ws.cell(row = i, column = j).value
            if type(query) == str:         # i only want to look at strings
                if regex.search(query):    # of the responses that are strings, i want to match to the regex
                    return [i,j]
     return [x,y] #x,y are the dummy locations

Just return after the for loops, it will only be executed if no match was found.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement