Why is this (presumably more efficient) dynamic al…

I have the following problem as homework:

Write a O(N^2) algorithm to determine whether the string can be broken into a list of words. You can start by writing an exponential algorithm and then using dynamic programming to improve the runtime complexity.

The naive exponential algorithm which I started out with is this:

def naiveWordBreak(S):
    if len(S) == 0:
        return True
    else:
        return any([(S[:i] in wordSet) and naiveWordBreak(S[i:]) for i in range(1, len(S) + 1)])

I then adapted this into the following dynamic algorithm:

def wordBreak(S):
    prefixTable = [False] * (len(S) + 1)
    prefixTable[0] = True
    return _helper(S, prefixTable)

def _helper(S, prefixTable):
    if prefixTable[len(S)]:
        return prefixTable[len(S)]
    else:
        for i in range(1, len(S) + 1):
            if S[:i] in wordSet and _helper(S[i:], prefixTable):
                prefixTable[i] = True
                return True

I am fairly confident from my proof and some testing that both algorithms are correct, however the recursive method should be exponential time while the dynamic method should be O(n^2). However, I got curious and used the timeit library to analyze the time it takes for both algorithms to run a batch of tests and the results were surprising. The dynamic method only beats the recursive method by a fraction of a second. More confusing is that after running the same test a couple of times, the recursive method actually gave a better runtime than the dynamic method. Here is the code I’m using for testing runtime:

def testRecursive():
    naiveWordBreak("alistofwords")
    naiveWordBreak("anotherlistofwords")
    naiveWordBreak("stableapathydropon")
    naiveWordBreak("retouchesreissueshockshopbedrollsunspotassailsinstilledintersectionpipitreappointx")
    naiveWordBreak("xretouchesreissueshockshopbedrollsunspotassailsinstilledintersectionpipitreappoint")
    naiveWordBreak("realignitingrains")


def testDynamic():
    wordBreak("alistofwords")
    wordBreak("anotherlistofwords")
    wordBreak("stableapathydropon")
    wordBreak("retouchesreissueshockshopbedrollsunspotassailsinstilledintersectionpipitreappointx")
    wordBreak("xretouchesreissueshockshopbedrollsunspotassailsinstilledintersectionpipitreappoint")
    wordBreak("realignitingrains")


def main():
    recTime = timeit.timeit(testRecursive, number=1)
    dynTime = timeit.timeit(testDynamic, number=1)

    print("Performance Results:n")
    print("Time for recursive method = {}n".format(recTime))
    print("Time for dynamic method = {}n".format(dynTime))

    if dynTime < recTime:
        print("Dynamic method produces better performance")
    else:
        print("Recursive method produces better performance")

The way I see it, there are only a few explanations for why the runtimes are inconsistent/not what I expected:

There is something wrong with my dynamic algorithm (or my analysis of it)
There is something wrong with my recursive algorithm
My test cases are insufficient
timeit isn’t actually an appropriate library for what I’m trying to do

Does anyone have any insights or explanations?

Answer

The naive recursive approach is only slow when there are many, many, ways to break up the same string into words. If there is only one way, then it will be linear.

Assuming that can, not and cannot are all words in your list, try a string like "cannot" * n. By the time you get to n=40, you should see the win pretty clearly.

Why is this (presumably more efficient) dynamic algorithm being outperformed by the naive recursive version?

Advertisement

Answer