@lru_cache decorator excessive cache misses

Tags: , , , ,



How can you configure lru_cache to key its cache based on actual values received, rather than how the function was called?

>>> from functools import lru_cache
>>> @lru_cache
... def f(x=2):
...     print("reticulating splines...")
...     return x ** 2
...
>>> f()
reticulating splines...
4
>>> f(2)
reticulating splines...
4
>>> f(x=2)
reticulating splines...
4

In other words, only the first call above should be a cache miss, the other two should be cache hits.

Answer

To do that, you would have to go through the process of binding arguments to formal parameters. The actual process of doing that is implemented in C code with no public interface, but there’s a (much slower) reimplementation in inspect. This is about 100x slower than using functools.lru_cache normally:

import functools
import inspect

def mycache(f=None, /, **kwargs):
    def inner(f):
        sig = inspect.signature(f)
        f = functools.lru_cache(**kwargs)(f)
        @functools.wraps(f)
        def wrapper(*args, **kwargs):
            bound = sig.bind(*args, **kwargs)
            bound.apply_defaults()
            return f(*bound.args, **bound.kwargs)
        return wrapper
    if f:
        return inner(f)
    return inner

@mycache
def f(x):
    print("reticulating splines...")
    return x ** 2

If the performance penalty of that approach is too much, you can instead use the following trick, which requires more code duplication but runs much faster, only about 2x slower than using lru_cache normally (and sometimes faster, with keyword arguments):

@functools.lru_cache
def _f(x):
    print("reticulating splines...")
    return x ** 2

def f(x=2):
    return _f(x)

This uses the much faster C-level argument binding to normalize the call to the memoized helper function, but requires duplicating the function’s parameters 3 times: once in the outer function’s signature, once in the helper’s signature, and once in the call to the helper.



Source: stackoverflow