Skip to content
Advertisement

AttributeError: ‘PandasExprVisitor’ object has no attribute ‘visit_Ellipsis’, using pandas eval

I have a series of the form:

s

0    [133, 115, 3, 1]
1    [114, 115, 2, 3]
2      [51, 59, 1, 1]
dtype: object

Note that its elements are strings:

s[0]
'[133, 115, 3, 1]'

I’m trying to use pd.eval to parse this string into a column of lists. This works for this sample data.

pd.eval(s)

array([[133, 115, 3, 1],
       [114, 115, 2, 3],
       [51, 59, 1, 1]], dtype=object)

However, on much larger data (order of 10K), this fails miserably!

len(s)
300000

pd.eval(s)
AttributeError: 'PandasExprVisitor' object has no attribute 'visit_Ellipsis'

What am I missing here? Is there something wrong with the function or my data?

Advertisement

Answer

Your data is fine, and pandas.eval is buggy, but not in the way you think. There is a hint in the relevant github issue page that urged me to take a closer look at the documentation.

pandas.eval(expr, parser='pandas', engine=None, truediv=True, local_dict=None,
            global_dict=None, resolvers=(), level=0, target=None, inplace=False)

    Evaluate a Python expression as a string using various backends.

    Parameters:
        expr: str or unicode
            The expression to evaluate. This string cannot contain any Python
            statements, only Python expressions.
        [...]

As you can see, the documented behaviour is to pass strings to pd.eval, in line with the general (and expected) behaviour of the eval/exec class of functions. You pass a string, and end up with an arbitrary object.

As I see it, pandas.eval is buggy because it doesn’t reject the Series input expr up front, leading it to guess in the face of ambiguity. The fact that the default shortening of the Series__repr__ designed for pretty printing can drastically affect your result is the best proof of this situation.

The solution is then to step back from the XY problem, and use the right tool to convert your data, and preferably stop using pandas.eval for this purpose entirely. Even in the working cases where the Series is small, you can’t really be sure that future pandas versions don’t break this “feature” completely.

Advertisement