Skip to content
Advertisement

Why does `'{x[1:3]}’.format(x=”asd”)` cause a TypeError?

Consider this:

>>> '{x[1]}'.format(x="asd")
's'
>>> '{x[1:3]}'.format(x="asd")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string indices must be integers

What could be the cause for this behavior?

Advertisement

Answer

An experiment based on your comment, checking what value the object’s __getitem__ method actually receives:

class C:
    def __getitem__(self, index):
        print(repr(index))

'{c[4]}'.format(c=C())
'{c[4:6]}'.format(c=C())
'{c[anything goes!@#$%^&]}'.format(c=C())
C()[4:6]

Output (Try it online!):

4
'4:6'
'anything goes!@#$%^&'
slice(4, 6, None)

So while the 4 gets converted to an int, the 4:6 isn’t converted to slice(4, 6, None) as in usual slicing. Instead, it remains simply the string '4:6'. And that’s not a valid type for indexing/slicing a string, hence the TypeError: string indices must be integers you got.

Update:

Is that documented? Well… I don’t see something really clear, but @GACy20 pointed out something subtle. The grammar has these rules

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
element_index     ::=  digit+ | index_string
index_string      ::=  <any source character except "]"> +

Our c[4:6] is the field_name, and we’re interested in the element_index part 4:6. I think it would be clearer if digit+ had its own rule with meaningful name:

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
element_index     ::=  index_integer | index_string
index_integer     ::=  digit+
index_string      ::=  <any source character except "]"> +

I’d say having index_integer and index_string would more clearly indicate that digit+ is converted to an integer (instead of staying a digit string), while <any source character except "]"> + would stay a string.

That said, looking at the rules as they are, perhaps we should think “what would be the point of separating the digits case out of the any-characters case which would match it as well?” and think that the point is to treat pure digits differently, presumably to convert them to an integer. Or maybe some other part of the documentation even states that digit or digits+ in general gets converted to an integer.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement