Skip to content
Advertisement

How to find the index in a string in spark when I am dealing with a list?

How do I find the index of the numbers 0-9 in a string?

MyString = "Are all the black cats really black 045. I don't think so 098."
MyString.find([0-9])

TypeError: must be str, not list

How do I get around this and replicate what essentially is a PATINDEX in SQL server?. The following answer gives me a perspective of searching using regex but I am still unable to insert lists.

Advertisement

Answer

You can use a list comprehension to find the numbers:

>>> [MyString.find(str(i)) for i in range(10)]
[36, -1, -1, -1, 37, 38, -1, -1, 60, 59]

If you want the smallest number, you can use min:

>>> min([j for j in [MyString.find(str(i)) for i in range(10)] if j != -1])
36

Or you can use re.search for use with a regex pattern:

>>> import re
>>> re.search("[0-9]", MyString).start()
36

Remember to wrap the regex pattern in double quotes, otherwise it would be interpreted as a list of 0-9 (zero minus nine), which is -9.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement