Is it any chance to get cities names from the following raw strings without heavy iterations?
'nSegelbåt ntttttttttnttttttttntttttttttnttttt
tttttntttttttttttnttttttttttttÄlvsborgntttttttt
ttttntttttttttttnttttttttttntttttttntttttttnttttttnttttt'
'nSegelbåt ntttttttttnttttttttntttttttttntttt
ttttttntttttttttttnttttttttttttÄlvsborgnttt
tttttttttntttttttttttnttttttttttntttttttntttttttnttttttnttttt',
'nButiknSegelbåt ntttttttttnttttttttnttttttttt
nttttttttttntttttttttttnttttttttttttStockholmntttt
ttttttttntttttttttttnttttttttttntttttttnt
ttttttnttttttnttttt'
need to get Älvsborg, Stockholm, etc, that is name of a cities, towns. Names will be different of cource
Function is already heavy with iterations, so that build or add-on functions/methods are preferable.
also it is possible to get them in the following format:
SegelbåtÄlvsborg
ButikSegelbåtStockholm
ButikSegelbåtStockholm
SegelbåtJönköping
SegelbåtGöteborg
ButikSegelbåtGöteborg
ButikSegelbåtGöteborg
SegelbåtSkaraborg
SegelbåtStockholm
SegelbåtStockholm
SegelbåtHalland
SegelbåtStockholm
ButikSegelbåtHelsingborg
SegelbåtStockholm
ButikSegelbåtKalmar
SegelbåtGöteborg
ButikSegelbåtGöteborg
ButikSegelbåtÖstergötland
ButikSegelbåt
ButikSegelbåtGöteborg
ButikSegelbåtGöteborg
ButikSegelbåtGöteborg
ButikSegelbåtGöteborg
SegelbåtStockholm
ButikSegelbåtHelsingborg
SegelbåtKalmar
SegelbåtGöteborg
which doesn’t make job easier.
Thank you!
p.s. i can separate letters and sheltered symbols like this in FOR cycle:
letters = ''.join(filter(lambda x: False if x.isspace() else True,
place.get_text()
And after that i still need to separate cities names somehow…
Advertisement
Answer
You can just use str.split
:
In [1]: s = 'nSegelbåt ntttttttttnttttttttntttttttttnttttttttttntttttttttttnttttttttttttÄlvsborgntttttttt
ttttntttttttttttnttttttttttntttttttntttttttnttttttnttttt'
In [2]: s.split() # when called with no argument it splits on all whitespace
Out[2]: ['Segelbåt', 'Älvsborg']
The city name seems to be the last element:
In [3]: s.split()[-1]
Out[3]: 'Älvsborg'
It looks like you’re parsing HTML with BeautifulSoup. You may find it easier to select the proper elements directly instead of parsing what .get_text()
produces.