webscraping an image with highlighted text

Question

I am doing web scraping on this URL which is a newspaper image with highlighted words. My purpose is to retrieve all those highlighted words in red. Inspecting the page gives the class: image-overlay hit-rect ng-star-inserted in which attribute title must be extract: Using the following code snippet with Beau…

Accepted Answer

The data you&#8217;re looking for is loaded from external URL via JavaScript. To get the data you can use following example:import requestsapi_url = "https://digi.kansalliskirjasto.fi/rest/binding-search/ocr-hits/761979"params = {"page": "12", "term": ["Katri", "Katrina", "Ikonen"]}data = [d["text"] for d in requests.get(api_url, params=params).json()]print(data)Prints:['Katri', 'Katrina', 'Katri', 'Katri', 'Katri', 'Katri', 'Katri', 'Katri', 'Ikonen.', 'Katrina', 'Katri', 'Ikonen.', 'Katri', 'Katrina', 'Katri', 'Katri', 'Katri']

Advertisement

Answer