I have a webpage to parse. The HTML code is in the figure.
I need to extract the price, which is simple text:
<div class="price"> "212,25 € " <sup>HT</sup>
This is the only “price” class on the page. So I call the find() method:
soup = BeautifulSoup(get(url, headers=headers, params=params).content, 'lxml') container = soup.find_all('div', class_="side-content") # Find a container cost = container.find('div', {'class': 'price'}) # Find price class cost_value = cost.next_sibling
The cost is None. I have tried .next_sibling function and .text functions. But as find() returns None, I have an exception. How can I fix it?
Advertisement
Answer
I have resolved it. The problem was in the JavaScript-generated data. So static parsing methods don’t work with it. I tried several solutions (including Selenium and an XHR script results capturing).
Finally, inside my parsed data I have found a static URL of a page that links to a separate web page, where this JavaScript code is executed and can be parsed by static methods.
The video “Python Web Scraping Tutorial: scraping dynamic JavaScript/Ajax websites with Beautiful Soup“ explains a similar solution.