Tag: beautifulsoup

BeautifulSoup finds an html element that contains spaces in its attributes

How to use BeautifulSoup to find an html element that contains spaces in its attributes I would like to know how to use soup.find to find the title that i want. Because beautifulsoup considers the attribute attrs of title ‘that i want’ like this: {‘class’: [‘td’, ‘p1’]}.<br> But not like this: {‘class’: [‘td p1’]} Answer Note Different approaches but both

How to loop over multiple pages of a website using Scrapy

beautifulsoup python scrapy web-scraping

Hello everybody out there! I have been working with BeautifulSoup for my scraping projects. Currently, I’m learning Scrapy. I have written a code in BeautifulSoup to loop over multiple pages of a single website using for loops. I looped over 10 pages and fetched URLs of blog posts from those pages using the code below. I want to do the

Scrape Historical Bitcoin Data from Coinmarketcap with BeautifulSoup

beautifulsoup pandas python web-scraping

I’m trying to scrape Historical Bitcoin Data from coinmarketcap.com in order to get close, volume, date, high and low values since the beginning of the year until Sep 30, 2021. After going through threads and videos for hours, and I’m new to scraping with Python, I don’t know what my mistake is (or is there something with the website I

How to extract deeply nested tags using Beautiful Soup

beautifulsoup python web-scraping

I have the content below and I am trying to understand how to extract the <p> tag copy using Beautiful Soup (I am open to other methods). As you can see the <p> tags are not both nested inside the same <div>. I gave it a shot with the following method but that only seems to work when both <p>

How to get HTML changes after pressing button with Beautiful Soup and Requests

beautifulsoup python request web-scraping

I want to get the HTML this site https://www.forebet.com/en/football-predictions after pressing the button More[+] enough times to load all games. Each time the button More[+] on the bottom of the page the HTML changes and shows more football games. How do I get the request to the page with all the football games loaded? Answer Like stated, requests and beautfulsoup

Python scraping – subtract class?

beautifulsoup html python request web-scraping

I am new to python and programming and scraping. I would like to subtract one html tag from another: in “game_elements” there are all matches including lives, in “game_elements_live” there are only lives. In your opinion is it possible to only have non-live matches? I use requests and BeautifulSoup thank you so much Answer If you’re using version 4.7.0 or

web scraping amazon reviews precents bs4

beautifulsoup python selenium selenium-webdriver

So I’m Trying To Get The Review Precent Of Each Amount of Stars In an Amazon Product Page. This Is The Output I want To Get: And So Far This Is The Output I Got: As You see, I Have Managed To Get The Awesome Feedback Working But Not The Other Ones… The problem is that I got all the

Web scraping from the span element

beautifulsoup python

I am on a scraping project and I am lookin to scrape from the following. I want to extract only Christian, Islam as the output.(Without the ‘Faith:’). This is my try: How can I make this done? Answer There are several ways you can fix this, I would suggest the following – Find all <span> in <div> that have not

Beautifulsoup: Replace all with aria-level attributes with tags of the same level

beautifulsoup python

I have a HTML source where <div> elements serve as headings. Using Beautifulsoup and the attribute aria-level I would like to replace all <div> elements with <h> tags of the same level. My code kind of works for my purpose but it seems inelegant and ideally, the attributes of the former <div> elements would be removed. Output: What it should

Code scrapes first webpage twice, but then scrapes the next six as it’s meant to

beautifulsoup python selenium web-scraping

I’m trying to scrape football scores from 8 pages online. For some reason my code is scraping the results from the first page twice, it goes on to scrape the next 6 pages as it should, then leaves out the final page. Here is my code Help would be much appreciated EDIT: I fixed it by shifting the loop up