Tag: web-scraping

Python scraping – subtract class?

beautifulsoup html python request web-scraping

I am new to python and programming and scraping. I would like to subtract one html tag from another: in “game_elements” there are all matches including lives, in “game_elements_live” there are only lives. In your opinion is it possible to only have non-live matches? I use requests and BeautifulSoup thank you so much Answer If you’re using version 4.7.0 or

Unable to locate element by class name using selenium via Python. Why so?

python selenium web-scraping

I wrote the following code in order to scrape the text of the element <h3 class=”h4 mb-10″>Total nodes: 1,587</h3> from https://blockchair.com/dogecoin/nodes. I’m aware that there are perhaps less bloated options than selenium to scrape the target in question, yet the said code is just a snippet, a part of a more extensive script that relies on selenium for its other

Code scrapes first webpage twice, but then scrapes the next six as it’s meant to

beautifulsoup python selenium web-scraping

I’m trying to scrape football scores from 8 pages online. For some reason my code is scraping the results from the first page twice, it goes on to scrape the next 6 pages as it should, then leaves out the final page. Here is my code Help would be much appreciated EDIT: I fixed it by shifting the loop up

Scraping Table in Selenium and long single line printed instead of columns and rows

beautifulsoup python selenium web-scraping

I am trying to scrape this website, and this is my code thus far: What prints out is a long list, and I am trying to figure out how to get it into a table format. Answer I’m getting the following output as a table format. Code: Output:

Grabbing all data fields from a div in python beautifulsoup

beautifulsoup python python-3.x web-scraping

The snippet below works fine not until the other day. Is there any way to extract all the data inside this div class=”row mb-4″ easily. What I am thinking is that if additional changes will be made to the page, still the script will not be affected. Previous Output: Wanted Improved Output: Answer Try: Prints:

Selenium bug: Message: invalid argument: ‘url’ must be a string

python selenium web-scraping

I have some simple selenium scraping code that returns all the search results, but when I run the for loop, it displays an error: Message: invalid argument: ‘url’ must be a string (Session info: chrome=93.0.4577.82) I would like to ask for some help. How to avoid this error? Thanks. Answer You are trying to get the “user_data” immediately after opening

Word count from different URL’s in Python

beautifulsoup python web-scraping

I have the following code which provides me with the columns: Authors, Date, Blog name, Link and blog category To further enhance this, I want to add the word count of the article and the author, separately The updated columns I am trying to achieve are: Authors, Date, Blog name, Link, blog category, description count, about count Example: For the

Web-scraping return empty values: possible protected site

beautifulsoup pandas python web-scraping

I’m working with web-scraping from www.albumoftheyear.org, but in my code I can only get an empty df. I don’t know if the site is protected with some cloudflare and if this is a cause or I’m making a mistake with the selected tags. The basic idea is to iterate through the pages and collect the data (title, year, genre) from

document.scrollingElement is not working . Not able to scroll down for inspecting elements

css-selectors python selenium selenium-webdriver web-scraping

Please find the aatched screenshot. and Below code is printing only first 4-5 rows which is visible in the screenshot. It is not scrolling down and inspecting element it is prining blank spaces. Same code is running succesfully without i write code written in main function outside the function. add_data.py -> Answer May be you need to scroll to each

Is it possible to call a function inside another function in Python? (Web-Scraping problem)

beautifulsoup pandas python web-scraping

I’m working on a web-scraping task and I can already collect the data in a very rudimentary way. Basically, I need a function to collect a list of songs and artists from the Allmusic.com and then add the data in df. In this example, I use this link: https://www.allmusic.com/mood/tender-xa0000001119/songs So far, I managed to accomplish most of the objective, however,