Tag: beautifulsoup

webscraping an image with highlighted text

I am doing web scraping on this URL which is a newspaper image with highlighted words. My purpose is to retrieve all those highlighted words in red. Inspecting the page gives the class: image-overlay hit-rect ng-star-inserted in which attribute title must be extract: Using the following code snippet with BeautifulSoup: However, I get [] as a result! My expected result

Python requests.get of link, passed in a for, gets always the same content of the first link

beautifulsoup python python-requests

I’m trying to make a for loop of links that are opened and from which I then retrieve data; right now I have the problem that at every cycle it always retrieves the same page (the first one) even though I change the link every time. Answer Try to remove idCategory=5&idExpansion=1178 from the filterURL: Prints:

How to extract link to Package Sources from Arch User Repository (AUR) website

beautifulsoup python

I’m using BeautifulSoup to extract this line: from a webpage. Specifically, I want this part: iwgtk-0.8.tar.gz I’ve written this code: and I assume it is this line that fails. I’ve tried but that failed too. Answer Try to select your elements more specific: or more comfortable via css selector and use get(‘href’) to get the url or text / get_text()

Trying to get data from a table using beautifulsoup in python

beautifulsoup python python-requests

Trying to get the “all splits” line of numbers from https://insider.espn.com/nba/player/splits/_/id/532/type/nba/year/2003/category/perGame (html code is in the picture) my code returns the ‘all splits’ text instead of the numbers I’m looking for. How do I go about changing the lookups in the GetStats function area to get the numbers instead of the first column descriptors. Answer To get the all_splits stats

An Error while using bs4 and requests in replit

beautifulsoup python python-requests replit

When I use bs4 an requests locally it works but when i put my code In replit :(The Error): The ERROR Please Help Me ! If someone can explain what is the problem with replit . Answer This would be much easier to debug if you included a sample link [a plausible value of URL.format(username)]. The error seems to be

Downloading all zip files from url

beautifulsoup python web-scraping

I need to download all the zip files from the url: https://www.ercot.com/mp/data-products/data-product-details?id=NP7-802-M The zip files are as shown in the pic: I am trying the following code: I have tried different versions of above but no success so far. I am not sure how to proceed. Answer Everything you need comes from one endpoint that you can query and then

Extract data by looping though dates using pandas

beautifulsoup pandas python

I want to scrape exchange rate data from July 1 2021 to June 30 2022 by enumerating exchangeDate variable and save it to excel. Here is my code so far: How do I loop through all dates? Answer You can use something like this:

Finding <p style class using BeautifulSoup

beautifulsoup finance python web-scraping

I am trying to scrape MSFT’s income statement using code I found here: How to Web scraping SEC Edgar 10-K Dynamic data They use the ‘span’ class to narrow the search. I do not see a span, so I am trying to use the <p class with no luck. Here is my code, it is largely unchanged from the answer

How to extract RSS links from website with Python

beautifulsoup python rss web-scraping

I am trying to extract all RSS feed links from some websites. Ofc if RSS exists. These are some website links that have RSS, and below is list of RSS links from those websites. My approach is to extract all links, and then check if some of them has RSS in them, but that is just a first step: I

How to scrape table with flight data, avoiding an empty result?

beautifulsoup python python-requests selenium web-scraping

I’m trying to extract a table from a webpage and have tried a number of alternatives, but the table always seems to remain empty. Two of what I thought were the most promising sets of code are attached below. Any means of extracting the data from the webpage would be considered as helpful. I have also included a screenshot of