I am trying to scrape data from the html tables on this page and export it to a csv. The only success i’ve had is with extracting the headers. I thought the problem might be with the page not fully loading before the data is scraped, hence my use of the the ‘requests_html’ library, but the issue still persists. Here’s
Tag: python-requests
Python Web Scraping – How to Skip Over Missing Entries?
I am working on a project that involves analyzing the text of political emails from this website: https://politicalemails.org/. I am attempting to scrape all the emails using BeautifulSoup and pandas. I have a working chunk right here: The above results in pulling the data I want. However, I want to loop through larger chunks of the emails in this archive.
Certificates won’t work in Python requests
I am trying to create a little Python script to send data to a server using the requests module in Python. To make it a bit more secure i want to use self signed certificates made in a program called XCA. When using the certificates in the browser everything works and is secure. When using Postman to send a request
TypeError: list indices must be integers or slices, not str. Python
So, I have this piece of code: It gives me this response: Then I have this piece of code, that should give me the page I need: But I get this: I tried a few other ways to do this: But nothing works Answer You’re trying to retrieve a dictionary item with a list as a value. json.loads(grade.content)[‘pages’][0][‘id’] should work.
Get the desired table from the site
There is a link to a site with a schedule. On the page there are 3 elements <select>: Institute (faculty), course, group. How to get the desired table through Requests? I tried Post and Get, unsuccessfully. Maybe Requests will not help here at all, and it is better to try Selenium? Answer You can use BeautifulSoup to parse the elements:
Connection timeouts as a protection from site scraping?
I am new to Python and Web scraping but it’s been two weeks that I periodically scrape one website and successfully download images from it. I use different proxies and sometimes change them. But starting yesterday all my proxies suddenly stopped working with a timeout error. I’ve tried a whole list of them and all fail. Could this be a
How to add optional parameters in REST API?
I am using the following REST API: https://rest.ensembl.org/documentation/info/vep_hgvs_get An example of the code for this is as follows: On the documentation page, it says it has an optional parameter as follows: Name Description Example Values dbNSFP Include fields from dbNSFP, a database of pathogenicity predictions for missense variants. Multiple fields should be separated by commas. See dbNSFP README for field
Python and FastAPI: keep getting 405 when posting response
I have made at fasstapi at https://lectio-fastapi.herokuapp.com/docs#/ When I test on the fastapi site it works like a charm. But when I try to access it from another python program i get 405 when posting the output I am getting is: And I am expecting to get these returns: It seems like to me I need to define something more
How to call the CITES species+ api in python?
I am trying to access the cites species api to get information on a input species name. Reference document: http://api.speciesplus.net/documentation/v1/references.html I tried to use the api with the provided API Key. I get error code 401. Here is the code Answer As @jonrsharpe said in comment: You have to set APIKEY as header – don’t put it in URL. You
How to put each link separate in database with beautifulsoup python
Hello i would like to add each link seperate in the database. When i print out “new_lst” it displays every link so i think it wants to put the whole outcome in 1 row and now seperate. My code: Answer You are already iterating over with a for loop. Yes, it is putting the whole outcome in one line as