Skip to content
Advertisement

Python web-scraper not working for TripAdvisor

I am trying to write a simple Python scraper in order to save all the reviews of a specific place on TripAdvisor.

The specific link I am using as example is the following:

https://www.tripadvisor.com/Attraction_Review-g319796-d5988326-Reviews-or50-Museo_de_Altamira-Santillana_del_Mar_Cantabria.html

Here is the code I am using, that is supposed to print the relative html:

JavaScript

If I run this code in the console it stays pending on the requests.get(url) for long without any output. Using another url (for example url = "https://stackoverflow.com/") I get immediately the html correctly displayed. Why is TripAdvisor not working? How can I manage to obtain its html?

Advertisement

Answer

Adding an user-agent should solve your issue in first step, cause some sites provides different content or use it for bot / automation detection – Open DEVTools in your browser an copy the user-agent from one of your requests:

JavaScript

Example

JavaScript

Output

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement