Tag: web-crawler

scrapy/regex get json_object from html

I’m crawling reviews from a website in scrapy python and want to get all the reviews from the following part of the raw html as a dictionary. Getting the window.cj.listings is no problem, but I can’t seem to get the window.cj.app_data out with regex. The following code works for getting the listing. But I get nothing from window.cj.app_data, when I

How to handle “Redis.exceptions.ConnectionError: Connection has data”

eventlet python python-requests redis web-crawler

I receive following output: I couldn’t find any issue related to this particular error. I emptied/flushed all redis databases, so there should be no data there. I assume it has something to do with eventlet and patching. But even when I put following code right at the beginning of the file, the error appears. What does this error mean? Answer

Python how to find the minimum number of moves for a directory iteration – crawler

iteration python python-3.x web-crawler

I’m working on a Python(3) program in which I have to return the number of moves for a directory iteration by using the input as a list of multiple iterations denotes various actions like: ../ denotes move to the parent folder of the current folder. ./ remain in the same folder x/ move to the child folder named x Actually,

Crawling IMDB for movie trailers?

python web-crawler youtube

I want to crawl IMDB and download the trailers of movies (either from YouTube or IMDB) that fit some criteria (e.g.: released this year, with a rating above 2). I want to do this in Python – I saw that there were packages for crawling IMDB and downloading YouTube videos. The thing is, my current plan is to crawl IMDB