Skip to content

Tag: scrapy

XPath Selector to get IMDB release Date

I am practicing using Xpath selectors, and it seems to be very difficult to extract the release date from this website. I can get to the div class=’txt-block’, but not past that. I am trying to the get the date underneath it. E.g. “18 July 2008 (USA)” https://www.imdb.com/title/tt04685…

Pyinstaller error on scrapy?

I am using scrapy importing it. I built the python file using pyinstaller. After building it I ran the file ./new.py. But the error pops: Answer You did not use Pyinstaller properly when you had built your stand-alone program. Here is a short, layman’s description of how Pyinstaller works: Pyinstaller b…

Scrapy: populate items with item loaders over multiple pages

I’m trying to crawl and scrape multiple pages, given multiple urls. I am testing with Wikipedia, and to make it easier I just used the same Xpath selector for each page, but I eventually want to use many different Xpath selectors unique to each page, so each page has its own separate parsePage method. T…

Replacing characters in Scrapy item

I’m trying to scrape from a commerce website using Scrapy. For the price tag, I want to remove the “$”, but my current code does not work. What is the appropriate method to remove characters when using Scrapy? Answer extract() would return you a list, you can use extract_first() to get a sin…