Skip to content
Advertisement

Tag: scrapy

When using python script to run scrapy crawler, data is scraped successfully but the output file shows no data in it and is of 0 kb

#Scrapy News Crawler #defining function to set headers and setting Link from where to start scraping #Iterating headline links and getting healine details and date/time #Python script (Separate FIle ) Answer Instead of running you spider with cmdline.execute you can run it with CrawlerProcess, read about common practices. You can see main.py as an example. You can declare the headers

Scrapy get only text ignoring the commented content

I researched but can’t find any answers to my question: I want get the main content, ignoring the commented content, how should I do? my scrapy spider looks like: But this codes give me only some nt. plz help, thank you. Answer When /text() in XPath or ::text in CSS fails to produce the desired result, I use another library.

Scrapy : Crawled 0 pages (at 0 pages/min), scraped 0 items

I’m new to python and I’m trying to scrape a html with a scrapy spider but the response returns nothing. Wondering what’s wrong here? Thanks for any help in advance. The url: https://directory.lubesngreases.com/LngMain/includes/themes/MuraBootstrap3/remote/api/?fn=searchcompany&name&query&STATE&brand&COUNTRY&query2&mode=advanced&filters=%7B%7D&page=1&datatype=html My spider: Output: Answer I added print(‘url:’, response.url) in parse() and I see it runs this function. First problem is that you use CSS in wrong way.

Could this selenium code be recreated using scrapy?

I’m interested in getting a better idea of what scrapy can do. Here is a very simple selenium code that interacts with a website, fills in some boxes, clicks some elements and downloads a file. Could this code be replicated using scrapy?, so that a code is written using scrapy that does the exact same thing. Answer “selenium code be

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Scrapy

Hi guys I am trying to scrap/crawl this json based site using scrapy/Beautifulsoup https://pk.profdir.com/jobs-for-angular-developer-lahore-punjab-cddb I have write this below code to run read/fetch the json from website: But it will arise this error again and again: If anyone knows please help me it will be very helpful for me Answer The json that is located inside <script> isn’t valid, so

How to locate a changing element in playwright? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 11 months ago. Improve this question I am filling a input-box with verification code, but the text which can locate the input-box is keeping changing, just like “30 seconds later, you

Run scrapy splash as a script

I am trying to run a scrapy script with splash, as I want to scrape a javascript based webpage, but with no results. When I execute this script with python command, I get this error: crochet._eventloop.TimeoutError. In addition the print statement in parse method never printed, so I consider something is wrong with SplashRequest. The code that I wrote in

Advertisement