I do not understand why my spider wont run. I tested the css selector separately, so I do not think it is the parsing method.
Traceback message: ReactorNotRestartable:
JavaScript
x
12
12
1
class espn_spider(scrapy.Spider):
2
name = "fsu2021_spider"
3
def start_requests(self):
4
urls = "https://www.espn.com/college-football/team/_/id/52"
5
for url in urls:
6
yield scrapy.Request(url = url, callback = self.parse_front)
7
def parse(self, response):
8
schedule_link = response.css('div.global-nav-container li > a::attr(href)')
9
process = CrawlerProcess()
10
process.crawl(espn_spider)
11
process.start()
12
Advertisement
Answer
urls = “https://www.espn.com/college-football/team/_/id/52” for url in urls:
You’re going through the characters of “urls”, change it to a list:
JavaScript
1
4
1
urls = ["https://www.espn.com/college-football/team/_/id/52"]
2
3
4
Also you don’t have “parse_front” function, if you just didn’t add it to the snippet then ignore this, if it was a mistake then change it to:
JavaScript
1
2
1
yield scrapy.Request(url=url, callback=self.parse)
2