I’m trying to create a function that has 2 arguments, a web URL, and a search word. The function should print out the number of times the word is seen on the page.
I am currently unsure of what I’m doing wrong, as my output isn’t giving me neither an error nor an output…
JavaScript
x
24
24
1
from html.parser import HTMLParser
2
from urllib.request import urlopen
3
4
class customWebScraper(HTMLParser):
5
def __init__(self, searchWord, desiredURL):
6
HTMLParser.__init__(self)
7
self.searchWord= ''
8
self.desiredURL = ''
9
10
11
def scrapePage(searchWord, desiredURL):
12
wordCount = 0
13
if searchWord.count(searchWord) > 0:
14
wordCount += 1
15
print(wordCount)
16
17
searchWord= ''
18
desiredURL = ''
19
20
urlContents = urlopen(desiredURL).read().decode('utf-8')
21
22
parseURL = customWebScraper(searchWord, desiredURL)
23
parseURL.feed(urlContents)
24
So if a user types:
customWebScraper(‘name’,’http://help.websiteos.com/websiteos/example_of_a_simple_html_page.htm‘)
it should output: 6
Advertisement
Answer
Here’s a simple example script that defines the function you want.
JavaScript
1
18
18
1
from urllib.request import urlopen
2
3
class customWebScraper:
4
def __init__(self, searchWord, desiredURL):
5
self.searchWord = searchWord
6
self.desiredURL = desiredURL
7
8
def scrapePage(self):
9
url_content = urlopen(self.desiredURL).read().decode('utf-8')
10
return url_content.lower().count(self.searchWord.lower())
11
12
13
14
parseURL = customWebScraper('name', 'http://help.websiteos.com/websiteos/example_of_a_simple_html_page.htm')
15
count = parseURL.scrapePage()
16
print('"{}" appears in {} exactly {} times'.format(parseURL.searchWord, parseURL.desiredURL, count))
17
18
when I run it the output is:
“name” appears in http://help.websiteos.com/websiteos/example_of_a_simple_html_page.htm exactly 6 times
I assumed you wanted case-insensitive match because in the page you provided, name
appears 6 times only if you also count appName
, etc.