I am practicing web scraping using the requests and BeautifulSoup modules on the following website:
https://www.imdb.com/title/tt0080684/
My code thus far properly outputs the json in question. I’d like help in extracting from the json only the name
and description
into a response dictionary.
Code
JavaScript
x
30
30
1
# Send HTTP requests
2
import requests
3
4
import json
5
6
from bs4 import BeautifulSoup
7
8
9
class WebScraper:
10
11
def send_http_request():
12
13
# Obtain the URL via user input
14
url = input('Input the URL:n')
15
16
# Get the webpage
17
r = requests.get(url)
18
19
soup = BeautifulSoup(r.content, 'html.parser')
20
21
# Check response object's status code
22
if r:
23
p = json.loads("".join(soup.find('script', {'type':'application/ld+json'}).contents))
24
print(p)
25
else:
26
print('nInvalid movie page!')
27
28
29
WebScraper.send_http_request()
30
Desired Output
JavaScript
1
2
1
{"title": "Star Wars: Episode V - The Empire Strikes Back", "description": "After the Rebels are brutally overpowered by the Empire on the ice planet Hoth, Luke Skywalker begins Jedi training with Yoda, while his friends are pursued by Darth Vader and a bounty hunter named Boba Fett all over the galaxy."}
2
Advertisement
Answer
You can parse the dictonary and then print a new JSON object using the dumps
method:
JavaScript
1
31
31
1
# Send HTTP requests
2
import requests
3
4
import json
5
6
from bs4 import BeautifulSoup
7
8
9
class WebScraper:
10
11
def send_http_request():
12
13
# Obtain the URL via user input
14
url = input('Input the URL:n')
15
16
# Get the webpage
17
r = requests.get(url)
18
19
soup = BeautifulSoup(r.content, 'html.parser')
20
21
# Check response object's status code
22
if r:
23
p = json.loads("".join(soup.find('script', {'type':'application/ld+json'}).contents))
24
output = json.dumps({"title": p["name"], "description": p["description"]})
25
print(output)
26
else:
27
print('nInvalid movie page!')
28
29
30
WebScraper.send_http_request()
31
Output:
JavaScript
1
2
1
{"title": "Star Wars: Episode V - The Empire Strikes Back", "description": "Star Wars: Episode V - The Empire Strikes Back is a movie starring Mark Hamill, Harrison Ford, and Carrie Fisher. After the Rebels are brutally overpowered by the Empire on the ice planet Hoth, Luke Skywalker begins Jedi training..."}
2