BeautifulSoup Web Scraping to find values of a specific key within the result set

Question

I am scraping a webpage using beautiful soup: <class 'bs4.element.Tag'> , 1 0 Here is what 'result' looks like: I am unable to access recipeIngredient (highlighted in the image) as a dictionary keys. It gives me a keyerror. KeyError: 'recipeIngredient' How can I do this? I want to extract this from 'result': "recipeIngredient":["1 pound fresh, best-quality mozzarella (preferably buffalo milk)","4

Accepted Answer

You would need to convert the data inside the script tag to json using json.loads. In order to get the data inside the script tag use .get_text methodimport requests, jsonfrom bs4 import BeautifulSoupr= requests.get("https://cooking.nytimes.com/recipes/1018849-classic-caprese-salad?action=click&module=Collection%20Page%20Recipe%20Card&region=46%20Ways%20to%20Do%20Justice%20to%20Summer%20Tomatoes&pgType=collection&rank=1")c= r.contentsoup= BeautifulSoup(c, "html.parser")result= soup.find("script", {"type": "application/ld+json"})data = json.loads(result.get_text())print(data["recipeIngredient"])Output:['1 pound fresh, best-quality mozzarella (preferably buffalo milk)', '4 medium heirloom tomatoes', '1 bunch fresh basil, leaves only, some reserved for garnish', 'Flaky sea salt, such as Maldon', 'Coarsely ground black pepper', 'High-quality extra-virgin olive oil']

Advertisement

Answer