I need to extract the digits (0.04) in the “td” tag at the end of this html page.
<div class="boxContentInner"> <table class="values non-zebra"> <thead> <tr> <th>Apertura</th> <th>Max</th> <th>Min</th> <th>Variazione giornaliera</th> <th class="last">Variazione %</th> </tr> </thead> <tbody> <tr> <td id="open" class="quaternary-header">2708.46</td> <td id="high" class="quaternary-header">2710.20</td> <td id="low" class="quaternary-header">2705.66</td> <td id="change" class="quaternary-header changeUp">0.99</td> <td id="percentageChange" class="quaternary-header last changeUp">0.04</td> </tr> </tbody> </table> </div>
I tried this code using BeautifulSoup with Python 2.8:
from bs4 import BeautifulSoup import requests page= requests.get('https://www.ig.com/au/indices/markets-indices/us-spx-500').text soup = BeautifulSoup(page, 'lxml') percent= soup.find('td',{'id':'percentageChange'}) percent2=percent.text print percent2
The result is NONE.
Where is the error?
Advertisement
Answer
I had a look at https://www.ig.com/au/indices/markets-indices/us-spx-500 and it seems you are not searching for the right id when doing percent= soup.find('td', {'id':'percentageChange'})
The actual value is located in <span data-field="CPC">VALUE</span>
You can retrieve this information with the below:
percent = soup.find("span", {'data-field': 'CPC'}) print(percent.text.strip())