Skip to content
Advertisement

BeautifulSoup: How to extract text encapsulated in multiple div/span/id tags

I need to extract the digits (0.04) in the “td” tag at the end of this html page.

      <div class="boxContentInner">
         <table class="values non-zebra">
   <thead>
   <tr>
      <th>Apertura</th>
      <th>Max</th>
      <th>Min</th>
      <th>Variazione giornaliera</th>
      <th class="last">Variazione %</th>
   </tr>
   </thead>
   <tbody>
   <tr>
      <td id="open" class="quaternary-header">2708.46</td>
      <td id="high" class="quaternary-header">2710.20</td>
      <td id="low" class="quaternary-header">2705.66</td>
      <td id="change" class="quaternary-header changeUp">0.99</td>
      <td id="percentageChange" class="quaternary-header last changeUp">0.04</td>
   </tr>
   </tbody>
</table>

      </div>

I tried this code using BeautifulSoup with Python 2.8:

from bs4 import BeautifulSoup 
import requests 

page= requests.get('https://www.ig.com/au/indices/markets-indices/us-spx-500').text 
soup = BeautifulSoup(page, 'lxml') 

percent= soup.find('td',{'id':'percentageChange'}) 
percent2=percent.text


print percent2


The result is NONE.

Where is the error?

Advertisement

Answer

I had a look at https://www.ig.com/au/indices/markets-indices/us-spx-500 and it seems you are not searching for the right id when doing percent= soup.find('td', {'id':'percentageChange'})

The actual value is located in <span data-field="CPC">VALUE</span>

enter image description here

You can retrieve this information with the below:

percent = soup.find("span", {'data-field': 'CPC'})
print(percent.text.strip())
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement