BeautifulSoup: How to extract text encapsulated in multiple div/span/id tags

I need to extract the digits (0.04) in the “td” tag at the end of this html page.

      <div class="boxContentInner">
         <table class="values non-zebra">
   <thead>
   <tr>
      <th>Apertura</th>
      <th>Max</th>
      <th>Min</th>
      <th>Variazione giornaliera</th>
      <th class="last">Variazione %</th>
   </tr>
   </thead>
   <tbody>
   <tr>
      <td id="open" class="quaternary-header">2708.46</td>
      <td id="high" class="quaternary-header">2710.20</td>
      <td id="low" class="quaternary-header">2705.66</td>
      <td id="change" class="quaternary-header changeUp">0.99</td>
      <td id="percentageChange" class="quaternary-header last changeUp">0.04</td>
   </tr>
   </tbody>
</table>

      </div>

JavaScript
​x
 
      <div class="boxContentInner">
         <table class="values non-zebra">
   <thead>
   <tr>
      <th>Apertura</th>
      <th>Max</th>
      <th>Min</th>
      <th>Variazione giornaliera</th>
      <th class="last">Variazione %</th>
   </tr>
   </thead>
   <tbody>
   <tr>
      <td id="open" class="quaternary-header">2708.46</td>
      <td id="high" class="quaternary-header">2710.20</td>
      <td id="low" class="quaternary-header">2705.66</td>
      <td id="change" class="quaternary-header changeUp">0.99</td>
      <td id="percentageChange" class="quaternary-header last changeUp">0.04</td>
   </tr>
   </tbody>
</table>
​
      </div>
​
​

I tried this code using BeautifulSoup with Python 2.8:

from bs4 import BeautifulSoup 
import requests 

page= requests.get('https://www.ig.com/au/indices/markets-indices/us-spx-500').text 
soup = BeautifulSoup(page, 'lxml') 

percent= soup.find('td',{'id':'percentageChange'}) 
percent2=percent.text


print percent2

JavaScript
 
from bs4 import BeautifulSoup 
import requests 
​
page= requests.get('https://www.ig.com/au/indices/markets-indices/us-spx-500').text 
soup = BeautifulSoup(page, 'lxml') 
​
percent= soup.find('td',{'id':'percentageChange'}) 
percent2=percent.text
​
​
print percent2
​
​
​

The result is NONE.

Where is the error?

Answer

I had a look at https://www.ig.com/au/indices/markets-indices/us-spx-500 and it seems you are not searching for the right id when doing percent= soup.find('td', {'id':'percentageChange'})

The actual value is located in <span data-field="CPC">VALUE</span>

You can retrieve this information with the below:

percent = soup.find("span", {'data-field': 'CPC'})
print(percent.text.strip())

JavaScript
 
percent = soup.find("span", {'data-field': 'CPC'})
print(percent.text.strip())
​

Advertisement

Answer