BeautifulSoup doesn’t find tables on webpage

Tags: , , , ,



I’m trying to get the data from the 1st table on a website. I’ve looked on here for similar problems and tried a number of the given solutions but can’t seem to find the table and ultimately the data in the table.

I’ve tried:

from bs4 import BeautifulSoup  
from selenium import webdriver  
driver = webdriver.Chrome('C:\folder\chromedriver.exe')  
url = 'https://docs.microsoft.com/en-us/windows/release-information/'  
driver.get(url)  

tbla = driver.find_element_by_name('table') #attempt using by element name  
tblb = driver.find_element_by_class_name('cells-centered') #attempt using by class name  
tblc = driver.find_element_by_xpath('//*[@id="winrelinfo_container"]/table[1]') #attempt by using xpath  

and tried using beautiful soup

html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
table = soup.find("table", {"class": "cells-centered"})
print(len(table))

Any help is much appreciated.

Answer

Table is present inside an iframe you need to switch iframe first to access the table.

Induce WebDriverWait() and wait for frame_to_be_available_and_switch_to_it() and following locator.

Induce WebDriverWait() and wait for visibility_of_element_located() and following locator.

driver.get("https://docs.microsoft.com/en-us/windows/release-information/")
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID,"winrelinfo_iframe")))
table=WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table.cells-centered")))

You need to import below libraries.

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Or you use below code with xpath.

driver.get("https://docs.microsoft.com/en-us/windows/release-information/")
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID,"winrelinfo_iframe")))
table=WebDriverWait(driver,10).until(EC.presence_of_element_located((By.XPATH,'//*[@id="winrelinfo_container"]/table[1]')))

You can import further your table data to pandas dataframe and then export to csv file.You need to import pandas.

driver.get("https://docs.microsoft.com/en-us/windows/release-information/")
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID,"winrelinfo_iframe")))
table=WebDriverWait(driver,10).until(EC.presence_of_element_located((By.XPATH,'//*[@id="winrelinfo_container"]/table[1]'))).get_attribute('outerHTML')
df=pd.read_html(str(table))[0]
print(df)
df.to_csv("path/to/csv")

Import pandas: pip install pandas

Then add the below library

import pandas as pd


Source: stackoverflow