Why can’t I save my scraped html table to pandas dataframe?

Question

I have a python script that scrapes a html table. When I try to save my scraped data to pandas dataframe, I get an error. Please help me check what am doing wrong? Here is my codeblock Here is the error i get I want to save the above scraped values into pandas dataframe. That&#8217;s my aim. Please help if

Accepted Answer

In your variable row_data you are only saving one row and you are overwriting it in every iteration. You probably want to use all rows in your DataFrame. You can for example create a new variable row_data_all and pass that to your DataFramerow_data_all = []for row in tablebodies:    tabledata = row.find_elements(selenium.webdriver.common.by.By.CSS_SELECTOR, 'tr, td')    row_data = []    for data in tabledata:        row_data.append(data.text)    row_data_all.append(row_data)pd.DataFrame(row_data_all, columns = row_headers)In case you really wanted to create a DataFrame from a single row you should usepd.DataFrame(row_data, index = row_headers).TAlternativeYou can also use pandas&#8217; read_html() method, which only needs the html source code. You can even pass it the source code of the entire page, and it will return a list of DataFrames of the tables found in the source code. This will also speed up your function a lot.html_table = driver.find_element(By.TAG_NAME, "table").get_attribute("outerHTML")df = pd.read_html(html_table)[0]

Advertisement

Answer

Alternative