Skip to content
Advertisement

How do you scrape a table from a website which is hosting the table data outside of the HTML?

I am trying to scrape the table data from this table URL: https://covid19criticalcare.com/pharmacies/ enter image description here

On my previous scrape I used the following Python packages: from bs4 import BeautifulSoup import requests import mysql.connector import pandas as pd from sqlalchemy import create_engine

But this url’s HTML doesn’t contain the table data on it, instead it seems to be drawing the data from an external database. enter image description here

Could someone please point me in the right direction for scraping a table data with this sort of HTML setup using a python script?

I tried doing a blind scrape, by using the method I used on my previous scrape.

from bs4 import BeautifulSoup
import requests
import mysql.connector
import pandas as pd
from sqlalchemy import create_engine

url = "https://covid19criticalcare.com/pharmacies/"

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
result = requests.get(url, headers = headers)
doc = BeautifulSoup(result.text, "html.parser")

name = doc.find_all("td", class_="column-1")

td_pharmacy_name = []

for td in name:
names = td.text
td_names.append(names)
print(td_names)

Advertisement

Answer

Just as alternative to @Naphat Theerawats answer and while I noticed that you started with a seleniumbased solution you could get your goal with that much easier in combination withpandas`.

Load the website and extract table from driver.page_source with pd.read_html() – To avoid iterating each page just select Show All entries

Example

from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
import pandas as pd

url = 'https://covid19criticalcare.com/pharmacies/'

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
driver.get(url)
wait = WebDriverWait(driver, 5)
        
select = Select(wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '[name = "DataTables_Table_0_length"'))))
select.select_by_value('-1')
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'a.paginate_button.next.disabled')))

df = pd.read_html(driver.page_source, displayed_only=False)[1]
driver.close()

df

Output

Pharmacy Name Email Phone Website Requires prescription? Pharmacy Address Based in the United States? Overnight shipping to the United States? Overnight International shipping? Ships to the following States/Provinces
0 Covid Pharmacy sales@0covidpharmacy.com (785) 672 9222 0covidpharmacy.com NO 245 Krishna Market Channi RoadNagpur, Maharashtra 440001India NO YES YES AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingGuamPuerto RicoVirgin IslandsArmed Forces AmericasArmed Forces EuropeArmed Forces Pacific
1 Ivermectin Service ask24@1ivermectin.com (888) 290 0964 (US), +91 22509 72606 (IN) 1ivermectin.com NO 1/16, First Floor, Tardeo Air Conditioned Market Building, TardeoMumbai, Tardeo 400034India NO YES YES AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingPuerto RicoVirgin Islands
1 Life Pharmacy sales@1lifepharmacy.net (888) 560-0430 (US); +91 (807 ) 127-9990 (India) 1lifepharmacy.net NO 302, Pride Plaza, Rajkot, 360002Rajkot, Gujarat 360002; 84118India NO YES YES AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming
1-2-3 RX Global Pharmacy doctor@123rx.net (516) 758-2630 123rx.net NO 2967 Dundas St. W.Toronto, Ontario M6P 1Z2Canada NO YES YES AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming
12 Angel Pharmacy Store 12angel.store@gmail.com (908) 866-4260 12angel.store NO 1050 Bharat Diamond BourseBandra Kurla ComplexMumbai, Maharashtra 400051India NO YES YES AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingGuamPuerto RicoVirgin IslandsArmed Forces AmericasArmed Forces EuropeArmed Forces Pacific
24 x 7 Pharma contact@24x7pharma.com (851) 127-5721 24x7pharma.com NO Mahek IconSumul Diary Road, KatargamSurat, Gujarat 395003India NO YES YES AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyomingGuamPuerto RicoVirgin IslandsArmed Forces AmericasArmed Forces EuropeArmed Forces Pacific

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement