I am new to web scraping and i am facing a problem. In the appending part, it seems to append only the first row of the table I want to scrape! I am sure I am missing something. Any ideas? Thanks in advance! The code snippet is the following:
JavaScript
x
44
44
1
driver = visit_main_page()
2
3
contents = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]')
4
5
tables = contents[0].find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table')
6
7
data = {"Date": [], "Time": [], "Place": [], "Latitude": [], "Longitude": [], "Fatalities": [], "Magnitude": []}
8
9
for i in tables:
10
11
try:
12
dates = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[1]')
13
times = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[2]')
14
places = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[3]')
15
lat = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[4]')
16
long = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[5]')
17
fat = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[6]')
18
magn = driver.find_elements_by_xpath('//*[@id="mw-content-text"]/div[1]/table[2]/tbody/tr[1]/td[7]')
19
except NoSuchElementException:
20
print('No such content!')
21
pass
22
time.sleep(1)
23
24
for d in dates:
25
data['Date'].append(d.text)
26
27
for t in times:
28
data['Time'].append(t.text)
29
30
for p in places:
31
data['Place'].append(p.text)
32
33
for la in lat:
34
data['Latitude'].append(la.text)
35
36
for lo in long:
37
data['Longitude'].append(lo.text)
38
39
for f in fat:
40
data['Fatalities'].append(f.text)
41
42
for m in magn:
43
data['Magnitude'].append(m.text)
44
Advertisement
Answer
UPD
You are using a wrong locators.
All the parameters you are trying to grab are starting with //*[@id="mw-content-text"]/div[1]/table[2]
– this points to a specific table.
To collect the data you are looking for try this:
JavaScript
1
12
12
1
dates = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[1]")
2
times = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[2]")
3
places = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[3]")
4
lat = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[4]")
5
long = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[5]")
6
fat = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[6]")
7
magn = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[7]")
8
9
10
dates = driver.find_elements_by_xpath("//table[contains(@class,'wikitable')]//tbody//tr//td[1]")
11
12
This is the main problem. The code after that looks correct.
You have no to get contents
and tables
with this approach