I am trying to use beautifulsoup to get the links off of this webpage: https://nfdc.faa.gov/nfdcApps/services/ajv5/fixes.jsp
I need the links to all of the fixes in Arizona (AZ), so I search for AZ, and when I start by hitting ‘A’ under ‘View fixes in alphabetical order:’, I am not able to scrape the links that are shown by hoving over each fix (i.e ‘AALAN’) when I use beautifulsoup in python. How can I do this? Here is my code:
JavaScript
x
9
1
page = requests.get("https://nfdc.faa.gov/nfdcApps/services/ajv5/fix_search.jsp?selectType=state&selectName=AZ&keyword=")
2
soup = bs(page.content)
3
4
links = []
5
for link in soup.findAll('a'):
6
links.append(link.get('href'))
7
8
print(links)
9
And this is what it outputs:
JavaScript
1
2
1
['http://www.faa.gov', 'http://www.faa.gov', 'http://www.faa.gov/privacy/', 'http://www.faa.gov/web_policies/', 'http://www.faa.gov/contact/', 'http://faa.custhelp.com/', 'http://www.faa.gov/viewer_redirect.cfm?viewer=pdf&server_name=employees.faa.gov', 'http://www.faa.gov/viewer_redirect.cfm?viewer=doc&server_name=employees.faa.gov', 'http://www.faa.gov/viewer_redirect.cfm?viewer=ppt&server_name=employees.faa.gov', 'http://www.faa.gov/viewer_redirect.cfm?viewer=xls&server_name=employees.faa.gov', 'http://www.faa.gov/viewer_redirect.cfm?viewer=zip&server_name=employees.faa.gov']
2
The links to the fixes are not there (i.e https://nfdc.faa.gov/nfdcApps/services/ajv5/fix_detail.jsp?fix=1948394&list=yes is not in the list)
I am looking to compile a list of all the fix links for Arizona so I can aquire the data. Thanks!
Advertisement
Answer
Try:
JavaScript
1
20
20
1
import requests
2
from bs4 import BeautifulSoup
3
4
url = "https://nfdc.faa.gov/nfdcApps/services/ajv5/fix_search.jsp"
5
6
data = {
7
"alphabet": "A",
8
"selectType": "STATE",
9
"selectName": "AZ",
10
"keyword": "",
11
}
12
13
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
14
15
for data["alphabet"] in alphabet:
16
soup = BeautifulSoup(requests.post(url, data=data).content, "html.parser")
17
18
for a in soup.select('[href*="fix_detail.jsp"]'):
19
print("{:<10} {}".format(a.text.strip(), a["href"]))
20
Prints:
JavaScript
1
13
13
1
2
3
ITEMM fix_detail.jsp?fix=17822&list=yes
4
ITUCO fix_detail.jsp?fix=56147&list=yes
5
IVLEC fix_detail.jsp?fix=11787&list=yes
6
IVVRY fix_detail.jsp?fix=20962&list=yes
7
IWANS fix_detail.jsp?fix=1948424&list=yes
8
IWEDU fix_detail.jsp?fix=13301&list=yes
9
IXAKE fix_detail.jsp?fix=585636&list=yes
10
11
12
13