Iterating over table of divs using BeautifulSoup

Question

A div of class="tableBody" has many divs as children. I want to get all its div child and get the string which I have highlighted in this picture. the above code returns me a empty list. I am trying to learn BS4. I appreciate it if you could help me with the code. Answer The data you see on the

Accepted Answer

The data you see on the page is loaded dynamically via JavaScript. You can use requests module to simulate it.For example:import requestsfrom bs4 import BeautifulSoupurl = 'https://www.ungm.org/Public/Notice/Search'payload = {  "PageIndex": 0,  "PageSize": 15,  "Title": "",  "Description": "",  "Reference": "",  "PublishedFrom": "",  "PublishedTo": "12-Jul-2020",  "DeadlineFrom": "12-Jul-2020",  "DeadlineTo": "",  "Countries": [],  "Agencies": [],  "UNSPSCs": [],  "NoticeTypes": [],  "SortField": "DatePublished",  "SortAscending": False,  "isPicker": False,  "NoticeTASStatus": [],  "IsSustainable": False,  "NoticeDisplayType": None,  "NoticeSearchTotalLabelId": "noticeSearchTotal",  "TypeOfCompetitions": []}soup = BeautifulSoup( requests.post(url, json=payload).content, 'html.parser' )for row in soup.select('.tableRow'):    cells = [cell.get_text(strip=True) for cell in row.select('.tableCell')]    print(cells[1])    print('{:<30}{:<15}{:<15}{:<25}{:<45}{:<15}'.format(*cells[2:]))    print('-'*80)Prints:Supply and delivery of 78 smartphones13-Jul-2020 11:00 (GMT 2.00)  11-Jul-2020    FAO            Request for quotation    2020/FRMLW/FRMLW/106096                      Malawi         --------------------------------------------------------------------------------Supply of LEGUMES SEEDS for rainfed season23-Jul-2020 14:00 (GMT 2.00)  11-Jul-2020    FAO            Invitation to bid        2020/FRMLW/FRMLW/106051                      Malawi         --------------------------------------------------------------------------------Supply of MAIZE SEEDS for rainfed season22-Jul-2020 14:00 (GMT 2.00)  11-Jul-2020    FAO            Invitation to bid        2020/FRMLW/FRMLW/106050                      Malawi         --------------------------------------------------------------------------------Procurement of Supply and Installation of Outdoor Metal Furniture for Rooftop Terrace at FAO Headquarters in Rome, Italy10-Aug-2020 12:00 (GMT 2.00)  11-Jul-2020    FAO            Invitation to bid        2020/CSAPC/CSDID/105286                      Italy          --------------------------------------------------------------------------------Procurement of Silo for Emergency Project13-Jul-2020 13:00 (GMT 5.00)  11-Jul-2020    FAO            Invitation to bid        2020/FABGD/FABGD/106145                      Bangladesh     --------------------------------------------------------------------------------Procurement of Concentrate Ruminant Feed13-Jul-2020 13:00 (GMT 5.00)  11-Jul-2020    FAO            Invitation to bid        2020/FABGD/FABGD/106064                      Bangladesh     --------------------------------------------------------------------------------Purchase of Waste Collection Vehicles - (Two Tractors)22-Jul-2020 06:30 (GMT 0.00)  11-Jul-2020    UNOPS          Request for quotation    RFQ/2020/15298                               Sri Lanka      --------------------------------------------------------------------------------Procurement of Laboratory Equipment and Material24-Jul-2020 22:23 (GMT -1.00) 11-Jul-2020    FAO            Invitation to bid        2020/FRGAM/FRGAM/106143                      Gambia         --------------------------------------------------------------------------------Compra de chalecos para promotores comunitarios para la Oficina de Unicef Bolivar - LRFQ-2020-915935216-Jul-2020 23:59 (GMT -3.00) 11-Jul-2020    UNICEF         Request for proposal     LRFQ-2020-9159352                            Venezuela      --------------------------------------------------------------------------------Call for Proposals Quality Based Fixed Budget (CFPFB):26-Jul-2020 17:00 (GMT 3.00)  11-Jul-2020    UNDP           Request for proposal     UNDP-SYR-RPA-051-20                          Syrian Arab Republic--------------------------------------------------------------------------------Innovation and Design Specialist27-Jul-2020 00:00 (GMT -5.00) 11-Jul-2020    UNDP           Not set                  Innovation and Design Specialist             Turkey         --------------------------------------------------------------------------------(RFI) from national and/or international CSOs/NGOs for potential partnership with UNDP and its pooled funding mechanism, the Darfur Community Peace and Stability Fund (DCPSF),26-Jul-2020 08:00 (GMT -7.00) 11-Jul-2020    UNDP           Request for information  RFI-SDN-20-002                               Sudan          --------------------------------------------------------------------------------IRAQ-LRPS-017-2020-9159660 Rehabilitation of 3 water projects at Avrek, Grey Basi and Sarsenk in Duhok26-Jul-2020 12:00 (GMT 3.00)  11-Jul-2020    UNICEF         Request for proposal     9159660                                      Iraq           --------------------------------------------------------------------------------106142 INVITACIÓN A COTIZAR PARA LA ADQUISICIÓN DE FERTILIZANTES, HERRAMIENTAS Y MATERIALES PARA ECA DE CACAO21-Jul-2020 22:00 (GMT -5.00) 10-Jul-2020    FAO            Request for quotation    2020/FLCOL/FLCOL/106142                      Colombia       --------------------------------------------------------------------------------Achat de tablettes, de GPS et batteries rechargeable (206 tablettes, 68 GPS, et 181 pack chargeurs et batteries rechargeables) à livrer sur  Dakar28-Jul-2020 12:00 (GMT 0.00)  10-Jul-2020    FAO            Invitation to bid        2020/FRSEN/FRSEN/106093                      United Kingdom --------------------------------------------------------------------------------EDIT: To get all pages, filter out only &#8216;Afghanistan&#8217; country and save to CSV, you can use this example:import csvimport requestsfrom bs4 import BeautifulSoupurl = 'https://www.ungm.org/Public/Notice/Search'payload = {  "PageIndex": 0,  "PageSize": 15,  "Title": "",  "Description": "",  "Reference": "",  "PublishedFrom": "",  "PublishedTo": "12-Jul-2020",  "DeadlineFrom": "12-Jul-2020",  "DeadlineTo": "",  "Countries": [],  "Agencies": [],  "UNSPSCs": [],  "NoticeTypes": [],  "SortField": "DatePublished",  "SortAscending": False,  "isPicker": False,  "NoticeTASStatus": [],  "IsSustainable": False,  "NoticeDisplayType": None,  "NoticeSearchTotalLabelId": "noticeSearchTotal",  "TypeOfCompetitions": []}page, all_data = 0, []while True:    print('Page {}...'.format(page))    payload['PageIndex'] = page    soup = BeautifulSoup( requests.post(url, json=payload).content, 'html.parser' )    rows = soup.select('.tableRow')    if not rows:        break    for row in rows:        cells = [cell.get_text(strip=True) for cell in row.select('.tableCell')]        print(cells[1])        print('{:<30}{:<15}{:<15}{:<25}{:<45}{:<15}'.format(*cells[2:]))        print('-'*80)        # we are only interested in Afghanistan:        if 'afghanistan' in cells[7].lower():            all_data.append([row['data-noticeid'], *cells[1:]])    page += 1# write to csv file:with open('data.csv', 'w', newline='') as csvfile:    csv_writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)    for row in all_data:        csv_writer.writerow(row)Saved data.csv (screenshot from LibreOffice):

Advertisement

Answer