Why can’t I extract the other pages of the same website using beautifulsoup

Question

I wrote this code to extract multiple pages of data from this site (base URL - "https://www.goodreads.com/shelf/show/fiction"). But it's only showing the first 50 books' data. How can I extract all fiction books' names extracting all pages using beautifulsoup? Answer You can make the pagination from fiction category of the books from your base base url, you need to input

Accepted Answer

You can  make the pagination from fiction category of the books from your base base url, you need to input the fiction keyword in search box and click on search button then you will get this url :https://www.goodreads.com/search?q=fiction&qid=ydDLZMCwDJ and from here you have to collect data and to make the next pages.import requestsfrom bs4 import BeautifulSoupimport pandas as pdbook_title = []url = 'https://www.goodreads.com/search?page={page}&q=fiction&qid=ydDLZMCwDJ&tab=books'for page in range(1,11):    response = requests.get(url.format(page=page))    page_content = response.text    doc = BeautifulSoup(page_content, 'html.parser')    a_tags = doc.find_all('a', {'class': 'bookTitle'})    for tag in a_tags:        book_title.append(tag.get_text(strip=True))   df = pd.DataFrame(book_title,columns=['Title'])print(df)Output:                 Title0     Trigger Warning: Short Fictions and Disturbances1    You Are Not So Smart: Why You Have Too Many Fr...2       Smoke and Mirrors: Short Fiction and Illusions3           Fragile Things: Short Fictions and Wonders4                                   Collected Fictions..                                                 ...195  The Science Fiction Hall of Fame, Volume One, ...196  The Art of Fiction: Notes on Craft for Young W...197  Invisible Planets: Contemporary Chinese Scienc...198                                  How Fiction Works199  Monster, She Wrote: The Women Who Pioneered Ho...[200 rows x 1 columns]

Advertisement

Answer