I’m scraping news article. Here is the link.
So I want to get that “13” string inside comment__counter total_comment_share
class. As you can see that string is visible on inspect element and you can try it yourself from the link above. But when I did find()
and print, that string is invisible so I can’t scrape it. This is my code:
a = 'https://tekno.kompas.com/read/2020/11/12/08030087/youtube-down-pagi-ini-tidak-bisa-memutar-video' b = requests.get(a) c = (b.content) d = BeautifulSoup(c) e = d.find('div', {'class', 'social--inline eee'}) f = d.find('div', {'class', 'comment__read__text'}) print(f)
From my code I’m using find()
on comment__read__text
class to make it more clear I can find the elements but that “13” string. The result is same if I’m using find()
on comment__counter total_comment_share
class. This is the output from code above:
<div class="comment__read__text"> <a href="http://tekno.kompas.com/komentar/2020/11/12/08030087/youtube-down-pagi-ini-tidak-bisa-memutar-video">Komentar <div class="comment__counter total_comment_share"></div></a> </div>
As you can see the “13” string is not there. Anyone knows why? Any help would be appreciated.
Advertisement
Answer
it’s because a request was made while the page was loading which makes the page renders the content dynamically. Try this out:
import requests a = 'https://tekno.kompas.com/read/2020/11/12/08030087/youtube-down-pagi-ini-tidak-bisa-memutar-video' b = requests.get('https://apis.kompas.com/api/comment/list?urlpage={}&json&limit=1'.format(a)) c = b.json() f = c["result"]["total"] print(f)
PS: if you’re interested in scraping all the comments, just change limit
to 100000
which will get you all the comments in one request as JSON
.