Skip to content
Advertisement

Beautifulsoup how to extract paragraph from this page perfectly? only paragraph

I am unable to get the text inside the p tags i want text of all the p tags, have tried this so far but unable to exact text.

import requests
from bs4 import BeautifulSoup

    link = 'https://trumpwhitehouse.archives.gov/briefings
statements/remarks-president-trump-farewell-address-nation/'

page = requests.get(link)
soup = BeautifulSoup(page.content,'lxml')
article= soup.findAll('p')
print(article)

i am getting many p tags within my code how to remove those tags ? here is my output

 <p>The White House</p>, <p>THE PRESIDENT: My fellow Americans:
    and Four years ago, we launched a.<p>, and to restore the allegiance this government to its citizens. In short, we embarked on a to make America all Americans.</p>, <p>As I conclude my term asthe 45th
of the United States, I  — and so much more.</p>, <p>This week, and pray for its our best wishes, and we also want  — a very important word.</p>

Advertisement

Answer

res=requests.get(r"https://trumpwhitehouse.archives.gov/briefings-statements/remarks-president-trump-farewell-address-nation/")
soup=BeautifulSoup(res.text,"html.parser")
data=soup.find("div",class_="page-content").find_all("p")
for d in data:
    print(d.get_text())

Output:

The White House
THE PRESIDENT: My fellow Americans: Four years ago, we launched a great national effort to rebuild our country, to renew its spirit, and to restore the allegiance of this government to its citizens. In short, we embarked on a mission to make America great again — for all Americans.
....
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement