Skip to content
Advertisement

page scraping using beautiful soup, without links

I am using the following code to extract text from a web page:

JavaScript

The problem is, when I open text, I get all the links from the bottoms that exist at the top of the page, which I don’t want. How can i modify the above code to do so?

I also gets the footnotes, which i may want, but a separate text. Is there a way to separate the footsnotes from the main text?

Thanks

Advertisement

Answer

If you want to extract all the text then you can use get_text() method

JavaScript

To save as text file, you can use pandas DataFrame

JavaScript

#import

JavaScript
Advertisement