Skip to content
Advertisement

how to remove unwanted text from retrieving title of a page using python

Hi All I have written a python program to retrieve the title of a page it works fine but with some pages, it also receives some unwanted text how to avoid that

here is my program

JavaScript

here is my output

JavaScript

instead of this I suppose to receive only this line

JavaScript

please help me with some idea all other websites are working only some websites gives these problem

Advertisement

Answer

Your problem is that you’re finding all the occurences of “title” in the page. Beautiful soup has an attribute title specifically for what you’re trying to do. Here’s your modified code:

JavaScript
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement