Skip to content
Advertisement

Python found URL is invalid

Hi there I have following Problem:

I extracted a list of URL’s from a .txt file with Python using this:

JavaScript

And the Output contains for some files following:

JavaScript

PROBLEM IS:

as you see it printed out “#038;” I’m thinking that translates into “&” but there is already a “&” infront of that and if I follow the Link its invalid.

However if I delete all “#038;” the Link works just fine.

How can I print them so that I dont have “#038;” inside and the Link works?

Thanks so much

Advertisement

Answer

Looks like a url encoding issue. Since, you are only printing, you can use string replace function.

JavaScript
Advertisement