Skip to content
Advertisement

Python fix french accents parsed as =C3=A9

In python i’m stuck with a couple of strings from french language with accents that I can’t convert back to normal, e.g.:

word1 = 'install=C3=A9' # should be installé
word2 = 'transf=E9r=E9' # should be transféré
word3 = 'bient=C3=B4t'  # should be bientôt

Most documentation I read specify to read the files with some encodings=’utf-8′ or so, but here I’m stuck with actual strings. Is there a way to decode the strings or should I build a maximega .replace() function ?

Advertisement

Answer

The encoding seems to be Quoted Printable.

import quopri
word1 = 'install=C3=A9'
byteString = quopri.decodestring(word1)
string = byteString.decode('utf-8')
print(string)

Actually the function expects bytes as input, so it would be even better to have the words declared as bytes:

word1 = b'install=C3=A9'
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement