In python i’m stuck with a couple of strings from french language with accents that I can’t convert back to normal, e.g.:
JavaScript
x
4
1
word1 = 'install=C3=A9' # should be installé
2
word2 = 'transf=E9r=E9' # should be transféré
3
word3 = 'bient=C3=B4t' # should be bientôt
4
Most documentation I read specify to read the files with some encodings=’utf-8′ or so, but here I’m stuck with actual strings. Is there a way to decode the strings or should I build a maximega .replace() function ?
Advertisement
Answer
The encoding seems to be Quoted Printable.
JavaScript
1
6
1
import quopri
2
word1 = 'install=C3=A9'
3
byteString = quopri.decodestring(word1)
4
string = byteString.decode('utf-8')
5
print(string)
6
Actually the function expects bytes as input, so it would be even better to have the words declared as bytes:
JavaScript
1
2
1
word1 = b'install=C3=A9'
2