Skip to content
Advertisement

Python | Excel csv File Unicode Issue

There is a python file to extract user’s data from telegram group.
Here is the codes :

JavaScript

After extracting members when i open members--.csv file i see problems on UniCode characters.
How can i fix this issue?
I am using excel 2016

Advertisement

Answer

The problem is not your code, it’s Excel. When Excel opens a file it uses the encoding that is default for your version of Windows, and that encoding is never UTF-8 – it’s one of the many code pages that they invented before Unicode came about.

If you use the text import wizard, there’s an option to select the text encoding, and you can choose UTF-8 there if you want. But that’s a pain to do every time you need to open a CSV.

There’s a way to make Excel recognize that the file is UTF-8 encoded and use it automatically, many Microsoft products use the same trick. If the file starts with a Unicode Byte Order Mark (BOM) U+FEFF encoded in UTF-8 (the 3 byte sequence 0xEF,0xBB,0xBF), Excel will recognize that the file is UTF-8 encoded and override its default. Python will automatically start your file with this BOM sequence if you use the special encoding 'utf_8_sig'.

JavaScript

It’s not recommended that you put this special signature at the beginning of every file, only when you know it will be consumed by an application that requires it.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement