I am trying to convert a huge csv file from utf-16 to utf-8 format using python and below is the code:
with open(r'D:_appsaaaoutputsrcfile, 'rb') as source_file: with open(r'D:_appsaaaoutputdestfile, 'w+b') as dest_file: contents = source_file.read() dest_file.write(contents.decode('utf-16').encode('utf-8'))
But this code uses lots of memory and fails with Memoryerror. Please help me with an alternate method.
Advertisement
Answer
an option is to convert the file line by line:
with open(r'D:_appsaaaoutputsrcfile', 'rb') as source_file, open(r'D:_appsaaaoutputdestfile', 'w+b') as dest_file: for line in source_file: dest_file.write(line.decode('utf-16').encode('utf-8'))
or you could open the files with your desired encoding:
with open(r'D:_appsaaaoutputsrcfile', 'r', encoding='utf-16') as source_file, open(r'D:_appsaaaoutputdestfile', 'w+', encoding='utf-8') as dest_file: for line in source_file: dest_file.write(line)