Skip to content
Advertisement

Faster alternative to Python’s zipfile module?

Is there a noticeably faster alternative to Python 2.7.4 zipfile module (with ZIP_DEFLATED) for zipping a large number of files into a single zip file? I had a look at czipfile https://pypi.python.org/pypi/czipfile/1.0.0, but that appears to be focused on faster decrypting (not compressing).

I am routinely having to process a large number of image files (~12,000 files of a combination of .exr and .tiff files) with each file between ~1MB – 6MB in size (and ~9 GB for all the files) into a single zip file for shipment. This zipping takes ~90 minutes to process (running on Windows 7 64bit).

If anyone can recommend a different python module (or alternatively a C/C++ library or even a standalone tool) that would be able to compress a large number of files into a single .zip file in less time than the zipfile module, that would be greatly appreciated (anything close to ~5-10% faster (or more) would be very helpful).

Advertisement

Answer

As Patashu mentions, outsourcing to 7-zip might be the best idea.

Here’s some sample code to get you started:

import os
import subprocess

path_7zip = r"C:Program Files7-Zip7z.exe"
path_working = r"C:temp"
outfile_name = "compressed.zip"
os.chdir(path_working)

ret = subprocess.check_output([path_7zip, "a", "-tzip", outfile_name, "*.txt", "*.py", "-pSECRET"])

As martineau mentioned you might experiment with compression methods. This page gives some examples on how to change the command line parameters.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement