Skip to content
Advertisement

Why does compression output a larger zip file?

I really don’t understand about python compression because whenever I am trying to compress a folder or file I end up getting a very larger file 5.5 times bigger than the original file. Why does this happen? Is there any way I can compress a folder or a file with python and get an output that’s at most the size of the original file? Here is the code I am using.

import os, zipfile

def zipfiles(filename, destname):
  try:
   zf = zipfile.ZipFile(destname, 'w', zipfile.ZIP_DEFLATED)
   for dirname, subdirs, files in os.walk(filename):
      zf.write(dirname)
      for filename in files:
          zf.write(os.path.join(dirname, filename))
   zf.close()
  except Exception, e:
   print str(e)
def main():
   x = raw_input('Enter Filename:   ')
   while len(x) == 0:
       x = raw_input('Enter Filename:   ')
   y = raw_input('Enter destination name:   ')
   while len(y) == 0:
       y = raw_input('Enter destination name:   ')
   zipfiles(x, y+'.zip')
main()

Advertisement

Answer

Make sure that the destination .zip file is not in the same folder you are compressing, otherwise your script may be adding a copy of the file being created to itself — which obviously will make it much bigger.

Here’s a revised version of your code that will skip the archive when it’s being created in the same directory folder:

import os, zipfile

def zipfiles(source_folder, destination_name):
    source_folder = os.path.abspath(source_folder)
    destination_path = os.path.abspath(destination_name)
    try:
        with zipfile.ZipFile(destination_name, 'w', zipfile.ZIP_DEFLATED) as zf:
            for dirname, subdirs, files in os.walk(source_folder):
                # zf.write(dirname)  # Not needed.
                for filename in files:
                    filepath = os.path.join(dirname, filename)
                    if filepath != destination_path:  # Skip file being created.
                        zf.write(filepath)
    except Exception as e:
        print(e)

def main():
    x = ''
    while not x:
        x = raw_input('Enter source folder name: ')
    y = ''
    while not y:
        y = raw_input('Enter destination archive file name: ')
    zipfiles(x, y+'.zip')

main()
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement