I have a zip file which was created on Windows
machine using this tool System.IO.Compression.ZipFile
(this zip archive contains many files and folders). I have a python code that runs on Linux
machine (raspberry pi to be exact) which has to unzip the archive and create all the necessary folders and files. I’m using Python 3.5.0
and zipfile
library, this is a sample code:
import zipfile zip = zipfile.ZipFile("MyArchive.zip","r") zip.extractall() zip.close()
Now when I run this code instead of getting a nice unzipped directory tree, I get all the files in root directory with weird names like Folder1Folder2MyFile.txt
.
My assumption is that since zip archive was created on Windows and directory separator on windows is whereas on Linux it is
/
, python zipfile
library treats as part of a file name instead of directory separator. Also note that when I’m extracting this archive manually (not through python code) all the folder are created as expected, so it seems that this is definitely a problem of
zipfile
library. Another note is that for zip archives that where created with a different tool (not System.IO.Compression.ZipFile
) it works OK using the same python code.
Any insight on what’s going on and how to fix it?
Advertisement
Answer
What is happening is that while Windows recognizes both (
path.sep
) and /
(path.altsep
) as path separators, Linux only recognizes /
(path.sep
).
As @blhsing’s answer shows, the existing implementation of ZipFile
always ensures that path.sep
and /
are considered valid separator characters. That means that on Linux, is treated as a literal part of the file name. To change that, you can set
os.altsep
to , since it gets checked if it’s not None of empty.
If you go down the road of modifying ZipFile
itself, like the other answer suggests, just add a line to blindly change to
path.sep
, since /
is always changed already anyway. That way, /
, and possibly
path.altsep
will all be converted to path.sep
. This is what the command line tool appears to be doing.