I was restoring the Node pako gzip method using Python and encountered the following problem:
node:
const pako = require('pako'); const test = 'aaa' var data = pako.gzip(test); console.log(data) Uint8Array(23) [ 31, 139, 8, 0, 0, 0, 0, 0, 0, 3, 75, 76, 76, 4, 0, 45, 115, 7, 240, 3, 0, 0, 0 ]
python:
import numpy import gzip test = 'aaa' compressed_byte_pako = gzip.compress(test.encode('utf-8')) compressed = numpy.frombuffer(compressed_byte_pako, dtype=numpy.uint8) print(compressed) [ 31 139 8 0 178 45 89 98 2 255 75 76 76 4 0 45 115 7 240 3 0 0 0]
Why do they compress inconsistently?
Advertisement
Answer
Though there is no guarantee that they will, they are compressing identically. The compressed portions and trailer with checks on the uncompressed data are the last 13 bytes of each, and are the same. Both will decompress to the original data (‘aaa’).
The first ten bytes are the gzip headers. The first one has no time stamp, compression level, or operating system information. The second one does.