Skip to content
Advertisement

Reading “a flat, binary array of 16-bit signed, little-endian (LSB) integers” from file in python

I’m trying to read a old file of snow data from here, but I’m having a ton of trouble just opening a single file and getting data out. In the user guide, it says “Each monthly binary data file with the file extension “.NSIDC8″ contains a flat, binary array of 16-bit signed, little-endian (LSB) integers, 721 columns by 721 rows (row-major order, i.e. the top row of the array comprises the first 721 values in the file, etc.).” The data is 20 to 50 years old so there’s not much coding documentation

If I just open the file and run readlines, with this code:

with open(os.path.join(folder,file), 'rb') as f:
# contents = f.read()
lines = f.readlines()

I get something looking like this: x00Px00@x00x19x00x13x00Cx00Fx00x11x00rx00:x00.x00x02

If I use np.load(), the results are number like: -6.85682214e+304

I imagine I need to use the struct package and the unstruct function, but I have no idea what format to use, and my attempts are not getting reasonable answers. For instance, I’ve tried just reading the first four bytes and using ‘<i’ as the format, as shown in the code below

with open(os.path.join(folder,file), 'rb') as f:
print(struct.unpack('<i', f.read(4)))

And the print statement showed (-13041864,), which doesn’t make sense. Any insights would be greatly appreciated

Advertisement

Answer

You can unpack the data 16 bits at a time and specify this in your unpack format string. You’re using <i, which wants 4 bytes. The data is in 16 bit numbers, which wants 2 bytes. Instead, use <h.

For example,

# I chose a random file from their setup
with open("NL198303.v01.NSIDC8", "rb") as dfile:
    print(struct.unpack("<h", dfile.read(2)))
# prints -200, which is a "fixed value for corners" according to their docs

Here, h means “signed short”.

I looked at several random locations in the file and only saw -200 and -250, corresponding to some sort of fixed boundary and ocean spots. Presumably there are other values somewhere, but I didn’t look.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement