I am trying to visualize a DICOM file with Python 3 and pyDicom which should contain a black 100×100 image with some curves drawn in it. The pixel data is extracted from header (7fe0,0010) and when printed shows b'x00x00x00...'
. This I can easily convert to a 100×100 numpy array.
However, the curve data in (5000,3000) shows me b'x00x00x00x00x00x00x00x00x00x00x00x00x00xc0H@x00x00x00x00x00xc0X@x00x00x00x00x00xc0H@'
which I am not able to convert to x,y coordinates in my 100×100 pixel image. In the DICOM file it says
- curve dimensions: 2
- number of points: 2
- type of data: poly
- data value representation: 3
- curve label: horizontal axis
- curve data: 32 elements
The main question is: How do I decode the coordinates required for retracing the curve within my 100×100 image? My main concern is the fact that there should be 32 elements, but only 26 hex values in the output. Also I have no clue how to deal with the xc0H@
and xc0X@
. When I print those, it yields 192 72 64
and 192 88 64
. How does python decode these 2 hex codes to 6 numbers? And what do these numbers represent?
EDIT:
Apparently data value representation 3 means the data is represented as a floating point double. On the other hand, there should be two points in the data, so each point is represented by 16 elements? I don’t see how these two statements are compatible. What is interesting is that the first xc0H@
translates to 3 numbers as mentioned before, and by doing so complete the first 16 elements of the curve data. How can I convert this into a point in my 2D image?
Advertisement
Answer
Curve data has been retired in DICOM since 2004, so you will find the relevant information in the DICOM standard from 2004 (thanks to @kritzel_sw for the link).
As you already found out, Data Value Representation
3 means that the data entries are in double format, and with a Type of Data
of polygon, you have x/y tuples in your data. As a double value is saved in 8 bytes, there are 16 bytes per point — in your case (32 bytes of data) 2 points overall.
Pydicom does not (and probably will not) directly support the retired Curve module (though support for the Waveform module, the current equivalent, has been added in pydicom 2.1), so you have to decode the data yourself. You can do something like this (given double numbers):
from struct import unpack from pydicom import dcm_read ds = dcm_read(filename) data = ds[0x50003000].value # unpack('d') unpacks 8 bytes into a double numbers = [unpack('d', data[i:i+8])[0] for i in range(0, len(data), 8)] # I'm sure there is a nicer way for this... coords = [(numbers[i], numbers[i+1]) for i in range(0, len(numbers), 2)]
In your example data, this will return:
[(0.0, 49.5), (99.0, 49.5)]
e.g. the x/y coordinates (0, 49.9) and (99.0, 49.5), which corresponds to a horizontal line in the middle of your image.
As to the mismatch of 26 hex elements vs 32 bytes: a byte string representation shows only the bytes that cannot be converted to ASCII in hex string notation, the rest is just shown as the representation of the corresponding ASCII characters. So, for example this part of your byte string: x00xc0H@
is 4 bytes long and could also be represented as x00xc0x48x40
in hex string notation.