I am new to this file type. I want to access files in the folder S65279_1148582599_LIVER__QIU__SHIFU__M_53
by accessing the DICOMDIR located besides that folder. Inside the folder, I want to access the “IMG” files in a folder that looks like:
IMGO00O0000 2021/12/22 8:55 13,408 KB IMGOO000001 2021/12/2 8:55 13,840 KB IMGO0000002 2021/12/22 8:54 8 KB IMG00000035 2021/12/22 8:54 103 KB IMGOO000739 2021/12/22 8:54 2 KB Objects.xml 2021/12/22 8:54 11 KB
After reading the DICOMDIR file using dcmread
, I get something like this:
(0004, 0000) Group Length UL: 232 (ee04, 1400) Offset of the Next Directory Record UL: e (0004, 1410) Record In-use Flag US: 65535 (e004, 1420) Offset of Referenced Lower-Level Di UL: 0 (e004, 1430) Directory Record Type CS: 'SR DOCUMENT' (e004, 1500) Referenced File ID CS: ['S65279_1148582599_LIVER_QIU_SHIFU_M_53', 'IMGOOOR0739'] (0004, 1510) Referenced sOP Class UID in File UI: Comprehensive SR Storage (eee4, 1511) Referenced sOP Instance UID in File UI: 1.2.276.0.48.10201. 1.20110506073656578008 (0004, 1512) Referenced Transfer Syntax UID in F UI: Explicit VR Little Endian (0008, 0e00) Group Length UL: 48 (0008, 0005) Specific Character Set CS: 'IS0_IR 192' (0008, 0023) Content Date DA: '20110322' (e008, e033) Content Time TM: '095212' (0020, e000) Group Length UL: 10 (ea20, 0013) Instance Number IS: '0'
How should I proceed to get ‘IMG00000739’? I have tried tried to access ReferenceFileID
using ds[0x0004,0x1500]
, but this is not working.
Advertisement
Answer
A DICOMDIR file contains a linked list of directory records for patients, studies, series and images with some of their attributes. If all you need is a list of the DICOM files referenced in the DICOMDIR, you can just find all ReferencedFileID
tags, which contain the path components to the path as a list. ReferencedFileID
is only present in the image level directory records, so if you search for these entries, you will get the path to all contained DICOM images.
In pydicom
, a DICOMDIR is represented by a FileSet that handles the specifics of that file type and allows you to search the directory records.
So, if you want to get all DICOM file paths referenced in the DICOMDIR, you can do something like this:
from pydicom import dcmread from pydicom.fileset import FileSet ds = dcmread(dicomdir_path) fs = FileSet(ds) root_path = fs.path # returns all contained values if IMAGE level entries file_ids = fs.find_values("ReferencedFileID") for file_id in file_ids: # file_id is a list, unpack it into the components using * dcm_path = os.path.join(root_path, *file_id) print(dcm_path) # here you can collect the paths or load the dataset