Skip to content
Advertisement

Access DICOM files via DICOMDIR using pydicom

I am new to this file type. I want to access files in the folder S65279_1148582599_LIVER__QIU__SHIFU__M_53 by accessing the DICOMDIR located besides that folder. Inside the folder, I want to access the “IMG” files in a folder that looks like:

IMGO00O0000 2021/12/22 8:55 13,408 KB
IMGOO000001 2021/12/2  8:55 13,840 KB
IMGO0000002 2021/12/22 8:54 8 KB
IMG00000035 2021/12/22 8:54 103 KB
IMGOO000739 2021/12/22 8:54 2 KB
Objects.xml 2021/12/22 8:54 11 KB

After reading the DICOMDIR file using dcmread, I get something like this:

(0004, 0000) Group Length                        UL: 232
(ee04, 1400) Offset of the Next Directory Record UL: e
(0004, 1410) Record In-use Flag                  US: 65535
(e004, 1420) Offset of Referenced Lower-Level Di UL: 0
(e004, 1430) Directory Record Type               CS: 'SR DOCUMENT'
(e004, 1500) Referenced File ID                  CS: ['S65279_1148582599_LIVER_QIU_SHIFU_M_53', 'IMGOOOR0739']
(0004, 1510) Referenced sOP Class UID in File    UI: Comprehensive SR Storage
(eee4, 1511) Referenced sOP Instance UID in File UI: 1.2.276.0.48.10201. 1.20110506073656578008
(0004, 1512) Referenced Transfer Syntax UID in F UI: Explicit VR Little Endian
(0008, 0e00) Group Length                        UL: 48
(0008, 0005) Specific Character Set              CS: 'IS0_IR 192'
(0008, 0023) Content Date                        DA: '20110322'
(e008, e033) Content Time                        TM: '095212'
(0020, e000) Group Length                        UL: 10
(ea20, 0013) Instance Number                     IS: '0'

How should I proceed to get ‘IMG00000739’? I have tried tried to access ReferenceFileID using ds[0x0004,0x1500], but this is not working.

Advertisement

Answer

A DICOMDIR file contains a linked list of directory records for patients, studies, series and images with some of their attributes. If all you need is a list of the DICOM files referenced in the DICOMDIR, you can just find all ReferencedFileID tags, which contain the path components to the path as a list. ReferencedFileID is only present in the image level directory records, so if you search for these entries, you will get the path to all contained DICOM images.

In pydicom, a DICOMDIR is represented by a FileSet that handles the specifics of that file type and allows you to search the directory records.
So, if you want to get all DICOM file paths referenced in the DICOMDIR, you can do something like this:

from pydicom import dcmread 
from pydicom.fileset import FileSet

ds = dcmread(dicomdir_path)
fs = FileSet(ds)
root_path = fs.path
# returns all contained values if IMAGE level entries
file_ids = fs.find_values("ReferencedFileID")
for file_id in file_ids:
    # file_id is a list, unpack it into the components using *
    dcm_path = os.path.join(root_path, *file_id)
    print(dcm_path) # here you can collect the paths or load the dataset
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement