从NumPy数组创建pydicom文件

12

我正在尝试从标准大小的numpy数组(512 x 512或256 x 256)创建新的DICOM图像。

import dicom, dicom.UID
from dicom.dataset import Dataset, FileDataset

def write_dicom(pixel_array,filename):
    
    file_meta = Dataset()
    ds = FileDataset(filename, {},file_meta = file_meta,preamble="\0"*128)
    ds.PixelData = pixel_array.tostring()
    ds.save_as(filename)
    return

if __name__ == "__main__":
    import numpy as np
    pixel_array = np.tile(np.arange(256).reshape(16,16), (16,16)) * 4
    write_dicom(pixel_array,'pretty.dcm')

3
你的 write_dicom 函数是否包含比你展示的更多操作?否则,你似乎正在创建一个只包含像素数据的文件,这将不是有效的 DICOM 文件。为了使文件符合 DICOM 标准,你需要输入研究、系列、实例 UID、图像模态、患者数据等信息。 - Anders Gustafsson
你能提供文件中DICOM头信息的转储(当然需要隐藏病人姓名等敏感信息)吗?以下是华盛顿大学提供的非常有用且免费的工具链接,您可使用它查看DICOM文件的信息:http://nrg.wustl.edu/software/dicom-browser/ - Matt
GDCM的FAQ中包含一些关于为什么从随机光栅图像格式创建合法DICOM很困难的文本:http://sourceforge.net/apps/mediawiki/gdcm/index.php?title=General_questions#How_do_I_convert_my_raster_image_format_X_into_DICOM_.3F - timday
8个回答

8

2020更新 :)

这些答案都对我没用。以下是我最终采用的方法,可以保存一个有效的单色16bpp MR切片,至少在Slicer、Radiant和MicroDicom中能正确显示:

import pydicom
from pydicom.dataset import Dataset, FileDataset
from pydicom.uid import ExplicitVRLittleEndian
import pydicom._storage_sopclass_uids

image2d = image2d.astype(np.uint16)

print("Setting file meta information...")
# Populate required values for file meta information

meta = pydicom.Dataset()
meta.MediaStorageSOPClassUID = pydicom._storage_sopclass_uids.MRImageStorage
meta.MediaStorageSOPInstanceUID = pydicom.uid.generate_uid()
meta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian  

ds = Dataset()
ds.file_meta = meta

ds.is_little_endian = True
ds.is_implicit_VR = False

ds.SOPClassUID = pydicom._storage_sopclass_uids.MRImageStorage
ds.PatientName = "Test^Firstname"
ds.PatientID = "123456"

ds.Modality = "MR"
ds.SeriesInstanceUID = pydicom.uid.generate_uid()
ds.StudyInstanceUID = pydicom.uid.generate_uid()
ds.FrameOfReferenceUID = pydicom.uid.generate_uid()

ds.BitsStored = 16
ds.BitsAllocated = 16
ds.SamplesPerPixel = 1
ds.HighBit = 15

ds.ImagesInAcquisition = "1"

ds.Rows = image2d.shape[0]
ds.Columns = image2d.shape[1]
ds.InstanceNumber = 1

ds.ImagePositionPatient = r"0\0\1"
ds.ImageOrientationPatient = r"1\0\0\0\-1\0"
ds.ImageType = r"ORIGINAL\PRIMARY\AXIAL"

ds.RescaleIntercept = "0"
ds.RescaleSlope = "1"
ds.PixelSpacing = r"1\1"
ds.PhotometricInterpretation = "MONOCHROME2"
ds.PixelRepresentation = 1

pydicom.dataset.validate_file_meta(ds.file_meta, enforce_standard=True)

print("Setting pixel data...")
ds.PixelData = image2d.tobytes()

ds.save_as(r"out.dcm")

请注意以下内容:
  • 按照 PyDicom 文档所建议的通过 FileDataset 构造函数进行操作会导致无法生成有效的头文件。
  • validate_file_meta 方法将为您创建一些缺失的元素(如版本)。
  • 您需要两次指定字节序和显式/隐式 VR。
  • 只要您相应地更新每个片的 ImagePositionPatient 和 InstanceNumber,此方法也将允许您创建有效的体积。
  • 确保您的 numpy 数组已转换为与 BitsStored 相同位数的数据格式。

1
谢谢您的回答。请问您能解释一下为什么要使用_storage_sopclass_uids,它并不是pydicom的公共API的一部分吗? - normanius

5
这是我需要编写的代码的功能版本。它将从给定的二维像素数组中编写16位灰度DICOM图像。根据DICOM标准,UID应对每个图像和系列都是唯一的,但这段代码并不考虑这点,因为我不知道UID实际上是什么作用。如果其他人了解,请告诉我,我会很高兴添加它。
import dicom, dicom.UID
from dicom.dataset import Dataset, FileDataset
import numpy as np
import datetime, time

def write_dicom(pixel_array,filename):
    """
    INPUTS:
    pixel_array: 2D numpy ndarray.  If pixel_array is larger than 2D, errors.
    filename: string name for the output file.
    """

    ## This code block was taken from the output of a MATLAB secondary
    ## capture.  I do not know what the long dotted UIDs mean, but
    ## this code works.
    file_meta = Dataset()
    file_meta.MediaStorageSOPClassUID = 'Secondary Capture Image Storage'
    file_meta.MediaStorageSOPInstanceUID = '1.3.6.1.4.1.9590.100.1.1.111165684411017669021768385720736873780'
    file_meta.ImplementationClassUID = '1.3.6.1.4.1.9590.100.1.0.100.4.0'
    ds = FileDataset(filename, {},file_meta = file_meta,preamble="\0"*128)
    ds.Modality = 'WSD'
    ds.ContentDate = str(datetime.date.today()).replace('-','')
    ds.ContentTime = str(time.time()) #milliseconds since the epoch
    ds.StudyInstanceUID =  '1.3.6.1.4.1.9590.100.1.1.124313977412360175234271287472804872093'
    ds.SeriesInstanceUID = '1.3.6.1.4.1.9590.100.1.1.369231118011061003403421859172643143649'
    ds.SOPInstanceUID =    '1.3.6.1.4.1.9590.100.1.1.111165684411017669021768385720736873780'
    ds.SOPClassUID = 'Secondary Capture Image Storage'
    ds.SecondaryCaptureDeviceManufctur = 'Python 2.7.3'

    ## These are the necessary imaging components of the FileDataset object.
    ds.SamplesPerPixel = 1
    ds.PhotometricInterpretation = "MONOCHROME2"
    ds.PixelRepresentation = 0
    ds.HighBit = 15
    ds.BitsStored = 16
    ds.BitsAllocated = 16
    ds.SmallestImagePixelValue = '\\x00\\x00'
    ds.LargestImagePixelValue = '\\xff\\xff'
    ds.Columns = pixel_array.shape[0]
    ds.Rows = pixel_array.shape[1]
    if pixel_array.dtype != np.uint16:
        pixel_array = pixel_array.astype(np.uint16)
    ds.PixelData = pixel_array.tostring()

    ds.save_as(filename)
    return



if __name__ == "__main__":
#    pixel_array = np.arange(256*256).reshape(256,256)
#    pixel_array = np.tile(np.arange(256).reshape(16,16),(16,16))
    x = np.arange(16).reshape(16,1)
    pixel_array = (x + x.T) * 32
    pixel_array = np.tile(pixel_array,(16,16))
    write_dicom(pixel_array,'pretty.dcm')

2
我刚刚在这里找到了你的方法,并想补充一下,“长点分 UID”是唯一标识符,根据定义,应该对于研究/患者/数据集是唯一的 - 因此,您可能希望避免为您创建的每个数据集提供完全相同的 UID 集,因为这可能会导致在将此数据导入任何其他地方时产生冲突。 - Chris
1
pydicom实际上提供了一个从头创建DICOM文件的示例:https://github.com/darcymason/pydicom/blob/master/source/dicom/examples/write_new.py - Chris
1
太好了,谢谢!我已经将它添加到顶部的解决方案中以增加可见性。 - Michael K
2
@Chris的链接已经失效,但这里有一个可用的链接:https://pydicom.github.io/pydicom/stable/auto_examples/input_output/plot_write_dicom.html - David Pärsson

4
我能进一步简化@Corvin的精彩回答。这是一个极简的代码示例,可以将(虚拟的)3D numpy数组保存为有效的DICOM图像,并且可以使用Amide打开。
#!/usr/bin/python3
import numpy
import pydicom
import pydicom._storage_sopclass_uids

# dummy image
image = numpy.random.randint(2**16, size=(512, 512, 512), dtype=numpy.uint16)

# metadata
fileMeta = pydicom.Dataset()
fileMeta.MediaStorageSOPClassUID = pydicom._storage_sopclass_uids.CTImageStorage
fileMeta.MediaStorageSOPInstanceUID = pydicom.uid.generate_uid()
fileMeta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian

# dataset
ds = pydicom.Dataset()
ds.file_meta = fileMeta

ds.Rows = image.shape[0]
ds.Columns = image.shape[1]
ds.NumberOfFrames = image.shape[2]

ds.PixelSpacing = [1, 1] # in mm
ds.SliceThickness = 1 # in mm

ds.BitsAllocated = 16
ds.PixelRepresentation = 1
ds.PixelData = image.tobytes()

# save
ds.save_as('image.dcm', write_like_original=False)

根据观察,如果将输出的image.dcm文件传递给dciodvfy,会发现很多字段都缺失。这些字段的填充留给读者自行完成 ;)

3
上述示例可以工作,但会导致许多工具对DICOM文件进行抱怨,并且它们不能使用itk/SimpleITK作为堆栈完全读取。我发现从numpy生成DICOM的最佳方法是使用SimpleITK工具并逐层生成DICOM。一个基本示例(https://github.com/zivy/SimpleITK/blob/8e94451e4c0e90bcc6a1ffdd7bc3d56c81f58d80/Examples/DicomSeriesReadModifyWrite/DicomSeriesReadModifySeriesWrite.py)展示了如何加载堆栈,执行转换,然后重新保存文件,但这可以通过使用

进行简单修改
import SimpleITK as sitk
filtered_image = sitk.GetImageFromArray(my_numpy_array)

输出图像中的标签数量非常大,因此手动创建所有标签非常繁琐。此外,SimpleITK支持8、16、32位图像以及RGB,因此比在pydicom中制作它们要容易得多。
(0008, 0008) Image Type                          CS: ['DERIVED', 'SECONDARY']
(0008, 0016) SOP Class UID                       UI: Secondary Capture Image Storage
(0008, 0018) SOP Instance UID                    UI: 1.2.826.0.1.3680043.2.1125.1.35596048796922805578234000521866725
(0008, 0020) Study Date                          DA: '20170803'
(0008, 0021) Series Date                         DA: '20170803'
(0008, 0023) Content Date                        DA: 0
(0008, 0030) Study Time                          TM: '080429.171808'
(0008, 0031) Series Time                         TM: '080429'
(0008, 0033) Content Time                        TM: 0
(0008, 0050) Accession Number                    SH: ''
(0008, 0060) Modality                            CS: 'OT'
(0008, 0064) Conversion Type                     CS: 'WSD'
(0008, 0090) Referring Physician's Name          PN: ''
(0010, 0010) Patient's Name                      PN: ''
(0010, 0020) Patient ID                          LO: ''
(0010, 0030) Patient's Birth Date                DA: ''
(0010, 0040) Patient's Sex                       CS: ''
(0018, 2010) Nominal Scanned Pixel Spacing       DS: ['1', '3']
(0020, 000d) Study Instance UID                  UI: 1.2.826.0.1.3680043.2.1125.1.33389357207068897066210100430826006
(0020, 000e) Series Instance UID                 UI: 1.2.826.0.1.3680043.2.1125.1.51488923827429438625199681257282809
(0020, 0010) Study ID                            SH: ''
(0020, 0011) Series Number                       IS: ''
(0020, 0013) Instance Number                     IS: ''
(0020, 0020) Patient Orientation                 CS: ''
(0020, 0052) Frame of Reference UID              UI: 1.2.826.0.1.3680043.2.1125.1.35696880630664441938326682384062489
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows                                US: 40
(0028, 0011) Columns                             US: 50
(0028, 0100) Bits Allocated                      US: 32
(0028, 0101) Bits Stored                         US: 32
(0028, 0102) High Bit                            US: 31
(0028, 0103) Pixel Representation                US: 1
(0028, 1052) Rescale Intercept                   DS: "0"
(0028, 1053) Rescale Slope                       DS: "1"
(0028, 1054) Rescale Type                        LO: 'US'
(7fe0, 0010) Pixel Data                          OW: Array of 8000 bytes

2
输出图像中的标签数量非常大,因此手动创建所有标签是繁琐的。那么你做了什么呢?我不明白。 - Monica Heddneck

2

Corvin的2020更新对我来说几乎有效。

元数据仍未写入文件,因此在读取时会引发以下异常:

pydicom.errors.InvalidDicomError:文件缺少DICOM文件元信息头或标头中缺少“DICM”前缀。

为了解决这个问题并将元数据写入dicom文件,我需要在save_as()调用中添加enforce_standard=True

ds.save_as(filename=out_filename, enforce_standard=True) 

2
看起来参数现在应该是 write_like_original=False。https://pydicom.github.io/pydicom/dev/tutorials/dataset_basics.html?highlight=enforce_standard - Will

1

对于那些需要的人来说,这是一个可用的工作配置和一个问题。 问题在另一个主题中创建多个jpg图像的Dicom 对我有效的是不带压缩的灰度。每次尝试使用压缩都失败了,我不知道为什么:

# Populate required values for file meta information
meta = pydicom.Dataset()
meta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian
meta.MediaStorageSOPClassUID = pydicom._storage_sopclass_uids.MRImageStorage
meta.MediaStorageSOPInstanceUID = pydicom.uid.generate_uid()

# build dataset
ds = Dataset()
ds.file_meta = meta
ds.fix_meta_info()

# unknown options
ds.is_little_endian = True
ds.is_implicit_VR = False
ds.SOPClassUID = pydicom._storage_sopclass_uids.MRImageStorage
ds.SeriesInstanceUID = pydicom.uid.generate_uid()
ds.StudyInstanceUID = pydicom.uid.generate_uid()
ds.FrameOfReferenceUID = pydicom.uid.generate_uid()
ds.BitsStored = 16
ds.BitsAllocated = 16
ds.SamplesPerPixel = 1
ds.HighBit = 15
ds.ImagesInAcquisition = "1"
ds.InstanceNumber = 1
ds.ImagePositionPatient = r"0\0\1"
ds.ImageOrientationPatient = r"1\0\0\0\-1\0"
ds.ImageType = r"ORIGINAL\PRIMARY\AXIAL"
ds.RescaleIntercept = "0"
ds.RescaleSlope = "1"
ds.PixelRepresentation = 1

# Case options
ds.PatientName = "Anonymous"
ds.PatientID = "123456"
ds.Modality = "MR"
ds.StudyDate = '20200225'
ds.ContentDate = '20200225'

# convert image to grayscale
img = Image.open(filename).convert('L')
img.save(filename)

# open image, decode and ensure_even stream
with open(filename, 'rb') as f:
    arr = decode(f)

def ensure_even(stream):
    # Very important for some viewers
    if len(stream) % 2:
        return stream + b"\x00"
    return stream

# required for pixel handler
ds.BitsStored = 8
ds.BitsAllocated = 8
ds.HighBit = 7
ds.PixelRepresentation = 0

# grayscale without compression WORKS
ds.PhotometricInterpretation = "MONOCHROME2"
ds.SamplesPerPixel = 1  # 1 color = 1 sample per pixel
ds.file_meta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian
ds.PixelData = ensure_even(arr.tobytes())

# JPEGBaseline compressed DOES NOT WORK
# ds.PixelData = encapsulate([ensure_even(arr.tobytes())])
# ds.PhotometricInterpretation = "YBR_FULL"
# ds.SamplesPerPixel = 3  # 3 colors = 3 sampleperpixel
# ds.file_meta.TransferSyntaxUID = pydicom.uid.JPEGBaseline
# ds.compress(pydicom.uid.JPEGBaseline)

# JPEGExtended compressed DOES NOT WORK
# ds.PixelData = encapsulate([ensure_even(arr.tobytes())])
# ds.PhotometricInterpretation = "YBR_FULL_422"
# ds.SamplesPerPixel = 3  # 3 colors = 3 sampleperpixel
# ds.file_meta.TransferSyntaxUID = pydicom.uid.JPEGExtended
# ds.compress(pydicom.uid.JPEGExtended)

# JPEG2000 compressed DOES NOT WORK
# ds.PhotometricInterpretation = "RGB"
# ds.SamplesPerPixel = 3  # 3 colors = 3 sampleperpixel
# ds.file_meta.TransferSyntaxUID = pydicom.uid.JPEG2000
# ds.PixelData = encapsulate([ensure_even(arr.tobytes())])
# ds.compress(pydicom.uid.JPEG2000)

# Image shape
ds['PixelData'].is_undefined_length = False
array_shape = arr.shape
ds.Rows = array_shape[0]
ds.Columns = array_shape[1]

# validate and save
pydicom.dataset.validate_file_meta(ds.file_meta, enforce_standard=True)
new_filename = filename.replace('.jpg', name + '.dcm')
ds.save_as(new_filename, write_like_original=False)

1

DICOM是一种非常复杂的格式。有许多方言,兼容性问题往往是靠运气。您可以尝试使用nibabel,也许它的方言更适合RadiAnt或MicroDicom。

总的来说,我建议尽可能使用Nifti格式。它的标准更加简洁,不兼容的情况很少。nibabel也支持此格式。


但是医院数据库仍然使用.dcm格式。因此,这是一个重要的问题。 - prerakmody

0

对于3D CT扫描,您可以使用以下代码

def vol_to_dicom_for_ct(path_img_ct, patient_name, patient_id, path_dicom):

    """
    Converts a .nrrd/.mha/.nifti file into its .dcm files

    Params
    ------
    path_img_ct: str, the path of the .nrrd/.mha/.nifti file
    patient_name: str
    patient_id: str
    path_dicom: str, the final output directory
    
    Note: Verify the output with dciodvfy
        - Ref 1: https://www.dclunie.com/dicom3tools/workinprogress/index.html
        - Ref 2: https://manpages.debian.org/unstable/dicom3tools/dciodvfy.1.en.html
    """

    try:
        
        import sys
        import copy
        import random
        import shutil
        import subprocess
        import numpy as np

        if Path(path_img_ct).exists():
            
            try:
                import pydicom
                import pydicom._storage_sopclass_uids
            except:
                subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', 'pydicom'])
                import pydicom  

            try:
                import SimpleITK as sitk
            except:
                subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', 'SimpleITK']) # 2.1.1
                import SimpleITK as sitk
            
            try:
                import matplotlib.pyplot as plt
            except:
                subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', 'matplotlib']) # 2.1.1
                import matplotlib.pyplot as plt

            # Step 0 - Create save directory
            if Path(path_dicom).exists():
                shutil.rmtree(path_dicom)
            Path(path_dicom).mkdir(exist_ok=True, parents=True)
            
            # Step 1 - Get volume params
            img_ct      = sitk.ReadImage(str(path_img_ct))
            img_spacing = tuple(img_ct.GetSpacing())
            img_origin  = tuple(img_ct.GetOrigin()) # --> dicom.ImagePositionPatient
            img_array   = sitk.GetArrayFromImage(img_ct).astype(np.int16) # [D,H,W]

            # Step 2 - Create dicom dataset
            ds                     = pydicom.dataset.Dataset()
            ds.FrameOfReferenceUID = pydicom.uid.generate_uid() # this will stay the same for all .dcm files of a volume
            
            # Step 2.1 - Modality details
            ds.SOPClassUID         = pydicom._storage_sopclass_uids.CTImageStorage
            ds.Modality            = 'CT'
            ds.ImageType           = ['ORIGINAL', 'PRIMARY', 'AXIAL']

            # Step 2.2 - Image Details
            ds.PixelSpacing               = [float(img_spacing[0]), float(img_spacing[1])]
            ds.SliceThickness             = str(img_spacing[-1])
            ds.Rows                       = img_array.shape[1]
            ds.Columns                    = img_array.shape[2]
            
            ds.PatientPosition            = 'HFS'
            ds.ImageOrientationPatient    = [1, 0, 0, 0, 1, 0]
            ds.PositionReferenceIndicator = 'SN'
            
            ds.SamplesPerPixel            = 1
            ds.PhotometricInterpretation  = 'MONOCHROME2'
            ds.BitsAllocated              = 16
            ds.BitsStored                 = 16
            ds.HighBit                    = 15
            ds.PixelRepresentation        = 1
            
            ds.RescaleIntercept           = "0.0"
            ds.RescaleSlope               = "1.0"
            ds.RescaleType                = 'HU'
            
            # Step 3.1 - Metadata
            fileMeta = pydicom.Dataset()
            fileMeta.MediaStorageSOPClassUID = pydicom._storage_sopclass_uids.CTImageStorage
            fileMeta.MediaStorageSOPInstanceUID = pydicom.uid.generate_uid() # this will change for each .dcm file of a volume
            fileMeta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian
            ds.file_meta = fileMeta

            # Step 3.2 - Include study details
            ds.StudyInstanceUID    = pydicom.uid.generate_uid()
            ds.StudyDescription    = ''
            ds.StudyDate           = '19000101'                   # needed to create DICOMDIR
            ds.StudyID             = str(random.randint(0,1000))  # needed to create DICOMDIR

            # Step 3.3 - Include series details
            ds.SeriesInstanceUID   = pydicom.uid.generate_uid()
            ds.SeriesDescription   = ''
            ds.SeriesNumber        = str(random.randint(0,1000)) # needed to create DICOMDIR

            # Step 3.4 - Include patient details
            ds.PatientName      = patient_name
            ds.PatientID        = patient_id

            # Step 3.5 - Manufacturer details
            ds.Manufacturer           = 'MICCAI2015'
            ds.ReferringPhysicianName = 'Mody'                   # needed for identification in RayStation
            ds.ManufacturerModelName  = 'test_offsite'

            # Step 4 - Make slices
            for slice_id in range(img_array.shape[0]):
                
                # Step 4.1 - Slice identifier
                random_uuid = pydicom.uid.generate_uid()
                ds.file_meta.MediaStorageSOPInstanceUID = random_uuid
                ds.SOPInstanceUID = random_uuid
                ds.InstanceNumber = str(slice_id+1)
                
                vol_origin_tmp          = list(copy.deepcopy(img_origin))
                vol_origin_tmp[-1]      += img_spacing[-1]*slice_id
                ds.ImagePositionPatient = vol_origin_tmp

                # Step 4.2 - Slice data
                img_slice               = img_array[slice_id,:,:]
                # plt.imshow(img_slice); plt.savefig(str(Path(path_dicom, '{}.png'.format(slice_id)))); plt.close()
                ds.PixelData            = img_slice.tobytes()

                save_path               = Path(path_dicom).joinpath(str(ds.file_meta.MediaStorageSOPInstanceUID) + '.dcm')
                ds.save_as(str(save_path), write_like_original=False)

            return ds.StudyInstanceUID, ds.SeriesInstanceUID

        else:
            print (' - [ERROR][vol_to_dicom_for_ct()] Error in path: path_img_ct: ', path_img_ct)
            return None, None

    except:
        traceback.print_exc()


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接