tech.com!samba.rahul.net!rahul.net!a2i!hustle.rahul.net!rahul.net!a2i!flash.us.com!flash.us.com!not-for-mail
Subject: Medical Image Format FAQ, Part 1/8
Date: 9 Apr 1996 13:32:59 -0400
Summary: This posting contains answers to the most Frequently Asked
. Question on alt.image.medical - how do I convert from image
. format X from vendor Y to something I can use ? In addition
. it contains information about various standard formats.

Posting-Frequency: monthly
Version: 3.01

This message is automatically posted once a month to help readers looking
for information about medical image formats. If you don't want to see this
posting every month, please add the subject line to your kill file.

Contents:

    part1    - contains index, general information & standard formats
    part2    - contains standard formats (continued)
    part3    - contains information about proprietary CT formats
    part4    - contains information about proprietary MR formats
    part5    - contains information about proprietary other formats
    part6    - contains information about hosts & compression
    part7    - contains general information sources
    part8    - contains DICOM information sources

Tools that describe and convert many of the formats described in this document
are available in the dicom3tools package from

    "ftp://ftp.rahul.net/pub/dclunie/".

A Mosaic browsable version of this FAQ is available at:

    "http://www.rahul.net/dclunie/medical-image-faq/html/".

Html, postscript and text forms of the FAQ are available at:

    "ftp://ftp.rahul.net/pub/dclunie/medical-image-faq/".

Many FAQs, including this Listing, are available on the archive site
rtfm.mit.edu in the directory pub/usenet/news.answers.  The name under
which a FAQ is archived appears in the Archive-name line at the top of
the article.

There's a mail server on that machine. You send a e-mail message to
mail-server@rtfm.mit.edu containing the keyword "help" (without quotes!)
in the message body. To fetch this particular FAQ send a message with the
following body:

    send usenet/news.answers/medical-image-faq/part1
    ...
    send usenet/news.answers/medical-image-faq/part8

Please direct comments or questions and especially contributions to

    "dclunie@flash.us.com"

or reply to this article. All unknown formats and test images gratefully
accepted, but please don't email them, rather contact me and we can
arrange to exchange documents or disks or tapes by snail mail.

Changes this issue

    Reorganized DICOM information sources into part 8.
    Fixed lots of typos and formatting errors.
    Add NEMA web,ftp sites.
    Update NLM visible human web site addresses.
    Add DISC 96 tutorial web site.
    Changed Evergreen references to new company IMNET Systems.

Changes last issue

    Added Imatron CT format.
    Update Intech addresses.
    Update radsci-l listserv address.
    Fix errors in signa 3x/4x description.
    Add HL7/DICOM IMSIG site.
    Made other HL7 links more obvious :).
    Updated NIH Image entry to include web site.

The next part is table of contents.


Subject: Contents

1.  Introduction

    1.1 Objective
    1.2 Types of Formats
    1.3 In Desperation - Quick & Dirty Tricks

2.  Standard Formats

    2.1 ACR/NEMA 1.0 and 2.0
    2.2 ACR/NEMA DICOM 3.0
    2.3 Papyrus
    2.4 Interfile V3.3
    2.5 Qsh
    2.6 DEFF

3.  Proprietary Formats

    3.1 Proprietary Formats - General Information

.3.1.1 SPI (Standard Product Interconnect)

    3.2 CT - Proprietary Formats

.3.2.1 General Electric CT

.      3.2.1.1 GE CT 9800

..      3.2.1.1.1 GE CT 9800 Image data
..      3.2.1.1.2 GE CT 9800 Tape format
..      3.2.1.1.3 GE CT 9800 Raw data MR

.      3.2.1.2 GE CT Advantage - Genesis

..      3.2.1.2.1 GE CT Advantage Image data
..      3.2.1.2.2 GE CT Advantage Archive format
..      3.2.1.2.3 GE CT Advantage Raw data

.      3.2.1.3 GE CT Pace
.      3.2.1.4 GE CT Sytec

.3.2.2 Siemens CT

.      3.2.2.1 Siemens Somatom DR
.      3.2.2.2 Siemens Somatom Plus
.      3.2.2.3 Siemens Somatom AR

.3.2.3 Philips CT
.3.2.4 Picker CT
.3.2.5 Toshiba CT
.3.2.6 Hitachi CT
.3.2.7 Shimadzu CT
.3.2.8 Elscint CT
.3.2.8 Imatron CT

    3.3 MR - Proprietary Formats

.3.3.1 General Electric MR

.      3.3.1.1 GE MR Signa 3.x,4.x

..      3.3.1.1.1 GE MR Signa 3.x,4.x Image data
..      3.3.1.1.2 GE MR Signa 3.x,4.x Tape format
..      3.3.1.1.3 GE MR Signa 3.x,4.x Raw data

.      3.3.1.2 GE MR Signa 5.x - Genesis

..      3.3.1.2.1 GE MR Signa 5.x Image data
..      3.3.1.2.2 GE MR Signa 5.x Archive format
..      3.3.1.2.3 GE MR Signa 5.x Raw data

.      3.3.1.3 GE MR Max
.      3.3.1.4 GE MR Vectra

.3.3.2 Siemens MR

.      3.3.2.1 Siemens Magnetom GBS/GBS II

..      3.3.2.1.1 Siemens Magnetom GBS/GBS II Native Format
..      3.3.2.1.2 Siemens Magnetom GBS/GBS II SPI Format

.      3.3.2.2 Siemens Magnetom SP

..      3.3.2.2.1 Siemens Magnetom SP Native Format
..      3.3.2.2.2 Siemens Magnetom SP SPI Format

.      3.3.2.3 Siemens Magnetom Impact

..      3.3.2.3.1 Siemens Magnetom Impact Native Format
..      3.3.2.3.2 Siemens Magnetom Impact SPI Format

.      3.3.2.4 Siemens Magnetom Vision

..      3.3.2.4.1 Siemens Magnetom Vision Native Format
..      3.3.2.4.2 Siemens Magnetom Vision SPI Format

.3.3.3 Philips MR

.      3.3.3.1 Philips Gyroscan S5
.      3.3.3.2 Philips Gyroscan ACS
.      3.3.3.3 Philips Gyroscan T5
.      3.3.3.4 Philips Gyroscan NT5 & NT15

.3.3.4 Picker MR
.3.3.5 Toshiba MR
.3.3.6 Hitachi MR
.3.3.7 Shimadzu MR
.3.3.8 Elscint MR

    3.4 Proprietary Workstations

.3.4.1 ISG Workstations

.      3.4.1.1 Gyroview

    3.5 Other Proprietary Formats

.3.5.1 Analyze From Mayo

4.  Host Machines

    4.1 Data General

.4.1.1 Data General Data

.      4.1.1.1 Data General Integers
.      4.1.1.2 Data General Floating Point

.4.1.2 Data General Operating System

.      4.1.2.1 Data General RDOS
.      4.1.2.2 Data General AOS/VS

.4.1.3 Data General Network

    4.2 Vax

.4.2.1 Vax Data

.      4.2.1.1 Vax Integers
.      4.2.1.2 Vax Floating Point
.      4.2.1.3 Vax Strings

.4.2.2 Vax Operating System

.      4.2.2.1 Vax VMS
.      4.2.2.2 ULTRIX
.      4.2.2.3 OSF

    4.3 Sun - Sun3 68000 and Sun4 Sparc

.4.3.1 Sun Data

.      4.3.1.1 Sun Integers
.      4.3.1.2 Sun Floating Point
.      4.3.1.3 Sun Strings

.4.3.2 Sun Operating System

5.  Compression Schemes

    5.1 Reversible Compression
    5.2 Irreversible Compression

.5.2.1 Perimeter Encoding

6.  Getting Connected

    6.1 Tapes
    6.2 Ethernet
    6.3 Serial Ports

7.  Sources of Information

    7.1 Contacts and Sites
    7.2 Relevant FAQ's
    7.3 Mailservers
    7.4 References
    7.5 Organizations and Societies
    7.6 Usenet Newsgroups
    7.7 DICOM Information Sources

8.  Acknowledgements

The next part is part1 - general information & standard formats.


1.  Introduction

    1.1 Objective

.The goal of this FAQ is to facilitate access to medical images stored
.on digital imaging modalities such as CT and MR scanners, and their
.accompanying descriptive information. The document is designed
.particularly for those who do not have access to the necessary
.proprietary tools or descriptions, particularly in those moments when
.inspiration strikes and one just can't wait for the local sales person
.to track down the necessary authority and go through the cycle of
.correspondence necessary to get a non-disclosure agreement in place, by
.which time interest in the project has usually faded, and another great
.research opportunity has passed! It may also be helpful for those keen
.to experiment with home-grown PACS-like systems using their existing
.equipment, and also for those who still have equipment that is still
.useful but so old even the host computer vendor doesn't support it any
.more!

.There is of course no substitute for the genuine tools or descriptions
.from the equipment vendors themselves, and pointers to helpful
.individuals in various organizations, as well as names and catalog
.numbers of various useful documents, are included here where known.

.In addition there are several small companies that specialize in such
.connectivity problems that have a good reputation and are well known.
.Contact information is provided for them, though I personally have no
.experience with their products and am not endorsing them.

.Finally, great care has been taken not to include any information that
.has been released under non-disclosure agreements. What is included
.here is the result of either information freely released by vendors,
.handy hints from others working in the field, or in many cases close
.scrutiny of hex dumps and experimentation with scanner parameters and
.study of the effects on the image files. The intent is to spread
.hard-earned knowledge gained over many years amongst those new to the
.field or a particular piece of equipment, not to threaten anyone's
.proprietary interests, or to substitute for the technical support
.available from vendors that ranges from free to extortionate, and
.excellent to abysmal, depending on who your are dealing with and where
.in the world you are located!

.Please use this information in the spirit in which is intended, and
.where possible contribute whatever you know in order to expand the
.information to cover more vendors and equipment.

    1.2 Types of Formats

.Later sections will deal with the problems of getting the image files
.from the modality to the workstation, but for the moment assume the
.files are there and need to be deciphered.

.Four types of information are generally present in these files:

.   - image data, which may be unmodified or compressed,
.   - patient identification and demographics,
.   - technique information about the exam, series, and slice/image.

.Extracting the image information alone is usually straightforward and
.is described in 1.3. Dealing with the descriptive information, for
.example to make use of the data for dissemination in a PACS
.environment, or to extract geometry details in order to combine images
.into 3D datasets, is more difficult and requires deeper understanding
.of how the files are constructed.

.There are three basis families of formats that are in popular use:

.   - fixed format, where layout is identical in each file,
.   - block format, where the header contains pointers to information,
.   - tag based format, where each item contains its own length.

.The block format is one of the most popular, though in most cases, the
.early part of the header contains only a limited number of pointers to
.large blocks, the blocks are almost always in the same place and a
.constant length, for standard rather than reformatted images at least,
.and if one doesn't know the specifics of the layout one can get by
.assumming a fixed format. I presume this reflects the intent of the
.designers to handle future expansion and revision of the format.

.The example par excellence of the tag based format is the ACR/NEMA
.style of data stream, which, though never intended as a file format per
.se has proven useful as model. See for example the sections dealing
.with the ACR/NEMA standards as well as DICOM (whose creators are about
.to vote on a media interchange format after all this time) and Papyrus.
.ACR/NEMA style tags are described in more detail elsewhere, but each is
.self-contained and self-describing (at least if you have the
.appropriate data dictionary) and contains its own length, so if you
.can't interpret it you can skip it! Very convenient. Most file formats
.based on this scheme are just concatenated series of tags, and apart
.from having to guess the byte order, which is not specified (unlike
.TIFF which is a similar deal for those in the "real" imaging world),
.and sometimes skip a fixed length but short header, are dead easy to
.handle.

.To identify such a file just do a "strings <file | grep 'ACR-NEMA'" -
.if it is such a file, just look through the start of the hex dump until
.you start to see the characteristic sequentially ordered pairs of 16
.bit words that identify ACR/NEMA attributes, decide the byte order, et
.voila, you can pipe it into any general ACR/NEMA dumping program to see
.what it contains. If you see even group tags, they will be described in
.the standard. If you see odd group tags then they are vendor specific
.and you will have to ask the vendor or correlate them with
.identification information printed on the film until you figure out the
.ones that are important to you.

    1.3 In Desperation - Quick & Dirty Tricks

.Because radiologists, radiographers, technologists, physicists and
.imaging programmers are dedicated long suffering creatures who work
.long hours under adverse conditions for little reward, the vendors in
.their generosity have seen fit to make life a little easier, by almost
.universally putting the image data at the end of the file. Rarely you
.will see files that are padded out to fixed record size boundaries (eg.
.Vax VMS 512 byte records), and sometimes overlay plane data may be
.stored after the image data. Furthermore there is almost always an
.option at archive time to allow for storage in an uncompressed and
.totally unadulterated form. Even in ACR/NEMA the tag for image pixel
.data is numerically the highest and hence the last to appear in the
.sequence which is guaranteed to be sorted. They could have screwed us
.up totally by gratuitously adding variable length blocks of other stuff
.at the end, but the only time I have encountered this was on a Siemens
.Impact with the ACR/NEMA based SPI format padd

.In other words, if an image is 256 by 256, uncompressed, and 12-16 bits
.deep (and hence usually, though not always, stored as two bytes per
.pixel), then we all know that the file is going to contain
.256*256*2=131072 bytes of pixel data at the end of the file. If the
.file is say 145408 bytes long, as all GE Signa 3X/4X files are for
.example, then you need to skip 14336 bytes of header before you get to
.the data. Presume row by row starting with top left hand corner raster
.order, try both alternatives for byte order, deal with the 16 to 8 bit
.windowing problem, and very soon you have your image on the screen of
.your workstation.

.This technique is so useful, even NIH Image for the Macintosh (an
.excellent must-have free program BTW.) provides a raw import tool to do
.this, and describes it in the manual using the 14336 byte offset! This
.tool is something that is sadly lacking in most commercial image
.handling programs for non-medical applications, which can't import
.images with more than 8 bits per channel.

.Of course you have to live without the identification, demographic and
.technique information (other than what can be derived from the file
.name in some cases), but for many research and presentation purposes
.this is quite adequate.

.Occasionally one runs into clever files where four 12 bit words are
.packed into three 16 bit words and one goes crazy trying to figure out
.the logic of how they are packed. The back of the old ACR/NEMA standard
.describes somewhere one way in which this is done. One should still be
.able to calculate the length easily enough.

.I haven't yet encountered a format that did nasty things like have
.strips of rows seperated by padding ... I guess we are lucky that most
.images are nice powers of two or even multiples thereof (256,320,512).

.Of course the GE CT 9800 uses perimeter encoding even when DPCM
.compression is not selected, so this technique won't work.

2.  Standard Formats

    2.1 ACR/NEMA 1.0 and 2.0

.ACR/NEMA Standards Publication No. 300-1985      <- ACR/NEMA 1.0
.ACR/NEMA Standards Publication No. 300-1988      <- ACR/NEMA 2.0
.ACR/NEMA Standards Publication PS2-1989          <- data compression

.The American College of Radiologists (ACR) and the National Electrical
.Manufacturers Association (NEMA) recognized some time ago the need for
.standards to facilitate multi-vendor connectivity to promote the
.development of PACS and what is now referred to as Wide Area
.Networking. The first such standard was version 1.0 which was released
.in 1985 as ACR/NEMA Standards Publication No. 300-1985, subsequently
.revised several times, then revised again and released as version 2.0
.in 1988, described in ACR/NEMA Standards Publication No. 300-1988.
.There it remained until a radically revised and reorganized approach,
.preserving backward compatibility, was released during 1992-1993 as
.ACR/NEMA Standards Publication PS3, also referred to as DICOM 3.

.In the interim, to facilitate the transfer of compressed images,
.another standard described in ACR/NEMA Standards Publication PS2-1989,
.was released which described various means fo extending standard
.300-1985 to handle compression utilizing a broad range of reversible
.and irreversible schemes. Though this part of the standard was never
.apparently implemented by anyone, and has been quietly bypassed by
.those working on DICOM 3 compression, it makes very interesting reading
.and is a nice summary of applicable techniques.

.What does one need to know about ACR/NEMA 1.0 and 2.0 ? The standards
.define a mechanism along the lines of the layered ISO-OSI (Open Systems
.Interconnect) model, with physical, transport/network, session, and
.presentation and application layers. Unless one actually wants to
.physically connect to a device that supports the unique 50 pin
.point-to-point electrical interface, then one really only needs to be
.aware of how ACR/NEMA implements the presentation and application
.layers, which are described in terms of a "message format". This
.message format is important to many people, not because anyone
.seriously wants to connect devices in the limited fashion envisaged by
.these early standards, but because many proprietary formats and other
.de facto standards have adopted the ACR/NEMA message format and its
.corresponding data dictionary and extension mechanisms.

.The message format is described in sections 4, 5 and 10 of ACR/NEMA SP
.300-1988 which are summarized briefly here. Section 6 describes command
.structure which is not really relevant other than that commands are
.also structured in the same way as data and consume part of the data
.dictionary. You will not encounter command tags in data streams
.("messages") encapsulated in file formats though.

.A message consists of a series of "data elements" each of which
.contains a piece of information. Each element is described by an
."element name" consisting of a pair of 16 bit unsigned integers ("group
.number", "data element number"). The data stream is ordered by
.ascending group number, and within each group by ascending data element
.number. Each element may occur only once in a message. Even numbered
.groups describe elements defined by the standard. Odd numbered groups
.are available for use by vendors or users, but must conform to the same
.structure as standard elements. Following the (group number, data
.element number) pair is a length field that is a 32 bit unsigned even
.integer that describes the number of bytes from the end of the length
.field to the beginning of the next data element.

.The last part of a data element is its value, which is defined by the
.data dictionary to be an ascii (numeric AN or text AT) or binary value
.(BI 16 bit or BD 32 bit). The values may be single or multiple.
.Multiple ascii values are delimited by the backslash (05CH) character.
.Odd length ascii values are padded with a space (020H).

.For example:

.    0008 0010  000C 0000  4341 2D52 454E 414D
....  3120 302E

.is data element "Recognition Code" because that is what the dictionary
.defines group 0008 element 0010 to be. The dictionary says it is of
.type AT (ascii text), has a value multiplicity of single and only
.enumerated values are allowed, in this case the ascii string "ACR-NEMA
.2.0". It is of length 0000000C hex or 12 bytes long.

.The electrical interface is a 16 bit one, and hence even though 32
.binary values are defined to be transmitted least significant word
.first (though the order for the 32 bit length is not actually
.specified), there is no mention in the standard as to how to
.encapsulate the message in an 8 bit world, hence different users and
.vendors have chosen little or big endian schemes. The new DICOM
.standard assumes a default little endian representation which seems to
.be the most appropriate considering the old definition for 32 bit
.words, which specified that the least significant 16 bit word be
.transmitted first.

.Hence there are three likely possible byte orders that a vendor
.interpreting the ACR/NEMA standard in a byte oriented world may have
.used:

.    - little endian 16 and 32 bit words, as in DICOM 3,
.    - big endian 16 and 32 bit words, as in DICOM 3,
.    - big endian 16 bit words, but the least significant half of
.      a 32 bit word is sent first (as per ACR/NEMA 2.0).

.The choice seems to be made usually on the basis of the native byte
.order of integers on the host processor. Most of the formats I have
.encountered are one of the first two, but I did encounter one from
.Philips that used the last scheme and it drove me crazy for a while,
.until I appreciated the subtlety of it ! I call it "Big Bad Endian"
.format in my implementation that recognizes it, but that may be a value
.judgement on my part :)

.Notice particularly how this design allows one to parse the message
.even if the data dictionary is not complete. Consider an element that
.has an unrecognized element name. One cannot interpret the content of
.the element and so has to ignore it. One doesn't even know whether it
.contains binary or ascii information (this is what DICOM later refers
.to as "implicit representation". despite this, the length value allows
.one to skip to the next element and proceed.

.Over the years there has been much discussion amongst those who favour
.such implicit dictionary driven schemes, and those who prefer explicit
.representations, including explicit description of the element type
.(binary or ascii, etc.) and even the element description itself! Some
.would prefer the message to contain something like
."RecognitionCode='ACR-NEMA 2.0';" for example. The nuclear medicine
.groups have adopted a de facto standard called Interfile that makes use
.of ACR/NEMA data elements, but uses such a descriptive representation.
.Their argument is that the data stream is much more readable which is
.true enough, and more readily extensible.

.The groups are organized as follows:

.    0000                    Command
.    0008                    Identifying
.    0010                    Patient
.    0018                    Acquisition
.    0020                    Relationship
.    0028                    Image Presentation
.    4000                    Text
.    6000-601E (even)        Overlay
.    7FE0                    Pixel Data

.Some of the more interesting elements are:

.    (nnnn,0000) BD S Group Length           # of bytes in group nnnn
.    (nnnn,4000) AT M Comments

.    (0008,0010) AT S Recognition Code       # ACR-NEMA 1.0 or 2.0
.    (0008,0020) AT S Study Date             # yyyy.mm.dd
.    (0008,0021) AT S Series Date            # yyyy.mm.dd
.    (0008,0022) AT S Acquisition Date       # yyyy.mm.dd
.    (0008,0023) AT S Image Date             # yyyy.mm.dd
.    (0008,0030) AT S Study Time             # hh.mm.ss.frac
.    (0008,0031) AT S Series Time            # hh.mm.ss.frac
.    (0008,0032) AT S Acquisition Time       # hh.mm.ss.frac
.    (0008,0033) AT S Image Time             # hh.mm.ss.frac
.    (0008,0060) AT S Modality               # CT,NM,MR,DS,DR,US,OT

.    (0010,0010) AT S Patient Name
.    (0010,0020) AT S Patient ID
.    (0010,0030) AT S Patient Birthdate      # yyyy.mm.dd
.    (0010,0040) AT S Patient Sex            # M, F, O for other
.    (0010,1010) AT S Patient Age            # xxxD or W or M or Y

.    (0018,0010) AT M Contrast/Bolus Agent   # or NONE
.    (0018,0030) AT M Radionuclide
.    (0018,0050) AN S Slice Thickness        # mm
.    (0018,0060) AN M KVP
.    (0018,0080) AN S Repetition Time        # ms
.    (0018,0081) AN S Echo Time              # ms
.    (0018,0082) AN S Inversion Time         # ms
.    (0018,1120) AN S Gantry Tilt            # degrees

.    (0020,1040) AT S Position Reference     # eg. iliac crest
.    (0020,1040) AN S Slice Location         # in mm (signed)

.    (0028,0010) BI S Rows
.    (0028,0011) BI S Columns
.    (0028,0030) AN M Pixel Size             # row\col in mm
.    (0028,0100) BI S Bits Allocated         # eg. 12 bit for CT
.    (0028,0101) BI S Bits Stored            # eg. 16 bit
.    (0028,0102) BI S High Bit               # eg. 11
.    (0028,0102) BI S Pixel Representation   # 1 signed, 0 unsigned

.    (7FE0,0010) BI M Pixel Data             # as described by grp 0028

.The way in which the pixel data is stored can vary tremendously, though
.thankfully most users and vendors use the simple unimaginative scheme
.that is shown above, ie. 1 12 bit pixel stored in the low order part of
.a 16 bit word with no attempt at packing more compactly. Following are
.some examples shown in Appendix E of the standard. Note that when one
.adds the little/big endian question the permutations mount!

.Bits Allocated = 16
.Bits Stored    = 12
.High Bit       = 11

...  |<------------------ pixel ----------------->|
.    ______________ ______________ ______________ ______________
.   |XXXXXXXXXXXXXX|              |              |              |
.   |______________|______________|______________|______________|
.    15          12 11           8 7            4 3            0

.---------------------------

.Bits Allocated = 16
.Bits Stored    = 12
.High Bit       = 15

.   |<------------------ pixel ----------------->|
.    ______________ ______________ ______________ ______________
.   |              |              |              |XXXXXXXXXXXXXX|
.   |______________|______________|______________|______________|
.    15          12 11           8 7            4 3            0

.---------------------------

.Bits Allocated = 12
.Bits Stored    = 12
.High Bit       = 11

.   ------ 2 ----->|<------------------ pixel 1 --------------->|
.    ______________ ______________ ______________ ______________
.   |              |              |              |              |
.   |______________|______________|______________|______________|
.    15          12 11           8 7            4 3            0

.   -------------- 3 ------------>|<------------ 2 --------------
.    ______________ ______________ ______________ ______________
.   |              |              |              |              |
.   |______________|______________|______________|______________|
.    15          12 11           8 7            4 3            0

.   |<------------------ pixel 4 --------------->|<----- 3 ------
.    ______________ ______________ ______________ ______________
.   |              |              |              |              |
.   |______________|______________|______________|______________|
.    15          12 11           8 7            4 3            0

.---------------------------

.And so on ... refer to the standard itself for more detail.

The next part is part2 - standard formats (continued).

