TIFF
Tag Image File Format or Tagged Image File Format, commonly known by the abbreviations TIFF or TIF, is an image file format for storing raster graphics images, popular among graphic artists, the publishing industry, and photographers.
TIFF is widely supported by scanning, faxing, word processing, optical character recognition, image manipulation, desktop publishing, and page-layout applications.
The format was created by Stephen Carlsen, an engineer at Aldus Corporation, for use in desktop publishing. It published the latest version 6.0 in 1992, subsequently updated with an Adobe Systems copyright after the latter acquired Aldus in 1994. Several Aldus or Adobe technical notes have been published with minor extensions to the format, and several specifications have been based on TIFF 6.0, including TIFF/EP, TIFF/IT, TIFF-F and TIFF-FX.
History
TIFF was created as an attempt to get desktop scanner vendors of the mid-1980s to agree on a common scanned image file format, in place of a multitude of proprietary formats. Carlsen's three objectives were: Implement import and printing of 'basic' bitmapped images, like MacPaint files. Implement import and printing of higher resolution images, coming from scanners. Define and promote an industry standard for storing and processing scanned images, so that Aldus wouldn't have to write import filters for every model of every scanner entering the budding desktop scanner market.In the beginning, TIFF was only a binary image format, because that was all that desktop scanners could handle. As scanners became more powerful, and as desktop computer disk space became more plentiful, TIFF grew to accommodate grayscale images, then color images. Today, TIFF, along with JPEG and PNG, is a popular format for deep-color images.
The first version of the TIFF specification was published by the Aldus Corporation in the autumn of 1986 after two major earlier draft releases. It can be labeled as Revision 3.0. It was published after a series of meetings with various scanner manufacturers and software developers. In April 1987 Revision 4.0 was released and it contained mostly minor enhancements. In October 1988 Revision 5.0 was released and it added support for palette color images and LZW compression.
TIFF is a complex format, defining many tags of which typically only a few are used in each file. This led to implementations supporting many varying subsets of the format, a situation that gave rise to the joke that TIFF stands for Thousands of Incompatible File Formats.
This problem was addressed in revision 6.0 of the TIFF specification by introducing a distinction between Baseline TIFF and TIFF Extensions. Additional extensions are defined in two supplements to the specification which were published in September 1995 and March 2002 respectively.
Overview
A TIFF file contains one or several images, termed subfiles in the specification. The basic use case for having multiple subfiles is to encode a multipage telefax in a single file, but it is also allowed to have different subfiles be different variants of the same image, for example scanned at different resolutions. Rather than being a continuous range of bytes in the file, each subfile is a data structure whose top-level entity is called an image file directory. Baseline TIFF readers are only required to make use of the first subfile, but each IFD has a field for linking to a next IFD.The IFDs are where the tags for which TIFF is named are located. Each IFD contains one or several entries, each of which is identified by its tag. The tags are arbitrary 16-bit numbers; their symbolic names such as ImageWidth often used in discussions of TIFF data do not appear explicitly in the file itself. Each IFD entry has an associated value, which may be decoded based on general rules of the format, but it depends on the tag what that value then means. There may within a single IFD be no more than one entry with any particular tag. Some tags are for linking to the actual image data, other tags specify how the image data should be interpreted, and still other tags are used for image metadata.
TIFF images are made up of rectangular grids of pixels.
The two axes of this geometry are termed horizontal and vertical. Horizontal and vertical resolution need not be equal.
A baseline TIFF image divides the vertical range of the image into one or several strips, which are encoded separately. Historically this served to facilitate TIFF readers with limited capacity to store uncompressed data — one strip would be decoded and then immediately printed — but the present specification motivates it by "increased editing flexibility and efficient I/O buffering". A TIFF extension provides the alternative of tiled images, in which case both the horizontal and the vertical ranges of the image are decomposed into smaller units.
An example of these things, which also serves to give a flavor of how tags are used in the TIFF encoding of images, is that a striped TIFF image would use tags 273 , 278 , and 279 . The StripOffsets point to the blocks of image data, the StripByteCounts say how long each of these blocks are, and RowsPerStrip says how many rows of pixels there are in a strip; the latter is required even in the case of having just one strip, in which case it merely duplicates the value of tag 257 . A tiled TIFF image instead uses tags 322 , 323 , 324 , and 325 . The pixels within each strip or tile appear in row-major order, left to right and top to bottom.
The data for one pixel is made up of one or several samples; for example an RGB image would have one Red sample, one Green sample, and one Blue sample per pixel, whereas a greyscale or palette color image only has one sample per pixel. TIFF allows for both additive and subtractive color models. TIFF does not constrain the number of samples per pixel, nor does it constrain how many bits are encoded for each sample, but baseline TIFF only requires that readers support a few combinations of color model and bit-depth of images. Support for custom sets of samples is very useful for scientific applications; 3 samples per pixel is at the low end of multispectral imaging, and hyperspectral imaging may require hundreds of samples per pixel. TIFF supports having all samples for a pixel next to each other within a single strip/tile but also different samples in different strips/tiles. The default format for a sample value is as an unsigned integer, but a TIFF extension allows declaring them as alternatively being signed integers or IEEE-754 floats, as well as specify a custom range for valid sample values.
TIFF images may be uncompressed, compressed using a lossless compression scheme, or compressed using a lossy compression scheme. The lossless LZW compression scheme has at times been regarded as the standard compression for TIFF, but this is technically a TIFF extension, and the TIFF6 specification notes the patent situation regarding LZW. Compression schemes vary significantly in at what level they process the data: LZW acts on the stream of bytes encoding a strip or tile, whereas the JPEG compression scheme both transforms the sample structure of pixels and encodes pixels in 8×8 blocks rather than row by row.
Most data in TIFF files are numerical, but the format supports declaring data as rather being textual, if appropriate for a particular tag. Tags that take textual values include Artist, Copyright, DateTime, DocumentName, InkNames, and Model.
MIME type
The MIME type image/tiff without an application parameter is used for Baseline TIFF 6.0 files or to indicate that it is not necessary to identify a specific subset of TIFF or TIFF extensions. The optional "application" parameter is defined for image/tiff to identify a particular subset of TIFF and TIFF extensions for the encoded image data, if it is known. According to RFC 3302, specific TIFF subsets or TIFF extensions used in the application parameter must be published as an RFC.MIME type image/tiff-fx is based on TIFF 6.0 with TIFF Technical Notes TTN1 and TTN2. It is used for Internet fax compatible with the ITU-T Recommendations for Group 3 black-and-white, grayscale and color fax.
Digital preservation
Adobe holds the copyright on the TIFF specification along with the two supplements that have been published. These documents can be found on the Adobe TIFF Resources page. The Fax standard in RFC 3949 is based on these TIFF specifications.TIFF files that strictly use the basic "tag sets" as defined in TIFF 6.0 along with restricting the compression technology to the methods identified in TIFF 6.0 and are adequately tested and verified by multiple sources for all documents being created can be used for storing documents. Commonly seen issues encountered in the content and document management industry associated with the use of TIFF files arise when the structures contain proprietary headers, are not properly documented, or contain "wrappers" or other containers around the TIFF datasets, or include improper compression technologies, or those compression technologies are not properly implemented.
Variants of TIFF can be used within document imaging and content/document management systems using CCITT Group IV 2D compression which supports black-and-white images, among other compression technologies that support color. When storage capacity and network bandwidth was a greater issue than commonly seen in today's server environments, high-volume storage scanning, documents were scanned in black and white to conserve storage capacity.
The inclusion of the SampleFormat tag in TIFF 6.0 allows TIFF files to handle advanced pixel data types, including integer images with more than 8 bits per channel and floating point images. This tag made TIFF 6.0 a viable format for scientific image processing where extended precision is required. An example would be the use of TIFF to store images acquired using scientific CCD cameras that provide up to 16 bits per photosite of intensity resolution. Storing a sequence of images in a single TIFF file is also possible, and is allowed under TIFF 6.0, provided the rules for multi-page images are followed.