1. GRAPHICS INTERCHANGE FORMAT (GIF)
The Graphics Interchange Format (GIF) is a bitmap image format that was introduced by CompuServe in 1987 and has since come into widespread usage on the World Wide Web due to its wide support and portability.
The format supports up to 8 bits per pixel, allowing a single image to reference a palette of up to 256 distinct colors chosen from the 24-bit RGB color space. It also supports animations and allows a separate palette of 256 colors for each frame. The color limitation makes the GIF format unsuitable for reproducing color photographs and other images with continuous color, but it is well-suited for simpler images such as graphics or logos with solid areas of color.
GIF images are compressed using the Lempel-Ziv-Welch (LZW) lossless data compression technique to reduce the file size without degrading the visual quality. This compression technique was patented in 1985. Controversy over the licensing agreement between the patent holder, Unisys, and CompuServe in 1994 inspired the development of the Portable Network Graphics (PNG) standard; since then all the relevant patents have expired.
CompuServe introduced the GIF format in 1987 to provide a color image format for their file downloading areas, replacing their earlier run-length encoding (RLE) format, which was black and white only. GIF became popular because it used LZW data compression, which was more efficient than the run-length encoding that formats such as PCX and MacPaint used, and fairly large images could therefore be downloaded in a reasonably short time, even with very slow modems.
The original version of the GIF format was called 87a. In 1989, CompuServe devised an enhanced version, called 89a, that added support for multiple images in a stream, interlacing and storage of application-specific metadata. The two versions can be distinguished by looking at the first six bytes of the file, which, when interpreted as ASCII, read “GIF87a” and “GIF89a”, respectively.
The GIF89a feature of storing multiple images in one file, accompanied by control data, is used extensively on the Web to produce simple animations. The optional interlacing feature, which stores image scan lines out of order in such a fashion that even a partially downloaded image was somewhat recognizable, also helped GIF’s popularity, as a user could abort the download if it was not what was required.
The creators of the format pronounced GIF with a soft “g“, /ˈdʒɪf/, as in “George”. However, many people pronounce GIF with a hard “G”, as in a ‘gift’ IPA: /ˈɡɪf/, reflecting the way it is pronounced in its own acronym (Graphics Interchange Format). According to the creator of the GIF format, Steve Wilhite, the pronunciation deliberately echoes that of an American peanut butter brand, Jif, and the employees of CompuServe would often say “Choosy developers choose GIF”, spoofing this brand’s television commercials. This pronunciation was also identified by CompuServe in their documentation of a graphics display program called CompuShow. Both pronunciations are given as correct by the Oxford English Dictionary and the American Heritage Dictionary.
- GIFs are suitable for sharp-edged line art (such as logos) with a limited number of colors. This takes advantage of the format’s lossless compression, which favors flat areas of uniform color with well defined edges (in contrast to JPEG, which favors smooth gradients and softer images).
- GIFs can also be used to store low-color sprite data for games.
- GIFs can be used for small animations and low-resolution film clips.
- In view of the general limitation on the GIF image palette to 256 colors, it is not usually used as a format for digital photography. Digital photographers use image file formats capable of reproducing a greater range of colors, such as TIFF, RAW or the lossy JPEG, which is more suitable for compressing photographs.
- The PNG format is a popular alternative to GIF images since it uses better compression techniques and does not have a limit of 256 colors, but PNGs do not support animations. The MNG and APNG formats, both derived from PNG, support animations, but aren’t widely used.
GIF is palette based: although any palette selection can be one of millions of shades, the maximum number that can be used in a frame is 256. These are stored in a “palette”, a table that associates each palette selection number with a specific RGB value. The limitation to 256 colors seemed reasonable at the time of GIF’s creation because few people could afford the hardware to display more. Simple graphics, line drawings, cartoons, and grey-scale photographs typically need fewer than 256 colors. In addition, one of the colors in the palette can optionally be set as fully transparent. A transparent pixel takes on the color of the pixel in the same positions from the background, which may have been determined by a previous frame of animation.
There exist ways to dither or diffuse photographs by using pixels of 2 or more different colors to approximate an in-between color, but this transformation inevitably loses some detail. The algorithms used to select the palette and to perform the dithering vary widely in output quality. Additionally, dithering significantly reduces the image’s compressibility and thus works contrary to GIF’s main purpose.
In the early days of graphical web browsers, graphics cards with 8-bit buffers (allowing only 256 colors) were common and it was fairly common to make GIF images using the websafe palette which was based on the common subset of the standard Windows and Macintosh palettes. This ensured predictable display but severely limited the choice of colors. Now that 24-bit graphics cards are the norm, optimized palettes make less sense when creating images, though some web designers still advise the use of the web safe palette.
There are at least two rarely-used methods that can generate a GIF that, if decoded according to the GIF89a standard, will produce an animation that ends with a 24-bit RGB truecolor image.
GIF89a was designed based on the principle of rendering images (known as frames when used for animation) to a logical, fixed-size screen. Each image could optionally have no delay after it is rendered, and could have its own 256-color palette. Also, each image need not fill the entire logical screen, and the animation can cease after the last frame; it need not begin again. The multi-frame, zero-delay, and unique-palette features, optionally combined with transparency, allow for each image to replace only a portion of the previous image’s pixel data. When used without looping, a more-than-256-color final result can be achieved.
For example, a GIF can be encoded to render as a series of overlapping full-screen images, each image filling in color that wasn’t in the previous one. Transparent pixels can be used to preserve colors from previous images.
These methods are not widely supported by GIF-generating software, and Web browsers and other image viewers may not contain completely compliant GIF89a implementations, so their ability to display such GIFs accurately may be limited.
In computing, JPEG (pronounced JAY-peg; IPA: /ˈdʒeɪpɛg/) is a commonly used method of compression for photographic images. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality.
JPEG compression is used in a number of image file formats. JPEG/Exif is the most common image format used by digital cameras and other photographic image capture devices; along with JPEG/JFIF, it is the most common format for storing and transmitting photographic images on the World Wide Web. These format variations are often not distinguished, and are simply called JPEG.
The name “JPEG” stands for Joint Photographic Experts Group, the name of the committee that created the standard. The group was organized in 1986, issuing a standard in 1992, which was approved in 1994 as ISO 10918-1. JPEG is distinct from MPEG (Moving Picture Experts Group), which produces compression schemes for video.
The JPEG compression algorithm is at its best on photographs and paintings of realistic scenes with smooth variations of tone and color. For web usage, where the bandwidth used by an image is important, JPEG is very popular. And JPEG/Exif is the most common format saved by digital cameras.
On the other hand, JPEG is not as well suited for line drawings and other textual or iconic graphics, where the sharp contrasts between adjacent pixels cause noticeable artifacts. Such images are better saved in a lossless graphics format such as TIFF, GIF, PNG, or a raw image format. JPEG is also not well suited to files that will undergo multiple edits, as some image quality will usually be lost each time the image is decompressed and recompressed (generation loss). To avoid this, an image that is being modified or may be modified in the future can be saved in a lossless format such as PNG, and a copy exported as JPEG for distribution.
As JPEG is a lossy compression method — it removes information from the image — it must not be used in astronomical or medical imaging or other purposes where the exact reproduction of the data is required. Lossless formats such as PNG must be used instead.
The compression method is usually lossy, meaning that some visual quality is lost in the process and cannot be restored. There are variations on the standard baseline JPEG that are lossless; however, these are not widely supported.
There is also an interlaced “Progressive JPEG” format, in which data is compressed in multiple passes of progressively higher detail. This is ideal for large images that will be displayed while downloading over a slow connection, allowing a reasonable preview after receiving only a portion of the data. However, progressive JPEGs are not as widely supported, and even some software which does support them (such as some versions of Internet Explorer) only displays the image once it has been completely downloaded.
There are also many medical imaging systems that create and process 12-bit JPEG images. The 12-bit JPEG format has been part of the JPEG specification for some time, but again, this format is not as widely supported.
A number of alterations to a JPEG image can be performed losslessly (that is, without recompression and the associated quality loss) as long as the image size is a multiple 1 MCU block (Minimum Coded Unit) (usually 16 pixels in both directions, for 4:2:0).
Blocks can be rotated in 90 degree increments, flipped in the horizontal, vertical and diagonal axes and moved about in the image. Not all blocks from the original image need to be used in the modified one.
The top and left of a JPEG image must lie on a block boundary, but the bottom and right need not do so. This limits the possible lossless crop operations, and also what flips and rotates can be performed on an image whose edges do not lie on a block boundary for all channels.
When using lossless cropping, if the bottom or right side of the crop region is not on a block boundary then the rest of the data from the partially used blocks will still be present in the cropped file and can be recovered relatively easily by anyone with a hex editor and an understanding of the format.
It is also possible to transform between baseline and progressive formats without any loss of quality, since the only difference is the order in which the coefficients are placed in the file.
The file format is known as ‘JPEG Interchange Format’ (JIF), as specified in Annex B of the standard. However, this “pure” file format is rarely used, primarily because of the difficulty of programming encoders and decoders that fully implement all aspects of the standard and because of certain shortcomings of the standard:
- Color space definition
- Component sub-sampling registration
- Pixel aspect ratio definition
Several additional standards have evolved to address these issues. The first of these, released in 1992, was JPEG File Interchange Format (or JFIF), followed in recent years by Exchangeable image file format (Exif) and ICC color profiles.
There is some confusion between the original ‘JPEG Interchange Format’ (JIF) and the similarly titled ‘JPEG File Interchange Format’ (JFIF). In some ways JFIF is a cutdown version of the JIF standard in that it specifies certain constraints (such as standard color space), while in other ways it is an extension of JIF due to the standard Application Segment header. The documentation for the original JFIF standard states:
- JPEG File Interchange Format is a minimal file format which enables JPEG bitstreams to be exchanged between a wide variety of platforms and applications. This minimal format does not include any of the advanced features found in the TIFF JPEG specification or any application specific file format. Nor should it, for the only purpose of this simplified format is to allow the exchange of JPEG compressed images.
Image files that employ JPEG compression are commonly called “JPEG files”. Most image capture devices (such as digital cameras) and most image editing software programs that write to a “JPEG file” are actually creating a file in the JFIF and/or Exif format.
Strictly speaking, the JFIF and Exif standards are incompatible because they each specify that their header appears first. In practice, most JPEG files in Exif format contain a small JFIF header that precedes the Exif header. This allows older readers to correctly handle the older format JFIF header, while newer readers also decode the following Exif header.
The most common filename extensions for files employing JPEG compression are .jpg and .jpeg, though .jpe, .jfif and .jif are also used. It is also possible for JPEG data to be embedded in other file types – TIFF encoded files often embed a JPEG image as a thumbnail of the main image.
Many JPEG files embed an ICC color profile (color space). Commonly used color profiles include sRGB and Adobe RGB. Because these color spaces use a non-linear transformation, the dynamic range of an 8-bit JPEG file is about 11 stops. However, many applications are not able to deal with JPEG color profiles and simply ignore them.
A JPEG image contains a sequence of markers, each of which begins with a 0xFF byte followed by a byte indicating what kind of marker it is. Some markers consist of just those two bytes; others are followed by two bytes indicating the length of marker-specific payload data that follows. (The length includes the two bytes for the length, but not the two bytes for the marker.) Some markers are followed by entropy-coded data; the length of such a marker does not include the entropy-coded data.
Within the entropy-coded data, after any 0xFF byte, a 0x00 byte is inserted by the encoder before the next byte, so that there does not appear to be a marker where none is intended. Decoders must skip this 0x00 byte. This technique, called byte stuffing, is only applied to the entropy-coded data, not to marker payload data.
There are other Start Of Frame markers that introduce other kinds of JPEG.
Since several vendors might use the same APPn marker type, application-specific markers often begin with a standard or vendor name (e.g., “Exif” or “Adobe”) or some other identifying string.
At a restart marker, block-to-block predictor variables are reset, and the bitstream is synchronized to a byte boundary. Restart markers provide means for recovery after bitstream error. Since the runs of macroblocks between restart markers may be independently decoded, these runs may be decoded in parallel.
Although a JPEG file can be encoded in various ways, most commonly it is done with JFIF encoding. The encoding process consists of several steps:
- The representation of the colors in the image is converted from RGB to YCbCr, consisting of one luma component (Y), representing brightness, and two chroma components, (Cb and Cr), representing color. This step is sometimes skipped.
- The resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color details than to fine brightness details.
- The image is split into blocks of 8×8 pixels, and for each block, each of the Y, Cb, and Cr data undergoes a discrete cosine transform (DCT). A DCT is similar to a Fourier transform in the sense that it produces a kind of spatial frequency spectrum.
- The amplitudes of the frequency components are quantized. Human vision is much more sensitive to small variations in color or brightness over large areas than to the strength of high-frequency brightness variations. Therefore, the magnitudes of the high-frequency components are stored with a lower accuracy than the low-frequency components. The quality setting of the encoder (for example 50 or 95 on a scale of 0–100 in the Independent JPEG Group’s library) affects to what extent the resolution of each frequency component is reduced. If an excessively low quality setting is used, the high-frequency components are discarded altogether.
- The resulting data for all 8×8 blocks is further compressed with a loss-less algorithm, a variant of Huffman encoding.
The decoding process reverses these steps. In the remainder of this section, the encoding and decoding processes are described in more detail.
Many of the options in the JPEG standard are not commonly used, and as mentioned above, most image software uses the simpler JFIF format when creating a JPEG file, which among other things specifies the encoding method. Here is a brief description of one of the more common methods of encoding when applied to an input that has 24 bits per pixel (eight each of red, green, and blue). This particular option is a lossy data compression method.
First, the image should be converted from RGB into a different color space called YCbCr. It has three components Y, Cb and Cr: the Y component represents the brightness of a pixel, the Cb and Cr components represent the chrominance (split into blue and red components). This is the same color space as used by digital color television as well as digital video including video DVDs, and is similar to the way color is represented in analog PAL video and MAC but not by analog NTSC, which uses the YIQ color space. The YCbCr color space conversion allows greater compression without a significant effect on perceptual image quality (or greater perceptual image quality for the same compression). The compression is more efficient as the brightness information, which is more important to the eventual perceptual quality of the image, is confined to a single channel, more closely representing the human visual system.
This conversion to YCbCr is specified in the JFIF standard, and should be performed for the resulting JPEG file to have maximum compatibility. However, some JPEG implementations in “highest quality” mode do not apply this step and instead keep the color information in the RGB color model, where the image is stored in separate channels for red, green and blue luminance. This results in less efficient compression, and would not likely be used if file size was an issue.
Due to the densities of color- and brightness-sensitive receptors in the human eye, humans can see considerably more fine detail in the brightness of an image (the Y component) than in the color of an image (the Cb and Cr components). Using this knowledge, encoders can be designed to compress images more efficiently.
The transformation into the YCbCr color model enables the next step, which is to reduce the spatial resolution of the Cb and Cr components (called “downsampling” or “chroma subsampling“). The ratios at which the downsampling can be done on JPEG are 4:4:4 (no downsampling), 4:2:2 (reduce by factor of 2 in horizontal direction), and most commonly 4:2:0 (reduce by factor of 2 in horizontal and vertical directions). For the rest of the compression process, Y, Cb and Cr are processed separately and in a very similar manner.
After subsampling, each channel must be split into 8×8 blocks of pixels. If the data for a channel does not represent an integer number of blocks then the encoder must fill the remaining area of the incomplete blocks with some form of dummy data. Filling the edge pixels with a fixed color (typically black) creates ringing artifacts along the visible part of the border; repeating the edge pixels is a common technique that reduces the visible border, but it can still create artifacts.
The resulting compression ratio can be varied according to need by being more or less aggressive in the divisors used in the quantization phase. Ten to one compression usually results in an image that cannot be distinguished by eye from the original. 100 to one compression is usually possible, but will look distinctly artifacted compared to the original. The appropriate level of compression depends on the use to which the image will be put.
Those who use the World Wide Web may be familiar with the irregularities known as compression artifacts (commonly known as ‘jaggies’) that appear in JPEG images. These are due to the quantization step of the JPEG algorithm. They are especially noticeable around sharp corners between contrasting colours (text is a good example as it contains many such corners). They can be reduced by choosing a lower level of compression; they may be eliminated by saving an image using a lossless file format, though for photographic images this will usually result in a larger file size. Compression artifacts make low-quality JPEGs unacceptable for storing heightmaps. The images created with ray-tracing programs have noticeable blocky shapes on the terrain.
Some programs allow the user to vary the amount by which individual blocks are compressed. Stronger compression is applied to areas of the image that show fewer artifacts. This way it is possible to manually reduce JPEG file size with less loss of quality.
Since the quantization stage always results in a loss of information, JPEG standard is always a lossy compression codec. (Information is lost both in quantizing and rounding of the floating-point numbers.) Even if the quantization matrix is a matrix of ones, information will still be lost in the rounding step.
JPEG compression artifacts blend well into photographs with detailed non-uniform textures, allowing higher compression ratios. Notice how a higher compression ratio first affects the high-frequency textures in the upper-left corner of the image, and how the contrasting lines become more fuzzy. The very high compression ratio severely affects the quality of the image, although the overall colors and image form are still recognizable. However, the precision of colors suffer less (for a human eye) than the precision of contours (based on luminance). This justifies the fact that images should be first transformed in a color model separating the luminance from the chromatic information, before subsampling the chromatic planes (which may also use lower quality quantization) in order to preserve the precision of the luminance plane with more information bits.
For information, the uncompressed 24-bit RGB bitmap image below (73,242 pixels) would require 219,726 bytes (excluding all other information headers). The filesizes indicated below include the internal JPEG information headers and some meta-data. For full quality images (Q=100), about 8.25 bits per color pixel is required. On grayscale images, a minimum of 6.5 bits per pixel is enough (a comparable Q=100 quality color information requires about 25% more encoded bits). The full quality image below (Q=100) is encoded at 9 bits per color pixel, the medium quality image (Q=25) uses 1 bit per color pixel. For most applications, the quality factor should not go below 0.75 bit per pixel (Q=12.5), as demonstrated by the low quality image. The image at lowest quality uses only 0.13 bit per pixel, and displays very poor color, it could only be usable after subsampling to a much lower display size.
The medium quality photo uses only 4.3% of the storage space but has little noticeable loss of detail or visible artifacts. However, once a certain threshold of compression is passed, compressed images show increasingly visible defects. See the article on rate distortion theory for a mathematical explanation of this threshold effect.
From 2004 to 2008, new research has emerged on ways to further compress the data contained in JPEG images without modifying the represented image. This has applications in scenarios where the original image is only available in JPEG format, and its size needs to be reduced for archival or transmission. Standard general-purpose compression tools cannot significantly compress JPEG files.
Typically, such schemes take advantage of improvements to the naive scheme for coding DCT coefficients, which fails to take into account:
- Correlations between magnitudes of adjacent coefficients in the same block;
- Correlations between magnitudes of the same coefficient in adjacent blocks;
- Correlations between magnitudes of the same coefficient/block in different channels;
- The DC coefficients when taken together resemble a downscale version of the original image multiplied by a scaling factor. Well-known schemes for lossless coding of continuous-tone images can be applied, achieving somewhat better compression than the Huffman coded DPCM used in JPEG.
Some standard but rarely-used options already exist in JPEG to improve the efficiency of coding DCT coefficients: the arithmetic coding option, and the progressive coding option (which produces lower bitrates because values for each coefficient are coded independently, and each coefficient has a significantly different distribution). Modern methods have improved on these techniques by reordering coefficients to group coefficients of larger magnitude together; using adjacent coefficients and blocks to predict new coefficient values; dividing blocks or coefficients up among a small number of independently coded models based on their statistics and adjacent values; and most recently, by decoding blocks, predicting subsequent blocks in the spatial domain, and then encoding these to generate predictions for DCT coefficients.
A freely-available tool called PackJPG is based on the 2007 paper “Improved Redundancy Reduction for JPEG Files.” There are also at least two companies selling proprietary tools with similar capabilities, Infima’s JPACK and Allume’s StuffIt Deluxe, both of which claim to have pending patents on their respective undisclosed technologies.
In 2002 Forgent Networks asserted that it owned and would enforce patent rights on the JPEG technology, arising from a patent that had been filed on October 27, 1986, and granted on October 6, 1987 ( ). The announcement created a furor reminiscent of Unisys‘ attempts to assert its rights over the GIF image compression standard.
The JPEG committee investigated the patent claims in 2002 and were of the opinion that they were invalidated by prior art. Others also concluded that Forgent did not have a patent that covered JPEG. Nevertheless, between 2002 and 2004 Forgent was able to obtain about US$105 million by licensing their patent to some 30 companies. In April 2004, Forgent sued 31 other companies to enforce further license payments. In July of the same year, a consortium of 21 large computer companies filed a countersuit, with the goal of invalidating the patent. In contrast to other major computer companies such as Sony and Philips, Microsoft launched a major lawsuit against Forgent. In February 2006, the United States Patent and Trademark Office agreed to re-examine Forgent’s JPEG patent at the request of the Public Patent Foundation. On May 26, 2006 the USPTO found the patent invalid based on prior art. The USPTO also found that Forgent knew about the prior art, and did not tell the Patent Office, making any appeal to reinstate the patent highly unlikely to succeed.
The JPEG committee has as one of its explicit goals that their standards (in particular their baseline methods) be implementable without payment of license fees, and they have secured appropriate license rights for their upcoming JPEG 2000 standard from over 20 large organizations.
Beginning in August 2007, another company, Global Patent Holdings, LLC claimed that its patent ( In its first two lawsuits following the reexamination, both filed in Chicago, Illinois, Global Patent Holdings sued the Green Bay Packers, CDW, Motorola, Apple, Orbitz, Officemax, Caterpillar, Kraft and Peapod as defendants. A third lawsuit was filed on December 5, 2007 in Southern Florida against ADT Security Services, AutoNation, Florida Crystals Corp., HearUSA, MovieTickets.com, Ocwen Financial Corp. and Tire Kingdom, and a fourth lawsuit on January 8, 2008 in Southern Florida against the Boca Raton Resort & Club. A fifth lawsuit was filed against Global Patent Holdings in Nevada. That lawsuit was filed by Zappos.com, Inc., who was allegedly threatened by Global Patent Holdings, and seeks a judicial declaration that the ‘341 patent is invalid and not infringed. The patent owner has also used the patent to sue or threaten outspoken critics of broad software patents, including Gregory Aharonian and the anonymous operator of a website blog known as the “Patent Troll Tracker.” On December 21, 2007, patent lawyer Vernon Francissen of Chicago asked the U.S. Patent and Trademark Office to reexamine the sole remaining claim of the ‘341 patent on the basis of new prior art. On March 5, 2008, the U.S. Patent and Trademark Office agreed to reexamine the ‘341 patent, finding that the new prior art raised substantial new questions regarding the patent’s validity. In light of the reexamination, the accused infringers in four of the five pending lawsuits have filed motions to suspend (stay) their cases until completion of the U.S. Patent and Trademark Office’s review of the ‘341 patent. On April 23, 2008, a judge presiding over the two lawsuits in Chicago, Illinois granted the motions in those cases. On July 22, 2008, the Patent Office issued the first “Office Action” of the second reexamination, finding the claim invalid based on nineteen separate grounds.), is infringed by the downloading of JPEG images on either a website or through e-mail. If not invalidated, this patent could apply to any website that displays JPEG images. The patent emerged in July 2007 following a seven-year reexamination by the U.S. Patent and Trademark Office in which all of the original claims of the patent were revoked, but an additional claim (claim 17) was confirmed.
- JPEG (lossy and lossless): ITU-T T.81, ISO/IEC IS 10918-1
- JPEG (extensions): ITU-T T.84
- JPEG-LS (lossless, improved): ITU-T T.87, ISO/IEC IS 14495-1
- JBIG (black and white pictures): ITU-T T.82, ISO/IEC IS 11544-1
- JPEG 2000 (successor of JPEG/JPEG-LS): ITU-T T.800, ISO/IEC IS 15444-1
- JPEG-2000 (extensions): ITU-T T.801
- JPEG XR (formerly called HD Photo) undergoing final voting as ISO/IEC 29199-2
Portable Network Graphics (PNG) is a bitmapped image format that employs lossless data compression. PNG was created to improve upon and replace GIF (Graphics Interchange Format) as an image-file format not requiring a patent license. It is pronounced /ˈpɪŋ/ or spelled out as P-N-G. The PNG initialism is optionally recursive, unofficially standing for “PNG’s Not GIF”.
PNG supports palette-based (palettes of 24-bit RGB colors), greyscale or RGB images. PNG was designed for transferring images on the Internet, not professional graphics, and so does not support other color spaces (such as CMYK).
PNG files nearly always use file-extension “
PNG” or “
png” and are assigned MIME media type “
image/png” (approved October 14, 1996).
The motivation for creating the PNG format came in early 1995 when it came to light that the LZW data compression algorithm used in the GIF format had been patented by Unisys. For more on this controversy see GIF (Unisys and LZW patent enforcement). There were also other problems with the GIF format which made a replacement desirable, notably its limitation to 256 colors at a time when computers capable of displaying far more than 256 colors were becoming common. Although GIF allows for animation, it was decided that PNG should be a single-image format. A companion format called MNG (Multi-image Network Graphics) has been defined for animation.
A precursory discussion thread on newsgroup “comp.graphics” with the subject Thoughts on a GIF-replacement file format, dating back to January 1995 has many propositions which would later be part of the PNG file format, and displays Oliver Fromme, author of the popular MS-DOS JPEG viewer QPEG, proposing the PING name meaning PING is not GIF, and also the PNG extension for the first time.
- October 1, 1996 – Version 1.0 of the PNG specification was released, and later appeared as RFC 2083. It became a W3C Recommendation on October 1, 1996.
- December 31, 1998 – Version 1.1, with some small changes and the addition of three new chunks, was released.
- August 11, 1999 – Version 1.2, adding one extra chunk, was released.
- November 10, 2003 – PNG is now an International Standard (ISO/IEC 15948:2003). This version of PNG differs only slightly from version 1.2 and adds no new chunks.
- March 3, 2004 – ISO/IEC 15948:2004.
After the header come a series of chunks, each of which conveys certain information about the image. Chunks declare themselves as critical or ancillary, and a program encountering an ancillary chunk that it does not understand can safely ignore it. This chunk-based structure is designed to allow the PNG format to be extended while maintaining compatibility with older versions.
The chunks each have a header specifying their size and type. This is immediately followed by the actual data, and then the checksum of the data. Chunks are given a four letter case sensitive ASCII name. The case of the different letters in the name (bit 5 of the numeric value of the character) provides the decoder with some information on the nature of chunks it does not recognize.
The case of the first letter indicates if the chunk is critical or not. If the first letter is uppercase, the chunk is critical; if not, the chunk is ancillary. Critical chunks contain information that is necessary to read the file. If a decoder encounters a critical chunk it does not recognize, it must abort reading the file or supply the user with an appropriate warning.
The case of the second letter indicates if the chunk is “public” (either in the specification or the registry of special purpose public chunks) or “private” (not standardised). Uppercase is public and lowercase is private. This ensures that public and private chunk names can never conflict with each other (although two private chunk names could conflict).
The third letter must be uppercase to conform to the PNG specification. It is reserved for future expansion. Decoders should treat a chunk with a lower case third letter the same as any other unrecognised chunk.
The case of the fourth letter indicates if a chunk is safe to copy by editors that do not recognize it. If lowercase, the chunk may be safely copied regardless of the extent of modifications to the file. If uppercase, it may only be copied if the modifications have not touched any critical chunks.
A decoder must be able to interpret these to read and render a PNG file.
IHDRmust be the first chunk, it contains the header.
PLTEcontains the palette; list of colors.
IDATcontains the image, which may be split among multiple IDAT chunks. Doing so increases filesize slightly, but makes it possible to generate a PNG in a streaming manner.
IENDmarks the image end.
PLTE chunk is essential for color type 3 (indexed color). It is optional for color types 2 and 6 (truecolor and truecolor with alpha) and it must not appear for color types 0 and 4 (greyscale and greyscale with alpha).
Other image attributes that can be stored in PNG files include gamma values, background color, and textual metadata information. PNG also supports color management through the inclusion of ICC color space profiles.
- bKGD gives the default background color. It is intended for use when there is no better choice available, such as in standalone image viewers (but not web browsers, see below for more details).
- cHRM gives the white balance.
- gAMA specifies gamma.
- hIST can store the histogram, or total amount of each color in the image.
- iCCP is an ICC color profile.
- iTXt contains UTF-8 text, compressed or not, with an optional language tag.
- pHYs holds the intended pixel size and/or aspect ratio of the image.
- sBIT (significant bits) indicates the color-accuracy of the source data.
- sPLT suggests a palette to use if the full range of colors is unavailable.
- sRGB indicates that the standard sRGB color space is used.
- tEXt can store text that can be represented in ISO 8859-1, with one name=value pair for each chunk.
- tIME stores the time that the image was last changed.
- tRNS contains transparency information. For indexed images, it stores alpha channel values for one or more palette entries. For truecolor and greyscale images, it stores a single pixel value that is to be regarded as fully transparent.
- zTXt contains compressed text with the same limits as tEXt.
The lowercase first letter in these chunks indicates that they are not needed for the PNG specification. The lowercase last letter in some chunks indicates that they are safe to copy, even if the application concerned does not understand them.
PNG images can either use palette-indexed color or be made up of one or more channels (numerical values directly representing quantities about the pixels). When there is more than one channel in an image all channels have the same number of bits allocated per pixel (known as the bit depth of the channel). Although the PNG specification always talks about the bit depth of channels, most software and users generally talk about the total number of bits per pixel (sometimes also referred to as bit depth or color depth). Since multiple channels can affect a single pixel, the number of bits per pixel is often higher than the number of bits per channel, as shown in the illustration below.
The number of channels will depend on whether the image is greyscale or color and whether it has an alpha channel. PNG allows the following combinations of channels:
- indexed (channel containing indexes into a palette of colors)
- greyscale and alpha (level of transparency for each pixel)
- red, green and blue (rgb/truecolor)
- red, green, blue and alpha
With indexed color images, the palette is always stored in RGB at a depth of 8 bits per channel (24 bits per palette entry). The palette must not have more entries than the image bitdepth allows for but it may have fewer (so if an image for example only uses 90 colors there is no need to have palette entries for all 256).
Indexed color PNGs are allowed to have 1, 2, 4 or 8 bits per pixel by the standard; greyscale images with no alpha channel allow for 1, 2, 4, 8 or 16 bits per pixel. Everything else uses a bit depth per channel of either 8 or 16. The combinations this allows are given in the table above. The standard requires that decoders can read all supported color formats but many image editors can only produce a small subset of them.
PNG offers a variety of transparency options. With truecolor and greyscale images either a single pixel value can be declared as transparent or an alpha channel can be added. For paletted images, alpha values can be added to palette entries. The number of such values stored may be less than the total number of palette entries, in which case the remaining entries are considered fully opaque.
The scanning of pixel values for binary transparency is supposed to be performed before any color reduction to avoid pixels becoming unintentionally transparent. This is most likely to pose an issue for systems that can decode 16 bits per channel images (as they must be compliant with the specification) but only output at 8 bits per channel (the norm for all but the highest end systems).
PNG uses a non-patented lossless data compression method known as DEFLATE, which is the same algorithm used in the zlib compression library. This method is combined with prediction, where for each image line, a filter method is chosen that predicts the color of each pixel based on the colors of previous pixels and subtracts the predicted color of the pixel from the actual color. An image line filtered in this way is often more compressible than the raw image line would be, especially if it is similar to the line above (since deflate has no understanding that an image is a 2D entity, and instead just sees the image data as a stream of bytes). Compression is further improved by choosing filter methods adaptively on a line-by-line basis. This improvement, and a heuristic method of implementing it commonly used by PNG-writing software, were created by Lee Daniel Crocker, who tested the methods on many images during the creation of the format.
PNG offers an optional 2-dimensional, 7-pass interlacing scheme – the Adam7 algorithm. This is more sophisticated than GIF’s 1-dimensional, 4-pass scheme, and allows a clearer low-resolution image to be visible earlier in the transfer. However, as a 7-pass scheme, it tends to reduce the data’s compressibility more than simpler schemes.
PNG itself does not support animation at all. MNG is an extension to PNG that does; it was designed by members of the PNG Group. MNG shares PNG’s basic structure and chunks, but it is significantly more complex and has a different file signature, which automatically renders it incompatible with standard PNG decoders.
The complexity of MNG led to the proposal of APNG by developers of the Mozilla Foundation. It is based on PNG, supports animation and is simpler than MNG. APNG offers fallback to single-image display for PNG decoders that do not support APNG. However, neither of these formats is currently widely supported. APNG is supported in Firefox 3.0 and Opera 9.5. The PNG Group decided in April 2007 not to embrace APNG. Several alternatives are under discussion, ANG, aNIM/mPNG, PNG in GIF and its subset RGBA in GIF.
Adobe Fireworks (formerly by Macromedia) uses PNG as its native file format, allowing other image editors and preview utilities to view the flattened image. However, Fireworks by default also stores meta data for layers, animation, vector data, text and effects. Such files should not be distributed directly. Fireworks can instead export the image as an optimized PNG without the extra meta data for use on web pages, etc.
Other popular graphics programs which support the PNG format include Adobe Photoshop, Corel Photo-Paint, Corel PaintShop Pro, The GIMP, GraphicConverter, Helicon Filter, Inkscape, Jasc Paint Shop Pro (Corel), Pixel image editor, Paint.NET and Xara. Some programs bundled with popular operating systems which support PNG include Microsoft’s Paint and Apple’s iPhoto and Preview.
Some image processing programs have PNG compression problems, mainly related to lack of full implementation of the PNG compressor library. These include:
- Microsoft’s Paint for Windows XP
- Microsoft Picture It! Photo Premium 9
- older versions of Adobe Photoshop.
Adobe’s Fireworks is sometimes placed in this category, but its difficulties are less severe than the other entries. The confusion stems from a misunderstanding of the mechanics of its Save format: though PNGs, the intermediate images produced by that option include large, private chunks containing complete layer and vector information, which allows further, lossless editing. Properly saved with the Export option, Fireworks’ PNGs are competitive with those produced by other image editors, but are no longer editable as anything but flattened bitmaps. Fireworks is unable to save size-optimized vector-editable PNGs.
GIF is found to be in use more than PNG for a few reasons:
- No support on old browsers (such as Internet Explorer below version 4).
- No animation, still images only (unlike GIF, though Mozilla’s unofficial APNG format is a potential solution).
PNG compatible browsers include: Apple Safari, Google Chrome, Mozilla Firefox, Opera, Camino, Internet Explorer 7, and many others. For the complete comparison, see Comparison of web browsers (Image format support).
However, Internet Explorer (Windows), before version 7, has a fair share of issues, which prevent it from using PNG to its full potential.
- 4.0 crashes on large PNG chunks.
- 4.0 does not include the functionality to view .png files, but there is a registry fix.
- 5.0 and 5.01 has broken OBJECT support.
- 5.01 prints palette images with black (or dark gray) backgrounds under Windows 98, sometimes with radically altered colors.
- 6.0 fails to display PNG images of 4097 or 4098 bytes in size.
- 6.0 cannot open a PNG file that contains one or more zero-length IDAT chunks.
- 6.0 sometimes completely loses ability to display PNGs, there are various fixes
- 6.0 and below fails to display alpha-channel of PNG images used as CSS background.
- 6.0 and below has inconsistent/broken gamma support.
- 6.0 and below has no ICC-profile (iCCP) support.
- 6.0 and below has no color-correction support.
- 6.0 and below has progressive display of interlaced images (replicating method).
- 6.0 and below has broken alpha-channel transparency support (will display the default background color instead). However there are various fixes:
- webfx – PNG Behavior (IE behavior/.htc)
- The PNG problem in Windows Internet Explorer (IE behavior/.htc) (unmaintained)
- TwinHelix – Near-native PNG support with alpha opacity to IE 5.5 and 6 (IE behavior/.htc)
- A Better IE 5.5 and 6 PNG Fix (supports CSS background-position, background-repeat) (IE behavior/.htc)
- PNG-24 Alpha Transparency With Microsoft Internet Explorer or better (MSIE 5.5+) (PHP)
- Cross Browser PNG Transparency (CSS)
- CSS PNG fix (with background call none fix) (CSS)
- SitePoint – Use 8-bit PNGs
- Upgrade to IE7, which has full transparency support.
PNG icons have been supported in most distributions of Linux since at least 1999, in desktop environments such as GNOME. In 2006, Microsoft introduced PNG icons into Windows with the release of Windows Vista. PNG icons are supported in Mac OS X as well. Another operating system to include 3rd party PNG icons support is AmigaOS 3/4 (and its clones – MorphOS and AROS).
Generally, PNG files without unnecessary metadata should have a smaller file size than the identical image encoded in GIF format. PNG gives the image creator far more flexibility than GIF, but care must be taken to avoid PNG files that are needlessly large.
As GIF is limited to 256 colors, many image editors will automatically reduce the color depth when saving an image in GIF format. Often when people save the same truecolor image as PNG and GIF, they see that the GIF is smaller, and do not realise it is possible to create a 256 color PNG that has identical quality to the GIF with a smaller file size. This leads to the misconception that PNG files are larger than equivalent GIF files.
In practice, well optimized and compressed PNG files are usually much smaller than GIF files (10 % to 50 % savings) for filesizes above about 100 bytes.
Some versions of Adobe Photoshop, CorelDRAW and MS Paint provide poor PNG compression effort, further fueling the idea that PNG is larger than GIF. Many graphics programs (such as Apple’s Preview software) save PNGs with large amounts of metadata and color-correction data that are generally unnecessary for Web viewing. Unoptimized PNG files from Adobe Fireworks are also notorious for this.
It should be noted that Adobe Photoshop‘s performance on PNG files has been much improved in the CS Suite when using the Save For Web feature (which also allows explicit PNG/8 use).
Various tools are available for optimizing PNG files. OptiPNG and pngcrush are both open-source software optimizers that run from a Unix command line or a Windows Command Prompt, and effectively reduce the size of PNG files. The littleutils is another open-source package that contains a wrapper script called opt-png. This script utilizes pngcrush and a variant of pngrewrite to reduce bit-depth when possible, reducing PNG file sizes further. Perl scripts might wish to employ Image-Pngslimmer which allows some dynamic optimization.
Other tools such as AdvanceCOMP and Ken Silverman‘s PNGOUT are capable of reducing the file size even further still, giving the competent user the smallest file size possible for a given PNG image. The current version of IrfanView can use PNGOUT as an external plug-in, and the screenshots show PNGOUT’s save options.
pngcrush and PNGOUT have the extra ability to remove all color correction data from PNG files (gamma, white balance, ICC color profile, standard RGB color profile). This often results in much smaller file sizes. The following command line options achieve this with pngcrush:
pngcrush -rem gAMA -rem cHRM -rem iCCP -rem sRGB InputFile.png OutputFile.png
There’s GUI front-end for OptiPNG, pngcrush and advpng that runs on Mac OS X.
Since Windows Vista icons may contain PNG subimages, the optimizations can be applied to them as well. At least one icon editor, Pixelformer, is able to perform a special optimization pass while saving ICO files, thereby reducing their sizes.
Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. It is now under the control of Adobe Systems. Originally created by the company Aldus for use with what was then called “desktop publishing”, the TIFF format is widely supported by image-manipulation applications, by publishing and page layout applications, by scanning, faxing, word processing, optical character recognition and other applications. Adobe Systems, which acquired Aldus, now holds the copyright to the TIFF specification. TIFF has not had a major update since 1992, though several Aldus/Adobe technical notes have been published with minor extensions to the format, and several specifications, including TIFF/EP, have been based on the TIFF 6.0 specification.
The phrases “Tagged Image File Format” and “Tag Image File Format” were used as the subtitle to some early versions of the TIFF specification; the current specification, TIFF 6.0, does not use either subtitle phrase; the name is now, simply, “TIFF”.
TIFF was originally created as an attempt to get desktop scanner vendors of the mid-1980s to agree on a common scanned image file format, rather than have each company promote its own proprietary format. In the beginning, TIFF was only a binary image format (only two possible values for each pixel), since that was all that desktop scanners could handle. As scanners became more powerful, and as desktop computer disk space became more plentiful, TIFF grew to accommodate grayscale images, then color images. Today, TIFF is a popular format for high-color-depth images, along with JPEG and PNG. Adobe Systems, which acquired the PageMaker publishing program from Aldus, now controls the TIFF specification.
The TIFF is a flexible, adaptable file format for handling images and data within a single file, by including the header tags (size, definition, image-data arrangement, applied image compression) defining the image’s geometry. For example, a TIFF can be a container file holding compressed JPEG and RLE (run-length encoding) images. A TIFF also can include a vector-based Clipping path (outlines, croppings, image frames). The ability to store image data in a lossless format makes the TIFF file a useful image archive, because, unlike standard JPEG files, the TIFF using lossless compression (or none) may be edited and re-saved without losing image quality; other TIFF options are layers and pages.
Although the currently accepted standard format, when the TIFF was introduced, its extensibility provoked compatibility problems. Programmers were free to specify new tags and options — but not every implemented program supported every tag created. Resultantly, the TIFF became the lowest common denominator image file. Today, the most TIFF images and readers remain based upon uncompressed 32-bit CMYK or 24-bit RGB images.
The TIFF offers the option of using LZW compression, a lossless data-compression technique for reducing a file’s size. Until 2004, this option’s use was limited, because the LZW technique then was under several patents; however, these patents are expired.
Every TIFF begins with a 2-byte indicator of byte order: “II” for little endian and “MM” for big endian byte ordering. The next 2 bytes represent the number 42, selected “for its deep philosophical significance“. The 42-reading depends upon the byte order indicated by the 2-byte indicator. All words, double words, etc., in the TIFF file are read based per the indicated byte order. The TIFF 6.0 Specification (Section 7: Additional baseline TIFF Requirements) says that compliant TIFF readers must support both byte orders (II and MM), however, TIFF writers may choose the byte order convenient for their image. The image-processing community’s joke about the early TIFF’s standardised-consistency problems is Thousands of Incompatible File Formats.
The Tiff file format uses 32bit offsets, and as such, each file is limited to 4 gigabytes.
The TIFF format is the standard in document imaging and document management systems using CCITT Group IV 2D compression, which supports black-and-white (bitonal, monochrome) images. In high-volume storage scanning, documents are scanned in black and white (not in colour or in grayscale) to conserve storage capacity. An average A4 scanning produces 30 KB of data at 200 ppi (pixels per inch of resolution) and 50 KB of data at 300 ppi; 300 ppi is more common than 200 ppi.
The TIFF format can save multi-page documents to a single TIFF file rather than a series of files for each scanned page. Multi-page support and 2D compression of bitonal images led to the TIFF’s becoming the standard storage format for facsimiles, especially on Fax Servers.
The inclusion of the SampleFormat tag in TIFF 6.0 allows TIFF files to handle advanced pixel data types, including integer images with more than 8 bits per channel and floating point images. This tag made TIFF 6.0 a viable format for scientific image processing where extended precision is required. An example is the use of TIFF to store images acquired using scientific CCD cameras that provide up to 16 bits per pixel of intensity resolution.
Storing a sequence of images in a single TIFF file is also possible, and is allowed under TIFF 6.0, provided the rules for multi-page images are followed.
Developers can apply for a block of “private tags” to enable them to include their own proprietary information inside a TIFF file without causing problems for file interchange. TIFF readers are required to ignore tags that they do not recognize, and a developer’s private tags are guaranteed not to clash with anyone else’s tags or with the standard set of tags defined in the specification.
The TIFF file format is unusual in comparison to other image formats, in that it is composed of small descriptor blocks containing offsets into the file which point to the actual pixel image data (composed of bands of pixel rows). This means that incorrect offset values can cause programs to attempt to read erroneous portions of the file or attempt to read past the physical end of file. Like most other image file formats, improperly encoded packet or line lengths within the file can cause poorly written rendering programs to overflow their internal buffers. Properly-written image rendering programs generally avoid such pitfalls.
Multiple buffer overflows have been found in Libtiff. Some of these have also been used to execute unsigned code on the PlayStation Portable, as well as run third party applications on the iPhone and iPod Touch.
The BMP file format, sometimes called bitmap or DIB file format (for device-independent bitmap), is an image file format used to store bitmap digital images, especially on Microsoft Windows and OS/2 operating systems.
Many graphical user interfaces use bitmaps in their built-in graphics subsystems; for example, the Microsoft Windows and OS/2 platforms’ GDI subsystem, where the specific format used is the Windows and OS/2 bitmap file format, usually named with the file extension of
In uncompressed BMP files, and many other bitmap file formats, image pixels are stored with a color depth of 1, 4, 8, 16, 24, or 32 bits per pixel. Images of 8 bits and fewer can be either grayscale or indexed color. An alpha channel (for transparency) may be stored in a separate file, where it is similar to a grayscale image, or in a fourth channel that converts 24-bit images to 32 bits per pixel.
Uncompressed bitmap files (such as BMP) are typically much larger than compressed (with any of various methods) image file formats for the same image. For example, the 1058×1058 Wikipedia logo, which occupies about 287.65 KB in the PNG format, takes about 3358 KB as a 24-bit BMP file. Uncompressed formats are generally unsuitable for transferring images on the Internet or other slow or capacity-limited media.
The bits representing the bitmap pixels are packed within rows. Depending on the color depth, a pixel in the picture will occupy at least n/8 bytes (n is the bit depth, since 1 byte equals 8 bits). The approximate size for a n-bit (2n colors) BMP file in bytes can be calculated, including the effect of starting each word on a 32-bit dword boundary, as:
- , where the floor function gives the highest integer that is less than or equal to the argument; that is, the number of 32-bit dwords needed to hold a row of n-bit pixels; this value multiplied by 4 gives the byte count.
- , where height and width are given in pixels.
In the formula above, 54 is the size of the headers in the popular Windows V3 BMP version (14-byte BMP file header plus 40-byte DIB V3 header); some other header versions will be larger or smaller than that, as described in tables below. And is the size of the color palette; this size is an approximation, as the color palette size will be bytes in the OS/2 V1 version, and some other versions may optionally define only the number of colors needed by the image, potentially fewer than 2n. Only files with 8 or fewer bits per pixel use a palette; for 16-bit (or higher) bitmaps, omit the palette part from the size calculation:
For detailed information, see the sections on file format below.
Microsoft has defined a particular representation of color bitmaps of different color depths, as an aid to exchanging bitmaps between devices and applications with a variety of internal representations. They called these device-independent bitmaps or DIBs, and the file format for them is called DIB file format or BMP file format. According to Microsoft support:
A device-independent bitmap (DIB) is a format used to define device-independent bitmaps in various color resolutions. The main purpose of DIBs is to allow bitmaps to be moved from one device to another (hence, the device-independent part of the name). A DIB is an external format, in contrast to a device-dependent bitmap, which appears in the system as a bitmap object (created by an application…). A DIB is normally transported in metafiles (usually using the StretchDIBits() function), BMP files, and the Clipboard (CF_DIB data format).
A typical BMP file usually contains the following blocks of data:
The following sections discuss the data stored in the BMP file or DIB in details. This is the standard BMP file format. Some bitmap images may be stored using a slightly different format, depending on the application that creates it. Also, not all fields are used; a value of 0 will be found in these unused fields.
A BMP file is loaded into memory as a DIB data structure, an important component of the Windows GDI API. The DIB data structure is the same as the BMP file format, but without the 14-byte BMP header.
This block of bytes is at the start of the file and is used to identify the file. A typical application reads this block first to ensure that the file is actually a BMP file and that it is not damaged. Note that the first two bytes of the BMP file format (thus the BMP header) are stored in big-endian order. This is the magic number ‘BM’. All of the other integer values are stored in little-endian format (i.e. least-significant byte first).
This block of bytes tells the application detailed information about the image, which will be used to display the image on the screen. The block also matches the header used internally by Windows and OS/2 and has several different variants. All of them contain a dword field, specifying their size, so that an application can easily determine which header is used in the image. The reason that there are different headers is that Microsoft extended the DIB format several times. The new extended headers can be used with some GDI functions instead of the older ones, providing more functionality. Since the GDI supports a function for loading bitmap files, typical Windows applications use that functionality. One consequence of this is that for such applications, the BMP formats that they support match the formats supported by the Windows version being run. See the table below for more information.
For compatibility reasons, most applications use the older DIB headers for saving files. With OS/2 being obsolete, for now the only common format is the V3 header. See next table for its description. All values are stored as unsigned integers, unless explicitly noted.
A 32-bit version of DIB with integrated alpha channel has been introduced with Windows XP and is used within its logon and theme system; it has yet to gain wide support in image editing software, but has been supported in Adobe Photoshop since version 7 and Adobe Flash since version MX 2004 (then known as Macromedia Flash).
The palette occurs in the BMP file directly after the BMP header and the DIB header. Therefore, its offset is the size of the BMP header plus the size of the DIB header.
The palette is a block of bytes (a table) listing the colors available for use in a particular indexed-color image. Each pixel in the image is described by a number of bits (1, 4, or 8) which index a single color in this table. The purpose of the color palette in indexed-color bitmaps is to tell the application the actual color that each of these index values corresponds to.
A DIB always uses the RGB color model. In this model, a color is terms of different intensities (from 0 to 255) of the additive primary colors red (R), green (G), and blue (B). A color is thus defined using the 3 values for R, G and B (though stored in backwards order in each palette entry).
The number of entries in the palette is either 2n or a smaller number specified in the header (in the OS/2 V1 format, only the full-size palette is supported). Each entry contains four bytes, except in the case of the OS/2 V1 versions, in which case there are only three bytes per entry. The first (and only for OS/2 V1) three bytes store the values for blue, green, and red, respectively, while the last one is unused and is filled with 0 by most applications.
As mentioned above, the color palette is not used when the bitmap is 16-bit or higher; there are no palette bytes in those BMP files.
This block of bytes describes the image, pixel by pixel. Pixels are stored “upside-down” with respect to normal image raster scan order, starting in the lower left corner, going from left to right, and then row by row from the bottom to the top of the image. Uncompressed Windows bitmaps can also be stored from the top row to the bottom, if the image height value is negative.
In the original DIB, the only four legal numbers of bits per pixel are 1, 4, 8, and 24. In all cases, each row of pixels is extended to a 32-bit (4-byte) boundary, filling with an unspecified value (not necessarily 0) so that the next row will start on a multiple-of-four byte location in memory or in the file. The total number of bytes in a row can be calculated as the image size/bitmap height in pixels. Following these rules there are several ways to store the pixel data depending on the color depth and the compression type of the bitmap.
One-bit (two-color, for example, black and white) pixel values are stored in each bit, with the first (left-most) pixel in the most-significant bit of the first byte. An unset bit will refer to the first color table entry, and a set bit will refer to the last (second) table entry.
Four-bit color (16 colors) is stored with two pixels per byte, the left-most pixel being in the more significant nibble. Each pixel value is an index into a table of up to 16 colors.
Eight-bit color (256 colors) is stored one pixel value per byte. Each byte is an index into a table of up to 256 colors.
The simplicity of the BMP file format, and its widespread familiarity in Windows and elsewhere, as well as the fact that this format is relatively well documented and free of patents, makes it a very common format that image processing programs from many operating systems can read and write.
While most BMP files have a relatively large file size due to lack of any compression, many BMP files can be considerably compressed with lossless data compression algorithms such as ZIP (up to 0.1% of original size) because they contain redundant data.
The X Window System uses a similar XBM format for black-and-white images, and XPM (pixelmap) for color images. There are also a variety of “raw” formats, which saves raw data with no other information. The Portable Pixmap (PPM) and Truevision TGA formats also exist, but are less often used – or only for special purposes; for example, TGA can contain transparency information.
6. ADOBE ILLUSTRATOR ARTWORK
Adobe Illustrator Artwork (AI) is a proprietary file format developed by Adobe Systems for representing single-page vector-based drawings in either the EPS or PDF formats. The .ai filename extension is used by Adobe Illustrator.
Early versions of the AI file format are true EPS files with a restricted, compact syntax, with additional semantics represented by Illustrator-specific DSC comments that conform to DSC’s Open Structuring Conventions. These files are identical to their corresponding Illustrator EPS counterparts, but with the EPS procsets (procedure sets) omitted from the file and instead externally referenced using %%Include directives.
Recent versions of the AI file format, including the PDF-based formats and recent EPS formats, are based on a native format called PGF that is unrelated to both EPS and PDF. PDF compatibility is achieved not by extending the PDF format, but by embedding a complete copy of the native PGF data within the PDF file. The same kind of “dual path” approach is also used when recent versions of Illustrator are saving EPS-compatible files.
7. AutoCAD DXF
AutoCAD DXF (Drawing Interchange Format, or Drawing Exchange Format) is a CAD data file format developed by Autodesk for enabling data interoperability between AutoCAD and other programs.
DXF was originally introduced in December 1982 as part of AutoCAD 1.0, and was intended to provide an exact representation of the data in the AutoCAD native file format, DWG (Drawing), for which Autodesk for many years did not publish specifications. Because of this, correct imports of DXF files have been difficult. Autodesk now publishes the DXF specifications, http://usa.autodesk.com/adsk/servlet/item?siteID=123112&id=8446698 on its website for versions of DXF dating from AutoCAD Release 13 to AutoCAD 2009.
As AutoCAD has become more powerful, supporting more complex object types, DXF has become less useful. Certain object types, including ACIS solids and regions, are not documented. Other object types, including AutoCAD 2006’s dynamic blocks, and all of the objects specific to the vertical-market versions of AutoCAD, are partially documented, but not well enough to allow other developers to support them. For these reasons many CAD applications use the DWG format which can be licensed from AutoDesk or non-natively from the Open Design Alliance.
ASCII versions of DXF can be read with a text-editor. The basic organization of a DXF file is as follows:
- HEADER section – General information about the drawing. Each parameter has a variable name and an associated value.
- CLASSES section – Holds the information for application-defined classes whose instances appear in the BLOCKS, ENTITIES, and OBJECTS sections of the database. Generally does not provide sufficient information to allow interoperability with other programs.
- TABLES section – This section contains definitions of named items.
- Application ID (APPID) table
- Block Recod (BLOCK_RECORD) table
- Dimension Style (DIMSTYPE) table
- Layer (LAYER) table
- Linetype (LTYPE) table
- Text style (STYLE) table
- User Coordinate System (UCS) table
- View (VIEW) table
- Viewport configuration (VPORT) table
- BLOCKS section – This section contains Block Definition entities describing the entities comprising each Block in the drawing.
- ENTITIES section – This section contains the drawing entities, including any Block References.
- OBJECTS section – Contains the data that apply to nongraphical objects, used by AutoLISP and ObjectARX applications.
- THUMBNAILIMAGE section – Contains the preview image for the DXF file.
- END OF FILE
All graphical elements can be specified in a textual source file that can be compiled into a binary file or one of two text representations. CGM provides a means of graphics data interchange for computer representation of 2D graphical information independent from any particular application, system, platform, or device. As a metafile, i.e. a file containing information that describes or specifies another file, the CGM format has numerous elements to provide functions and to represent entities, so that a wide range of graphical information and geometric primitives can be accommodated. Rather than establish an explicit graphics file format, CGM contains the instructions and data for reconstructing graphical components to render an image using an object-oriented approach.
Although CGM is not widely supported for web pages and has been supplanted by other formats in the graphic arts, it is still prevalent in engineering, aviation, and other technical applications.
The initial CGM implementation was effectively a streamed representation of a sequence of Graphical Kernel System primitive operations. It has been adopted to some extent in the areas of technical illustration and professional design, but has largely been superseded by formats such as SVG and DXF.
The World Wide Web Consortium has developed WebCGM, a profile of CGM intended for the use of CGM on the Web.
- 1986 – ANSI X3 122-1986 (ANSI X3 committee)
- 1987 – ISO 8632-1987 (ISO)
- 1991 – ANSI/ISO 8632-1987 (ANSI and ISO)
- 1992 – ISO 8632:1992, a.k.a CGM:1992 (ISO)
- 1999 – ISO/IEC 8632:1999, 2nd Edition (ISO/IEC JTC1/SC24)
- December 17, 2001 – WebCGM (W3C)
- January 30, 2007 – WebCGM 2.0 (W3C)
- Arnold, D.B. and P.R. Bono, CGM and CGI: Metafile and Interface Standards for Computer Graphics, Springer-Verlag, New York, NY, 1988.
- Henderson, L.R., and Gebhardt, “CGM: SGML for Graphics,” The Gilbane Report, Fall 1994.
- Henderson, L.R., and A.M. Mumford, The CGM Handbook, Academic Press, San Diego, CA, 1993.
- Bono, P.R. , J.L. Encarnacao, L.M. Encarnacao, and W.R. Herzner, PC Graphics With GKS, Prentice-Hall, Englewood Cliffs, NJ, 1990.
While MIF is essentially specific to a single program (FrameMaker), it was widely used in the complex document workflows of small enterprises, especially in the industrial and manufacturing sector.
The SVG specification is an open standard that has been under development by the World Wide Web Consortium (W3C) since 1999. SVG images and their behaviours are defined in XML text files. This means that they can be searched, indexed, scripted and, if required, compressed. SVG files can be edited with any text editor, but specialist SVG development environments are also available. These offer a wide range of specialised and general-purpose features.
Since 2001, SVG has progressed from version 1.0 to 1.2 and has been modularised to allow various profiles to be published, including SVG Print, SVG Basic and SVG Tiny. Being an efficient, widely understood and flexible image format, SVG is also well-suited to small and mobile devices. The SVG Basic and SVG Tiny specifications were developed with just such uses in mind and many current mobile devices support them.
SVG has been in development since 1999 by a group of companies within the W3C after the competing standards PGML (developed from Adobe’s PostScript) and VML (developed from Microsoft’s RTF) were submitted to W3C in 1998. SVG drew on experience designing both those formats.
SVG allows three types of graphic objects:
Graphical objects can be grouped, styled, transformed, and composited into previously rendered objects. SVG does not directly support z-indices that separate drawing order from document order for objects, which is different than in other vector markup languages like VML. Text can be in any XML namespace suitable to the application, which enhances searchability and accessibility of the SVG graphics. The feature set includes nested transformations, clipping paths, alpha masks, filter effects, template objects and extensibility.
While being primarily designated as a vector graphics markup language, the specification is also designed with the basic capabilities of a page description language, like Adobe’s PDF. It contains provisions for rich graphics, and is also compatible with the CSS specification’s properties for styling purposes; thus, unlike XHTML and XSL-FO which are layout-oriented languages, SVG is a fully presentational language. A much more print-specialized subset of SVG (SVG Print, authored by Canon, HP, Adobe and Corel) is currently a W3C Working Draft.
SVG drawings can be dynamic and interactive. Time-based modifications to the elements can be described in SMIL, or can be programmed in a scripting language (e.g., ECMAScript). The W3C explicitly recommends SMIL as the standard for animation in SVG, however it is more common to find SVG animated with ECMAScript because it is a language that many developers already understand, and it is more compatible with existing renderers. A rich set of event handlers such as onmouseover and onclick can be assigned to any SVG graphical object.
SVG images, being XML, contain many repeated fragments of text and are thus particularly suited to compression by gzip, though other compression methods may be used effectively. Once an SVG image has been compressed by gzip it may be referred to as an “SVGZ” image; with the corresponding filename extension. The resulting file may be as small as 20% of the original size.
SVG was developed by the W3C SVG Working Group starting in 1998, after Macromedia and Microsoft introduced Vector Markup Language (VML) whereas Adobe Systems and Sun Microsystems submitted a competing format known as PGML. The working group was chaired by Chris Lilley of the W3C.
- SVG 1.0 became a W3C Recommendation on September 4, 2001.
- SVG 1.1 became a W3C Recommendation on January 14, 2003. The SVG 1.1 specification is modularized in order to allow subsets to be defined as profiles. Apart from this, there is very little difference between SVG 1.1 and SVG 1.0.
- SVG Tiny 1.2 became a W3C Recommendation on December 22, 2008.
- SVG Full 1.2 is a W3C Working Draft. SVG Tiny 1.2 was initially released as a profile, and later refactored to be a complete specification, including all needed parts of SVG 1.1 and SVG 1.2. SVG 1.2 Full adds modules onto the SVGT 1.2 core.
- SVG Print adds syntax for multi-page documents and mandatory color management support.
Because of industry demand, two mobile profiles were introduced with SVG 1.1: SVG Tiny (SVGT) and SVG Basic (SVGB). These are subsets of the full SVG standard, mainly intended for user agents with limited capabilities. In particular, SVG Tiny was defined for highly restricted mobile devices such as cellphones, and SVG Basic was defined for higher-level mobile devices, such as PDAs.
Neither mobile profile includes support for the full DOM, while only SVG Basic has optional support for scripting, but because they are fully compatible subsets of the full standard most SVG graphics can still be rendered by devices which only support the mobile profiles.
SVGT 1.2 adds a microDOM (μDOM), allowing all mobile needs to be met with a single profile.
- Simple or compound shape outlines drawn with curved or straight lines can be filled in or outlined (or used as a clipping path) and are expressed in a highly compact coding in which, for example, M precedes the initial numeric X and Y coordinates and L will precede a subsequent point to which a line should be drawn.
- Basic Shapes
- Straight-line paths or paths made up of a series of connected straight-line segments (polylines), as well as closed polygons, circles and ellipses can be drawn. Rectangles and round-cornered “rectangles” are other standard elements.
- Unicode character text included in an SVG file is expressed as XML character data. Many visual effects are possible, and the SVG specification automatically handles bidirectional text (as when composing a combination of English and Arabic text, for example), vertical text (as Chinese was historically written) and characters along a curved path (such as the text around the edges of the Great Seal of the United States).
- SVG shapes can be filled and/or outlined (painted with a color, a gradient or a pattern). Fills can be opaque or have various degrees of transparency. “Markers” are end-of-line features, such as arrowheads, or symbols which can appear at the vertices of a polygon.
- Colors can be applied to all visible SVG elements, either directly or via the ‘fill’, ‘stroke’ and other properties. Colors are specified in the same way as in CSS2, i.e. using names like
blue, in hexadecimal such as
#22ff00, in decimal like
rgb(255,255,127)or as percentages of the form
- Gradients and Patterns
- SVG shapes can be filled or outlined with solid colors as above, or with color gradients or with repeating patterns. Color gradients can be linear or radial (circular), and can involve any number of colors as well as repeats. Opacity gradients can also be specified. Patterns are based on predefined raster or vector graphic objects, which can be repeated in x and/or y directions. Gradients and patterns can be animated and scripted.
- Clipping, Masking and Compositing
- Graphic elements, including text, paths, basic shapes and combinations of these, can be used as outlines to define both ‘inside’ and ‘outside’ regions that can be painted (with colors, gradients and patterns) independently. Fully opaque clipping paths and semi-transparent masks are composited together to calculate the color and opacity of every pixel of the final image, using simple alpha blending.
The use of SVG on the web is in its infancy; there is a great deal of inertia due to the long-time use of pure raster formats and other formats like Adobe Flash or Java applets, and browser support for SVG is still uneven. Web sites which serve SVG images, for example Wikipedia, typically also provide the images in a raster format, either automatically by HTTP content negotiation or allowing the user to directly choose the file.
There are several advantages to native support: plugins would not need to be installed, SVG could be freely mixed with other formats in a single document, and rendering scripting between different document formats would be considerably more reliable. At this time all major browsers have committed to some level of SVG support except for Internet Explorer which will also not support SVG in the upcoming version IE8. Other browsers’ implementations are not yet fully functional. See Comparison of layout engines for further details. As of 2008[update], only Opera and Safari support embedding via the
<img> tag. Tim Berners-Lee, the inventor of the Web, has been critical of Internet Explorer for its failure to support SVG.
- Opera (since 8.0) has support for the SVG 1.1 Tiny specification while Opera 9 includes SVG 1.1 Basic support and some of SVG 1.1 Full. Since 9.5 Opera has partial SVG Tiny 1.2 support.
- Browsers based on the Gecko layout engine (such as Firefox, Flock, Netscape, Camino, SeaMonkey and Epiphany), all have incomplete support for the SVG 1.1 Full specification since 2005. The Mozilla site has an overview of the modules which are supported in Firefox and an overview of the modules which are in progress in the development. Gecko 1.9, included in Firefox 3.0, adds support for more of the SVG specification (including filters).
- Browsers based on WebKit (such as Apple‘s Safari, Google Chrome, and The Omni Group‘s OmniWeb) have incomplete support for the SVG 1.1 Full specification since 2006. This includes Safari 3.0 and later (included with Mac OS X v10.5 and Mac OS X v10.4.11) as well as Mobile Safari as of iPhone OS 2.1.
- Amaya has partial SVG support.
Adobe provides SVG Viewer, the most widely used SVG plugin, but plans to discontinue support on January 1, 2009. SVG Viewer will remain available for download after this date. The plugin supports most of SVG 1.0/1.1. Adobe SVG plugin support for pre-3.0 versions of Safari is for PowerPC only. User-reported issues include lack of a scrolling feature, to enable viewing of any area of the SVG lying outside the visible area of its containing window.
KDE‘s Konqueror SVG plugin release is KSVG. KSVG2 was rolled into KDE 4 core, making it native-rendering. (SVG finds increasing use on the KDE platform: this system-wide support for SVG graphics in version 4 follows early support for SVG wallpaper at version 3.4.)
Corel once offered an SVG Viewer plugin, but has ceased development.
Images are usually automatically rasterised using a library such as ImageMagick, which provides a quick but incomplete implementation of SVG, or Batik, which implements nearly all of SVG 1.1 but requires the Java Runtime Environment.
- Inkscape is a free software SVG drawing program for Linux, Microsoft Windows and Mac OS X.
- The Batik SVG Toolkit can be used by Java programs to render, generate, and manipulate SVG graphics.
- xfig allows import and export of SVG drawings.
- The GNOME project has had integrated SVG support throughout the desktop since 2000.
- OpenOffice.org Draw can export SVG drawings. Import extensions are available to import SVG images into OpenOffice.org Draw.
- Go-oo Draw (OpenOffice.org variant) can open and export SVG files.
- OxygenOffice Draw (OpenOffice.org variant) can open and export SVG files.
- Adobe Illustrator supports both the import and export of SVG images. Photoshop, however, does not support SVG import. When writing SVG files Illustrator embeds a complete copy of the image in a proprietary format for later reediting. This often results in changes being lost if another editor is used then the file is reopened in Illustrator.
- CorelDRAW has an SVG export and import filter.
- Xara Xtreme has an SVG export and import filter in its free/open source Linux version.
- Microsoft Visio can save files in the SVG format as well as the SVG compressed format. Graphs created in Microsoft Excel or figures from Microsoft Word can be cut and pasted into Microsoft Visio documents.
- The GIMP allows SVG images to be imported as paths or rendered bitmaps.
- Blender will import SVG graphics as paths.
- Cairo is a vector graphics based library which can generate SVG. It has bindings for many programming languages including Haskell, Java, Perl, Python, Scheme, Smalltalk and several others.
- Altsoft Xml2PDF allows converting SVG files to PDF, PS, various GDI+ formats.
On mobile, the most popular implementations for mobile phones are by Ikivo and Bitflash, while for PDAs, Bitflash and Intesis have implementations. Flash Lite by Adobe optionally supports SVG Tiny since version 1.1. At the SVG Open 2005 conference, Sun demonstrated a mobile implementation of SVG Tiny 1.1 for the CLDC platform.
Mobile SVG players from Ikivo and BitFlash come pre-installed, i.e., the manufacturers burn the SVG player code in their mobiles before shipping to the customers. Mobiles also can include full web browsers (such as Opera Mini and the iPhone‘s Safari) which include SVG support.
The level of SVG Tiny support available varies from mobile to mobile, depending on the manufacturer and version of the SVG engine installed. Many of the new mobiles support additional features beyond SVG Tiny 1.1, like gradient and opacity; this standard is often referred as SVGT 1.1+.
Nokia’s S60 platform has built-in support for SVG. For example, icons are generally rendered using the platform’s SVG engine. Nokia has also led the JSR 226: Scalable 2D Vector Graphics API expert group which defines Java ME API for SVG presentation and manipulation. This API has been implemented in S60 Platform 3rd Edition Feature Pack 1 onward. Some Series 40 phones also support SVG (such as 6280).
Most Sony Ericsson phones beginning with K700 (by release date) support SVG Tiny 1.1. Phones beginning with K750 also support such features as opacity and gradients. Phones with Java Platform-8 have support for JSR 226.
Windows Metafile (WMF) is a graphics file format on Microsoft Windows systems, originally designed in the early 1990s. Windows Metafiles are intended to be portable between applications and may contain both vector and bitmap components. In contrast to raster formats such as JPEG and GIF which are used to store bitmap graphics such as photographs, scans and graphics, Windows Metafiles generally are used to store line-art, illustrations and content created in drawing or presentation applications. Most Windows clipart is in the WMF format.
Essentially, a WMF file stores a list of function calls that have to be issued to the Windows graphics layer GDI in order to display an image on screen. Since some GDI functions accept pointers to callback functions for error handling, a WMF file may include executable code.
WMF is a 16-bit format introduced in Windows 3.0. It is the native vector format for Microsoft Office applications such as Word, PowerPoint, and Publisher. A newer 32-bit version with additional commands is called Enhanced Metafile (EMF). EMF is also used as a graphics language for printer drivers.
As for other Microsoft file formats, no specification of the format was previously available, and alternative implementations had to reverse engineer existing WMF files, which was difficult and error prone. In September 2006, Microsoft published the WMF file format specification in the context of the Microsoft Open Specification Promise, promising to not assert patent rights to file formats implementors.
In December 2005, a vulnerability was reported to Microsoft by Symantec. It was assessed and classified as critical. In certain cases, the graphics rendering engine allowed remote code execution. This vulnerability was resolved in a security update on January 5, 2006 on Microsoft TechNet (MS06-001) and generally released January 10, 2006. Details can be found in Microsoft Knowledge Base Article “Vulnerability in Graphics Rendering Engine Could Allow Remote Code Execution” (912919). It was also referred to as the WMF (Windows Meta File) vulnerability.
The WMF format was designed to be executed by the Windows graphics layer GDI in order to restore the image, but as the WMF binary files contain the definition of the GDI graphic primitives that constitute this image, it is possible to design alternative libraries that render WMF binary files, or convert them in other graphic formats.
For example, the Batik library is able to render WMF files and convert them to their SVG equivalent. The Vector Graphics package of the FreeHEP Java library allows the saving of Java2D drawings as EMF files.