Chroma subsampling
Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance.
It is used in many video and still image encoding schemesboth analog and digitalincluding in JPEG encoding.
Rationale
Digital signals are often compressed to reduce file size and save transmission time. Since the human visual system is much more sensitive to variations in brightness than color, a video system can be optimized by devoting more bandwidth to the luma component, than to the color difference components Cb and Cr. In compressed images, for example, the 4:2:2 Y'CbCr scheme requires two-thirds the bandwidth of non-subsampled "4:4:4" R'G'B'. This reduction results in almost no visual difference as perceived by the viewer.How subsampling works
The human vision system processes color information at about a third of the resolution of luminance. Therefore it is possible to sample color information at a lower resolution while maintaining good image quality.This is achieved by encoding RGB image data into a composite black and white image, with separated color difference data. For example with, gamma encoded components are weighted and then summed together to create the luma component. The color difference components are created by subtracting two of the weighted components from the third. A variety of filtering methods can be used to limit the resolution.
Regarding gamma and transfer functions
Gamma encoded luma should not be confused with linear luminance. The presence of gamma encoding is denoted with the prime symbol.Gamma-correcting electro-optical transfer functions are used due to the nonlinear response of human vision. The use of gamma improves perceived signal-to-noise in analogue systems, and allows for more efficient data encoding in digital systems. This encoding uses more levels for darker colors than for lighter ones, accommodating human vision sensitivity.
Sampling systems and ratios
The subsampling scheme is commonly expressed as a three-part ratio J:''a:b'' or four parts, if alpha channel is present, that describe the number of luminance and chrominance samples in a conceptual region that is J pixels wide and 2 pixels high. The parts are :- J: horizontal sampling reference. Usually, 4.
- a: number of chrominance samples in the first row of J pixels.
- b: number of changes of chrominance samples between first and second row of J pixels. b is usually either zero or equal to a.
- Alpha: horizontal factor. May be omitted if alpha component is not present, and is equal to J when present.
The mapping examples given are only theoretical and for illustration. Also the diagram does not indicate any chroma filtering, which should be applied to avoid aliasing. To calculate required bandwidth factor relative to 4:4:4, one needs to sum all the factors and divide the result by 12.
Types of sampling and subsampling
4:4:4
Each of the three Y'CbCr components has the same sample rate, thus there is no chroma subsampling. This scheme is sometimes used in high-end film scanners and cinematic post-production."4:4:4" may instead be wrongly referring to R'G'B' color space, which implicitly also does not have any chroma subsampling. Formats such as HDCAM SR can record 4:4:4 R'G'B' over dual-link HD-SDI.
4:2:2
The two chroma components are sampled at half the horizontal sample rate of luma: the horizontal chroma resolution is halved. This reduces the bandwidth of an uncompressed video signal by one-third, which means for 8 bit per component without alpha only 16 bits are enough, as in NV16.Many high-end digital video formats and interfaces use this scheme:
- AVC-Intra 100
- Digital Betacam
- Betacam SX
- DVCPRO50 and DVCPRO HD
- Digital-S
- CCIR 601 / Serial digital interface / D-1
- ProRes
- XDCAM HD422
- Canon MXF HD422
4:1:1
In the 480i "NTSC" system, if the luma is sampled at 13.5 MHz, then this means that the Cr and Cb signals will each be sampled at 3.375 MHz, which corresponds to a maximum Nyquist bandwidth of 1.6875 MHz, whereas traditional "high-end broadcast analog NTSC encoder" would have a Nyquist bandwidth of 1.5 MHz and 0.5 MHz for the I/Q channels. However, in most equipment, especially cheap TV sets and VHS/Betamax VCRs, the chroma channels have only the 0.5 MHz bandwidth for both Cr and Cb. Thus the DV system actually provides a superior color bandwidth compared to the best composite analog specifications for NTSC, despite having only 1/4 of the chroma bandwidth of a "full" digital signal.
Formats that use 4:1:1 chroma subsampling include:
- DVCPRO / D-7
- 480i "NTSC" DV and DVCAM
- YJK, a proprietary color space implemented by the Yamaha V9958 graphic chip on MSX2+ computers.
4:2:0
Different variants of 4:2:0 chroma configurations are found in:
- All ISO/IEC MPEG and ITU-T VCEG H.26x video coding standards including H.262/MPEG-2 Part 2 implementations
- DVD-Video and Blu-ray Disc.
- 576i "PAL" DV and DVCAM
- HDV
- AVCHD and AVC-Intra 50
- Apple Intermediate Codec
- Most common JPEG/JFIF and MJPEG implementations
- VC-1
- WebP
Sampling positions
There are four main variants of 4:2:0 schemes, having different horizontal and vertical sampling siting relative to the 2×2 "square" of the original input size.- In MPEG-2, MPEG-4, and AVC, Cb and Cr are taken on midpoint of the left-edge of the 2×2 square. In other words, they have the same horizontal location as the top-left pixel, but is shifted one-half pixel down vertically. Also called "left".
- In JPEG/JFIF, H.261, and MPEG-1, Cb and Cr are taken at the center of 2×2 the square. In other words, they are offset one-half pixel to the right and one-half pixel down compared to the top-left pixel. Also called "center".
- In HEVC for BT.2020 and BT.2100 content, Cb and Cr are sampled at the same location as the group's top-left Y pixel. Also called "top-left". An analogous co-sited sampling is used in MPEG-2 4:2:2.
- In 4:2:0 PAL-DV, Cr is sampled at the same location as the group's top-left Y pixel, but Cb is sampled one pixel down. It is also called "top-left" in ffmpeg.
Interlaced and progressive
Original. This image shows a single field. The moving text has some motion blur applied to it.
Image:420-progressive-single-field.png
4:2:0 progressive sampling applied to moving interlaced material. The chroma leads and trails the moving text. This image shows a single field.
4:2:0 interlaced sampling applied to moving interlaced material. This image shows a single field.
In the 4:2:0 interlaced scheme, however, vertical resolution of the chroma is roughly halved, since the chroma samples effectively describe an area 2 samples wide by 4 samples tall instead of 2×2. As well, the spatial displacement between both fields can result in the appearance of comb-like chroma artifacts.
Original still image.
4:2:0 progressive sampling applied to a still image. Both fields are shown.
4:2:0 interlaced sampling applied to a still image. Both fields are shown.
If the interlaced material is to be de-interlaced, the comb-like chroma artifacts can be removed by blurring the chroma vertically.