Code-excited linear prediction
Code-excited linear prediction is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction and linear predictive coding vocoders. Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.
Background
The CELP algorithm is based on four main ideas:- Using the source-filter model of speech production through linear prediction ;
- Using an adaptive and a fixed codebook as the input of the LP model;
- Performing a search in closed-loop in a "perceptually weighted domain".
- Applying vector quantization
CELP decoder
Before exploring the complex encoding process of CELP we introduce the decoder here. Figure 1 describes a generic CELP decoder. The excitation is produced by summing the contributions from fixed and adaptive codebooks:where is the fixed codebook contribution and is the adaptive codebook contribution. The fixed codebook is a vector quantization dictionary that is hard-coded into the codec. This codebook can be algebraic or be stored explicitly. The entries in the adaptive codebook consist of delayed versions of the excitation. This makes it possible to efficiently code periodic signals, such as voiced sounds.
The filter that shapes the excitation has an all-pole model of the form, where is called the prediction filter and is obtained using linear prediction. An all-pole filter is used because it is a good representation of the human vocal tract and because it is easy to compute.
CELP encoder
The main principle behind CELP is called analysis-by-synthesis and means that the encoding is performed by perceptually optimizing the decoded signal in a closed loop. In theory, the best CELP stream would be produced by trying all possible bit combinations and selecting the one that produces the best-sounding decoded signal. This is obviously not possible in practice for two reasons: the required complexity is beyond any currently available hardware and the “best sounding” selection criterion implies a human listener.In order to achieve real-time encoding using limited computing resources, the CELP search is broken down into smaller, more manageable, sequential searches using a simple perceptual weighting function. Typically, the encoding is performed in the following order:
- Linear prediction coefficients are computed and quantized, usually as line spectral pairs.
- The adaptive codebook is searched and its contribution removed.
- The fixed codebook is searched.
Noise weighting
where.
Selected readings
Category:Data compression