Fast Fourier transform
A fast Fourier transform is an algorithm that computes the discrete Fourier transform of a sequence, or its inverse. A Fourier transform converts a signal from its original domain to a representation in the frequency domain and vice versa.
The DFT is obtained by decomposing a sequence of values into components of different frequencies. This operation is useful in many fields, but computing it directly from the definition is often too slow to be practical. An FFT rapidly computes such transformations by factorizing the DFT matrix into a product of sparse factors. As a result, it manages to reduce the complexity of computing the DFT from, which arises if one simply applies the definition of DFT, to, where is the data size. The difference in speed can be enormous, especially for long data sets where may be in the thousands or millions.
As the FFT is merely an algebraic refactoring of terms within the DFT, the DFT and the FFT both perform mathematically equivalent and interchangeable operations, assuming that all terms are computed with infinite precision. However, in the presence of round-off error, many FFT algorithms are much more accurate than evaluating the DFT definition directly or indirectly. There are many different FFT algorithms based on a wide range of published theories, from simple complex-number arithmetic to group theory and number theory. The best-known FFT algorithms depend upon the factorization of, but there are FFTs with complexity for all, even prime,. Many FFT algorithms depend only on the fact that is an th primitive root of unity, and thus can be applied to analogous transforms over any finite field, such as number-theoretic transforms. Since the inverse DFT is the same as the DFT, but with the opposite sign in the exponent and a factor, any FFT algorithm can easily be adapted for it.
Fast Fourier transforms are widely used for applications in engineering, music, science, and mathematics. The basic ideas were popularized in 1965, but some algorithms had been derived as early as 1805. In 1994, Gilbert Strang described the FFT as "the most important numerical algorithm of our lifetime", and it was included in Top 10 Algorithms of 20th Century by the IEEE magazine Computing in Science & Engineering.
History
The development of fast algorithms for DFT was prefigured in Carl Friedrich Gauss's unpublished 1805 work on the orbits of asteroids Pallas and Juno. Gauss wanted to interpolate the orbits from sample observations; his method was very similar to the one that would be published in 1965 by James Cooley and John Tukey, who are generally credited for the invention of the modern generic FFT algorithm. While Gauss's work predated even Joseph Fourier's 1822 results, he did not analyze the method's complexity, and eventually used other methods to achieve the same end.Between 1805 and 1965, some versions of FFT were published by other authors. Frank Yates in 1932 published his version called interaction algorithm, which provided efficient computation of Hadamard and Walsh transforms. Yates' algorithm is still used in the field of statistical design and analysis of experiments. In 1942, G. C. Danielson and Cornelius Lanczos published their version to compute DFT for x-ray crystallography, a field where calculation of Fourier transforms presented a formidable bottleneck. While many methods in the past had focused on reducing the constant factor for computation by taking advantage of symmetries, Danielson and Lanczos realized that one could use the periodicity and apply a doubling trick to "double with only slightly more than double the labor", though like Gauss they did not do the analysis to discover that this led to scaling. In 1958, I. J. Good published a paper establishing the prime-factor FFT algorithm that applies to discrete Fourier transforms of size, where and are coprime.
James Cooley and John Tukey independently rediscovered these earlier algorithms and published a more general FFT in 1965 that is applicable when is composite and not necessarily a power of 2, as well as analyzing the scaling. Tukey came up with the idea during a meeting of President Kennedy's Science Advisory Committee where a discussion topic involved detecting nuclear tests by the Soviet Union by setting up sensors to surround the country from outside. To analyze the output of these sensors, an FFT algorithm would be needed. In discussion with Tukey, Richard Garwin recognized the general applicability of the algorithm not just to national security problems, but also to a wide range of problems including one of immediate interest to him, determining the periodicities of the spin orientations in a 3-D crystal of Helium-3. Garwin gave Tukey's idea to Cooley for implementation. Cooley and Tukey published the paper in a relatively short time of six months. As Tukey did not work at IBM, the patentability of the idea was doubted and the algorithm went into the public domain, which, through the computing revolution of the next decade, made FFT one of the indispensable algorithms in digital signal processing.
Definition
Let be complex numbers. The DFT is defined by the formulawhere is a primitive th root of 1.
Evaluating this definition directly requires operations: there are outputs, and each output requires a sum of terms. An FFT is any method to compute the same results in operations. All known FFT algorithms require operations, although there is no known proof that lower complexity is impossible.
To illustrate the savings of an FFT, consider the count of complex multiplications and additions for data points. Evaluating the DFT's sums directly involves complex multiplications and complex additions, of which operations can be saved by eliminating trivial operations such as multiplications by 1, leaving about 30 million operations. In contrast, the radix-2 [|Cooley–Tukey algorithm], for a power of 2, can compute the same result with only complex multiplications and complex additions, in total about 70,000 operations — more than four-hundred times less than with direct evaluation. In practice, actual performance on modern computers is usually dominated by factors other than the speed of arithmetic operations and the analysis is a complicated subject, but the overall improvement from to remains.
Algorithms
Cooley–Tukey algorithm
By far the most commonly used FFT is the Cooley–Tukey algorithm. This is a divide-and-conquer algorithm that recursively breaks down a DFT of any composite size into smaller DFTs of size, along with multiplications by complex roots of unity traditionally called twiddle factors.This method was popularized by a publication of Cooley and Tukey in 1965, but it was later discovered that those two authors had together independently re-invented an algorithm known to Carl Friedrich Gauss around 1805.
The best known use of the Cooley–Tukey algorithm is to divide the transform into two pieces of size at each step, and is therefore limited to power-of-two sizes, but any factorization can be used in general. These are called the radix-2 and mixed-radix cases, respectively. Although the basic idea is recursive, most traditional implementations rearrange the algorithm to avoid explicit recursion. Also, because the Cooley–Tukey algorithm breaks the DFT into smaller DFTs, it can be combined arbitrarily with any other algorithm for the DFT, such as those described below.
Other FFT algorithms
For with coprime and, one can use the prime-factor algorithm, based on the Chinese remainder theorem, to factorize the DFT similarly to Cooley–Tukey but without the twiddle factors. The Rader–Brenner algorithm is a Cooley–Tukey-like factorization but with purely imaginary twiddle factors, reducing multiplications at the cost of increased additions and reduced numerical stability; it was later superseded by the split-radix variant of Cooley–Tukey. Algorithms that recursively factorize the DFT into smaller operations other than DFTs include the Bruun and QFT algorithms. Bruun's algorithm, in particular, is based on interpreting the FFT as a recursive factorization of the polynomial, here into real-coefficient polynomials of the form and.Another polynomial viewpoint is exploited by the Winograd FFT algorithm, which factorizes into cyclotomic polynomials—these often have coefficients of 1, 0, or −1, and therefore require few multiplications, so Winograd can be used to obtain minimal-multiplication FFTs and is often used to find efficient algorithms for small factors. Indeed, Winograd showed that the DFT can be computed with only irrational multiplications, leading to a proven achievable lower bound on the number of multiplications for power-of-two sizes; this comes at the cost of many more additions, a tradeoff no longer favorable on modern processors with hardware multipliers. In particular, Winograd also makes use of the PFA as well as an algorithm by Rader for FFTs of prime sizes.
Rader's algorithm, exploiting the existence of a generator for the multiplicative group modulo prime, expresses a DFT of prime size as a cyclic convolution of size, which can then be computed by a pair of ordinary FFTs via the convolution theorem. Another prime-size FFT is due to L. I. Bluestein, and is sometimes called the chirp-z algorithm; it also re-expresses a DFT as a convolution, but this time of the same size, via the identity
Hexagonal fast Fourier transform aims at computing an efficient FFT for the hexagonally-sampled data by using a new addressing scheme for hexagonal grids, called Array Set Addressing.
FFT algorithms specialized for real or symmetric data
In many applications, the input data for the DFT are purely real, in which case the outputs satisfy the symmetryand efficient FFT algorithms have been designed for this situation. One approach consists of taking an ordinary algorithm and removing the redundant parts of the computation, saving roughly a factor of two in time and memory. Alternatively, it is possible to express an even-length real-input DFT as a complex DFT of half the length, followed by post-processing operations.
It was once believed that real-input DFTs could be more efficiently computed by means of the discrete Hartley transform, but it was subsequently argued that a specialized real-input DFT algorithm can typically be found that requires fewer operations than the corresponding DHT algorithm for the same number of inputs. Bruun's algorithm is another method that was initially proposed to take advantage of real inputs, but it has not proved popular.
There are further FFT specializations for the cases of real data that have even/odd symmetry, in which case one can gain another factor of roughly two in time and memory and the DFT becomes the discrete cosine/sine transform. Instead of directly modifying an FFT algorithm for these cases, DCTs/DSTs can also be computed via FFTs of real data combined with pre- and post-processing.