Newton's method


In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots of a real-valued function. The most basic version starts with a real-valued function , its derivative, and an initial guess for a root of. If satisfies certain assumptions and the initial guess is close, then
is a better approximation of the root than. Geometrically, is the x-intercept of the tangent to the graph of at : that is, the improved guess,, is the unique root of the linear approximation of at the initial guess,. The process is repeated as
until a sufficiently precise value is reached. The number of correct digits roughly doubles with each step. This algorithm is first in the class of Householder's methods, and was succeeded by Halley's method. The method can also be extended to complex functions and to systems of equations.

Description

The purpose of Newton's method is to find a root of a function. The idea is to start with an initial guess near a root, approximate the function by its tangent line near the guess, and then take the root of the linear approximation as a next guess at the function's root. This will typically be closer to the function's root than the previous guess, and the method can be iterated.
Image:newton iteration.svg|alt=Illustration of Newton's method|thumb|right|upright=1.4| is a better approximation than for the root of the function .
The best linear approximation to an arbitrary differentiable function near the point is the tangent line to the curve, with equation
The root of this linear function, the place where it intercepts the -axis, can be taken as a closer approximate root :
Image:NewtonIteration Ani.gif|alt=Illustration of Newton's method|thumb|right|upright=1.4|Iteration typically improves the approximation.
The process can be started with any arbitrary initial guess, though it will generally require fewer iterations to converge if the guess is close to one of the function's roots. The method will usually converge if. Furthermore, for a root of multiplicity 1, the convergence is at least quadratic in some sufficiently small neighbourhood of the root: the number of correct digits of the approximation roughly doubles with each additional step. More details can be found in below.
Householder's methods are similar but have higher order for even faster convergence. However, the extra computations required for each step can slow down the overall performance relative to Newton's method, particularly if or its derivatives are computationally expensive to evaluate.

History

In the Old Babylonian period, the side of a square of known area could be effectively approximated, and this is conjectured to have been done using a special case of Newton's method, [|described algebraically below], by iteratively improving an initial estimate; an equivalent method can be found in Hero of Alexandria's Metrica, so is often called Heron's method. Jamshīd al-Kāshī used a method to solve to find roots of , a method that was algebraically equivalent to Newton's method, and in which a similar method was found in Trigonometria Britannica, published by Henry Briggs in 1633. His method first appeared in his 1427 publication Miftāḥ al-Ḥisāb. Al-Kāshī's work was founded on the earlier contributions of the polymath al-Bīrūnī and the mathematician Sharaf al-Dīn al-Ṭūsī. The contributions of al-Kāshī remained largely unknown to the Western scientific community for centuries, until the work of François Viète. In 1600, Viète rediscovered a technique similar to al-Kāshī's in the context of solving scalar polynomial equations of degree six.
The method that laid the groundwork for what is now the modern Newton's method which would be developed by Joseph Raphson and Thomas Simpson first appeared in Isaac Newton's work in De analysi per aequationes numero terminorum infinitas and in De metodis fluxionum et serierum infinitarum. However, while Newton gave the basic ideas, his method differs from the modern method given above. He applied the method only to polynomials, starting with an initial root estimate and extracting a sequence of error corrections. He used each correction to rewrite the polynomial in terms of the remaining error, and then solved for a new correction by neglecting higher-degree terms. He did not explicitly connect the method with derivatives or present a general formula. Newton applied this method to both numerical and algebraic problems, producing Taylor series in the latter case. Despite this, in the later second and third editions of Newton's 1687 Philosophiæ Naturalis Principia Mathematica, he did apply his method in an iterative manner to a nonpolynomial equation, specifically Kepler's equation, which were the first published uses of Newton's method in this form by him.
Newton may have derived his method from a similar, less precise method by mathematician Viète, however, the two methods are not the same. The essence of Viète's own method can be found in the work of the mathematician Sharaf al-Din al-Tusi.
The Japanese mathematician Seki Kōwa used a form of Newton's method in the 1680s to solve single-variable equations, though the connection with calculus was missing.
Newton's method was first published in 1685 in A Treatise of Algebra both Historical and Practical by John Wallis. In 1690, Raphson published a simplified description in Analysis aequationum universalis. Raphson also applied the method only to polynomials, but he avoided Newton's tedious rewriting process by extracting each successive correction from the original polynomial. This allowed him to derive a reusable iterative expression for each problem. Finally, in 1740, Simpson described Newton's method as an iterative method for solving general nonlinear equations using calculus, essentially giving the description above. In the same publication, Simpson also gives the generalization to systems of two equations and notes that Newton's method can be used for solving optimization problems by setting the gradient to zero.
Arthur Cayley in 1879 in The Newton–Fourier imaginary problem was the first to notice the difficulties in generalizing Newton's method to complex roots of polynomials with degree greater than 2 and complex initial values. This opened the way to the study of the theory of iterations of rational functions.

Practical considerations

Newton's method is a powerful technique—if the derivative of the function at the root is nonzero, then the convergence is at least quadratic: as the method converges on the root, the difference between the root and the approximation is squared at each step. However, there are some difficulties with the method.

Difficulty in calculating the derivative of a function

Newton's method requires that the derivative can be calculated directly. An analytical expression for the derivative may not be easily obtainable or could be expensive to evaluate. In these situations, it may be appropriate to approximate the derivative by using the slope of a line through two nearby points on the function. Using this approximation would result in something like the secant method whose convergence is slower than that of Newton's method.

Failure of the method to converge to the root

It is important to review the [|proof of quadratic convergence] of Newton's method before implementing it. Specifically, one should review the assumptions made in the proof. For [|situations where the method fails to converge], it is because the assumptions made in this proof are not met.
For example, [|in some cases], if the first derivative is not well behaved in the neighborhood of a particular root, then it is possible that Newton's method will fail to converge no matter where the initialization is set. In some cases, Newton's method can be stabilized by using successive over-relaxation, or the speed of convergence can be increased by using the same method.
In a robust implementation of Newton's method, it is common to place limits on the number of iterations, bound the solution to an interval known to contain the root, and combine the method with a more robust root finding method.

Slow convergence for roots of multiplicity greater than 1

If the root being sought has multiplicity greater than one, the convergence rate is merely linear unless special steps are taken. When there are two or more roots that are close together then it may take many iterations before the iterates get close enough to one of them for the quadratic convergence to be apparent. However, if the multiplicity of the root is known, the following modified algorithm preserves the quadratic convergence rate:
This is equivalent to using successive over-relaxation. On the other hand, if the multiplicity of the root is not known, it is possible to estimate after carrying out one or two iterations, and then use that value to increase the rate of convergence.
If the multiplicity of the root is finite then will have a root at the same location with multiplicity 1. Applying Newton's method to find the root of recovers quadratic convergence in many cases although it generally involves the second derivative of. In a particularly simple case, if then and Newton's method finds the root in a single iteration with

Slow convergence

The function has a root at 0. Since is continuously differentiable at its root, the theory guarantees that Newton's method as initialized sufficiently close to the root will converge. However, since the derivative is zero at the root, quadratic convergence is not ensured by the theory. In this particular example, the Newton iteration is given by
It is visible from this that Newton's method could be initialized anywhere and converge to zero, but at only a linear rate. If initialized at 1, dozens of iterations would be required before ten digits of accuracy are achieved.
The function also has a root at 0, where it is continuously differentiable. Although the first derivative is nonzero at the root, the second derivative is nonexistent there, so that quadratic convergence cannot be guaranteed. In fact the Newton iteration is given by
From this, it can be seen that the rate of convergence is superlinear but subquadratic. This can be seen in the following tables, the left of which shows Newton's method applied to the above and the right of which shows Newton's method applied to. The quadratic convergence in iteration shown on the right is illustrated by the orders of magnitude in the distance from the iterate to the true root being approximately doubled from row to row. While the convergence on the left is superlinear, the order of magnitude is only multiplied by about 4/3 from row to row.
1212
1.4286 × 10−12.1754 × 10−13.3333 × 10−14.4444 × 10−1
1.4669 × 10−21.8260 × 10−26.6666 × 10−27.1111 × 10−2
9.0241 × 10−49.8961 × 10−43.9216 × 10−33.9369 × 10−3
2.5750 × 10−52.6511 × 10−51.5259 × 10−51.5259 × 10−5
2.4386 × 10−72.4539 × 10−72.3283 × 10−102.3283 × 10−10
5.0366 × 10−105.0406 × 10−105.4210 × 10−205.4210 × 10−20
1.3344 × 10−131.3344 × 10−132.9387 × 10−392.9387 × 10−39

The rate of convergence is distinguished from the number of iterations required to reach a given accuracy. For example, the function has a root at 1. Since and is smooth, it is known that any Newton iteration convergent to 1 will converge quadratically. However, if initialized at 0.5, the first few iterates of Newton's method are approximately 26214, 24904, 23658, 22476, decreasing slowly, with only the 200th iterate being 1.0371. The following iterates are 1.0103, 1.00093, 1.0000082, and 1.00000000065, illustrating quadratic convergence. This highlights that quadratic convergence of a Newton iteration does not mean that only few iterates are required; this only applies once the sequence of iterates is sufficiently close to the root.