Inverse problem
An inverse problem in science is the process of calculating from a set of observations the causal factors that produced them: for example, calculating an image in X-ray computed tomography, source reconstruction in acoustics, or calculating the density of the Earth from measurements of its gravity field. It is called an inverse problem because it starts with the effects and then calculates the causes. It is the inverse of a forward problem, which starts with the causes and then calculates the effects.
Inverse problems are some of the most important mathematical problems in science and mathematics because they tell us about parameters that we cannot directly observe. They can be found in system identification, optics, radar, acoustics, communication theory, signal processing, medical imaging, computer vision, geophysics, oceanography, meteorology, astronomy, remote sensing, natural language processing, machine learning, nondestructive testing, slope stability analysis and many other fields.
History
Starting with the effects to discover the causes has concerned physicists for centuries. A historical example is the calculations of Adams and Le Verrier which led to the discovery of Neptune from the perturbed trajectory of Uranus. However, a formal study of inverse problems was not initiated until the 20th century.One of the earliest examples of a solution to an inverse problem was discovered by Hermann Weyl and published in 1911, describing the asymptotic behavior of eigenvalues of the Laplace–Beltrami operator. Today known as Weyl's law, it is perhaps most easily understood as an answer to the question of whether it is possible to hear the shape of a drum. Weyl conjectured that the eigenfrequencies of a drum would be related to the area and perimeter of the drum by a particular equation, a result improved upon by later mathematicians.
The field of inverse problems was later touched on by Soviet-Armenian physicist, Viktor Ambartsumian.
While still a student, Ambartsumian thoroughly studied the theory of atomic structure, the formation of energy levels, and the Schrödinger equation and its properties, and when he mastered the theory of eigenvalues of differential equations, he pointed out the apparent analogy between discrete energy levels and the eigenvalues of differential equations. He then asked: given a family of eigenvalues, is it possible to find the form of the equations whose eigenvalues they are? Essentially Ambartsumian was examining the inverse Sturm–Liouville problem, which dealt with determining the equations of a vibrating string. This paper was published in 1929 in the German physics journal Zeitschrift für Physik and remained in obscurity for a rather long time. Describing this situation after many decades, Ambartsumian said, "If an astronomer publishes an article with a mathematical content in a physics journal, then the most likely thing that will happen to it is oblivion."
Nonetheless, toward the end of the Second World War, this article, written by the 20-year-old Ambartsumian, was found by Swedish mathematicians and formed the starting point for a whole area of research on inverse problems, becoming the foundation of an entire discipline.
Then important efforts have been devoted to a "direct solution" of the inverse scattering problem especially by Gelfand and Levitan in the Soviet Union. They proposed an analytic constructive method for determining the solution. When computers became available, some authors have investigated the possibility of applying their approach to similar problems such as the inverse problem in the 1D wave equation. But it rapidly turned out that the inversion is an unstable process: noise and errors can be tremendously amplified making a direct solution hardly practicable.
Then, around the seventies, the least-squares and probabilistic approaches came in and turned out to be very helpful for the determination of parameters involved in various physical systems. This approach met a lot of success. Nowadays inverse problems are also investigated in fields outside physics, such as chemistry, economics, and computer science. Eventually, as numerical models become prevalent in many parts of society, we may expect an inverse problem associated with each of these numerical models.
Conceptual understanding
Since Newton, scientists have extensively attempted to model the world. In particular, when a mathematical model is available, we can foresee, given some parameters that describe a physical system, the behavior of the system. This approach is known as mathematical modeling and the above-mentioned physical parameters are called the model parameters or simply the model. To be precise, we introduce the notion of state of the physical system: it is the solution of the mathematical model's equation. In optimal control theory, these equations are referred to as the state equations. In many situations we are not truly interested in knowing the physical state but just its effects on some objects. Hence we have to introduce another operator, called the observation operator, which converts the state of the physical system into what we want to observe. We can now introduce the so-called forward problem, which consists of two steps:- determination of the state of the system from the physical parameters that describe it
- application of the observation operator to the estimated state of the system so as to predict the behavior of what we want to observe.
In this approach we basically attempt at predicting the effects knowing the causes.
The table below shows, the Earth being considered as the physical system and for different physical phenomena, the model parameters that describe the system, the physical quantity that describes the state of the physical system and observations commonly made on the state of the system.
| Governing equations | Model parameters | State of the physical system | Common observations on the system |
| Newton's law of gravity | Distribution of mass | Gravitational field | Measurement made by gravimeters at different surface locations |
| Maxwell's equations | Distribution of magnetic susceptibility | Magnetic field | Magnetic field measured at different surface locations by magnetometers |
| Wave equation | Distribution of wave-speeds and densities | Wave-field caused by artificial or natural seismic sources | Particle velocity measured by seismometers placed at different surface locations |
| Diffusion equation | Distribution of Diffusion coefficient | Diffusing material concentration as a function of space and time | Monitoring of this concentration measured at different locations |
In the inverse problem approach we, roughly speaking, try to know the causes given the effects.
General statement of the inverse problem
The inverse problem is the "inverse" of the forward problem: instead of determining the data produced by particular model parameters, we want to determine the model parameters that produce the data that is the observation we have recorded.Our goal, in other words, is to determine the model parameters such that
where is the forward map. We denote by the number of model parameters, and by the number of recorded data.
We introduce some useful concepts and the associated notations that will be used below:
- The space of models denoted by : the vector space spanned by model parameters; it has dimensions;
- The space of data denoted by : if we organize the measured samples in a vector with components ;
- : the response of model ; it consists of the data predicted by model ;
- : the image of by the forward map, it is a subset of made of responses of all models;
- : the data misfits associated with model : they can be arranged as a vector, an element of.
When operator is linear, the inverse problem is linear. Otherwise, that is most often, the inverse problem is nonlinear.
Also, models cannot always be described by a finite number of parameters. It is the case when we look for distributed parameters : in such cases the goal of the inverse problem is to retrieve one or several functions. Such inverse problems are inverse problems with infinite dimension.
Linear inverse problems
In the case of a linear forward map and when we deal with a finite number of model parameters, the forward map can be written as a linear systemwhere is the matrix that characterizes the forward map. The linear system can be systematically solved by means of both regularization and Bayesian methods.
An elementary example: Earth's gravitational field
Only a few physical systems are actually linear with respect to the model parameters. One such system from geophysics is that of the Earth's gravitational field. The Earth's gravitational field is determined by the density distribution of the Earth in the subsurface. Because the lithology of the Earth changes quite significantly, we are able to observe minute differences in the Earth's gravitational field on the surface of the Earth. From our understanding of gravity, we know that the mathematical expression for gravity is:here is a measure of the local gravitational acceleration, is the universal gravitational constant, is the local mass of the rock in the subsurface and is the distance from the mass to the observation point.
By discretizing the above expression, we are able to relate the discrete data observations on the surface of the Earth to the discrete model parameters in the subsurface that we wish to know more about. For example, consider the case where we have measurements carried out at 5 locations on the surface of the Earth. In this case, our data vector, is a column vector of dimension : its -th component is associated with the -th observation location. We also know that we only have five unknown masses in the subsurface with known location: we denote by the distance between the -th observation location and the -th mass. Thus, we can construct the linear system relating the five unknown masses to the five data points as follows:
To solve for the model parameters that fit our data, we might be able to invert the matrix to directly convert the measurements into our model parameters. For example:
A system with five equations and five unknowns is a very specific situation: our example was designed to end up with this specificity. In general, the numbers of data and unknowns are different so that matrix is not square.
However, even a square matrix can have no inverse: matrix can be rank deficient and the solution of the system is not unique. Then the solution of the inverse problem will be undetermined. This is a first difficulty. Over-determined systems have other issues.
Also noise may corrupt our observations making possibly outside the space of possible responses to model parameters so that solution of the system may not exist. This is another difficulty.