Halting problem


In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. The halting problem is undecidable, meaning that no general algorithm exists that solves the halting problem for all possible program–input pairs. The problem comes up often in discussions of computability since it demonstrates that some functions are mathematically definable but not computable.
A key part of the formal statement of the problem is a mathematical definition of a computer and program, usually via a Turing machine. The proof then shows, for any program that might determine whether programs halt, that a "pathological" program exists for which makes an incorrect determination. Specifically, is the program that, when called with some input, passes its own source and its input to f and does the opposite of what predicts will do. The behavior of on shows undecidability as it means no program will solve the halting problem in every possible case.

Background

The halting problem is a decision problem about properties of computer programs on a fixed Turing-complete model of computation, i.e., all programs that can be written in some given programming language that is general enough to be equivalent to a Turing machine. The problem is to determine, given a program and an input to the program, whether the program will eventually halt when run with that input. In this abstract framework, there are no resource limitations on the amount of memory or time required for the program's execution; it can take arbitrarily long and use an arbitrary amount of storage space before halting. The question is simply whether the given program will ever halt on a particular input.
For example, in pseudocode, the program
does not halt; rather, it goes on forever in an infinite loop. On the other hand, the program
does halt.
While deciding whether these programs halt is simple, more complex programs prove problematic. One approach to the problem might be to run the program for some number of steps and check if it halts. However, as long as the program is running, it is unknown whether it will eventually halt or run forever. Turing proved no algorithm exists that always correctly decides whether, for a given arbitrary program and input, the program halts when run with that input. The essence of Turing's proof is that any such algorithm can be made to produce contradictory output and therefore cannot be correct.

Programming consequences

Some infinite loops can be quite useful. For instance, event loops are typically coded as infinite loops. However, most subroutines are intended to finish. In particular, in hard real-time computing, programmers attempt to write subroutines that are guaranteed to finish, and to finish before a given deadline.
Sometimes these programmers use some general-purpose programming language, but attempt to write in a restricted style, such as MISRA C or SPARK, that makes it easy to prove that the resulting subroutines finish before the given deadline.
Other times these programmers apply the rule of least power; they deliberately use a computer language that is not quite fully Turing-complete. Often, these are languages that guarantee all subroutines finish, such as Rocq.

Common pitfalls

The difficulty in the halting problem lies in the requirement that the decision procedure must work for all programs and inputs. A particular program either halts on a given input or does not halt. Consider one algorithm that always answers "halts" and another that always answers "does not halt". For any specific program and input, one of these two algorithms answers correctly, even though nobody may know which one. Yet neither algorithm solves the halting problem generally.
There are programs that simulate the execution of whatever source code they are given. Such programs can demonstrate that a program does halt if this is the case: the interpreter itself will eventually halt its simulation, which shows that the original program halted. However, an interpreter will not halt if its input program does not halt, so this approach cannot solve the halting problem as stated; it does not successfully answer "does not halt" for programs that do not halt.
The halting problem is theoretically decidable for linear bounded automata or deterministic machines with finite memory. A machine with finite memory has a finite number of configurations, and thus any deterministic program on it must eventually either halt or repeat a previous configuration:
However, a computer with a million small parts, each with two states, would have at least 21,000,000 possible states:
Although a machine may be finite, and finite automata "have a number of theoretical limitations":
It can also be decided automatically whether a nondeterministic machine with finite memory halts on none, some, or all of the possible sequences of nondeterministic decisions, by enumerating states after each possible decision.

History

In April 1936, Alonzo Church published his proof of the undecidability of a problem in the lambda calculus. Turing's proof was published later, in January 1937. Since then, many other undecidable problems have been described, including the halting problem which emerged in the 1950s.

Timeline

  • Origin of the halting problem

Many papers and textbooks refer the definition and proof of undecidability of the halting problem to Turing's 1936 paper. However, this is not correct. Turing did not use the terms "halt" or "halting" in any of his published works, including his 1936 paper. A search of the academic literature from 1936 to 1958 showed that the first published material using the term "halting problem" was. However, Rogers says he had a draft of available to him, and Martin Davis states in the introduction that "the expert will perhaps find some novelty in the arrangement and treatment of topics", so the terminology must be attributed to Davis. Davis stated in a letter that he had been referring to the halting problem since 1952. The usage in Davis's book is as follows:
A possible precursor to Davis's formulation is Kleene's 1952 statement, which differs only in wording:
The halting problem is Turing equivalent to both Davis's printing problem and to the printing problem considered in Turing's 1936 paper. However, Turing equivalence is rather loose and does not mean that the two problems are the same. There are machines which print but do not halt, and halt but not print. The printing and halting problems address different issues and exhibit important conceptual and technical differences. Thus, Davis was simply being modest when he said:

Formalization

In his original proof Turing formalized the concept of algorithm by introducing Turing machines. However, the result is in no way specific to them; it applies equally to any other model of computation that is equivalent in its computational power to Turing machines, such as Markov algorithms, Lambda calculus, Post systems, register machines, or tag systems.
What is important is that the formalization allows a straightforward mapping of algorithms to some data type that the algorithm can operate upon. For example, if the formalism lets algorithms define functions over strings then there should be a mapping of these algorithms to strings, and if the formalism lets algorithms define functions over natural numbers then there should be a mapping of algorithms to natural numbers. The mapping to strings is usually the most straightforward, but strings over an alphabet with n characters can also be mapped to numbers by interpreting them as numbers in an n-ary numeral system.

Representation as a set

The conventional representation of decision problems is the set of objects possessing the property in question. The halting set
represents the halting problem.
This set is recursively enumerable, which means there is a computable function that lists all of the pairs it contains. However, the complement of this set is not recursively enumerable.
There are many equivalent formulations of the halting problem; any set whose Turing degree equals that of the halting problem is such a formulation. Examples of such sets include:
  • .

    Proof concept

outlined a proof by contradiction that the halting problem is not solvable. The proof proceeds as follows: Suppose that there exists a total computable function halts that returns true if the subroutine f halts and returns false otherwise. Now consider the following subroutine:

def g -> None:
if halts:
loop_forever

halts must either return true or false, because halts was assumed to be total. If halts returns true, then g will call loop_forever and never halt, which is a contradiction. If halts returns false, then g will halt, because it will not call loop_forever; this is also a contradiction. Overall, g does the opposite of what halts says g should do, so halts can not return a truth value that is consistent with whether g halts. Therefore, the initial assumption that halts is a total computable function must be false.

Sketch of rigorous proof

The concept above shows the general method of the proof, but the computable function halts does not directly take a subroutine as an argument; instead it takes the source code of a program. Moreover, the definition of g is self-referential. A rigorous proof addresses these issues. The overall goal is to show that there is no total computable function that decides whether an arbitrary program i halts on arbitrary input x; that is, the following function h is not computable:
Here program i refers to the i th program in an enumeration of all the programs of a fixed Turing-complete model of computation.

Possible values for a total computable function f arranged in a 2D array. The orange cells are the diagonal. The values of f and g are shown at the bottom; U indicates that the function g is undefined for a particular input value.

The proof proceeds by directly establishing that no total computable function with two arguments can be the required function h. As in the sketch of the concept, given any total computable binary function f, the following partial function g is also computable by some program e:
The verification that g is computable relies on the following constructs :
  • computable subprograms,
  • duplication of values,
  • conditional branching,
  • not producing a defined result,
  • returning a value of 0.
The following pseudocode for e illustrates a straightforward way to compute g:


procedure e:
if f 0 then
return 0
else
loop forever


Because g is partial computable, there must be a program e that computes g, by the assumption that the model of computation is Turing-complete. This program is one of all the programs on which the halting function h is defined. The next step of the proof shows that h will not have the same value as f.
It follows from the definition of g that exactly one of the following two cases must hold:
  • f = 0 and so g = 0. In this case program e halts on input e, so h = 1.
  • f ≠ 0 and so g is undefined. In this case program e does not halt on input e, so h = 0.
In either case, f cannot be the same function as h. Because f was an arbitrary total computable function with two arguments, all such functions must differ from h.
This proof is analogous to Cantor's diagonal argument. One may visualize a two-dimensional array with one column and one row for each natural number, as indicated in the table above. The value of f is placed at column i, row j. Because f is assumed to be a total computable function, any element of the array can be calculated using f. The construction of the function g can be visualized using the main diagonal of this array. If the array has a 0 at position, then g is 0. Otherwise, g is undefined. The contradiction comes from the fact that there is some column e of the array corresponding to g itself. Now assume f was the halting function h, if g is defined, g halts so f = 1. But g = 0 only when f = 0, contradicting f = 1. Similarly, if g is not defined, then halting function f = 0, which leads to g = 0 under g's construction. This contradicts the assumption of g not being defined. In both cases contradiction arises. Therefore any arbitrary computable function f cannot be the halting function h.