Computational complexity


In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time and memory storage requirements. The complexity of a problem is the complexity of the best algorithms that allow solving the problem.
The study of the complexity of explicitly given algorithms is called analysis of algorithms, while the study of the complexity of problems is called computational complexity theory. Both areas are highly related, as the complexity of an algorithm is always an upper bound on the complexity of the problem solved by this algorithm. Moreover, for designing efficient algorithms, it is often fundamental to compare the complexity of a specific algorithm to the complexity of the problem to be solved. Also, in most cases, the only thing that is known about the complexity of a problem is that it is no higher than the complexity of the most efficient known algorithms. Therefore, there is a large overlap between analysis of algorithms and complexity theory.
As the amount of resources required to run an algorithm generally varies with the size of the input, the complexity is typically expressed as a function, where is the size of the input and is either the worst-case complexity or the average-case complexity. Time complexity is generally expressed as the number of required elementary operations on an input of size, where elementary operations are assumed to take a constant amount of time on a given computer and change only by a constant factor when run on a different computer. Space complexity is generally expressed as the amount of memory required by an algorithm on an input of size.

Resources

Time

The resource that is most commonly considered is time. When "complexity" is used without qualification, this generally means time complexity.
The usual units of time are not used in complexity theory because they are too dependent on the choice of a specific computer and on the evolution of technology. For instance, a computer today can execute an algorithm significantly faster than a computer from the 1960s; however, this is not an intrinsic feature of the algorithm but rather a consequence of technological advances in computer hardware. Complexity theory seeks to quantify the intrinsic time requirements of algorithms, that is, the basic time constraints an algorithm would place on any computer. This is achieved by counting the number of elementary operations that are executed during the computation. These operations are assumed to take constant time on a given machine, and are often called steps.

Bit complexity

Formally, the bit complexity refers to the number of operations on bits that are needed for running an algorithm. With most models of computation, it equals the time complexity up to a constant factor. On computers, the number of operations on machine words that are needed is also proportional to the bit complexity. So, the time complexity and the bit complexity are equivalent for realistic models of computation.

Space

Another important resource is the size of computer memory that is needed for running algorithms.

Circuit

Communication

For the class of distributed algorithms that are commonly executed by multiple, interacting parties, the resource that is of most interest is the communication complexity. It is the necessary amount of communication between the executing parties.

Others

The number of arithmetic operations is another resource that is commonly used. In this case, one talks of arithmetic complexity. If one knows an upper bound on the size of the binary representation of the numbers that occur during a computation, the time complexity is generally the product of the arithmetic complexity by a constant factor.
For many algorithms the size of the integers that are used during a computation is not bounded, and it is not realistic to consider that arithmetic operations take a constant time. Therefore, the time complexity, generally called bit complexity in this context, may be much larger than the arithmetic complexity. For example, the arithmetic complexity of the computation of the determinant of a integer matrix is for the usual algorithms. The bit complexity of the same algorithms is exponential in, because the size of the coefficients may grow exponentially during the computation. On the other hand, if these algorithms are coupled with multi-modular arithmetic, the bit complexity may be reduced to.
In sorting and searching, the resource that is generally considered is the number of entry comparisons. This is generally a good measure of the time complexity if data are suitably organized.

Complexity as a function of input size

It is impossible to count the number of steps of an algorithm on all possible inputs. As the complexity generally increases with the size of the input, the complexity is typically expressed as a function of the size of the input, and therefore, the complexity is a function of. However, the complexity of an algorithm may vary dramatically for different inputs of the same size. Therefore, several complexity functions are commonly used.
The worst-case complexity is the maximum of the complexity over all inputs of size, and the average-case complexity is the average of the complexity over all inputs of size . Generally, when "complexity" is used without being further specified, it is the worst-case time complexity that is considered.

Asymptotic complexity

It is generally difficult to compute precisely the worst-case and the average-case complexity. In addition, these exact values provide little practical application, as any change of computer or of model of computation would change the complexity somewhat. Moreover, the resource use is not critical for small values of, and this makes that, for small, the ease of implementation is generally more interesting than a low complexity.
For these reasons, one generally focuses on the behavior of the complexity for large, that is on its asymptotic behavior when tends to the infinity. Therefore, the complexity is generally expressed by using big O notation.
For example, the usual algorithm for integer multiplication has a complexity of this means that there is a constant such that the multiplication of two integers of at most digits may be done in a time less than This bound is sharp in the sense that the worst-case complexity and the average-case complexity are which means that there is also a constant such that these complexities are larger than The radix does not appear in these complexities, as changing of radix changes only the constants and

Models of computation

The evaluation of the complexity relies on the choice of a model of computation, which consists in defining the basic operations that are done in a unit of time. When the model of computation is not explicitly specified, it is generally implicitly assumed to be a multitape Turing machine, since several more realistic models of computation, such as random-access machines are asymptotically equivalent for most problems. It is only for very specific and difficult problems, such as integer multiplication in time that the explicit definition of the model of computation is required for proofs.

Deterministic models

A deterministic model of computation is a model of computation such that the successive states of the machine and the operations to be performed are completely determined by the preceding state. Historically, the first deterministic models were recursive functions, lambda calculus, and Turing machines. The model of random-access machines is also widely used, as a closer counterpart to real computers.
When the model of computation is not specified, it is generally assumed to be a multitape Turing machine. For most algorithms, the time complexity is the same on multitape Turing machines as on RAM-machines, although some care may be needed in how data is stored in memory to get this equivalence.

Non-deterministic computation

In a non-deterministic model of computation, such as non-deterministic Turing machines, some choices may be made at some steps of the computation. In complexity theory, one considers all possible choices simultaneously, and the non-deterministic time complexity is the time needed, when the best choices are always made. In other words, one considers that the computation is done simultaneously on as many processors as needed, and the non-deterministic computation time is the time spent by the first processor that finishes the computation. This parallelism is partly amenable to quantum computing via superposed entangled states in running specific quantum algorithms, like e.g. Shor's algorithm for finding the prime factors of an integer.
Even when such a computation model is not realistic yet, it has theoretical importance, mostly related to the P = NP problem, which questions the identity of the complexity classes formed by taking "polynomial time" and "non-deterministic polynomial time" as upper bounds. Simulating an NP-algorithm on a deterministic computer usually takes "exponential time". A problem is in the complexity class NP, if it may be solved in polynomial time on a non-deterministic machine. A problem is NP-complete if, roughly speaking, it is in NP and is not easier than any other NP problem. Many combinatorial problems, such as the Knapsack problem, the travelling salesman problem, and the Boolean satisfiability problem are NP-complete. For all these problems, the best known algorithm has exponential complexity. If any one of these problems could be solved in polynomial time on a deterministic machine, then all NP problems could also be solved in polynomial time, and one would have P = NP. it is generally conjectured that with the practical implication that the worst cases of NP problems are intrinsically difficult to solve, i.e., take longer than any reasonable time span for interesting lengths of input.