Theoretical computer science


Theoretical computer science is a subfield of computer science and mathematics that focuses on the abstract and mathematical foundations of computation.
It is difficult to circumscribe the theoretical areas precisely. The ACM's Special Interest Group on Algorithms and Computation Theory provides the following description:

History

While logical inference and mathematical proof had existed previously, in 1931 Kurt Gödel proved with his incompleteness theorem that there are fundamental limitations on what statements could be proved or disproved.
Information theory was added to the field with a 1948 mathematical theory of communication by Claude Shannon. In the same decade, Donald Hebb introduced a mathematical model of learning in the brain. With mounting biological data supporting this hypothesis with some modification, the fields of neural networks and parallel distributed processing were established. In 1971, Stephen Cook and, working independently, Leonid Levin, proved that there exist practically relevant problems that are NP-complete – a landmark result in computational complexity theory.
Modern theoretical computer science research is based on these basic developments, but includes many other mathematical and interdisciplinary problems that have been posed, as shown below:

Topics

Algorithms

An algorithm is a step-by-step procedure for calculations. Algorithms are used for calculation, data processing, and automated reasoning.
An algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Starting from an initial state and initial input, the instructions describe a computation that, when executed, proceeds through a finite number of well-defined successive states, eventually producing "output" and terminating at a final ending state. The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate random input.

Automata theory

is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science, under discrete mathematics. Automata comes from the Greek word αὐτόματα, meaning "self-acting".
Automata Theory is the study of self-operating virtual machines to help in the logical understanding of input and output process, without or with intermediate stage of computation.

Coding theory

is the study of the properties of codes and their fitness for a specific application. Codes are used for data compression, cryptography, error correction and more recently also for network coding. Codes are studied by various scientific disciplines – such as information theory, electrical engineering, mathematics, and computer science – for the purpose of designing efficient and reliable data transmission methods. This typically involves the removal of redundancy and the correction of errors in the transmitted data.

Computational complexity theory

is a branch of the theory of computation that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other. A computational problem is understood to be a task that is in principle amenable to being solved by a computer, which is equivalent to stating that the problem may be solved by mechanical application of mathematical steps, such as an algorithm.
A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used. The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying the amount of resources needed to solve them, such as time and storage. Other complexity measures are also used, such as the amount of communication, the number of gates in a circuit and the number of processors. One of the roles of computational complexity theory is to determine the practical limits on what computers can and cannot do.

Computational geometry

is a branch of computer science devoted to the study of algorithms that can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems are also considered to be part of computational geometry.
The main impetus for the development of computational geometry as a discipline was progress in computer graphics and computer-aided design and manufacturing, but many problems in computational geometry are classical in nature, and may come from mathematical visualization.
Other important applications of computational geometry include robotics, geographic information systems , integrated circuit design, computer-aided engineering , computer vision.

Computational learning theory

Theoretical results in machine learning mainly deal with a type of inductive learning called supervised learning. In supervised learning, an algorithm is given samples that are labeled in some
useful way. For example, the samples might be descriptions of mushrooms, and the labels could be whether or not the mushrooms are edible. The algorithm takes these previously labeled samples and
uses them to induce a classifier. This classifier is a function that assigns labels to samples including the samples that have never been previously seen by the algorithm. The goal of the supervised learning algorithm is to optimize some measure of performance such as minimizing the number of mistakes made on new samples.

Computational number theory

, also known as algorithmic number theory, is the study of algorithms for performing number theoretic computations. The best known problem in the field is integer factorization.

Cryptography

is the practice and study of techniques for secure communication in the presence of third parties. More generally, it is about constructing and analyzing protocols that overcome the influence of adversaries and that are related to various aspects in information security such as data confidentiality, data integrity, authentication, and non-repudiation. Modern cryptography intersects the disciplines of mathematics, computer science, and electrical engineering. Applications of cryptography include ATM cards, computer passwords, and electronic commerce.
Modern cryptography is heavily based on mathematical theory and computer science practice; cryptographic algorithms are designed around computational hardness assumptions, making such algorithms hard to break in practice by any adversary. It is theoretically possible to break such a system, but it is infeasible to do so by any known practical means. These schemes are therefore termed computationally secure; theoretical advances, e.g., improvements in integer factorization algorithms, and faster computing technology require these solutions to be continually adapted. There exist information-theoretically secure schemes that cannot be broken even with unlimited computing power—an example is the one-time pad—but these schemes are more difficult to implement than the best theoretically breakable but computationally secure mechanisms.

Data structures

A data structure is a particular way of organizing data in a computer so that it can be used efficiently.
Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example, databases use B-tree indexes for small percentages of data retrieval and compilers and databases use dynamic hash tables as look up tables.
Data structures provide a means to manage large amounts of data efficiently for uses such as large databases and internet indexing services. Usually, efficient data structures are key to designing efficient algorithms. Some formal design methods and programming languages emphasize data structures, rather than algorithms, as the key organizing factor in software design. Storing and retrieving can be carried out on data stored in both main memory and in secondary memory.

Distributed computation

studies distributed systems. A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components. Examples of distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer applications, and blockchain networks like Bitcoin.
A computer program that runs in a distributed system is called a distributed program, and distributed programming is the process of writing such programs. There are many alternatives for the message passing mechanism, including RPC-like connectors and message queues. An important goal and challenge of distributed systems is location transparency.

Information-based complexity

studies optimal algorithms and computational complexity for continuous problems. IBC has studied continuous problems as path integration, partial differential equations, systems of ordinary differential equations, nonlinear equations, integral equations, fixed points, and very-high-dimensional integration.

Formal methods

are a particular kind of mathematics based techniques for the specification, development and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analysis can contribute to the reliability and robustness of a design.
Formal methods are best described as the application of a fairly broad variety of theoretical computer science fundamentals, in particular logic calculi, formal languages, automata theory, and program semantics, but also type systems and algebraic data types to problems in software and hardware specification and verification.