Symbolic artificial intelligence

In artificial intelligence, symbolic artificial intelligence
is the term for the collection of all methods in artificial intelligence research that are based on high-level symbolic representations of problems, logic, and search. Symbolic AI used tools such as logic programming, production rules, semantic nets and frames, and it developed applications such as knowledge-based systems, symbolic mathematics, automated theorem provers, ontologies, the semantic web, and automated planning and scheduling systems. The Symbolic AI paradigm led to important ideas in search, symbolic programming languages, agents, multi-agent systems, the semantic web, and the strengths and limitations of formal knowledge and reasoning systems.
Symbolic AI was the dominant paradigm of AI research from the mid-1950s until the mid-1990s. Researchers in the 1960s and the 1970s were convinced that symbolic approaches would eventually succeed in creating a machine with artificial general intelligence and considered this the ultimate goal of their field. An early boom, with early successes such as the Logic Theorist and Samuel's Checkers Playing Program, led to unrealistic expectations and promises and was followed by the first AI Winter as funding dried up. A second boom occurred with the rise of expert systems, their promise of capturing corporate expertise, and an enthusiastic corporate embrace. That boom, and some early successes, e.g., with XCON at DEC, was followed again by later disappointment. Problems with difficulties in knowledge acquisition, maintaining large knowledge bases, and brittleness in handling out-of-domain problems arose. Another, second, AI Winter followed. Subsequently, AI researchers focused on addressing underlying problems in handling uncertainty and in knowledge acquisition. Uncertainty was addressed with formal methods such as hidden Markov models, Bayesian reasoning, and statistical relational learning. Symbolic machine learning addressed the knowledge acquisition problem with contributions including Version Space, Valiant's PAC learning, Quinlan's ID3 decision-tree learning, case-based learning, and inductive logic programming to learn relations.
Neural networks, a subsymbolic approach, had been pursued from early days and reemerged strongly in 2012. Early examples are Rosenblatt's perceptron learning work, the backpropagation work of Rumelhart, Hinton and Williams, and work in convolutional neural networks by LeCun et al. in 1989. However, neural networks were not viewed as successful until about 2012: "Until Big Data became commonplace, the general consensus in the Al community was that the so-called neural-network approach was hopeless. Systems just didn't work that well, compared to other methods.... A revolution came in 2012, when a number of people, including a team of researchers working with Hinton, worked out a way to use the power of GPUs to enormously increase the power of neural networks." Over the next several years, deep learning had spectacular success in handling vision, speech recognition, speech synthesis, image generation, and machine translation, though symbolic approaches continue to be useful in a few domains such as computer algebra systems and proof assistants.
However, given the inherent complexity of intelligence itself, it remains an open question whether symbolic AI will be completely supplanted by connectionist AI, or whether symbolic AI may yet experience a resurgence. More recently, work by Zhang et al. has further argued that, at least at the theoretical level, there is no clear superiority among different technological paradigms.

History

A short history of symbolic AI to the present day follows below. Time periods and titles are drawn from Henry Kautz's 2020 AAAI Robert S. Engelmore Memorial Lecture and the longer Wikipedia article on the History of AI, with dates and titles differing slightly for increased clarity.

The first AI summer: irrational exuberance, 1948–1966

Success at early attempts in AI occurred in three main areas: artificial neural networks, knowledge representation, and heuristic search, contributing to high expectations. This section summarizes Kautz's reprise of early AI history.

Approaches inspired by human or animal cognition or behavior

Cybernetic approaches attempted to replicate the feedback loops between animals and their environments. A robotic turtle, with sensors, motors for driving and steering, and seven vacuum tubes for control, based on a preprogrammed neural net, was built as early as 1948. This work can be seen as an early precursor to later work in neural networks, reinforcement learning, and situated robotics.
An important early symbolic AI program was the Logic theorist, written by Allen Newell, Herbert Simon and Cliff Shaw in 1955–56, as it was able to prove 38 elementary theorems from Whitehead and Russell's Principia Mathematica. Newell, Simon, and Shaw later generalized this work to create a domain-independent problem solver, GPS. GPS solved problems represented with formal operators via state-space search using means-ends analysis.
During the 1960s, symbolic approaches achieved great success at simulating intelligent behavior in structured environments such as game-playing, symbolic mathematics, and theorem-proving. AI research was concentrated in four institutions in the 1960s: Carnegie Mellon University, Stanford, MIT and University of Edinburgh. Each one developed its own style of research. Earlier approaches based on cybernetics or artificial neural networks were abandoned or pushed into the background.
Herbert Simon and Allen Newell studied human problem-solving skills and attempted to formalize them, and their work laid the foundations of the field of artificial intelligence, as well as cognitive science, operations research and management science. Their research team used the results of psychological experiments to develop programs that simulated the techniques that people used to solve problems. This tradition, centered at Carnegie Mellon University would eventually culminate in the development of the Soar architecture in the middle 1980s.

Heuristic search

In addition to the highly specialized domain-specific kinds of knowledge that we will see later used in expert systems, early symbolic AI researchers discovered another more general application of knowledge. These were called heuristics, rules of thumb that guide a search in promising directions: "How can non-enumerative search be practical when the underlying problem is exponentially hard? The approach advocated by Simon and Newell is to employ heuristics: fast algorithms that may fail on some inputs or output suboptimal solutions." Another important advance was to find a way to apply these heuristics that guarantees a solution will be found, if there is one, not withstanding the occasional fallibility of heuristics: "The A* algorithm provided a general frame for complete and optimal heuristically guided search. A* is used as a subroutine within practically every AI algorithm today but is still no magic bullet; its guarantee of completeness is bought at the cost of worst-case exponential time.

Early work on knowledge representation and reasoning

Early work covered both applications of formal reasoning emphasizing first-order logic, along with attempts to handle common-sense reasoning in a less formal manner.

Modeling formal reasoning with logic: the "neats"

Unlike Simon and Newell, John McCarthy felt that machines did not need to simulate the exact mechanisms of human thought, but could instead try to find the essence of abstract reasoning and problem-solving with logic, regardless of whether people used the same algorithms.
His laboratory at Stanford focused on using formal logic to solve a wide variety of problems, including knowledge representation, planning and learning.
Logic was also the focus of the work at the University of Edinburgh and elsewhere in Europe which led to the development of the programming language Prolog and the science of logic programming.

Modeling implicit common-sense knowledge with frames and scripts: the "scruffies"

Researchers at MIT found that solving difficult problems in vision and natural language processing required ad hoc solutions—they argued that no simple and general principle would capture all the aspects of intelligent behavior. Roger Schank described their "anti-logic" approaches as "scruffy".
Commonsense knowledge bases are an example of "scruffy" AI, since they must be built by hand, one complicated concept at a time.

The first AI winter: crushed dreams, 1967–1977

The first AI winter was a shock:

The second AI summer: knowledge is power, 1978–1987

Knowledge-based systems

As limitations with weak, domain-independent methods became more and more apparent, researchers from all three traditions began to build knowledge into AI applications. The knowledge revolution was driven by the realization that knowledge underlies high-performance, domain-specific AI applications.
Edward Feigenbaum said:

"In the knowledge lies the power."

to describe that high performance in a specific domain requires both general and highly domain-specific knowledge. Ed Feigenbaum and Doug Lenat called this The Knowledge Principle:

Success with expert systems

This "knowledge revolution" led to the development and deployment of expert systems, the first commercially successful form of AI software.
Key expert systems were:

DENDRAL, which found the structure of organic molecules from their chemical formula and mass spectrometer readings.
MYCIN, which diagnosed bacteremia – and suggested further lab tests, when necessary – by interpreting lab results, patient history, and doctor observations. "With about 450 rules, MYCIN was able to perform as well as some experts, and considerably better than junior doctors."
INTERNIST and CADUCEUS which tackled internal medicine diagnosis. Internist attempted to capture the expertise of the chairman of internal medicine at the University of Pittsburgh School of Medicine while CADUCEUS could eventually diagnose up to 1000 different diseases.
GUIDON, which showed how a knowledge base built for expert problem solving could be repurposed for teaching.
XCON, to configure VAX computers, a then laborious process that could take up to 90 days. XCON reduced the time to about 90 minutes.

DENDRAL is considered the first expert system that relied on knowledge-intensive problem-solving. It is described below, by Ed Feigenbaum, from a Communications of the ACM interview, :
The other expert systems mentioned above came after DENDRAL. MYCIN exemplifies the classic expert system architecture of a knowledge-base of rules coupled to a symbolic reasoning mechanism, including the use of certainty factors to handle uncertainty. GUIDON shows how an explicit knowledge base can be repurposed for a second application, tutoring, and is an example of an intelligent tutoring system, a particular kind of knowledge-based application. Clancey showed that it was not sufficient simply to use MYCIN's rules for instruction, but that he also needed to add rules for dialogue management and student modeling. XCON is significant because of the millions of dollars it saved DEC, which triggered the expert system boom where most all major corporations in the US had expert systems groups, to capture corporate expertise, preserve it, and automate it:
Chess expert knowledge was encoded in Deep Blue. In 1996, this allowed IBM's Deep Blue, with the help of symbolic AI, to win in a game of chess against the world champion at that time, Garry Kasparov.