Cyc

Cyc is a long-term artificial intelligence project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge, Cyc focuses on implicit knowledge. The project began in July 1984 at MCC and was developed later by the Cycorp company.
The name "Cyc" is a registered trademark owned by Cycorp. CycL has a publicly released specification, and dozens of HL modules were described in Lenat and Guha's textbook, but the Cyc inference engine code and the full list of HL modules are Cycorp-proprietary.

WWW cyc.com

History

The project was begun in July 1984 by Douglas Lenat at the Microelectronics and Computer Technology Corporation, a research consortium started by two United States–based corporations "to counter a then ominous Japanese effort in AI, the so-called 'fifth-generation' project." From January 1995 on, the project was under active development by Cycorp, where Douglas Lenat was the CEO.
The CycL representation language started as an extension of RLL. In 1989, CycL had expanded in expressive power to higher-order logic.
Cyc's ontology grew to about 100,000 terms in 1994, and as of 2017, it contained about 1,500,000 terms. The Cyc knowledge base involving ontological terms was largely created by hand axiom-writing; it was at about 1 million in 1994, and as of 2017, it was at about 24.5 million.
By 2002, Cyc was described as having "consumed $60 million and 600 person-years of effort from programmers, philosophers and others—collectively known as Cyclists—who have been codifying what Lenat calls 'consensus reality' and entering it into a massive database."
In 2008, Cyc resources were mapped to many Wikipedia articles.

Knowledge base

The knowledge base is divided into microtheories. Unlike the knowledge base as a whole, each microtheory must be free from monotonic contradictions. Each microtheory is a first-class object in the Cyc ontology; it has a name that is a regular constant. The concept names in Cyc are CycL terms or constants. Constants start with an optional #$ and are case-sensitive. There are constants for:

Individual items known as individuals, such as #$BillClinton or #$France.Collections, such as #$Tree-ThePlant or #$EquivalenceRelation. A member of a collection is called an instance of that collection.Functions, which produce new terms from given ones. For example, #$FruitFn, when provided with an argument describing a type of plants, will return the collection of its fruits. By convention, function constants start with an upper-case letter and end with the string Fn.Truth functions, which can apply to one or more other concepts and return either true or false. For example, #$siblings is the sibling relationship, true if the two arguments are siblings. By convention, truth function constants start with a lowercase letter.

For every instance of the collection #$ChordataPhylum, there exists a female animal, which is its mother.

Inference engine

An inference engine is a computer program that tries to derive answers from a knowledge base. The Cyc inference engine performs general logical deduction. It also performs inductive reasoning, statistical machine learning and symbolic machine learning, and abductive reasoning.
The Cyc inference engine separates the epistemological problem from the heuristic problem. For the latter, Cyc used a community-of-agents architecture in which specialized modules, each with its own algorithm, became prioritized if they could make progress on the sub-problem.

Releases

OpenCyc

The first version of OpenCyc was released in spring 2002 and contained only 6,000 concepts and 60,000 facts. The knowledge base was released under the Apache License. Cycorp stated its intention to release OpenCyc under parallel, unrestricted licences to meet the needs of its users. The CycL and SubL interpreter was released free of charge, but only as a binary, without source code. It was made available for Linux and Microsoft Windows. The open source Texai project released the RDF-compatible content extracted from OpenCyc. The user interface was in Java 6.
Cycorp was a participant of a working group for the Semantic Web, Standard Upper Ontology Working Group, which was active from 2001 to 2003.
A Semantic Web version of OpenCyc was available starting in 2008, but ending sometime after 2016.
OpenCyc 4.0 was released in June 2012. OpenCyc 4.0 contained 239,000 concepts and 2,093,000 facts; however, these are mainly taxonomic assertions.
4.0 was the last released version, and around March 2017, OpenCyc was shut down for the purported reason that "because such “fragmenting” led to divergence, and led to confusion amongst its users and the technical community generally thought that OpenCyc fragment was Cyc.".

ResearchCyc

In July 2006, Cycorp released the executable of ResearchCyc 1.0, a version of Cyc aimed at the research community, at no charge. In addition to the taxonomic information, ResearchCyc includes more semantic knowledge; it also includes a large lexicon, English parsing and generation tools, and Java-based interfaces for knowledge editing and querying. It contains a system for ontology-based data integration.

Applications

In 2001, GlaxoSmithKline was funding the Cyc, though for unknown applications. In 2007, the Cleveland Clinic has used Cyc to develop a natural-language query interface of biomedical information on cardiothoracic surgeries. A query is parsed into a set of CycL fragments with open variables. The Terrorism Knowledge Base was an application of Cyc that tried to contain knowledge about "terrorist"-related descriptions. The knowledge is stored as statements in mathematical logic. The project lasted from 2004 to 2008. Lycos used Cyc for search term disambiguation, but stopped in 2001. CycSecure was produced in 2002, a network vulnerability assessment tool based on Cyc, with trials at the US STRATCOM Computer Emergency Response Team.
One Cyc application has the stated aim to help students doing math at a 6th grade level. The application, called MathCraft, was supposed to play the role of a fellow student who is slightly more confused than the user about the subject. As the user gives good advice, Cyc allows the avatar to make fewer mistakes.

Criticisms

The Cyc project has been described as "one of the most controversial endeavors of the artificial intelligence history". Catherine Havasi, CEO of Luminoso, says that Cyc is the predecessor project to IBM's Watson. Machine-learning scientist Pedro Domingos refers to the project as a "catastrophic failure" for the unending amount of data required to produce any viable results and the inability for Cyc to evolve on its own.
Gary Marcus, a cognitive scientist and the cofounder of an AI company called Geometric Intelligence, said in 2016 that "it represents an approach that is very different from all the deep-learning stuff that has been in the news." This is consistent with Doug Lenat's position that "Sometimes the veneer of intelligence is not enough".

Notable employees

This is a list of some of the notable people who work or have worked on Cyc either while it was a project at MCC or Cycorp.