Knowledge graph

A knowledge graph is a knowledge base that uses a graph-structured data model or topology to integrate knowledge and data. Knowledge graphs are often used to store interlinked descriptions of entities — real-world objects, events, situations or abstract concepts — with free-form semantics, not fitting into a single traditional ontology.
Since the development of the Semantic Web, knowledge graphs are often associated with linked open data projects, focusing on the connections between concepts and entities. The are also prominently associated with and used by search engines such as Google, Bing, and Yahoo; knowledge-engines and question-answering services such as WolframAlpha, Apple's Siri, and Amazon Alexa; and social networks such as LinkedIn and Facebook.


The term was coined as early as 1972, in a discussion of how to build modular instructional systems for courses. In the late 1980s, Groningen and Twente universities jointly began a project called Knowledge Graphs, focusing on the design of semantic networks with edges restricted to a limited set of relations, to facilitate algebras on the graph. In subsequent decades, the distinction between semantic networks and knowledge graphs was blurred.
Some early knowledge graphs were topic-specific. In 1985, Wordnet was founded, capturing semantic relationships between words and meanings -- an application of this idea to language itself. In 2005, Marc Wirk founded Geonames to capture relationships between different geographic names and locales and associated entities. In 1998 Andrew Edmonds of Science in Finance Ltd in the UK created a system called ThinkBase that offered fuzzy-logic based reasoning in a graphical context.
In 2007, both DBpedia and Freebase were founded as graph-based knowledge repositories for general-purpose knowledge. DBpedia focused exclusively on data extracted from Wikipedia, while Freebase also included a range of public datasets. Neither described themselves as a 'knowledge graph' but developed and described related concepts.
In 2012, Google introduced their Knowledge Graph, building on DBpedia and Freebase among other sources. They later incorporated RDFa and microdata formats from indexed web pages, which in time were standardized around vocabularies published by The Google Knowledge Graph became a successful complement to string-based search within Google, and its popularity online brought the term into more common use.
Since then, several large multinationals have advertised their knowledge graphs use, further popularising the term. These include Facebook, LinkedIn, Airbnb, Microsoft, Amazon, Uber and eBay.


There is no single commonly accepted definition of a knowledge graph. Most definitions view the topic though a Semantic Web lens and include these features:
There are, however, many knowledge graph representations for which some of these features are not relevant. For those knowledge graphs this simpler definition may be more useful:
In addition to the above examples, the term has been used to describe open knowledge projects such as YAGO and Wikidata; federations like the Linked Open Data cloud; a range of commercial search tools, including Yahoo’s semantic search assistant Spark, Google’s Knowledge Vault, and Microsoft’s Satori; and the LinkedIn and Facebook entity graphs.

Using a knowledge graph for reasoning over data

A knowledge graph formally represents semantics by describing entities and their relationships. Knowledge graphs may make use of ontologies as a schema layer. By doing this, they allow logical inference for retrieving implicit knowledge rather than only allowing queries requesting explicit knowledge.
In order to allow the use of knowledge graphs in various machine learning tasks, several methods for deriving latent feature representations of entities and relations have been devised. These knowledge graph embeddings allow them to be connected to machine learning methods that require feature vectors like word embeddings. This can complement other estimates of conceptual similarity.