Single source of truth


In information science and information technology, single source of truth architecture, or single point of truth architecture, for information systems is the practice of structuring information models and associated data schemas such that every data element is mastered in only one place, providing data normalization to a canonical form.
There are several scenarios with respect to copies and updates:
  • The master data is never copied and instead only references to it are made; this means that all reads and updates go directly to the SSOT.
  • The master data is copied but the copies are only read and only the master data is updated; if requests to read data are only made on copies, this is an instance of CQRS.
  • The master data is copied and the copies are updated; this needs a reconciliation mechanism when there are concurrent updates.
  • *Updates on copies can be thrown out whenever a concurrent update is made on the master, so they are not considered fully committed until propagated to the master.
  • *Concurrent updates are merged.
The advantages of SSOT architectures include easier prevention of mistaken inconsistencies, and greatly simplified version control. Without a SSOT, dealing with inconsistencies implies either complex and error-prone consensus algorithms, or using a simpler architecture that's liable to lose data in the face of inconsistency.
Ideally, SSOT systems provide data that are authentic, relevant, and referable.
Deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements pose a risk for retrieval of outdated, and therefore incorrect, information. Common examples are as follows:
  • In electronic health records, it is imperative to accurately validate patient identity against a single referential repository, which serves as the SSOT. Duplicate representations of data within the enterprise would be implemented by the use of pointers rather than duplicate database tables, rows, or cells. This ensures that data updates to elements in the authoritative location are comprehensively distributed to all federated database constituencies in the larger overall enterprise architecture. EHRs are an excellent class for exemplifying how SSOT architecture is both poignantly necessary and challenging to achieve: it is challenging because inter-organization health information exchange is inherently a cybersecurity competence hurdle, and nonetheless it is necessary, to prevent medical errors, to prevent the wasted costs of inefficiency, and to make the primary care and medical home concepts feasible.
  • Single-source publishing as a general principle or ideal in content management relies on having SSOTs, via transclusion or substitution. Substitution happens via libraries of objects that can be propagated as static copies which are later refreshed when necessary. Component content management systems are a class of content management systems that aim to provide competence on this level.

    Implementation

Ontologic interactions

An acknowledged prerequisite is that it depends on the ontologic condition that no more than a single truth exists, an assertion that is ontologic in both the IT sense and the general sense of that word. In many instances, this presents no problem. The broadest contexts require adequate epistemic regime comparison and reconciliation. An archetypal example of this class of reconciliation is that two theological seminary libraries, from two different religions, could exchange information with an SSOT architecture, but the unification of truth would reside on the level of the statement that "religion X asserts that God is purple whereas religion Y asserts that God is green", rather than on the level of "God is purple" or "God is green".

Architectures or architectural features

An ideal implementation of SSOT is rarely possible in most enterprises. This is because many organisations have multiple information systems, each of which needs access to data relating to the same entities. Often these systems are purchased as commercial off-the-shelf products from vendors and cannot be modified in trivial ways. Each of these various systems therefore needs to store its own version of common data or entities, and therefore each system must retain its own copy of a record. For example, an enterprise resource planning system may store a customer record; the customer relationship management system also needs a copy of the customer record and the warehouse dispatch system might also need a copy of some or all of the customer data. In cases where vendors do not support such modifications, it is not always possible to replace these records with pointers to the SSOT.
For organisations wishing to implement a Single Source of Truth, some supporting architectures are:
A master data management system typically serves as the source of truth for an organization's metadata, helping to ensure accuracy and consistency throughout that organizations multiple data sources. Typically the MDM acts as a hub for multiple systems, many of which could allow updates to different aspects of information on a given entity. For example, the CRM system may be the "source of truth" for most aspects of the customer, and is updated by a call centre operator. However, a customer may also update their address via a customer service web site, with a different back-end database from the CRM system. The MDM application receives updates from multiple sources, acts as a broker to determine which updates are to be regarded as authoritative and then syndicates this updated data to all subscribing systems. The MDM application normally requires an ESB to syndicate its data to multiple subscribing systems.

Event store and event sourcing (ES)

In event oriented architectures, it has become increasingly common to find an implementation of the pattern which stores the system state as an ordered sequence of state changes. To do this, you need an Event Store, a particular type of database designed to hold all the events that change the state of the system. The event store in an + Command Query Responsibility Separation + Domain Driven Design + Messaging architecture is in fact a "single source of truth", with the additional advantage that it can also act as an Enterprise Service Bus as it can listen directly to the event store for status changes as everything passes by. In addition, by saving all the events, it also plays the role of Data Warehouse. One last advantage is that through this system the can be implemented, another technique not mentioned to obtain a single source of truth.

Data warehouse (DW)

While the primary purpose of a data warehouse is to support reporting and analysis of data that has been combined from multiple sources, the fact that such data has been combined means that the data warehouse is often used as a de facto SSOT. Generally, however, the data available from the data warehouse are not used to update other systems; rather the DW becomes the "single source of truth" for reporting to multiple stakeholders. In this context, the Data Warehouse is more correctly referred to as a "single version of the truth" since other versions of the truth exist in its operational data sources.