DNA database


A DNA database or DNA databank is a database of DNA profiles which can be used in the analysis of genetic diseases, genetic fingerprinting for criminology, or genetic genealogy. DNA databases may be public or private, the largest ones being national DNA databases.
DNA databases are often employed in forensic investigations. When a match is made from a national DNA database to link a crime scene to a person whose DNA profile is stored on a database, that link is often referred to as a cold hit. A cold hit is of particular value in linking a specific person to a crime scene, but is of less evidential value than a DNA match made without the use of a DNA database. Research shows that DNA databases of criminal offenders reduce crime rates.
Whether larger databases actually weaken match evidence has been debated. Stockmarr argued that searching a database increases the chance of coincidental matches, while Balding maintained that the evidence stays just as strong regardless of how the search was done. Recent analysis suggests both were partly right: traditional DNA profiling operates in a regime where false matches are so rare that database size barely matters, but threshold-based screening systems—which flag individuals matching some number of attributes rather than requiring an exact match—operate in a different regime where false alerts become statistically inevitable as populations grow.

Types

Forensic

A forensic database is a centralized DNA database for storing DNA profiles of individuals that enables searching and comparing of DNA samples collected from a crime scene against stored profiles. The most important function of the forensic database is to produce matches between the suspected individual and crime scene bio-markers, and then provides evidence to support criminal investigations, and also leads to identify potential suspects in the criminal investigation. Majority of the National DNA databases are used for forensic purposes.
The Interpol DNA database is used in criminal investigations. Interpol maintains an automated DNA database called DNA Gateway that contains DNA profiles submitted by member countries collected from crime scenes, missing persons, and unidentified bodies. The DNA Gateway was established in 2002, and at the end of 2013, it had more than 140,000 DNA profiles from 69 member countries. Unlike other DNA databases, DNA Gateway is only used for information sharing and comparison, it does not link a DNA profile to any individual, and the physical or psychological conditions of an individual are not included in the database.

Genealogical

A national or forensic DNA database is not available for non-police purposes. DNA profiles can also be used for genealogical purposes, so that a separate genetic genealogy database needs to be created that stores DNA profiles of genealogical DNA test results. GenBank is a public genetic genealogy database that stores genome sequences submitted by many genetic genealogists. Until now, GenBank has contained large number of DNA sequences gained from more than 140,000 registered organizations, and is updated every day to ensure a uniform and comprehensive collection of sequence information. These databases are mainly obtained from individual laboratories or large-scale sequencing projects. The files stored in GenBank are divided into different groups, such as BCT, VRL, PRI...etc. People can access GenBank from NCBI's retrieval system, and then use “BLAST” function to identify a certain sequence within the GenBank or to find the similarities between two sequences.

Medical

A medical DNA database is a DNA database of medically relevant genetic variations. It collects an individual's DNA which can reflect their medical records and lifestyle details. Through recording DNA profiles, scientists may find out the interactions between the genetic environment and occurrence of certain diseases, and thus finding some new drugs or effective treatments in controlling these diseases. It is often collaborated with the National Health Service.

National

A national DNA database is a DNA database maintained by the government for storing DNA profiles of its population. Each DNA profile based on PCR uses STR analysis. They are generally used for forensic purposes, including searching and matching DNA profiles of potential criminal suspects.
In 2009 Interpol reported 54 police national DNA databases in the world and 26 more countries planned to start one. In Europe Interpol reported there were 31 national DNA databases and six more planned. The European Network of Forensic Science Institutes DNA working group made 33 recommendations in 2014 for DNA database management and guidelines for auditing DNA databases. Other countries have adopted privately developed DNA databases, such as Qatar.
Typically, a tiny subset of the individual's genome is sampled from 13 or 16 regions that have high individuation.

United Kingdom

The first national DNA database in the United Kingdom was established in April 1995, called National DNA Database. By 2006, it contained 2.7 million DNA profiles, as well as other information from individuals and crime scenes. in 2020 it had 6.6 million profiles. The information is stored in the form of a digital code, which is based on the nomenclature of each STR. In 1995 the database originally had 6 STR markers for each profile, from 1999 10 markers, and from 2014, 16 core markers and a gender identifier. Scotland has used 21 STR loci, two Y-DNA markers and a gender identifier since 2014. In the UK, police have wide-ranging powers to take DNA samples and retain them if the subject is convicted of a recordable offence. As the large amount of DNA profiles which have been stored in NDNAD, "cold hits" may happen during the DNA matching, which means finding an unexpected match between an individual's DNA profile and an unsolved crime-scene DNA profile. This can introduce a new suspect into the investigation, thus helping to solve the old cases.
In England and Wales, anyone arrested on suspicion of a recordable offence must submit a DNA sample, the profile of which is then stored on the DNA database. Those not charged or not found guilty have their DNA data deleted within a specified period of time. In Scotland, the law similarly requires the DNA profiles of most people who are acquitted be removed from the database.

New Zealand

was the second country to set up a DNA database. In 2019 The New Zealand DNA Profile Databank held 40,000 DNA profiles and 200,000 samples.

United States

The United States national DNA database is called Combined DNA Index System. It is maintained at three levels: national, state and local. Each level implemented its own DNA index system. The national DNA index system allows DNA profiles to be exchanged and compared between participated laboratories nationally. Each state DNA index system allows DNA profiles to be exchanged and compared between the laboratories of various states and the local DNA index system allows DNA profiles collected at local sites and uploaded to SDIS and NDIS.
CODIS software integrates and connects all the DNA index systems at the three levels. CODIS is installed on each participating laboratory site and uses a standalone network known as Criminal Justice Information Systems Wide Area Network to connect to other laboratories. In order to decrease the number of irrelevant matches at NDIS, the Convicted Offender Index requires all 13 CODIS STRs to be present for a profile upload. Forensic profiles only require 10 of the STRs to be present for an upload.
As of 2011, over 9 million records were held within CODIS. As of March 2011, 361,176 forensic profiles and 9,404,747 offender profiles have been accumulated, making it the largest DNA database in the world. As of the same date, CODIS has produced over 138,700 matches to requests, assisting in more than 133,400 investigations.
The growing public approval of DNA databases has seen the creation and expansion of many states' own DNA databases. Political measures such as California Proposition 69, which increased the scope of the DNA database, have already met with a significant increase in numbers of investigations aided. Forty-nine states in the USA, all apart from Idaho, store DNA profiles of violent offenders, and many also store profiles of suspects. A 2017 study showed that DNA databases in U.S. states "deter crime by profiled offenders, reduce crime rates, and are more cost-effective than traditional law enforcement tools".
CODIS is also used to help find missing persons and identify human remains. It is connected to the National Missing Persons DNA Database; samples provided by family members are sequenced by the University of North Texas Center for Human Identification, which also runs the National Missing and Unidentified Persons System. UNTCHI can sequence both nuclear and mitochondrial DNA.
The Department of Defense maintains a DNA database to identify the remains of service members. The Department of Defense Serum Repository maintains more than 50,000,000 records, primarily to assist in the identification of human remains. Submission of DNA samples is mandatory for US servicemen, but the database also includes information on military dependents. The National Defense Authorization Act of 2003 provided a means for federal courts or military judges to order the use of the DNA information collected to be made available for the purpose of investigation or prosecution of a felony, or any sexual offense, for which no other source of DNA information is reasonably available.

Australia

The Australian national DNA database is called the National Criminal Investigation DNA Database. By July 2018, it contained 837,000+ DNA profiles. The database used nine STR loci and a sex gene for analysis, and this was increased to 18 core markers in 2013. NCIDD combines all forensic data, including DNA profiles, advanced bio-metrics or cold cases.