Combined DNA Index System
The Combined DNA Index System is the United States national DNA database created and maintained by the Federal Bureau of Investigation. CODIS consists of three levels of information: Local DNA Index Systems, where DNA profiles originate; State DNA Index Systems, which allows for laboratories within states to share information; and the National DNA Index System, which will enable states to compare DNA information with one another.
The CODIS software contains multiple databases depending on the type of information being searched against. Examples of these databases include missing persons, convicted offenders, and forensic samples collected from crime scenes. Each state and the federal system has different laws for the collection, upload, and analysis of information contained within their database. However, for privacy reasons, the CODIS database does not contain any personal identifying information, such as the name associated with the DNA profile. The uploading agency is notified of any hits to their samples and is responsible for disseminating personal information in accordance with their laws.
Establishment
The creation of a national DNA database within the U.S. was first mentioned by the Technical Working Group on DNA Analysis Methods in 1989. The FBI's strategic goal was to maximize the voluntary participation of states and avoid what happened several years earlier, when eight western states, frustrated with the process of creating the national Integrated [Automated Fingerprint Identification System|Automated Fingerprint Identification System] network, formed their own Western Identification Network. The FBI's strategy to discourage states from creating systems that competed with CODIS was to develop DNA databasing software and provide it free of charge to state and local crime laboratories. This strategic decision--to provide software free of charge for the purpose of gaining market share--was innovative at that time and predated the browser wars. In 1990, the FBI began a pilot DNA databasing program with 14 state and local laboratories.In 1994, Congress passed the DNA Identification Act, which authorised the FBI to create a national DNA database of convicted offenders as well as separate databases for missing persons and forensic samples collected from crime scenes. The DNA Identification Act also required that laboratories participating in the CODIS program maintain accreditation from an independent nonprofit organisation that is actively involved in the forensic fields and that scientists processing DNA samples for submission into CODIS maintain proficiency and are routinely tested to ensure the quality of the profiles being uploaded into the database. The national level of CODIS was implemented in October 1998. Today, all 50 states, the District of Columbia, federal law enforcement, the Army Laboratory, and Puerto Rico participate in the national sharing of DNA profiles.
Database structure
The CODIS database contains several different indexes for the storage of DNA profile information. For assistance in criminal investigations, three indexes exist: the offender index, which contains DNA profiles of those convicted of crimes; the arrestee index, which contains profiles of those arrested for crimes pursuant to the laws of the particular state; and the forensic index, which contains profiles collected from a crime scene. Additional indexes, such as the unidentified human remains index, the missing persons index, and the biological relatives of missing persons index, are used to assist in identifying missing persons. Speciality indexes also exist for other specimens that do not fall into the other categories. These indexes include the staff index, for profiles of employees who work with the samples, and the multi-allelic offender index, for single-source samples that have three or more alleles at two or more loci.Non-criminal indexes
While CODIS is generally used for linking crimes to other crimes and potentially to suspects, there are non-criminal portions of the database, such as the missing person indexes. The Combined DNA Index System, also known as CODIS, is maintained by the FBI at the NDIS level of CODIS, allowing all states to share information. Created in 2000 using the existing CODIS infrastructure, this section of the database is designed to help identify human remains by collecting and storing DNA information on the missing or the relatives of missing individuals. Unidentified remains are processed for DNA by the University of North Texas Center for Human Identification which is funded by the National Institute of Justice. Nuclear, Y-STR, and mitochondrial analysis can be performed on both unknown remains and on known relatives to maximize the chance of identifying remains.Statistics
, NDIS contained more than 14 million offender profiles, more than four million arrestee profiles and more than one million forensic profiles. The effectiveness of CODIS is measured by the number of investigations aided through database hits., CODIS had aided in over 520 thousand investigations and produced more than 530 thousand hits. Each state has their own SDIS database, and each state can set their own inclusionary standards that can be less strict than the national level. For this reason, several profiles that are present in state-level databases are not in the national database and are not routinely searched across state lines.Scientific basis
The bulk of identifications using CODIS relies on short tandem repeats that are scattered throughout the human genome and on statistics that are used to calculate the rarity of that specific profile in the population. STRs are a type of copy-number variation and comprise a sequence of nucleotide base pairs that is repeated over and over again. At each location tested during DNA analysis, also known as a locus, a person has two sets of repeats, one from the father and one from the mother. Each set is measured, and the number of repeat copies is recorded. If both strands, inherited from the parents, contain the same number of repeats at that locus, the person is said to be homozygous at that locus. If the repeat numbers differ, they are said to be heterozygous. Every possible difference at a locus is an allele. This repeat determination is performed across a number of loci, and the repeat values are the DNA profile that is uploaded to CODIS. As of January 1, 2017, the requirements for upload to the national level for known offender profiles are 20 loci.Alternatively, CODIS allows for the upload of mitochondrial DNA information into the missing persons indexes. Since mtDNA is passed down from mother to offspring, it can be used to link remains to still living relatives who have the same mtDNA.
Loci
Prior to January 1, 2017, the national level of CODIS required that known offender profiles have a set of 13 loci called the "CODIS core". Since then, the requirement has expanded to include seven additional loci. Partial profiles are also allowed in CODIS in separate indexes and are common in crime scene samples that are degraded or are mixtures of multiple individuals. Upload of these profiles to the national level of CODIS requires at least eight of the core loci to be present, as well as a profile rarity of 1 in 10 million.Loci that fall within a gene are named after the gene. For example, TPOX is named after the human thyroid peroxidase gene. Loci that do not fall within genes are given a standard naming scheme for uniformity. These loci are named D + the chromosome the locus is on + S + the order in which the location on that chromosome was described. For example, D3S1358 is on the third chromosome and is the 1358th location described. The CODIS core is listed below; loci with asterisks are the new core and were added to the list in January 2017.
The loci used in CODIS were chosen because they are in regions of noncoding DNA, sections that do not code for proteins. These sections should not be able to tell investigators any additional information about the person, such as their hair or eye colour, or their race. However, new advancements in the understanding of genetic markers and ancestry have indicated that the CODIS loci may contain phenotypic information.