Massively parallel sequencing
Massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing or second-generation sequencing. Some of these technologies emerged between 1993 and 1998 and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads per instrument run.
Many NGS platforms differ in engineering configurations and sequencing chemistry. They share the technical paradigm of massively parallel sequencing via spatially separated, clonally amplified DNA templates or single DNA molecules in a flow cell. This design is very different from that of Sanger sequencing—also known as capillary sequencing or first-generation sequencing—which is based on electrophoretic separation of chain-termination products produced in individual sequencing reactions. This methodology allows sequencing to be completed on a larger scale.
History
In the 1990s, Applied Biosystems dominated DNA sequencing technology with their automated capillary electrophoresis Sanger sequencing machines. However, the early 2000s saw many new companies entering the market, driven by the goal of reducing genome sequencing costs below $1000 following the enthusiasm generated by the Human Genome Project. Many of these new methods were first developed with support from the National Institutes of Health funding under the 'Technology Development for the $1,000 Genome' program, launched during Francis Collins’ tenure as director of the National Human Genome Research Institute.The first next-generation sequencers were based on pyrosequencing, originally developed by Pyrosequencing AB and later commercialized by 454 Life Sciences. In 2003, 454 Life Sciences launched the GS20, the first NGS DNA sequencer. This system provided reads approximately 400–500 bp long with 99% accuracy, enabling sequencing of about 25 million bases in a four-hour run at significantly lower costs compared to Sanger sequencing. The sequencing machines developed by 454 represented a paradigm shift by enabling the mass parallelisation of sequencing reactions, which significantly boosted the amount of DNA sequenced per run, making 454 Life Sciences the first major success in commercial NGS technology.
Also in 2003, Solexa began developing a competing method known as Sequencing by Synthesis. In 2004, Solexa acquired colony sequencing technology from Manteia, producing densely clustered DNA fragments immobilized on flow cells. These dense clusters generated stronger fluorescent signals, improving accuracy and reducing optical costs. In 2005, Solexa integrated an engineered DNA polymerase and reversible terminator nucleotides, allowing repeated cycles of sequencing and imaging. The first commercial sequencer based on this technology, Genome Analyzer, was launched in 2006, providing shorter reads but higher throughput and paired-end sequencing capability.
in 2007, 454 Life Sciences was acquired by Roche and Solexa by Illumina, the same year Applied Biosystems introduced SOLiD, a ligation-based sequencing platform. However, SOLiD encountered issues sequencing palindromic regions and was eventually discontinued. In 2011, Ion Torrent introduced another alternative, measuring proton changes during nucleotide incorporation using semiconductor-based sensors. Ion Torrent systems rapidly produced 100 bp reads but frequently struggled with accurately sequencing homopolymers, ultimately leading to their abandonment.
Due to limitations in competing methods, Illumina’s SBS technology eventually dominated the sequencing market. By 2012, expectations that 454 would gain a substantial share of the sequencing market had not been realized, and Roche’s 2007 acquisition was increasingly viewed as underperforming; that same year, Roche made an unsuccessful attempt to acquire Illumina. In October 2013, Roche announced that it would shut down 454, and stop supporting the platform by mid-2016. By 2014, Illumina controlled approximately 70% of DNA sequencer sales and generated over 90% of sequencing data. That year, Illumina introduced the HiSeq X Ten platform, significantly increasing throughput and claiming the long-targeted goal of sequencing human genomes at roughly $1000 each. Illumina surpassed this milestone in 2017 with the release of NovaSeq, a system capable of generating over 3000 Gbp per run.
NGS platforms
DNA sequencing with commercially available NGS platforms is generally conducted with the following steps. First, DNA sequencing libraries are generated by clonal amplification by PCR in vitro. Second, the DNA is sequenced by synthesis, such that the DNA sequence is determined by the addition of nucleotides to the complementary strand rather than through chain-termination chemistry. Third, the spatially segregated, amplified DNA templates are sequenced simultaneously in a massively parallel fashion without the requirement for a physical separation step. These steps are followed in most NGS platforms, but each utilizes a different strategy.NGS parallelization of the sequencing reactions generates hundreds of megabases to gigabases of nucleotide sequence reads in a single instrument run. This has enabled a drastic increase in available sequence data and fundamentally changed genome sequencing approaches in the biomedical sciences.
Newly emerging NGS technologies and instruments have further contributed to a significant decrease in the cost of sequencing nearing the mark of $1000 per genome sequencing.
As of 2014, massively parallel sequencing platforms are commercially available and their features are summarized in the table. As the pace of NGS technologies is advancing rapidly, technical specifications and pricing are in flux.
| Platform | Template preparation | Chemistry | Max read length | Run times | Max Gb per Run |
| Roche 454 | Clonal-emPCR | Pyrosequencing | 400‡ | 0.42 | 0.40-0.60 |
| GS FLX Titanium | Clonal-emPCR | Pyrosequencing | 400‡ | 0.42 | 0.035 |
| Illumina MiSeq | Clonal Bridge Amplification | Reversible Dye Terminator | 2x300 | 0.17-2.7 | 15 |
| Illumina HiSeq | Clonal Bridge Amplification | Reversible Dye Terminator | 2x150 | 0.3-11 | 1000 |
| Illumina Genome Analyzer IIX | Clonal Bridge Amplification | Reversible Dye Terminator | 2x150 | 2-14 | 95 |
| Life Technologies SOLiD4 | Clonal-emPCR | Oligonucleotide 8-mer Chained Ligation | 20-45 | 4-7 | 35-50 |
| Life Technologies Ion Proton | Clonal-emPCR | Native dNTPs, proton detection | 200 | 0.5 | 100 |
| Complete Genomics | Gridded DNA-nanoballs | Oligonucleotide 9-mer Unchained Ligation | 7x10 | 11 | 3000 |
| Helicos Biosciences Heliscope | Single Molecule | Reversible Dye Terminator | 35‡ | 8 | 25 |
| Pacific Biosciences SMRT | Single Molecule | Phospholinked Fluorescent Nucleotides | 10,000 ; 30,000+ | 0.08 | 0.5 |
Run times and gigabase output per run for single-end sequencing are noted. Run times and outputs approximately double when performing paired-end sequencing.
‡Average read lengths for the Roche 454 and Helicos Biosciences platforms.
Template preparation methods for NGS
Two methods are used in preparing templates for NGS reactions: amplified templates originating from single DNA molecules, and single DNA molecule templates.For imaging systems which cannot detect single fluorescence events, amplification of DNA templates is required. The three most common amplification methods are emulsion PCR, rolling circle and solid-phase amplification. The final distribution of templates can be spatially random or on a grid.