Top-down proteomics
Top-down proteomics is a method of protein identification capable of identifying and quantitating unique proteoforms through the analysis of intact proteins. The name is derived from the similar approach to DNA sequencing. During mass spectrometry, intact proteoforms are typically ionized by electrospray ionization and analysed using a variety of mass analysers, including Orbitraps, Ion Cyclotrons and Time-Of-Flight. Effective fractionation is critical for sample handling before mass-spectrometry-based proteomics. Typical proteome analysis routinely involves digesting intact proteins followed by inferred protein identification using mass spectrometry. Top-down proteomics using mass spectrometry interrogates protein structure through measurement of a proteoform's intact mass followed by direct ion dissociation in the gas phase. Top Down proteoform analysis can also be achieved through resolution of the proteoform from all other proteoforms and then applying peptide-centric LC-MS/MS to characterise the isolated proteoform.
A single gene can be coded for many protein products and the resulting canonical amino acid sequences can be further modified by any number of post-translational modifications or non-physiological adducts. These varied protein species or proteoforms define proteomes and are the functional entities underlying biological processes. Thus, truly comprehensive or 'deep' proteome analyses must assess proteoforms.
There are two general approaches to proteome analysis - bottom up and top down. The former, a peptide-centric or proteogenomic approach, infers the identities of canonical protein sequences by correlation with existing databases, mostly derived from genome sequencing projects. In contrast, TDP can, in theory, yield comprehensive proteome analyses at the level of proteoforms provided the methods used effectively address the full breadth of species in a proteome.
Adopted from analytical chemistry, the term top down in proteomics means the separation of intact proteoforms and their subsequent identification, and is agnostic as to how that is achieved. Currently, there are two analytical approaches that enable proteome assessments to different extents: Integrative or Integrated TDP or mass spectrometry-intensive TDP. Although somewhat misleading interpretations appear in the literature implying the latter defines TDP, this is clearly a misconception when considering what is genuinely required for fully effective, comprehensive proteome analyses.
Integrated Top-Down Proteomics (iTDP)
Further developed, refined, and optimized since the original report of a routine multi-dimensional separation of protein species, and subsequently coupled with western blotting and MS, this approach was the first to identify the range of protein species/proteoforms in a variety of samples. Currently, the iTDP analytical approach offers the highest proteoform resolution and a routine approach to full proteome analysis. In the case of 2D-PAGE, spots and/or regions of interest can be excised from the gel, proteolytically digested using well-established methods, and the resulting peptides then assessed using LC/MS/MS to identify canonical amino acid sequences and their inherent PTM. Integration of this sequence information with the isoelectric point and molecular weight information from 2DE thus enables definitive identification of proteoforms based on several key defining physico-chemical characteristics. In addition to highly sensitive and quantitative total proteoform detection using fluorescent stains. and notably Coomassie Brilliant Blue as a near-IR dye, gel staining protocols also enable the identification of broad proteoform groups containing the same PTM. Thus, iTDP utilizes integration of the best available approaches to enable truly comprehensive, deep proteome analyses at the critically necessary level of proteoforms. Accordingly, following critical evaluation to ensure comprehensive, quantitative analysis, new approaches can also be integrated once fully vetted.Advantages
- The main advantage of the iTDP approach is the routine ability to detect the full potential range of proteoforms in native proteomes. This results from capitalizing on integration of the best available analytical approaches and continuous integration of modifications to the approach as new refinements and optimizations are established.
- iTDP can be performed through sequentially combining any number of fractionation techniques available to the researcher, such as chromatography, density-gradient ultrafiltration, or electrophoresis, to name a few.
- 2DE enables parallel resolution of replicate samples rather than the serial approach of BUP and MSi-TDP that can result in significant variation between LC-MS runs. This also enables combining of resolved samples from several gels if necessary to ensure high quality MS/MS identifications, even of very low abundance species.
- Focusing on one select small portion of a gel-resolved proteome at a time enables full implementation of the power of MS/MS, yielding better data than the en masse, whole proteome digest BUP approach. The reduction in the number of proteoforms and thus peptides being introduced into LC/MS/MS means that higher concentrations of individual peptides can be analysed, increasing the quality of MS/MS spectra of the peptides and the likelihood of correctly localising PTM.
- Both the first and second dimensions of 2DE are adaptable and easily modified to enhance proteome coverage as necessary. This flexibility and adaptability further complements the additional analytical capacity enabled by excision and third electrophoretic separations of primary gel regions, as well as the subsequent deep imaging of the primary gel to expand the dynamic range of detection to even very low abundance proteoforms.
- Generally straightforward data analysis.
- High quality iTDP analyses are fully enabled by established mid-range LC/MS systems; while advanced and/or specialized systems continue to drive throughput and/or sequence coverage, these are not essential to enabling iTDP analyses.
- Western blotting after 2DE can also be used to capitalize on the availability of high-quality antibodies. Indeed, this was one of the first approaches to identify multiple variants of a given protein in the same sample. Criteria to ensure the highest quality western blots are well-established if not always widely followed.
- The primary focus of the iTDP approach is the comprehensiveness of analyses and thus data quality, rather than high throughput. "It is not the rate or volume of data generated but rather the quality that ultimately matters".
Disadvantages
- The primary focus of the iTDP approach is the comprehensiveness of analyses and thus data quality, rather than high throughput. Many claim this as a drawback of the approach. With the widespread adoption of BUP since the turn of the century, a much-touted goal of proteomics has been to achieve high-throughput analyses of amino acid sequences, comparable to the throughput of genomic analyses. Critically, this seems unlikely considering the vast potential speciation of protein products and thus the complexity of native proteomes. A truly disruptive technology would be required to genuinely enable quantitatively comprehensive, high-throughput proteome analyses.
- 2DE has been described as time-consuming or labour-intensive. Again, the issue is clearly one of analytical quality over speed. While it is true that iTDP — notably performed with full, parallel technical replicates — can take longer than a single BUP or MSi-TDP run, when one factors in the inherent technical aspects of those approaches, there is not a substantial difference in throughput. Furthermore, recent refinements have further optimized sample handling and increased 2DE throughput.
- It is difficult to ensure full, quantitative recovery of intact proteoforms from polyacrylamide gels, and this varies with the size of species and the PTM present. Whether attempted via passive diffusion from 'mashed' gel pieces or the use of 'dissolvable' formulations, full quantitative recovery has never been demonstrated and/or there is concern that the necessary treatments can modify the resolved native proteoforms. Thus, while recovery of fully intact proteoforms from the gel would be optimal to ensure full sequence coverage, in-gel digestion is an effective option for subsequent LC/MS/MS analyses.
Mass spectrometry-intensive TDP (MSi-TDP)
Advantages
- Like iTDP, the main advantage of MSi-TDP is the capacity, within limits, to fully assess given proteoforms, including isotopic variants.
- MSi-TDP can complement BUP approaches. Characterization of small proteins can be a significant challenge in BUP if an insufficient number of tryptic peptides are generated for analysis. MSi-TDP enables low mass protein detection, thus providing more detailed coverage of proteoforms in the lower MW range.
- Sequentially combining any number of fractionation techniques available to the researcher, such as chromatography, density-gradient ultrafiltration, or electrophoresis, dramatically increases the depth and quality of proteoform and proteome analysis.