Optical pooled screening


Optical pooled screening is a type of high-content single-cell genetic screen that profiles the phenotypes of individual cells by optical microscopy. The phenotypic profile of each cell is linked to one or several genetic features by in situ genotyping. OPS is used to determine the effect of genetic elements on the characteristics of cells and tissues. Single-cell screening methods like OPS have been adopted by the biotechnology industry for applications in drug development.
High-content pooled single-cell genetic screens became available as a functional genomics technique starting circa 2016. While the genetic intervention can be of any type that can be associated with a genetic sequence in the cell, including modifications in protein-coding or regulatory sequences, CRISPR systems are the most common methodology for affecting genetic perturbations in OPS efforts. The high-content nature of OPS data enables screens for cellular phenotypes not considered prior to data generation and in-depth analysis of the primary screening data to classify and prioritize screening hits. As an intrinsically single-cell-resolved approach, OPS is recognized as capable of identifying perturbation effects on the distribution of single-cell phenotypes across cells.
Researchers use OPS to visually assess how gene disruptions and other genetic perturbations cause changes in cellular characteristics like morphology by Cell Painting, protein localization, or intracellular signaling via transduction of signals detected by biochemical receptors in the cell. OPS requires in situ genotyping, for example by in situ sequencing the perturbation in each cell or a nucleotide sequence "barcode" that links image-based cell phenotypes to specific genetic alterations at the single-cell level. OPS is used in functional genomics, drug discovery, and disease research.

Context

OPS is one of two approaches available to generate high-content single-cell screening data. High-content single-cell functional genomic screens differ from previously established pooled genetic screening approaches relying on enrichment of perturbation identifier frequency in selected versus non-selected or original cell populations. In contrast, high content single-cell screens like OPS match cell phenotypes and perturbation identifiers at the single-cell level, enabling characterization and possible classification of phenotypes post-hoc based on the primary screening data output. Perturbed cell phenotypes are interpreted based on the nature of the perturbations enriched in a phenotypic class, or a quantitative trait can be directly mapped to genetic alteration in a regulatory or coding sequence.
In contrast to NGS approaches for high-content single-cell screening OPS directly reads out cellular structures, dynamic molecular/cellular functionality in live cell settings, and can achieve high resolution of cell states. As an imaging method, OPS is applicable where spatial relationships are relevant, for example, the subcellular distribution or localization of organelles or molecular components, and spatial relationships among cells. Imaging assays can also score cell non-autonomous phenotypes such as cell-cell interaction phenotypes, tissue context-dependent phenotypes, and the effect genes have outside the cell. As a live cell imaging method, OPS enables studies of cellular dynamics using advanced imaging modalities, such as single molecule fluorescence microscopy.
The capability of OPS to connect the phenotype of each cell in the pooled library to its genotype distinguishes OPS from imaging based pooled enrichment screens such as robotic picking, Visual Cell Sorting, CRISPR-based microRaft followed by guide RNA identification, single-cell isolation following time-lapse imaging, AI-photoswitchable screening, optical enrichment, image-enabled cell sorting, and Photopick. These methods all work by segregating cell populations according to pre-specified single-cell image characteristics and bulk readout perturbation identifier abundance in the segregated populations.

History

OPS was developed concurrently with single-cell screening methods based on NGS, i.e. Perturb-seq, CRISP-seq, and CROP-seq. The first dissemination of the OPS methodology also occurred in 2016, but the first scientific publications did not appear until the year after. One report of an OPS described a small CRISPR interference screen that perturbed different components regulating a fluorescent reporter protein. In this study, the live-cell phenotyping step was followed by FISH-based readout of barcodes expressed by T7 RNA polymerase from the same plasmid as the CRISPR single guide RNA. Another early report described an OPS with a bacterial library of mutated fluorescent proteins also followed by FISH-based readout of barcodes. Applications in human cells with CRISPR perturbations were subsequently reported with readout of thousands of sgRNA CRISPR perturbations by in situ sequencing of sgRNA and barcode sequences amplified from mRNA using a molecular inversion probe and rolling circle amplification and sequencing by synthesis chemistry; and in another example, readout of >100 sgRNA perturbations by FISH. Protein epitopes have also been applied to encode genomic perturbations for enrichment and in vivo OPS with readout from tissue sections.
A genome-wide scale loss-of-function CRISPR OPS in human cells was reported in 2023 and included high-content phenotypes recorded from >10 million cells assigned to one of 80,408 sgRNA perturbations. Other genome-wide OPS datasets were reported for infection of human cells by filoviruses, cell signaling, and morphological characterization under different culture conditions. New protocols for nucleotide-level barcode readout incorporate "Zombie" in situ T7 RNA polymerase-driven in vitro transcription for amplification or pre-amplification of OPS readout. A recent application of OPS is genome-wide tracking of chromosome loci over the cell cycle.

Methodology

Creation and use of genetic libraries

OPS requires genetically perturbed cell populations similar to those used for Perturb-seq, CRISP-seq, and CROP-seq and enrichment screens. In mammalian systems, viral transduction is commonly used to introduce elements of the genetic perturbation system such as sgRNAs into cells. A general challenge in perturbation engineering is maintaining linkage between sgRNA and barcode elements or among sgRNA or barcodes. Specific protocols and construct designs able to maintain the intended linkage have been developed. Errors in component synthesis, procedures for production of DNA or viruses, and processes occurring in the cell population for screening can de-link elements, but can be mitigated to maintain screen performance, which is particularly important for systems capable of multiple perturbations.
Bacterial libraries for OPS have been generated using episomal and chromosomally integrated genomic perturbations. A preferred method is to express sgRNA or ORFs from plasmids that also encode T7-expressed RNA barcodes. Strain libraries based on chromosomal mutations have been constructed using the phage lambda-derived Red recombination system. For chromosomally expressed barcodes, Zombie in situ T7 in vitro transcription pre-amplification can achieve the target concentration required for detection by in situ sequencing or sequential FISH genotyping protocols.

Data analysis methods

OPS data analysis comprises the extraction of phenotype parameter scores from each cell and matching these scores with perturbation genotype identifiers extracted from each cell using a series of digital image analysis steps. Then, the distributions of phenotype parameter scores can be determined for each perturbation genotype and compared or tested against the distributions observed for cells with control perturbations or a different perturbation genotype.
Primary analysis of phenotype images involves two major steps. First, cell segmentation and the alignment of segmentation masks across all the available images. Second, feature identification and extraction of feature scores from the pixel level data. Primary analysis of phenotyping images may involve a range of computational approaches including feature selection and machine learning approaches such as support vector machines, PCA, and dimensionality reduction that may involve clustering. For live cell imaging the segmented cells are tracked in time lapse movies and time-dependent phenotypes can be additionally scored.
Primary analysis of in situ genotype data also involves two major steps. First, identification of signal loci and association of loci with cells and analysis of signal sequences similar to single particle tracking. Second, assignment of perturbation identifiers to signal loci and cells. Primary analysis of genotype images may involve a range of computational approaches including machine learning approaches. Primary analysis concludes with the merging of single-cell phenotypes and genotypes and identification of the set of cells with matched single-cell phenotype scores and genotype identifiers.
Secondary analysis entails testing for perturbation effects and integration with other biological database resources and plausibility considerations based on general biological knowledge. New machine learning approaches for the identification and interpretation of perturbation effects from OPS datasets and for the optimal design of OPS experiments are active areas of development.

Applications

OPS has been applied across multiple research areas and for a variety of purposes.
  • Functional Genomics and Cell Biology: OPS facilitates comprehensive functional studies by revealing how specific genetic changes affect a wide range of cell functions, cell biological characteristics, and molecular processes
  • Drug Discovery: By identifying genes that regulate disease-associated cellular pathways/phenotypes/states, and the gene functions that must be intact for a drug to act, OPS helps researchers discover new drug targets and better understand the molecular mechanisms of drugs
  • Disease Research: OPS is used to investigate the etiology and pathophysiology of diseases including cancer, cell models used to study neurodegenerative conditions, and infectious diseases. By identifying genes associated with disease phenotypes and treatment responses in research models, and exploring the impact models of genes and alleles known to be associated with clinically-defined disease and treatment response in humans, OPS can contribute to the fundamental understanding of disease.
  • Diagnostics: OPS has been used combined with antibiotic susceptibility testing to identify the species in a mixed sample after the phenotypic susceptibility has been determined for each cell