Human microbiome
The human microbiome is the aggregate of all microbiota that reside on or within human tissues and biofluids along with the corresponding anatomical sites in which they reside, including the gastrointestinal tract, skin, mammary glands, seminal fluid, uterus, ovarian follicles, lung, saliva, oral mucosa, ocular surface, and the biliary tract. Types of human microbiota include bacteria, archaea, fungi, protists, and viruses. Though micro-animals can also live on the human body, they are typically excluded from this definition. In the context of genomics, the term human microbiome is sometimes used to refer to the collective genomes of resident microorganisms; however, the term human metagenome has the same meaning.
The human body hosts many microorganisms, with approximately the same order of magnitude of non-human cells as human cells. Some microorganisms that humans host are commensal, meaning they co-exist without harming humans; others have a mutualistic relationship with their human hosts. Conversely, some non-pathogenic microorganisms can harm human hosts via the metabolites they produce, like trimethylamine, which the human body converts to trimethylamine N-oxide via FMO3-mediated oxidation. Certain microorganisms perform tasks that are known to be useful to the human host, but the role of most of them is not well understood. Those that are expected to be present, and that under normal circumstances do not cause disease, are sometimes deemed normal flora or normal microbiota.
The Human Microbiome Project took on the project of sequencing the genome of the human microbiota, focusing particularly on the microbiota that normally inhabit the skin, mouth, nose, digestive tract, and vagina. It reached a milestone in 2012 when it published its initial results.
Terminology
Though widely known as flora or ''microflora, this is a misnomer in technical terms, since the word root flora pertains to plants, and biota refers to the total collection of organisms in a particular ecosystem. Recently, the more appropriate term microbiota is applied, though its use has not eclipsed the entrenched use and recognition of flora'' with regard to bacteria and other microorganisms. Both terms are being used in different literature.Relative numbers
The number of bacterial cells in the human body is estimated to be around 38 trillion, while the estimate for human cells is around 30 trillion. The number of bacterial genes is estimated to be 2 million, 100 times the number of approximately 20,000 human genes.Study
The problem of elucidating the human microbiome is essentially identifying the members of a microbial community, which includes bacteria, eukaryotes, and viruses. This is done primarily using deoxyribonucleic acid -based studies, though ribonucleic acid, protein and metabolite based studies are also performed. DNA-based microbiome studies typically can be categorized as either targeted amplicon studies or, more recently, shotgun metagenomic studies. The former focuses on specific known marker genes and is primarily informative taxonomically, while the latter is an entire metagenomic approach which can also be used to study the functional potential of the community. One of the challenges that is present in human microbiome studies, but not in other metagenomic studies, is to avoid including the host DNA in the study.Aside from simply elucidating the composition of the human microbiome, one of the major questions involving the human microbiome is whether there is a "core", that is, whether there is a subset of the community that is shared among most humans. If there is a core, then it would be possible to associate certain community compositions with disease states, which is one of the goals of the HMP. It is known that the human microbiome is highly variable both within a single subject and among different individuals, a phenomenon which is also observed in mice.
On 13 June 2012, a major milestone of the HMP was announced by the National Institutes of Health director Francis Collins. The announcement was accompanied with a series of coordinated articles published in Nature and several journals in the Public Library of Science on the same day. By mapping the normal microbial make-up of healthy humans using genome sequencing techniques, the researchers of the HMP have created a reference database and the boundaries of normal microbial variation in humans. From 242 healthy U.S. volunteers, more than 5,000 samples were collected from tissues from 15 to 18 body sites such as mouth, nose, skin, lower intestine, and vagina. All the DNA, human and microbial, were analyzed with DNA sequencing machines. The microbial genome data were extracted by identifying the bacterial specific ribosomal RNA, 16S rRNA. The researchers calculated that more than 10,000 microbial species occupy the human ecosystem, and they have identified 81–99% of the genera.
Analysis after the processing
The statistical analysis is essential to validate the obtained results ; if it is paired with graphical tools, the outcome is easily visualized and understood.Once a metagenome is assembled, it is possible to infer the functional potential of the microbiome. The computational challenges for this type of analysis are greater than for single genomes, because usually metagenomes assemblers have poorer quality, and many recovered genes are non-complete or fragmented. After the gene identification step, the data can be used to carry out a functional annotation by means of multiple alignment of the target genes against orthologs databases.
Marker gene analysis
It is a technique that exploits primers to target a specific genetic region and enables to determine the microbial phylogenies. The genetic region is characterized by a highly variable region which can confer detailed identification; it is delimited by conserved regions, which function as binding sites for primers used in PCR. The main gene used to characterize bacteria and archaea is 16S rRNA gene, while fungi identification is based on Internal Transcribed Spacer. The technique is fast and not so expensive and enables to obtain a low-resolution classification of a microbial sample; it is optimal for samples that may be contaminated by host DNA. Primer affinity varies among all DNA sequences, which may result in biases during the amplification reaction; indeed, low-abundance samples are susceptible to overamplification errors, since the other contaminating microorganisms result to be over-represented in case of increasing the PCR cycles. Therefore, the optimization of primer selection can help to decrease such errors, although it requires complete knowledge of the microorganisms present in the sample, and their relative abundances.Marker gene analysis can be influenced by the primer choice; in this kind of analysis, it is desirable to use a well-validated protocol. The first thing to do in a marker gene amplicon analysis is to remove sequencing errors; a lot of sequencing platforms are very reliable, but most of the apparent sequence diversity is still due to errors during the sequencing process. To reduce this phenomenon a first approach is to cluster sequences into Operational taxonomic unit : this process consolidates similar sequences into a single feature that can be used in further analysis steps; this method however would discard SNPs because they would get clustered into a single OTU. Another approach is Oligotyping, which includes position-specific information from 16s rRNA sequencing to detect small nucleotide variations and from discriminating between closely related distinct taxa. These methods give as an output a table of DNA sequences and counts of the different sequences per sample rather than OTU.
Another important step in the analysis is to assign a taxonomic name to microbial sequences in the data. This can be done using machine learning approaches that can reach an accuracy at genus-level of about 80%. Other popular analysis packages provide support for taxonomic classification using exact matches to reference databases and should provide greater specificity, but poor sensitivity. Unclassified microorganism should be further checked for organelle sequences.
Phylogenetic analysis
Many methods that exploit phylogenetic inference use the 16SRNA gene for Archea and Bacteria and the 18SRNA gene for Eukaryotes. Phylogenetic comparative methods are based on the comparison of multiple traits among microorganisms; the principle is: the closely they are related, the higher number of traits they share. Usually PCS are coupled with phylogenetic generalized least square or other statistical analysis to get more significant results. Ancestral state reconstruction is used in microbiome studies to impute trait values for taxa whose traits are unknown. This is commonly performed with PICRUSt, which relies on available databases. Phylogenetic variables are chosen by researchers according to the type of study: through the selection of some variables with significant biological informations, it is possible to reduce the dimension of the data to analyse.Phylogenetic aware distance is usually performed with UniFrac or similar tools, such as Soresen's index or Rao's D, to quantify the differences between the different communities. All this methods are negatively affected by horizontal gene transmission, since it can generate errors and lead to the correlation of distant species. There are different ways to reduce the negative impact of HGT: the use of multiple genes or computational tools to assess the probability of putative HGT events.
Ecological Network analysis
Microbial communities develop in a very complex dynamic which can be viewed and analyzed as an ecosystem. The ecological interactions between microbes govern its change, equilibrium and stability, and can be represented by a population dynamic model.The ongoing study of ecological features of the microbiome is growing rapidly and allows to understand the fundamental properties of the microbiome. Understanding the underlying rules of microbial community could help with treating diseases related to unstable microbial communities.
A very basic question is if different humans, who share different microbial communities, have the same underlying microbial dynamics. Increasing evidence and indications have found that the dynamics is indeed universal. This question is a basic step that will allow scientists to develop treatment strategies, based on the complex dynamics of human microbial communities.
There are more important properties on which considerations should be taken into account for developing interventions strategies for controlling the human microbial dynamics. Controlling the microbial communities could result in solving very bad and harmful diseases.