Gene regulatory network
A gene 'regulatory network' is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the function of the cell. GRN also play a central role in morphogenesis, the creation of body structures, which in turn is central to evolutionary developmental biology.
The regulator can be DNA, RNA, protein or any combination of two or more of these three that form a complex, such as a specific sequence of DNA and a transcription factor to activate that sequence. The interaction can be direct or indirect. In general, each mRNA molecule goes on to make a specific protein. In some cases this protein will be structural, and will accumulate at the cell membrane or within the cell to give it particular structural properties. In other cases the protein will be an enzyme, i.e., a micro-machine that catalyses a certain reaction, such as the breakdown of a food source or toxin. Some proteins though serve only to activate other genes, and these are the transcription factors that are the main players in regulatory networks or cascades. By binding to the promoter region at the start of other genes they turn them on, initiating the production of another protein, and so on. Some transcription factors are inhibitory.
In single-celled organisms, regulatory networks respond to the external environment, optimising the cell at a given time for survival in this environment. Thus a yeast cell, finding itself in a sugar solution, will turn on genes to make enzymes that process the sugar to alcohol. This process, which we associate with wine-making, is how the yeast cell makes its living, gaining energy to multiply, which under normal circumstances would enhance its survival prospects.
In multicellular animals the same principle has been put in the service of gene cascades that control body-shape. Each time a cell divides, two cells result which, although they contain the same genome in full, can differ in which genes are turned on and making proteins. Sometimes a 'self-sustaining feedback loop' ensures that a cell maintains its identity and passes it on. Less understood is the mechanism of epigenetics by which chromatin modification may provide cellular memory by blocking or allowing transcription. A major feature of multicellular animals is the use of morphogen gradients, which in effect provide a positioning system that tells a cell where in the body it is, and hence what sort of cell to become. A gene that is turned on in one cell may make a product that leaves the cell and diffuses through adjacent cells, entering them and turning on genes only when it is present above a certain threshold level. These cells are thus induced into a new fate, and may even generate other morphogens that signal back to the original cell. Over longer distances morphogens may use the active process of signal transduction. Such signalling controls embryogenesis, the building of a body plan from scratch through a series of sequential steps. They also control and maintain adult bodies through feedback processes, and the loss of such feedback because of a mutation can be responsible for the cell proliferation that is seen in cancer. In parallel with this process of building structure, the gene cascade turns on genes that make structural proteins that give each cell the physical properties it needs.
Overview
At one level, biological cells can be thought of as "partially mixed bags" of biological chemicals – in the discussion of gene regulatory networks, these chemicals are mostly the messenger RNAs and proteins that arise from gene expression. These mRNA and proteins interact with each other with various degrees of specificity. Some diffuse around the cell. Others are bound to cell membranes, interacting with molecules in the environment. Still others pass through cell membranes and mediate long range signals to other cells in a multi-cellular organism. These molecules and their interactions comprise a gene regulatory network.Image:DG Network in Hybrid Rice.png|thumb|right|540px|Example of a regulatory network
The nodes of this network can represent genes, proteins, mRNAs, protein/protein complexes or cellular processes. Nodes that are depicted as lying along vertical lines are associated with the cell/environment interfaces, while the others are free-floating and can diffuse. Edges between nodes represent interactions between the nodes, that can correspond to individual molecular reactions between DNA, mRNA, miRNA, proteins or molecular processes through which the products of one gene affect those of another, though the lack of experimentally obtained information often implies that some reactions are not modeled at such a fine level of detail. These interactions can be inductive, with an increase in the concentration of one leading to an increase in the other, inhibitory, with an increase in one leading to a decrease in the other, or dual, when depending on the circumstances the regulator can activate or inhibit the target node. The nodes can regulate themselves directly or indirectly, creating feedback loops, which form cyclic chains of dependencies in the topological network. The network structure is an abstraction of the system's molecular or chemical dynamics, describing the manifold ways in which one substance affects all the others to which it is connected. In practice, such GRNs are inferred from the biological literature on a given system and represent a distillation of the collective knowledge about a set of related biochemical reactions. To speed up the manual curation of GRNs, some recent efforts try to use text mining, curated databases, network inference from massive data, model checking and other information extraction technologies for this purpose.
Genes can be viewed as nodes in the network, with input being proteins such as transcription factors, and outputs being the level of gene expression. The value of the node depends on a function which depends on the value of its regulators in previous time steps. These functions have been interpreted as performing a kind of information processing within the cell, which determines cellular behavior. The basic drivers within cells are concentrations of some proteins, which determine both spatial and temporal coordinates of the cell, as a kind of "cellular memory". The gene networks are only beginning to be understood, and it is a next step for biology to attempt to deduce the functions for each gene "node", to help understand the behavior of the system in increasing levels of complexity, from gene to signaling pathway, cell or tissue level.
Mathematical models of GRNs have been developed to capture the behavior of the system being modeled, and in some cases generate predictions corresponding with experimental observations. In some other cases, models have proven to make accurate novel predictions, which can be tested experimentally, thus suggesting new approaches to explore in an experiment that sometimes wouldn't be considered in the design of the protocol of an experimental laboratory. Modeling techniques include differential equations, Boolean networks, Petri nets, Bayesian networks, graphical Gaussian network models, Stochastic, and Process Calculi. Conversely, techniques have been proposed for generating models of GRNs that best explain a set of time series observations. Recently it has been shown that ChIP-seq signal of histone modification are more correlated with transcription factor motifs at promoters in comparison to RNA level. Hence it is proposed that time-series histone modification ChIP-seq could provide more reliable inference of gene-regulatory networks in comparison to methods based on expression levels.
Structure and evolution
Global feature
Gene regulatory networks are generally thought to be made up of a few highly connected nodes and many poorly connected nodes nested within a hierarchical regulatory regime. Thus gene regulatory networks approximate a hierarchical scale free network topology. This is consistent with the view that most genes have limited pleiotropy and operate within regulatory modules. This structure is thought to evolve due to the preferential attachment of duplicated genes to more highly connected genes. Recent work has also shown that natural selection tends to favor networks with sparse connectivity.There are primarily two ways that networks can evolve, both of which can occur simultaneously. The first is that network topology can be changed by the addition or subtraction of nodes or parts of the network may be expressed in different contexts. The Drosophila Hippo signaling pathway provides a good example. The Hippo signaling pathway controls both mitotic growth and post-mitotic cellular differentiation. Recently it was found that the network the Hippo signaling pathway operates in differs between these two functions which in turn changes the behavior of the Hippo signaling pathway. This suggests that the Hippo signaling pathway operates as a conserved regulatory module that can be used for multiple functions depending on context. Thus, changing network topology can allow a conserved module to serve multiple functions and alter the final output of the network. The second way networks can evolve is by changing the strength of interactions between nodes, such as how strongly a transcription factor may bind to a cis-regulatory element. Such variation in strength of network edges has been shown to underlie between species variation in vulva cell fate patterning of Caenorhabditis worms.
Local feature
Another widely cited characteristic of gene regulatory network is their abundance of certain repetitive sub-networks known as network motifs. Network motifs can be regarded as repetitive topological patterns when dividing a big network into small blocks. Previous analysis found several types of motifs that appeared more often in gene regulatory networks than in randomly generated networks. As an example, one such motif is called feed-forward loops, which consist of three nodes. This motif is the most abundant among all possible motifs made up of three nodes, as is shown in the gene regulatory networks of fly, nematode, and human.The enriched motifs have been proposed to follow convergent evolution, suggesting they are "optimal designs" for certain regulatory purposes. For example, modeling shows that feed-forward loops are able to coordinate the change in node A and the expression dynamics of node C, creating different input-output behaviors. The galactose utilization system of E. coli contains a feed-forward loop which accelerates the activation of galactose utilization operon galETK, potentially facilitating the metabolic transition to galactose when glucose is depleted. The feed-forward loop in the arabinose utilization systems of E.coli delays the activation of arabinose catabolism operon and transporters, potentially avoiding unnecessary metabolic transition due to temporary fluctuations in upstream signaling pathways. Similarly in the Wnt signaling pathway of Xenopus, the feed-forward loop acts as a fold-change detector that responses to the fold change, rather than the absolute change, in the level of β-catenin, potentially increasing the resistance to fluctuations in β-catenin levels. Following the convergent evolution hypothesis, the enrichment of feed-forward loops would be an adaptation for fast response and noise resistance. A recent research found that yeast grown in an environment of constant glucose developed mutations in glucose signaling pathways and growth regulation pathway, suggesting regulatory components responding to environmental changes are dispensable under constant environment.
On the other hand, some researchers hypothesize that the enrichment of network motifs is non-adaptive. In other words, gene regulatory networks can evolve to a similar structure without the specific selection on the proposed input-output behavior. Support for this hypothesis often comes from computational simulations. For example, fluctuations in the abundance of feed-forward loops in a model that simulates the evolution of gene regulatory networks by randomly rewiring nodes may suggest that the enrichment of feed-forward loops is a side-effect of evolution. In another model of gene regulator networks evolution, the ratio of the frequencies of gene duplication and gene deletion show great influence on network topology: certain ratios lead to the enrichment of feed-forward loops and create networks that show features of hierarchical scale free networks. De novo evolution of coherent type 1 feed-forward loops has been demonstrated computationally in response to selection for their hypothesized function of filtering out a short spurious signal, supporting adaptive evolution, but for non-idealized noise, a dynamics-based system of feed-forward regulation with different topology was instead favored.