Monday, April 20, 2009

How Smart is the Cell? Part II: The Gene Activation network as an Analog Computer

In Part I we examined the enzyme activation network, observing the analogy to an analog computer. Now we're going to examine who DNA fits into this picture. Most of us are familiar with the protein synthesis process, which is essential to create the unactivated enzymes described in part one, as well as the other enzymes and proteins within the cell.

The sequence that will finally code for the protein is carried in the DNA. It is transcribed, creating transcript RNA which is edited (processed) to create messenger RNA (mRNA). Messenger RNA is translated to a protein, which is then (sometimes) subject to further processing. This protein may be one of the enzymes participating in an activation network. Every step in the process is subject to regulatory control. Even the disposal of mRNA is sometimes subject to regulation by various enzymes, modifying the level of translation.

We're going to focus on transcription initiation here, although the other steps can add complexity (and, potentially intelligence) to the overall interaction network.

Transcription Initiation for (most) protein-coding genes is very different between prokaryotes and Eukaryotes. I'm going to focus on Eukaryotes here. There are three types of RNA Polymerase used by Eukaryotes, but the only one that really matters for our purposes is RNA polymerase II, as the others normally only transcribe non-translated RNA such as ribozymes and nuclear RNA.

The fundamental key to control of transcription is the ability of proteins to fit like a key in a lock into openings in the DNA strand. Most of these openings are in the wide groove of the DNA helix, although there are exceptions. (We should note that RNA doesn't have a wide groove, as it cannot form the coiling configuration normally used by DNA.) In addition to a key/lock relationship with DNA, these proteins (enzymes) also have key/lock relationships with one another, allowing interactions among them.

The classic description of Transcriptional regulation involves a laundry list of elements such as promoters, enhancers, silencers, insulators, or locus control regions (LCR). (A good description based on these terms is given in Transcriptional Regulatory Elements in the Human Genome.)

I'm going to use a slightly newer version here, based on the modular nature of most of the elements involved. (I will be working primarily from The Evolution of Transcriptional Regulation in Eukaryotes by Gregory A. Wray, Matthew W. Hahn, Ehab Abouheif, James P. Balhoff, Margaret Pizer, Matthew V. Rockman and Laura A. Romano.) Basically, there is a promoter that interacts with a number of modules, which variously act as enhancers, silencers, insulators, locus control regions, etc. These modules, in turn, are made up of one or more Transcription Factor Binding Sites (TFBS or "binding site") that interact with enzymes (or, perhaps ribozymes) called Transcription Factors (TF). Figure 1 gives a picture of the promoter region and its surrounding control regions.

Figure 1 (from The Evolution of Transcriptional Regulation in Eukaryotes ). Click to see the original caption.

The binding sites that make up a module can interact with more than one TF, and when two or more binding sites are too near to or overlap one another, the binding of one TF can interfere with that of another. Moreover, any TF that binds to DNA then may perform a number of other functions, such as binding to enzymes such as co-factors, looping factors, or the chromatin remodeling complex. The action taken may depend on what other factors are present, so that a TF that acts to enhance transcription in the presence of the right co-factor could act to repress it when the co-factor was absent and it interfered with the binding of another enhancing TF.

Figure 2 (from The Evolution of Transcriptional Regulation in Eukaryotes).
Context dependence of binding site activity (the arrow at the promoter represents successful transcription initiation): (A) The binding site successfully affects the promoter. (B) The TF is absent (C) Local Chromatin is condensed (whether or not the TF is present). (D) An adjacent site is occupied, masking the binding site. (E) TF is present but in an inactive form (e.g. not phospho-activated). (F) A different TF has a higher affinity for the binding site. (G) Here two TF's must be present to affect transcription. (H) One of the TF's is absent. (I) Another co-factor interacts with the TF with greater affinity than the other TF. (J) A different cofactor (TF) has greater affinity for the other binding site.
Click to see the original caption.

Figure 2 illustrates some ways that a TF's interaction with a binding site can be influenced by context. This context includes the presence/absence of TF's and other factors that are ultimately produced by DNA coding elsewhere (usually) from the coding region of the gene being regulated.

A TF can interact with many binding sites, and its activity with each site will be independent of the others, except that when it's present in relatively small amounts, there will be competition among sites for TF activity.

The interaction between a TF and its binding site depends on the specific sequence of DNA in the binding site. However, experiments with TF binding have shown that there are many sequences that will bind any particular TF, usually all very similar. By comparing these sequences, it's usually possible to find a consensus sequence that is very similar to all of them.

The consensus sequence will generally have the highest binding energy, that is it will stick tightest to the TF. However, other similar sequences may be able to bind to the TF, although with different behavior. A few example consensus sequences are found in table 2 from The Evolution of Transcriptional Regulation in Eukaryotes. This means that when there are multiple binding sites with slightly different sequences, they will have different binding energies, and the activity will be different for the same concentration of TF. Note also that there can be multiple TF's with similar (but non-identical) consensus sequences, so that different binding sites may bind to different TF's depending on the relative concentrations of the TF's.

The affect of each TF concentration on transcription rate will be generally analog. Although a high enough concentration will saturate any particular binding site, producing full-bore transcription (assuming it's an enhancer), lower concentrations will cause each TF/binding site activity to perform an analog calculation.

Each of the points mentioned above adds to the complexity of the analog calculation performed by a particular module, as well as the overall promoter. The inputs are the concentrations of various TF's (and other factors) in the nucleus, the output is the transcription rate at the promoter. Indeed, as Wray, et al describe it
Two aspects of promoter function are reminiscent of analog logic circuits ([ref's]). (1) Individual modules can function as Boolean (off/on) or scalar (quantitative) elements whose interactions have predictable, additive effects on transcription. Multiple modules are sometimes required to produce a single phase of expression. [...] Conversely, a single module may be involved in several different phases of expression. [...] (2) Promoters integrate multiple, diverse inputs and produce a single, scalar output: the rate of transcriptional initiation. A familiar analogy is a neuron, which receives input from many sources but whose output is simply how often it fires. In many promoters, signal integration happens at the basal promoter, through specific interactions between bound transcription factors and components of the RNA polymerase II enzyme complex ([ref's]). In some promoters, however, a distinct module may integrate signals from other modules. (Section 3.5.6)

The interconnections between TF concentration and transcription rates have a fairly good analog in transistor networks. There has to be a separate resistor corresponding to each combination of TF and binding site, as well as the complex interaction logic described by Wray, et al above. Adding complexity, some binding sites can affect multiple transcriptions (promoters) (Wray, et al section 3.3.8). In addition, the competition for TF's among binding sites may not have an easy analog in transistor networks, but it certainly adds complexity (and potentially intelligence) to the calculation performed.

The overall interconnections not only form a very complex network, but one with a great deal of feedback, leading to a number of potential meta-stable or stable states. In this, it is similar to many electronic analog circuits.

Even if none of the TRF's and associated enzymes participated in the network of phospho-activating enzymes described in Part I, the system still makes a good analogy to an electronic analog computer, and a very complex one at that. (Note that I'm using the word analog in a different meaning from analogy here.) To quote Wray, et al:
[... B]ecause transcription factors can influence the expression of other transcription factors, a transcriptional activator can repress other genes through the intermediate step of activating a repressor, or vice versa ([ref's]). [emphasis original]

If we trace all the interconnections, it is also very highly interconnected:
Even using conservative criteria for recognizing interactions, these analyses indicate that most transcription factors directly regulate a few percent of the genes in the Saccharomyces genome. Genetic networks are therefore highly connected, with each node that is represented by a transcription factor linked to many other nodes.

It seems likely that many of the interconnections usually remain "dark" (at a specific time in a specific cell): either the TF (or an associated factor) is not present or some other essential factor for expression is missing for most of the promoters a present TF can affect.

When we include the potential for enzyme activation by phosphorylation or other methods, the result is to hook the enzyme system described in Part I to the gene activation system. The common input to the gene system from the enzyme system is well documented (Wray, et al):
Post-translational modifications, most commonly phosphorylation, can also modulate binding specificity. Several enzymes, including the MAP and Janus kinases, fine-tune the phosphorylation state of transcription factors, exerting a significant influence on overall transcription patterns ([ref's]).
The activity of many transcription factors depends on post-translational covalent modifications, most commonly phosphorylation ([ref's]), acetylation ([ref's]), and glycosylation ([ref's]). These modifications often provide an important point of control over transcription, and phosphorylation in particular is often dynamically regulated ([ref's]). [my emphasis]

In addition to simple control over TF concentration and activation, a variety of other methods exist for the enzyme system to control gene transcription. For instance, key/lock interactions between proteins participate when TF's interact with one another, co-factors, components of the basal transcriptional machinery, etc. All these interactions can be mediated or modified through phosphorylation or other sorts of activation.

Another way transcription can be modified (by the enzyme system) is through nuclear localization. The nucleoplasm is separated from the cytoplasm by the nuclear envelope, a double-membrane system pierced by a number of nuclear pores, which allow small molecules (including many enzymes) to pass freely between the nucleoplasm and the cytoplasm, but block larger molecules such as large proteins and especially polymers. (A more detailed, and peer-reviewed, discussion of the nuclear pore may be found in The Nuclear Pore Complex as a Transport Machine by Michael P. RoutDagger and John D. Aitchison.)

Figure 3 (from Daniel Stoffler's Nuclear Pore Complex Project)

By modifying the interaction of a TF or associated factor with the transport machinery of the nuclear pore, the enzyme system can affect transcription without either creating or destroying TF's.

Once transcription is complete, there are further controls that allow the network of enzymes and genes to control itself. I'm not going to list all of them, but one that cannot be left out involves alternative splicing of RNA. This is a process in which the transcribed RNA is edited, usually removing a number of sections called introns, while creating the final mRNA. Which introns are removed, and sometimes where their boundaries are set, determines the sequence of the final mRNA, and thus the character of the protein. Alternative splicing can be affected by enzymes, but also by non-coding RNA's of various types and provenance, including those snipped out of the introns of other genes. A description of some of the ways in which alternative splicing can be regulated may be found in Regulation of Alternative Splicing: More than Just the ABC by Amy E. House and Kristen W. Lynch. It may turn out that the number of data connections that affect protein (enzyme) concentration via regulation of internal splicing is greater than the number via transcription regulation.

We've seen that, potentially, the cell has an enormous number of analog interconnections that could be built into an analog computer. However, most of the actual systems of activation and gene expression that have been studied are digital: expression or activation is normally bistable, with a rapid switch from one state to the other, often both ways. (Not always, however. Some such systems are like a trigger that reacts very fast but takes somewhat longer to reset.) Given the large potential number of interacting enzymes and genes, a few hundred given over to digital effects hardly subtracts from the potential for building an analog (brain) from the others. In addition, many of these digital systems have analog aspects.

The cell exists in an analog world. Concentrations of nutrients, toxins, and threat indicators are present in varying concentrations, often gradients that must be sampled for proper reaction. Here, the analog signal is converted into whatever digital signals are necessary to initiate reaction and impose decisions. Note, however, that much of the control of that reaction is analog: how fast, how far to turn, etc.

Even in a completely digital response, such as that of the nerve synapse to the arrival of a depolarization wave, there is often an amplification effect that is analog in its nature, just as the state switch of an electronic logic gate or flip-flop is analog while taking place.

Also, the process of unwinding chromatin from the Nucleosome, can probably be considered digital: either the DNA is off the nucleosome available for transcription, or it isn't.

I found several papers addressing mixed and analog/digital systems:

Positive feedback in eukaryotic gene networks: cell differentiation by graded to binary response conversion by Attila Becskei, Bertrand Séraphin and Luis Serrano

Switch-like genes populate cell communication pathways and are enriched for extracellular proteins by Adam Ertel, Aydin Tozeren

Transcriptional Autoregulatory Loops Are Highly Conserved in Vertebrate Evolution by Szymon M. Kiełbasa, Martin Vingron

Functional characteristics of a double positive feedback loop coupled with autorepression by Subhasis Banerjee and Indrani Bose

The question of why so many of the cell's information systems are digital when analog systems are potentially so much smarter is mostly a matter of speed and energy costs. But that's a subject for another post.

Next: How Smart is the Cell? Part III: Programming, Power, and Speed

Links (I found these articles while researching this post and make no warranty regarding their applicability or agreement with my position.)

Functional Phosphorylation Sites in the C-Terminal Region of the Multivalent Multifunctional Transcriptional Factor CTCF


Unraveling transcription regulatory networks by protein–DNA and protein–protein interaction mapping

The Evolution of Transcriptional Regulation in Eukaryotes

Energy-dependent fitness: A quantitative model for the evolution of yeast transcription factor binding sites

Specificity and robustness in transcription
control networks

Cooperation between complexes that regulate chromatin structure and transcription

Transcriptional Regulation by the Numbers 1: Models

Single molecule analysis of RNA polymerase elongation reveals uniform kinetic behavior

Using noise to probe and characterize gene circuits

Real-Time Kinetics of Gene Activity in Individual Bacteria

Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences


  1. This just made my day much brighter. Thanks a lot. Something else I across was this .Take a look!

    Histone deacetylase 7

  2. Hello,

    Biological neural networks are made up of real biological neurons that are connected or functionally related in the peripheral nervous system or the central nervous system. In the field of neuroscience, they are often identified as groups of neurons that perform a specific physiological function in laboratory analysis. which is variety of other methods exist for the enzyme system to control gene transcription. Thanks a lot!