Tuesday, April 28, 2009


My next post on cell smarts is taking longer than I'd like, so I thought I'd post some of the links I found researching it: anything about mitochondria:

Mitochondrial-Nuclear Communications

Perinuclear, perigranular and sub-plasmalemmal mitochondria have distinct functions in the regulation of cellular calcium transport

Emerging functions of mammalian mitochondrial fusion and fission

Heterologous mitochondrial DNA recombination in human cells

Organization and dynamics of human mitochondrial DNA

Mitochondrial Fusion in Human Cells Is Efficient, Requires the Inner Membrane Potential, and Is Mediated by Mitofusins

Mitochondrial Regulation of Intracellular Ca2+ Signaling: More Than Just Simple Ca2+ Buffers

Mitochondria: The Hub of Cellular Ca2+ Signaling

H-ras, K-ras, and inner plasma membrane raft proteins operate in nanoclusters with differential dependence on the actin cytoskeleton

Dynamic molecular confinement in the plasma membrane by microdomains and the cytoskeleton meshwork

Mitochondria: More Than Just a Powerhouse

Mitochondrial Morphology and Dynamics in Yeast and Multicellular Eukaryotes

Mitochondrial fusion and fission in mammals

Mitochondria: Dynamic Organelles in Disease, Aging, and Development

Functions and dysfunctions of mitochondrial dynamics

Peroxisome Fission in Hansenula polymorpha Requires Mdv1 and Fis1, Two Proteins Also Involved in Mitochondrial Fission

Transcriptional Paradigms in Mammalian Mitochondrial Biogenesis and Function

Expression and Maintenance of Mitochondrial DNA

Read more!

Thursday, April 23, 2009

How Smart is the Cell? Part III: Programming, Power, and Speed

In parts I and II we observed that the enzyme and gene activation systems could be considered as analog computers, analogous to networks of transistors and resistors. In this part we'll consider these systems in the light of current use of computers, noting the contrasts and similarities.

Analogy and Analog

First, a word about analogy. The word "analogy" comes from the Greek "analogia" ("αναλογια"), which combines the word "logion" with the prefix "ana". "Ana" is a kind of "double negative", while "logia" is either a feminine singular noun referring to a "science" or body of knowledge/sayings, or a neuter plural referring to many sayings. Used by itself today, the word normally applies to "a supposed collection of sayings of Jesus", but such words as "archaelogia" (science/body of knowledge about history) retain much their original meaning. A classic Greek would probably understand such words as "geologia" "meteorologia", etc. (The relationship between the feminine and neuter nouns is hardly accidental, it is present in many Indo-European languages and reflected in the identify of case inflection. An example from Latin is "casum"/"casa": the former referring to "room" the latter "rooms"/"house".)

In Greek, then, "analogia" would refer to representing one science or body of knowledge with another. More specifically, the word came to be used for "proportionality". In English, we use the broader sense of some system of identifying a particular subject in terms of another subject through similarities or parallels.

Early creation of analog computers used scalar quantities such as voltage to represent real-world scalars, while early digital computing machines replicated manual processes using "arabic" numberals. As the Stanford Encyclopedia of Philosophy points out:
As the case of the architect's model makes plain, analog representation may be discrete in nature (there is no such thing as a fractional number of windows). Among computer scientists, the term ‘analog’ is sometimes used narrowly, to indicate representation of one continuously-valued quantity by another (e.g., speed by voltage). [my emphasis]
Similarly, from A Review of Analog Computing:
[... I]n a fundamental sense all computing is based on an analogy, that is, on a systematic relationship between the states and processes in the computer and those in the primary system. In a digital computer, the relationship is more abstract and complex than simple proportionality, but even so simple an analog computer as a slide rule goes beyond strict proportion (i.e., distance on the rule is proportional to the logarithm of the number). In both analog and digital computation—indeed in all computation—the relevant abstract mathematical structure of the problem is realized in the physical states and processes of the computer, but the realization may be more or less direct ([ref's]).

Therefore, despite the etymologies of the terms “analog” and “digital,” in modern usage the principal distinction between digital and analog computation is that the former operates on discrete representations in discrete steps, while the later operated on continuous representations by means of continuous processes ([ref's]). That is, the primary distinction resides in the topologies of the states and processes, and it would be more accurate to refer to discrete and continuous computation ([ref's]). [my emphasis]

We're going to use it in the narrow fashion.

Programming of Analog Computers

I've made analogies between the enzyme and gene activation systems, and analog computers created from networks of transistors and resistors, which are fairly "tight" in the sense that the voltage (or whatever, depending on specifics of design) is analogous to the concentration of enzyme or other factor. However, when we talk about "programming" analog computers, that draws an analogy to digital computers that is much looser.

What we know of as "programming" is pretty much a digital phenomenon. Let's back up to much simpler systems begin with: consider a digital system that samples its input (as discrete phenomena) and calculates an output, also discretely represented. This may be analogous to a network of transistors and resistors: each performs a pre-designed function and nothing else. Now expand the digital system to execute a "program" entered by setting switches on a board. These switches represent, in machine language, the steps the digital computer must follow to perform its activity. We may consider this analogous to a network of transistors and resistors with some potentiometers. twisting the knobs on the analog computer is equivalent to setting switches on the digital.

Now, digital computers will normally have memory of some type, registers to hold intermediate results, etc. But suppose some of this memory is used for its program: the program may be loaded into it just as switches are set manually, but the program itself may be data created through calculations. Finding an analogy to that with analog computers is much harder.

Instead, "programming" for analog computers normally refer(ed) to wiring.

Figure 1: Hitachi 240: the patch panel above where "programming" takes place, below are the knobs which are used to enter input conditions, etc. (image from The HITACHI 240 Analog Computer by VAXMAN.)

There isn't really any analogy here with the enzyme system. If we regard the presence (or absence) of binding regions that react to specific transcription factors as analogous to patch cords in the panel of Figure 1, we might say that the total sequence of non-coding DNA "programs" the gene activation system, but even here the analogy is loose because there are many possible sequences for any binding region, with different "strengths" of affinity for TF's. Instead, we will consider the non-coding sequences (and such protein-coding sequences as perform dual functions) as "programming" but not try to make analogies with the patch cord system. Thus, we have no good electronic analogy for "programming" in the gene activation system.

In a general sense, the interactions between enzymes and DNA or other enzymes are "built in". This is because they have a "key/lock" interaction, which has no relationship to the amino-acid sequence that can be manipulated. Changes to the sequence are essentially random "shots in the dark" with results that have to be evaluated by Darwinian selection. There may be a partial exception regarding some DNA recognition regions where there is a predictable relationship between amino acid sequence and DNA base sequence, but we won't go into this here.

To discover something analogous to "programming" for the enzyme activation system, we have to take a closer look at the way catalysts work. This will also bring in the other elements of our title: power and speed.

A Word about Catalysis

When the catalyst amount is small, the reaction rate increases about linearly with substrate concentration. For our purposes, both the catalyst and the substrate are enzymes present in very small amounts. Thus the rate will almost always fall into region A in Figure 2:

Figure 2: Plot of substrate concentration vs. reaction rate (from the Medical Biochemistry Page).

Similarly, the rate will vary about linearly with catalyst amount.

Note that we usually aren't talking here about a reaction at chemical equilibrium. The energy of both phosphorylation and dephosphorylation is such that each reaction will proceed until the equilibrium ratio is in the neighborhood of 10,000:1. Thus the ratio of phosphorylated target enzyme to unphosphorylated will depend primarily on the relative rates, which in turn depend on the concentrations of kinases and phosphatases.

Enzymes Making a Basic Calculation

Let's take a simple example. We'll look at an enzyme we'll call Es, our substrate enzyme. (We'll assume linear responses here, which is usually a good approximation, although not exact.) Its output will be the absolute concentration of EsP, the phospho-activated form. The dephosphorylized form will be called Es0 (the zero for no phosphate). For simplicity's sake, we'll have two inputs, treated as absolute concentrations: Ek, the kinase that activates our substrate, and Ep, the phosphatase that deactivates it.

Start with 50% phosphorylation:

1×Ek + 1×Ep = 50% EsP, 50% Es0

That is, the amounts of kinase and phosphatase are balanced so the amounts of active and inactive substrate are equal. Now, if we double the amount of kinase, we double the ratio of active to inactive substrate:

2×Ek + 1×Ep = 67% EsP, 33% Es0

If we double the amount of phosphatase as well, we return the ratio to equal:

2×Ek + 2×Ep = 50% EsP, 50% Es0

Thus, in this simple case, we see that the absolute concentrations of the input enzymes don't matter, just the relative concentrations among the inputs.

Turning the Enzyme Knob

What happens if we go back to the beginning then add a whole bunch of new substrate, from a newly expressed gene?

*1×Ek + 1×Ep = 25% EsP, 75% Es0

I've put an asteroid beside the line, because it's out of balance: we just doubled the total amount EsP+Es0 by adding as much Es0 as the previous total. Since the rate of each reaction depends on both the concentration of its enzyme (unchanged) and that of its substrate (tripled for Es0), the rate of activation will be tripled compared to the rate of deactivation, and the ratios will quickly return to equal:

*1×Ek + 1×Ep = 25% EsP, 75% Es0
=== quickly ===>
1×Ek + 1×Ep = 50% EsP, 50% Es0

As you can see, the fraction of substrate that's activated remains constant for constant inputs (subject to settling time), however the absolute concentration has just doubled! To make it clearer, let's do it with absolute concentrations of Es: We start with 10 units of Es, divided equally:

1×Ek + 1×Ep = 5×EsP, 5×Es0

We add 10 more units of Es0:

*1×Ek + 1×Ep = 5×EsP, 15×Es0
=== quickly ===>
1×Ek + 1×Ep = 10×EsP, 10×Es0

As you can see increasing the total amount of our substrate enzyme is analogous to "turning up the gain" on an electronic amplifier", or twisting the knob on the potentiometer in our transistor network. The same input creates a stronger output.

In summary, then, the enzyme activation system can be "programmed" by the gene system, by increasing (or decreasing) the rate of gene expression for any specific substrate enzyme.

Programming of the Gene Activation System

What about the "programming" of the gene system I mentioned earlier? Well, the gene activation system is run primarily by the binding sites: their location and specific binding energies for each transcription factor. Since this is the result of mutation, as filtered through Darwinian selection, we can say that evolution has programmed the gene activation system.

Power and Speed

We saw, above, that increasing all the inputs by the same amount would leave the output the same (to a first approximation), while increasing the total substrate would increase the output (roughly) proportionally. So, in principle, couldn't we increase the concentration of all the enzymes in the system and have the same result?

Yes (sort of), but...

There are several factors involved here besides the actual answer to a calculation. Note that each time a molecule of substrate is activated, one molecule of ATP is hydrolyzed to ADP. Since the rates of activation and deactivation are equal (subject to a settling time labeled "quickly" in the above equations), one molecule of EsP is also hydrolyzed to Es0 + Pi (inorganic phosphate). If we double the total amount of substrate (Es), we double the amount of ATP being "burned" by the calculation.

So wouldn't it be better to keep the concentrations as low as possible? Yes, within limits. Let's go back to our original example:

1×Ek + 1×Ep = 50% EsP, 50% Es0

and reduce the inputs to 1/100th their previous:

0.01×Ek + 0.01×Ep = 50% EsP, 50% Es0

It looks like the same calculation, but what happens if we double the amount of kinase:

*0.02×Ek + 0.01×Ep = 50% EsP, 50% Es0
=== slowly ===>
0.02×Ek + 0.01×Ep = 67% EsP, 33% Es0

The settling time has been increased by 100, the speed reduced by the same factor. Thus, you can have smart, or you can have fast, but to have both you've got to spend a lot of energy. Another problem is random variation in quantities: Gene expression is roughly continuous, but if you reduce the concentrations enough you get enough noise in the system to be a problem. (Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences by Arjun Raj and Alexander van Oudenaarden provides a recent review of noise in gene expression.)

What's needed, then, is a balance: enough to keep noise from interfering with the calculations, but not enough to burn more ATP than necessary.

For real speed, however, a digital approach can save a lot of energy. Here, the total concentrations of many enzymes in the system are high, but the activated forms of many are kept at very low relative concentrations. The system stays in a stable state until some external event triggers a positive feedback process that very quickly switches state. Once the new state has been achieved, either all the activated enzymes have either been reduced to very low concentrations, or their substrates have.

Is there trade-off in the gene activation system corresponding to the one described above for enzymes? Yes. The cost of transcribing RNA is two ATP's per base unit, as is the cost of adding each amino acid residue to a protein. (This is made clear in any biochemistry textbook, but I haven't found a good on-line reference for it.) The level of any protein can be maintained through continuous creation, overcoming the garbage disposal system, with a relatively low cost. However, when a gene has to be expressed at large levels, there has to be a dedicated disposal system for its protein. This can be either continuously maintained, or driven by part of the dynamic control system. Maintaining large amounts of transcription factors and high expression rates would also be expensive for the cell, and generally only happen for a few at a time, if that.

Are the concentrations of enzymes the same throughout the cell? Not always. In Part II I mentioned nuclear localization of transcription factors, but there's much more than that, the subject of the next post in this series.

Next How Smart is the Cell? Part IV: Local Intelligence

Links: (These include all the papers I used in creating this article, including the basics for many calculations not included here. I made them only to assure myself that I wasn't talking out my hat.)

The Modern History of Computing from Stanford Encyclopedia of Philosophy

A great disappearing act: the electronic analogue computer by Chris Bissell

A Review of Analog Computing Technical Report UT-CS-07-6 by Bruce J. MacLennan

Biochemical Energetics from Biochemistry of Metabolism

Enzyme Kinetics from the Medical Biochemistry Page

The Computational Versatility of Proteomic Signaling Networks

Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell

Quantitative analysis of signaling networks

Coupled positive and negative feedback circuits form an essential building block of cellular signaling pathways

The Contents of Adenine Nucleotides, Phosphagens and some Glycolytic Intermediates in Resting Muscles from Vertebrates and Invertebrates by ISIDOROS BEIS and ERIC A. NEWSHOLME.

Roles of the creatine kinase system and myoglobin in maintaining energetic state in the working heart by Fan Wu and Daniel A Beard. Read more!

Monday, April 20, 2009

How Smart is the Cell? Part II: The Gene Activation network as an Analog Computer

In Part I we examined the enzyme activation network, observing the analogy to an analog computer. Now we're going to examine who DNA fits into this picture. Most of us are familiar with the protein synthesis process, which is essential to create the unactivated enzymes described in part one, as well as the other enzymes and proteins within the cell.

The sequence that will finally code for the protein is carried in the DNA. It is transcribed, creating transcript RNA which is edited (processed) to create messenger RNA (mRNA). Messenger RNA is translated to a protein, which is then (sometimes) subject to further processing. This protein may be one of the enzymes participating in an activation network. Every step in the process is subject to regulatory control. Even the disposal of mRNA is sometimes subject to regulation by various enzymes, modifying the level of translation.

We're going to focus on transcription initiation here, although the other steps can add complexity (and, potentially intelligence) to the overall interaction network.

Transcription Initiation for (most) protein-coding genes is very different between prokaryotes and Eukaryotes. I'm going to focus on Eukaryotes here. There are three types of RNA Polymerase used by Eukaryotes, but the only one that really matters for our purposes is RNA polymerase II, as the others normally only transcribe non-translated RNA such as ribozymes and nuclear RNA.

The fundamental key to control of transcription is the ability of proteins to fit like a key in a lock into openings in the DNA strand. Most of these openings are in the wide groove of the DNA helix, although there are exceptions. (We should note that RNA doesn't have a wide groove, as it cannot form the coiling configuration normally used by DNA.) In addition to a key/lock relationship with DNA, these proteins (enzymes) also have key/lock relationships with one another, allowing interactions among them.

The classic description of Transcriptional regulation involves a laundry list of elements such as promoters, enhancers, silencers, insulators, or locus control regions (LCR). (A good description based on these terms is given in Transcriptional Regulatory Elements in the Human Genome.)

I'm going to use a slightly newer version here, based on the modular nature of most of the elements involved. (I will be working primarily from The Evolution of Transcriptional Regulation in Eukaryotes by Gregory A. Wray, Matthew W. Hahn, Ehab Abouheif, James P. Balhoff, Margaret Pizer, Matthew V. Rockman and Laura A. Romano.) Basically, there is a promoter that interacts with a number of modules, which variously act as enhancers, silencers, insulators, locus control regions, etc. These modules, in turn, are made up of one or more Transcription Factor Binding Sites (TFBS or "binding site") that interact with enzymes (or, perhaps ribozymes) called Transcription Factors (TF). Figure 1 gives a picture of the promoter region and its surrounding control regions.

Figure 1 (from The Evolution of Transcriptional Regulation in Eukaryotes ). Click to see the original caption.

The binding sites that make up a module can interact with more than one TF, and when two or more binding sites are too near to or overlap one another, the binding of one TF can interfere with that of another. Moreover, any TF that binds to DNA then may perform a number of other functions, such as binding to enzymes such as co-factors, looping factors, or the chromatin remodeling complex. The action taken may depend on what other factors are present, so that a TF that acts to enhance transcription in the presence of the right co-factor could act to repress it when the co-factor was absent and it interfered with the binding of another enhancing TF.

Figure 2 (from The Evolution of Transcriptional Regulation in Eukaryotes).
Context dependence of binding site activity (the arrow at the promoter represents successful transcription initiation): (A) The binding site successfully affects the promoter. (B) The TF is absent (C) Local Chromatin is condensed (whether or not the TF is present). (D) An adjacent site is occupied, masking the binding site. (E) TF is present but in an inactive form (e.g. not phospho-activated). (F) A different TF has a higher affinity for the binding site. (G) Here two TF's must be present to affect transcription. (H) One of the TF's is absent. (I) Another co-factor interacts with the TF with greater affinity than the other TF. (J) A different cofactor (TF) has greater affinity for the other binding site.
Click to see the original caption.

Figure 2 illustrates some ways that a TF's interaction with a binding site can be influenced by context. This context includes the presence/absence of TF's and other factors that are ultimately produced by DNA coding elsewhere (usually) from the coding region of the gene being regulated.

A TF can interact with many binding sites, and its activity with each site will be independent of the others, except that when it's present in relatively small amounts, there will be competition among sites for TF activity.

The interaction between a TF and its binding site depends on the specific sequence of DNA in the binding site. However, experiments with TF binding have shown that there are many sequences that will bind any particular TF, usually all very similar. By comparing these sequences, it's usually possible to find a consensus sequence that is very similar to all of them.

The consensus sequence will generally have the highest binding energy, that is it will stick tightest to the TF. However, other similar sequences may be able to bind to the TF, although with different behavior. A few example consensus sequences are found in table 2 from The Evolution of Transcriptional Regulation in Eukaryotes. This means that when there are multiple binding sites with slightly different sequences, they will have different binding energies, and the activity will be different for the same concentration of TF. Note also that there can be multiple TF's with similar (but non-identical) consensus sequences, so that different binding sites may bind to different TF's depending on the relative concentrations of the TF's.

The affect of each TF concentration on transcription rate will be generally analog. Although a high enough concentration will saturate any particular binding site, producing full-bore transcription (assuming it's an enhancer), lower concentrations will cause each TF/binding site activity to perform an analog calculation.

Each of the points mentioned above adds to the complexity of the analog calculation performed by a particular module, as well as the overall promoter. The inputs are the concentrations of various TF's (and other factors) in the nucleus, the output is the transcription rate at the promoter. Indeed, as Wray, et al describe it
Two aspects of promoter function are reminiscent of analog logic circuits ([ref's]). (1) Individual modules can function as Boolean (off/on) or scalar (quantitative) elements whose interactions have predictable, additive effects on transcription. Multiple modules are sometimes required to produce a single phase of expression. [...] Conversely, a single module may be involved in several different phases of expression. [...] (2) Promoters integrate multiple, diverse inputs and produce a single, scalar output: the rate of transcriptional initiation. A familiar analogy is a neuron, which receives input from many sources but whose output is simply how often it fires. In many promoters, signal integration happens at the basal promoter, through specific interactions between bound transcription factors and components of the RNA polymerase II enzyme complex ([ref's]). In some promoters, however, a distinct module may integrate signals from other modules. (Section 3.5.6)

The interconnections between TF concentration and transcription rates have a fairly good analog in transistor networks. There has to be a separate resistor corresponding to each combination of TF and binding site, as well as the complex interaction logic described by Wray, et al above. Adding complexity, some binding sites can affect multiple transcriptions (promoters) (Wray, et al section 3.3.8). In addition, the competition for TF's among binding sites may not have an easy analog in transistor networks, but it certainly adds complexity (and potentially intelligence) to the calculation performed.

The overall interconnections not only form a very complex network, but one with a great deal of feedback, leading to a number of potential meta-stable or stable states. In this, it is similar to many electronic analog circuits.

Even if none of the TRF's and associated enzymes participated in the network of phospho-activating enzymes described in Part I, the system still makes a good analogy to an electronic analog computer, and a very complex one at that. (Note that I'm using the word analog in a different meaning from analogy here.) To quote Wray, et al:
[... B]ecause transcription factors can influence the expression of other transcription factors, a transcriptional activator can repress other genes through the intermediate step of activating a repressor, or vice versa ([ref's]). [emphasis original]

If we trace all the interconnections, it is also very highly interconnected:
Even using conservative criteria for recognizing interactions, these analyses indicate that most transcription factors directly regulate a few percent of the genes in the Saccharomyces genome. Genetic networks are therefore highly connected, with each node that is represented by a transcription factor linked to many other nodes.

It seems likely that many of the interconnections usually remain "dark" (at a specific time in a specific cell): either the TF (or an associated factor) is not present or some other essential factor for expression is missing for most of the promoters a present TF can affect.

When we include the potential for enzyme activation by phosphorylation or other methods, the result is to hook the enzyme system described in Part I to the gene activation system. The common input to the gene system from the enzyme system is well documented (Wray, et al):
Post-translational modifications, most commonly phosphorylation, can also modulate binding specificity. Several enzymes, including the MAP and Janus kinases, fine-tune the phosphorylation state of transcription factors, exerting a significant influence on overall transcription patterns ([ref's]).
The activity of many transcription factors depends on post-translational covalent modifications, most commonly phosphorylation ([ref's]), acetylation ([ref's]), and glycosylation ([ref's]). These modifications often provide an important point of control over transcription, and phosphorylation in particular is often dynamically regulated ([ref's]). [my emphasis]

In addition to simple control over TF concentration and activation, a variety of other methods exist for the enzyme system to control gene transcription. For instance, key/lock interactions between proteins participate when TF's interact with one another, co-factors, components of the basal transcriptional machinery, etc. All these interactions can be mediated or modified through phosphorylation or other sorts of activation.

Another way transcription can be modified (by the enzyme system) is through nuclear localization. The nucleoplasm is separated from the cytoplasm by the nuclear envelope, a double-membrane system pierced by a number of nuclear pores, which allow small molecules (including many enzymes) to pass freely between the nucleoplasm and the cytoplasm, but block larger molecules such as large proteins and especially polymers. (A more detailed, and peer-reviewed, discussion of the nuclear pore may be found in The Nuclear Pore Complex as a Transport Machine by Michael P. RoutDagger and John D. Aitchison.)

Figure 3 (from Daniel Stoffler's Nuclear Pore Complex Project)

By modifying the interaction of a TF or associated factor with the transport machinery of the nuclear pore, the enzyme system can affect transcription without either creating or destroying TF's.

Once transcription is complete, there are further controls that allow the network of enzymes and genes to control itself. I'm not going to list all of them, but one that cannot be left out involves alternative splicing of RNA. This is a process in which the transcribed RNA is edited, usually removing a number of sections called introns, while creating the final mRNA. Which introns are removed, and sometimes where their boundaries are set, determines the sequence of the final mRNA, and thus the character of the protein. Alternative splicing can be affected by enzymes, but also by non-coding RNA's of various types and provenance, including those snipped out of the introns of other genes. A description of some of the ways in which alternative splicing can be regulated may be found in Regulation of Alternative Splicing: More than Just the ABC by Amy E. House and Kristen W. Lynch. It may turn out that the number of data connections that affect protein (enzyme) concentration via regulation of internal splicing is greater than the number via transcription regulation.

We've seen that, potentially, the cell has an enormous number of analog interconnections that could be built into an analog computer. However, most of the actual systems of activation and gene expression that have been studied are digital: expression or activation is normally bistable, with a rapid switch from one state to the other, often both ways. (Not always, however. Some such systems are like a trigger that reacts very fast but takes somewhat longer to reset.) Given the large potential number of interacting enzymes and genes, a few hundred given over to digital effects hardly subtracts from the potential for building an analog (brain) from the others. In addition, many of these digital systems have analog aspects.

The cell exists in an analog world. Concentrations of nutrients, toxins, and threat indicators are present in varying concentrations, often gradients that must be sampled for proper reaction. Here, the analog signal is converted into whatever digital signals are necessary to initiate reaction and impose decisions. Note, however, that much of the control of that reaction is analog: how fast, how far to turn, etc.

Even in a completely digital response, such as that of the nerve synapse to the arrival of a depolarization wave, there is often an amplification effect that is analog in its nature, just as the state switch of an electronic logic gate or flip-flop is analog while taking place.

Also, the process of unwinding chromatin from the Nucleosome, can probably be considered digital: either the DNA is off the nucleosome available for transcription, or it isn't.

I found several papers addressing mixed and analog/digital systems:

Positive feedback in eukaryotic gene networks: cell differentiation by graded to binary response conversion by Attila Becskei, Bertrand Séraphin and Luis Serrano

Switch-like genes populate cell communication pathways and are enriched for extracellular proteins by Adam Ertel, Aydin Tozeren

Transcriptional Autoregulatory Loops Are Highly Conserved in Vertebrate Evolution by Szymon M. Kiełbasa, Martin Vingron

Functional characteristics of a double positive feedback loop coupled with autorepression by Subhasis Banerjee and Indrani Bose

The question of why so many of the cell's information systems are digital when analog systems are potentially so much smarter is mostly a matter of speed and energy costs. But that's a subject for another post.

Next: How Smart is the Cell? Part III: Programming, Power, and Speed

Links (I found these articles while researching this post and make no warranty regarding their applicability or agreement with my position.)

Functional Phosphorylation Sites in the C-Terminal Region of the Multivalent Multifunctional Transcriptional Factor CTCF


Unraveling transcription regulatory networks by protein–DNA and protein–protein interaction mapping

The Evolution of Transcriptional Regulation in Eukaryotes

Energy-dependent fitness: A quantitative model for the evolution of yeast transcription factor binding sites

Specificity and robustness in transcription
control networks

Cooperation between complexes that regulate chromatin structure and transcription

Transcriptional Regulation by the Numbers 1: Models

Single molecule analysis of RNA polymerase elongation reveals uniform kinetic behavior

Using noise to probe and characterize gene circuits

Real-Time Kinetics of Gene Activity in Individual Bacteria

Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences Read more!

How Smart is the Cell? Part I: Enzymes as an Analog Computer.

Just how smart is the cell? That is, what capacity does a typical cell have to react "intelligently" to outside or internal stimuli? How much memory does it potentially have?

We'll start with a relatively simple system, phospho-activating enzymes. These are enzymes (proteins with catalytic ability) that work differently depending on whether one particular amino acid residue has had a phosphate group attached to it. The process is called Reversible phosphorylation, described at the KinasePhos site:

Figure 1 (from KinasePhos)

Protein phosphorylation, performed by a group of enzymes known as kinases and phosphotransferases (Enzyme Commission classification 2.7), is a post-translational modification essential to correct functioning within the cell. The post-translational modification of proteins by phosphorylation is the most abundant type of cellular regulation. It affects a multitude of cellular signal pathways, including metabolism, growth, differentiation and membrane transport. The enzymes must be sufficiently specific and act only on a defined subset of cellular targets to ensure signal fidelity. Proteins can be phosphorylated at serine, threonine and tyrosine residues.

Catalysis is defined as "the process in which the rate of a chemical reaction is either increased or decreased by means of a chemical substance known as a catalyst." In this case we'll assume it is increased. However, the reaction itself will only take place if it's energetically favorable. The rate at which it takes place depends on the relative concentrations of substrates and products, specifically how far they are from the equilibrium position. It also depends on the quantity of catalyst.

For Phosphorylation, referring to Figure 1 above, the process usually involves splitting a phosphate off of ATP (Adenosine triphosphate), the primary source of energy within the cell (it doesn't have to be ATP, GTP and other nucleotide triphosphates can also be used), and attaching it to the Hydroxyl group of an amino acid residue. This is very favorable energetically, at least until almost all of the particular target protein has been phosphorylated. Any enzyme that performs this function is called a "kinase", in this case a "protein kinase".

Similarly, for dephosphorylation, the phosphate is removed and added to the pool of inorganic phosphate. This reaction is usually not quite as favorable energetically as phosphorylation, but enough so that only a few thousandths or less of the target is left phosphorylated at equilibrium.

Now, let's start with an enzyme that only works when it's phosphorylated. We'll assume for simplicity's sake that there is only one site where this can occur. The enzyme will be activated by a kinase (another enzyme), and deactivated by a phosphatase. Or, it can be activated by several (or many) kinases, and deactivated by several (or many) phosphatases. Likewise, any specific kinase or phosphatase can act on more than one target protein (enzyme or otherwise). But what does that enzyme do? In our case, it also acts as a kinase or a phosphatase on other enzymes. Thus, enzyme A can activate enzyme B, enzyme B can activate enzyme C, and so on. At the same time, other enzymes can be de-activating them.

Remember that the rate of any catalytic reaction depends on the quantity of catalyst, as well as the relative quantities of substrate and product. The equilibrium concentration of activated (Phosphorylated) enzyme will depend on the rates of phosphorylation and dephosphorylation of all the kinases and phosphatases that affect it. This means that a network of various interacting kinases and phosphatases can drive the quantities of active enzymes in a very complex fashion. We could make an analogy with transistors: each activatable enzyme in the network corresponds to a transistor. The quantity of activated enzyme corresponds to the voltage at the output. The combination of kinase and phosphatase activity targeting it corresponds to the voltage at the input. If we hook transistors together with resistors, the specificity of kinase or phosphatase activity of any enzyme A targeting enzyme B corresponds to a resistor linking transistor A with B.

The analogy isn't exact, but in general, a cell whose cytoplasm contains 100 activatable kinase/phosphatase enzymes hooked into a network would have roughly the same computing ability as a network of 100 transistors. Consider that most small applications require only a handful of transistors, and that the number of known genes varies from 1-5,000 even in bacteria (with humans having 20-30,000). This means that even bacteria could have a network of several hundred enzymes (each coded by a gene) working together as a "brain" to control its activity. Eukaryotes usually have many more genes (5,000 an effective minimum). Not only that, but many genes in Eukaryotes have multiple transcription types, which can produce many enzymes from one gene. Given that different transcriptions often vary through inclusion or exclusion of a functional unit, this means that a single gene can "mix and match" phosphorylation sites for activation with active sites for performing activation or deactivation on other enzymes. It would be easy for even the simplest Eukaryote to have a network of several thousand different enzymes acting as its "chemical brain". Just to add more complexity, many enzymes have multiple sites of phosphorylation, often with different effects on activity.

While phospho-activation is probably the most common form of enzyme activation, there are other types of activation. Most of them work pretty much the way phospho-activation does, for our purposes.

An important thing to remember about analog computers is that each node (transistor, enzyme, etc.) can vary within a range. It can assume any value in that range, not just the "1" and "0" that digital computers use. Digital computers are generally made out of analog computers, by linking two transistors into a Flip-flop. Thus, to create a digital system out of transistors, at least two nodes capable of assuming any value within a range are linked together into a system that can only know "0" and "1". This represents a tremendous loss of intelligence relative to analog networks of transistors. Or enzymes.

While most of the activation systems that have been studied are digital, this doesn't mean most of those in the cell are. It's a lot easier to study digital systems in the cell, and those are the ones most likely to be noticed. This means that, potentially, a cell could have a surprisingly smart "chemical brain"

Next: How Smart is the Cell? Part II: The Gene Activation network as an Analog Computer Read more!

Friday, April 17, 2009

Nerves and Skin

I just discovered a large review article: Neuronal Control of Skin Function: The Skin as a Neuroimmunoendocrine Organ by Dirk Roosterman, Tobias Goerge, Stefan W. Schneider, Nigel W. Bunnett and Martin Steinhoff. For some reason, it's open access, despite being a very large and recent. I haven't read it yet (just the first few para's), so I'll add comments as they occur to me. Read more!

Thursday, April 16, 2009

Y-Haplotypes in Africa

A new paper, Genetic and demographic implications of the Bantu expansion: insights from human paternal lineages, finds "a recent origin for most paternal lineages in west Central African populations most likely resulting from the expansion of Bantu-speaking farmers that erased the more ancient Y-chromosome diversity found in this area", although the same is not true for mtDNA analyses, and "some traces of ancient paternal lineages are observed in these populations, mainly among hunter-gatherers." Much more interestingly, "[w]e also find the intriguing presence of paternal lineages belonging to Eurasian haplogroup R1b1*, which might represent footprints of demographic expansions in central Africa not directly related to the Bantu expansion."

A quick overview of previous info at Wiki. Note that R1b1* means R1b1 without any further mutations (or returned to such state by back-mutation). R1b1 (without the asteroid) refers to the group of haplotypes with the single nucleotide polymorphism (SNP) M343, along with any and all further mutations.

Dienekes' Anthropology Blog (from whom I got the reference) has an informative article, with some info beyond the abstact: Paternal traces of Bantu expansion + African R1b1 mystery

Searching for other information on this subject, I found:

Bantu language trees reflect the spread of farming
across sub-Saharan Africa: a maximum-parsimony

Farmers and Their Languages:
The First Expansions

A Back Migration from Asia to Sub-Saharan Africa Is Supported by High-Resolution Analysis of Human Y-Chromosome Haplotypes also here

The phylogeography of Y chromosome binary haplotypes and the origins of
modern human populations

Phylogeographic Analysis of Haplogroup E3b (E-M215) Y Chromosomes Reveals Multiple Migratory Events Within and Out Of Africa

The Levant versus the Horn of Africa: Evidence for Bidirectional Corridors of Human Migrations

The Making of the African mtDNA Landscape

Variation of Female and Male Lineages in Sub-Saharan Populations: the Importance of Sociocultural Factors

Bantu and European Y-lineages in Sub-Saharan Africa

Map of Africa

Another map of Africa

Yet more, with climate data

One paper I could only find in abstract:

The genetic legacy of western Bantu migrations Read more!

Saturday, April 11, 2009

Entropy and Information

What I'm doing here is beginning to dig into the relationship between entropy and information as it applies to replication in earliest life. Based on my amateur and sparsely read understanding, the cost in energy of adding a specific monomer out of a group of potential monomers to a growing polymer should be much higher than that of adding a random monomer.

For the moment I'm going to leave it at that, adding further as comments as they become appropriate. Any comments that help improve understanding of this subject will be very welcome.

Note: I'm not going to allow this thread to devolve into a debate over creationism of any sort (including "ID").

Links to Relevant Papers (Abstracts only where I couldn't find full texts.)

Shannon entropy applied from panda's thumb

Entropy and Information Theory by Robert M. Gray

Evolution of Biological Information by Thomas D. Schneider

Evolved RNA Secondary Structure and the Rooting of the Universal Tree of Life by Gustavo Caetano-Anollés

Life's Emergence is Not an Axiom: A Reply to Yockey by Avshalom C. Elitzur

About a Symmetry of the Genetic Code by A. J. Koch and J. Lehmann

Self-organization vs. self-ordering events in life-origin models by David L. Abela, and Jack T. Trevors

Global similarities in nucleotide base composition among disparate functional classes of single-stranded RNA imply adaptive evolutionary convergence. by E Schultes, Hraber, and T H LaBean

Evolution in thermodynamic perspective: An ecological approach by Bruce H. Weber, David J. Depew, C. Dyke, Stanley N. Salthe, Eric D. Schneider, Robert E. Ulanowicz and Jeffrey S. Wicken

The Origin and Evolution of tRNA Inferred from Phylogenetic Analysis of Structure by Feng-Jie Sun and Gustavo Caetano-Anollés


Causation and the Origin of Life. Metabolism or Replication First? by Addy Pross

The Driving Force for Life’s Emergence: Kinetic and Thermodynamic Considerations by Addy Pross

On the Emergence of Biological Complexity: Life as a Kinetic State of Matter by Addy Pross

Thermodynamics, information and life revisited, Part I: To be or entropy by Peter A. Corning, Stephen Jay Kline

Thermodynamics, information and life revisited, Part II: Thermoeconomics and Control information by Peter A. Corning, Stephen J. Kline

Modeling the Emergence of Complexity: Complex Systems, the Origin of Life and Interactive On-Line Art by Christa Sommerer and Laurent Mignonneau

What is complexity? by Christoph Adami

Does complexity always increase during major evolutionary transitions? by Mercedes Bleda-Maza de Lizana, Hannelore Brandt, Ille C. Gebeshuber, Michael MacPherson, Tetsuya Matsuguchi, and Szabolcs Számadó

I'll add further links in subsequent comments as I find them, so this post can be left unchanged. Read more!