Meme Storage in DNA

by Douglas C. Klimesh

It is well known how genes are stored in DNA. Genes are distinct units that house the linear instructions for building all the molecules that produce all that we know of as life. What is not known is how the instructions for controlling all this machinery, from the molecular level to the body level, are stored. Inherited instincts must also be stored in DNA, because DNA is the only storage medium involved in the transfer of information to the next generation. I refuse to believe that instinctive behavior can be generated by enzyme expression alone. Given that approximately only 3% of the DNA of the human genome codes for amino acids, the other 97%, called introns, should be used for storing behavior instructions among other things. I can’t imagine that instructions for building a computer such as the brain would be stored digitally and not include the software for running it. Nature is very efficient, so it seems very strange that nature goes to such great lengths to make and preserve exact copies of DNA and yet would have so much of it unused. I propose there is some type of language stored in intron DNA.

A Meme is an idea or a unit of thought. This essay is a (large) meme as is each sentence and each word. A word is not an actual meme, but the representation of a meme. A meme is not tangible in the same way computer software is not tangible, and memes can be thought of as software or programming or as instructions for action or thought. Humans transfer memes everyday by conversation, mass media, books, the Internet, and any other way an idea can be conveyed. By coming in contact with a meme, one’s thought process, and therefore oneself, is usually changed in some (usually slight) way. Ethologists use the term "fixed action pattern", which is a good definition of a meme when dealing with instinctive behavior. Some memetists have a more precise definition such as a meme must be transferred by imitation. Susan Blackmore The Meme Machine For the specific purposes of this essay, I am using a broad concept of a meme. Although mine is not the normal use of the word, meme was coined by Richard Dawkins to correspond to gene, so the use of the word to represent thought stored in DNA seems quite appropriate.

Neurobiology and its related fields have explained much about how life works. I feel there are large enough holes in neurobiology to warrant exploring other ideas. As Thomas Kuhn contents in The Structure of Scientific Revolutions , it is the holes in the current theory that lead to the next theory. Sometimes a scientific theory has worked so well for so long that it is hard to see anything else. And as with gaps in many strong scientific theories, the holes in neurobiology are subtle and trivialized. The hole is how are inherited instincts stored?

"We have only 300 unique genes in the human (genome) that are not in the mouse," said Craig Venter, president of Celera Genomics, the Maryland firm that led one of the mapping teams. "This tells me genes can’t possibly explain all of what makes us what we are."

Eric Lander, a geneticist at the Massachusetts Institute of Technology and a scientific leader of the Human Genome Project, sounded awestruck as he summarized the article he and his scientific allies published in Nature. "I think the junk is the biggest surprise in the genome," Lander said. Less than 1.5 percent of the genome seems to code for proteins, Lander said.

A seemingly common way to visualize the workings of the brain is to think of the brain as computer hardware and the mind as software. Mind may be a cloudy term, but the concept of software is there, whether or not we can put a finger on it. Considering how much our thoughts and the words of others control our actions, it seems obvious that the brain is a computer that is at least partially run by software. However neurobiology tends to only concern itself with the hardware. In fact Ira Black argues in Information and the Brain that there is no software. When something is learned neurons change and make new connections to represent the new concept. Hormones and other molecules are the programming or at least the message. How hormones work and the genes that code for them and their receptors are fairly well understood. Also somewhat understood are the complex layers of molecular control for gene expression. It’s understandable for neurobiologists to think they will eventually know all about the brain just by discovering all the molecular mechanisms. However it’s like taking apart and analyzing an unknown type of computer; of course you’re only going to find hardware.

I refuse to believe there is no software in the brain and elsewhere. Sure neurons may reconfigure themselves when something is learned. Life is very efficient, and the brain is just optimizing itself for the new task. Just think about some of the complex tasks a brain does, such as understanding a sentence and using vision, that we humans are only beginning to get computers to do. How can a brain, arguably more powerful than any human made computer, not use software? The power of our "Information Age" is due to software (and the hardware to run it.) Human technology generally can be thought of as trying to "catch up" with what nature has already mastered. Humans invented software, why not nature? One could argue that it is not necessary for life. I would agree that prokaryotes and probably many simpler eukaryotes do not use software, just as humans got along for most of our history without building programmable computers. However once tasks become sufficiently complex, computers become necessary for efficiency. Biologists are still uncovering the fantastic molecular mechanisms of basic cellular processes. One could argue that we are born without any software, but what about complicated inherited instincts? If there is software then it must be stored in DNA, where else?

Besides DNA, an other possible means of inherited instinct storage is Rupert Sheldrake’s morphogenetic fields. [4] Sheldrake proposes a nonphysical field that controls physical entities. This field, originally proposed to explain embryo growth and cell differentiation, extends beyond space and time so that all the members of a species that has ever existed exerts some influence over a current member. A popular example of this field is the "Hundredth Monkey" phenomenon, where once a significant portion of a species’ population learns something (in this case monkeys washing potatoes) members of the species not physically connected with the other members (on a separate island) will automatically "know" it (or at least be much more inclined to figure it out.) Sheldrake also uses morphogenetic fields to explain how nature chooses particular crystal formation and protein folding. Although scientists generally dismiss such nonphysical things, I believe in some type of morphogenetic fields. However I don’t believe morphogenetic fields can explain the hybridization of instinctive memes as discussed below.

The only other possible theory of inherited instinct storage that I have found is in Stuart Hameroff’s Ultimate Computing. [5] He proposes that the microtubules making up the cytoskeleton of basically every living cell in addition to providing the structure and shape of the cell and the means of intracellular transport also function as electrical information processors. So when a cell divides, it may pass on genetic information not just in DNA but also in the form of microtubules, which are integral in cell division mechanics. Although this type of generational information transfer seems possible, what is more interesting is Hameroff’s larger idea that these microtubules function as electrical information processors. The microtubules do not seem to be information storage devices but instead seem to be electrical information transfer and logic devices. These tubes could make up the logic gates of a cellular computer. A living cell is a very complex machine, and it seems that a computer would be required to control it. However the structure and mechanism of how these logic gates work is unknown. Hameroff hints that one type of logic gate is the structure of one tube perpendicular and not touching another. Also some type of information processing seems to occur in just the tube itself. So now we have some type of logic gates built up from these microtubules, and we also have memory storage in the form of DNA. So I am proposing that some type of machine code, somewhat similar to the machine code of silicon microprocessors, is stored in DNA. Evidence to support this is that prokaryotes very rarely have introns, and they have much less and simpler microtubules. A simple analogy is the comparison between automobiles of decades ago with the new cars of today that have an embedded computer controlling fuel/air mixtures and many other engine and other controls. The prokaryotes correspond to the older cars where engine controls were done mechanically or using simple logic devices. The eukaryotes correspond to the new cars. Both cars work, but the new cars "learn" the driving style and environment of the driver and can be reprogrammed for say high performance with the swap of a memory chip.

Currently scientists do not understand the function of introns. It has been suggested that introns are used as a checksum or error correcting code for exons, the coding regions. Even if this is true, it does not explain why generally the more complex a species is the more introns it has. Another idea about introns comes from research into genetic algorithms which shows that having "scratch pad" area where nonfunctional genes are stored improves genetic algorithms. I recommend the Winter 1998 issue of Evolutionary Computation for information about introns and genetic algorithms. However H. Eugene Stanley and his collaborators and others [1] have shown that long-range correlations exist between base pairs of introns but not exons. Furthermore Stanley et al. using statistical linguistic analysis have found that introns have certain statistical features in common with natural languages and that exons do not.[2] [3] Although the methods and results have been seriously questioned, it seems that the differences between the coding and the noncoding regions can not be accounted for by saying that introns are just unused or scratch pad areas of genes.

Geneticists may feel smug knowing how DNA translates to proteins, but to be able to understand the software code will be magnitudes more powerful. Trying to decipher this DNA code would be similar to deciphering the code of an embedded (no keyboard or monitor) computer’s digital tape storage without any documentation about the microprocessor. Although the logic gate structure is unknown, likely the more efficient way to decipher the code of a unknown microprocessor is to analyze the stored code, not by taking the processor apart. Imagine trying to figure out the machine language commands of a present day silicon microprocessor with millions of transistors by taking it apart. Instead you would give it various software inputs and analyze the outputs.

An analogy to help picture the DNA software decipher problem is to imagine looking at the raw binary code - or even it’s hexadecimal equivalent- of a standard computer program. A computer program is divided into sections of two types; the actual software instructions and the data that it manipulates. For example most likely at the top of your screen now are the words "File", "Edit", and "View". Somewhere stored in the program somewhere- definitely on your harddrive- and probably in the dynamic RAM memory also- are those words translated in zeros and ones by a code known as ASCII. This code assigns all text -letters, digits, and punctuations- into an eight bit binary number, a byte, of between the decimal numbers 0 - 255. Now if you used a program not much more complicated than a simple text editor to read the object code, the code the microprocessor, the hardware, understands. Your Intel Pentium, AMD Athlon, Motorola G4 or whatever microprocessor you are using does not understand the ASCII code nor the text it represents. So if you were to view the raw zeros and ones of the whole program converted into ASCII, most of it would be garbage -junk-, but you would find the words "File", "Edit" somewhere. This is assuming that you are not versed in the arcane (and mostly useless given today’s powerful high level languages such as Java) knowledge of machine language programming, you won’t know the hexadecimal machine codes.

These English words correspond to the amino acid coding in DNA. If DNA is a computer or the storage mechanism for a cellular digital computer, then the protein coding regions are just the output, the data that is manipulated. And we humans have only the vaguest of clues as to the digital software code. Comparing the ratio of the amount of ASCII text data to the total amount of the program with the ratio of protein coding regions compared to total amount of the human genome. For example the program directory of one of my web browsers is about 22 megabytes in size. In ASCII approximately 1,000 text characters are stored in 1 kilobyte. 1000 kilobytes, KB, equals 1 megabyte, MB.

This analogy also shows how hard it is going to be to decipher this code. (This analogy is open for misinterpretation. The embedded English words correspond to what we understand when we look at the raw data. The words do not correspond to memes.) Not only can microprocessor instructions be various lengths, but also not all of it is instructions, some of it can be labels and data. [This "data" may correspond to the form of the body.]Once we start dealing with species that have nervous systems and brains, we encounter another layer of programming. Although a single cell can not comprehend complex organism actions, the brain is "just" a network of neurons. Just as a single human does not have the comprehension of the combined thought of our whole society, each person knows English. To get a complex task to occur involving a large number of people, the people just have to be told in English what their individual task is. In other words to program the brain is to program many individual neurons. Thus the difference between cellular and brain level programming is mainly one of complexity. To continue with the silicon computer analogy, usually a high level computer language that most programmers program in gets compiled (translated) into machine language that the microprocessor (hardware) understands. One high level language command is the equivalent of a number of machine language commands. So although brain level memes are more complex, they should be based on cell level memes.

The question arises as to the mechanism of converting DNA base pair sequences to electrical signals. Already we know fairly well how messenger RNA is converted to amino acid chains, so it is not hard to picture a possible mechanism existing, especially when we understand start and stop codons. However given the current work of measuring the conductivity of DNA, especially by Jacqueline Barton et al. [Web Reference], more efficient mechanisms seem possible. By intercalating conductive molecules around a number of base pairs, say a codon of three base pairs, in RNA and either measuring the conductivity or sending a particular signal along these base pairs, an electrical "signature" can be generated.

However a recent invention called Nanopore Technology sheds light on the most likely mechanism. Scientists are excited about this discovery because of the possibility of using it to sequence DNA very fast. I am excited about it because I believe a similar structure is already at work in our brains. Imagine a single strand of DNA feeding through a nanopore at a neuronal post synaptic membrane. If this nanopore is just the right size so that depending on the base sequence of the DNA in the nanopore at the time, calcium and potassium ions flow through at different rates. Therefore a specific voltage is generated for a specific DNA sequence. Thus by pulling the DNA through at a constant rate, electrical signals in the neuron are generated from the DNA. Also a recent study shows how DNA strands can move through confined spaces.

Interestingly Hameroff has teamed up with Roger Penrose to propose that consciousness is due to some type of quantum coherence effect going on with the microtubules.[6] I believe they are on the right track and that consciousness (and artificial intelligence) can only exist in the physical world in/as a quantum computer. As a quantum computer is a very complicated concept, it is beyond the scope of this essay. Basically instead of storing bits as either 0 or 1, a quantum computer uses quantum states, which in the cell I believe to be electron spin. Thus the state of the quantum bit, or qubit, can be in a superposition of both 0 and 1 if there is no outside observation of the states while it is computing. Thus theoretically quantum computers would be magnitudes faster than classical computers. I would say that consciousness is coupled with this quantum computer. In other words the command that the microprocessor is currently interpreting, after being read in from DNA, is experienced as a thought by the consciousness. The point is that the eukaryotic cell contains not just a classical computer, but a quantum computer as well. Since humans have not yet built a quantum computer, much less programmed one, the software question becomes even more complex. I use the term meme for this essay instead of say software, because, as Stanley et al. have tried to show, introns have natural language characteristics. Because consciousness is part of all living beings, we are dealing with more of a natural language than microprocessor level instructions.

How does life break down instincts into various memes? A textbook example of an inherited meme is a certain spider that spins an egg cocoon in exactly the same way every time using thousands of movements. The spider will do the whole sequence of steps from start to finish in exactly the same way every time despite experience, being moved to a new location, previous partial cocoon completion, or running out of silk.[7] This whole cocoon building and egg laying behavior pattern is probably stored as a single meme. The spider mother does not raise her young, so there is not a question of passing down the meme by rearing. Because memes are passed down through the raising and socialization of offspring, especially in humans, confusion can arise when figuring out the exact mechanism for meme inheritance.

Other studies show the hybridization of memes. Separate studies of complicated instinctive procedures such as nest building and mating rituals of certain birds show that mating slightly different species together will produce overall (non-working) actions that combine instinctive elements from both parents.[7] These and other behavior hybridization studies show that memes are stored like genes.

I assume that memes are coupled with specific exons. Thus activating a particular gene could trigger a particular thought or at least a particular action. This could help explain the "disheveled" gene in mice (also found in flies and humans) that disrupts socialization in mice when it is removed.[8] Also detected is a gene that influences fearfulness in mice. (Web reference) Possibly having a particular thought (on the cellular level) could activate a certain gene. So genetic/memetic experiments are possible. One could swap all the introns of a particular gene that is only expressed at a point early in development with the introns around a gene that is expressed only later in life and see if the gene expressions are reversed. This theory also supports the idea of multiple copies of a gene in the genome where the exon is the same, but the introns are different. So there would be different instructions expressed from a particular gene at different times. Plants could also be used for these experiments. Also some genes are not (exon) expressed at all and may store only memes. In a gene, memes could be software for controlling the enzyme that the gene codes for, similar to device driver software necessary for most computer hardware.

The most probable structure of DNA coding is fractally. Nature obviously uses fractals as the structure of many organisms. Humans now have written fractal image compression routines, so it is quite possible that nature uses fractal compression to store the three dimensional form of the body. As language itself has fractal characteristics due to its self referral nature, Stanley et al. have found that introns have some fractal charactistics.

I have discussed many of the problems and complexities involved in the process of decoding memes in DNA, but there are positive aspects. The human genome is being mapped and recorded onto the Internet, so computer analysis of a large number of introns is possible. By setting up a distributed Internet computing project, very inexpensive supercomputer equivalent analysis is possible. By already knowing exactly where genes and introns start and stop on the chromosome and given the large number of genes, DNA analysis is possible without needing a lab. However by attacking the problem on all fronts, success should happen quicker. Not only by doing experiments with intron removal and alteration on various species will the mysteries of DNA be unraveled, but also by searching for the molecular structures of DNA to electrical conversion and the electrical logic devices of microtubules.

Whether or not anything I have postulated about the specifics of meme storage in DNA is true, I strongly believe in its existence and that scientists should be trying to decode them and also looking for the molecular structures involved with them. I have mentioned many possible uses of introns: error correcting code, scratch pad area, microprocessor type instructions, brain type instructions, enzyme control instructions, and storing the form of the body. I assume that introns do many of these things, just as a gene can have multiple introns. Like most complex, cutting edge scientific investigations today, the decoding of memes in DNA requires many areas of knowledge. This requires a collaboration of experts from many diverse fields. It also requires people who are knowledgeable in many fields but not necessarily an expert in any (such as myself). But first it requires a belief in the goal.


[1] Ivan Amato. "DNA Shows Unexplained Patterns Writ Large". Science. v257, p747. Aug 7, 1992.

[2] R. N. Mantegna et al. "Linguistic features of non-coding DNA Sequences". Phys. Rev. Lett. 73, p3169. Dec 5, 1994.

[3] Philip Yam. Sci. Am. March 1995 p24.

[4] Rupert Sheldrake A New Science of Life: The Hypothesis of Morphic Resonance. Rochester, Vt. : Park Street Press, 1995.

[5] Stuart Hameroff Ultimate Computing: Biomolecular Consciousness and Nanotechnology 1987.

[6] Hameroff, S.R., and Penrose, R., (1996) Orchestrated reduction of quantum coherence in brain microtubules: A model for consciousness. In: Toward a Science of Consciousness - The First Tucson Discussions and Debates, S.R. Hameroff, A. Kaszniak and A.C. Scott (eds.), MIT Press, Cambridge, MA.

[7] Purves, William K. and Orians, Gordon H. and Heller, H. Craig. Life: The Science of Biology, Third Edition. Sunderland, MA: Sinauer Associates, 1992. p.984.

[8] Cell. Sept. 5, 1997; 90(5):895-905.

I welcome comments at Last modified Dec 1, 2002.