This note is about a powerful conjecture regarding the fundamental roles suggested by the true significance of DNA. A conjecture is not settled Science, it is a hypothesis to be validated by empirical evidence and logic (mathematics). Normally, conjectures are not aired publicly, however at this time I am 88 years old, housebound & do not have the energy to engage in the usual scholarly processes. At the same time, I am lucky enough to have:
The basic point of this lecture is to show that regular Computing, Biology and Psychology will be found in the next decades and centuries to be all Computational Sciences.
The first point is that all three of these sciences share the same binary integer based hard evidence which is, for the last two, encoded by DNA molecules. The DNA technology is a storage technology and it should be looked at primarily as a storage technology and the question is why is DNA so predominant in biological and psychological studies evidence?
The answer that I suggest is that this is due to the fact that the DNA achieves probably the highest possible density of information storage that is possible in this universe. On top of that, a reliability that is unmatched by any other man-made storage technology.
In fact, in the case of DNA the density that can be achieved is that it takes between ten and one hundred atoms to store one bit of information and the ultimate density would be one atom per bit but one atom per bit is exactly what the physicists have been trying to get with quantum computing and they have been unsuccessful because systems that are based on a few atoms tend to decohere over time. So, my supposition is that we will probably be able to prove, in the future, that ten atoms is the minimum per bit and, for sure, somewhere between one hundred and ten atoms.
DNA reliability is incredible. We have hard evidence that DNA has been recovered from fossils stored in amber that was in existence between one hundred and twenty to one hundred and thirty six million years ago and that compares with the storage technology that we have developed over the years that can only guarantee storage for a few thousand years at best.
My conjecture is that the DNA molecule is probably the optimal density/reliability molecule in the universe, if that is the case, then any other biology or any other psychology would also be based on DNA. Now, why DNA has got these abilities? The answer is that the basic structure of DNA depends crucially on the chemistry of carbon. Carbon is an atom that has an outer shell with four electrons out of a shell that has eight holes. In other words, carbon has got an even number of electrons to give to other atoms as well as holes to accept electrons from other atoms. Therefore, carbon has got the richest connectivity of virtually any other atom in the first few rounds of the Mendeleev table. Silicon which is the next round in the Mendeleev table also has got four electrons and four holes in its third shell. Carbon has got two shells and Silicon has got three. However, Carbon is a much lighter nucleus so that probably increases the value of the electric connection with other atoms.
I expect that in the next few centuries, mankind will discover more and more planets or moons that have life. Planets which are in the Goldilocks zone of their star. They will also find out that such planets use the same or nearly the same chemistry, DNA based or very close to DNA. Now, that supposition does not imply that the life found on other planets and moons to be exactly identical to what happened on earth. In fact, it is likely that the actual course of evolution in alien bodies would go a different way because of random events like what we had sixty six million years ago when a large asteroid impacted the Yucatan province creating the Chicxulub crater. Such random events can completely change the course of evolution.
DNA is particularly good at implementing evolutionary processes. It is a technology that allows a trial and error approach to discovering the truth. The way DNA does it, is a molecule capable to duplicate with minor errors, infrequent errors. However, these infrequent errors generate different trials. The trial part of the trial and error strategy is provided by the errors (mutations) duplicating the DNA. The discovery part arises from the fact that new versions of DNA and related organisms are eliminated faster if they are less fit.
The information stored in DNA is of two basic types. There is information that was generated during the process of evolution and that information is stored in the genome. The genome is well understood to be several strings of binary code and in particular, for the human race, there are twenty three strings or chromosomes of this binary code. In total, these strings have about 3.2 billion bits of binary information (400 MB).
The second type of information is generated by the life experiences of the individual of a particular species. The lifetime experience is apparently coded in the neuron cells' DNA. In fact, very recently in the last year (2019) biologists have been sequencing the DNA of neurons discovering that they have different DNA codes. I mean, when you sequence the genetic information of regular cells, you find that the individual has essentially the same DNA for all of its genomes. That is not the case for the neurons. Neurons have highly variable DNA. The biologists, right now, are wrestling with this and scratching their heads but Computer science has absolutely no problem with it. In fact, when I saw this I said ok, this is where the temporary lifetime information is getting stored. In other words, my conjecture is: the neuronal DNA encodes what the organism learns during its lifetime. In the last half a dozen years or so, we have learned to manipulate DNA strings and we can take any DNA from any living species and sequence it and sequencing does not mean reading, sequencing is finding out what the individual codes are but not necessarily understanding what it means. The other thing we learned is how to modify or edit these codes.
We learned the latter by observing the mechanism that bacteria employ to combat viruses by simply snipping some of the virus DNA and then inserting it into their own genome so that when a bacteria encounters the same virus again they recognize it and they know to kill it. CRISPR is the molecule bacteria use to recognize the viruses, basically can learn how to find & cut a segment of virus DNA and then edit it back into the DNA of their own genome. This is exactly the technique you need to edit any sequence. We have developed technologies which go by the name Crisper/Cas9 and Cas13. In other words, we can create molecules that have the ability to find a particular sequence of DNA, cut it & replace it with an alternative sequence. That is, we now have several DNA cut & paste technologies.
In the last half a dozen years we have learned how to read the individual bits and to edit them by cutting and pasting these DNA codes. The next step would be to try and really understand what these codes mean but before we do that let me tack to a different area which is not Biology, not Psychology but is the Mathematic behind computers which is known as Computer Sciences.
One of the fundamental problems that were solved in the '30s of the last century was the notion of what is computable, what are things that can be computed by machines. Machine like algorithms. It turns out that in around 1937 some fundamental discoveries were made by three Mathematical geniuses Alan Touring an English man and Alonzo Church an American and Kurt Godel an Austrian. These three people discovered the same thing using three different approaches which were eventually proven by Touring to be totally equivalent and the result was this, that any function that computes and evaluates values which are integers can be effectively computed by machines and by effectively computing we mean that the time to do the computation would not exceed some polynomial expression of the number of variables and also that this computation would not require unlimited storage. So given that a function values to integers, can be computed in polynomial time and it does not require unlimited storage, then the answer is that yes, indeed, it is computable.
Now, what does this Theory of Computer Science got to do with Geonomic and neuron DNA? Well, the answer is very easy which is that these DNA strings are binary integers. They are binary integers which means that they are computable by Touring machines.
A further implication of all of this is that when we say it is computable by Touring machines, what we really mean is that both of these kinds of strings, the Genomic and the neuronal are computable by machine like processes which do not require magic intervention of any kind. Therefore, all of life and all of minds are not some magic creation but they are a result of logical, predictable processes, which may require an enormous amount of copies and may require an enormous amount of time which, of course, we know from Science in general, were available to these processes.
What we need to do in the future is to, not only confirm all of the above, but we have to start thinking about Biology and Psychology not as mushy descriptive Sciences but as hard Sciences, hard Sciences that not only have empirical evidence but have Mathematics that are relevant to the data structures and data types utilized by the DNA empirical evidence.
Another way to put it is we can view, not only regular computing but also Biological computing and Neurological computing, to be processes that are guided by some very sophisticated software. So the issue in front of mankind as far as I can tell, is that in the next decades or centuries we have to find out the structures or architectural organizing principles of the:
The key issue, as it was for computing software is the Architecture of the software. Now I am not saying that the Architecture of the Biological software or the Neurological software will be identical to what we have in the Computers, not by a long shot. I fully anticipate that these softwares will be far more complex than anything mankind has created, but you still will have an Architecture. Furthermore, one will not understand the software and have a predictive science until you have full Architectural control and understanding of the Mathematical insights organizing the enormous mass of empirical evidence.
Before proceeding to sketch how such research effort could proceed let me mention two major unexplained mysteries which seems to be explicitly & simply explained by the above conjecture:
Given the neuron (reminder of) life duration & the recently discovered cut & paste process they employ to encode fresh information, the first item could easily be explained if we were to find out that the process for healing the DNA double-strand breaks was to deteriorate with advanced age.
About the second item, given the complexity & sophistication of human information retrieval processes, it is very likely that it is not encoded in the Read Only Software available in the genome. That is, it is very likely to rely on bootstrap processes to progressively code it in the writable neuronal memory.
The amount of empirical evidence is overwhelming. The reason we are making progress is that computers allow us to keep a huge amount of data. If we did not have these tools, we could not even begin to collect the information. We would not know how to organize it or store it.
We have, in front of us generations of hard work involving processes like parsings of these binary strings. That is, understanding what is data and what is process. Or what are instructions and what is hierarchical structure. If man-made software is any guide, this means that we will eventually learn how to compile DNA codes from some kind of source code or to start from the DNA and go in reverse and decompile it creating the source.
In other words, learn how to express the codes' semantics by a symbolic language that can be read and understood by people and can be processed by a computer generating the correct DNA binary string. In other words, there is a lot of work to be done. I would not even bother to talk about it but I am at the end of my life and I want to leave some vision of what should be done to progress. It is a very very exciting direction but it is going to be a lot of work and very complex.
Some general guidelines are:
Starting at the very top, the human level, the most complex may guarantee financial success, but it will ensure that the work will be most difficult.
Hopefully, I have shown above that it is very probable, that the use of the DNA storage technology by biological & psychological processes is tantamount to discovering and recording very rare & interesting binary sequences or integers. That is software!
I can show, with an easy computation, how RARE such strings are. Take, for instance, the human genome which completely specifies how to build a human. We now know that it comprises 3.2 billion bits. Such a number corresponds to a one followed by about 3.2 billion/3 zeros. I.e. one followed by at least a billion of zeros!!!!!!!!! Virtually all these possibilities will be nonsense!!! Another way to see this is: it took evolution between 3 and 4 billion years to discover the right genomic sequences to build a human being. This is a speed of around a bit per year!!!!
At present it is estimated that 5 billion species have existed on Earth so far, even if you assume all of them to be as complex as mankind, this still means discovering only 5 billion bits/year or a little over 600 MB/year. Our present digital technology produces meaningful bits many times over this amount per hour!!
This is the deep essential change brought about by digital technology: the increase of the speed of discovery of interesting, beautiful, inspiring, uplifting bit strings by, so far, eight order of magnitude (100 million fold) over & above what the pre-hi-tech world could do. futurist Ray Kurzweil is right that we are facing THE SINGULARITY but he is wrong about:
Thank you, that is the end of my story.