Applefinch

Decoding Evolution

applefinch — Sun, 19 Feb 2017 15:58:30 +0000

Unless you were raise by wolves, you will not be surprised to hear that the DNA of differnt organisms contains a kind of recipe for developing and maintaining that organism over its lifetime. Squirrel DNA has a recipe for squirrels, dog DNA for dogs and so forth.

But wolves not withstanding, you might be surprised to hear that there is another extremely ancient message in DNA that comes to us over billions of years. It is an evolutionary journal of the stages of evolution that led up to squirrels, dogs, and every other living organism from a single common ancestral population billions of years ago.

More surpring is that the message is buried not in one set of DNA, but in the pattern of similariteis and differences in DNA between different organisms. You won’t get this message from only analyzing squirrel DNA, but it will start to surface as you compare the genes of a number of related organisms.

Powerful mathematical techniques can be applied to comparing the gene sequences of related organisms to decode their evolutionary history. Now with the advent of cheap and fast computing power, and cheap and fast gene sequencing, biologists all around the world are busy analyzing DNA and constructing the entire evolutionary tree of life using these techniques.

But how is it that mathematics can be used in biology to chart the history of an organism’s evolution? What kind of information would be in DNA that could be decoded with mathematics and how was that discovered?

In the latter part of the 20th century, as molecular genetics came into its own, biologists realized that the steps in the process of evolution fulfill the definition of what is called a Branching Markov Process. Markov processes are a class of processes that share a certain set of properties that were characterized very early in the 20th century by Russian mathematician Andrey Markov. While this realization might sound ike something of interest to only mathemeticians, the implications are:

If the diversity of all life on Earth has come about through evolution from a common ancestral population, the history of that billion year journey is recorded in the DNA of organisms alive today.
The genetic message that comes to us across deep time is encoded in the highly unique and recognizable pattern produced by a Branching Markov Process.
We should be able to confirm the theory of evolution for all life on Earth by finding the Markov message in the DNA comparisons between different living organisms.
The message contains what we need to recreate the tree of life. We can recreate that tree branch by branch by sequencing the DNA of related organisms compare them using Markov mathematics.

“Mathematics is the languge with which God has written the universe.” – Galileo Galilei

Galileo would probably not be too surprised to hear that the process that gave rise to the entire diversity of life on Earth can be described mathematically. Nor would he be surprised to hear that the same mathematics can be used to decode the history of that process. The mathematics of Markov analysis are pretty fierce, but fortunately Markov processes are easy to describe using no mathematics at all. I promise that, and I promise that there will be no quiz on this on Friday.

A Drunkard’s Walk

First let me define a Markov Process as a process that changes the state of something in a small random way over and over again. The most popular example would be a drunk person walking across a field. The “state” of the walking drunkard at any given moment would be the position of the step he has just taken. As he staggers around the field, each new step is a new state. If you record the position of each step he takes, you will have a sequence of positions or states. Markov analysis would call the sequence of states a Markov Chain.

What makes this a Markov Process that produces a Markov Chain is that each new state is closely related to its previous state. The drunkard can only change his position equal to or less than one stride length. If you number each state (or step, in this case) you would find that the any state Sn is not much different than its predecessor Sn-1. The difference is equal to or less than the length of the drunkards stride.

Markov processes have a very short memory. They don’t look back very far in their history. If I told you the location the drunkard was in at this very moment, you can’t predict where his next step will be, but you can predict that it will land somewhere within a circle whose radius is no longer than the length of one stride. Notice I did not have to tell you anything about his previous positions leading up to where he is now for you to predict the possibilities for his next step. So this drunkard’s walking process has “the Markov Property”, where each new state is only dependent on its current state.

If you happened to run into the drunkard just standing in the field you might ask him where he came from. If he can’t manage to remember, you would be hard pressed to figure that out. All you know is that the step before the position he is standing on is within a circle that is no larger than one length of his stride.

So far it doesn’t seem like Markov process is much of a help in figuring out the history of a process. But let’s take another example. This time the process is that a number of different people around the country are going to make a chain letter before there was email. They are instructed to each sign the letter, make a copy, and mail the copy to another person who has not already recieved and signed the letter. Suppose the letter has no signing page and no room anywhere to sign it, so the signers put their name anywhere on the letter they can find room.

Now suppose you are the last guy to get the letter. You get your copy in the mail and it has all the signatures except yours. Stepping back for a moment, notice that this is another Markov Process. The letter has gone through a series of states where each copy is the same as the one before it but has one more signature. But notice that his time the copy arrives in the mail and it has more information about it’s Markov Process than the drunkard case. If it has 23 signatures except yours you at least know it went through 23 previous states. You know this because this particular Markov process left some evidenc. The small changes from state to state accumulated on the document. The only problem is that since they were signed all over the front page, you still don’t know what order the states occurred in. In fact, the signers might have all met at a high school reunion somewhere and all signed it at once and then sent you a copy.

This is fun but how does this relate to evolution, you might ask? Let’s suppose you caught a squirrel, took a sample of its DNA and sequenced it? You now have the equivalent of that chain letter. The squirrel is the offspring of a set of parents who at conception made a copy of their DNA and passed it on to the first cell of the new squirrel. But since the copying process is imperfect there is a chance of a slight mutation (or more) taking place that altered the DNA copy. The same thing happened when the squirrel’s parents were born from the grandparents, and so forth. The DNA has gone through a couple of billions of years of imperfect copying where each copy varied slightly from the previous one by a small random mutation. And like the letter signing example, the results of each step along the way accumulated in the DNA in the form of mutations generation after generation all the way down to the squirrel you have in your laboratory cage. I think the implications are clear so far that the process of evolution, among other things, is a Markov Process when it comes to what happens to DNA over the many generations it is handed down.

But we still have the same problem as the chain letter. We don’t know the order in which the mutations occurred. So how can we chart the evolutionary history of the squirrel? All we can see is the one final state, which has carried with it all the accumulated mutations like so many signatures. Fortunately for us and for evolutionary biologists, the analogy of the contract signing process is not complete when it comes to the Markov Process of evolution. There is more that we can examine than the DNA of a single organism.

To get closer to what happens in evolution, let’s suppose a new scenario where people are mailing chain letters. But let’s modify the rules to produce a mega-chain letter. Let’s have each person sign the letter but make ten copies and distribute them to five different individuals who are supposed to do the same thing. Notice that this still describes a Markov Process, but rather than producing one Markov Chain it produces a large number of chains that are all different in a certain way. When Bob receives a letter he signs it and makes copies. But when he sends each copy to five different people he is establishing five branches off of his particular chain. Why? Because each of the five copies will now receive a different signatures in their next step in the process. And then the same thing will happen again as those people sign, copy, and forward to five more people.

If you drew a diagram for this, each state would produce new branches that go off in on their separate paths, further branching at each new state. A diagram of this would show the branches fanning out from one point, then each branch fanning out again, and so on, to form a tree-like structure. You could trace a path on the diagram from the originator of the letter all the way to any final recipient. That one path would be a Markov Chain, because they are all different Markov Chains. But the entire tree of chains would be a set of Branching Markov Chains.

Now if you picked up one of the letters from its final recipient, you would still have the same problem as the contract signing. You have a document with a lot of signatures, but no way to tell in which order they occurred. But something really interesting happens if you have all the final letters. The first thing you would notice is that none of them have the exact same set of signatures. But the next thing you would notice is that they all have at least one signature in common (the originator of the letter, let’s call him Bob). As you keep comparing them all, the next thing you might notice is that the letters can be sorted into five groups, where the letters in a particular group share two signatures, one being Bob and one being someone else. Let’s say the first group is the Bob-Mary group and the second is the Bob-Joe group and so on. These second shared signatures from the first set of five recipients from Bob’s mailing. The next thing you might notice is that the Bob-Mary letters can be further broken down into five subgroups where the letters in each subgroup share another signature. That would be the Bob-Mary-Pete subgroup and the Bob-Mary-Sue subgroup and so on.

By now you can see a pattern emerging. The similarities and differences in the signatures on the final letters contain information about the branching that a single letter does not have. And the successive grouping and dividing of the letters allow you to redraw the original tree that describes the order of the letter signing that originally took place, even if you did not already know how the letters were created.

If you start diagraming this, since the letters all have the Bob name, you plant the tree at Bob. Seeing that all letters can be sorted in groups of five by another name, you then can draw the Bob-Mary branch, then the Bob-Joe branch, and so on. Then because each of those groups can be further divided into their own five subgroups you can draw those branches and so on. Soon you will have drawn the entire tree that describes the original routing that all the letters took from one person to the next.

You might notice that this tree that you draw from the letter comparisons looks like a genealogy chart. The similarity is not a coincidence. If you substitute the letter for a human genome, and you realize that each child born inherits a copy of the parents’ genes but with a random variation, you can see that the building of an extended family in each generation is a Branching Markov Process. And the states that are being changed along each branch point is the state of the genome as it is copied and handed off to the next generations where it accumulates mutations just like the branching chain letter accumulated signatures.

The final results will produce relatives who can be grouped by their genetic differences and similarities just like the final chain letters could be grouped by similarities and differences in the signatures. And just like the chain letters, you can recreate the family tree diagram for those youngsters even without having the DNA of any of their parents, grandparents, or any of their ancestors. Now simply project that notion back milions of generations and you can see that the entire world of living organisms is related in the same way that people in an extended family are related. The reason why we know that is the same reason why we can tell siblings from cousins and second cousins and so forth in a family tree by comparing their DNA.

Evolution is a Branching Markov Process and it accumulates information across the genomes of related organisms in the precise pattern that we would expect from that kind of process. Encoded in that pattern is evolution’s billion year journal of where it has been and what it has accomplished.

If that seemed too easy, you are correct. Sorting piles of messages doesn’t seem very math intensive. But consider that the genome has billions of base pairs that could be mutated in any which way. Sorting them by hand would take to the end of time. Fortunately, powerful computer techniques can be used to do the analysis.

To give you a feel for what those techniques have to deal with, let me extend my last branching email problem by one more challenging angle. Suppose your job at the FBI was to establish the alibi of many of the participants in the email chain letter. But to make the problem more challenging, suppose none of the chain letter participants signed any of the letters.

You go back to your office and ponder this for a while and then while staring at the letters spread out on your desk you realize that some of them have other artifacts on them. It looks like some of the letters in the process laid around for a while on peoples’ desks before they were signed, copied, and mailed out. One group has a coffee mug stain, another has the remnants of a swatted fly, another group has a phone number written in the corner, another has doodles, and so forth. While looking for other artifacts, you notice that some of the copiers had a dirty glass causing some letters to have little smudges and nits that the copiers picked up.

Why this is important is that each letter picked up unique artifacts at the same location where it was copied, and the artifact was copied five times along with the signature. That means you might be able to do the same analysis using the artifacts alone. In fact there might be further artifacts due to the unique lens distortions in the copiers at each location as well. That means you might be able to do the analysis using the distortion from each of the lenses and the artifacts found on the letters. While they are good standins for the missing signatures they are distributed all over each of the letters.

So now you decide to pull out your heavy hitter Markov Analysis software. You scan each letter with a very high resolution scanner which turns the letter into a grid of pixel values like any digital image. Then you feed them into the bigtime Markov software which uses the similarities and differences between all the letters on the basis of the pixel values. That way it can include the effect of the all the artifacts from lying around each office and from the lens distortions of each copier. The software crunches for a half hour or so and then comes back with a tree for the same reason that your manual signature sort produced a tree.

Just to be sure you instruct the software to redo the analysis a number of times using different specific areas on the letters. For example, you have it analyze all the letters but only using the top left hand quarter of each letter, then repeat that for the top right hand quarter, and so on. Each analysis produces its own tree that it coaxed out of its particular area of the letters.

If you can see why each of the trees it produces should be identical (or very close) then you are starting to realize the implications of how one could prove almost beyond a shadow of a doubt that the letters were indeed produced by a branching chain letter process (which is a branching Markov Process). It would be very difficult if not impossible to fake all those letters in a way that could possibly produce the same results unless the information accumulated through the branching chain letter process.

The unsigned branching chain letter analysis analogy is very close to what happens when DNA from different related species are sequenced and analyzed. Each genome from each different organisms is like the final letters in the analogy, where the billions of base pairs that make up the information in the genome are like the pixels from the images of the letters. Comparing the whole genomes of a group of different but related organisms should give us a tree, and so should comparing any particular section of the genome between those organisms. We should end up with a set of trees that should agree with each other if the information in those genomes have accumulated through the Branching Markov Process of evolution.

However, since this message from the past comes to us over millions of years, and natural selection is carefully preserving some genes over others (along with other interference) it is not surprising that the message is somewhat noisy. Also consider that unlike our nice clean chain letter scenario, we don’t have all the final results. Most of the species in that huge tree of related organisms going back millions of years are now extinct. So in actual practice the trees that biologist get from the different sections of the genome do not all agree one hundred percent.

If the trees don’t agree one hundred percent, how much confidence should we have in the results? Going back to the branching chain letter analogy, suppose we don’t have all the final letters. And suppose the letters we do have were all copied one last time on a really bad copier. Now suppose the section by section analysis of the incomplete set of noisy letters produced trees that agreed with each other by eighty percent or so. How confident would you be that the letters were originally created by the branching chain letter process? To get a better feel for that, ask yourself what other kind of process could produce a set of letters that produce any trees at all, or trees from different sections that agree with each other in any way. If not for being generated through a branching chain letter process, you should only get nonsense from the analysis.

The same goes for DNA. Trees from genetic comparisons that agree to greater than eighty percent is almost miraculous considering that the some of the information we are using is billions of years old. Like with the noise letter analogy, if the genes in a squirrel did not come about through this Branching Markov process, comparing them to other species in the rodent family using Markov analysis would just produce nonsense. This is because the number of different combinations for mutations by any other process would be astronomical. So it would be a cosmic coincidence to get trees that agree so well. And it would be a million cosmic coincidences if the Branching Markov Process of evolution did not produce all the diversity of life on Earth, yet we could get good agreement in the sets of trees we obtain from the comparisons of the hundreds of thousands of organisms we have analyzed so far.

One final slam dunk for evolution. When we classify organisms the old school way by comparing their anatomy and behavior (as we have been doing for about 200 years now) the evolutionary tree we build from those classifications also agree with the trees we get from the mathematical gene sequence comparisons.

Using only evidence from living organisms, we can read and decode evolution’s billion year journal that comes down to us through deep time. That meduium for that message is the DNA of those organisms.

On 747s, Tornadoes, and Junkyards

applefinch — Sat, 04 Feb 2017 20:57:19 +0000

The chance that higher life forms might have emerged in this way is comparable to the chance that a tornado sweeping through a junkyard might assemble a Boeing 747 from the materials therein. – Sir Fred Hoyle

This is a perfect analogy. But not for what it is intended for. It suggests choosing a ridiculous hypothesis (random chance) for how a 747 might be produced and it compares that to an opinion on how probable life might have originated in the same way. Its a ridiculous hypothesis because like almost everything else in the universe random chance is not what is responsible for assembling 747s. We know this because we happen to know how 747s are put together.

Short Answer: Yes, correct. 747s by random chance is ridiculous. And so is suggesting that life came about by random chance. That is why there are no serious proposals in science that suggest pure random chance for the origin of 747s, of life, or of much of anything else for that matter.

Consider that random chance is also a bad hypothesis for why things fall to the ground when you drop them. We happen to know why that happens, too. Most processes in the universe involve some degree of chance and some degree of necessity to one proportion or another. The falling stuff example would be a highly determined processes such as gravity attracting things with 100% probabilty (the opposite of random chance).

Although we don’t have a comprehensive testable hypothesis for how life first began there there no serious proposals that it happened by pure random chance. So what the 747/tornado/junkyard analogy is perfect for is showing why it is a waste of time choosing a really bad hypothesis for how one thing came about as analogy for how something else might have come about. It is astronomically unlikely that either 747s or life originate through pure random chance.

The formal name for this deceptive form of argument is called a Straw Man Fallacy, where one misrepresents one’s opponent’s position (usually by replacing it with a ridiculous one) and argues against the the Straw Man version rather than your opponent’s actual position.

The 747/junkyard argument is a perfect analogy for why straw man arguments are considered a fallacy. In this specific case of life and 747s by random chance it is informally called Hoyle’s Fallacy (sometimes called the Junkyard Fallacy) named after Sir Fred Hoyle who first proposed it. Hoyle was a brilliant Nobel Prize winning astronomer, but in his later years he seemed to subscribe to some unusual notions such as life on Earth being seeded by aliens from somewhere else (Panspermia).

Science accepts nothing merely on authority, even the authority of Nobel Prize winners, especially when they are outside their field of expertise. There is no excuse for logical fallacies even from Nobel Prize winners. Don’t be Fred Hoyle (at least in this respect, however, I would surely praise you winning a Nobel). Don’t use false analogies that employ straw man arguments. If you are a Christian advocating for a miraculous origin for life, a string of logical fallacies is unworthy of your apologetics.

But isn’t life arising from non-life unlikely?

Well yes, it is downright miraculous in my personal opinion. We surely don’t see it happening all the time. We don’t know how many times life may have got started on Earth. It may have gotten started a number of times when conditions were really harsh and changing on Earth, and did not survive. And it may have started a number of times after our current life on Earth began but entered a world already full of very hungry organisms well adapted by being as badass as they can be.

One thing we do know is that the more we look for the first signs of life in the fossil record the farther back in time we find evidence of first life. At the moment, it seems that life got started very quickly after conditions were condusive for it, probably in about 300,000 years.

Haven’t scientists calculated the odds of life forming and found it impossible?

Well, some people have offered their calculations on that. In fact Hoyle himself calculated it as one chance in 10⁴⁰⁰⁰⁰. That doesn’t look good for the home team, does it? But here is the problem with those kinds of calculations. Remember that Hoyle assumed that the process was random chance. So there should be no surprise that he gets a number that looks to be impossible.

Why is that important? Because if you don’t know the process by which something rare happens, you have no basis to calculate how unlikely it is. For example, consider a scientist from 150 years ago trying to calcualate the odds of a tornado forming. If he chooses random chance as the cause, he might try to calculate the odds of each molecule in a huge column of air all starting to move in the same circular direction by random chance or accident. The number of molecules would be astronomically big, so the calculations would have to show that the chances are astronomically small. So small, in fact, that the frequent occurance of tornadoes might move him to conclude that they are acts of a vengeful God.

But these days we know how tornadoes form. Storm fronts cause temperature inversions where warm air is below a layer of cold air. The rising warm air currents affect all the molecules at once, creating air vortices. If you understand how that works, you would perform a very different calculation and find that tornadoes are rather likely occurances in storm fronts.

Sure, we know how tornadoes form but we don’t know how life first began.

That is correct. But that is the only difference. In one case we know how it happens and in the other case we don’t. But in our ignorance of the unknown case that does not mean that calculations assuming pure random chance are any more meaningful than in the case where we are not ignorant. It simply is a really bad assumption. Almost nothing we know about happens by pure random chance.

In the study of logic we call this Argument From Ignorance Fallacy. It means that using our ignorance for how something happens is not something that can be appealed to to support an otherwise unsupportable hypothesis. (e.g. We don’t know how tornadoes form, so they must be acts of God.)

Isn’t it unusual that we don’t know how life began?

Not really. Consider that modern science only got started a few hundred years ago with methods formalized by Newton. Consider that it was only in my lifetime that we figured out the double helical structure of DNA. And it is only in the last few decades that we have been able to sequence the information in the whole genome of an organism. There is nothing simple about living organisms and we have just scratched the surface of how biology works in general. It is not unusual that we are still working on figuring out how life got started to begin with.

Now consider that at any point in history there is a long list of things about nature that we have not yet figured out. That list might seem to get shorter, but usually figuring out something new adds more questions to the list. Scientists see this not as something alarming or demotivating. They understand that this list is basically what you might call “job description”. There is nothing more exciting to a scientist than a new frontier of challenges to work on.

There is nothing we should read into our current ignorance about various things in nature except that those things are challenging, complex, and interesting. Life might be the most complex thing in the universe, in fact. (As a physicist by training, it certainly seems that way to me. We know much more about how stars work than we do how all of biology works.)

But how do we know that life began by some natural process?

We don’t. As I said before, we don’t have a comprehensive testable hypothesis for the origin of life. We look for a natural origin because that is what scientists do. In fact science can only confirm or reject hypotheses that are testable. It cannot deal with untestable opinions or personal beliefs.

Also, most religious groups, including most Christian denominations don’t see natural processes as being something that conflicts with God’s providence over those processes. For them the question is not did God create life, but how did he create life? And so they accept that the scientific method is the best way to determine that.

But related to the previous question, our ignorance of how something works or how something began is not a good argument for it not happening by a natural process. The only thing it is a good argument for is increased funding in that area of science.

Do we know anything about the origin of life?

In fact the field itself, called abiogenesis (meaning a non-biological origin of life), is vigorous, exciting, and rather productive in one sense. Every day a new finding suggests ways that bits of the life processes might have got started. And every day we find out more about what in nature might have given rise to various metabolic processes and organic molecules. It has become clear that

The big problem in solving the abiogenesis problem is that we might end up knowing a handful of ways to get life started, but not having enough information to know which one was the one that was the cause of the life we have on Earth.

A few of the researchers into the origin of life happen to have produced a number of highly accessible books in the popular press for the lay reader. Nick Lane, for example, is a well respected researcher in the field and manages to write for the lay reader in ways that make the work come alive (no pun intended).

What Took Us So Long?

applefinch — Wed, 11 Jan 2017 15:05:48 +0000

A few years ago I had taken my family on vacation to a cabin by a lake in Maine. The mid August weather was perfect for a lakeside retreat with cool nights and dry sunlit days. Finding myself awake at midnight one night, I got up and slipped outside into the pines and walked the short distance to the shore of the lake. Out of the corner of my eye, I saw a flash in sky, looking up quickly enough to see the transient streak of a meteor before it flared out. Recalling that it was the time of year for the Perseids meteor shower, I decided to take a canoe out into the middle of the cove to see if I could catch more of the show.

I pushed the canoe into the water as smoothly as possible, not wanting to disturb the complete silence of the night. The water was so still that it reflected constellations like a mirror. A couple of quiet paddles and the wide bottom canoe slid across the surface like a skater on ice. I stopped in the middle of the cove, placed the seat cushions on the floor of the canoe and slipped down to lie on my back and gaze at the sky.

Drifting and turning very slowly and silently, with the sides of the canoe blocking any presence of the shore, it gave the illusion that I was floating in space. With my eyes adjusted for pitch blackness, it was only me and the astonishing beauty the night sky. I lost count of the number of meteors I observed that night, so I figured my plan was vindicated.

If you have ever seen the night sky on a dry clear night, undiminished by the lights of civilization, you can understand how the Milky Way got its name. The Milky Way is our view into our own galaxy, the immense cluster of stars that make up the cosmic neighborhood of our own Sun. In conditions like I described above, the visible stars are so numerous that it looks more like a milky fog than it does a collection of individual stars. And it stretches across the sky in a band from horizon to horizon that looks very much like a celestial highway.

If you have learned to recognize some of the constellations and the names of some of the stars they contain, you can often pick them out just about anywhere where you can see the night sky. But on that particular night, they seemed so close that I felt I could reach up and touch them. Rather than being “way up there”, they seemed very up close and personal like longtime faithful friends.

It is easy to forget that for the last hundred of thousands of years, up until the advent of outdoor artificial lighting, people moved about with the constant presence of this up close and personal canopy of the night sky, not simply seeing it as the rare treat that I did that August night on the lake.

In times past we conducted our affairs with the features of the night sky looming large and prominent in our field of vision. The constellations and stars surely must have felt like faithful companions that accompanied us wherever we went. It is no surprise that many of the names of stars and constellations come down to the present day from a time since before the birth of Christ. Those names have outlasted many of the civilizations that created them.

More than just names, many cultures created entire stories to go with constellations, partly for entertainment and partly as religious belief. Some cultures considered the stars themselves as deities. One thing they would not fail to notice is that none of the stars move in respect to each other. Where on Earth kingdoms rise and fall, rivers flood, forests burn, and every living thing flares up only to burn out in a short time, the heavens would seem to be immutable and eternal to people in ancient cultures. If it were not for the effect of the Sun and Moon, the constellations would seem to simply rotate around the Earth. Its no surprise that some cosmologies had the stars affixed to the inside surface of a hollow sphere, rotating around with a stationary Earth at the center.

So in one sense, the stars themselves seem boring. Or better said, they seem bored with us. By that I mean that if the stars are deities or not, they seem completely disinterested in the affairs of man and the natural events on Earth. Wars, famine, fires, floods, good harvest or bad all happen with no correlation to the constant deliberate rotation of that celestial sphere. If you need a calendar for planting or harvesting, don’t bother with the stars since one day/night rotation of the sphere is no different than the next. It’s like trying to tell time with just a clockface but no hands.

And so ancient observers who took to their watchtowers at night to record the celestial events had no reason to record over and over again the position of the stars in regard to each other. The answer was always the same.

(What about astrology, you might ask? Didn’t astrologers believe that our destiny was in the stars, and can be forecast by their motion? Yes, but not the stars that make up the constellations. Up until the invention of the telescope (and for some time after for many) the planets were considered as wandering stars. They are the exception to the rest of the stars that seemed glued permamently to the inside of that celestial sphere. (But more on planets later.)

If the stars are cold, distant, unchanging, and disinterested, the Sun, on the other hand is close, warm, and obviously life-giving. Ancient cultures could not fail to notice that their very existence was tied to the presence of the Sun. The Sun was something to be reckoned with whether they though it was divine or not.

More than just giving light and heat during each day, the motion of the Sun is not as featureless as the stars. The Sun rises and sets in a different spot on the horizon each day. The point on the eastern horizon where the Sun rises, and the point on the western horizon where it sets moves slowly southward as winter approaches, comes to a stop at mid winter (the winter solstice) and then starts moving north again as summer approaches only to turn around again at midsummer (the summer solstice).and start back southward.

This motion is also perfectly correlated with the hours of daylight each day and roughly correlated with the seasons. As most humans populations had switched from hunter-gatherer to agrarian lifestyles by about 10,000 years ago, this correlation was important to them. It is no surprise that cultures built stone structure that were aligned with the sunrise at winter solstice and no surprise that they were eager to see that important “turn-around” meaning that another cycle of seasons was beginning.

With something to observe to indicate the annual cycle of the seasons, there is still not enough information to know week by week when to plant crops and when to harvest them for maximum production. Fortunately, there is another celestial body that cycles faster than the Sun’s equinox. The Moon makes a revolution around the Earth about every 28 days. And it does so independently of the Sun’s motion. By observing both the motion of the Sun and the motion of the Moon, one can get a pretty good estimate of what week of the year is the current week and use that to make decisions about planting, harvesting, breeding of livestock and so on.

Consider what Genesis says about the purpose of the Sun and the Moon:

¹⁴And God said, “Let there be lights in the vault of the sky to separate the day from the night, and let them serve as signs to mark sacred times, and days and years, ¹⁵and let them be lights in the vault of the sky to give light on the earth.” – Genesis 1:14-15

Holy days scheduled to celestial events are still with us. Easter might be the most important holiday for Christians. Easter is a “moveable feast”, that occurs on the first Sunday after the first full Moon after the vernal (spring) equinox.

But if you are scanning the heavens for signs and portents instead of keeping a calendar, the motion of the Sun, Moon, and stars are not very helpful. Imagine the court astrologer telling his patron that today would be a good day to go to war because the Sun rose this morning as expected. Or the stars stayed in their constellations last night as they have done since recorded history. Divining your destiny by the Sun, Moon, or stars is much like waiting for a clock to do something unexpected.

Now notice what Genesis has to say about the stars:

“…he made the stars also.” – Genesis 1:16

I love how that is tossed off as if they almost forgot to mention the stars. The authors of Genesis seem to find no purpose for stars.

Fortunately for astrologers there are a few strange stars that wander around moving through the constellations like lost sheep. These are the planets, of course. The name itself means “wanderers”. While the clockwork rotation of the Sun, Moon, and stars make them ideal for calendar keeping, the motion of the planets seem so arbitrary that they are useless for that.

What the Sun, Moon, and planets have in common is that they all drift through a narrow band of twelve constellations we call the Zodiac. What is unique about the planets is that they speed up and slow down dramatically, and at some points they appear to stop, reverse direction around a loop and then continuing on in the original direction. The loops are referred to as retrograde motion. The reason for this strange apparent motion is the shifting perspective we have as our own planet orbits the Sun, compared to the different rates that the other planets have as they orbit the Sun. The different orbital rates of the planets, including the Earth, means that these retrogrades are not going to happen for a given planet in the same Zodiac constellation each time.

Consider that major life and death decisions were made on the basis of astrological deliberations. If you are in the astrology business no one is going to pay you for telling them that the Sun rose on time, or that anyone can see that Mars was in Capricorn last night. The big money is not in knowing where every planet was last night but knowing where precisely the Sun, Moon, and planets were on the day the king was born some 3o years go (even if some of the planets were not visible on those nights because they were on the other side of the Earth in daylight.)

Herein lies the big problem. Putting up some sighting stones to pin down the solstice is one thing. That is going to happen like clockwork year after year. Predicting or “post-dicting” where the planets will be in the sky on a date decades into the future or a date decades in the past is another thing altogether. And that “other thing” is the impetus or driving force behind what ultimately established modern science.

We stared at planets for thousands of years, carefully recording their strange but fascinating motion through the Zodiac. We built crude geometric models to predict where they were in the sky on any particular night in history. And we moved armies and chose spouses based on what we thought the positions of the planets told us about our destinies.

But with all that serious life and death planet-gazing, we never in thousands of years figured out that the Earth is also a planet and they all orbit the Sun. The questions I would like to answer for you is

Why and how did we suddenly develop dead certainty about the Sun-centered makeup of the solar system about 250 years ago?
What took us so long to get there?

Answering the first question will tell us why we can develop dead certainty about any scientific theory. Answering the second question helps us answer the first, because if there was a scientific revolution, what was it revolting from? (Hint: The answer to question #1 is not the telescope or Galileo’s testimony, or even Copernicus’s work).

The Audacity of Induction

applefinch — Tue, 08 Nov 2016 03:05:48 +0000

A zealous Intelligent Design proponent once offered the following proof of design in the universe.

Every beautiful thing we have ever seen was produced by a creative being.
The universe contains beautiful things.
Therefore, those things, [that is, the beautiful things in the universe] were produced by a creative being.

While are so many things wrong with this logic, my first comment to him was to say that #1 needed to be proven first. He answered with, “#1 is arrived at through Induction, which is a valid logical operation.” He offered that almost all scientific theories contain premises arrived at through Induction. He is correct on both counts, that induction is a valid logical operation and is unavoidable in scientific theories. But he is wrong about how it is applied.

What is Induction

“What is good for the goose is good for the gander”, would be an example of inductive reasoning. But what is more interesting is universal induction, such as “what is good for these few geese is good for all ganders”. Universal Induction is when we take what we believe is true for a limited set of cases and apply it to all other similar cases (few goose, all ganders).

If induction seems a bit sketchy to you your instincts are good. It seems a bit audacious to make just a few observations, then wave your hands, make a bold claim about everything and just walk away. But consider something like geometry. We can create geometric proofs about a perfect circle and declare it true for all perfect circles by induction. For example, “The longest line that can be drawn between two points on a circle goes through the center of the circle. This is true for all perfect circles.” We can do this because a perfect circle is something that we characterized completely such that all perfect circles have the exact same properties.

But when we come back to making claims about nature, we don’t get to make the definitions. Our claims about nature are not absolute like they are in geometry, rather they are “empirical” meaning all we can do is observe nature with our senses and hope we are finding something universal in those observations. What we can observe about a few geese may not necessarily be true about all geese. Geese come from nature, not from our own definitions. All we can do is try to discover the universal essence of geese-ness through observations.

Why is induction necessary in science? Because we will never be able to observe everything for all time everywhere in the universe. If your theory is about geese, you will never be able to exhaustively prove it by examining all geese that have ever lived, that are now living, and will live in the future. But how can science be so successful if most of it is based on what seems to be a house of cards of exaggeration? It would seem that one could “prove” just about anything by just declaring it “induction!”

The answer to this lies in risk. Science will only allow inductive premises for those things that have a possibility of failing if the induction is wrong. Although you cannot test all geese, any universal claim you make about geese can be proven wrong by just one geese that does not live up to your claim. Suppose you have observed that all the geese in your town are white. Let’s call that the “range” of your observations that support your geese theory. By induction you impose your observation on a much larger “domain”, let’s say all geese everywhere, ever are white.

Once you do that, you can now test your geese theory by observing more geese. Each new white goose you observe adds more confidence to your theory. Each successful new observation moves one more geese from the domain to your range of actual observations. But it would take only one observation of a non-white goose to come across to explode your theory completely. The important thing here is that there is a huge imbalance of audacity vs risk.

In science, our geese theory would be considered a kind of prediction engine that predicts the color of all geese everywhere. Prediction is used in a formal sense in that our predictions have to hold for all new observations whether they are about geese in the past, present, or the future.
It would be considered a scientific theory because the prediction is said to be “falsifiable” meaning it can readily fail if the theory is wrong. Scientific theories are engines of falsifiable prediction. Our geese theory is falsifiable because it forbids geese of any other color but white. And so we build confidence the fastest in risky falsifiable theories that demand things and forbid things in their predictions but manage to “fail to fail” in predicting the outcome of ever more novel observations in the world in their subject matter.

This requirement for testing a theory on the basis of its falsifiable prediction allows for our initial theory to be wrong but detectably wrong as it starts to fail in its predictive power. A theory that fails predictions is one that can be improved or replaced over and over again until it no longer fails for the moment. Then we continue the process forever. We don’t only do this ourselves. What scientists do is to publish their work in professional journals with enough explanation to allow others to test the predictions on their own. So if our range is all geese in our hometown, we can enlist professional geese watchers all around the world to try to find a non-white goose. Those scientists in turn publish their findings in professional journals and cite our original article. And the science community keeps track of all of those articles and citations. In this way a public record for predictive range and success accumulates over time from many different independent observers.

If you want to examine that public record at anytime, you can do a literature search in the professional journals and you will find a reference to all article that cite our article on geese theory. At any time we can examine how well the theory is holding up all around the world.

This is what gets us from a flat Earth to a spherical Earth to an egg-shaped Earth in our ever improving theories about the universe. Bold theories are proposed and then thrown against the wall of nature day in and day out to see if they continue to stick. Any slight discrepancies in what they predict vs what we see is seen as an opportunity by scientists to perhaps discover something new, even at the expense of this theory that has been so successful so far. You won’t win a prize for reporting the next white goose, but your name may be made on being the first to discover a non-white goose.

Our acceptance of scientific theories have to always be provisional because of this, but while it looks like a weakness, risk and falsifiable prediction is the rubber meeting the road of nature. Its what gives us traction to climb the hill of better and better explanations for things we see in nature.

Finding Dawkins’ Weasels

applefinch — Sun, 22 May 2016 17:05:23 +0000

Back in the days of the Apple II, biologist Richard Dawkins wrote a small Basic program, called Weasel, that demonstrated how the random mutation and selection in evolution could generate information. The program used “critters” whose DNA were strings of characters. The program is “dirt simple” but succeeds in achieving its goal after only fifty cycles of generations. The goal is for the random mutation and selection process to end up with the given target string. Dawkin’s choice of strings was a line from Hamlet, “Methinks it is like a weasel.” I wrote a version in Typescript so I could embed it in a blog article.

In the following demonstration, simply hit Reset and then Run and watch the program converge on the target string. (Sorry, it only works in recent versions of Chrome. I will update it when I fix the IE and Edge problems).

Dawkins’ Weasels

%CODE1%

Some Intelligent Design proponents criticized the value of Dawkins’ Weasel program on a number of points that were readily answered by Dawkins and others. However, one such criticism is worth discussion and that is that the program itself already contains the information that is supposedly created by the process. Although the program is carefully written so that the random mutation process does not have access to that information (only the selection process does), the drama of the program is somewhat undermined for those who don’t appreciate the significance of that.

Recently, I set out to write a new Weasel program that generates what the ID community would call Complex Specified Information (which they insist is a signature of something produced by an intelligence) where the information is not present in the program itself. I call the program Steiner Weasels, because the program ends up generating Steiner Trees between a set of given points using generations of weasels that are randomly mutated and selected based on criteria that is much simpler than a Steiner Tree.

In the following demonstration, hit Reset, then Run and watch the weasels create Steiner Trees whose specification is not contained in the program. Let it run for about 150 generations. Reset and Run it a few times to see how it deals with different layouts. Also, feel free to hit Earthquake a few times while it is running to see how it scrambles to adapt to new layouts.

Steiner Weasels

%CODE2%

If you would like to know more about Steiner Weasels, I invite you to read an in depth explanation in Shakespeare, Evolution, and Weasels.

Shakespeare, Evolution, and Weasels

applefinch — Sat, 30 Apr 2016 17:00:25 +0000

Note: This article contains a number of interactive demonstrations which only work in the later versions of Chrome. They do not yet work in Internet Explorer or Edge.

Monkeys With Typewriters

How long would it take for an infinite number of monkeys banging on typewriters to come up with the complete works of Shakespeare? Most of us would probably answer with “never”and dismiss the question as one of those rhetorical questions that interest people with too much time on their hands. But suppose we lower the bar somewhat and ask the same question about a single monkey and a single line from a Shakespeare play.

So let’s start over and ask how long it would take for a roomful of monkeys to come up with the line, “Methinks it is like a weasel.” (Hamlet: Act 3, Scene 2)? This seems more manageable, and perhaps more interesting, because we can almost imagine the possibility that monkeys pounding away on typewriters might manage to come up with a particular line of Shakespeare that has only 27 characters (including spaces and punctuation). It turns out that if you gave a monkey a typewriter and he managed to stay focused long enough to randomly hit 27 keys, the chances of him producing the weasel passage is 1 in 10^53 (1 followed by 53 zeros). It seems as if a monkey would not live long enough to accomplish this.

But what is so special about that particular line of Shakespeare? Is the monkey less likely to produce that line than any other string of 27 characters. The answer is no, not really. Choose any string of 27 characters, such as “uvm(%%$ejis &^ mcn..?/[!}”, and the chances are the same 1 in 10^53 that the monkey will produce that particular string. So mathematically, there is nothing special about the weasel passage or any other 27 character string.

Shakespeare from Space

Intelligent Design proponent William Dembski offers a different approach to this. He asks us how we would react if we received a signal from deep space that had a phrase like, “Methinks it is like a weasel.”, compared to how we would react if we got some random 27 elements of gibberish. In the first case we would be astounded, whereas in the second case we would not be surprised at all. His point is that although either string has the same mathematical probability, the weasel passage is recognizable by a criteria that stands outside of the mathematics. It is not only a specific string, but it is specific to human English language. It is not only specific but it has a “specification”, which in this case would be “a line of Shakespeare.” His opinion is that an unlikely sequence or arrangement of things that can be described by a brief external specification is one sign that the sequence or arrangement is unlikely to arise out of natural random processes or natural laws of nature (repeatedly) , and must be signs of intelligent origin.

Dembski’s Complex Specified Information

A line of Shakespeare, a line of English text of any sort, the carving of a human face, or a sequence of the first 50 prime numbers, are all examples of sequences or arrangements that have an easily identifiable specification. Dembski adds that when a sequence or arrangement of things is both highly complex and readily “specified” as we have just seen, the chances of it occurring without the benefit of guidance by an intelligence is vanishingly small. For example, the complete works of Shakespeare is far more complex (has more integrated moving parts to it) than the one weasel passage, and it can be readily specified as “the works of Shakespeare.” Dembski has defined this combination of complexity and specification as Complex Specified Information, or CSI. If the monkeys produced the works of Shakespeare or we received the works of Shakespeare in a signal from deep space, he says we would have to conclude that there was some from of intelligent guidance involved somewhere. According to Dembski, a high degree of CSI is a “signature” of the involvement of an intelligent agent.

CSI in DNA

Dembski then turns our attention to living creatures and shines his CSI flashlight on the information that we find in the gene sequences of DNA. Genes consist of specific arrangements of four types of amino acids that specify the structure of a specific proteins that the cell should produce for a specific functions. Dembski says that gene sequences are both complex and specified to such a large degree that their arrangements could not have come about through any natural processes. He insists that an intelligence must have guided the formation of the information in DNA.

He asks how a process like evolution that uses chance mutations that is as random as monkeys with typewriters could produce such complex information that specifies something as nuanced and functional as the human eye, for example. The usual answer from biology is natural selection. But for some that answer is not very satisfying since all it does is eliminate information. If natural selection is the “survival of the fittest”, how can a random process bring about the “arrival of the fittest”? Or using the monkeys analogy, how could simply selecting out and discarding the pages that are not scenes from Shakespeare ever help the monkeys produce anything but gibberish?

Dawkins’ Weasels and CSI

Way back in the days of the Apple II, evolutionary biologist Richard Dawkins wrote a simple computer program to demonstrate how random mutation and natural selection (RM + NS) could produce complex specified information. His program is often called the Weasel Program because the goal of the program was to use random mutation and selection to generate the line from Hamlet, “Methinks it looks like a weasel.” WEASEL works as follows:

Generate a string of random characters and call it the Parent String.
Make 1000 copies of the Parent String and call them the Children Strings.
Go through each of the child strings one by one and make a small number of random mutations. The mutations happen at random places in the string. They can add or substitute a character for another randomly chosen character. Or they can simply delete a character.
Compare each child string to the target string, “Methinks it is like a weasel.”
Select the child string whose mutations have brought it closer to being like the target string.
Declare the selected child string to be the new Parent String.
Repeat steps 2 through 7 using the new Parent String.

The important things to keep in mind is that the mutations that occur in step 3 happen randomly with no foresight about the target string about weasels. After that step is complete, each child string is still very much like its Parent String, but each has a few of its own particular random variations. The selection process for choosing the “fittest” child string happens without knowing how the child strings were produced. It simply selects from what it is given by comparing them with the target.

Since you have had the persistence to get this far, your reward is a demonstration of Dawkins Weasels right here in your very own browser. In the demo window below, you may enter a target string of your own or leave it as it is. Press Reset, then press Run. The program will cycle through a number of generations to evolve the initial random string into the target string (each generation is numbered to the right of its fittest string.) When it has found its weasel, you can hit the Stop button or let it run for a while to see what happens (nothing happens).

Scenario 1: Dawkins’ Weasels

%CODE1%

Notice that for a string about the size of the weasel passage it achieves its goal in somewhere between 30 and 60 generations. Pretty good for such a simpleminded process. But how did it get to its goal so fast when we already established that no amount of monkeys could do the same thing in their lifetime? Does the weasel program cheat? The answer is yes, but it “cheats” in the way that evolution cheats. The secret is that there is a step in the weasel scenario that is not in the monkey scenario. That would be step 6, where each “best” child becomes the parent of the next generation, allowing the next generation to start with a better string than the last one. That way, the string for each new parent rachets towards the goal with small but relentless steps.

Weasels Cheat at Poker

In biology, this is called Inheritance. It cheats like you would be cheating if you kept the best cards in your pretty good poker hand for use in the next game, improving on the hand each game through discards and the random replacement cards from the top of the deck. In only a few games you could build yourself a Royal Flush.

WEASEL is dramatic in that such a mindlessly simple program running on an Apple II demonstrates that a random process followed by a selection process can generate CSI repeatedly. However, Dawkins’ critics in the Intelligent Design movement raised a number of objections claiming that the program cheats in other ways that are not found the process proposed by the theory of evolution. Most of the objections were inconsequential, but one is interesting enough to spend some time on since it gets to the heart of another question that Intelligent Design proponents ask, which is “How can a process unguided by intelligence produce information?”.

Are Dawkin’s Weasels Guided?

The complaint is that the WEASEL program has the actual target string built into it. In other words, the information that the program is supposed to be generating through RM + NS, is already built into the program. As such, it does not answer the question about how information can be generated by an “unguided” process such as evolution. The first answer to this objection is that the program is carefully written so that the process that produces mutated children from the parent works in complete isolation from the process that evaluates each child in respect to the target string. In other words there is an information barrier such that information only flows out of the genes of the children but never flowing in from the outside world.

Imagine two black boxes: a Copy Box and a Selector Box. You place a weasel in the Copy Box, a bell rings, and you take out the 1000 children it has produced (all with slight random variations). You place the 1000 children in the Selector Box which selects one child weasel and disposes of all the others. You take out the selected child weasel and put it into the Copy Box. Rinse and repeat.

The Copy Box doesn’t know or care what criteria the Selector Box used to select the one child weasel because the important point is that there is no information about that criteria flowing from Selector Box to the Copy Box. The child producing process in the Copy Box is not informed by any information from the outside world. It simply waits for someone to give it a weasel and then proceeds to make 1000 copies of that weasel. But since the Copy Box is not perfect, the copies are not exact replicas of the original weasel. They have a small number of random variations in their weasel strings. The process that produces the variations is no more guided than the bounces of a well made pair of dice thrown against a wall.

Introducing Steiner Weasels

So what happens if we model the same process with a computer program that does not have the target built into it? Is that possible, and will it be able to produce CSI? If CSI is defined as information that can be seen to have a separate specification, how can a simple computer program generate that kind of information if it does not already contain the specification?

Having got this far, please be entertained for a moment by running the next demo Scenario 1 below. Hit Reset, then Run. After it seems to have settled down, hit Stop and read the explanation below it.

Scenario #1: Weasels Find Some Food Sources

%CODE2%

The remaining scenarios all are various versions of a weasel program of my own called Steiner Weasels. The back story is that Steiner Weasels s are very dumb weaselly creatures that move across a field in straight lines, stopping for a moment to eat stuff on the ground. After eating, they then move in another straight line (often in another direction) for a while until the stop and eat again. The locations where they stop and eat are each determined by an individual gene in their DNA. In Scenario #1, their DNA contains five genes. (In all the remaining scenarios, the thin lines are the “paths”, and the small solid dots at the end of the lines are where the Steiner Weasels s stop and eat.)

Steiner Weasels burn calories as they move along, in proportion to the length of their travel. The longer they travel between eating stops, the more calories they burn. Not surprisingly, Steiner Weasels acquire calories when they stop and eat. The field on which they are moving and eating has a number of different kinds of food sources. At each stop they make, the Steiner Weasels can only get calories from a different kind of food source, so if it makes three stops, it has to eat a different food type at each stop. (The food sources are denoted by the green circles).

Finally, each different kind of food comes from its own source on the field, but the food is dispersed somewhat by wind and so forth, so a Steiner Weasel can get some amount of food of the required type at each stop even if it is not close to the source. However, the closer a food stop is to a particular source, the more food it can eat of that type when it stops. When the Steiner Weasels have traveled to all their stops, they try to reproduce, but only the Steiner Weasel with the most calories gets that privilege, with the rest of them dying off. That means that the surviving SW is the one whose total calories (cals consumed – cals spent) is the one who gets to reproduce. Like in Dawkins’ Weasel program, the surviving weasel produces thousands of children by replicating itself. And also like Dawkins’ program, the replication is imperfect such that the children end up inheriting the parent’s genes, but with a small number or degree of mutations.

Its important to note here that the Steiner Weasels are not intelligent. They make no decisions about where they stop and eat or how many stops they make. Those things are completely determined by their genes.

In scenario #1, the mutations that occur in the genes of the children are small random variations in the location of where they stop to eat. The scenario starts with a parent with three genes, each set to a random location. A few thousand children are then replicated from the parent, and then each one is mutated randomly. Then each child is examined for how many calories it spent and how many it acquired. The one with the most calories left over is the one that is chosen to be the next parent, and the remaining ones are killed off. This cycle of generations happens over and over until you hit the stop button.

The location of the food sources on the field is chosen randomly when the Reset button is pressed and the parent weasel is established. The weasel world starts (or resumes) its generation cycles when the Run button is pressed. Reset and Run it again and watch the food stops acquire food sources and slowly move towards them to get as close as possible. You might notice that some stops get closer to the source than others. This is because the selection for most calories is somewhat of a tradeoff. The closer to a food source the more calories acquired, but it also makes the path longer which increases the calories spent getting to the source.

The Earthquake Button. I am guessing you already pressed it being the curious reader that you are. The Earthquake button causes a random number of food sources, to shift in location by random amounts. After the weasels have found their food source, while the world is still running, hit the Earthquake button a few times and watch the weasels adapt to their newly relocated sources.

Without further ado, Scenario #2 is the same as scenario #1, but a new mutation type has been added where new genes can be added or deleted randomly. Since each gene specifies the location of a food stop, the weasels evolve additional genes until there is one gene per food source so all of the sources can be covered. Any children who evolve more genes than food sources will be less fit, because they will waste calories moving along paths to stops that do not produce any calories.

Run scenario #2 and watch it work until it settles down, then hit the Earthquake button a few times and watch the weasels evolve new food stops and match the new food source locations.

Scenario #2: Weasels Find All the Food Sources

%CODE2%

The astute reader might have realized that so far Steiner Weasels are not much better than Dawkins Weasels when it comes to a built-in target. The astute reader would be correct. Although the program generates food sources and places them in random locations, once that happens, the program has a fixed target to work with. But have patience. The first few scenarios are meant to introduce you to the back story of Steiner Weasels and how their world evaluates them for fitness.

In the previous scenarios, the SW children improved their fitness by evolving food stops that were closer to their food sources. But the paths that they take from one source to another looks to be wasteful in terms of calories. If you recall from the description above, the DNA of a weasel contains a set of genes each of which contains the location of a food stop. In addition to a food stops, genes also contains information about the paths that a weasel follows from stop to stop. If you run any of the SW scenarios above, you can see that the paths for a tree-like structure, where a given stop might have one or more paths to other stops.

Here is where it starts to get interesting. If we add another mutation type that switches a path from one randomly chosen stop to another randomly chosen stop, the total length of the route a weasel takes will get either longer or shorter as a consequence. Since a shorter route means less calories expended, child weasels can become fitter by getting as close to their food stops as possible, but also traveling between the stops by the more efficient route. Run Scenario #3 and watch how it finds all the food sources, and sorts out a very efficient route. It won’t always find the absolute most efficient route unless you let it run for a very long time, but it is amazing how quickly it approaches that ideal.

Scenario #3: Weasels Optimize Their Routes to Food Sources

%CODE2%

Now for the $50,000 question. Where does the information in the genes come from about the most efficient route? There is nowhere in the program where that information can be found. Not only are the mutations of genetic information occurring randomly with no regard to how the results will be evaluated, but the selection process knows nothing about the possible routes, either. It simply subtracts the calories spent from the calories gained and chooses the child with the highest remaining calories.

Consider that the number of possible tree routes that can be drawn between all the food sources is astronomically high. There is the most efficient route, the least efficient route and every route in between. Next, consider that for a purely random process, any given tree route between all the 15 food sources is equally likely, just like any combination of 15 letters from monkeys typing is equally likely. But where the number of possibilities is so high, it is better to say that they are also all equally unlikely. Like the monkeys producing a line of Shakespeare by pure chance is extremely unlikely, so is the weasels producing the most efficient route between points by pure chance extremely unlikely. (The answer is that the process of random variation and selection is not “pure” chance, but an elegantly simple ballet of chance and necessity.)

Furthermore, according to Dembski, what makes the line of Shakespeare special among all possible combination of letters is that it has a “specification”. It can be specified as to its line number, scene, and act in a particular play, in a particular language, written by a particular author. Similarly, the most efficient route between the food sources also has a specification, which is the “minimal traversal tree” between a set of points. In fact, the minimal traversal tree is extremely valuable information when it comes to shipping logistics because profitability in a shipping company is maximized by minimizing fuel costs and delivery time.

But wait, that’s not all. One final scenario that raises the bar to another level. This one is almost spooky. In Scenario #4 below, one more mutation type has been added, where a newly created gene with a random stop is inserted between existing stops. Run scenario #4 and let it settle down for a while then read the commentary that follows it.

Scenario #4: Weasels Find Shortcuts With Steiner Points

%CODE2%

Looking at the results of running Scenario #4, you might have noticed that there seems to be something broken about it. It looks like some stops are far from a food source and show no signs of getting closer. On closer inspection you might notice that there are no free food sources and these wayward stops are extra stops. There are more stops than food sources. (If you did not get any extra stops, hit Reset and Run again and let it settle down. Usually two out of three runs produce at least one of these mysterious stops.)

The scenario is not broken. If you look more carefully, you will see that the weasels found some shortcuts. Although there is no calorie value in the extra stops (because all the food types have been consumed by other food stops), the extra stops shorten the lengths of a few paths. These stops are called Steiner Points and the weasels have created a Steiner Tree out of an ordinary tree. They had no idea they were creating a Steiner Tree, but they managed to do so because the addition of the extra points lowered the calories spent in moving from one food source to another.

Its important to remember that in all the Steiner Weasel scenarios, the process that copies and mutates genes is completely isolated from any information about where the food sources are, or even how many calories are acquired and spent. The process that evaluates and selects for the fittest weasel has no information about how the genes in the weasels are altered. And yet after about 100 – 150 generations, the genes of the fittest weasel has values that put that weasel closest to each food source and following the most efficient route including a few shortcuts.

Where Does the Steiner Tree Information Come From?

Where does that information come from in the genes of that weasel? Since the processes that alter the contents of the genes have no knowledge of where the food sources are, or of optimum paths, or of how to make shortcuts, one can say that the information is newly “created” in the genes. That would be reasonable from the point of view of the gene itself because the genes are closed off from information coming in from the outside. They can only mutate their own contents in random amounts.

But another way to look at it is that the genes are just a kind of storage medium for information, whereas the process itself (variation, selection, inheritance) is where all the action is. One could say that the process “discovers” the information and stores it in the genes. But that is not quite right, since the information about the optimum route between the food sources, or information about where to place shortcuts is not anywhere in the program or its environment.

Consider that if I gave you fifteen different random points on a graph, there is no place you could go to look up on the Internet or anywhere else what the optimum route is between them. There is no mathematical formula where you could just plug in the numbers and get the answer. You would have to figure it out by trial and error using lots of intelligence with full knowledge of where the points are on the graph. So it is difficult to say that the process discovers the right information, because the information is not hanging around to be discovered.

My answer is given that the only information that is available is the location of the points, the process discovers something that is a logical consequence of that information.

Whatever the answer is about where the information comes from, it is a mistake to look at only one part of the process. It does not come from the pure chance of random mutation. It does not come from selection. And it does not come from inheritance. It comes as a result of all three of those steps repeated over and over again.

Finally, can we say that this extremely simple process generates Complex Specified Information? Yes we can. It manages to generate something highly unlikely , which is a Steiner Tree between a set of points. The chances of getting a Steiner Tree by pure chance is vanishingly small whereas, this process generates them over and over again. Is the information “complex”? You could argue that with only fifteen points, the information is not very complex. But except for computing resources the process can deal with any number of points. I chose fifteen because it was enough points to be interesting, but not so many that it took forever to produce results.

Symbols, Codes, and Information

applefinch — Sun, 22 Nov 2015 14:59:53 +0000

Shannon’s Theory of Information

Information is Not Meaning. It Symbolizes Meaning.

What does it mean to have a theory of information? How can information be studied, categorized or quantified? How can we measure the information in a Shakespeare play or a sad message you receive about the death of a close friend? The answer starts with realizing that information is not the same as meaning. Where meaning is something experienced by a sender or receiver of information, information itself is something more basic. In the realm of human experience, information is to meaning as the motion of objects is to classical ballet. When humans communicate information they do so symbolically, whether by sound, the written word, gestures, and even art or music. The symbols themselves don’t actually contain meaning but rather select meaning out of a shared context between the sender and the receiver. (I recommend reading my previous article on Information and Meaning if you need more clarification).

It was this realization that led mathematician Claude Shannon to develop an entire theory of information. Shannon was working for Bell Laboratories in the mid 20th century on a project to solve an engineering problem about reliable communications on noisy telephone lines. He not only solved that problem, but in the process ended up founding the modern science of information theory on which we have based all our modern communications and computing technology. But Shannon’s theory reaches beyond engineering to go deep into quantifying information as something as fundamental to the universe as energy or matter. It would not be wrong to think of Claude Shannon as the “Isaac Newton” of information. Where Newton developed a “System of the World” for motion all throughout the universe, Shannon has done the same for information.

Information Reduces Uncertainty By Answering a Question

Shannon’s theory is based on just a few simple concepts. First, the function of information is to inform,. Shannon says that we are informed to the degree that our uncertainty about something is reduced. Uncertainty is not knowing the answer to a question. Information reduces uncertainty by answering a question. Think of uncertainty as a hole, and information is that which reduces the size of that hole.

Uncertainty is Related to the Number of Possible Answers to a Question.

Complex questions can be broken down further into simpler sub-questions until you reach the point where the simplest questions have only a certain number of possible answers. Let’s call that number N. For example, suppose you wanted to know the exact date that the first shot was fired in the Civil War. That question can be broken down to separate questions about the day of the week, the month, day of the month, and the year. The number of possible answers for the day-of-week question is seven, so in this case N = 7. Suppose you only have a vague idea for when the war began. Let’s say that for the day-of-week question, you are completely clueless. Although there are only seven possible answers, they are all equally probable as far as you are concerned. This represents the maximum uncertainty for this question for this case where the number of possible answers is N = 7.

Although you are maximally uncertain about the week or the month the shot was fired, Shannon would say that the uncertainty about the month is higher than the uncertainty about the week, because any particular month has a one in twelve chance of being the answer, whereas any particular week has a only one in seven chance of being the answer. Where all answers are equally probable, the question with the highest N has the highest uncertainty. [!fnYou can get a better feel for this by considering the extreme case where one question is was it day or night that the shot was fired (N = 2), vs what was the name of the person who fired the shot (N = some very large number). As you might be completely clueless about the answer to either question, you can understand why the uncertainty about the name of the shooter is far higher than the day or night answer./fn]

The Possible Answers Are Represented by a Set of Symbols That Make Up a Code.

Shannon says that if each question has a particular number of possible answers (N), the answers to a particular question can be represented by a set of N unique symbols. It doesn’t matter what the symbols are as long as there are N symbols that can be distinguished from each other by both the sender and the receiver of information. The sender needs to be able to select and send a symbol that the receiver can associate with one of the possible answers. The set of N symbols that represent the answers to a question is called a Code. [!fnFor example, one code would be English words. The list of symbols in the Code for the day and night question could simply be [daytime, nighttime]. Similarly, days of the week could be the usual English [Mon, Tue, …. , Fri] and so forth. The Code for the month might be [Jan, Feb, Mar, …, Dec]./fn]

The Smallest Code With Information has Two Symbols

Consider a question that has only one answer, where N = 1.. Your uncertainty about that answer is zero, so a message containing the one symbol in the code will not reduce your uncertainty. A question with two answers, however, has the possibility of uncertainty, such as the daytime-nighttime question above. Any information carrying code has to have at least two symbols.

Information Can Be Measured in Bits

You can represent the symbols in a code by numbering them from 0 to N-1. The Code for [Daytime, Nighttime], can be represented by [0, 1], for example. If you encoded that in binary bits, you would only need one bit since one bit can be either 0 or 1. The answer to that question can be sent with one binary bit set to either 0 or 1. A question that has more possible answers needs more bits. A Code for the day of the week needs seven symbols that could be numbered 0 through 6. That range of numbers would require 3 bits [!fn Three binary bits can encode the numbers 0 through 7 with the following sequences [000, 001, 010, 011, 100, 101, 110, 111]/fn]. The general rule is that where the number of symbols in the code is N, the number of bits is I = ln2(N) [!fn Ln2(N) is the log base 2 of the number /fn]. This is the average amount of information in a symbol in a Code of N symbols.

The Possible Answers Are Not Always Equally Probable

In some cases the receiver might not consider all possible answers as equally probable. Consider a weather application that is telling you the current temperature. Since you are fully aware of the season or can see that it is snowing outside, all possible temperature messages from the weather service to you are not equally probable, as temperatures near freezing in this particular case are far more probable than a temperature of 90%. A weather application could exploit this by knowing the average daily temperature for each day of the year averaged over the last five years, and send only the differences from that average from the service to your phone. The code could use fewer bits for smaller differences coming closer to an “optimum” code.

This is much more than an engineering exercise that saves you money on your phone’s data plan. It actually represents the amount of information carried by each symbol in the Code. Since your expectation increases for values closer to the average daily temperature, receiving one of those values has less influence on your already low uncertainty. This is consistent with what was stated above, that the amount of information carried by a symbol in a Code is directly related to how much uncertainty it reduces in the receiver.

Not All Symbols Carry Information

Consider two different messages, with one saying “Last night’s lowest temperature was 10 degrees F”, or “Last night’s lowest temperature was a frosty 10 degrees F.” The addition of the word “frosty” adds no additional information to a human receiver. If the person already has no uncertainty about what 10 degrees feels like, the word “frosty” does not reduce it further, so it contains no information. A Code could be very inefficient that way using far more bits than the minimum required to reduce the uncertainty.

Information Always Comes in a Message Containing Symbols Selected From Codes.

Although English words are used for both pieces of information,”daytime,Tuesday” would be a message that answers two questions, with two Codes. The first Code is a set of two possible symbols, followed by a second Code with seven possible symbols. Shannon says that information is always carried in a message that contains one or more symbols that are part of one or more Codes answering one or more questions.

Information is a Material Thing

This might come as a surprise to the reader, but the consequences of all of this is that no information moves in the world without involving something to do with energy or matter. There is no way you can send a symbol to me without using a pulse of energy or moving some matter somewhere. Bits are more than just a mathematical notion. You cannot transmit of store even one bit without storing it as energy or as a piece of matter. If this were not true, you would never need more computer memory or disk space and you would not have to pay your Internet provider a usage fee for each megabyte of data you sent or received.

This became such a powerful fundamental observation that it led to solving such problems as how many extra bits you might need to use to overcome random noise in a communications channel or in imperfect persistence in computer memory. It even led to linking Shannon Information to other fundamental properties in the universe such as thermodynamics.

Information Does Not Require an Intelligent Sender or Receiver

Here is the most dramatic outcome from Shannon’s theory. Consider that if Shannon Information is not meaning but symbolizes meaning, and that it informs by reducing the receiver’s uncertainty as to which symbol in a Code was selected by the sender, and if it can specified, stored, and transmitted in a certain number of bits, then Shannon Information is not something that only pertains to human communications. It turns out that Shannon Information is everywhere in nature. We can find it, isolate it, and quantify it just like we can do for a message between two humans. In Shannon’s Information Theory, there is no requirement for an intelligent sender or an intelligent receiver of information. In fact, naturally produced information is flowing all throughout the universe.

Information and Meaning

applefinch — Sat, 19 Sep 2015 18:33:41 +0000

What would you think if you received a text message that said;

taH pagh taHbe

You would probably suspect that someone ‘butt dialed” that message by mistake. However, it might amuse to know that message #1 is one of the most recognized lines from Shakespeare written in Klingon. The line is from Hamlet’s soliloquy in Shakespeare’s play Hamlet. (Yes there is an actual language called Klingon developed by people related to the production of the well known StarTrek movies.) The well known line in English is:

To be, or not to be

This line is spoken by Hamlet himself in Act III, scene as he ponders one of life’s deepest questions about the meaning of life. The full context of the line is:

To be, or not to be: that is the question:
Whether ’tis nobler in the mind to suffer
The slings and arrows of outrageous fortune
Or to take arms against a sea of troubles,
And by opposing end them?—To die,—to sleep,—
No more; and by a sleep to say we end
The heartache, and the thousand natural shocks
That flesh is heir to,—’tis a consummation
Devoutly to be wish’d.

The Ultimate Meaning

For Hamlet the question of being or not being is one of life or death, because the answer he is getting is that life itself does not have enough meaning to outweigh the inevitable misery, suffering, and injustice accompanies it. Hamlet is considering suicide. Through his plays, Shakespeare seemed to be able to open a window into the human condition, bringing out every facet of it at a time when the study of the human heart through psychology was hundreds of years away. If ever there was literature with meaning, this scene from Hamlet is heavy with it, where the meaning of the scene is all about the meaning of life.

Unless you were raised by wolves and had never heard of Shakespeare, the moment you hear that first line of Scene III there is no confusion about what he is talking about. The meaning is perfectly clear. If you received text message #2 from a thespian friend suffering from depression, you would immediately get on the phone or rush over to their house to perhaps avert a tragedy. But if message #2 is perfectly clear and laden with life and death meaning, why would message #1 be just nonsense to you? If either of the messages are equivalent in regard to Shakespeare’s profound meaning, why doesn’t message #2 invoke in you the same response?

The Ultimate Meaning in 19 Bytes

The answer is obvious, of course. You don’t speak Klingon. But beneath that obvious answer there is more interesting question. If that line of Shakespeare is heavy with meaning, where did the meaning go in message #1? You might answer that the meaning is still there, but without a knowledge of Klingon, we don’t recognize it. But let me propose to you that neither message #1 nor message #2 actually contains any meaning. That might sound outrageous, but let me ask you how you think that all that anguish about the human condition and the question of whether life is worth living be contained in a set of about 12 little marks on a piece of paper or a computer screen? Consider that the characters in message #2 that come to you from the web server were stored in 19 bytes of memory (if you include the spaces). How much information about the meaning of life can be stored in 19 bytes of computer memory?

Meaning As Shared Context

The answer to that question is that message #2 does not so much ‘contain’ meaning as it does ‘select’ meaning. Each of the six words in that message are part of a vocabulary shared by you, me, and Shakespeare. When Shakespeare sat down to write that line, he had profound meaning in mind. His goal was to transmit that meaning to his audience. In order to do that, he selected words from our shared vocabulary knowing that you and I and the rest of his audience already know the meaning of “to be”, and the meaning of “not to be”. When you or I hear or read the words “to be”, we associate those words with meaning that is already in our heads. If you are still uncertain about this, consider a line a bit further in Scene III where Shakespeare writes,

who would fardels bear

Shakespeare didn’t write that line to make English class difficult for high school students, but in good faith that his audience knew what fardels are. The problem is that although Shakespeare and his audience shared a certain English vocabulary at the time of his writing, here in the present day our vocabulary does not completely overlap with the 16th century Elizabethan audience’s. So when we read or hear the word “fardels”, it doesn’t compute. We are unable to match it up with a meaning in our own minds. The word “fardels” means “burdens”, which highlights the fact that its not the meaning of the word that confused us but our lack of that word in our vocabulary. We know what burdens are, but we just didn’t recognize the word itself.

Words Are Symbols For Shared Meaning

Now we can safely say that the words in message #2 and their meaning are separate concerns. The words don’t contain meaning but merely symbolize meaning. The words are symbols. And given that the words and the meaning are separate, any set of symbols will do for me to send information to you as long as both the meaning and the set of symbols I use are part of a context that we share between us. In other words, we need to share the same set of meanings and an agreement on which symbols will represent which meaning. For example, if you and I had troubled to learn Klingon, I could recite Hamlet to you in Klingon and it might rock your soul as much as it might in English.

In the case of English and many other languages, words are symbols made up of symbols. English words are made up of symbols from a set of 26 letters, ten numbers, spaces, and punctuation. But in other languages words themselves are discrete symbols, such as Chinese or Japanese ideograms. In either case, most adult humans have a vocabulary of about 25,000 words. If I am to communicate meaning to you effectively, I need to choose wisely from my set of 25,000 word symbols based on what they mean to me and send them to you hoping that you know each of the symbols and will associate approximately the same meaning in your mind to each of them when you receive them. I can form them out of English letters or I could number the words from 0 to 25,000, send you a string of numbers, and you can look up the words from a chart (which is partly what zipping up a text file does).

Human Communication Is Always Symbolic

There is nothing unique about Shakespeare communicating to us symbolically. All of us do the same thing when we send or speak words to each other. But there is nothing unique about the written word as symbol, either. We create and use all kinds of symbols besides written words to communicate with each other. In fact, since we cannot read each other’s minds, we have no choice but to communicate symbolically in all of our communications. Written and spoken words, gestures, dance, music, visual arts, touch, and so forth are no exception to the rule. All human communication takes place by humans sending symbols to other humans.

Symbols and Information Theory

If all human communication takes place through symbols that means that the way we are informed about something is through the receiving of symbols. If those symbols come from a finite set of symbols (e. g. 20,000 word vocabulary, or the three colors of a traffic light) it should be possible to quantify information by quantifying symbols. We can ask questions such as “what is the smallest set of symbols I would need to inform you about someone’s age in years?” Or, “how many symbols would I need to store on a computer disk if I wanted to store certain information (e. g. a zipped version of Shakespeare’s Hamlet) ?”

As it turns out, the recognition that information is carried by symbols led to the establishment of the science of Information Theory by a Bell Labs mathematician named Claude Shannon in the late 1940s. In the process he discovered that the flow of information is not just an engineering problem but something that is as fundamental in the universe as matter or energy.

Applefinch

Decoding Evolution

A Drunkard’s Walk

On 747s, Tornadoes, and Junkyards

But isn’t life arising from non-life unlikely?

Haven’t scientists calculated the odds of life forming and found it impossible?

Sure, we know how tornadoes form but we don’t know how life first began.

Isn’t it unusual that we don’t know how life began?

But how do we know that life began by some natural process?

Do we know anything about the origin of life?

What Took Us So Long?

The Audacity of Induction

Finding Dawkins’ Weasels

Dawkins’ Weasels

Steiner Weasels

Shakespeare, Evolution, and Weasels

Monkeys With Typewriters

Shakespeare from Space

Dembski’s Complex Specified Information

CSI in DNA

Dawkins’ Weasels and CSI

Scenario 1: Dawkins’ Weasels

Weasels Cheat at Poker

Are Dawkin’s Weasels Guided?

Introducing Steiner Weasels

Scenario #1: Weasels Find Some Food Sources

Scenario #2: Weasels Find All the Food Sources

Scenario #3: Weasels Optimize Their Routes to Food Sources

Scenario #4: Weasels Find Shortcuts With Steiner Points

Where Does the Steiner Tree Information Come From?

Symbols, Codes, and Information

Shannon’s Theory of Information

Information is Not Meaning. It Symbolizes Meaning.

Information Reduces Uncertainty By Answering a Question

Uncertainty is Related to the Number of Possible Answers to a Question.

The Possible Answers Are Represented by a Set of Symbols That Make Up a Code.

The Smallest Code With Information has Two Symbols

Information Can Be Measured in Bits

The Possible Answers Are Not Always Equally Probable

Not All Symbols Carry Information

Information Always Comes in a Message Containing Symbols Selected From Codes.

Information is a Material Thing

Information Does Not Require an Intelligent Sender or Receiver

Information and Meaning

taH pagh taHbe

To be, or not to be

The Ultimate Meaning

The Ultimate Meaning in 19 Bytes

Meaning As Shared Context

Words Are Symbols For Shared Meaning

Human Communication Is Always Symbolic

Symbols and Information Theory