{"id":102,"date":"2006-08-06T12:39:48","date_gmt":"2006-08-06T12:39:48","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2006\/08\/06\/qa-what-is-information\/"},"modified":"2006-08-06T12:39:48","modified_gmt":"2006-08-06T12:39:48","slug":"qa-what-is-information","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2006\/08\/06\/qa-what-is-information\/","title":{"rendered":"Q&amp;A: What is information?"},"content":{"rendered":"<p>I received an email from someone with some questions about information theory; they relate to some sufficiently common questions\/misunderstandings of information theory that I thought it was worth turning the answer into a post.<br \/>\nThere are two parts here: my correspondent started with a question; and then  after I answered it, he asked a followup.<br \/>\nThe original question:<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br \/>\n&gt;Recently in a discussion group, a member posted a series of symbols, numbers,<br \/>\n&gt;and letters:<br \/>\n&gt;<br \/>\n&gt;`+<br \/>\n&gt;The question was what is its meaning and whether this has information or not.<br \/>\n&gt;I am saying is does have information and an unknown meaning (even though I am<br \/>\n&gt;sure they were just characters randomly typed by the sender), because of the<br \/>\n&gt;fact that in order to recognize a symbol, that is information. Others are<br \/>\n&gt;claiming there is no information because there is no meaning. Specifically that<br \/>\n&gt;while the letters themselves have meaning, together the message or &#8220;statement&#8221;<br \/>\n&gt;does not, and therefore does not contain information. Or that the information<br \/>\n&gt;content is zero.<br \/>\n&gt;<br \/>\n&gt;Perhaps there are different ways to define information?<br \/>\n&gt;<br \/>\n&gt;I think I am correct that it does contain information, but just has no meaning.<br \/>\nThis question illustrates one of the most common errors made when talking about information. Information is *not* language: it does *not* have *any* intrinsic meaning. You can describe information in a lot of different ways; in this case, the one that seems most intuitive is: information is something that reduces a possibility space. When you sit down to create a random string of characters, there is a huge possibility space of the strings that could be generated by that. A specific string narrows the space to one possibility.  Anything that reduces possibilities is something that *generates* information.<br \/>\nFor a very different example, suppose we have a lump of radium. Radium is a radioactive metal which breaks down and emits alpha particles  (among other things). Suppose we take out lump of radium, and put it into an alpha particle detector, and record the time intervals between the emission of alpha particles.<br \/>\nThe radium is *generating information*. Before we started watching it, there were a huge range of possibilities for exactly when the decays would occur. Each emission &#8211; each decay event &#8211; narrows the possibility space of other emissions. So the radium is generating information.<br \/>\nThat information doesn&#8217;t have any particular *meaning*, other than being the essentially random time-stamps at which we observed alpha particles. But it&#8217;s information.<br \/>\nA string of random characters may not have any *meaning*; but that doesn&#8217;t mean it doesn&#8217;t contain information. It *does* contain information; in fact, it *must* contain information: it is a *distinct* string, a *unique* string &#8211; one possibility out of many for the outcome of the random process of generation; and as such, it contains information.<br \/>\nThe Followup<br \/>\n&#8212;&#8212;&#8212;&#8212;&#8212;<br \/>\n&gt;The explanation I have gotten from the person I have been debating, as to what<br \/>\n&gt;he says is information is:<br \/>\n&gt;<br \/>\n&gt;I = -log<sub>2<\/sub> P(E)<br \/>\n&gt;<br \/>\n&gt;where:<br \/>\n&gt;I: information in bits<br \/>\n&gt;P: probability<br \/>\n&gt;E: event<br \/>\n&gt;<br \/>\n&gt;So for the example:<br \/>\n&gt;`+<br \/>\n&gt;He says:<br \/>\n&gt;&#8221;I find that there is no meaning, and therefore I infer no information. I<br \/>\n&gt;calculate that the probability of that string occuring was ONE (given no<br \/>\n&gt;independent specification), and therefore the amount of information was ZERO. I<br \/>\n&gt;therefore conclude it has no meaning.&#8221;<br \/>\n&gt;<br \/>\n&gt;For me, even though that string was randomly typed, I was able to look at the<br \/>\n&gt;characters, and find there placement on a QWERTY keyboard, compare the<br \/>\n&gt;characters to the digits on the hand, and found that over all, the left hand<br \/>\n&gt;index finger keys were used almost twice as much. I could infer that a person<br \/>\n&gt;was left handed tried to type the &#8220;random&#8221; sequence.  So to me, even though I<br \/>\n&gt;don&#8217;t know the math, and can&#8217;t measure the amount of information, the fact I<br \/>\n&gt;was able to make that inference of a left handed typer tells me that there is<br \/>\n&gt;information, and not &#8220;noise&#8221;.<br \/>\nThe person quoted by my correspondent is an idiot; and clearly one who&#8217;s been reading Dembski or one of his pals. In my experience, they&#8217;re the only ones who continually stress that log-based equation for information.<br \/>\nBut even if we ignore the fact that he&#8217;s a Dembski-oid, he&#8217;s *still* an idiot. You&#8217;ll notice that nowhere in the equation does *meaning* enter into the definition of information content. What *does* matter is the *probability* of the &#8220;event&#8221;; in this case, the probability of the random string of characters being the result of the process that generated it.<br \/>\nI don&#8217;t know how he generated that string. For the sake of working things through, let&#8217;s suppose that it was generated by pounding keys on a minimal keyboard; and let&#8217;s assume that the odds of hitting any key on that keyboard are equal (probably an invalid assumption, but it will only change the quantity of information, not its presence, which is the point of this example.) A basic minimal keyboard has about sixty keys. (I&#8217;m going by counting the keys on my &#8220;Happy Hacking Keyboard&#8221;. Of those, seven are shifts of various sorts: 2 shift, 2 control, one alt, one command, one mode), and one is &#8220;return&#8221;. So we&#8217;re left with 52 keys.  To make things simple, let&#8217;s ignore the possibility of shifts affecting the result (this will result in us getting a *lower* information content, but that&#8217;s fine). So, it&#8217;s 80 characters; each *specific* character generated is an event with probability 1\/52. So each *character* of the string has, by the Shannon-based definition quoted above, about 5.7 bits of information. The string as a whole has about 456 bits of information.<br \/>\nThe fact that the process that generated it is random doesn&#8217;t make the odds of a particular string be one. For a trivial example, I&#8217;m going to close my eyes and pound on my keyboard with both hands twice, and then select the first 20 characters from each:<br \/>\nFirst: &#8220;`;kldsl;ksd.z.md\u03a9.l.x`&#8221;<br \/>\nSecond: &#8220;`lkficeewrflk;erwm,.r`&#8221;<br \/>\nQuite different results overall? Seems like I&#8217;m a bit heavy on the right hand on the k\/l area. But note how different the outcomes are. *That* is information. Information without *semantics*; without any intrinsic meaning. But that *doesn&#8217;t matter*. Information *is not language*; the desire of [certain creationist morons][gitt] to demand that information have the properties of language is nonsense, and has *nothing* to do with the mathematical meaning of information.<br \/>\nThe particular importance of this fact is that it&#8217;s a common creationist canard that information *must* have meaning; that therefore information can&#8217;t be created by a natural process, because natural processes are random, and randomness has no meaning; that DNA contains information; and therefore DNA can&#8217;t be the result of a natural process.<br \/>\nThe sense in which DNA contains information is *exactly* the same as that of the random strings &#8211; both the ones in the original question, and the ones that I created above. DNAs information is *no different*.<br \/>\n*Meaning* is something that you get from language; information *does not* have to be *language*. Information *does not* have to have *any* meaning at all. That&#8217;s why we have distinct concepts of *messages*, *languages*, *semantics*, and *information*: because they&#8217;re all different.<br \/>\n[gitt]: http:\/\/scienceblogs.com\/goodmath\/2006\/07\/bad_bad_bad_math_aig_and_infor.php<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I received an email from someone with some questions about information theory; they relate to some sufficiently common questions\/misunderstandings of information theory that I thought it was worth turning the answer into a post. There are two parts here: my correspondent started with a question; and then after I answered it, he asked a followup. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[30],"tags":[],"class_list":["post-102","post","type-post","status-publish","format-standard","hentry","category-information-theory"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-1E","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/102","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=102"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/102\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=102"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=102"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=102"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}