{"id":283,"date":"2007-01-21T19:39:04","date_gmt":"2007-01-21T19:39:04","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2007\/01\/21\/misrepresenting-simulations\/"},"modified":"2007-01-21T19:39:04","modified_gmt":"2007-01-21T19:39:04","slug":"misrepresenting-simulations","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2007\/01\/21\/misrepresenting-simulations\/","title":{"rendered":"Misrepresenting Simulations"},"content":{"rendered":"<p>Yet another reader forwarded me a link to a <a href=\"http:\/\/rightwingnation.com\/index.php\/2007\/01\/19\/2793\/\">rather dreadful articl<\/a>e. This one seems to be by<br \/>\nsomeone who knows better, but prefers to stick with his political beliefs rather than an honest<br \/>\nexploration of the facts.<\/p>\n<p> He&#8217;s trying to help provide cover for the anti-global warming cranks. Now, in light of all of the<br \/>\ndata that we&#8217;ve gathered, and all of the different kinds of analyses that have been used<br \/>\non that data, for anyone in the real world, it&#8217;s pretty undeniable that global warming is<br \/>\na real phenomena, and that at least part of it is due to humanity. <\/p>\n<p><!--more--><\/p>\n<p> One of the standard arguments from the supposed skeptics about global warming is<br \/>\nthe fact that much of our understanding of it is generated using <em>simulations<\/em>. That&#8217;s exactly the tack that this article takes:<\/p>\n<blockquote>\n<p>Think of a (mathematical) function. The function takes values as input, processes the values, and spits out a value. A simulation is a lot like a function, with three crucial exceptions. First, a simulation has no real input values, but uses instead simulated input. Second, because the input is simulated, instead of running the simulation only once (as you would with a function), you have to run a simulation many times (iterations), then statistically analyze the output &#8212; the simulated output, calculated from the simulated input. Third, because the output of the simulation is simulated output and there are multiple outputs (because the simulation must be run many times), it must be statistically analyzed for reliability and the multiple results must be analyzed statistically.<\/p>\n<p>A function is certain. A simulation is uncertain.<\/p>\n<\/blockquote>\n<p> This is utter bullshit. No nice way to put it; it&#8217;s utter crap. It is <em>not<\/em> a fact that<br \/>\nsimulations don&#8217;t take <em>real<\/em> input data. It is also <em>not<\/em> a fact that all simulations<br \/>\nneed to be run multiple times. This is a pure strawman; <em>some<\/em> simulations are run with<br \/>\nsimulated input; <em>some<\/em> simulations are run multiple times with varying inputs to get a sense<br \/>\nof trends. It is manifestly <em>not<\/em> the fact that <em>all<\/em> simulations work this way; and<br \/>\nmany of the simulations used for analyzing global warming trends were fully deterministic simulations run using <em>real<\/em> measurements. <\/p>\n<p> Let&#8217;s take a moment and ask the fundamental question: What is a simulation?<\/p>\n<p> For our purposes: a simulation is a computer program which implements a mathematical model<br \/>\nof some phenomenon. The input to a simulation is the data needed to describe an initial state<br \/>\nof that phenomena. The simulation is run against the input data, and produces a description of<br \/>\nwhat it&#8217;s mathematical model predicts about the state of the phenomena at subsequent points in time.<\/p>\n<p> Given a simulation model, it can be run in a number of different ways:<\/p>\n<ol>\n<li> Verification: given a set of data for some point in the past, the simulation can be run<br \/>\nto see if its results match the present.<\/li>\n<li> Prediction: given a set of data for the present, the simulation can be run to<br \/>\ngenerate predictions about the future.<\/li>\n<li> Exploration: given a <em>manufactured<\/em> set of data, the simulation can be run<br \/>\nto explore what could happen in a given situation, or to study how the model works<br \/>\nin various cases.<\/li>\n<\/ol>\n<p> The author of the original article pretends that everything is case 3 &#8211; the exploratory case,<br \/>\nwhere the data being input is manufactured, rather than being a representation of real measurements.<br \/>\nHe also assumes that the simulation is <em>stochastic<\/em>: that is, that it is using some sort of<br \/>\nrandomness in its input, so that starting from the <em>same<\/em> input data, the simulation will<br \/>\ngenerate different results. In <em>my<\/em> experience, that&#8217;s incredibly rare in computer simulations<br \/>\n&#8211; in fact, I can&#8217;t recall ever seeing a <em>single<\/em> simulation that wasn&#8217;t fully deterministic &#8211;<br \/>\nmeaning that it always generates the same result for the same input.<\/p>\n<p> There&#8217;s one thing he got right: functions are certain. Given a function, you <em>know<\/em> what<br \/>\nits result is, and you can check whether or not the result is correct. For a simulation, the result<br \/>\nis more fuzzy &#8211; the simulation may be generating a &#8220;correct&#8221; result in the sense that it doesn&#8217;t have<br \/>\nany bugs, and it generates the result that its model says it should; and at the same time be<br \/>\ncompletely <em>wrong<\/em> because the mathematical model that it&#8217;s using doesn&#8217;t accurately<br \/>\nrepresent reality.  And there&#8217;s one other thing that&#8217;s true, although he doesn&#8217;t explicitly mention it: many simulations are probabilistic, in the sense that instead of generating exactly one results, they compute multiple possibilities: when there are choices about what to do in the model, they<br \/>\nrun all of them, generating a probability for each of the branches.<\/p>\n<p> With that out of the way, let&#8217;s continue on to the next interesting part of the article.<\/p>\n<blockquote>\n<p>So how does a simulation simulate input values? Usually, by taking real data, analyzing it statistically to determine its frequency and distribution, then using statistics to generate input values using the same frequency and distribution. Note that in order for this to work, one must assume that the data are stable, that is, that the frequency and distribution of the data will not change over time.<\/p>\n<p>The part of the simulation that corresponds to a function is known as the model. Obviously, only an accurate model can produce reliable results (output), and the more accurate the model, the more reliable the results.<\/p>\n<p> Simulations can be powerful tools for making predictions, and are used in many fields, including<br \/>\nbusiness. However, because simulations use simulated (that is, not real) input and result in<br \/>\nsimulated (that is, not real) output, they have no evidentiary power &#8212; that is, you cannot use a<br \/>\nsimulation as evidence for anything, nor can you call the output of a simulation (real) data.<\/p>\n<\/blockquote>\n<p> First paragraph is not nonsense, but it&#8217;s not exactly <em>honest<\/em>, either. He wants to make<br \/>\nit look like the models are as uncertain as possible, so he stresses the idea of a stochastic model.<br \/>\nA stochastic model is entirely probabilistic &#8211; its model is based on nothing but statistics about how<br \/>\nthings have behaved before. Stochastic models are relatively weak, as models go; they&#8217;re generally<br \/>\nused in cases where we don&#8217;t have a good behavioral model that accurately models the real phenomena<br \/>\nthat&#8217;s being simulated. In any situation where we use simulations, we <em>strongly<\/em> prefer<br \/>\nphysical models &#8211; that is, instead of using stochastic models, we prefer a simulation that&#8217;s based on<br \/>\nreally simulating physical behaviors, rather than just playing with probabilities. What he&#8217;s saying is misleading &#8211; because he&#8217;s pretending that <em>all<\/em> models are stochastic.<\/p>\n<p> He <em>also<\/em> deliberately overstates the case about the evidentiary value of simulations. A simulation is never considered the equivalent of real-world evidence in terms of quality; but they<br \/>\nmost emphatically <em>are<\/em> frequently used as supporting evidence for various theories. The quality of a simulation as evidence is generally based on how well it performs in verification runs &#8211; a simulation that can be demonstrated to generate accurate results in a wide variety of situations is considered a good piece of supporting evidence when run on data similar to the data used for the verifications. <\/p>\n<p> For example, the US Army&#8217;s ordinance testing facility in Maryland now uses simulations<br \/>\nfor <em>most<\/em> of its tests. It uses tests for information gathering, and it periodically<br \/>\nperforms tests to validate the simulations; but the quality of the simulations has gotten high enough that for many purposes, they no longer consider it necessary to run real tests of dropping explosives from an airplane. They quite definitely consider the results of those simulations to be evidence!<\/p>\n<p> And now, finally, we get to the sleaziest part.<\/p>\n<blockquote>\n<p>You can use estimated data in your model, but doing so inserts another layer of uncertainty into the simulation results. Let me explain.<\/p>\n<p>Since the data in our model are estimated, we must use statistics to determine their validity. Since Mr. Lewis uses dice, I will as well, sticking for the sake of simplicity and clarity to rolling one die.<\/p>\n<p>If you roll a die, the probability that you will roll, say, a 1 is 1\/6. If you roll the die a second time, the probability that you will roll, say, a second 1 is 1\/6 * 1\/6, or 1\/36. If you roll the die a third time, the probability that you will roll, say, a third 1 is 1\/6 * 1\/6 * 1\/6, or 1\/216, and so forth.<\/p>\n<p>Estimated variables in a simulation model must be treated in the same way as rolling a die, because each is uncertain, and each involves probability. Assuming that our climatologist is ethical, each of the estimated variables in his model should fall within the statistical norm of reliability, or be 95% reliable. Given the complexity of climatological models, hundreds of such estimated variables would be necessary, but for clarity&#8217;s sake, we will say the model includes only 50 such variables.<\/p>\n<p> That means that the reliablility of the simulation model is 0.95^50 (0.95 raised to the fiftieth power), or 0.0769, or 7.7%. So even if we didn&#8217;t have the uncertainty of the simulated input (not to mention the additional uncertainty of assuming that the data are stable), even if there were no uncertainty in our ouputs themselves, our simulation results would only 7.7% reliable.\t<\/p>\n<\/blockquote>\n<p> This is a <em>thoroughly<\/em> dishonest bunch of babble, which is <em>in no way<\/em> an accurate description of <em>anything<\/em>. <\/p>\n<p> Even if we accept what he says at face value: that there are multiple variables<br \/>\nin a simulation which need to be considered separately in terms of probability &#8211; he&#8217;s quite<br \/>\ndeliberately ignoring the <em>correct<\/em> way of combining those probabilities. In fact, he&#8217;s really just trying to play the inverse of a classic &#8220;big numbers&#8221; game &#8211; he wants to artificially combine things to make the probability look as untrustworthy as possible. The trick is in pretending<br \/>\nthat all 50 (or whatever) variables in the simulation are <em>independent<\/em>. In real<br \/>\nclimate simulations, the kinds of things that become variables are <em>not<\/em> independent. To give a couple of examples, real climatalogical simulations will include a parameter to describe the<br \/>\nhumidity of airmasses based on temperature; and simulation to describe the viscosity of airmasses based on temperature and humidity. Those <em>are not<\/em> independent &#8211; the viscosity of the airmass is determined in part by its humidity; the ability of the airmass to pick up more moisture while over the ocean is determined in part by its viscosity. The probabilities of these things being correct are not independent &#8211; if one is right, the other is almost certainly right; and if one is wrong, the other is almost certainly wrong &#8211; because each depends on the correctness of the other. Dependent variables get treated <em>very<\/em> differently in a probability calculation that independent variables &#8211; that&#8217;s what <a href=\"http:\/\/en.wikipedia.org\/wiki\/Bayesian_inference\">Bayes theorem<\/a> is all about.<\/p>\n<p> But it&#8217;s much worse than just making a misleading probability argument. He&#8217;s very deliberately<br \/>\nmischaracterizing how we model the accuracy of a simulation. The accuracy of simulation is based on<br \/>\nits performance and the known accuracy of the fundamental model which it&#8217;s based on. So, for example,<br \/>\nmost airplane manufacturers no longer use wind tunnels &#8211; computational fluid dynamics simulations<br \/>\ngenerate <em>better<\/em> results than the wind tunnel (Do a websearch on &#8220;Boeing&#8221; and &#8220;Tranair&#8221;). The<br \/>\nreliability of the simulation is based on two things. One is a long history of measuring things on<br \/>\ninstrumented aircraft, and comparing the measurements to the predictions from the simulations; the<br \/>\nother is the known accuracy of the Navier Stokes equations, and the computational methods used to<br \/>\nimplement NS systems. On the basis of those two, we come up with results about how accurate we<br \/>\nbelieve the models to be. <\/p>\n<p> And further &#8211; we look at simulations based on <em>multiple<\/em> models. If 20 different models, generated in 20 different ways, all of which have strong track records for accuracy &#8211; if all 20 of them have been been implemented by simualations whose quality has been demonstrated &#8211; and all of them generate nearly the same result, and <em>no<\/em> system\/model with a proven track record disagrees, then we consider the results of those simulations to be <em>very strong<\/em> evidence. <\/p>\n<p> Of course, he saves the worse for last.<\/p>\n<blockquote>\n<p> Climatological simulations cannot be taken very seriously. They can certainly never be taken as evidence or proof, as no simulation can be, because they are simulations. They aren&#8217;t real.<\/p>\n<p> What disturbs me about all this global warming warfare is that the climatologists know this. They know that their models have no evidentiary power. Yet, they disingenuously claim the reverse. This isn&#8217;t science. It&#8217;s politics. It&#8217;s dishonest. And it&#8217;s a breach of professional ethics and integrity.<\/p>\n<\/blockquote>\n<p> So says the man who just threw together a bundle of lies and misrepresentations to try to<br \/>\nsupport a pre-determined opinion without regard for the facts.  And he accuses <em>others<\/em><br \/>\nof dishonesty and breaches of ethics and integrity.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Yet another reader forwarded me a link to a rather dreadful article. This one seems to be by someone who knows better, but prefers to stick with his political beliefs rather than an honest exploration of the facts. He&#8217;s trying to help provide cover for the anti-global warming cranks. Now, in light of all of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[5],"tags":[],"class_list":["post-283","post","type-post","status-publish","format-standard","hentry","category-bad-physics"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-4z","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/283","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=283"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/283\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=283"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=283"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=283"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}