{"id":279,"date":"2007-01-18T13:58:25","date_gmt":"2007-01-18T13:58:25","guid":{"rendered":"http:\/\/scientopia.org\/blogs\/goodmath\/2007\/01\/18\/basics-standard-deviation\/"},"modified":"2007-01-18T13:58:25","modified_gmt":"2007-01-18T13:58:25","slug":"basics-standard-deviation","status":"publish","type":"post","link":"http:\/\/www.goodmath.org\/blog\/2007\/01\/18\/basics-standard-deviation\/","title":{"rendered":"Basics: Standard Deviation"},"content":{"rendered":"<p> When we look at a the data for a population+ often the first thing we do<br \/>\nis look at the mean. But even if we <em>know<\/em> that the distribution<br \/>\nis perfectly normal, the mean isn&#8217;t enough to tell us what we know to understand what the mean is telling us about the population. We also need<br \/>\nto know something about how the data is spread out around the mean &#8211; that is, how <em>wide<\/em> the bell curve is around the mean.<\/p>\n<p> There&#8217;s a basic measure that tells us that: it&#8217;s called the <em>standard deviation<\/em>. The standard deviation describes the spread of the data,<br \/>\nand is the basis for how we compute things like the degree of certainty,<br \/>\nthe margin of error, etc.<\/p>\n<p><!--more--><\/p>\n<p> Suppose we have a population of data points, P={p<sub>1<\/sub>,&#8230;,p<sub>n<\/sub>}. We know that the mean is<br \/>\nthe sum of the points (p<sub>i<\/sub>s) divided by the number of<br \/>\npoints |P|. The way to describe the spread is based roughly on the concept<br \/>\nof the <em>average<\/em> difference between the points from the mean.<\/p>\n<p> So what happens if we naively compute the average of the difference between the mean and the the data points? That is, compute the mean difference? That is &#8211; if the mean is M, and the average distance is <em>d<\/em>, then can we use the following?<\/p>\n<p><!-- equation image --><br \/>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"mean-distance.mmf.jpg\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_135.jpg?resize=88%2C37\" width=\"88\" height=\"37\" \/><\/p>\n<p> Unfortunately, that won&#8217;t work. If we work it through, what we&#8217;d find is that by the definition of mean, that average difference <em>d<\/em> will<br \/>\nbe 0. After all, the mean is the point <em>in the center<\/em> of the<br \/>\ndistribution &#8211; that means that a simple sum of the differences will be zero &#8211; the values <em>larger<\/em> than the mean (which will be positive) will be precisely equal to the sum of the values <em>smaller<\/em> that the mean (which will be negative), and so the sum, and therefore the average must be 0.<\/p>\n<p> How do we get around that? By making all of the distances positive. And how do we do that? Square them. The standard deviation, which is usually written &sigma; is a <em>root mean-square<\/em> measure &#8211; which means that it&#8217;s the mean (average) of<br \/>\nthe square root of the difference between the points and the mean squared.  The sum of the squares is also a useful figure, called the <em>variance<\/em>; the variance is just the mean of the squares &#8211; that is &sigma;<sup>2<\/sup>. The standard deviation written in equational form, where M is the mean, and P is the set of points, is:<br \/>\n<!-- equation image --><br \/>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"sdev.mmf.jpg\" src=\"https:\/\/i0.wp.com\/scientopia.org\/img-archive\/goodmath\/img_136.jpg?resize=109%2C38\" width=\"109\" height=\"38\" \/><\/p>\n<p> Let&#8217;s run through an example. Take the list of salaries from the <a href=\"\">mean<\/a> article: [ 20, 20, 22, 25, 25, 25, 28, 30, 31, 32, 34, 35, 37, 39, 39, 40, 42, 42, 43, 80, 100, 300, 700, 3000 ]. The sum of these is 4789. There are 24 values. So the mean (rounding off to 2 significant figures) is 4789\/24 = 200. So what&#8217;s the standard deviation? <\/p>\n<ol>\n<li> First, we&#8217;ll compute the sum of the squares of the differences:<br \/>\n(20-200)<sup>2<\/sup> + (20-200)<sup>2<\/sup> + (22-200)<sup>2<\/sup> + (25-200)<sup>2<\/sup> + &#8230; + (700-200)<sup>2<\/sup> + (3000-200)<sup>2<\/sup> =<br \/>\n32400+32400+31684+30625+30625+30625+29584+28900+28561+28224+27556+27225+26569+25921+25921+25600+24964+24964+24649+14400+10000+10000+250000+7840000 = 8661397.<\/li>\n<li> Then we&#8217;ll divide by the number of points: 8661397\/24 = 360891. So the variance is roughly 360,000.<\/li>\n<li> Then take the square root of the variance: the square root of 360,000 = 600.<\/li>\n<\/ol>\n<p> So, for our salaries, the mean is $200,000 with a standard deviation of $600,000. That right there should be enough to give us a good sense that there&#8217;s something very strange about the distribution of numbers here &#8211; because salaries can&#8217;t be less than zero, but the standard deviation is <em>three times<\/em> the size of the mean!<\/p>\n<p> But what does the standard deviation <em>mean<\/em> precisely? The best way to define it is in probabilistic terms. In a population P with roughly normal distribution, mean M, and standard deviation &sigma;: <\/p>\n<ul>\n<li> 2\/3s of the values in P will be<br \/>\nwithin the range M +\/- &sigma;.<\/li>\n<li> 95% of the values will be within the range M +\/- 2&sigma;.<\/li>\n<li> 99% of the values will be within the range M +\/- 3&sigma;<\/li>\n<\/ul>\n<p> For <em>any<\/em> population P with mean M and standard deviation &sigma;, regardless of whether the distribution is<br \/>\nnormal:<\/p>\n<ul>\n<li> <em>At least<\/em> 1\/2 of the values in P will be within the range M +\/- 1.4&sigma;.<\/li>\n<li> <em>At least<\/em> 3\/4 of the values in P will be within the range M +\/- 2&sigma;<\/li>\n<li> <em>At least<\/em> 9\/10s of the values in P will be within the range M +\/- 3&sigma;.\n<\/ul>\n<p> If you have a population P which is very large, you often want to make<br \/>\nan estimate about the population using a <em>sample<\/em>, where a sample<br \/>\nis a subset P&#8217; &sub; P  of the population. Since the standard deviation of the sample is generally slightly smaller than the standard deviation of the population as a whole, we add a correction factor for sampled populations. In the equation for the standard deviation, instead of dividing by the size of the sample, |P&#8217;|, we divide by the size of the sample minus one: |P&#8217;|-1. The ideal correction factor is a lot more complicated, but in practice, the &#8220;subtract one from the size of the sample&#8221; trick is an excellent approximation, and so it&#8217;s used nearly universally.<\/p>\n<p> Next topic in the basics will be something closely related: confidence intervals and margins of error.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When we look at a the data for a population+ often the first thing we do is look at the mean. But even if we know that the distribution is perfectly normal, the mean isn&#8217;t enough to tell us what we know to understand what the mean is telling us about the population. We also [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[74,61],"tags":[],"class_list":["post-279","post","type-post","status-publish","format-standard","hentry","category-basics","category-statistics"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4lzZS-4v","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/279","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/comments?post=279"}],"version-history":[{"count":0,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/posts\/279\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/media?parent=279"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/categories?post=279"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.goodmath.org\/blog\/wp-json\/wp\/v2\/tags?post=279"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}