# Selective Data and Global Warming

One of the most common sleazy tricks used by various sorts of denialists
comes back to statistics – invalid and deceptive sampling methods. In fact,
the very first real post on the original version of this blog was a shredding of
a paper by Mark and David Geier that did this.

Proper statistical analysis relies on a kind of blindness. Many of the things
that you look for, you need to look for in a way that doesn’t rely on any a priori
knowledge of the data. If you look at the data, and find what appears to be an
interesting property of it, you have to be very careful to show that it’s
a real phenomena – and you do that by performing blind analyses that demonstrate
its reality.

The reason that I bring this up is because one of my fellow SBers,
Tim Lambert, posted something about a particularly sleazy example of this
by Michael Duffy, a global warming denialist over at his blog, Deltoid.

The situation is that there’s a Duffy claims
that global warming stopped in 2002. It didn’t. But he makes it look like it did by using a deliberately dishonest way of sampling the data.

Looking at things like climate, one way of looking at trends is to
take periodic trending samples. That is, take every two-year interval, and
compute the difference between the two years. (So, for example, to look at two year trends since 2000, you’d look at (2000-2002, 2001-2003, 2002-2004, 2003-2005, etc.) To look for strong trends in
this way, you need to be sure that you’re capturing the right phenomena – because climate is chaotic, if you look at a period of time that’s too short, you can
see a lot of noise. So, for example, you might look at every 2 year trend, every 4 year trend, every 6 year trend, every 8 year trend, and every 10 year trend.

Let me take a moment to explain one very important word in the discussion above: chaotic. In mathematics, chaos has a very specific meaning. It doesn’t mean random without pattern. It means that there’s a high sensitivity
to initial conditions, and a particular kind of stochastic self-similarity. The canonical example of this is brownian motion. Take a cup of tea, and float a grain of pepper on it. Now, every second, plot its position in the tea. It’s going to float around in seemingly random ways. But there’s a pattern to its motion. You’ll see it make some large moves, but they’ll be rare in comparison to average.

There’s a lot more to mathematical chaos than that, and I’ll probably write about it at some time. But the thing that’s important here is that the chaotic
behavior of things like brownian motion can mask trends. If you stirred the tea in
the teacup, you’ll find the pepper jumping around in a chaotic fashion – but there’ll be an underlying trend for it to move in a circle. If you drop a ping-pong ball into a river, it’ll move all over the place – it will sometimes even get caught in an eddy, and move backwards. But overall, there’ll be a strong trend for it to move downriver.

If you did a trend analysis of the motion of the ping-pong ball, you’d be
looking at “How far did it move downriver in a given period of time?” – so you’d record its position every second, and then look at the difference in its position
over 1 second intervals, 5 second intervals, 10 second intervals, etc.

If you wanted to argue that the ping-pong ball had completely stopped moving
downriver, you couldn’t just take a couple of 2 second intervals, and show that in
three consecutive two-second intervals, it’s position didn’t move downriver. The chaotic nature of its motion means that you’d expect intervals of that length where it didn’t move downriver.

To get back to the weather issue, if you look at climate trends,
climate is chaotic. There’s a lot of bumps in it. If you look at short
trends, you see a huge amount of noise. But if you look at slightly longer
trends, a very strong pattern starts to appear. Even that has its bumps, but you can see a very compelling pattern in the data.

So, our denialist friend did trending – up to six year trends. And that’s
what he focuses his discussion on: six year trends. Why, you might ask, would he look specifically at six year trends? That’s easy. Because six-year trends are
the longest ones that produce the results he wants. Plot seven year or 8 year
trends, and suddenly, you can see the warming trend again. In fact, it’s an extremely obvious thing. Just look at the graph (taken from RealClimate).

What’s going on mathematically is that there is an upward trend in
the data. Most estimates put that warming trend at around 5 degrees F per
century – or about 1/20th of a degree per year. But yearly variation – the
chaotic component – is plus or minus a couple of degrees. So over short periods of time, that yearly variation drowns out the trend. But if you look at longer trends – which damp out the random yearly variation, while allowing the trend to accumulate – then the overall warming trend becomes visible.

What Duffy did is look at his data, and try to find a way of presenting
it that appeared to support his pre-selected conclusion. And he managed to find
one. He didn’t show a complete analysis – he couldn’t, because a complete analysis would have refuted his argument. So he selectively chose a way of analyzing the
data that would produce the desired results: he looked at the data to find the
longest period where trend analysis would show what he wanted – and he stopped there.

As sleazy tactics go, this is pretty extreme. As I said earlier, the
very first post on this blog was a takedown of an autism crank paper. This
is far worse that the autism paper – which was pretty bad. In the case of
the autism paper, they wanted to find an inflection point in the data, so they looked at the data, and picked something that would produce the result they
wanted. Arguably, you could just be clueless about proper statistical methods, and
do that by mistake. In the case of this global warming thing, there is no
possibility that this was caused by clueless error. This was deliberate
deception by cherrypicking data to produce a desired result.

## 0 thoughts on “Selective Data and Global Warming”

1. g

In mathematics, chaos has a very specific meaning. […] The canonical example of this is brownian motion. Yes … and no. Brownian motion is *not* an example of chaos in that very specific mathematical sense. And the “trend plus noise” character of the climate data is what matters here, not the fact that a lot of the noise is there because the weather is a chaotic system. (If the noise were the result of archangels playing dice or, er, Brownian motion, it wouldn’t make any difference to the wrongness of quoting alleged trends based on looking at too few datapoints to make the signal outweigh the noise.)
No question that Duffy is being either egregiously dishonest or stunningly incompetent, probably the former. It’s not only that 7 years or 8 years would show a warming trend; going *down* from 6 years to 5 or 4 would (at least according to my eyeballs) show a warming trend too.
‘Course, in 1993 even an 8-year sample would have “shown” that global warming had stopped. Which would have been bullshit, just like Duffy’s 6-year sample now.

2. Dreamer

Brownian Motion of a particle appears random because of excessive number of generators of any motion (degrees of freedom), hence our ability to predict is hindered by inability to control and observe all variables in the system. It does differ from the simpler deterministic chaos systems, which have finite and few degrees of freedom that appear random due high complexity. However, the latter is an example of mathematical chaos which is really a subset of chaotic systems. The important thing is their sensitivity to initial conditions, given a minute change in any condition their behaviour will rapidly diverge from the previous observation even if everything else is EXACTLY the same.
The point is that climate, and weather, are chaotic systems. This is why it is impossible to really predict where trends will lead, or what is causing them, with precision. I’ve never been a fan of ‘global warming’ as a term, though I know it’s only a label, because it seemed apparent that there were too many potential outcomes to say if things would get warmer or cooler. Heck, average global temperatures could remain the same in the long term. However, what’s important is that locally things could change significantly. Greater extremes, changing rainfall patterns, etc etc. That is why I feel temperature is a really really crappy indicator (I can understand that it isn’t the only measure in climate science, but to the lay that is what they think of); it’s like only measuring the x-axis motion of a molecule, it might show you an overall motion or it might average out to nothing if the flow is in the y or z direction.
I am personally satisfied that climate change is real enough, especially given the large degree of research and modelling that has been undetaken. The specifics don’t matter, simply that at the end of the day they show that a dynamical system on the scale of planet is sensitive to human scale perturbation over a long enough period.
The benefits of sustainability are there even if the climate were more robust (insensitive to human scale perturbation), or we luck out (maintains a human friendly equilibrium after perturbation), simply by allowing more to be done with finite resources. Frankly, the risk of a bad outcome is enough for me.

3. Sili

I guess I can see what makes these people tick (for a change). When I see that trend I get scared too – only difference is that I don’t try to close my eyes and chant “La la la la I can’t hear you!”. Nor do I try to blind and deafen others – and that is the despicable thing.

4. Mgccl

I remember I wrote an article about how global warming is not caused by human…
and I found people use data to mislead the population into believe stuff… like Al Gore…

5. JimD

Com’on Mark…
The true denialists dont believe in _human made_ global warming. You believe CO2 causes it?
psQ: first time poster, long time watcher, wonderful blog! Keep it coming!

6. Stephen Wells

Arrhenius wrote a paper in the late 19th century pointing out that, given the physical properties of CO2, doubling its level in the atmosphere would increase average global temperature by about 5 degrees. We can’t say we weren’t warned.

7. bill r

Mark,
You’ve assumed a linear trend and then gone through the data until you find one. What happens when you use a longer baseline and/or don’t force linearity? The GIS data goes back to 1880 or so. Perhaps you could post on non-parametric smoothing and change-point detection, or the problems with cherry-picking the data range.

8. Mark C. Chu-Carroll

Bill R:
I haven’t done anything. I haven’t done my own analysis of the data. I want to be clear about that: I’m not an expert, and I haven’t done an independent analysis of the data. I have read several studies, and looked at their methods, and based on that, I think that the data is highly consistent and supportive of warming. Every paper, every analysis that I’ve seen by people who argue against warming have obvious errors. That doesn’t mean that you can’t make a compelling argument against human-caused global warming, but if there is one, I haven’t seen it yet.
If you take the GIS data, and look at trends, you find a remarkably consistent pattern. It’s not superimposing a line on the data; it’s just looking at trends in the data, and seeing what they say.
Look at that diagram. That’s 30 years of data. It’s not linear. You can see three discrete regions with consistent slopes: there’s the earliest period in the data, which has an upward slope; there’s the early nineties, where the trends briefly have a downward slope, and then there’s the late nineties onward, where the trends have a slightly larger upward slope that the earliest part of the graph.
Real statistical analysis doesn’t start with a pre-conceived model of the data. You take the data, and see what it says – and given any potential conclusion, you should be highly skeptical of it until you can’t discard it.
When you look at warming data, you find a consistent correlation with an upward trend. No matter how you look at the data, that trend is there – and it correlates closely
with CO2 levels. Nothing else correlates as tightly.
So the data supports a correlation. To move from correlation to causation isn’t a statistical thing. You need an explanation for the mechanism by which the correlated
factors have a causative link. The greenhouse effect provides a mechanism, which can be experimentally demonstrated, and which provides that explanation.

9. bill r

Mark,
Excuse me, my error. I thought your discussion of looking at various lengths was your work, not reporting of someone else’s.
The 30 year plot does look impressive, until one starts looking at the autocorrelations in the data, at which point it becomes less so. A simple multiplier to adjust the sample size for autocorrelation is (1-r)/(1+r). Depending on the baseline, the raw autocorrelation in the GISTEMP data is in the 0.6 to 0.8 range, giving a multiplier of 1/4 to 1/9 and an effective sample size of 7 1/2 – 3 1/3. With those sample sizes, the trend becomes much less impressive, and the pre-chosen filter becomes very important.
The correlation with CO2 is, as you point out, quite strong. The yearly C02 (Mauna Loa) that I’ve seen is also pretty much monotone in time, so that correlation is strongly influenced by other time trends out there. (storks and babies in Holland, anyone?)
I’m not an expert in time series, either, but a lot of basic technology seems to be ignored in the discussions I’ve seen.

10. dhogaza

I’m not an expert in time series, either, but a lot of basic technology seems to be ignored in the discussions I’ve seen.

This guy is an expert in time series analysis (it’s his job) and has a lot of stuff to say about trends in climate data. You should spend some time at his site, very informative stuff.

11. Stephen Wells

bill r, please also consider the actual physics, e.g. why CO2 and methane are called greenhouse gases in the first place.
Arguing about global warming on the basis of the recent climate record is like saying “Well, my ECG says I haven’t had a heart attack yet, I can carry on eating McDonald’s three meals a day.” It’s fairly obvious that (i) we’re boosting atmospheric CO2 by burning fossil carbon so rapidly and (ii) boosting atmospheric CO2 will, ceteris paribus, warm the globe, on basic physical principles. The point of looking in the recent climate record is not to determine whether AGW is a threat, because we know it’s a threat. The point is to determine how much trouble we’re in already.

12. bill r

Stephen,
Look at Mark’s original post, above. He’s discussing analysis of the recent temperature record. I don’t think either of us are writing about the physics. It’s about the support from external evidence. When you ignore the dependence due to the time series the evidence appears to offer stronger support than when you adjust it for the dependencies.
To follow up on your point, where are the regressions driven by CO2? Simple regressions on time are rather silly, unless warming is driven by time, like aging. Time is being used as a proxy.

13. bill r

dhogaza,
I’ve been there. Who is that masked man? He writes like someone who uses time series a lot. I liked the sleight of hand at the end of his autocorrelation post. Using an ar(1) regression for monthly data on time, sweet!

14. Stephen Wells

bill r, I’m saying that playing with time series without considering the physics puts you into the stamp-collecting field, not the scientific one.

15. Wry Mouth

Global Weather Change is something I want to really get down into someday… maybe this summer, when I get some time off… of course, I second your main point here, which is that one has to approach data analysis with an almost sociopathically detached mindset. “The data are what the data are,” as my wife would say.
Cherry-picking is to be avoided, and data excluded (outliers, etc.) only for the best reasons; not for any old reason.
The politicization of a natural phenomenon is aggravating to me; the scores of celebrities and others grappling for leverage using “Global Warming” as an empty mantra just seem — to me — to be interfering with any sort of modeling and analysis of the phenomena.
As for me and my children, I worry not so much — if Warming happens, or Cooling, in any significant manner, I trust them to be smart enough to adapt.
;o/

using “Global Warming” as an empty mantra

The fact being that it isn’t empty doesn’t preclude it from being used in politics of course.
Albeit dealing with facts is harder for the politician. 😛

17. Wry Mouth

Let me clarify the “empty mantra” jibe — I don’t necessarily mean to imply that the Global Warming model(s) are empty or false. On the contrary; I have already owned up to meteorological models being outside my areas of expertise or even glancing knowledge.
What I wanted to say was that there are those — many — who use the phrase “Global Warming” to try and leverage political or popular power, while having NO or LITTLE idea about what the models are or what they suggest or how they are derived. Indeed, I feel at times that it seems they barely know how to spell GLOBAL WARMING; yet have jumped on the idea that one can use it to cow others into going along with one’s own plans.
That’s all. Sore spot. I feel the same way about a lot of things that pople tend to use without the requisite fore- and after-thought. ;o/

18. bill r

Stephen,
Actually, I think we both can agree on that. These posts aren’t about the physics or the climatology, they are about correlations with time. That was my point at the end of the last post.

19. johnny

you wrote What Duffy did is look at his data, and try to find a way of presenting it that appeared to support his pre-selected conclusion. And he managed to find one. He didn’t show a complete analysis – he couldn’t, because a complete analysis would have refuted his argument. So he selectively chose a way of analyzing the data that would produce the desired results: he looked at the data to find the longest period where trend analysis would show what he wanted – and he stopped there.
sorry, but you did exactly the same thing – ‘cherry-picked’ an 8-year trend which supported _your_ point, and ignored the 6-year trend supporting his.
your cherry-picked graph shows a seemingly-dramatic upward trend (of .5 degrees centigrade over 25 years). this may or may not be dramatic, may or may not be a trend…and may or may not be significant. given the typically geological time-scale of global atmospheric conditions, your 25-year ‘trend’ might well be a completely irrelevant blip, but since you chose not to display a graph showing a vastly longer time-scale we lose any such context.
i think projecting malign aims onto another person for having a different viewpoint than yours and accusing them of ‘sleazy tactics’ is silly. he has his opinion; you have yours. you both have ‘cherry-picked’ highly subjective data snapshots which seemingly support your disparate positions, and no doubt each truly believe your positions.
frankly it reminds me of so-called ‘technical analysis’ of the stock-market: you claim and see trends where such may or may not (and typically _does not_ exist)…but then conveniently dispose of any countervailing data, firmly convinced of the ‘reality’ you have created.
i don’t claim to know who is correct, and to me that is the fundamental error in both your stances: claiming you _do_ know you’re correct, when in fact you couldn’t possibly.

20. Tom Farrell

I love the fact that you complain about somebody’s small sample size, and then to prove your point you use a 25 year data sample.
Show me a 5000 year data sample, and then maybe I’ll begin to consider that the temperature rise you show isn’t within the realm of ordinary variation. But then, you won’t, because that would disprove your point…

21. Dave Beardson

Hi Mark, I really like your site, I’ve been a reader for a while now. I’d just like to point out that if people are interested in long term temperature series they’re easily available from here:
http://www.ncdc.noaa.gov/paleo/
Well, maybe it’s easy for me because I know my way around the site. Anyway, this problem with the “no warming” can mostly be chalked up to the atypical 1998 temperatures.
In paleoclimatology what we normally do is standardize all the values to a climate normal (some 30 year average temperature) and then detrend everything else. That gets rid of some of the autocorrelation I believe.

22. Frank

Oh sure, nice argument. Real nice argument. Guess what: the truth is global warming is a hoax. Now you say you found the data you wanted? Well that’s just the way science works. Anyone can prove anything. The real truth is the word of God. Science is a scam and global warming is a hoax perpetrated by Communists. How many climatologists are Russians? How many are atheists? I’ll bet you’ll be surprised when you find out the answer. And when someone like me can destroy your argument just like this maybe you should consider a different line of work… you do work don’t you? Or are my tax dollars paying for that too? This country used to be great until people like you appeased the terrorists.

23. Skeptic

But Gore and his crowd did the same thing with the data used to plot their hockey stick graph. They conveniently left out the medieval warm period, which shows temperatures spiked even without anthropogenic carbon dioxide emissions. If you’re going to tell the tale, tell both sides.

24. Evan

For an alternative viewpoint, check out Clime Audit by Steve McIntyre. He’s the amateur who discovered the egregious errors in the hockey stick graph 2 YEARS after it was presented to the UN. If you want to talk about intellectual dishonesty, at least include Michael Mann (of the aforementioned realclimate.org.
http://www.climateaudit.org/

25. Anonymous

A real scientist would use a graph to support his claims that goes back more than 30 years. There is no doubt the globe is warming now, the question is to look at long term trends.

26. Charlie

A real scientist would use a graph to support his claims that goes back more than 30 years. There is no doubt the globe is warming now, the question is to look at long term trends. People forget that in the 60’s and 70’s some people were claiming the onset of a new ice age. People forget that we conveniently started taking measurements of temperature at one of the coolest points in recent geologic history. And people forget that vegetation alone, including the rain forest they fight to save, is responsible for more greenhouse gas emissions than all of human kind. There is no denying that pollution such as what is going on in China and other industrialized cities in the US is wrong and damaging in other ways, but that it will end up causing the catastrophes predicted of global warming is questionable if not doubtful.

27. Ross

Since PV=nRT, we should just mess with P until T goes back to where we want it. I’m off to Home Depot for an air compressor… brb.

28. Stephen

The ABC radio program Occam’s Razor has had three recent programs. A two part climate denier, and a rebuttal. The rebuttal covers this bit about cherry picking. It surprised me a little, because i couldn’t remember any. Pretty smooth. He doesn’t consider it likely that the denier is incompetent. He talks about how a volcano reduced global temperatures for a couple years. 2002? You can probably just listen to the rebuttal.
http://tinyurl.com/3vxrwr
http://tinyurl.com/4raccn
and the rebuttal:

5000 year of data? Here is 800 000 years of data:

As with the earlier report, the new data reinforces how we’ve now sailed off into terra incognito when it comes to the atmosphere. The feedbacks and cycles that we can observe in the ice cores all operated within narrow ranges of temperatures and greenhouse gasses, and we’re now outside those ranges, meaning that the current conditions represent an experiment with a sample size of one in which we may find new and unexpected feedbacks.

So AGW exists, is confirmed with every extended dataset, and is unprecedented in the current conditions (current tectonics and historically increased solar output, et cetera).
I’m not sure what the point is of denying all the science behind the established consensus, but I trust it must be comforting to look the other way.

30. Episkipos

“It means that there’s a high sensitivity to initial conditions, and a particular kind of stochastic self-similarity. The canonical example of this is brownian motion. Take a cup of tea, and float a grain of pepper on it. Now, every second, plot its position in the tea. It’s going to float around in seemingly random ways. But there’s a pattern to its motion. You’ll see it make some large moves, but they’ll be rare in comparison to average.”
I have found the idea that we can determine if the current warming trend is directly related to human activities a bit, well, egotistical. Current methods, observation abilities and general understanding of large systems such as global weather are exceedingly primitive albeit cutting edge in the current moment.
Before continuing I’d like to point out that the issue of warming and the issue of pollution should not be synonymous. Any reasonable person should agree pollution is not in our best interest even if it does not contribute to global warming. We should en devour to reduce pollution and even eliminate it all together.
The above quote has two issues that I find very disconcerting though. First is the idea of “initial conditions”. We simply will never have that data. So as a logical result our models are always wrong. Secondly “You’ll see it make some large moves, but they’ll be rare in comparison to average.” suggests that large moves in data do not alter the underlying “average” which is also a misunderstanding in my view. All it takes is one large “unexpected” event to radically change the model over a given time frame. The idea of the bell curve ‘average’ when accounting for patterns in such things at weather patterns over Epochs of which we have insufficient data is what I refer to as egotistical. Fine for professors seeking tenure but less that adequate for modeling reality.
Also since I for one do not study climate change 18 hours a day and have not done so for 50 years I need to defer to those of us that do so. In specialized societies such as our own this is the norm.
However when there is not overwhelming consensus among those that do study the same thing it is a red flag to me to listen to both sides and wait for a consensus to emerge.
Thanks

31. Mick

So where is that .3 degree temp rise for the decade that was Proclaimed by the Sacred Consensus of the IPCC?
Tag whatever year you want to the cooling trend but the truth remains that the IPCC model failed to replicate “reality”, that inconvenient world of actual observations.
Now are we all gonna drink the Koolaid because of this failed model?

@ Episkipos:
Cool that you accept consensus. That is AGW.
@ Mick:
First, your claim contradicts the presented data in the post. Second, I can’t find any such proclamation. IPCC reviews the current science, so AFAIU there isn’t a single “IPCC model”, and the last report mentions much lower likely rates for the decade in question.
Besides, as the post indicates it is rather risky to look at such shorter intervals. The main trend over longer periods count, as well as the fact that the models while becoming increasingly firm (for example, by being rejected by data as you suggest) all predicts GW.

33. William Connolley

Climate isn’t chaotic. Weather is chaotic. Climate is the statistics of weather and is stable.

34. pokenhorn

Of course the earth’s climate warms and cools. How else could we have ice ages? The panic these Gore-ites are trying to create is intended to create fear. This hoax is the ultimate tool to make people fearful, and therefore manageable. All the evidence says that the earth’s climate does not stay the same. Therefore, it is either warming or cooling. If you are troubled that it is warming, by default you must wish it were cooling. Why?? I say let’s heat this baby up and open new lands, presently too cold to be desirable for habitation, for human occupation. Relax everybody. The world-wide LEFT is at work here. Driven by their own inner demons to tell everyone else what to do, they have stumbled on to this grand ploy to put themselves perpetually in power in one grand stroke.

35. Sohrab Saran

Since this is the most scientific and logical thread I could find before dinnertime, I would like to ask whether anyone can point me to the DETAILED MATHS of global warming. I feel that relying on statistical evidence of temperature is not good enough because the results would be garbled by (say) a huge sheet of ice falling into the ocean and causing a temporary cooling effect. This is what I would like to see:
1. Earth = basically ball of mud weighing X tonnes traveling at Y km/h receiving radiant energy of Z megawatts per second.
2. Rate of adding CO2 to atmosphere due to Coal+Petroleum+Natural Gas+biological = a+b+c+d = e tons of C02 per second.
Now the tricky parts:
3. Ocean absorption model (mathematical formula) = ????? – will need the tonnes of water of the earth, current CO2 levels, when will the saturation point be reached….
4. Atmospheric change in reflection/absorption ratio
Someone on some web page said that even H20 is a greenhouse gas having greater absorption spectrum so why is C02 such a criminal?
5. Any natural methods of CO2 removal … biological absorption (decline curve due to deforestation, climate change etc.)
5. Rate of global warming will be given by a formula based on all the above…
I feel that this will be more clear and convincing than the statistics…
I think that whatever mentioned above is already being modeled by the supercomputers so please point me to the website that explains it like a high-school physics derivation – government-funded IPCC – are you guys listening? Pachauri Sahaab???? – please put it into physics textbooks after 7th standard (at least in India) as a chapter on Global Climate Management.