Climate Science Glossary

Term Lookup

Enter a term in the search box to find its definition.

Settings

Use the controls in the far right panel to increase or decrease the number of terms automatically displayed (or to completely turn that feature off).

Term Lookup

Term:

Settings

Beginner Intermediate Advanced No Definitions Definition Life:

All IPCC definitions taken from Climate Change 2007: The Physical Science Basis. Working Group I Contribution to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change, Annex I, Glossary, pp. 941-954. Cambridge University Press.

Home

Arguments

Software

Resources Comments

The Consensus Project

Translations

About Support

	Climate's changed before
	It's the sun
	It's not bad
	There is no consensus
	It's cooling
	Models are unreliable
	Temp record is unreliable
	Animals and plants can adapt
	It hasn't warmed since 1998
	Antarctica is gaining ice
	View All Arguments...

Latest Posts

How significance tests are misused in climate science

Posted on 12 November 2010 by Maarten Ambaum

Guest post by Dr Maarten H. P. Ambaum from the Department of Meteorology, University of Reading, U.K.

Climate science relies heavily on statistics to test hypotheses. For example, we may want to ask whether the global mean temperature has really risen over the past ten years. A standard answer is to calculate a temperature trend from data and then ask whether this temperature trend is “significantly” upward; many scientists would then use a so-called significance test to answer this question. But it turns out that this is precisely the wrong thing to do.

This poor practice appears to be widespread. A new paper in the Journal of Climate reports that three quarters of papers in a randomly selected issue of the same journal used significance tests in this misleading way. It is fair to say, though, that most of the times, significance tests are only one part of the evidence provided.

The post by Alden Griffith on the 11th of August 2010 lucidly points to some of the problems with significance tests. Here we summarize the findings from the Journal of Climate paper, which explores how it is possible that significance tests are so widely misused and misrepresented in the mainstream climate science literature.

Not unsurprisingly, preprints of the paper have enthusiastically been picked up by those on the sceptic side of the climate change debate. We better find out what is really happening here.

Consider a scientist who is interested in measuring some effect and who does an experiment in the lab. Now consider the following thought process that the scientist goes through:

My measurement stands out from the noise.
So my measurement is not likely to be caused by noise.
It is therefore unlikely that what I am seeing is noise.
The measurement is therefore positive evidence that there is really something happening.
This provides evidence for my theory.

This apparently innocuous train of thought contains a serious logical fallacy, and it appears at a spot where not many people notice it.

To the surprise of most, the logical fallacy occurs between step 2 and step 3. Step 2 says that there is a low probability of finding our specific measurement if our system would just produce noise. Step 3 says that there is a low probability that the system just produces noise. These sound the same but they are entirely different.

This can be compactly described using Bayesian statistics: Bayesian statistics relies heavily on conditional probabilities. We use notations such as p(M|N) to mean the probability that M is true if N is known to be true, that is, the probability of M, given N. Now say that M is the statement “I observe this effect” and N is the statement “My system just produces noise”. Step 2 in our thought experiment says that p(M|N) is low. Step 3 says that p(N|M) is low. As you can see, the conditionals are swapped; these probabilities are not the same. We call this the error of the transposed conditional.

How about a significance test? A significance test in fact returns a value of p(M|N), the so-called p-value. In this context N is called the “null-hypothesis”. It returns the probability of observing an outcome (M: we observe an upward trend in the temperature record) given that the null-hypothesis is true (N: in reality there is no upward trend, there are just natural variations).

The punchline is that we are not at all interested in this probability. We are interested in the probability p(N|M), the probability that the null hypothesis is true (N: there is no upward temperature trend, just natural variability) given that we observe a certain outcome (M: we observe some upward trend in the temperature record).

Climate sceptics want to argue that p(N|M) is high (“Whatever your data show me, I still think there is no real trend; probably this is all just natural variability”), while many climate scientists have tried to argue that p(N|M) is low (“Look at the data: it is very unlikely that this is just natural variability”). Note that low p(N|M) means that the logical opposite of the null-hypothesis (not N: there really is an upward temperature trend) is likely to be true.

Who is right? There are many independent reasons to believe that p(N|M) is low; standard physics for example. However many climate scientists have shot themselves in the foot by publishing low values of p(M|N) (in statistical parlance, low p(M|N) means a “statistically significant result”) and claiming that this is positive evidence that p(N|M) is low. Not so.

We can make some progress though. Bayes' theorem shows how the two probabilities are related. The aforementioned paper shows in detail how this works. It also shows how significance tests can be used; typically to debunk false hypotheses. These aspects may be the subject of a further post.

In the meantime, we need to live with the fact that “statistically significant” results are not necessarily in any relevant sense significant. This doesn't mean that those results are false or irrelevant. It just means that the significance test does not provide a way of quantifying the validity of some hypothesis.

So next time someone shows you a “statistically significant” result, do tell them: “I don't care how low your p-value is. Show me the physics and tell me the size of the effect. Then we can discuss whether your hypothesis makes sense.” Stop quibbling about meaningless statistical smoke and mirrors.

Reference:
M. H. P. Ambaum, 2010: Significance tests in climate science. J. Climate, 23, 5927-5932. doi:10.1175/2010jcli3746.1

0 0

Printable Version | Link to this page

Comments

Prev 1 2

Comments 51 to 84 out of 84:

Dikran Marsupial at 06:56 AM on 17 November, 2010
I'm happy to third KRs comment. It is staightforward to show that it is reasonable to talk of the probability that a hypothesis is true. If BP and I were to bet on the number of times a coin I took from my pocket came up heads and I flipped six heads in a row, then BP might well hypothesize that my coin was biased. However, no matter how many times I got a head one after another, he could never know for certain that the hypothesis were true, as (infinitely) long runs of heads are possible, just (infinitely) improbable. But does that mean he is limited to saying "I don't know" when asked if his hypothesis is true? Of course not, most people would have no difficulty in quantifying their belief in the truth of BPs hypothesis. Indeed that is exactly what gamblers do whenever they make a wager, which IIRC is where the Bayesian approach to probability (a mathematical framework for quantifying belief in the truth of an uncertain proposition) originated. BTW - as I belive Donald Rumsfeld (sort of) said - there are things you know, there are things you know you don't know and there are things you don't know you don't know. Statistics of any framework is a good way to deal with the things you know. Bayesian statistics also has a good way of dealing with what you know you don't know - you introduce a minimally informative prior (using techniques like MaxEnt and transformation groups to decide what is minimally informative) representing your ignorance of that element of the analysis and integrate it out (Marginalisation). The things we don't know we don't know, we can't do too much about, other than adopting a cautious approach, avoiding overstatement of our findings and being willing to recognise when we are wrong.
0 0
HumanityRules at 11:04 AM on 17 November, 2010
51.KR 52 Tom 53 Dikran The specifics of SOC aside, I'm not sure where your difference with BP lies, it appears to be one of emphasis. The uncertainty for BP is crucial for the others it's a minor irritation. Specifcally to KR on your point 2. Skeptics seem to believe that the introduction of the IPCC into climate science has caused some leap-frogging in that process you describe in order to get all the way to CAGW and to provide the basis on which policy can be formulated. I'm not sure anybody expects perfection but recognition of the imperfection would be useful start. As an example I was browsing thru Maarten's department website. There's plenty on clouds and the suggestion that we haven't even got the measurements right yet. It becomes a matter of (expert) opinion as to whether that constitutes certainty or uncertainty. Dikran you seem to introduce an element of the subjective into the discussion. The gambler (or us) can judge the hypothesis based on our "belief", we could replace that word with experience or expert opinion. In fact the example you give of 6 heads in a row could very easily lead us down the wrong path. I could point out the chances of that are ~1%, suggest you're trying to cheat me and if this was the Wild West shoot you. That might be evidence of my poor gambling experience (or violent tendancies). The stats in this case have done nothing to bring us to this conclusion it's actually all down to opinion, in fact my poor use of the stats might have contributed to giving me certainty about my wrong decision. As Maarten points out stats don't point us to which is the correct path to take. As I'm ignorant on this subject to what extent do stats rely on subjective judgement? I'd always thought the point of them was to get away from this but from the descriptions I'm reading here at some level subjectivity seems to be part of the process. In Bayesian the attempt is to build it into the process. In frequentist (as in the example in #49) in seems to be added afterwards.
0 0
Berényi Péter at 12:23 PM on 17 November, 2010
#51 KR at 05:17 AM on 17 November, 2010 You state that weather is in a state of Self Organized Criticality - SOC. I have been unable to find any references that indicate this; do you have a paper to link to on this subject? There's a review paper. XE: Extreme Events workshop Developing a Research Agenda for the 21^st Century Boulder, Colorado, on June 7-9, 2000 Self-Organised Criticality and the Atmospheric Sciences: selected review, new findings and future directions Suraje Dessai & Martin E. Walter "We suspect theories of complexity, such as SOC, have been underrepresented in the atmospheric sciences because of their "soft science" character. Atmospheric sciences have historically developed from centuries of advancement in the hard sciences, such as physics, mathematics and chemistry, etc. It would have been unlikely to see a quick transition from the classical reductionist and reproducible science approach towards an abstract, holistic and probabilistic complex science. Proof of this is the fact that only a small number of scientists have cited the few applications of these theories in the atmospheric sciences." "Another possibility for the neglect of SOC in the atmospheric sciences is the increased funding of applied atmospheric sciences (e.g. climate change research) vis-à-vis the decreased funding of "basic" research, i.e., according to Byerley (1995) research to increase knowledge; to answer a scientific as opposed to a practical question." This one has some ideas of its own: Fractals, Vol. 7, No. 4 (1999) 421425 IDENTIFICATION OF SELF-ORGANIZED CRITICALITY IN ATMOSPHERIC LOW FREQUENCY VARIABILITY R. R. JOSHI and A. M. SELVAM This is also interesting (possible scale free behavior of atmospheric flows from 1 cm to 1000 km). Science 9 April 1982 Vol. 216 no. 4542 pp. 185-187 DOI: 10.1126/science.216.4542.185 Area-Perimeter Relation for Rain and Cloud Areas S. LOVEJOY Or this one on 1/f noise in a particular time series. Atmos. Chem. Phys., 7, 629634, 2007 Technical Note: Long-term memory effect in the atmospheric CO₂ concentration at Mauna Loa C. Varotsos, M.-N. Assimakopoulos & M. Efstathiou
0 0
KR at 12:26 PM on 17 November, 2010
HumanityRules - The reason this is an important point is due to the oft-repeated claims by various people that summarize to "any uncertainty invalidates human driven global warming", or at least needs "sound science", a favorite claim of the Marshall Institute, for example. There is always some level of uncertainty in science, some small chance that your theory doesn't actually match up with the behavior of the universe. Perhaps you need better measurements (Newtonian vs. Einstein's physics, for example) to point it out. Perhaps your epicycles are way too complicated. Perhaps you have been deceived by ideology, or bad choice of drugs! But agreement with known physics and statistical evaluation of the data (whether Bayesian or frequentist) helps you to rank hypotheses in order of agreement with the data. That's key to making a scientific judgement. Yep, it's somewhat subjective. All induction is. But given a pile of reasons on one side of an argument and a pile of illogic, poor data, or contradictions on the other, you can generally make that call. If not, study some more. Yes, there is some possible uncertainty even in the law of gravity - it might stop working tomorrow. Nature may be manipulated by the lawn gnome Illuminati. Climate science may be the result of a cabal of One World Socialists bent on world domination. Or the centuries of scientific research and independent investigators may have identified key elements of how we affect climate by our actions. We can rank the uncertainties - only the last is even remotely a probable (first and second definitions) hypothesis. Small uncertainties (part of the nature of science and inductive reasoning) do not unseat an entire block of science - especially when the alternatives presented are hugely more uncertain.
0 0
KR at 12:48 PM on 17 November, 2010
Berényi - Thank you, those are some very interesting papers. I'm a bit concerned that the Joshi et al 1999 only seems to study 90 days of data. We know "weather" is chaotic; climate doesn't seem to show the same fractal nature of variation, and 90 days of data can only support cyclic variations of 30-45 days at most, not millenia. Extending that analysis to thousands of years will take a great deal more data. The CO2 paper states that: "Therefore the long-range correlations in the atmospheric CO2 that deduced from the present analysis can help in recognition of anthropogenically induced changes caused by increased CO2 emissions to the atmosphere on the background of natural atmosphere changes. So they appear to be identifying patterns of variation that be identified superimposed on anthropogenic CO2 change to better identify the signature, and don't seem to make any claims about low frequency (long term) variation coming from "pink noise". Finally, the cloud cover/fractal distribution paper is excellent. That's a very clean analysis of fractal dimensionality. But I don't see the connection between fractal self-similarity and 1/f noise. Nor do I see the significance with regard to tracking climate changes. I really don't see how these descriptions of weather are critical issues for climate. Parametric descriptions of fractal systems are perfectly adequate for analyzing mass behavior, as long as the parametric description includes observed internal variability at the appropriate scales.
0 0
HumanityRules at 15:04 PM on 17 November, 2010
56.KR "any uncertainty invalidates human driven global warming" That seems like the extreme end of a spectrum of opinion that starts somewhere near "some uncertainty may question the magnitude of human driven global warming". This may in fact might be where the IPCC sits. I'm happy to agree that the example you highlight is wrong but I don't see that invalidates the whole spectrum. I actually don't know were I sit on that spectrum it probably changes on a daily basis maybe something like "there is sufficient uncertainty that we cannot accurately attribute warming." Anyway I think there are a couple of statements in your post that come out of the false dicotomy of denier/alarmist. They are products of the politics rather than the science. For me this is a barrier for resolving the issue. If Judith Curry can step outside of the consensus to ask questions only to be labelled incompetent or a dupe or worse then I don't think there is much hope for the process. I really don't need to be convinced that the Marshall Institute is wrong, i need to see that Curry asking questions is accepted as part of the normal scientific process. those statements were "But given a pile of reasons on one side of an argument and a pile of illogic, poor data, or contradictions on the other, you can generally make that call. If not, study some more." and "Small uncertainties (part of the nature of science and inductive reasoning) do not unseat an entire block of science - especially when the alternatives presented are hugely more uncertain." It strikes me that "AGW v's the rest of the world" is more Hollywood than science. (anyway sorry this is drifting away from stats)
0 0
Berényi Péter at 01:11 AM on 18 November, 2010
#51 KR at 05:17 AM on 17 November, 2010 We can only state that a particular hypothesis is more probable than others given the evidence, the statistics of our data. And whether using Bayesian or frequentist methods, we can estimate from the statistics the probability (second definition) that our hypotheis is supported by that data. That's how induction works, and how we can learn something new. Induction is not a scientific method. It is an heuristic method (one of many) used to arrive at universal propositions of any kind, some being scientific among them. But what makes a universal proposition scientific is not the fact it is supported by data, but that it is not contradicted by any of them. In Galileo's time according to the prevailing theory of heavenly bodies they were supposed to be perfect spheres. Up to the moment Galileo have constructed his first (improved) telescope in late 1609 and started to study the skies with it in November, this theory was consistent with observations. However, it was not based on induction in any sense, that is, it was not the case many heavenly bodies having thoroughly been observed and all of them found to be perfectly round with a smooth surface, a universal law of their shape was arrived at. In fact only angular extents of the Sun and the Moon are large enough to be seen as other than point sources with a naked eye. So, quite the contrary, there was a general principle stating the Heavens were eternal and perfect, while the Earth was home to transient and imperfect phenomena (supported by the cosmological role of each, well known to anyone familiar with Scriptures). From this distinction the Theory of Roundness follows easily. What is more, it is also consistent with Occam's Razor. Why should, after all, Heavenly Bodies assume any shape other than the most perfect one, the sphere? A single lunar observation of Galileo was enough to falsify this theory and replace it with another one, stating all the Heavenly Bodies are like Earth, at least in principle. Not an inductive step either. (He saw bright spots on the dark side of the Moon, more than 1/20 lunar diameter away from the terminator, concluded correctly, using simple geometry, those are peaks of mountains more than 4 Italian mile - 7320 m - high, illuminated by the rising sun at the first streak of lunar dawn). In fact the motivation behind new revelations are seldom true induction. They're more often than not based on novel application of general principles coupled to a few select facts, like Einstein's Geometrodynamics, based entirely on symmetry principles with a healthy addition of fact in the form of the Weak Equivalence Principle verified by Baron Roland von Eötvös and his team with a 10^-8 precision for a few samples, as reported in 1909 at the 16^th International Geodesic Conference in London. Mathematics, as the language of Natural Philosophy plays a very special role in this process. Communications on Pure and Applied Mathematics Volume 13, Issue 1, pages 1–14, February 1960 DOI: 10.1002/cpa.3160130102 (article first published online: 12 OCT 2006) The unreasonable effectiveness of mathematics in the natural sciences Richard Courant lecture in mathematical sciences delivered at New York University, May 11, 1959 by Eugene P. Wigner "The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve."
0 0
Eric (skeptic) at 01:12 AM on 18 November, 2010
KR (#51), thank you for pointing to your reply here. My disagreement with Tom on the other thread was almost purely philosophical akin to arguing religion or politics and should be ended as irresolvable (my friends always say "just shut up"). Therefore I appreciate your (and BP's) efforts in this thread to keep the focus on climate science or at least empirically based science. I do not believe a hypothesis in climate or any other science can be solely conditioned on statistical measurements. If we did that, we could say (a la Nixon): "We are all Bayesians now". But in reality there are physical connections (which could be chained) and statistical measurements are merely a result of the physical connections (chained or not). We can start with the handy fact that everything in the universe is physically connected albeit in some cases extremely weakly and chained in other cases. For example, the fact that we can detect a distant quasar means its fields impact the earth in some way (thankfully not enough to complicate weather or climate predictions). I could hypothesize that having more clouds in Norway causes my garden in Virginia to be warmer. There may be some extremely small direct connection (e.g. via gravity) which I would ignore. I can make measurements and find some correlation. But I know from well supported theories of weather and geographic limits of weather systems that there is no direct physical connection. The existing theories point to many possible confounding (and causal) factors with supporting evidence including statistical measurements and direct physical connections. Based on these relationships we end up building a model which is (AFAIK) always an oversimplification. There is then a strong argument for the applicability of statistical measurements to support the oversimplified model. Assuming that there are alternative oversimplified models to choose from, do we "rank order" them based on the data as KR says? I suggest we do not. What appears to be imprecision in the data is actually inapplicability to the modeled relationship. The essential difference between the applicability of a measurement (the hypothesized causality in a theoretical model) and a conditional probability as defined by Dr Ambaum is that the former is either true or not true and the latter is a measurement. This difference may appear obvious and superficial for most directly connected phenomena, but it is easily confused in a complex model. This is because the imprecision of the measurements gets conflated with the weakness of the causality.
0 0
KR at 02:34 AM on 18 November, 2010
Berényi - "Induction is not a scientific method. It is an heuristic method (one of many) used to arrive at universal propositions of any kind, some being scientific among them. But what makes a universal proposition scientific is not the fact it is supported by data, but that it is not contradicted by any of them." I would have to disagree. Induction is not only a scientific method, it is the scientific method. Generalizing from observations and forming a 'universal' proposition is done via induction - deduction cannot teach you anything you don't already know. Testing inductive hypotheses for validity, yes, you've described that quite well. But the universal propositions you describe are generated by induction. In fact, even deductive logic is based on induction. Given deductive logic, you can derive interrelationships and implications, starting from your premises and reasoning through first and higher order logic. 1. All men are mortal 2. Socrates is a man 3. Therefore, Socrates is mortal But your premises - those are inductive. You believe them because you have observed them to be valid (all men are mortal), an inductive statement from experience. From a book I read recently - A: "The world is a disk, which sits on the backs of four giant elephants." B: "What's under the elephants?" A: "Oh, from there it's more elephants all the way down..."
0 0
Ned at 02:59 AM on 18 November, 2010
KR, that reference seems a bit garbled. I've seen versions where it's elephants standing on the back of a turtle, and where it's turtles all the way down ... but is there really one where it's elephants all the way down? http://en.wikipedia.org/wiki/Turtles_all_the_way_down
0 0
KR at 03:13 AM on 18 November, 2010
Ned - I think you may be right! I must have mixed that story with the Diskworld series, where the disk rests on the back of four elephants (who themselves stand on a turtle). Sigh. I have to work on my metaphors a bit, obviously. One more note on induction - Our data (outside the most simplistic cases) is never perfect, there's always some noise in the measurements. Newtonian physics was fine until our measurements improved, whereupon we got Einstein. We take these inductively generated hypotheses, test them against the data (hopefully increasing the probability, the inductive likelyhood of being correct), and judge them against other hypotheses based on those inductive supports. After enough evidence accumulates, enough tests performed, we can accept these hypotheses as generally applicable. But - our knowledge is not perfect. Newtonian physics was thought to be a 'universal proposition'; turns out it's useful in many cases, but not correct. We have to keep in mind the separation between the world we live in [the baseline against which we work] and our theories of how it operates [which are our best evaluations, not crystalline truths].
0 0
Dikran Marsupial at 04:58 AM on 18 November, 2010
BP, can you provide a deductive chain of reasoning that establishes the theory of evolution? If not, does that mean the theory is not scientific?
0 0
Dikran Marsupial at 05:02 AM on 18 November, 2010
BP wrote "Induction is not a scientific method. It is an heuristic method (one of many) used to arrive at universal propositions of any kind, some being scientific among them. But what makes a universal proposition scientific is not the fact it is supported by data, but that it is not contradicted by any of them." No, creationism is not unequivocally contradicted by any observation, but it isn't a scientific theory. According to Popper the *possibility* of falsification distinguishes scientific theories from unscientific ones. Creationism is non-falsifiable as the deity may have buried dinosaur bones as a test of our faith etc.
0 0
Eric (skeptic) at 05:35 AM on 18 November, 2010
KR, here is a simple pre-Newtonian physics: Velocity = Force * Constant. For example, as I push a shopping cart with a constant force, it travels at a constant speed. It attained that speed (from rest) as I applied the force and maintains that speed as long as I apply the force. You might argue there is some sort of theoretical "friction" in the wheels but you will have to show how to measure that friction along with your new theory. Once you demonstrate your Newtonian theory, my theory is not "useful in many cases, but not correct", it is simply wrong and discarded unless your theory allows it to be true in special cases. In your example above, Newtonian theory is 100% incorrect. The fact that it is "useful in many cases" simply indicates that measurements are being taken and utilized with low enough precision to appear correct in Newtonian theory. Those measurements are not "probabilistically correct" in any way, they are simply too imprecise to be correct (for relativity theory) or impossible to ever measure precisely (for quantum mechanical theory).
0 0
KR at 05:56 AM on 18 November, 2010
Eric - What was Newtonian physics prior to the more accurate measurements? Was it a universal proposition? A complete truth? Or was it rather the best we could do at the time? As is Einstein's physics now? I would recommend for your reading topics on The Problem of Induction, in particular David Hume, Karl Popper, and Wesley Salmon (who I had the pleasure of taking some courses with). Those links contain some overviews and multiple links to further discussions.
0 0
Eric (skeptic) at 07:08 AM on 18 November, 2010
KR, thanks. You haven't demonstrated that my pre-Newtonian physics example above is not "universal" or "complete" (nor have you defined those). Next I will read Salmon since he seems to have the best counterargument.
0 0
Berényi Péter at 11:34 AM on 18 November, 2010
#65 Dikran Marsupial at 05:02 AM on 18 November, 2010 No, creationism is not unequivocally contradicted by any observation, but it isn't a scientific theory. According to Popper the *possibility* of falsification distinguishes scientific theories from unscientific ones. Creationism is non-falsifiable as the deity may have buried dinosaur bones as a test of our faith etc. You are correct. I should have added to "not contradicted [by data, observation, measurement, whatever]" possibility of falsification as well, that is, the theory also have to be able to specify under what state of affairs it is considered to be contradicted by facts. This is precisely one of the most serious drawbacks of CO₂ induced warming theory. In the above sense it is not falsifiable, because 1. The concept of "forcing" does not have a proper definition. This fact is shown by the existence of an arbitrary fudge factor attached to each kind of forcing, called "efficacy" (for example according to some studies the same forcing expressed in W/m² in case of black carbon on snow is supposed to have more than three times the efficacy of atmospheric carbon dioxide - but should measurements indicate polar soot pollution is high enough to explain recent warming at high latitudes, this fudge factor is always malleable enough to leave room for significant CO₂ effect, enhanced of course by some supposed water vapor feedback). 2. "Climate sensitivity" does not have a sharp enough definition either. We have no idea about either the shape of the response function (if it is a first order one or has some more complex form) or the time constant(s) involved, that is, in what time climate is supposed to attain equilibrium after a step change in a particular "forcing". According to a bunch of studies just about anything is consistent with AGW theory, including increased or decreased storm activity, multi-year flat OHC, drought, flood, warming or cooling, more snow, less snow, increasing or decreasing sea ice. One only wonders what state of affairs would constitute a falsification of this theory. I mean if century scale climate sensitivity to CO₂ doubling is in fact less than 1°C (negative feedback), is there a climate indicator that would show it beyond reasonable doubt in significantly less time than a century? The literature is distressingly silent about it, although exactly this kind of study would have the capacity to make propositions about AGW scientific, therefore it is indispensable to any level of credibility. A recent study goes as far as claiming severe continental scale winter cooling is not only consistent with "global warming", but it is a consequence of it, kind of proof. JOURNAL OF GEOPHYSICAL RESEARCH VOL. 115, D21111, 11 PP., 2010 doi:10.1029/2009JD013568 A link between reduced Barents-Kara sea ice and cold winter extremes over northern continents Vladimir Petoukhov & Vladimir A. Semenov "Our results imply that several recent severe winters do not conflict the global warming picture but rather supplement it" At the same time they do not bother with elaborating on other effects of a supposed partially ice free Barents-Kara sea in winter, like where this oceanic heat lost to the arctic winter atmosphere is supposed to go or how this loss influences overall OHC. If they would, I suppose we could see a strong local negative feedback at work, barely consistent with positive feedback. The meticulous PR transition from the original buzzword "global warming" through "climate change" to "climate disruption" does not help building public confidence either. It is not only the case we do not have a definition of "disruption" that is sharp enough to be falsifiable, but it is also utterly impossible to define what is supposed to constitute climate disruption as opposed to natural variability. Questions like these have nothing to do with confidence tests directly or the way they are used, the failure to explicitly define Bayesian priors, etc., except if anywhere, in a honest falsifiability study these ingredients would find their proper place.
0 0
Dikran Marsupial at 22:10 PM on 18 November, 2010
BP @ 69 - the key point was that your claim that induction is not scientific is simply incorrect, it would be better if we resolved that issue rather than be diverted by more tangential matters. Is the theory of evolution scientific? Having said which, it is clearly not true that the theory of CO2 induced warming is non-falisifiable. All it would take would be a period with increasing radiative forcing due to CO2 without warming, that was sufficiently extended for the lack of warming to be attributable to the natural variability of the climate, and that could not be attributed to the action of other known forcings. AGW theory is directly falsifiable by observations and hence is a scientific theory. For a concrete example - a thirty year period of cooling, with increasing CO2 and all other forcings remaining approximately constant would kill AGW theory stone dead. It is also not that case that forcing is not adequately defined - see e.g. the definition given in the glossary of the IPCC WG1 report. You appear not to understand the reason for "efficacy" - it simply allows the effect on climate of different forcings to be expressed in terms of the effect of CO2 - it is a help in comparing the relative importance of different factors, nothing more. As to the paper you site, a theory isn't falsified by the observation of something that the theory predicts, so that is no indication that the theory is not falsifiable. It is just an observation that doesn't falsify the theory.
0 0
batsvensson at 00:12 AM on 20 November, 2010
@Berényi Péter Your claim: "it simply does not make sense to talk about the probability of hypotheses being true (or false). It's either true or false. Of course it is entirely possible we are ignorant about its truth value" In what respect is, given it has been deiced, a hypotheses true or false?
0 0
batsvensson at 00:40 AM on 20 November, 2010
@KR Berényi Péter claimed that "Induction is not a scientific method.". To this you responded: "I would have to disagree. Induction is not only a scientific method, it is the scientific method. Generalizing from observations and forming a 'universal' proposition is done via induction". Perhaps Berényi claim is that we can not use positive evidence, like induction, as support for a hypothesis to be true and therefore any such attempt is an non-scientific.
0 0
Berényi Péter at 09:33 AM on 21 November, 2010
#70 Dikran Marsupial at 22:10 PM on 18 November, 2010 Is the theory of evolution scientific? I am not aware of a single well defined scientific theory of evolution. If you mean that vastly American idiosyncrasy, the so called Evilution vs. Cretinism controversy, I refuse to play that game. There are specific theories of various aspects of the overall evolution process that can be called scientific, none of them based on induction. If you mean the simple observation the geological record is full of fossil remnants of extinct species, that's not a theory, just a bunch of facts begging for a theoretical explanation. Some attempts of such an explanation may be inherently scientific in nature, others are not so much. Early theories like Lamarckism, Spencerism or Darwinism are already falsified, at least to the extent they were specific enough in their predictions and proposed mechanism behind phenomena observed to lend to a well defined logical procedure as falsification. There's also a cohort of recent theories going under the umbrella term Neo-Darwinism, all based on a unification of ideas from Alfred Russel Wallace (a spiritualist) and Gregor Johann Mendel (an Augustinian monk). It is not a unified theory either, just a meta-theory, which encouraged the formation of various scientific theories, some of them still standing. There is nothing specifically inductive in the principles underlying those theories. They are generally based on the postulated existence of variable replicators in an environment with finite resources. As the replicators are capable to increase their number exponentially, some (natural) selection inevitably occurs. However, the outcome heavily depends on the type of replicators, spontaneous development of even simple autocatalytic sets is empirically unsupported in real chemistry. In this sense generation of complexity by evolution is still not well understood. There is strong indication that below a certain (quite high) level of algorithmic complexity entities are not able to function as Darwinian replicators. There should be a specific type of variability in the replication process in order to selection be able to work in a creative way, that is, it's not true that just any kind of variability would suffice. This is why abiogenesis is still outside the realm of science with no "standard model" of the origin of life in sight. As we have never seen life outside Earth, there is no empirical basis for assessing the probability of spontaneous occurrence of life either. All we know is the conditional probability of life having been appeared, provided we consider this problem, and that conditional probability is exactly 1 (see anthropic principle). So I do not quite know where you are trying to get by bringing up evolution in the present context, but it is obviously more problematic than you would imagine. * a thirty year period of cooling, with increasing CO2 and all other forcings remaining approximately constant would kill AGW theory stone dead That's not true. Between 1943 and 1976 (in 33 years) global land-ocean temperature index was dropping (by 0.12°C) while atmospheric CO₂ concentration has increased from about 300 ppmv to 332 ppmv. If CO₂ radiative forcing is supposed to be a logarithmic function of its concentration, this is 14.6% of the forcing for CO₂ doubling. If we go with the IPCC mean estimate of 3°C for doubling, surface temperature should have increased by 0.44°C during the same period. Therefore the missing heat is 0.56°C in 33 years which indicates a cooling trend at a 1.7°C/century rate without CO₂ contribution. And that with the heavily adjusted GISTEMP figures (raw temperature data as measured by thermometers show a more severe cooling in this period, in excess of 0.3°C). Effect of CH₄ and other trace gases with absorption lines in thermal IR are not taken into account either. Therefore AGW theory would have been killed stone dead a long time ago, if there were no "all other forcings remaining approximately constant" clause. In the IPCC Fourth Assessment Report: Climate Change 2007: Working Group I: The Physical Science Basis: Glossary the following definition is seen:
Radiative forcing Radiative forcing is the change in the net, downward minus upward, irradiance (expressed in W m^–2) at the tropopause due to a change in an external driver of climate change, such as, for example, a change in the concentration of carbon dioxide or the output of the Sun. Radiative forcing is computed with all tropospheric properties held fixed at their unperturbed values, and after allowing for stratospheric temperatures, if perturbed, to readjust to radiative-dynamical equilibrium. Radiative forcing is called instantaneous if no change in stratospheric temperature is accounted for. For the purposes of this report, radiative forcing is further defined as the change relative to the year 1750 and, unless otherwise noted, refers to a global and annual average value. Radiative forcing is not to be confused with cloud radiative forcing, a similar terminology for describing an unrelated measure of the impact of clouds on the irradiance at the top of the atmosphere.
The remarkable part of it is that radiative forcing is defined at the tropopause, at an ill-defined surface (because of occasional tropopause folding events) high up in the atmosphere but well below any satellite orbit, a surface where practically no measurements of IR irradiance are done (either up or down). The only direct way to determine if CO₂ induced warming effect (of a magnitude similar to the one estimated by the IPCC) is falsified by observed surface cooling between 1943 and 1976 or not is to analyze the difference of two unmeasured quantities at an unknown surface. Otherwise, as this quantity is obviously unknown, one can assume there was a negative forcing there, canceling the effect of increasing CO₂ concentrations. And this is exactly what people do (by setting supposed aerosol effects to a suitable value, neither supported nor contradicted by measurements). That's what I mean by the theory being melleable enough to resist falsification attempts. Not because it is true, but because it is flexible (not good for a theory that is supposed to be scientific).
0 0
Dikran Marsupial at 09:56 AM on 21 November, 2010
First as to the thirty year period of cooling, I note that you ignored the part about all other focings remaining approximately constant, which does not hold for the 1943-1976 period. Indeed the sulphate aerosol issue made some scientists at the time discuss the possibility of an ice age. Second, the precice definition of the radiataive forcing is not needed for observational falsification of the theory as a whole. Popper's idea of falsifiablity assumes that both parties attempt to find a test of the theory in good faith. If we observed sufficiently extended period of cooling with CO2 rising and all other forcings constant, that would falsify AGW theory, regardless of how you measured radiative forcing. Lastly, regarding evolution, lets take Darwinian evolution as a particular case; it is entirely inductive, a general principle based on observation of particular examples. But most scientists would agree it is science. If you think it has been falsified (please do give links - genuinely interested), then it is by definition falsifiable, and hence a scientific theory according to Popper.
0 0
KR at 10:33 AM on 21 November, 2010
Berényi - So, the cycle of multiple observations, generalizations, identification of common elements, hypothesis, testing - that's not science by induction???? Because that is how the theory of evolution came about, and it is inductive reasoning. You are using a very different dictionary than most, if that is the case. Darwin, incidentally, did not propose a mechanism for inheritance, acknowledging that while offspring carried traits of their parents, he did not know the details. Your reference to Wallace is interesting, but I don't think relevant - both Darwin and Wallace held much the same ideas. And your use of the terms "spiritualist" and "Augustinian monk" appear to be ad hominem references - I sincerely hope I am incorrect in appraisal.
0 0
Dikran Marsupial at 10:52 AM on 21 November, 2010
The issue about stratospheric intrusions looks like a red herring to me. According to the paper BP linked, they are small scale events 300m-1km, which suggests they have very little effect on the measurement of global radiative forcing.
0 0
kdkd at 11:04 AM on 21 November, 2010
BP's claim that induction is not a valid part of the scientific method is absurd. Induction is a cornerstone of the scientific method, without which we would be unable to perform scientific research. No wonder that BP has such trouble with making a coherent scientific argument if he lacks this basic understanding (his maths is clearly not too shabby, although he seems to lack sufficient grounding in statistics to be coherent there too). It looks like BP is not alone in criticising the validity of induction, although as Fisher (1955, p 74-75) shows, this anti-induction interpretation, in the context of the kinds of probablistic reasoning used in climate science and elsewhere, is clearly totally incorrect. It may make sense in some engineering fields, but as I have no expertise in that area, I can not comment.
0 0
Berényi Péter at 04:02 AM on 23 November, 2010
#75 KR at 10:33 AM on 21 November, 2010 that is how the theory of evolution came about, and it is inductive reasoning You do not have to believe every single word Darwin had put down in his Autobiography like "I worked on true Baconian principles, and without any theory collected facts on a wholesale scale". In fact he did not do such a thing, he was much more the follower of John Stuart Mill in this respect. Here is the first paper on (Darwinian) evolution. You can check for yourself how much of it is based on sheer induction as opposed to a quick hypothesis deduced from a few undeniable universal facts like creatures, given the opportunity, are capable to increase their numbers exponentially in an ever changing environment with finite resources and they are similar to (but not identical with) their progenitors. That's all. Jour. of the Proc. of the Linnean Society (Zoology), 3 (July 1858): 53-62. On the Tendency of Species to form Varieties, and on the Perpetuation of Varieties and Species by Natural Means of Selection Charles R. Darwin & Alfred R. Wallace In a sense Erasmus Alvey Darwin, Charles Robert Darwin's brother was right in his letter of 23 Nov 1859 stating "In fact the a priori reasoning is so entirely satisfactory to me that if the facts wont fit in, why so much the worse for the facts is my feeling" (upon reading Origin). The plethora of facts in Darwin's book On the Origin of Species or the Preservation of Favoured Races in the Struggle for Life supporting this theory can all be considered failed falsification attempts. That is, they're not needed for the derivation of the theory, but turned out to be consistent with it. The original factual base of the theory stands unchallenged to this day.
1. We still have not found a single species that would not increase its numbers exponentially in a favorable environment
2. No environment is known that would be stable on a geological timescale
3. No environment with infinite or exponentially growing resources is found (except the environment of human society, due to ever shifting technological definition of "resource").
4. No species is found where offspring and progenitor are either strictly identical or dissimilar (if the entire life cycle of the species is taken into account)
In this sense the empirical basis of the theory is not falsified. Problems discovered later have nothing to do with this quick-and-dirty inductive step, it is still a masterpiece. Problems arose not because of hasty induction, but some vagueness and much hand waving in its deductive structure (in the ratiocination phase, using Mill's term), some of which is due to sketchy definition of basic concepts, some to lack of rigorous formalism. The issues centered around point 4. above. He was sticking to the idea of blending inheritance until the end of his life, although he would have the chance to read about Mendel's results in Hermann Hoffman's book (1869), had he not skipped page 52 due to lack of mathematical training and interest. Therefore the important difference between phenotype and genotype along with the quantized nature of inheritance was unknown to him (and, understandably, also recombination, as it was discovered later). However, even with the tremendous advance in formalization and the description (and utilization) of the standard digital information storage and retrieval system encapsulated in all known life forms, Evolution of Complexity is still not understood (although this is the single most important aspect of the theory as far as general human culture is concerned). Even a proper widely agreed upon definition of complexity is lacking and while there is no way to assign probabilities to candidates like Kolmogorov complexity, it makes even less sense to talk about the probability of individual propositions dependent on this concept being true, either in a Bayesian context or otherwise. Current status of AGW theory is much the same. It is also highly deductive, based on the single observation carbon dioxide has a strong emission line in thermal infrared. It is the only inductive step, other than those necessary for launching general atmospheric physics of course. Otherwise the structure of the theory is supposed to be entirely deductive, relying on computational climate models as devices of inference. However, according to Galileo, the great Book of the Universe is written in the language of mathematics, not computer programs. The difference is essential. Mathematical formulae as they are used in physics lend themselves to all kinds of transformation, revealing hidden symmetries or conservation principles, making perturbation theories, equivalence proofs (like Schroedinger's exploit with matrix and wave mechanics) or analysis of general properties of a dynamic system (like existence and geometry of attractors) possible, etc., etc. On the other hand, there is no meaningful transformation for the code base of a General Circulation Model (other than compiling it under a specific operation system). Move on folks, there's nothing to see here. There is a metaphysical difference between our viewpoints. In unstructured problems like spam filtering Bayesian inference may be useful. As soon as some noticeable structural difference occurs between spam and legitimate email, spammers are fast to exploit it, so it is a race for crumbs of information. Stock rates work much the same way, from a strictly statistical point of view. On the other hand as soon as meaning is considered, it is no longer justified to attach Bayesian probabilities to propositions concerning this meaning. One either understands what was being said or not (if you take your time and actually read and understand each piece of your incoming mail, it is easy to tell spam and the rest apart, even for non-experts). To make a long story short, I think Galileo's statement about the hidden language is not just a metaphor, but there's more to it. There's indeed a message to be decoded, written in an utterly non-human language. It is a metaphysical statement of course and as such, has no immediate bearing on questions of physics. Nevertheless metaphysical stance plays an undeniable role in the manner people approach problems, even in their choice of problems as well. More so in their assessment what constitutes a proper solution.
0 0
KR at 04:23 AM on 23 November, 2010
Berényi - From the John Stewart Mills reference you provided: "We have found that all Inference, consequently all Proof, and all discovery of truths not self-evident, consists of inductions, and the interpretation of inductions: that all our knowledge, not intuitive, comes to us exclusively from that source." and: "We shall fall into no error, then, if in treating of Induction, we limit our attention to the establishment of general propositions. The principles and rules of Induction as directed to this end, are the principles and rules of all Induction; and the logic of Science is the universal Logic, applicable to all inquiries in which man can engage." Mills quite correctly limits the purest use of "induction" for generalization from the particular to the universal. He also notes a major limitation on induction, the question of "enough proof": "Why is a single instance, in some cases, sufficient for a complete induction, while in others, myriads of concurring instances, without a single exception known or presumed, go such a very little way toward establishing a universal proposition? Whoever can answer this question knows more of the philosophy of logic than the wisest of the ancients, and has solved the problem of induction." Perhaps significance testing? Darwin did a great deal of data collection - look up "Darwin and snails" for some examples. There is indeed a metaphysical difference between our approaches, Berényi - you seem to treat the universe as a purely mathematical problem, with the idea that we can have exact knowledge, and that induction is somehow not part of science and our investigations. I, on the other hand, feel that we but see through a glass darkly, one that sometimes has bugs spattered on it, and our science is a series of improving approximations of what goes on in the universe, complete with averages, parametric expressions, and other methods for handling complexity without complete knowledge. And hence significance tests (just to tie this fairly wild excursion back to the topic) are useful to determine how confident you are in what you think you know. Just remember - deduction is exact within the limitations of the premises. But it cannot teach you anything you don't already know. For that, you must use induction to create new premises. Only induction can teach you something new.
0 0
Dikran Marsupial at 06:11 AM on 23 November, 2010
Berényi Péter wrote "Problems arose not because of hasty induction, but some vagueness and much hand waving in its deductive structure..." Yes, because Darwin's theory of evolution is almost entirely inductive in nature (which was, rather obviously, the whole point in using it an example) it is hardly surprising that the deductive structure is rather lacking. Given that it rests almost entirely on inductive foundations, is it science or not?
0 0
kdkd at 11:42 AM on 23 November, 2010
I believe that BP is deeply confused between the difference between sceintific laws, which can be proven in the same way that mathematical theories can be proven, and scientific theories which sometimes rely on scientific laws, but usually rely on chains of induction. Induction, by definition is based on evidence. I suspect his misunderstanding comes from being from an engineering background where consideration of these matters is relatively unimportant. It's worth noting that a lot of the basis of modern computer programming comes from Whitehead and Russel's (1911) theory of logical types, which itself in parts is not amenable to deductive proof, but provides much of the basis for modern functional computer programming (without which we would not have this forum :]). Again the selective ignoring of comments by the so-called sceptic contingent here is instructive, in that BP has chosen to ignore the detailed critique of anti-induction reasoning provided by the very famous statistician Fisher in my comment #77
0 0
Berényi Péter at 03:19 AM on 24 November, 2010
Some remarks in no particular order.
- I have never told you induction was not a necessary ingredient in the scientific endeavor. What I keep telling it's not a scientific method, much less the scientific method. Induction is always a messy process, never governed by an established rule set in practice. It is best considered to be part of heuristics.
- True Baconian "inductive method" was never practiced by anyone, ever. Not a single scientific discovery was made by applying those silly lists.
- There's this persistent myth deduction cannot teach us anything we don't already know, only induction is capable to do that. That's simply not true. Just consider GIMPS (the Great Internet Mersenne Prime Search). They're looking for prime numbers having binary representation devoid of zeroes. The quest is entirely deductive, but we surely acquire new knowledge as the search proceeds. The only credible way to dispute it is to fully specify the 55^th Mersenne prime right now.
- Or consider the discovery of Neptune on 23-24 September, 1846 by Johann Gottfried Galle and Heinrich Louis d'Arrest using the 24.4 cm aperture size, 4 m long achromatic refractor of New Berlin Observatory. But they already knew where to look and what to look for (unlike Galileo Galilei, who has also seen and documented the planet on 28 December, 1612 and 27-28 January, 1613 again, but failed to recognize and report it). They were simply told by Urbain Jean Joseph Le Verrier where the planet is supposed to be. He used inverse perturbation theory applied to Newtonian celestial mechanics to calculate mass and orbital elements of an unknown planet in an entirely deductive manner to explain observed anomalies in orbital elements of Uranus. He had made a conceptual error in his calculation of errors, so the mass and orbital elements of the newly discovered planet turned out to be outside the error bounds given by him, but its celestial position was still within limits. Calculation of error bounds was corrected only after discovery.
- No amount of induction based alone on careful observation of Uranus' orbit would possibly lead to such a result without an axiomatized background theory. The inverse square law of Newtonian gravitation itself was of course based on induction, but originally only on a few examples (the known lunar and planetary orbits of the time) and was verified by Newton with a 4% accuracy. Which later on turned out to be more than a millionfold better. That's what I mean when I tell you in science (somewhat miraculously) much more is coming out then was put in.
- As for "the issue about stratospheric intrusions", the small (300 m - 1 km) scale is rather instructive (it is not resolved by GCMs). Large scale stratosphere folding events are well known, now it looks like it happens on all scales in a rather fractal-like manner. BTW small scale in itself does not mean it has only minuscule effect on radiative forcing, for it can happen all over the globe. Troposphere-stratosphere mixing has the potential to bring down extremely dry startospheric air to the upper troposphere (while freeze-drying humid air of tropospheric origin). Overall effect on radiative balance can be huge.
- I'll return to Fisher 1955 later.
0 0
KR at 03:49 AM on 24 November, 2010
Berényi - You are quite correct about deduction in mathematics; given that the premises are defined as a consistent system, without inductive input, math (and pure logic) is pretty much by definition a purely deductive system. The 55th Mersenne prime is an extended deduction from posited premises - it's already contained in the premises, even if we haven't ground our way down to it. I once took part in a graduate class where we proved the equivalence of syntax and semantics for propositional logic - that took the entire term! This had been stated (with good reason) before, but apparently the full proof had never been explicitly worked out prior to that time. But that conclusion was based entirely on the premises we started with. Now, back to the real world. At least some of the premises we use for any deductive argument about the real world (as opposed to a self-contained by-definition realm) are observational, inductive premises. Johannes Kepler could not have formulated his theory of elliptical orbits without Tycho Brahe's body of observations. And that theory was induced as a generalization that accounted for observational (fuzzy, noisy) evidence. And hence back to significance tests - they perform as tests on the strength of our observational knowledge. Enough - I will not debate this with you any further, especially as it is too far off topic. I understand, Berényi, that you do not like induction as a principle for understanding the world, and seem to object to the lack of certainty involved. Unfortunately, that is the world we live in, where we use induction to tie possible maths to how the universe works, and not the realm of by-definition premises of pure mathematics.
0 0
Tom Dayton at 03:54 AM on 24 November, 2010
BP wrote "Induction is always a messy process, never governed by an established rule set in practice." Yep, and science is a messy process. BP, you have an incorrect definition of science. What's odd is your certainty of your definition despite your lack of background as a scientist, and in stark contrast to the explicit descriptions of science by real, working scientists (including me and other commenters here), historians of science, philosophers of science, anthropologists of science, and sociologists of science. We have linked you to a multitude of those descriptions, but you have either ignored them or simply insisted they are wrong. I'm going to remind all of us of the point of this too-long exchange with BP: Some "skeptics" of anthropogenic global warming claim that the conclusions of climatologists are not convincing because climatologists do not behave like the "real" scientists in other fields. The original post by Maarten at the top of this page was taken by some skeptics as more evidence of that. Those skeptics then ignored Maarten's later points that this particular statistical incorrectness is common in scientific fields outside climatology, and does not have a profound effect on the overall conclusions of climatologists.
0 0
Dikran Marsupial at 03:59 AM on 24 November, 2010
Berényi Péter@82 wrote "Troposphere-stratosphere mixing has the potential to bring down extremely dry startospheric air to the upper troposphere (while freeze-drying humid air of tropospheric origin). Overall effect on radiative balance can be huge." So how about giving a reference to a paper that establishes that trophospheric-stratospheric mixing actually does have a significant effect on the measurement or definition of global radiative forcing, rather than a paper discussing an aspect of the structure of stratospheric intrusions that didn't actually provide any evidence whatsoever that was relevant to the question under discussion (i.e. it was a red herring).
0 0
kdkd at 07:08 AM on 24 November, 2010
Tom #84 Strongly agree. It's instructive that BP's comment #82 does not consider the messier parts of science where exact measurements are impossible, and real experiemental designs are impossible and we have to rely on Quasi-experiments. He clearly does have an incorrect or incomplete mental definition of science.
0 0