# Spurious Correlations in Climate Science

Posted May 27, 2018

on:**FIGURE 1: NO USEFUL INFORMATION IN REGRESSION COEFFICIENTS WITHOUT CORRELATION**

**FIGURE 2: SPURIOUS CORRELATION IMPOSED BY SHARED TREND IN THE ABSENCE OF RESPONSIVENESS AT THE TIME SCALE OF INTEREST**

**FIGURE 3: DEMONSTRATION OF THE SPURIOUSNESS OF CORRELATION BETWEEN CUMULATIVE VALUES**

**FIGURE 4: DETRENDED CORRELATION EXPLAINED**

- You know who Charles Darwin is of course but you may not have heard of his mad cousin
**Francis Galton**who did the math for Darwin’s theory of evolution. Two of the many procedures Sir Galton came up with to help him make sense of the data are still used today and are possibly the two most widely used tools in all of statistics. They are ordinary least squares (OLS) linear**regression**and OLS**correlation**. - Both of these statistics are measures of a linear relationship between two variables X and Y. The linear regression coefficient
**B**of Y against X is a measure of how much Y changes**on average**for a unit change in X and the linear correlation**R**is a measure of how close the observed changes are to the average. This close relationship between regression and correlation is described in**Figure 1**above that consists of four charts. The data for these charts are generated by a Monte Carlo simulation used to control the degree of correlation. - In the
**HIGH (R=0.94) and VERY HIGH (R=0.98) correlation charts**, linear regression tells us that on average, a unit change in X causes Y to change by**B=5 and the data are very consistent with this requirement**. The consistency in this case derives from a low variance of the regression coefficient implied by high correlation. The strong correlation also implies that the observed changes in Y for a unit increases in X are close to the the average value of B=5 over the full span of the data and for any selected sub-span of the time series. A sample of five observed values shows changes of 5.0 to 5.4 for very high correlation, and 4.5 to 5.4 for high correlation. - In the
**LOW (R=0.36) and MID (R=0.7) correlation charts**, the regression coefficients are correspondingly less precise varying from**B=1.8 to B=7.1 for LOW-R and B=3.5 to B=5.6 for MID-R**in the five random estimates presented. The point here is that without a sufficient degree of correlation between the time series at the time scale of interest**, though regression coefficients can be computed, the computed coefficients may have no interpretation**. The weak correlations in these cases also imply that**the observed changes in Y for a unit increases in X would be different in sub-spans of the time series**. The so called “split-half” test, which compares the first half of the time series with the second half, may be used to examine the instability of the regression coefficient imposed by low correlation or by violations of other OLS assumptions such as dependence and that of identical Gaussian distributions of the source of each data value in the time seris. - An issue specific to the analysis of time series data is that the observed correlation in the source data must be separated into the portion that derives from shared long term trends, with no interpretation at the time scale of interest, from
**the responsiveness of Y to changes in X at the time scale of interest**. If this separation is not made, the correlation used in the evaluation may be, and often is spurious. An example of such a**spurious correlation**is shown in**Figure 2**above. It was provided by the**TYLERVIGEN**collection of spurious correlations**[LINK]**. As is evident, the spurious correlation between drownings and marriages derives from a shared trend. The fluctuations around the trend at an appropriate time scale are clearly not correlated. - The separation of these effects may be carried out using
**detrended correlation analysis**. Briefly, the trend component is removed from both time series and the residuals are tested for the**responsiveness of Y to changes in X at the appropriate time scale**. The procedure and its motivation are described quite well in**ALEX TOLLEY’S LECTURE**(Figure 4)**[LINK]**The motivation and procedure for detecting and removing such spurious correlations in time series data are described in a short paper available for download described in below. **SPURIOUS CORRELATIONS IN TIME SERIES DATA:**Unrelated time series data can show spurious correlations by virtue of a shared drift in the long term trend. The spuriousness of such correlations is demonstrated with examples. The SP500 stock market index, GDP at current prices for the USA, and the number of homicides in England and Wales in the sample period 1968 to 2002 are used for this demonstration. Detrended analysis shows the expected result that at an annual time scale the GDP and SP500 series are related and that neither of these time series is related to the homicide series. Correlations between the source data and those between cumulative values show spurious correlations of the two financial time series with the homicide series. These results have implications for empirical evidence that attributes changes in temperature and carbon dioxide levels in the surface-atmosphere system to fossil fuel emissions.**FULL TEXT.**Yet another example of spurious correlations in time series data is the apparent homicide sensitivity of atmospheric carbon dioxide concentration described in a related post**[LINK]**.**It is for these reasons that the argument that “the theory that X causes Y is supported by the data because X shows a rising trend and at the same time we see that Y has also been going up” is specious because for the data to be declared consistent with causation it must be shown that Y is responsive to X at the appropriate time scale when the spurious effect of the shared trend is removed.**Examples from climate science are presented in the seven downloadable papers described in PARAGRAPHS #9 TO #15.**Are fossil fuel emissions causing atmospheric CO2 levels to rise?****RESPONSIVENESS OF ATMOSPHERIC CO2 CONCENTRATION TO EMISSIONS**The IPCC carbon cycle accounting assumes that changes in atmospheric CO2 are driven by fossil fuel emissions on a year by year basis. A testable implication of the validity of this assumption is that changes in atmospheric CO2 should be correlated with fossil fuel emissions at an annual time scale net of long term trends. A test of this relationship with insitu CO2 data from Mauna Loa 1958-2016 and flask CO2 data from twenty three stations around the world 1967-2015 is presented. The test fails to show that annual changes in atmospheric CO2 levels can be attributed to annual emissions. The finding is consistent with prior studies that found no evidence to relate the rate of warming to emissions and they imply that the IPCC carbon budget is flawed possibly because of insufficient attention to uncertainty, excessive reliance on net flows, and the use of circular reasoning that subsumes a role for fossil fuel emissions in the observed increase in atmospheric CO2.**FULL TEXT****Can sea level rise be attenuated by reducing or eliminating fossil fuel emissions?****A TEST OF THE ANTHROPOGENIC SEA LEVEL RISE HYPOTHESIS**Detrended correlation analysis of a global sea level reconstruction 1807-2010 does not show that changes in the rate of sea level rise are related to the rate of fossil fuel emissions at any of the nine time scales tried. The result is checked against the measured data from sixteen locations in the Pacific and Atlantic regions of the Northern Hemisphere. No evidence could be found that observed changes in the rate of sea level rise are unnatural phenomena that can be attributed to fossil fuel emissions. These results are inconsistent with the proposition that the rate of sea level rise can be moderated by reducing emissions. It is noted that correlation is a necessary but not sufficient condition for a causal relationship between emissions and acceleration of sea level rise.**FULL TEXT****Can ocean acidification be attenuated by reducing or eliminating fossil fuel emissions? A TEST OF THE ANTHROPOGENIC OCEAN ACIDIFICATION HYPOTHESIS**Detrended correlation analysis of annual fossil fuel emissions and mean annual changes in ocean CO2 concentration in the sample period 1958-2014 shows no evidence that the two series are causally related. The finding is inconsistent with the claim that fossil fuel emissions have a measurable impact on the CO2 concentration of the oceans at a lag and time scale of one year.**FULL TEXT****Is surface temperature responsive to atmospheric CO2 levels? EMPIRICAL TEST FOR THE CHARNEY CLIMATE SENSITIVITY FUNCTION**Monthly means of Mauna Loa atmospheric CO2 concentrations are used in conjunction with surface temperature data from two different sources for the sample period 1979-2017 to test the validity and reliability of the empirical Charney climate sensitivity function. Detrended correlation analysis of temperature in five global regions from two different sources did not show that surface temperature is responsive to changes in the logarithm of atmospheric CO2 at an annual time scale. Correlations observed in source data are thus shown to be spurious. We conclude that the empirical Charney Climate Sensitivity function is specious because it is based on a spurious correlation.**FULL TEXT****Is surface temperature responsive to atmospheric CO2 levels? UNCERTAINTY IN EMPIRICAL CLIMATE SENSITIVITY**Atmospheric CO2 concentrations and surface temperature reconstructions in the study period 1850-2017 are used to estimate observed equilibrium climate sensitivity. Comparison of climate sensitivities in the first and second halves of the study period and a study of climate sensitivities in a moving 60-year window show that the estimated values of climate sensitivity are unstable and unreliable and that therefore they may not contain useful information. These results are not consistent with the existence of a climate sensitivity parameter that determines surface temperature according to atmospheric CO2 concentration.**FULL TEXT****Is surface temperature responsive to atmospheric CO2 levels?****FROM CLIMATE SENSITIVITY TO TRANSIENT CLIMATE RESPONSE**A testable implication of the theory of anthropogenic global warming (AGW) is the Equilibrium Climate Sensitivity (ECS), the coefficient of proportionality between the logarithm of atmospheric CO2 and surface temperature. This line of research has been retarded by large uncertainties in empirical estimates of the ECS. An alternative to the ECS that offers a more stable metric for AGW is the Carbon Climate Response or Transient Climate Response to Cumulative Emissions (CCR/TCRE). It is computed as the coefficient of proportionality between cumulative fossil fuel emissions and temperature. The CCR/TCRE metric provides a direct connection from emissions to temperature without the intervening step of atmospheric accumulation. We show here that though the CCR/TCRE is stable, it has no interpretation in terms of AGW because the proportionality it describes is spurious and specious.**FULL TEXT****Is surface temperature responsive to atmospheric CO2 levels?****THE CHARNEY HOMICIDE SENSITIVITY TO CO2**Homicides in England and Wales 1898-2003 are studied against the atmospheric carbon dioxide data for the same period. The Charney Equilibrium Sensitivity of homicides is found to be λ=1.7 thousands of additional annual homicides for each doubling of atmospheric CO2. The sensitivity estimate is supported by a strong correlation of ρ=0.95 and detrended correlation of ρ=0.86. The analysis illustrates that spurious proportionalities in time series data in conjunction with inadequate statistical rigor in the interpretation of empirical Charney climate sensitivity estimates impedes the orderly accumulation of knowledge in this line of research.**FULL TEXT**- A further caution needed in regression and correlation analysis of time series data arises when the
**source data are preprocessed**prior to analysis. In most cases, the**effective sample size**of the preprocessed data is less than that of the source data because preprocessing involves**using data values more than once**. For example taking moving averages involves multiplicity in the use of the data that reduces the effective sample size (EFFN) and the effect of that on the**degrees of freedom**(DF) must be taken into account when carrying out hypothesis tests. - The procedures and their rationale are described in this freely downloadable paper
**ILLUSORY STATISTICAL POWER IN TIME SERIES DATA**: Preprocessing of time series data with moving average and autoregressive processes serves a useful purpose in time series analysis; but the further use of the preprocessed series for computing probability in hypothesis tests or for constructing confidence intervals**requires a correction to the degrees of freedom imposed on the filtered series by multiplicity**. Multiplicity derives from repeated use of the same data item in the source data series for the computation of multiple items in the filtered series. A procedure for estimating multiplicity and the effective degrees of freedom implied by multiplicity is proposed and its utility is demonstrated with examples. It is found that without a multiplicity correction the filtered series can show an illusory increase in statistical power.**FULL TEXT** - Failure to correct for this effect on DF may result in a
**false sense of statistical power and faux rejection of the null in hypothesis tests**as shown in this analysis of Kerry Emmanuel’s famous paper on what he called “increasing destructiveness” of North Atlantic hurricanes:**CIRCULAR REASONING IN CLIMATE SCIENCE**: A literature review shows that the circular reasoning fallacy is common in climate change research. It is facilitated by confirmation bias and by activism such that the prior conviction of researchers is subsumed into the methodology. Example research papers on the impact of fossil fuel emissions on tropical cyclones, on sea level rise, and on the carbon cycle demonstrate that the conclusions drawn by researchers about their anthropogenic cause derive from circular reasoning. The validity of the anthropogenic nature of global warming and climate change and that of the effectiveness of proposed measures for climate action may therefore be questioned solely on this basis.**FULL TEXT** - When the statistics are done correctly, we find no evidence for the claim that “human caused climate change is supercharging tropical cyclones” as in the downloadable paper
**A General Linear Model for Trends in Tropical Cyclone Activity**ABSTRACT: The ACE index is used to compare tropical cyclone activity worldwide among seven decades from 1945 to 2014. Some increase in tropical cyclone activity is found relative to the earliest decades. No trend is found after the decade 1965-1974. A comparison of the six cyclone basins in the study shows that the Western Pacific Basin is the most active basin and the North Indian Basin the least. The advantages of using a general linear model for trend analysis are described.**FULL TEXT** **CUMULATIVE VALUES OF A TIME SERIES:**An extreme case of the effect of preprocessing on degrees of freedom occurs when a time series of**cumulative values**is derived from the source data as in the famous Matthews paper on the proportionality of warming to cumulative emissions [Matthews, H. Damon, et al. “The proportionality of global warming to cumulative carbon emissions.”*Nature*459.7248 (2009): 829].- It has been shown in these downloadable papers that the time series of cumulative values has an effective sample size of EFFN=2 and therefore there are no degrees of freedom and there is no statistical power. The correlation between cumulative values is therefore spurious and does not contain useful information. The spuriousness is demonstrated in
**Figure 3**and explored in detail in the papers described in the next five paragraphs. Abstracts and download links for the five papers are provided. -
**EFFECTIVE-N OF THE CUMULATIVE VALUES OF A TIME SERIES**: In the computation of cumulative values of a time series of length N, all data items except for the last item are used more than once. The multiplicity in the use of the data reduces the effective value of N. We show that for time series of cumulative values the effective value of N is too small to yield sufficient degrees of freedom to make inferences about the population. It is not possible to evaluate the statistical significance of a correlation between cumulative values for this reason even when the magnitude of the correlation coefficient observed in the sample is large. The results provide a rationale for the findings of a previous work in which the spuriousness of correlations between cumulative values was demonstrated with Monte Carlo simulation.**FULL TEXT** **LIMITATIONS OF THE TCRE**: Observed correlations between cumulative emissions and cumulative changes in climate variables form the basis of the Transient Climate Response to Cumulative Emissions (TCRE) function. The TCRE is used to make forecasts of future climate scenarios based on different emission pathways and thereby to derive their policy implications for climate action. Inaccuracies in these forecasts likely derive from a statistical weakness in the methodology used. The limitations of the TCRE are related to its reliance on correlations between cumulative values of time series data. Time series of cumulative values contain neither time scale nor degrees of freedom. Their correlations are spurious. No conclusions may be drawn from them.**FULL TEXT****From Equilibrium Climate Sensitivity to Carbon Climate Response**: A testable implication of the theory of anthropogenic global warming (AGW) is the Equilibrium Climate Sensitivity (ECS), the coefficient of proportionality between the logarithm of atmospheric CO2 and surface temperature. This line of research has been retarded by large uncertainties in empirical estimates of the ECS. An alternative to the ECS that offers a more stable metric for AGW is the Carbon Climate Response or Transient Climate Response to Cumulative Emissions (CCR/TCRE). It is computed as the coefficient of proportionality between cumulative fossil fuel emissions and temperature. The CCR/TCRE metric provides a direct connection from emissions to temperature without the intervening step of atmospheric accumulation. We show here that though the CCR/TCRE is stable, it has no interpretation in terms of AGW because the proportionality it describes is spurious and specious.**FULL TEXT****SPURIOUS CORRELATIONS BETWEEN CUMULATIVE VALUES**Monte Carlo simulation shows that cumulative values of unrelated variables have a tendency to show spurious correlations. The results have important implications for the theory of anthropogenic global warming because empirical support for the theory that links warming to fossil fuel emissions rests entirely on a correlation between cumulative values.**FULL TEXT****EXTRATERRESTRIAL FORCING OF SURFACE TEMPERATURE**: It is proposed that visitation by extraterrestrial spacecraft (UFO) alters the electromagnetic properties of the earth, its atmosphere, and its oceans and that these changes can cause global warming leading to climate change and thence to the catastrophic consequences of floods, droughts, severe storms, and sea level rise. An empirical test of this theory is presented with data for UFO sightings and surface temperature reconstructions for the study period 1910-2015. The results show strong evidence of proportionality between surface temperature and cumulative UFO sightings. We conclude that the observed warming since the Industrial Revolution are due to an electromagnetic perturbation of the climate system by UFO extraterrestrial spacecraft.**FULL TEXT**

### 15 Responses to "Spurious Correlations in Climate Science"

[…] time series are detrended. The details of the instability issue are described in a related post Spurious Correlations in Climate Science and a downloadable paper posted on SSRN Validity and Reliability of Charney Climate Sensitivity. […]

[…] Spurious Correlations in Climate Science Elevated CO2 and Crop Chemistry […]

[…] RELATED POST: SPURIOUS CORRELATIONS IN CLIMATE SCIENCE […]

[…] “The planet’s average surface temperature has risen about 0.9C driven largely by increased carbon dioxide”. This claim assumes that the observed increase in atmospheric CO2 is driven by emissions and that the observed increase in surface temperature is driven by atmospheric CO2 concentration. These relationships exist in climate models because they have been programmed into them but they are not found in the observational data as shown these two related posts: HUMAN CAUSED CLIMATE CHANGE, THE GREENHOUSE EFFECT OF ATMOSPHERIC CO2. No evidence exists outside of climate models that relate warming to emissions outside of climate models and without the use of spurious correlations as discussed in this related post: SPURIOUS CORRELATIONS IN CLIMATE SCIENCE. […]

[…] fossil fuel emissions cause ocean acidification. The related post on spurious correlations is here SPURIOUS CORRELATIONS IN CLIMATE SCIENCE and the ocean acidification issue presented here HUMAN CAUSED CLIMATE CHANGE shows that […]

[…] J. (2018). Spurious Correlations in Climate Science. Retrieved from chaamjamal.wordpress.com: https://chaamjamal.wordpress.com/2018/05/27/spurious-correlations-in-climate-science-2/ NSIDC. (2018). NSIDC DATA. Retrieved from National Snow and Ice Data Center: http://nsidc.org/data/ […]

[…] Yet statistical analysis of the observational data do not show the correlations that would exist if this chain of causation to be true were true. The correlation argument is presented in more detail in two related posts. HUMAN CAUSED CLIMATE CHANGE, SPURIOUS CORRELATIONS IN CLIMATE SCIENCE. […]

[…] RELATED POST: SPURIOUS CORRELATIONS IN CLIMATE SCIENCE […]

[…] Spurious Correlations in Climate Science […]

[…] and not under a controlled experiment. This issue is discussed at length in a related post on SPURIOUS CORRELATIONS IN CLIMATE SCIENCE. In short, correlations between time series of field data require extreme caution to separate out […]

1 | TCR: Transient Climate Response | Cha-am Jamal, Thailand

June 3, 2018 at 6:08 am

[…] Spurious correlations explained […]