Thongchai Thailand


Posted on: May 24, 2020




Methodological flaws have become commonplace in financial research to the point that familiarity has bred de facto acceptance. Some of these only require refinements in the findings but in most cases the methods used bring to question the findings themselves. In this paper we survey these flaws and investigate their potential impact on conclusions normally drawn from the research. We use simulation to demonstrate how an incorrect conclusion about a known population may be drawn using these methods. We then offer alternate methods that may be used to avoid the pitfalls described. Finally, we examine the validity of the “large sample” statistical model in understanding phenomena and developing theory in financial economics.


Finance, like the other social sciences, inherited a version of the scientific method from the natural sciences and to this day classical Neyman-Pearson large sample hypothesis testing is generally accepted as the only credible research tool in finance. In this research model, we think we live in the world of Ho the null hypothesis, where entropy is maximized and no patterns and therefore no information exists. To find information we seek patterns. We do this by randomly selecting a finite sample from an infinity of possibilities. This is our observation; and given the distributional properties of Ho we may compute the odds that we would observe such a sample by chance. If this probability turns out to be lower than a threshold value of unlikelyhood we proclaim that it is unlikely that we would make such an observation in Ho; therefore we must not be in Ho. The distributional properties of Ho are usually assumed to be described by the Gaussian function.

In practice, financial researchers often stray from this pattern seeking model partly because of convenience and and partly because of unfamiliarity with the actual statistical underpinnings of the research methods; but more importantly, I believe, because the methodology, though a stunning success in the natural sciences, is not a good fit in finance. The Krueger and Kennedy (1990) finding that stock market performance is highly correlated with football games is a harsh reminder that spurious statistics do exist but such a faux pas is harder to identify when the relationships found are the ones we are looking for; or one that is easily rationalized in terms of a dominant theoretical framework.


A commonly found departure from the pattern seeking model is to look for a number of different patterns. The more patterns you look for the greater the odds that a chance pattern will be observed in a sample taken from the Ho distribution. The procedure compromises the statistical method because even in the world of Ho, if you look long enough you will find the pattern you are looking for. In Finance papers mutiple comparisons are frequently used in conjunction with mutiple values of the alpha probability and ex-poste selection of the direction of the effect. Table 1 shows a typical result (Aggarwal 1990). A common practice is to set up a table of many t-tests and then to identify the ‘improbable’ ones with asterisks. The usual code is * = ‘significant at alpha=10%, ** = ‘significant at alpha=5%, and *** = ‘significant at alpha=1%.

In Table 2 we perform a similar test on 10 samples of n=100 drawn from the Ho population and find three spurious ‘rejections’ which might have been reported as findings in financial research. The example underlines the need for careful analysis of multiple comparison cases.

At an alpha level of 5%, the probability that all 10 samples would fail to reject Ho is 0.95^10 or about 60%. This means that there is a 40% probability that at least one chance rejection will occur. In other words, the experimentwide error rate is 40% instead of 5%. To hold the experimentwide error rate at 5% each comparison must be made at alpha = 0.005. In general, when making m comparisons, the single comparison alpha level (ac) and the eperimentwide alpha level (ae) are related by the equation

(1-ac)^m =( 1-ae)


In simple linear regression models with many explanatory x-variables the beta coefficient of each x-variable may be interpreted only if the x-variables act independently on y, the response variable. When the x-variables are correlated or even when some linear combination of the x-variables are correlated, the regression coefficients become unstable so that neither the direction nor the magnitude of the observed ‘effect’ may have a valid interpretation.

Most papers in Finance that use linear regression do not check for this condition and the few that do refer only to a correlation matrix of the x-variables. A correlation matrix is unable to detect correlations between linear combinations.

Some Examples of Multi Co-linearity in Financial Research

(i) The Intervention dummy variable in time

An example model is presented in Table 3 (Blennerhassett and Bowman 1994). The authors wish to measure the impact of the screen trading intervention in the time series of off-market trading. But here they are analyzing the data in real time and not in event time and the authors suspect that the model is subject to historical effects. An intervention dummy variable identifies the implementation of screen trading. The addition of the time variable to the model is an attempt to remove historical effects. But the dummy variable is also a time indicator. Therefore the two variables are highly correlated. Simulated data in Table 3 show a time variable with two different intervention dummy variables. Note that both dummys are highly correlated with time and, unless one dummy is symmetrical around the other, the dummy variables are also likely to be correlated with each other. Models such as these are common in financial research that attempts to model the impact of a historical event such as a change in regulation, market structure, interest rates, trade agreements, and so on.

(ii) Explanatory variables that are algebraically related

Capital structure, that is, the amount of debt a firm carries relative to equity, is a contentious and unresolved topic in Finance. (DeAngelo and Masulis 1980, Bowen, Daley, and Huber 1982, Bradley, Jarrell, and Kim 1984, Kane, Marcus, and McDonald 1984, Kim and Sorensen 1986). A popular model of capital structure maintained that the value of debt is that it is a tax shield and therefore, that firms that enjoy other forms of tax shields such as depreciation would use less debt. The empirical results are conflicting and inconclusive. The regression coefficients of the models differ in sign from one study to another. Some found the non debt tax shield (ndts) coefficient to be negative and argued that the ndts model is correct since it showed that firms with non debt tax shields used less debt. Others found the coefficient to be positive and argued that firms with more tax shields enjoyed higher cash flows and therefore had larger debt capacities. Still others found no relationship between ndts and capital structure.

A possible explanation for the failure of these empirical studies is the natural algebraic relationship that exists between total assets and depreciation. Since both of these variables were used in the model the regression coefficients were unstable and their spurious sign and maginitude became interpreted into financial theory. The empirical results summarized in Table 4 reveal their random and contradictory nature.

(iii) Linear combinations of explanatory variables are correlated

Frequently the starting point of empirical investigations in finance is not theory but an available database such as COMPUSTAT or CRSP. All the variables in the database become candidates for the model and models built in this manner are frequently unstable even when a correlation matrix of the explanatory variables does not reveal the extent of the dependencies between them.

One of the many models proposed for measuring the determinants of capital structure includes D=total debt, E= book value of equity, and A=total assets as explanatory variables (Kim and Sorensen 1986). The three variables have an exact accounting relationship. A = D+E.

(iv) The small firm effect

Banz (1981) first reported that stocks of small firms tend to have higher returns and this relationship was quickly confirmed by studies that followed. Investigators of asset pricing and market efficiency saw Banz’s size effect as an anomaly that must be removed from the data before other effects can be detected. Fama and French (1992) first removed the size effect from the data before they tested the relationship between beta and returns. If the Sharpe (1964) model is correct, they reasoned, then there ought to be a significant and positive relationship between beta risk and returns. They found that the relationship if any between these variables was negative.

But small firms not only have higher returns; they also higher betas. Chan and Chen (1988) report a very strong correlation between beta and size. This relationship implies that tests of asset pricing such as those of Fama and French (1992) and Fama and MacBeth (1973) are likely to produce spurious values of the coefficients for beta risk; and this is indeed the case. Some studies have found a strong positive effect of beta risk that is consistent with the CAPM (capital asset pricing model) while others have found no significant relationship and still others report a relationship in the opposite direction. These empirical results have caused a great deal of controversy and have forced financial economists to re-evaluate some fundamental assumptions of asset pricing. Yet, these findings may turn out to be a fluke of multi-collinearity.

For example, when Fama and French (1992) removed the size effect from the data they may have also removed that which they intended to measure – the beta effect. A failure to find a beta effect in the residuals of the size effect does not imply an absence of a beta effect. There are better ways to find the unique contributions of the two correlated variables. For example, one might first regress beta against size and use these residuals as the unique contribution of beta and then regress size against beta and use those residuals as the unique contribution of size. Alternately, one might look for orthogonal principal components of size and beta.

An added complication in asset pricing research is that some of the empirical models include the PE ratio (PE = stock price over accounting earnings) as an explanatory variable in addition to the risk measure beta. But PE too contains a risk measure. Current financial theory interprets PE as a combination of two effects. Ceteris paribus higher perceived risk would lower the PE ratio and higher perceived growth would raise the PE ratio. Empirical studies are complicated by a high collinearity between PE and beta. There are other problems with asset pricing studies that have to do with the time series nature of the data and the methods by which the concept of risk is rendered and we review these concerns in a another paper.


Sample sizes in financial research are typically very large. For example the Fama French (1992) paper on asset pricing models is a study of monthly stock returns from over 2000 firms in the period 1962 to 1989. This represents over 672,000 observations. When we treat these observations as a sample we get a sampling distribution with a tiny standard deviation. For instance, The degrees of freedom is approximately 670,000 whose square root is 818. In this case the standard devaition of returns was 52%. This means that the standard deviation of the sampling distribution is 0.0636% or about 6 basis points and a difference as small as 12 basis points might be considered significant although no fund manager would pursue a policy to take advantage of such a small effect. Many of the empirical investigations of the determinants of capital structure cited in this paper have proposed regression models that were found to be statistically significant at r-squared values of 1 to 5%. That is to say 95 to 99% of the sum of squares in the sample remain unexplained and are considered by the model to be random backround noise.


Black (1993) observes that although there are many researchers in finance, perhaps thousands, there is only one history, and therefore, pretty much just one set of data. Researchers use past studies to formulate new hypotheses and models. They then test these models on the same data that were used to test the past hypotheses. Recall that classical hypothesis testing is formulated on the basis of a priori hypotheses and the probability of the data given the null. But in finance we find ourselves repeatedly testing the probability of the data given the hypothesis given the data; i.e. testing for pattens ex poste.

In this sense classical statistics is a failure in financial research and I would like to propose a re-evaluation of our research agenda and methodology and an emphasis on well designed case studies on small samples and even anecdotes.


Variable sample statistic Rejection status
United Kingdom spot rate 0.3619 – forward rate -0.2821 – risk premium -0.8977 **
West Germany spot rate 0.2299 – forward rate 0.4454 * risk premium -0.4544 *
Switzerland spot rate -0.2643 – forward -0.0340 – risk premium -1.6703 ` **

Rejection status: – = fail to reject, * = reject at alpha=10%, **=reject at alpha=5%


t-Test: Ho: µ = 0 Ha: µ ≠ 0

y1 :Sample Mean = -0.11183675 t-Statistic = -1.342 p=0.1828 –
y2 :Sample Mean = -0.18801382 t-Statistic = -1.995 p=0.0487 **
y3 :Sample Mean = -0.04132946 t-Statistic = -0.438 p=0.6621 –
y4 :Sample Mean = -0.21933539 t-Statistic = -2.112 p=0.0372 **
y5 :Sample Mean = -0.05426054 t-Statistic = -0.533 p=0.5953 –
y6 :Sample Mean = -0.04306246 t-Statistic = -0.469 p=0.6403 –
y7 :Sample Mean = -0.05917902 t-Statistic = -0.712 p=0.4780 –
y8 :Sample Mean = 0.14411801 t-Statistic = 1.308 p=0.1939 –
y9 :Sample Mean = 0.15260649 t-Statistic = 1.570 p=0.1197 –
y10 :Sample Mean = 0.18155813 t-Statistic = 1.804 p=0.0742 *


MODEL: Vt = Bo + B1D + B2t
Vt = ratio of on-market to total market value traded on day t
D = 1 after screen trading was implemented
D = 0 before screen trading is implemented
t = time in days

(coefficients and t-statistics)
Bo 0.3135 18.2
B1 0.1444 4.7
B2 0.000139 0.35

Simulated data
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 1
8 0 1
9 0 1
10 0 1
11 1 1
12 1 1
13 1 1
14 1 1
15 1 1
16 1 1
17 1 1
18 1 1
19 1 1
20 1 1

Pearson Product-Moment Correlation
time dummy1 dummy2
time 1.000
dummy1 0.867 1.000
dummy2 0.795 0.655 1.000


note: a negative effect supports the DeAngelo Masulis theory
Study Direction Statistical test
Bradley Jarrel and Kim: positive significant
Kim and Sorensen: negative not significant
Boquist and Moore positive significant
Bowen, Daley, and Huber negative significant
DeAngelo and Masulis negative significant


F-value of the test: 44.392
R-squared: 0.334
Number of observations: 800

AAA rate regression weight: -0.004382
t-value for beta: -2.504
probability t<-2.504 0.0125


Aggarwal, Raj, Distribution of spot and forward exchange rates: empirical evidence and investor valuation of skewness and kurtosis, Decision Sciences, v21, p588-595, 1990

Banz, Rolf, The relationship between return and market value of common stocks, Journal of Financial Economics, v9 p3, 1981

Black, Fisher, Beta and return, Journal of Portfolio Management, Fall 1993, p8

Blennerhassett, Michael, and Robert Bowman, A change in market microstructure – the seitch to screen trading in the New Zealand Stock Exchange, Asia Pacific Finance Association, Sydney, Australia, 1994

Boquist, John and William Moore, Inter0industry leverage differences and the De-Angelu Masulis tax shield hypothesis, Financial Management, Spring 1984, p5-9

Bowen, Robert, Daley, Lane, and Charles Huber, Evidence on the existence and determinants of inter-industry differences in leverage, Financial Management, Winter 1982, p10-20

Brown, Keith, W.B. Harlow, and Seha Tinic, How rational investors deal with uncertainty: reports of the death of the efficient market theory are greatly exaggerated, Financial Management Collection, Fall 1990

Brown, S.J., and J.B. Warner, Using daily stock returns, the case of event studies, Journal of Financial Economics, March 1985, p3

Chan, K. C., and Naifu Chen, An unconditional asset pricing test and and the role of firm size as an instrumental variable for risk, Journal of Finance, v43, p309, 1988

Chan, K. C., and Josef Lakonishok, Are reports of beta’s death premature?, Journal of Portfolio Management, Summer 1993

Chen, N.F, and Ingersoll, E., Exact pricing in linear factor models with finitely many assets: A note, Journal of Finance June 1983 page 985

Chen, Naifu, Richard Roll, and Stephen Ross, Economic forces and the stock market: testing the APT and alternate asset pricing theories, Working paper, December 1983

Chen, Naifu, Some empirical tests of the theory of arbitrage pricing, Journal of Finance, Dec 1983 pp 1393, p1414

DeAngelo, H. and R.W. Masulis, Optimal capital structure under corporate and personal taxation, Journal of Financial Economics, v8, March 1980, p3-29

Dimson, Elroy, Risk measurement when shares are subject to infrequent trading, Journal of Financial Economics, v7, p197, 1979 (the non-synchronicity problem)

Dybvig, Phillip, and Ross, Stephen, Yes, the APT is Testable, Journal of Finance, Sep, 1985

Fama, Eugene, and Kenneth French, The cross section of expected stock returns, Journal of Finance, v47:2, 1992, p427

Fama, Eugene, and James MacBeth, Risk, return, and equilibrium, Journal of Political Economy, 1973, 81, p607

Garman, Mark, and Michael Klass, On the estimation of security price volatilities from historical data, Journal of Business, v53, p67, 1980

Gatward, Paul and Ian Sharp, Capital structure dynamics with interrelated adjustments: Australian evidence, Third International Conference on Asia Pacific Financial Markets, Singapore, 1993

Haugen, Robert, and Nardin Baker, Interpreting risk and expected return: comment, Journal of Portfolio Management, Spring 1993, p36 (confirms F-F and rationalizes higher returns for lower ris. the market prices growth stocks too high)

Hsieh, David, Nonlinear Dynamics in Financial Markets, Financial Analysts Journal, July-August 1995, p55

Kane, Alex, Marcus, Alan, and Tobert McDonald, How big is the tax advantage of debt, Journal of Finance, July 1984, p841-855

Kim, Wi, and Eric Sorensen, Evidence of the impact of the agency cost of debt on the corporate debt policy, Journal of Financial and Quantitative Analysis, v21:2, July 1986, p131-143

Kolb, Robert, and Ricardo Rodriguez, The regression tendencies of betas, The Financial Review, v24:2 May, 1989, p319 (beta is not stationary)

Krueger, Thomas, and William Kennedy, An examination of the superbowl stock market predictor, Journal of Finance, June, 1990, p691

Kryzanowski, Lawrence, Simon Lalancette, and Minh Chau To, Some tests of APT mispricing using mimicking portfolios, Financial Review, v29: 2, p153, May 1994

Mandelbrot, Benoit, The variation of certain speculative prices, Journal of Business, October 1963

Markowitz, Harry, Portfolio selection, Journal of Finance, v12, March 1952, p77

Parkinson, Michael, The extreme value method for estimating the variance of the rate of return, Journal of Business, v53, p61, 1980

Peters, Edgar E., Fractal structure in the capital markets, Financial Analysts Journal, July/August 1989, pp. 32-37

Reinganum, Marc, A new empirical perspective on the CAPM, Journal of Financial and Quantitative Analysis, v16, p439, 1981

Roll, Richard, A critique of the asset pricing theory’s tests, Journal of Financial Economics, March 1977, p129

Roll, Richard and Stephen Ross, An empirical investigation of the arbitrage pricing theory, Journal of Finance, Dec 1980, p1073

Ross, Stephen, The arbitrage theory of capital pricing, Journal of Economic Theory, v13, p341, 1976

Scheinkman, J.A., and Blake LeBaron, Non-linear dynamics and stock returns, Working paper number 181, Dept of Economics, University of Chicago, 1990

Schwert, G. W., Why does stock market volatility change over time?, Journal of Finance, Dec 1989, p1115

Sharpe, William, A simplified model for porftolio returns, Management Science, 1962, p277

Sharpe, William, Capital asset prices: a theory of market equilibrium under conditions of risk, Journal of Finance, v19, p425, 1964

Shukla, Ravi, and Vhrles Trzcinka, Research on risk and return: Can measures of risk explain anything?, Journal of Portfolio Management, Spring 1991 (weekly returns, capm just as good as multifactor apt)

Schwert, G. William, Why does stock market volatility change over time?, The Journal of Finance, December 1989, page 1115-1153

Velleman, Paul, Definition and comparison of robust nonlinear data smoothing algorithms”, Journal of the American Statistical Association, 75, September 1980, 609-615.


Jamal, you have some real insights in this paper.

First, you have identified the problems that most computer modelers (for example, those in finance, those in climate science, and those now doing pandemic forecasting) won’t acknowledge they have:

“[W]e think we live in the world of Ho the null hypothesis, where entropy is maximized and no patterns and therefore no information exists. To find information we seek patterns. *** The more patterns you look for the greater the odds that a chance pattern will be observed in a sample taken from the Ho distribution. The procedure compromises the statistical method because even in the world of Ho, if you look long enough you will find the pattern you are looking for.”

Like you, I do not feel that financial markets can be properly analyzed using statistical methods alone. People – and especially investors – are not rational actors:

“In practice, financial researchers often stray from this pattern seeking model partly because of convenience and and partly because of unfamiliarity with the actual statistical underpinnings of the research methods; but more importantly, I believe, because the methodology, though a stunning success in the natural sciences, is not a good fit in finance.”

No statistical theory, including Portfolio theory and its diversification and balancing, could have protected an investor from, for example, the Great Financial Crisis of 2008/9. People panicked, just like people are doing now.

Thank you sir for your very insightful comment. Fully agree. Humans are very smart and they know a lot but they also have an ego that makes them think they ought to “knowitall”. They don’t know how to say we don’t know and most of the time that is the honest answer. This is the biggest joke in climate science.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: