# DATA SELECTION BIAS BIBLIOGRAPHY

Posted on: June 15, 2021

THIS POST IS A LITERATURE REVIEW OF DATA SELECTION BIAS IN EMPIRICAL RESEARCH.

THE CONTEXT OF THIS LITERATURE SURVEY IS THE DATA SELECTION BIAS IN CLIMATE SCIENCE THAT PROPOSES A CAUSE AND EFFECT THEORY FOR THE TEMPERATURE CYCLES OF THE HOLOCENE WITH DATA FOR ONE SUCH EVENT SELECTED BY THE RESEARCHERS.

Manski, Charles F. “3 The selection problem in econometrics and statistics.” (1993): 73-84.

This chapter discusses the selection problem in econometrics and statistics. Because censored data are so common, econometricians and statisticians have denoted much effort to their analysis. In particular, the following selection problem has drawn substantial attention: each member of a population is characterized by a triple (y, z, x), where y lies in a finite dimensional real space Y, z = 0 or 1, and x lies in a finite dimensional real space X. The selection problem is the failure of the censored-sampling process to identify P(y│x). The sampling process does identify the selection probability P(z = 1Ix), the censoring probability P(z = 0│x), and the measure of y conditional on selection, P(y │x, z = 1). It is uninformative regarding the measure of y conditional on censoring, P(y│x, z = 0). Although the econometrics and statistics literatures on the selection problem differ in important respects, they both focus primarily on situations in which one has strong prior information on the distribution of (y, z) conditional on x. The chapter discusses the selection problem in the absence of prior information and the selection problem with prior information. Censoring creates an identification problem. Identification depends on the prior knowledge a researcher is willing to assert in the application of interest. As researchers are heterogeneous in their applications and in their prior beliefs, so must be their perspectives on the selection problem.

Berk, Richard A. “An introduction to sample selection bias in sociological data.” American sociological review (1983): 386-398.

Heckman, James J. “Sample selection bias as a specification error.” Econometrica: Journal of the econometric society (1979): 153-161.

Stolzenberg, Ross M., and Daniel A. Relles. “Tools for intuition about sample selection bias and its correction.” American Sociological Review (1997): 494-507.

Winship, Christopher, and Robert D. Mare. “Models for sample selection bias.” Annual review of sociology 18.1 (1992): 327-350.

ABSTRACT: When observations in social research are selected so that they are not independent of the outcome variables in a study, sample selection leads to biased inferences about social processes. Nonrandom selection is both a source of bias in empirical research and a fundamental aspect of many social processes. This chapter reviews models that attempt to take account of sample selection and their applications in research on labor markets, schooling, legal processes, social mobility, and social networks. Variants of these models apply to outcome variables that are censored or truncated—whether explicitly or incidentally—and include the tobit model, the standard selection model, models for treatment effects in quasi-experimental designs, and endogenous switching models. Heckman’s two-stage estimator is the most widely used approach to selection bias, but its results may be sensitive to violations of its assumptions about the way that selection occurs. Recent econometric research has developed a wide variety of promising approaches to selection bias that rely on considerably weaker assumptions. These include a number of semiand nonparametric approaches to estimating selection models, the use of panel data, and the analyses of bounds of estimates. The large number of available methods and the difficulty of modelling selection indicate that researchers should be explicit about the assumptions behind their methods and should present results that derive from a variety of methods.

Hug, Simon. “Selection bias in comparative research: The case of incomplete data sets.” Political Analysis (2003): 255-274.

Taha, Ahmed E. “Data and selection bias: A case study.” UMKC L. Rev. 75 (2006): 171.

RELATED POST ON DATA SELECTION BIAS: https://tambonthongchai.com/2020/10/09/a-data-selection-bias/