History of the modern epidemiological concept of confounding

Alfredo Morabia

doi:10.1136/jech.2010.112565

Article Text

PDF

Essay

History of the modern epidemiological concept of confounding

Free

Alfredo Morabia

Correspondence to Alfredo Morabia, Center for the Biology of Natural Systems, Queens College - CUNY, 163-03 Horace Harding Expressway, Flushing, NY 11365, USA; alfredo.morabia{at}qc.cuny.edu

Abstract

The epidemiological concept of confounding has had a convoluted history. It was first expressed as an issue of group non-comparability, later as an uncontrolled fallacy, then as a controllable fallacy named confounding, and, more recently, as an issue of group non-comparability in the distribution of potential outcome types. This latest development synthesised the apparent disconnect between phases of the history of confounding. Group non-comparability is the essence of confounding, and the statistical fallacy its consequence. This essay discusses how confounding was perceived in the 18th and 19th centuries, reviews how the concept evolved across the 20th century and finally describes the modern definition of confounding.

History
methodology me

https://doi.org/10.1136/jech.2010.112565

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

To an unprepared mind, the terms ‘confounding’ or ‘confounder’ do not immediately evoke the consequences of comparing groups of people who differ on determinants of the studied outcome. Expressions suggesting the imbalanced distribution of multiple independent causes across groups would have conveyed the meaning more directly, but epidemiology retained the verb confound. The reason is that the theoretical work on the concept of confounding started with a description of a statistical fallacy: under some conditions, the effect of an exposure could be similar in each stratum of a third variable, but when these strata were pooled, it was as if the effect of the exposure of interest got ‘mixed’ with that of the third variable.1 The fallacy was thus aptly named confounding, from an old usage of the Medieval Latin verb ‘confundere’, which meant mixing.2 Earlier attempts to trace the history of confounding essentially focused on this conceptualisation of confounding as a fallacy.2 3

Compared to earlier reports, the present essay expands the history of the concept known today as epidemiologic confounding in the phases preceding and following the time when it was primarily viewed as a fallacy. After discussing how confounding was perceived in the 18th and 19th centuries, the essay reviews how the concept evolved across the 20th century and finally describes the modern definition of confounding.

Method

The methodological approach driving this history of confounding is inspired by Piaget's genetic epistemology.3 4 The leading idea is that scientific disciplines are in continual construction, formalisation and organisation. Their methods and concepts are commonsensical when the discipline first appears, but become increasingly theoretical and abstract as the discipline acquires experience and addresses questions of increasingly complex nature.4

The genetic epistemology approach assumes that the concept that was eventually named confounding: A. had a history, which started with commonsensical observations, B. evolved into an increasingly abstract, formal and overarching concept, and C. is still evolving today.

To trace the history of confounding, this essay uses the four phases (preformal, early, classic and modern) previously identified in the history of epidemiological methods and concepts.3 It also focuses on the theory of confounding, and does not cover the statistical approaches to confounding-related issues, (eg, collapsibility5) methods of adjusting for confounding, or the history of causal inference, which, even though closely related and overlapping at times with that of confounding, has a broader scope.5 6

Preformal confounding

Non-comparability of groups is the most primitive epidemiological concern to which the modern concept of confounding can be traced. When, in 1747, Lind7 compared the efficacy of candidate treatments of scurvy, he made sure his six experimental pairs of seamen were comparable, a priori, in terms of determinants of scurvy lethality such as disease stage, food and air quality.8

In the 19th century, group non-comparability was a formidable criticism to epidemiological studies.9 Hence the emphasis put by John Snow10 on the comparability of the 1854 London clients of the Southwark and Vauxhall water company, who drank polluted Thames water and experienced high mortality from cholera, with those of the Lambeth Company, who received relatively sewage-free water and experienced low mortality from cholera. Both groups, Snow insisted, were similar in social standing, housing space and occupations. He specifically investigated neighbourhoods with mixed water supply, in which adjacent houses could be supplied by different water companies.11 But for Snow's contemporaries, like Farr12, who believed that cholera was due to air pollution, that is, miasma,13 the two companies served clients who differed substantially in ways thought to be relevant to the occurrence of cholera, such as elevation above sea level, family income and quality of housing. How could Snow confound his critics? He could only speculate that the clients of the two companies must have been comparable as a large number of people (‘no fewer than 300 000’), ‘were divided into two groups without their choice, and, in most cases, without their knowledge’.11 Retrospectively, we understand that the contagionist Snow was arguing against the miasmatic idea that the two client populations were comparable on some miasma-related confounding characteristics. Snow claimed that comparability was plausible, but he lacked the techniques developed subsequently to achieve comparability analytically or by design.

Early theory of confounding

Indeed, epidemiologists of the first half of the 20th century began to formally address the criticism of non-comparability.9 They implemented new techniques such as random allocation of treatment14, restriction of the study sample15, standardisation of risks and rates15 16 and exposure propensity scores.17 18 They improved the epidemiological study designs, such as retrospective cohort studies16 19 and case—control studies.17 20 All of these efforts aimed to design studies and/or analyse the data in ways that purposefully optimised comparability on alternative causes of the studied outcome. Surprisingly, the first definition of the concept we refer to today as ‘confounding,’ did not follow from this line of efforts to achieve balanced comparisons.

As shown in table 1, Yule,25 in 1903,1 Greenwood,26 in 1935,21 and Hill,27 in 1939,22 apparently independently described a fallacy resulting from pooling data when a third variable was not equally distributed in the compared groups. Yule used the imaginary example of an attribute, not transmitted by fathers to sons or by mothers to daughters, but that showed ‘considerable apparent inheritance’ when the data of fathers, sons, mothers and daughters were analysed together. Greenwood,21 imagined an immunisation experiment, in which risk of death was similar among the inoculated and the non-inoculated in first and second groups of patients, but pooling the two groups together resulted in a spurious protective effect of the inoculation. In Hill's example, a treatment did not work for men or women, but reduced mortality when the male and female data were combined.22

View this table:

Table 1

Five historical representations of the concept of confounding as a fallacy resulting from mixing strata of exposure to a third factor

Yet, Yule, Greenwood and Hill do not seem to have viewed the fallacy as a common issue in population studies, and did not suggest computing a weighted average of the stratum-specific effects to bypass it. Apparently, their examples went into oblivion. The subsequent phase of the history of the epidemiological concept of ‘confounding’ appears to be an offshoot of discussions related to the modelling of interactions.

Classic theory of confounding

Fisher28 used the verb ‘confound’ in 1926, to describe the implication of discarding some high-order interactions in the analysis of data from studies with factorial designs.2 Precision could be improved, but the sacrifice of interactions would amalgamate strata, eliminating and therefore ‘confounding’ the manifestation of some of the underlying heterogeneity of effects.29

In my view, Fisher was using the term ‘confounding’ in the same way it had been used earlier by the English philosopher Mill, that is, as the consequence of ignoring causal interactions. For Mill, confounding meant ‘intermixture of causes,’ which he defined as two or more causes, ‘modifying the effects of one another’.30 Mill was referring to a mixing of effects that were heterogeneous across strata of one of the causes. This was different from Yule's fallacy, in which the exposure had a single effect, which was similar in all strata of the extraneous factor, except for being confounded in the pooled effect.

It is at that point that Simpson,31 building on Fisher's work, made the contribution now known as Simpson's paradox. Simpson showed that discarding the interaction terms could impact the estimation of the pooled effect even when the stratum-specific effects were homogeneous. This could actually leave ‘considerable scope for paradox and error’. He gave the example of an imaginary trial, (see table 1) in which the treatment homogeneously increased the survival odds both for males and females as separate groups, but had no effect when genders were pooled.23

Simpson posited that, for second-order interactions to be ignored, the third variable had to be independent of the treatment among the non-outcomed and independent of the outcome variable among the unexposed. Otherwise, stratification had to be preserved. This became the core of the classic epidemiological definition of confounding.

From 1959 on, expressions appear in the epidemiology literature, which evoke Yule's fallacy or Simpson's paradox without explicitly referring to them. Papers and textbooks mention ‘indirect associations’,32 and ‘misleading associations,’ produced by ‘extraneous factors’,33 or, ‘indirect associations generated by factors related to both outcome and exposure’.34 The term ‘confounding’ itself began to appear in epidemiological articles and textbooks in the 1970s.35–37 Its usage may have reflected the influence of the sociologist Kish who had defined the term in 1959.2 38

Around 1980, it was specified, in addition to the two conditions formulated by Simpson, that the third variable should not mediate the relation of exposure to outcome.39 40 This third condition highlighted the need for a priori, non-statistical knowledge about the relationship of the potential confounder with the other studied variables.41

Overall, table 1 shows the similarity of the quantitative examples used to illustrate confounding as a mixing of effects, from Yule1 to Rothman,24 that is, across most of the 20th century. Yule's expression of confounding as, ‘a fallacy caused by the mixing of records (ie, strata)’1, is analogous to Rothman's, ‘on the simplest level, confounding may be considered as a mixing of effects’.24

Modern theory of confounding

The classic definition of confounding had weaknesses. It was derived from the relation of additional variables to exposure and outcome, and not from the characteristics of the studied association, such as non-comparability. A variable could meet the classic definition and not be a confounder.42 Matching for a confounder had different implications in cohort and case—control studies.39 43 Screening for confounding by comparing the stratum-specific and the pooled effects could lead to different conclusions based on whether one used risk ratios, risk differences or ORs.39

The modern definition of confounding was inspired by work in the analysis of randomised controlled trials. In 1923, Neyman44 defined a causal effect as the impossible contrast between the outcome of a single unit, say an individual, if assigned the experimental treatment, and the outcome of that same individual if concurrently assigned the reference treatment.45 In 1974, Rubin stated the fundamental problem of effect identification in terms similar to those of Neyman.46 If ‘y(E)−y(C)’ is the effect of treatment E versus control C on outcome Y, and assuming y(E) and y(C) need to be measured at time 2 on the same person: ‘The problem in measuring y(E)−y(C) is that we can never observe both y(E) and y(C) since we cannot return to time t₁ to give the other treatment’.46

Each individual can be observed in only one treatment state at any point in time. Of the two potential outcomes (ie, under the experimental or under the reference treatments), one is observed, and the other needs to remain hypothetical. Thus, as described by Copas in 1973,47 there could be four individual types of potential outcome pairs for a dichotomous treatment (A and B) followed by a dichotomous outcome (success or failure) according to whether a subject would respond to A and B, A but not B, B but not A, or neither A nor B.

There is literary evidence of the ongoing epidemiological reflection about potential outcomes in the 1980s,48 49 but it wasn't until a 1986 paper by Greenland and Robins that the potential outcome approach to confounding was made widely accessible to epidemiologists.42 In Greenland and Robins' paper, the potential outcome model was confined to deterministic risks (ie, risks that can equal either 0 or 1) but it differed from previous discussions46 50 because—as shown in table 2, which imitates a table in Greenland and Robins' 1986 paper—it used the four ‘causal’ types47 dubbed ‘doomed’, ‘exposure causative’, ‘exposure preventive’ and ‘immune’.

View this table:

Table 2

Definition and notation of potential outcome types and their outcomes according to two potential outcomes

The example in table 2 shows that if a centenarian lady has been vaccinated and does not get the flu, she has no way of knowing whether she was susceptible and the vaccine was ‘preventive’, or whether she is naturally ‘immune’. Similarly, if a non-vaccinated person does not get the flu, she cannot know whether she would have avoided the flu had she been vaccinated. She could be ‘doomed’ or she could lack the protection of the ‘preventive’ vaccine. The effect of the vaccine cannot be identified, or its parameter estimated, without knowing both potential outcomes, under vaccination, as well as under no vaccination. This is the logical impasse mentioned by Rubin46: both potential outcomes cannot be observed simultaneously in the same person.

Consider now, two large randomised groups of N subjects each, and that in each group, the N subjects are d doomed, c causative, p preventive and i immune to flu, where d+c+p+i=N. One group gets the vaccine and the other does not. The risk difference of getting the flu is identifiable as groups are large and comparable with respect to their potential outcome types, assuming there were no gross violations of the assignment protocol,51 misclassification, or losses in the follow-up. They are, in Greenland and Robins' terminology, ‘exchangeable’. The risk of flu is R_V=(d+c)/N in those vaccinated, and R_NV=(d+p)/N in those not vaccinated. The risk difference between the non-vaccinated and the vaccinated is RD=R_V−R_NV=(p−c)/N. The risk difference only ‘partially identifies’ the vaccine effect, because a zero effect could be due to the vaccine causing as many flu cases (c) as it prevents (p). ‘Full identification’ is possible if, for example, the vaccine does not contain killed or weakened influenza virus, but only split particles of the flu virus, which cannot cause flu. Under this scenario, there are no ‘c’ subjects and the risk difference is simply (p/N), that is, if c=0.

However, if the groups were not at least ‘partially’ exchangeable, as if, for example, there were more ‘doomed’ (eg, centenarians with lethargic immune response) in the vaccinated group than in the non-vaccinated group, the ds would not cancel out, and the risk difference would be confounded.

This theory of confounding derived from potential outcome contrasts has been generalised from randomised to observational studies,52 it has helped to formally distinguish confounding from selection bias53 and has recently been revisited by its authors.54

Conclusion

From a broad historical perspective, the modern definition of confounding based on potential outcome contrasts has reinstated group non-comparability as the essence of confounding, establishing the statistical fallacy as one of its consequences.

What is already known on this subject

Earlier attempts to trace the history of confounding focused on the period when confounding was conceptualised as a fallacy resulting from mixing the effect of the studied variables with that of a third variable.

What this study adds

The present essay expands the history of the concept known today as epidemiologic confounding to the 18th and 19th century when it began to be viewed as an issue of non-comparability between groups.
It also explains how the modern definition of confounding based on potential outcome contrasts has reinstated group non-comparability as the essence of confounding and established that the statistical fallacy, from which confounding draws its name, is a consequence of group non-comparability.

Acknowledgments

This essay was presented as an invited lecture at XVIII IEA World Congress of Epidemiology and the VII Brazilian Congress of Epidemiology, in Porto Alegre, 21–24 September, 2008. I am indebted to Sander Greenland, Sharon Schwartz, Olli Miettinen, Jan Vandenbroucke, Jamey Robins, Timothy Lash, Paolo Vineis and Raj Bhopal for discussions and comments on the many earlier versions of this manuscript.

References

↵
1. Yule GU
. Notes on the theory of association of attributes in statistics. Biometrika 1903;2:121–34.
OpenUrl FREE Full Text
↵
1. Morabia A
1. Vandenbroucke JP
. The history of confounding. In: Morabia A, ed. History of epidemiological methods and concepts. Basel: Birkhäuser, 2004:313–26.
↵
1. Morabia A
1. Morabia A
. Epidemiology: an epistemological perspective. In: Morabia A, ed. History of epidemiological methods and concepts. Basel: Birkhäuser, 2004:1–126.
↵
1. Piaget J
. Genetic epistemology. New York: Columbia University Press, 1970.
↵
1. Greenland S,
2. Robins JM,
3. Pearl J
. Confounding and collapsibility in causal inference. Stat Sci 1999;14:29–46.
OpenUrl CrossRef Web of Science
↵
1. Morabia A
1. Vineis P
. Causality in epidemiology. In: Morabia A, ed. History of epidemiological methods and concepts. Basel: Birkhäuser, 2004:337–50.
↵
1. Tröhler U
. James Lind and scurvy: 1747 to 1795. The James Lind Library, 2003. http://www.jameslindlibrary.org (accessed 11 Aug 2008).
↵
1. Lind J
. A treatise of scurvy, 1753. Edinburgh: University Press, 1953.
↵
1. Holland WW,
2. Olsen J,
3. Florey C
1. Morabia A
. Epidemiological methods and concepts in the nineteenth century and their influences on the twentieth century. In: Holland WW, Olsen J, Florey C, eds. The development of modern epidemiology. Personal reports from those who were there. New York: Oxford University Press, 2007:17–29.
↵
1. Frerichs RR
. John Snow (1813–1858), 2009. http://www.ph.ucla.edu/epi/snow/encyclopediasummaryfrerichs.html (accessed 2 Mar 2010).
↵
1. Snow J
. On the mode of communication of cholera. 2nd edn. London: Churchill, 1855:74—5.
↵
1. Halliday S
. William Farr: Campaigning statistician. J Med Biogr 2000;8:220–7. http://www.ph.ucla.edu/epi/snow/farr/farr_compiler_a.html.
OpenUrl PubMed
↵
1. Vinten-Johansen P,
2. Brody H,
3. Paneth N,
4. et al
. Cholera, chloroform and the science of medicine: a life of John Snow. Oxford: Oxford University Press, 2003:260.
↵
Therapeutic Trial Committee of the Medical Research Council. The serum treatment of lobar pneumonia. Lancet 1934;1:290–5.
OpenUrl
↵
1. Goldberger J,
2. Wheeler GA,
3. Sydenstricker E
. A study of the relation of family income and other economic factors to pellagra incidence in seven cotton-mill villages of South Carolina in 1916. Public Health Rep 1920;35:2673–714.
OpenUrl PubMed
↵
1. Weinberg W
. Die Kinder der Tuberkulosen. Leipzig: S Hirzel, 1913.
↵
1. Lane-Claypon J
. A further report on cancer of the breast: reports on public health and medical subjects. London, UK: His Majesty's Stationary Office, 1926. Report No.: 32.
↵
1. Morabia A
. Janet Lane-Claypon—interphase epitome. Epidemiology 2010;21:573–6.
OpenUrl CrossRef PubMed Web of Science
↵
1. Winkelstein W Jr.
. Vignettes of the history of epidemiology: three firsts by Janet Elizabeth Lane-Claypon. Am J Epidemiol 2004;160:97–101.
OpenUrl Abstract/FREE Full Text
↵
1. Stocks P,
2. Karn M
. A co-operative study of the habits, home life, dietary and family histories of 450 cancer patients and of an equal number of control patients. Ann Eugen 1933;5:237–79.
OpenUrl CrossRef
↵
1. Greenwood M
. Epidemics & crowd diseases: Introduction to the study of epidemiology. North Stratford, NH: Ayer Company Publishers, Incorporated, 1935: 84—5.
↵
1. Hill AB
. Principles of medical statistics. London: The Lancet Ltd, 1939:125–7.
↵
1. Simpson EH
. The interpretation of interaction in contingency tables. J Roy Stat Soc Ser B 1951;13:238–41.
OpenUrl
↵
1. Rothman KJ
. Modern epidemiology. 1st edn. Boston: Little Brown and Co, 1986:89.
↵
1. Kendall M
. George Udny Yule 1871–1951. J R Stat Soc 1952;115:156–61.
OpenUrl
↵
1. Wilkinson L
. ‘Greenwood, Major (1880–1949)’, Oxford Dictionary of National Biography, 2004. http://www.oxforddnb.com/view/article/51797 (accessed 26 Feb 2010).
↵
1. Doll R
. Austin Bradford Hill. 8 July 1897–18 April 1991. Biogr Mem Fellows R Soc 1994;40:129–40.
OpenUrl
↵
1. Yates F,
2. Mather K
. Ronald Aylmer Fisher 1890–1962. Biogr Mem Fellows R Soc 1963;9:91–120.
OpenUrl FREE Full Text
↵
1. Street DJ
. Fisher's Contributions to Agricultural Statistics. Biometrics 1990;46:937–45.
OpenUrl CrossRef Web of Science
↵
1. Nagel E
1. Mill JS
. A system of logic (Eight Edition 1881). In: Nagel E, ed. J.S. Mill's Philosophy of Scientific Method. New York: Hafner Publishing Co, 1881:3–356.
↵
Edward H Simpson (born 1922). http://en.wikipedia.org/wiki/Edward_H._Simpson, 2010 (accessed 2 Mar 2010).
↵
1. Cornfield J,
2. Haenszel W,
3. Hammond EC,
4. et al
. Smoking and lung cancer: recent evidence and a discussion of some questions. J Natl Cancer Inst 1959;22:173–203.
OpenUrl PubMed Web of Science
↵
1. Mantel N,
2. Haenszel W
. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 1959;22:719–48.
OpenUrl PubMed Web of Science
↵
1. MacMahon B,
2. Pugh TF,
3. Ipsen J
. Epidemiologic methods. Boston: Little, Brown and Co, 1960:14–16.
↵
1. MacMahon B,
2. Pugh TF
. Epidemiology—principles and methods. Boston: Little, Brown and Co, 1970.
↵
1. Miettinen OS
. Matching and design efficiency in retrospective studies. Am J Epidemiol 1970;91:111–18.
OpenUrl FREE Full Text
↵
1. Susser M
. Causal thinking in the health sciences. New York: Oxford, 1973.
↵
1. Kish L
. Some statistical problems in research design. Am Sociol Rev 1959;24:328–38.
OpenUrl CrossRef
↵
1. Miettinen OS,
2. Cook EF
. Confounding: essence and detection. Am J Epidemiol 1981;114:593–603.
OpenUrl Abstract/FREE Full Text
↵
1. Greenland S,
2. Neutra R
. Control of confounding in the assessment of medical technology. Int J Epidemiol 1980;9:361–7.
OpenUrl Abstract/FREE Full Text
↵
1. Rothman KJ
. Epidemiologic methods in clinical trials. Cancer 1977;39:1771–5.
OpenUrl CrossRef PubMed Web of Science
↵
1. Greenland S,
2. Robins JM
. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol 1986;15:413–19.
OpenUrl Abstract/FREE Full Text
↵
1. Kupper LL,
2. Karon JM,
3. Kleinbaum DG,
4. et al
. Matching in epidemiologic studies: validity and efficiency considerations. Biometrics 1981;37:271–91.
OpenUrl CrossRef PubMed Web of Science
↵
1. O'Connor J,
2. Robertson E
. Jerzy Neyman (1894–1981), 2003. http://www-history.mcs.st-andrews.ac.uk/Biographies/Neyman.html (accessed 2 Mar 2010).
↵
1. Splawa-Neyman J
. On the application of probability theory to agricultural experiments. Essay on Principles. Section 9 (1923). Statist Sci 1990;5:465–80.
OpenUrl
↵
1. Rubin DB
. Estimating causal effects of treatments in randomized and nonrandomized treatments. J Educ Psychol 1974;66:688–701.
OpenUrl CrossRef Web of Science
↵
1. Copas J
. Randomization models for the matched and unmatched 2×2 tables. Biometrika 1973;60:467–76.
OpenUrl Abstract/FREE Full Text
↵
1. Miettinen OS
. Theoretical epidemiology. New York: Wiley, 1985:251–2.
↵
1. Robins JM,
2. Morgenstern H
. The foundations of confounding in epidemiology. Comput Math Applic 1987;14:869–916.
OpenUrl CrossRef
↵
1. Holland P
. Statistics and causal inference (with discussion). J Am Med Assoc 1986;81:940–70.
OpenUrl
↵
1. Greenland S
. Randomization, statistics, and causal inference. Epidemiology 1990;1:421–9.
OpenUrl CrossRef PubMed
↵
1. Rubin DB
. [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies. Statist Sci 1990;5:472–80.
OpenUrl
↵
1. Hernan MA,
2. Hernandez-Diaz S,
3. Robins JM
. A structural approach to selection bias. Epidemiology 2004;15:615–25.
OpenUrl CrossRef PubMed Web of Science
↵
1. Greenland S,
2. Robins JM
. Identifiability, exchangeability and confounding revisited. Epidemiologic Perspectives & Innovations 2009;6:4.
OpenUrl CrossRef PubMed

Footnotes

Competing interests None.
Patient consent Obtained.
Provenance and peer review Not commissioned; externally peer reviewed.

[1] ↵
Yule GU
. Notes on the theory of association of attributes in statistics. Biometrika 1903;2:121–34.
OpenUrl FREE Full Text

[2] Yule GU

[3] ↵
Morabia A
Vandenbroucke JP
. The history of confounding. In: Morabia A, ed. History of epidemiological methods and concepts. Basel: Birkhäuser, 2004:313–26.

[4] Morabia A

[5] Vandenbroucke JP

[6] ↵
Morabia A
Morabia A
. Epidemiology: an epistemological perspective. In: Morabia A, ed. History of epidemiological methods and concepts. Basel: Birkhäuser, 2004:1–126.

[7] Morabia A

[8] Morabia A

[9] ↵
Piaget J
. Genetic epistemology. New York: Columbia University Press, 1970.

[10] Piaget J

[11] ↵
Greenland S,
Robins JM,
Pearl J
. Confounding and collapsibility in causal inference. Stat Sci 1999;14:29–46.
OpenUrl CrossRef Web of Science

[12] Greenland S,

[13] Robins JM,

[14] Pearl J

[15] ↵
Morabia A
Vineis P
. Causality in epidemiology. In: Morabia A, ed. History of epidemiological methods and concepts. Basel: Birkhäuser, 2004:337–50.

[16] Morabia A

[17] Vineis P

[18] ↵
Tröhler U
. James Lind and scurvy: 1747 to 1795. The James Lind Library, 2003. http://www.jameslindlibrary.org (accessed 11 Aug 2008).

[19] Tröhler U

[20] ↵
Lind J
. A treatise of scurvy, 1753. Edinburgh: University Press, 1953.

[21] Lind J

[22] ↵
Holland WW,
Olsen J,
Florey C
Morabia A
. Epidemiological methods and concepts in the nineteenth century and their influences on the twentieth century. In: Holland WW, Olsen J, Florey C, eds. The development of modern epidemiology. Personal reports from those who were there. New York: Oxford University Press, 2007:17–29.

[23] Holland WW,

[24] Olsen J,

[25] Florey C

[26] Morabia A

[27] ↵
Frerichs RR
. John Snow (1813–1858), 2009. http://www.ph.ucla.edu/epi/snow/encyclopediasummaryfrerichs.html (accessed 2 Mar 2010).

[28] Frerichs RR

[29] ↵
Snow J
. On the mode of communication of cholera. 2nd edn. London: Churchill, 1855:74—5.

[30] Snow J

[31] ↵
Halliday S
. William Farr: Campaigning statistician. J Med Biogr 2000;8:220–7. http://www.ph.ucla.edu/epi/snow/farr/farr_compiler_a.html.
OpenUrl PubMed

[32] Halliday S

[33] ↵
Vinten-Johansen P,
Brody H,
Paneth N,
et al
. Cholera, chloroform and the science of medicine: a life of John Snow. Oxford: Oxford University Press, 2003:260.

[34] Vinten-Johansen P,

[35] Brody H,

[36] Paneth N,

[37] et al

[38] ↵
Therapeutic Trial Committee of the Medical Research Council. The serum treatment of lobar pneumonia. Lancet 1934;1:290–5.
OpenUrl

[39] ↵
Goldberger J,
Wheeler GA,
Sydenstricker E
. A study of the relation of family income and other economic factors to pellagra incidence in seven cotton-mill villages of South Carolina in 1916. Public Health Rep 1920;35:2673–714.
OpenUrl PubMed

[40] Goldberger J,

[41] Wheeler GA,

[42] Sydenstricker E

[43] ↵
Weinberg W
. Die Kinder der Tuberkulosen. Leipzig: S Hirzel, 1913.

[44] Weinberg W

[45] ↵
Lane-Claypon J
. A further report on cancer of the breast: reports on public health and medical subjects. London, UK: His Majesty's Stationary Office, 1926. Report No.: 32.

[46] Lane-Claypon J

[47] ↵
Morabia A
. Janet Lane-Claypon—interphase epitome. Epidemiology 2010;21:573–6.
OpenUrl CrossRef PubMed Web of Science

[48] Morabia A

[49] ↵
Winkelstein W Jr.
. Vignettes of the history of epidemiology: three firsts by Janet Elizabeth Lane-Claypon. Am J Epidemiol 2004;160:97–101.
OpenUrl Abstract/FREE Full Text

[50] Winkelstein W Jr.

[51] ↵
Stocks P,
Karn M
. A co-operative study of the habits, home life, dietary and family histories of 450 cancer patients and of an equal number of control patients. Ann Eugen 1933;5:237–79.
OpenUrl CrossRef

[52] Stocks P,

[53] Karn M

[54] ↵
Greenwood M
. Epidemics & crowd diseases: Introduction to the study of epidemiology. North Stratford, NH: Ayer Company Publishers, Incorporated, 1935: 84—5.

[55] Greenwood M

[56] ↵
Hill AB
. Principles of medical statistics. London: The Lancet Ltd, 1939:125–7.

[57] Hill AB

[58] ↵
Simpson EH
. The interpretation of interaction in contingency tables. J Roy Stat Soc Ser B 1951;13:238–41.
OpenUrl

[59] Simpson EH

[60] ↵
Rothman KJ
. Modern epidemiology. 1st edn. Boston: Little Brown and Co, 1986:89.

[61] Rothman KJ

[62] ↵
Kendall M
. George Udny Yule 1871–1951. J R Stat Soc 1952;115:156–61.
OpenUrl

[63] Kendall M

[64] ↵
Wilkinson L
. ‘Greenwood, Major (1880–1949)’, Oxford Dictionary of National Biography, 2004. http://www.oxforddnb.com/view/article/51797 (accessed 26 Feb 2010).

[65] Wilkinson L

[66] ↵
Doll R
. Austin Bradford Hill. 8 July 1897–18 April 1991. Biogr Mem Fellows R Soc 1994;40:129–40.
OpenUrl

[67] Doll R

[68] ↵
Yates F,
Mather K
. Ronald Aylmer Fisher 1890–1962. Biogr Mem Fellows R Soc 1963;9:91–120.
OpenUrl FREE Full Text

[69] Yates F,

[70] Mather K

[71] ↵
Street DJ
. Fisher's Contributions to Agricultural Statistics. Biometrics 1990;46:937–45.
OpenUrl CrossRef Web of Science

[72] Street DJ

[73] ↵
Nagel E
Mill JS
. A system of logic (Eight Edition 1881). In: Nagel E, ed. J.S. Mill's Philosophy of Scientific Method. New York: Hafner Publishing Co, 1881:3–356.

[74] Nagel E

[75] Mill JS

[76] ↵
Edward H Simpson (born 1922). http://en.wikipedia.org/wiki/Edward_H._Simpson, 2010 (accessed 2 Mar 2010).

[77] ↵
Cornfield J,
Haenszel W,
Hammond EC,
et al
. Smoking and lung cancer: recent evidence and a discussion of some questions. J Natl Cancer Inst 1959;22:173–203.
OpenUrl PubMed Web of Science

[78] Cornfield J,

[79] Haenszel W,

[80] Hammond EC,

[81] et al

[82] ↵
Mantel N,
Haenszel W
. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 1959;22:719–48.
OpenUrl PubMed Web of Science

[83] Mantel N,

[84] Haenszel W

[85] ↵
MacMahon B,
Pugh TF,
Ipsen J
. Epidemiologic methods. Boston: Little, Brown and Co, 1960:14–16.

[86] MacMahon B,

[87] Pugh TF,

[88] Ipsen J

[89] ↵
MacMahon B,
Pugh TF
. Epidemiology—principles and methods. Boston: Little, Brown and Co, 1970.

[90] MacMahon B,

[91] Pugh TF

[92] ↵
Miettinen OS
. Matching and design efficiency in retrospective studies. Am J Epidemiol 1970;91:111–18.
OpenUrl FREE Full Text

[93] Miettinen OS

[94] ↵
Susser M
. Causal thinking in the health sciences. New York: Oxford, 1973.

[95] Susser M

[96] ↵
Kish L
. Some statistical problems in research design. Am Sociol Rev 1959;24:328–38.
OpenUrl CrossRef

[97] Kish L

[98] ↵
Miettinen OS,
Cook EF
. Confounding: essence and detection. Am J Epidemiol 1981;114:593–603.
OpenUrl Abstract/FREE Full Text

[99] Miettinen OS,

[100] Cook EF

[101] ↵
Greenland S,
Neutra R
. Control of confounding in the assessment of medical technology. Int J Epidemiol 1980;9:361–7.
OpenUrl Abstract/FREE Full Text

[102] Greenland S,

[103] Neutra R

[104] ↵
Rothman KJ
. Epidemiologic methods in clinical trials. Cancer 1977;39:1771–5.
OpenUrl CrossRef PubMed Web of Science

[105] Rothman KJ

[106] ↵
Greenland S,
Robins JM
. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol 1986;15:413–19.
OpenUrl Abstract/FREE Full Text

[107] Greenland S,

[108] Robins JM

[109] ↵
Kupper LL,
Karon JM,
Kleinbaum DG,
et al
. Matching in epidemiologic studies: validity and efficiency considerations. Biometrics 1981;37:271–91.
OpenUrl CrossRef PubMed Web of Science

[110] Kupper LL,

[111] Karon JM,

[112] Kleinbaum DG,

[113] et al

[114] ↵
O'Connor J,
Robertson E
. Jerzy Neyman (1894–1981), 2003. http://www-history.mcs.st-andrews.ac.uk/Biographies/Neyman.html (accessed 2 Mar 2010).

[115] O'Connor J,

[116] Robertson E

[117] ↵
Splawa-Neyman J
. On the application of probability theory to agricultural experiments. Essay on Principles. Section 9 (1923). Statist Sci 1990;5:465–80.
OpenUrl

[118] Splawa-Neyman J

[119] ↵
Rubin DB
. Estimating causal effects of treatments in randomized and nonrandomized treatments. J Educ Psychol 1974;66:688–701.
OpenUrl CrossRef Web of Science

[120] Rubin DB

[121] ↵
Copas J
. Randomization models for the matched and unmatched 2×2 tables. Biometrika 1973;60:467–76.
OpenUrl Abstract/FREE Full Text

[122] Copas J

[123] ↵
Miettinen OS
. Theoretical epidemiology. New York: Wiley, 1985:251–2.

[124] Miettinen OS

[125] ↵
Robins JM,
Morgenstern H
. The foundations of confounding in epidemiology. Comput Math Applic 1987;14:869–916.
OpenUrl CrossRef

[126] Robins JM,

[127] Morgenstern H

[128] ↵
Holland P
. Statistics and causal inference (with discussion). J Am Med Assoc 1986;81:940–70.
OpenUrl

[129] Holland P

[130] ↵
Greenland S
. Randomization, statistics, and causal inference. Epidemiology 1990;1:421–9.
OpenUrl CrossRef PubMed

[131] Greenland S

[132] ↵
Rubin DB
. [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies. Statist Sci 1990;5:472–80.
OpenUrl

[133] Rubin DB

[134] ↵
Hernan MA,
Hernandez-Diaz S,
Robins JM
. A structural approach to selection bias. Epidemiology 2004;15:615–25.
OpenUrl CrossRef PubMed Web of Science

[135] Hernan MA,

[136] Hernandez-Diaz S,

[137] Robins JM

[138] ↵
Greenland S,
Robins JM
. Identifiability, exchangeability and confounding revisited. Epidemiologic Perspectives & Innovations 2009;6:4.
OpenUrl CrossRef PubMed

[139] Greenland S,

[140] Robins JM

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Introduction

Method

Preformal confounding

Early theory of confounding

Classic theory of confounding

Modern theory of confounding

Conclusion

What is already known on this subject

What this study adds

Acknowledgments

References

Footnotes

Read the full text or download the PDF:

Log in using your username and password