Article Text

## Abstract

Multilevel analysis has recently emerged as a useful analytical technique in several fields, including public health and epidemiology. This glossary defines key concepts and terms used in multilevel analysis.

- multilevel analysis

Multilevel analysis, originally developed in the fields of education, sociology, and demography, has received increasing attention in public health and epidemiology over the past few years. This glossary defines key terms and concepts in multilevel analysis. The intent is to provide conceptual explanations of basic concepts, particularly those that are fundamental, that have been used inconsistently or that lend themselves to confusion. Selected terms and concepts more broadly related to the presence of multiple levels of organisation (such as group level variables and inferential fallacies) are also included. Although the glossary often refers to individuals nested within groups, multilevel analysis is applicable to a broad range of situations involving units at a lower level (or micro units) nested within units at a higher level (or macro units) (including for example, persons nested within studies as in meta-analysis, and measures over time nested within individuals as in the analysis of repeat measures). References to terms that have their own specific entry are in small capitals.

## AGGREGATE DATA

Term used to refer to data or variables for a higher level unit (for example, a group) constructed by combining information for the lower level units of which the higher level unit is composed (for example, individuals within the group). Examples of aggregate data include summaries of the properties of individuals comprising a group, for example, the percentage of persons in a neighbourhood with complete high school or the mean income of state residents. Implicit in most uses of the term aggregate data is the idea that aggregate variables are merely summaries of the properties of lower level units and not measures of higher level properties themselves (although this is not necessarily true in all cases, see derived variables).

## ATOMISTIC FALLACY

The fallacy sometimes present when drawing inferences regarding variability across groups (or the relation between group level variables) based on individual level data, or more generally, the fallacy of drawing inferences regarding variability across units defined at a higher level based on data collected for units at a lower level. The atomistic fallacy arises because associations between two variables at the individual level may differ from associations between analogous variables measured at the group level. For example, a study of individuals may find that increasing individual level income is associated with decreasing coronary heart disease mortality. If it is inferred from these data that at the country level, increasing per capita income is associated with decreasing coronary heart disease mortality, the researcher may be committing the atomistic fallacy (because across countries, increasing per capita income may actually be associated with *increasing* coronary heart disease mortality). The sources of the atomistic fallacy are similar to those of the ecologic fallacy. In the atomistic fallacy, the conceptual model being tested corresponds to the higher level, but the data are collected for a lower level.^{1,}^{2} The atomistic fallacy has sometimes been referred to as the individualistic fallacy.^{3,}^{4}

## COMPOSITIONAL EFFECTS

When inter-group (or inter-context) differences in an outcome (for example, disease rates) are attributable to differences in group composition (that is, in the characteristics of the individuals of which the groups are comprised) they are said to result from compositional effects.^{5} On the other hand, when group differences are attributable to the effects of group level variables or properties, they are said to result from contextual effects.

## CONTEXTUAL ANALYSIS

An analytical approach originally used in sociology to investigate the effect of collective or group characteristics on individual level outcomes.^{4,}^{6,}^{7} In contextual analysis, group level predictors (often constructed by aggregating the characteristics of individuals within groups) are included together with individual level variables in standard regressions with individuals as the units of analysis (contextual effects models). This approach permits the simultaneous examination of how individual level and group level variables are related to individual level outcomes. It thus allows for macro processes that are presumed to have an impact on individuals over and above the effects of individual level variables.^{6} The terms “contextual analysis” and multilevel analysis have sometimes been used synonymously, ^{8–}^{10} and both approaches are similar in allowing the investigation of how group level (or macro) and individual level (or micro) variables (as well as their interactions) are related to individual level outcomes. However, multilevel models are more general than the original contextual models in that (1) they allow (and account for) the possibility of residual correlation between individuals within groups; and (2) they allow examination of between group variability and the factors associated with it. In contrast, contextual models often do not account for residual correlation (although they can be modified to do so) and do not allow the examination of inter-group variability or of the factors associated with it (see also variance components).

## CONTEXTUAL EFFECTS

Term generally used to refer to the effects of variables defined at a higher level (usually at the group level) on outcomes defined at a lower level (usually at the individual level) after controlling for relevant individual level (lower level) confounders. The term is most often used to refer to the effect of a derived group level variable (for example, mean neighbourhood income) on an individual level outcome (such as blood pressure) after controlling for its individual level namesake (for example, individual level income).^{6,}^{11} However, “contextual effects” is also sometimes used to refer to the effects of group level variables generally be they derived variables or integral variables, and can apply to any situation involving lower level units nested within higher level units (for example, contextual effects of country characteristics on disease rates for small areas, contextual effects of tissue characteristics on cell biology). Contextual effects are sometimes contrasted with compositional effects.^{5}

## CONTEXTUAL EFFECTS MODELS

Regression models with individuals as the units of analysis that include both group level and individual level variables as predictors of individual level outcomes. Traditional contextual effects models are equivalent to multilevel models in which all coefficients are modelled as fixed (that is, no error terms are included in the group level or level 2 equations, see multilevel models). See contextual analysis.

## CONTEXTUAL VARIABLES

See derived variables and group level variables.

## CROSS LEVEL EFFECTS

Term used to refer to the main effects of higher level variables (for example, group level variables) on outcomes at a lower level (for example, individual level outcomes) as well as to modifications of the effects of lower level (individual level) variables by higher level (group level) variables (see cross level interaction).^{12} Examples include the effect of country level income inequality on individual level self reported health (effect of a higher level variable on outcomes at a lower level), and the presence of stronger associations between individual level income and self reported health in the presence of high country level income inequality (modifications of the effects of lower level variables by higher level variables). The term “ecological effects” has sometimes been used as a synonym for “cross level effects”.^{12}

## CROSS LEVEL INFERENCE

The drawing of inferences regarding factors associated with variability in the outcome at one level based on data collected at another level (for example, drawing inferences regarding relations between individual level variables based on group level associations, or vice versa). See ecologic fallacy and atomistic fallacy.

## CROSS LEVEL INTERACTION

Refers to the interaction between higher level and lower level variables—that is, to modification of the effects of lower level variables by characteristics of the higher level units to which the lower level units belong (or vice versa).^{5,}^{12} For example, if the relation between individual level income and blood pressure differs by neighbourhood characteristics (that is, neighbourhood and individual level variables interact), there is said to be a cross level interaction. In multilevel models whenever group specific estimates of the effect of a lower level variable are modelled as a function of higher level (group level) variables (as in equation (3) under the entry for multilevel models), a cross level interaction appears in the final model (γ_{11} *C*_{j} *I*_{ij} in equation (4) under multilevel models).

## DERIVED VARIABLES

A type of group level variable constructed by mathematically summarising the characteristics of individuals in the group (for example, means, proportions, or measures of dispersion, such as, percentage of persons with incomplete high school, mean income, standard deviation of the income distribution).^{11,}^{13} Some derived variables have no individual level analogue (for example, standard deviation of the income distribution) and therefore necessarily refer to group level constructs. Others (for example, mean neighbourhood income) do have individual level analogues (for example, individual level income), but may provide information on group level constructs, distinct from their individual level namesake. The mean of the dependent variable in the group (for example, proportion infected in a study of the causes of infection) can be thought of as a special type of derived variable.^{14} Although derived and integral variables are sometimes presented as conceptually distinct, they are closely interrelated. Derived variables often operate by shaping certain integral properties of the group. For example, the composition of a group may influence the predominant types of interpersonal contacts, values, and norms or may shape organisations or regulations within the group that affect all members.^{15} The terms “analytical variables” and “aggregate variables” have been used as synonyms for “derived variables”. The term “contextual variables” has also been used as a synonym for “derived variables” ^{14} although it is sometimes used to refer to group level variables generally.^{6,}^{13}

## ECOLOGICAL FALLACY

The fallacy sometimes present when drawing inferences at the individual level (that is, regarding relations between individual level variables) based on group level data. The ecological fallacy arises because associations between two variables at the group level (or ecological level) may differ from associations between analogous variables measured at the individual level. These differences between individual level and group level associations were first described for correlation coefficients ^{16} but may also be present for other measures of association such as regression coefficients.^{11,}^{17} More generally, the fallacy may occur whenever data for units at a higher level are used to draw inferences regarding factors associated with variability across units at a lower level—that is, when the conceptual model being tested corresponds to the lower level, but the data are collected for a higher level.^{1,}^{2} Suppose a researcher finds that at the country level, increasing per capita income is associated with increasing mortality attributable to traffic accidents. If he/she infers that at the individual level, increasing personal income is associated with increasing motor vehicle related mortality, she may be committing the ecological fallacy, because within countries, motor vehicle related mortality may always be lower in high income than in low income persons. In the case of regression coefficients, the sources of the ecological fallacy include (1) the lack of information on constructs pertaining to a lower level of organisation; and (2) the failure to realise that a variable defined and measured at one level of organisation may tap into a different construct than its namesake at another level.^{18}

## EMPIRICAL BAYES ESTIMATES

Estimates of parameters for a given group or higher level unit (for example, estimates of group specific intercepts or slopes, such as *b*_{0j} and b_{1j} in equation (1), under multilevel models) obtained by combining information from the group itself with information from other similar groups investigated.^{10,}^{19,}^{20} This is particularly useful when estimating parameters for a group with few within group observations. These estimates are “optimally” weighted averages that combine information derived from the group itself with the mean for all similar groups. The weighted average shifts the group specific estimate (derived using data only for that particular group) towards the mean for similar groups. The less precise the group specific estimate and the less the variability observed across groups, the greater the shift towards the overall group mean. Thus, the estimate for a given group is based not only on its own data but also takes into account estimates for other groups and the characteristics groups share.^{20} Empirical Bayes estimates of parameters for a given group can be derived from multilevel models using estimates of the group level errors (for example, U_{0j} and U_{1j ,} see multilevel models) for that particular group. Empirical Bayes estimates are also sometimes referred to as “shrinkage estimates” because they “shrink” the group specific estimate towards the overall mean (although in fact when the overall mean is greater than the group specific estimate, the “shrunken” or empirical Bayes estimate may actually be greater than the group specific estimate). In public health, empirical Bayes estimation can be used, for example, to derive improved estimates of rates of death or diseases for small areas with few observations,^{21} or to estimate rates of different health outcomes for individual providers (hospitals, physicians, etc).^{22} In other applications (which do not involve the structure of individuals within groups although they are analogous to it), empirical Bayes estimates of regression coefficients have been used to obtain improved estimates of associations in studies investigating the role of multiple exposures.^{23}

## ENVIRONMENTAL VARIABLES

In the context of ecological studies and multilevel analysis, the term “environmental variables” has sometimes been used to refer to group level measures of physical or chemical exposures. Environmental variables, so defined, have been proposed as a “type” of group level variable, distinct from derived variables and integral variables.^{11} These variables are not derived by aggregating the characteristics of individuals but they do have group level and individual level analogues (for example, days of sunlight in the community and individual level sunlight exposure information). In contrast with derived and integral variables, which may be used as indicators of group level constructs, group level environmental variables are used exclusively as proxies for individual level exposures (which may be more difficult to measure for logistic or methodological reasons), rather than as indicators of a group level property, which is conceptually different from the analogous measure at the individual level.

## FIXED EFFECTS/FIXED COEFFICIENTS

Regression coefficients (intercepts or covariate effects) that are not allowed to vary randomly across higher level units (see multilevel models). For example, in the case of persons nested within neighbourhoods, two options are available for modelling the effects of neighbourhood. One option is to include a dummy variable for each neighbourhood. In this case the neighbourhood coefficients are modelled as fixed (sometimes called “fixed effects”). Another option is to assume that the neighbourhoods in the sample are a random sample of a larger population of neighbourhoods and that the coefficients for the “neighbourhood effect” vary randomly around an overall mean (for example, as reflected by U_{oj} in equation 2 under the entry for multilevel models). In this case, the neighbourhood effects are modelled as random (sometimes called “random effects”, see random effects models). In the same example, the coefficients for individual level covariates can also be modelled as fixed or random. For example, if the relation between individual level income and blood pressure is not allowed to vary randomly across neighbourhoods, the coefficient for individual level income is fixed (“fixed coefficient”). On the other hand, if the coefficient for individual level income is allowed to vary randomly across neighbourhoods around an overall mean effect (as reflected by U_{1j} in equation 3 under the entry for multilevel models), the coefficient for income is modelled as random (sometimes called a “random coefficient”, see random coefficient models). Although the terms “fixed effects” and “fixed coefficients “ are sometimes distinguished as noted above, they are often used interchangeably. Fixed effects models or fixed coefficient models are models in which all effects or coefficients are fixed. See also random effects/random coefficients.

## GROUP LEVEL VARIABLES

Term used to refer to variables that characterise groups. The terms group level variables, macro variables and ecological variables are often used interchangeably.^{2,}^{6,}^{11,}^{14,}^{24} Group level variables may be used as proxies for unavailable or unreliable individual level data (for example, when neighbourhood mean income is used as a proxy for the individual level income of individuals living in the neighbourhood) or as indicators of group level constructs (for example, when mean neighbourhood income is used as an indicator of neighbourhood characteristics that may be related to individual level outcomes independently of individual level income). It is the second usage (as indicators of group level constructs) that is of particular interest in multilevel analysis. Group level variables have been classified into two basic types.^{11,}^{13,}^{24} derived variables and integral variables. Two additional types of group level variables, structural variables ^{13} and environmental variables^{11} are sometimes distinguished. The term contextual variables has been used as a synonym for group level variables generally ^{6,}^{13} although it is sometimes reserved for derived group level variables.^{11,}^{14}

## HIERARCHICAL (LINEAR) MODELS

See multilevel models

## INDIVIDUAL LEVEL VARIABLES

Term used to refer to variables that characterise individuals and refer to individual level constructs (for example, age or personal income).

## INDIVIDUALISTIC FALLACY

Term used as a synonym for the atomistic fallacy. May sometimes also be used as a synonym for the psychologistic fallacy.

## INTEGRAL VARIABLES

A type of group level variable. Integral variables differ from derived variables (another type of group level variable) in that they are not summaries of the characteristics of individuals in the group. Integral variables have no individual level analogues and necessarily refer to group level constructs. Examples of integral variables include the existence of certain types of laws, political or economic system, social disorganisation, or population density.^{11,}^{13} Integral variables have also been referred to as primary or global variables.

## INTRACLASS CORRELATION

A measure of the degree of resemblance between lower level units belonging to the same higher level unit or cluster.^{25} In the case of individuals nested within groups (for example, neighbourhoods), the intraclass correlation measures the extent to which values of the dependent variable are similar for individuals belonging to the same group. It can be thought of as the average correlation between values of two randomly drawn lower level units (for example, individuals) in the same, randomly drawn higher level unit (for example, neighbourhood). It can also be defined as the proportion of the variance in the outcome that is between the groups or higher level units. In the case of a simple random intercept model, the intraclass correlation coefficient is estimated by the ratio of population variance between groups (τ_{00}) to the total variance (τ_{00} + σ^{2}).^{25} (see multilevel models) The estimation of the intraclass correlation coefficient in models including random covariate effects, or in the case of non-normally distributed dependent variables, is more complex and not always straightforward .

## MARGINAL MODELS

See population-average models.

## MIXED MODELS

Term used to refer to models that contain a mixture of fixed effects (or fixed coefficients) and random effects (or random coefficients). In mixed models some of the regression coefficients (intercepts or covariate effects) are allowed to vary randomly across higher level units but others are not (see multilevel models). Thus mixed models can be thought of as a particular case of the more general multilevel models (although the term is also occasionally used as a synonym of multilevel models generally). Sometimes the term mixed models is also used to encompass models that account for correlation between lower level units (for example, individuals) within higher level units (for example, neighbourhoods) in other ways—that is, by modelling the correlations or covariances themselves rather than by allowing for random effects or random coefficients.^{26} These models (which are not multilevel models) have also been called covariance pattern models,^{26} marginal models, or population average models.

## MULTILEVEL ANALYSIS

An analytical approach that is appropriate for data with nested sources of variability—that is, involving units at a lower level or micro units (for example, individuals) nested within units at a higher level or macro units (for example, groups such as schools or neighbourhoods).^{5,}^{10,}^{19,}^{24,}^{25,}^{27–}^{30} Multilevel analysis allows the simultaneous examination of the effects of group level and individual level variables on individual level outcomes while accounting for the non-independence of observations within groups. Multilevel analysis also allows the examination of both between group and within group variability as well as how group level and individual level variables are related to variability at both levels. Thus, multilevel models can be used to draw inferences regarding the causes of inter-individual variation (or the relation of group and individual level variables to individual level outcomes) but inferences can also be made regarding inter-group variation, whether it exists in the data, and to what extent it is accounted for by group and individual level characteristics. In multilevel analysis, groups or contexts are not treated as unrelated but are conceived as coming from a larger population of groups about which inferences want to be made. Multilevel analysis thus allows researchers to deal with the micro-level of individuals and the macro-level of groups or contexts simultaneously.^{5}

Multilevel analysis has a broad range of applications in many situations involving nested sources of random variability such as persons nested within neighbourhoods,^{5,}^{30} patients nested within providers,^{31} meta analysis (observations nested within sites)^{19,}^{32} longitudinal data analysis (repeat measurements over time nested within persons),^{28,}^{33,}^{34} multivariate responses (multiple outcomes nested within individuals),^{5} the analysis of repeat cross sectional surveys (multiple observations nested within time periods),^{35} the examination of geographical variations in rates (rates for smaller areas nested within regions or larger areas)^{36} and the examination of interviewer effects (respondents nested within interviewers).^{37} Multilevel analysis can also be used in situations involving multiple nested contexts^{19,}^{28} (for example, multiple measures over time on individuals nested within neighbourhoods) as well as overlapping or cross classified contexts (for example, children nested within neighbourhoods and schools).^{38} The statistical models used in multilevel analysis are referred to as multilevel models^{25,}^{28,}^{29} or hierarchical linear models.^{19,}^{39}

## MULTILEVEL MODELS

The statistical models used in multilevel analysis.^{19,}^{25,}^{28,}^{29} The terms “hierarchical models” and “multilevel models” are often used synonymously. These models (or variants of them) have previously appeared in different literatures under a variety of names including random effects models or random coefficient models^{40–}^{42} “covariance components models” or “variance components models”,^{43,}^{44} and mixed models.^{26} A simplified example for the case of a normally distributed dependent variable, a single individual level (lower level unit) predictor and a single group level (higher level unit) predictor is provided below. Analogous models can be formulated for non-normally distributed dependent variables.^{10,}^{28,}^{39,}^{45}

In the case of multilevel analysis involving two levels (for example, individuals nested within groups), the multilevel model can be conceptualised as a two stage system of equations.

In the first stage (level 1), a separate individual level regression is defined for each group or higher level unit.

*Y*_{ij} = outcome variable for i^{th} individual in j^{th} group

*I*_{ij}= individual level variable for i^{th} individual in j^{th} group

b_{0j} is the group specific intercept

b_{1j} is the group specific effect of the individual level variable

Individual level errors (e_{ij}) are assumed to be independent and identically distributed with a mean of 0 and a variance of σ^{2}. The same regressors are generally used in all groups, but regression coefficients (b_{0j} and b_{1j}) allowed to vary from one group to another.

In a second stage (level 2), each of the group or context specific regression coefficients defined in equation (1) (b_{0j} and b_{1j} in this example) are modelled as a function of group level (or higher level) variables.

*G*_{j} group level variable

γ_{00} is the common intercept across groups

γ_{01} is the effect of the group level predictor on the group specific intercepts

γ_{10} is the common slope associated with the individual level variable across groups

γ_{11} is the effect of the group level predictor on the group specific slopes

The errors in the level 2 equations (U_{0j} and U_{1j}), sometimes called “macro errors”, are assumed to be normally distributed with mean 0 and variances τ_{00} and τ_{11} respectively. τ_{01} represents the covariance between intercepts and slopes. Thus, multilevel analysis summarises the distribution of the group specific coefficients in terms of two parts: a “fixed”part that is common across groups (γ_{00} and γ_{01} for the intercept, and γ_{10} and γ_{11} for the slope) and a “random” part (U_{0j} for the intercept and U_{1j} for the slope) that is allowed to vary from group to group (see also fixed coefficients and random coefficients).

By including an error term in the group level equations (equations (2) and (3)), these models allow for sampling variability in the group specific coefficients (b_{0j} and b_{1j}) and also for the fact that the group level equations are not deterministic (that is, the possibility that not all relevant macro-level variables have been included in the model). The underlying assumption is that group specific intercepts and slopes are random samples from a normally distributed population of group specific intercepts and slopes, or alternatively, that the macro errors are exchangeable—that is, that the residual variation in group specific coefficients across groups is unsystematic.^{10}

An alternative way to present the model fitted in multilevel analysis is to substitute equations (2) and (3) in (1) to obtain:

The model includes the effects of group level variables (γ_{01}), individual level variables (γ_{10}) and their interaction (γ_{11}) on the individual level outcome *Y*_{ij} . These coefficients (γ_{01,} γ_{10} and γ_{11}), which are common to all individuals regardless of the group to which they belong are often called the fixed coefficients (or fixed effects). The model also includes a random intercept component (U_{0j}), and a random slope component (U_{1j}). The values of these components vary randomly across groups, and hence U_{0j} and U_{1j} referred to as the random coefficients (or random effects). The parameters of the above equations (fixed effects, random effects, variances of the random effects, and residual variance) are simultaneously estimated using iterative methods. The level 1 and level 2 variances (σ^{2}, τ_{00,} τ_{11 ,} τ_{10}) are called the (co)variance components.

Many variants of the more general model illustrated above are possible. For example, only group specific intercepts (*b*_{0j}) may be modelled as random (these models have also been called random effects models). When covariate effects (*b*_{1j} in the example above) are modelled as random these models have also been called random coefficient models. When some of the coefficients are fixed and other are random these models have also been called “mixed effects models” or simply mixed models. When all coefficients are modelled fixed (no random errors are included in level 2 equations) these models are reduced to traditional contextual effects models. Multilevel models can also account for multiple nested contexts (or levels)^{19,}^{28} allowing fixed and random coefficients to be associated with variables measured at different levels of the data hierarchy being analysed. Multilevel models can also be modified to allow for non-hierarchical, overlapping or cross classified contexts (for example, children simultaneously nested within neighbourhoods and schools).^{38}

## NON-INDEPENDENCE OF OBSERVATIONS

Refers to situations in which dependent variables for observations at a lower level nested within the same higher level unit (or cluster) are correlated, even after measured characteristics are taken into account. For example, two persons from the same neighbourhood may tend to have more similar blood pressure levels than two persons from different neighbourhoods, even after measured individual and neighbourhood characteristics are taken into account. In the case of repeat measures on individuals over time, two blood pressure measurements on the same person may tend to be more similar than two measures on different persons even after relevant covariates are taken into account. One reason for this correlation may have to do with the omission of important higher level variables that observations within the same higher level unit share. This residual correlation violates the assumption of independence of observations underlying usual regression approaches. Ignoring this correlation may lead to incorrect inferences. Efficiency of estimation may also be reduced.^{40} Multilevel models account for potential residual correlation by modelling intercepts and regression coefficients as random (for example, by allowing for macro level errors, *U*_{0j} and *U*_{1j} in second level equations, see multilevel models).

## POPULATION-AVERAGE MODELS

Models that account for correlation between lower level units within higher level units (or clusters) by modelling the correlations or covariances themselves rather than by allowing for random effects or random coefficients as multilevel models do.^{40,}^{46} These correlations are taken into account in the estimation of regression coefficients and their standard errors. Different correlation structures (describing within cluster or within higher level unit correlations) can be specified. “Population-average models” are also referred to as “marginal models”^{40,}^{46} or “covariance pattern models”.^{26} Whereas multilevel models model the dependent variable conditional on the random effects (or random coefficients), population-average models model the marginal expectation of the dependent variables across the population (in a sense, “averaged “ across the random effects). For this reason, marginal models have also been called “population-average” models (as a way to contrast them with subject specific random effects models).^{46} The Generalised Estimating Equation (GEE) approach is one approach to fitting marginal models.^{46}

Population-average models model the population-average response as a function of covariates without explicitly accounting for heterogeneity across higher level units.^{46} In contrast, multilevel models investigate and explain the source of group to group variation (and of the within group correlation) by modelling group specific regression coefficients as a function of group level variables plus random variation. Therefore, although population-average models account for the correlation between outcomes within higher level units, the source of this correlation is not directly investigated (the correlation, and sometimes higher level effects themselves, are viewed as nuisance parameters that must be taken into account but are not of direct interest). Therefore, population-average models do not allow examination of group to group variation, of the group level or individual level variables related to it, or of the degree of variation present between and within groups, as multilevel models do (see variance components). Differences between both types of models also have consequences for the interpretation of regression coefficients: in the multilevel model, the regression coefficient estimates how the response changes as a function of covariates *conditional* on the random effects; in the marginal model, the coefficient expresses how the response changes as a function of covariates “averaged” over group to group heterogeneity (or group random effects).^{40,}^{46} In the case of continuous dependent variables these coefficients are mathematically equivalent, but in the case of non-normally distributed variables (for example, logistic models) the marginal parameter values will usually be smaller in absolute value than their random effects analogues.^{46,}^{47}

## PSYCHOLOGISTIC FALLACY

An inferential fallacy that may arise from the failure to consider group characteristics in drawing inferences regarding the causes of variability across individuals^{1,}^{2}—that is, assuming that individual level outcomes can be explained exclusively in terms of individual level characteristics. Although the level at which data are collected may fit the conceptual model being investigated (that is, individual level), important facts pertaining to other levels (that is, group level) may have been ignored.^{1,}^{2} For example, a study based on individuals might find that immigrants are more likely to develop depression than natives. But suppose this is only true for immigrants living in communities where they are a small minority. A researcher ignoring the contextual effect of community composition might attribute the higher overall rate in immigrants to the psychological effects of immigration or to genetic factors, ignoring the importance of community level factors and thus committing the psychologistic fallacy.^{1} The term “psychologistic fallacy” is not entirely appropriate because the individual level factors used to explain the outcome are not always exclusively psychological.^{2} Although the term “individualistic fallacy” may appear more adequate, it has also been used as a synonym for the related but distinct atomistic fallacy.^{3,}^{4} See also sociologistic fallacy.

## RANDOM COEFFICIENT MODELS

Term originally used for models in which the regression coefficients corresponding to covariates in the model are treated as random rather than fixed ^{19,}^{26}(that is, models containing random coefficients, see for example *b*_{1j} in the entry for multilevel models). Traditional random coefficient models do not include higher level (or group level) predictors in the group level equations for the covariate effects (that is, in a traditional random coefficient model, equation (3) would be *b*_{1j} = *γ*_{10} + *U*_{1j}).^{19} Thus random coefficient models can be thought of as a particular case of the more general multilevel models. However, the term random coefficient models is sometimes used more broadly used to refer to multilevel models generally. See also random effects models.

## RANDOM EFFECTS/RANDOM COEFFICIENTS

Regression coefficients (intercepts or covariate effects) that are allowed to vary randomly across higher level units (that is, are assumed to be realisations of values from a probability distribution) (see multilevel models). For example, in the case of persons nested within neighbourhoods, neighbourhood effects can be assumed to vary randomly around an overall mean (random effect, see random effects models). Similarly, the effect of personal income on individual health may be allowed to vary randomly across neighbourhoods (random coefficient, see random coefficient models). Although the terms “random effects” and “random coefficients” are sometimes distinguished as noted above, they are often used interchangeably. The use of random effects or random coefficients is especially appropriate when the higher level units (or groups) can be thought of as random samples from a larger population of units (or groups) about which inferences wish to be made. See also fixed effects/fixed coefficients.

## RANDOM EFFECTS MODELS

Term originally used for models in which differences across groups (or other classification system) are treated as random rather than fixed ^{19,}^{26} (that is, models containing random effects). For example, in the case involving individuals nested within neighbourhoods, a model treating neighbourhood differences as fixed would include all neighbourhoods represented in the sample as a set of dummy variables in a regression equation with individuals as the units of analysis (see fixed coefficients). In contrast, a random effects model would treat neighbourhood differences as realisations from a probability distribution—that is, neighbourhood intercepts would be allowed to vary randomly across neighbourhoods following a probability distribution (see multilevel models). An underlying assumption is that the neighbourhoods in the study are a random sample from a larger population of neighbourhoods about which inferences wish to be made. Random effects models can be thought of as a particular case of the more general multilevel models in which only intercepts are allowed to vary randomly across groups (that is, random intercept models). Sometimes, however, the term random effects models is used more broadly to refer to multilevel models generally (that is, models that allow for both random intercept and random covariate effects). See also random coefficient models.

## RESIDUAL CORRELATION

See non-independence of observations.

## SOCIOLOGISTIC FALLACY

An inferential fallacy that may arise from the failure to consider individual level characteristics in drawing inferences regarding the causes of variability across groups.^{1,}^{2} Although the level at which data are collected may fit the conceptual model being investigated (that is, group level), important facts pertaining to other levels (that is, the individual level) may have been ignored.^{1} Suppose a researcher finds that communities with higher rates of transient population have higher rates of schizophrenia, and he/she concludes that higher rates of transient population lead to social disorganisation, breakdown of social networks, and increased risk of schizophrenia among all community inhabitants. But suppose that schizophrenia rates are only increased for transient residents (because transient residents tend to have fewer social ties, and individuals with few social ties are at greater risk of developing schizophrenia). That is, rates of schizophrenia are high for transient residents and low for non-transient residents, regardless of whether they live in communities with a high or a low proportion of transient residents. If this is the case, the researcher would be committing the sociologistic fallacy in attributing the higher schizophrenia rates to social disorganisation affecting all community members rather than to differences across communities in the percentage of transient residents. See also psychologistic fallacy.

## STRUCTURAL VARIABLES

A type of group level variable that refers to relations or interactions between members of a group,^{13} for example, characteristics of social networks within the group or patterns of contacts or interactions between members of the group. Structural variables are sometimes considered a subtype of integral variables.^{12,}^{18}

## SUBJECT SPECIFIC MODELS

Term used to refer to random effects/random coefficient models (or multilevel models generally) in order to contrast them with population-average models. “Subject specific” is used because the term was originally developed in the context of longitudinal data analysis,^{46} where individuals or subjects are the higher level units and repeat measures are the lower level units. In this case, the fixed effects coefficients derived from a random effects, random coefficient, or multilevel model are conditional on person level (or person specific) random effects, hence the term “subject specific”. More generally, they can be thought of as “higher level unit” specific (or cluster specific), because they are conditional or higher level unit (or cluster specific) random effects. For example, in the entry for multilevel models, the estimate of γ_{01} is conditional on group level random effects (as reflected by the presence of U_{oj} and U_{1j}).

## VARIANCE COMPONENTS

Using multilevel models the total variance in individual level outcomes (or lower level outcomes generally) can be decomposed into variance within and between groups (or higher level units generally). For example, the variance in blood pressure across individuals can be decomposed into variance within and between neighbourhoods. These components are referred to as variance components. The ability to estimate the variance components (which provide important information on the variability in the outcome between and within groups) is a key feature of multilevel models, and what distinguishes multilevel models from traditional contextual effects models and population-average models. For this reason, multilevel models have also sometimes been referred to as variance component or covariance component models. See also multilevel models.

## REFERENCES

## Statistics from Altmetric.com

Multilevel analysis, originally developed in the fields of education, sociology, and demography, has received increasing attention in public health and epidemiology over the past few years. This glossary defines key terms and concepts in multilevel analysis. The intent is to provide conceptual explanations of basic concepts, particularly those that are fundamental, that have been used inconsistently or that lend themselves to confusion. Selected terms and concepts more broadly related to the presence of multiple levels of organisation (such as group level variables and inferential fallacies) are also included. Although the glossary often refers to individuals nested within groups, multilevel analysis is applicable to a broad range of situations involving units at a lower level (or micro units) nested within units at a higher level (or macro units) (including for example, persons nested within studies as in meta-analysis, and measures over time nested within individuals as in the analysis of repeat measures). References to terms that have their own specific entry are in small capitals.

## AGGREGATE DATA

Term used to refer to data or variables for a higher level unit (for example, a group) constructed by combining information for the lower level units of which the higher level unit is composed (for example, individuals within the group). Examples of aggregate data include summaries of the properties of individuals comprising a group, for example, the percentage of persons in a neighbourhood with complete high school or the mean income of state residents. Implicit in most uses of the term aggregate data is the idea that aggregate variables are merely summaries of the properties of lower level units and not measures of higher level properties themselves (although this is not necessarily true in all cases, see derived variables).

## ATOMISTIC FALLACY

The fallacy sometimes present when drawing inferences regarding variability across groups (or the relation between group level variables) based on individual level data, or more generally, the fallacy of drawing inferences regarding variability across units defined at a higher level based on data collected for units at a lower level. The atomistic fallacy arises because associations between two variables at the individual level may differ from associations between analogous variables measured at the group level. For example, a study of individuals may find that increasing individual level income is associated with decreasing coronary heart disease mortality. If it is inferred from these data that at the country level, increasing per capita income is associated with decreasing coronary heart disease mortality, the researcher may be committing the atomistic fallacy (because across countries, increasing per capita income may actually be associated with *increasing* coronary heart disease mortality). The sources of the atomistic fallacy are similar to those of the ecologic fallacy. In the atomistic fallacy, the conceptual model being tested corresponds to the higher level, but the data are collected for a lower level.^{1,}^{2} The atomistic fallacy has sometimes been referred to as the individualistic fallacy.^{3,}^{4}

## COMPOSITIONAL EFFECTS

When inter-group (or inter-context) differences in an outcome (for example, disease rates) are attributable to differences in group composition (that is, in the characteristics of the individuals of which the groups are comprised) they are said to result from compositional effects.^{5} On the other hand, when group differences are attributable to the effects of group level variables or properties, they are said to result from contextual effects.

## CONTEXTUAL ANALYSIS

An analytical approach originally used in sociology to investigate the effect of collective or group characteristics on individual level outcomes.^{4,}^{6,}^{7} In contextual analysis, group level predictors (often constructed by aggregating the characteristics of individuals within groups) are included together with individual level variables in standard regressions with individuals as the units of analysis (contextual effects models). This approach permits the simultaneous examination of how individual level and group level variables are related to individual level outcomes. It thus allows for macro processes that are presumed to have an impact on individuals over and above the effects of individual level variables.^{6} The terms “contextual analysis” and multilevel analysis have sometimes been used synonymously, ^{8–}^{10} and both approaches are similar in allowing the investigation of how group level (or macro) and individual level (or micro) variables (as well as their interactions) are related to individual level outcomes. However, multilevel models are more general than the original contextual models in that (1) they allow (and account for) the possibility of residual correlation between individuals within groups; and (2) they allow examination of between group variability and the factors associated with it. In contrast, contextual models often do not account for residual correlation (although they can be modified to do so) and do not allow the examination of inter-group variability or of the factors associated with it (see also variance components).

## CONTEXTUAL EFFECTS

Term generally used to refer to the effects of variables defined at a higher level (usually at the group level) on outcomes defined at a lower level (usually at the individual level) after controlling for relevant individual level (lower level) confounders. The term is most often used to refer to the effect of a derived group level variable (for example, mean neighbourhood income) on an individual level outcome (such as blood pressure) after controlling for its individual level namesake (for example, individual level income).^{6,}^{11} However, “contextual effects” is also sometimes used to refer to the effects of group level variables generally be they derived variables or integral variables, and can apply to any situation involving lower level units nested within higher level units (for example, contextual effects of country characteristics on disease rates for small areas, contextual effects of tissue characteristics on cell biology). Contextual effects are sometimes contrasted with compositional effects.^{5}

## CONTEXTUAL EFFECTS MODELS

Regression models with individuals as the units of analysis that include both group level and individual level variables as predictors of individual level outcomes. Traditional contextual effects models are equivalent to multilevel models in which all coefficients are modelled as fixed (that is, no error terms are included in the group level or level 2 equations, see multilevel models). See contextual analysis.

## CONTEXTUAL VARIABLES

See derived variables and group level variables.

## CROSS LEVEL EFFECTS

Term used to refer to the main effects of higher level variables (for example, group level variables) on outcomes at a lower level (for example, individual level outcomes) as well as to modifications of the effects of lower level (individual level) variables by higher level (group level) variables (see cross level interaction).^{12} Examples include the effect of country level income inequality on individual level self reported health (effect of a higher level variable on outcomes at a lower level), and the presence of stronger associations between individual level income and self reported health in the presence of high country level income inequality (modifications of the effects of lower level variables by higher level variables). The term “ecological effects” has sometimes been used as a synonym for “cross level effects”.^{12}

## CROSS LEVEL INFERENCE

The drawing of inferences regarding factors associated with variability in the outcome at one level based on data collected at another level (for example, drawing inferences regarding relations between individual level variables based on group level associations, or vice versa). See ecologic fallacy and atomistic fallacy.

## CROSS LEVEL INTERACTION

Refers to the interaction between higher level and lower level variables—that is, to modification of the effects of lower level variables by characteristics of the higher level units to which the lower level units belong (or vice versa).^{5,}^{12} For example, if the relation between individual level income and blood pressure differs by neighbourhood characteristics (that is, neighbourhood and individual level variables interact), there is said to be a cross level interaction. In multilevel models whenever group specific estimates of the effect of a lower level variable are modelled as a function of higher level (group level) variables (as in equation (3) under the entry for multilevel models), a cross level interaction appears in the final model (γ_{11} *C*_{j} *I*_{ij} in equation (4) under multilevel models).

## DERIVED VARIABLES

A type of group level variable constructed by mathematically summarising the characteristics of individuals in the group (for example, means, proportions, or measures of dispersion, such as, percentage of persons with incomplete high school, mean income, standard deviation of the income distribution).^{11,}^{13} Some derived variables have no individual level analogue (for example, standard deviation of the income distribution) and therefore necessarily refer to group level constructs. Others (for example, mean neighbourhood income) do have individual level analogues (for example, individual level income), but may provide information on group level constructs, distinct from their individual level namesake. The mean of the dependent variable in the group (for example, proportion infected in a study of the causes of infection) can be thought of as a special type of derived variable.^{14} Although derived and integral variables are sometimes presented as conceptually distinct, they are closely interrelated. Derived variables often operate by shaping certain integral properties of the group. For example, the composition of a group may influence the predominant types of interpersonal contacts, values, and norms or may shape organisations or regulations within the group that affect all members.^{15} The terms “analytical variables” and “aggregate variables” have been used as synonyms for “derived variables”. The term “contextual variables” has also been used as a synonym for “derived variables” ^{14} although it is sometimes used to refer to group level variables generally.^{6,}^{13}

## ECOLOGICAL FALLACY

The fallacy sometimes present when drawing inferences at the individual level (that is, regarding relations between individual level variables) based on group level data. The ecological fallacy arises because associations between two variables at the group level (or ecological level) may differ from associations between analogous variables measured at the individual level. These differences between individual level and group level associations were first described for correlation coefficients ^{16} but may also be present for other measures of association such as regression coefficients.^{11,}^{17} More generally, the fallacy may occur whenever data for units at a higher level are used to draw inferences regarding factors associated with variability across units at a lower level—that is, when the conceptual model being tested corresponds to the lower level, but the data are collected for a higher level.^{1,}^{2} Suppose a researcher finds that at the country level, increasing per capita income is associated with increasing mortality attributable to traffic accidents. If he/she infers that at the individual level, increasing personal income is associated with increasing motor vehicle related mortality, she may be committing the ecological fallacy, because within countries, motor vehicle related mortality may always be lower in high income than in low income persons. In the case of regression coefficients, the sources of the ecological fallacy include (1) the lack of information on constructs pertaining to a lower level of organisation; and (2) the failure to realise that a variable defined and measured at one level of organisation may tap into a different construct than its namesake at another level.^{18}

## EMPIRICAL BAYES ESTIMATES

Estimates of parameters for a given group or higher level unit (for example, estimates of group specific intercepts or slopes, such as *b*_{0j} and b_{1j} in equation (1), under multilevel models) obtained by combining information from the group itself with information from other similar groups investigated.^{10,}^{19,}^{20} This is particularly useful when estimating parameters for a group with few within group observations. These estimates are “optimally” weighted averages that combine information derived from the group itself with the mean for all similar groups. The weighted average shifts the group specific estimate (derived using data only for that particular group) towards the mean for similar groups. The less precise the group specific estimate and the less the variability observed across groups, the greater the shift towards the overall group mean. Thus, the estimate for a given group is based not only on its own data but also takes into account estimates for other groups and the characteristics groups share.^{20} Empirical Bayes estimates of parameters for a given group can be derived from multilevel models using estimates of the group level errors (for example, U_{0j} and U_{1j ,} see multilevel models) for that particular group. Empirical Bayes estimates are also sometimes referred to as “shrinkage estimates” because they “shrink” the group specific estimate towards the overall mean (although in fact when the overall mean is greater than the group specific estimate, the “shrunken” or empirical Bayes estimate may actually be greater than the group specific estimate). In public health, empirical Bayes estimation can be used, for example, to derive improved estimates of rates of death or diseases for small areas with few observations,^{21} or to estimate rates of different health outcomes for individual providers (hospitals, physicians, etc).^{22} In other applications (which do not involve the structure of individuals within groups although they are analogous to it), empirical Bayes estimates of regression coefficients have been used to obtain improved estimates of associations in studies investigating the role of multiple exposures.^{23}

## ENVIRONMENTAL VARIABLES

In the context of ecological studies and multilevel analysis, the term “environmental variables” has sometimes been used to refer to group level measures of physical or chemical exposures. Environmental variables, so defined, have been proposed as a “type” of group level variable, distinct from derived variables and integral variables.^{11} These variables are not derived by aggregating the characteristics of individuals but they do have group level and individual level analogues (for example, days of sunlight in the community and individual level sunlight exposure information). In contrast with derived and integral variables, which may be used as indicators of group level constructs, group level environmental variables are used exclusively as proxies for individual level exposures (which may be more difficult to measure for logistic or methodological reasons), rather than as indicators of a group level property, which is conceptually different from the analogous measure at the individual level.

## FIXED EFFECTS/FIXED COEFFICIENTS

Regression coefficients (intercepts or covariate effects) that are not allowed to vary randomly across higher level units (see multilevel models). For example, in the case of persons nested within neighbourhoods, two options are available for modelling the effects of neighbourhood. One option is to include a dummy variable for each neighbourhood. In this case the neighbourhood coefficients are modelled as fixed (sometimes called “fixed effects”). Another option is to assume that the neighbourhoods in the sample are a random sample of a larger population of neighbourhoods and that the coefficients for the “neighbourhood effect” vary randomly around an overall mean (for example, as reflected by U_{oj} in equation 2 under the entry for multilevel models). In this case, the neighbourhood effects are modelled as random (sometimes called “random effects”, see random effects models). In the same example, the coefficients for individual level covariates can also be modelled as fixed or random. For example, if the relation between individual level income and blood pressure is not allowed to vary randomly across neighbourhoods, the coefficient for individual level income is fixed (“fixed coefficient”). On the other hand, if the coefficient for individual level income is allowed to vary randomly across neighbourhoods around an overall mean effect (as reflected by U_{1j} in equation 3 under the entry for multilevel models), the coefficient for income is modelled as random (sometimes called a “random coefficient”, see random coefficient models). Although the terms “fixed effects” and “fixed coefficients “ are sometimes distinguished as noted above, they are often used interchangeably. Fixed effects models or fixed coefficient models are models in which all effects or coefficients are fixed. See also random effects/random coefficients.

## GROUP LEVEL VARIABLES

Term used to refer to variables that characterise groups. The terms group level variables, macro variables and ecological variables are often used interchangeably.^{2,}^{6,}^{11,}^{14,}^{24} Group level variables may be used as proxies for unavailable or unreliable individual level data (for example, when neighbourhood mean income is used as a proxy for the individual level income of individuals living in the neighbourhood) or as indicators of group level constructs (for example, when mean neighbourhood income is used as an indicator of neighbourhood characteristics that may be related to individual level outcomes independently of individual level income). It is the second usage (as indicators of group level constructs) that is of particular interest in multilevel analysis. Group level variables have been classified into two basic types.^{11,}^{13,}^{24} derived variables and integral variables. Two additional types of group level variables, structural variables ^{13} and environmental variables^{11} are sometimes distinguished. The term contextual variables has been used as a synonym for group level variables generally ^{6,}^{13} although it is sometimes reserved for derived group level variables.^{11,}^{14}

## HIERARCHICAL (LINEAR) MODELS

See multilevel models

## INDIVIDUAL LEVEL VARIABLES

Term used to refer to variables that characterise individuals and refer to individual level constructs (for example, age or personal income).

## INDIVIDUALISTIC FALLACY

Term used as a synonym for the atomistic fallacy. May sometimes also be used as a synonym for the psychologistic fallacy.

## INTEGRAL VARIABLES

A type of group level variable. Integral variables differ from derived variables (another type of group level variable) in that they are not summaries of the characteristics of individuals in the group. Integral variables have no individual level analogues and necessarily refer to group level constructs. Examples of integral variables include the existence of certain types of laws, political or economic system, social disorganisation, or population density.^{11,}^{13} Integral variables have also been referred to as primary or global variables.

## INTRACLASS CORRELATION

A measure of the degree of resemblance between lower level units belonging to the same higher level unit or cluster.^{25} In the case of individuals nested within groups (for example, neighbourhoods), the intraclass correlation measures the extent to which values of the dependent variable are similar for individuals belonging to the same group. It can be thought of as the average correlation between values of two randomly drawn lower level units (for example, individuals) in the same, randomly drawn higher level unit (for example, neighbourhood). It can also be defined as the proportion of the variance in the outcome that is between the groups or higher level units. In the case of a simple random intercept model, the intraclass correlation coefficient is estimated by the ratio of population variance between groups (τ_{00}) to the total variance (τ_{00} + σ^{2}).^{25} (see multilevel models) The estimation of the intraclass correlation coefficient in models including random covariate effects, or in the case of non-normally distributed dependent variables, is more complex and not always straightforward .

## MARGINAL MODELS

See population-average models.

## MIXED MODELS

Term used to refer to models that contain a mixture of fixed effects (or fixed coefficients) and random effects (or random coefficients). In mixed models some of the regression coefficients (intercepts or covariate effects) are allowed to vary randomly across higher level units but others are not (see multilevel models). Thus mixed models can be thought of as a particular case of the more general multilevel models (although the term is also occasionally used as a synonym of multilevel models generally). Sometimes the term mixed models is also used to encompass models that account for correlation between lower level units (for example, individuals) within higher level units (for example, neighbourhoods) in other ways—that is, by modelling the correlations or covariances themselves rather than by allowing for random effects or random coefficients.^{26} These models (which are not multilevel models) have also been called covariance pattern models,^{26} marginal models, or population average models.

## MULTILEVEL ANALYSIS

An analytical approach that is appropriate for data with nested sources of variability—that is, involving units at a lower level or micro units (for example, individuals) nested within units at a higher level or macro units (for example, groups such as schools or neighbourhoods).^{5,}^{10,}^{19,}^{24,}^{25,}^{27–}^{30} Multilevel analysis allows the simultaneous examination of the effects of group level and individual level variables on individual level outcomes while accounting for the non-independence of observations within groups. Multilevel analysis also allows the examination of both between group and within group variability as well as how group level and individual level variables are related to variability at both levels. Thus, multilevel models can be used to draw inferences regarding the causes of inter-individual variation (or the relation of group and individual level variables to individual level outcomes) but inferences can also be made regarding inter-group variation, whether it exists in the data, and to what extent it is accounted for by group and individual level characteristics. In multilevel analysis, groups or contexts are not treated as unrelated but are conceived as coming from a larger population of groups about which inferences want to be made. Multilevel analysis thus allows researchers to deal with the micro-level of individuals and the macro-level of groups or contexts simultaneously.^{5}

Multilevel analysis has a broad range of applications in many situations involving nested sources of random variability such as persons nested within neighbourhoods,^{5,}^{30} patients nested within providers,^{31} meta analysis (observations nested within sites)^{19,}^{32} longitudinal data analysis (repeat measurements over time nested within persons),^{28,}^{33,}^{34} multivariate responses (multiple outcomes nested within individuals),^{5} the analysis of repeat cross sectional surveys (multiple observations nested within time periods),^{35} the examination of geographical variations in rates (rates for smaller areas nested within regions or larger areas)^{36} and the examination of interviewer effects (respondents nested within interviewers).^{37} Multilevel analysis can also be used in situations involving multiple nested contexts^{19,}^{28} (for example, multiple measures over time on individuals nested within neighbourhoods) as well as overlapping or cross classified contexts (for example, children nested within neighbourhoods and schools).^{38} The statistical models used in multilevel analysis are referred to as multilevel models^{25,}^{28,}^{29} or hierarchical linear models.^{19,}^{39}

## MULTILEVEL MODELS

The statistical models used in multilevel analysis.^{19,}^{25,}^{28,}^{29} The terms “hierarchical models” and “multilevel models” are often used synonymously. These models (or variants of them) have previously appeared in different literatures under a variety of names including random effects models or random coefficient models^{40–}^{42} “covariance components models” or “variance components models”,^{43,}^{44} and mixed models.^{26} A simplified example for the case of a normally distributed dependent variable, a single individual level (lower level unit) predictor and a single group level (higher level unit) predictor is provided below. Analogous models can be formulated for non-normally distributed dependent variables.^{10,}^{28,}^{39,}^{45}

In the case of multilevel analysis involving two levels (for example, individuals nested within groups), the multilevel model can be conceptualised as a two stage system of equations.

In the first stage (level 1), a separate individual level regression is defined for each group or higher level unit.

*Y*_{ij} = outcome variable for i^{th} individual in j^{th} group

*I*_{ij}= individual level variable for i^{th} individual in j^{th} group

b_{0j} is the group specific intercept

b_{1j} is the group specific effect of the individual level variable

Individual level errors (e_{ij}) are assumed to be independent and identically distributed with a mean of 0 and a variance of σ^{2}. The same regressors are generally used in all groups, but regression coefficients (b_{0j} and b_{1j}) allowed to vary from one group to another.

In a second stage (level 2), each of the group or context specific regression coefficients defined in equation (1) (b_{0j} and b_{1j} in this example) are modelled as a function of group level (or higher level) variables.

*G*_{j} group level variable

γ_{00} is the common intercept across groups

γ_{01} is the effect of the group level predictor on the group specific intercepts

γ_{10} is the common slope associated with the individual level variable across groups

γ_{11} is the effect of the group level predictor on the group specific slopes

The errors in the level 2 equations (U_{0j} and U_{1j}), sometimes called “macro errors”, are assumed to be normally distributed with mean 0 and variances τ_{00} and τ_{11} respectively. τ_{01} represents the covariance between intercepts and slopes. Thus, multilevel analysis summarises the distribution of the group specific coefficients in terms of two parts: a “fixed”part that is common across groups (γ_{00} and γ_{01} for the intercept, and γ_{10} and γ_{11} for the slope) and a “random” part (U_{0j} for the intercept and U_{1j} for the slope) that is allowed to vary from group to group (see also fixed coefficients and random coefficients).

By including an error term in the group level equations (equations (2) and (3)), these models allow for sampling variability in the group specific coefficients (b_{0j} and b_{1j}) and also for the fact that the group level equations are not deterministic (that is, the possibility that not all relevant macro-level variables have been included in the model). The underlying assumption is that group specific intercepts and slopes are random samples from a normally distributed population of group specific intercepts and slopes, or alternatively, that the macro errors are exchangeable—that is, that the residual variation in group specific coefficients across groups is unsystematic.^{10}

An alternative way to present the model fitted in multilevel analysis is to substitute equations (2) and (3) in (1) to obtain:

The model includes the effects of group level variables (γ_{01}), individual level variables (γ_{10}) and their interaction (γ_{11}) on the individual level outcome *Y*_{ij} . These coefficients (γ_{01,} γ_{10} and γ_{11}), which are common to all individuals regardless of the group to which they belong are often called the fixed coefficients (or fixed effects). The model also includes a random intercept component (U_{0j}), and a random slope component (U_{1j}). The values of these components vary randomly across groups, and hence U_{0j} and U_{1j} referred to as the random coefficients (or random effects). The parameters of the above equations (fixed effects, random effects, variances of the random effects, and residual variance) are simultaneously estimated using iterative methods. The level 1 and level 2 variances (σ^{2}, τ_{00,} τ_{11 ,} τ_{10}) are called the (co)variance components.

Many variants of the more general model illustrated above are possible. For example, only group specific intercepts (*b*_{0j}) may be modelled as random (these models have also been called random effects models). When covariate effects (*b*_{1j} in the example above) are modelled as random these models have also been called random coefficient models. When some of the coefficients are fixed and other are random these models have also been called “mixed effects models” or simply mixed models. When all coefficients are modelled fixed (no random errors are included in level 2 equations) these models are reduced to traditional contextual effects models. Multilevel models can also account for multiple nested contexts (or levels)^{19,}^{28} allowing fixed and random coefficients to be associated with variables measured at different levels of the data hierarchy being analysed. Multilevel models can also be modified to allow for non-hierarchical, overlapping or cross classified contexts (for example, children simultaneously nested within neighbourhoods and schools).^{38}

## NON-INDEPENDENCE OF OBSERVATIONS

Refers to situations in which dependent variables for observations at a lower level nested within the same higher level unit (or cluster) are correlated, even after measured characteristics are taken into account. For example, two persons from the same neighbourhood may tend to have more similar blood pressure levels than two persons from different neighbourhoods, even after measured individual and neighbourhood characteristics are taken into account. In the case of repeat measures on individuals over time, two blood pressure measurements on the same person may tend to be more similar than two measures on different persons even after relevant covariates are taken into account. One reason for this correlation may have to do with the omission of important higher level variables that observations within the same higher level unit share. This residual correlation violates the assumption of independence of observations underlying usual regression approaches. Ignoring this correlation may lead to incorrect inferences. Efficiency of estimation may also be reduced.^{40} Multilevel models account for potential residual correlation by modelling intercepts and regression coefficients as random (for example, by allowing for macro level errors, *U*_{0j} and *U*_{1j} in second level equations, see multilevel models).

## POPULATION-AVERAGE MODELS

Models that account for correlation between lower level units within higher level units (or clusters) by modelling the correlations or covariances themselves rather than by allowing for random effects or random coefficients as multilevel models do.^{40,}^{46} These correlations are taken into account in the estimation of regression coefficients and their standard errors. Different correlation structures (describing within cluster or within higher level unit correlations) can be specified. “Population-average models” are also referred to as “marginal models”^{40,}^{46} or “covariance pattern models”.^{26} Whereas multilevel models model the dependent variable conditional on the random effects (or random coefficients), population-average models model the marginal expectation of the dependent variables across the population (in a sense, “averaged “ across the random effects). For this reason, marginal models have also been called “population-average” models (as a way to contrast them with subject specific random effects models).^{46} The Generalised Estimating Equation (GEE) approach is one approach to fitting marginal models.^{46}

Population-average models model the population-average response as a function of covariates without explicitly accounting for heterogeneity across higher level units.^{46} In contrast, multilevel models investigate and explain the source of group to group variation (and of the within group correlation) by modelling group specific regression coefficients as a function of group level variables plus random variation. Therefore, although population-average models account for the correlation between outcomes within higher level units, the source of this correlation is not directly investigated (the correlation, and sometimes higher level effects themselves, are viewed as nuisance parameters that must be taken into account but are not of direct interest). Therefore, population-average models do not allow examination of group to group variation, of the group level or individual level variables related to it, or of the degree of variation present between and within groups, as multilevel models do (see variance components). Differences between both types of models also have consequences for the interpretation of regression coefficients: in the multilevel model, the regression coefficient estimates how the response changes as a function of covariates *conditional* on the random effects; in the marginal model, the coefficient expresses how the response changes as a function of covariates “averaged” over group to group heterogeneity (or group random effects).^{40,}^{46} In the case of continuous dependent variables these coefficients are mathematically equivalent, but in the case of non-normally distributed variables (for example, logistic models) the marginal parameter values will usually be smaller in absolute value than their random effects analogues.^{46,}^{47}

## PSYCHOLOGISTIC FALLACY

An inferential fallacy that may arise from the failure to consider group characteristics in drawing inferences regarding the causes of variability across individuals^{1,}^{2}—that is, assuming that individual level outcomes can be explained exclusively in terms of individual level characteristics. Although the level at which data are collected may fit the conceptual model being investigated (that is, individual level), important facts pertaining to other levels (that is, group level) may have been ignored.^{1,}^{2} For example, a study based on individuals might find that immigrants are more likely to develop depression than natives. But suppose this is only true for immigrants living in communities where they are a small minority. A researcher ignoring the contextual effect of community composition might attribute the higher overall rate in immigrants to the psychological effects of immigration or to genetic factors, ignoring the importance of community level factors and thus committing the psychologistic fallacy.^{1} The term “psychologistic fallacy” is not entirely appropriate because the individual level factors used to explain the outcome are not always exclusively psychological.^{2} Although the term “individualistic fallacy” may appear more adequate, it has also been used as a synonym for the related but distinct atomistic fallacy.^{3,}^{4} See also sociologistic fallacy.

## RANDOM COEFFICIENT MODELS

Term originally used for models in which the regression coefficients corresponding to covariates in the model are treated as random rather than fixed ^{19,}^{26}(that is, models containing random coefficients, see for example *b*_{1j} in the entry for multilevel models). Traditional random coefficient models do not include higher level (or group level) predictors in the group level equations for the covariate effects (that is, in a traditional random coefficient model, equation (3) would be *b*_{1j} = *γ*_{10} + *U*_{1j}).^{19} Thus random coefficient models can be thought of as a particular case of the more general multilevel models. However, the term random coefficient models is sometimes used more broadly used to refer to multilevel models generally. See also random effects models.

## RANDOM EFFECTS/RANDOM COEFFICIENTS

Regression coefficients (intercepts or covariate effects) that are allowed to vary randomly across higher level units (that is, are assumed to be realisations of values from a probability distribution) (see multilevel models). For example, in the case of persons nested within neighbourhoods, neighbourhood effects can be assumed to vary randomly around an overall mean (random effect, see random effects models). Similarly, the effect of personal income on individual health may be allowed to vary randomly across neighbourhoods (random coefficient, see random coefficient models). Although the terms “random effects” and “random coefficients” are sometimes distinguished as noted above, they are often used interchangeably. The use of random effects or random coefficients is especially appropriate when the higher level units (or groups) can be thought of as random samples from a larger population of units (or groups) about which inferences wish to be made. See also fixed effects/fixed coefficients.

## RANDOM EFFECTS MODELS

Term originally used for models in which differences across groups (or other classification system) are treated as random rather than fixed ^{19,}^{26} (that is, models containing random effects). For example, in the case involving individuals nested within neighbourhoods, a model treating neighbourhood differences as fixed would include all neighbourhoods represented in the sample as a set of dummy variables in a regression equation with individuals as the units of analysis (see fixed coefficients). In contrast, a random effects model would treat neighbourhood differences as realisations from a probability distribution—that is, neighbourhood intercepts would be allowed to vary randomly across neighbourhoods following a probability distribution (see multilevel models). An underlying assumption is that the neighbourhoods in the study are a random sample from a larger population of neighbourhoods about which inferences wish to be made. Random effects models can be thought of as a particular case of the more general multilevel models in which only intercepts are allowed to vary randomly across groups (that is, random intercept models). Sometimes, however, the term random effects models is used more broadly to refer to multilevel models generally (that is, models that allow for both random intercept and random covariate effects). See also random coefficient models.

## RESIDUAL CORRELATION

See non-independence of observations.

## SOCIOLOGISTIC FALLACY

An inferential fallacy that may arise from the failure to consider individual level characteristics in drawing inferences regarding the causes of variability across groups.^{1,}^{2} Although the level at which data are collected may fit the conceptual model being investigated (that is, group level), important facts pertaining to other levels (that is, the individual level) may have been ignored.^{1} Suppose a researcher finds that communities with higher rates of transient population have higher rates of schizophrenia, and he/she concludes that higher rates of transient population lead to social disorganisation, breakdown of social networks, and increased risk of schizophrenia among all community inhabitants. But suppose that schizophrenia rates are only increased for transient residents (because transient residents tend to have fewer social ties, and individuals with few social ties are at greater risk of developing schizophrenia). That is, rates of schizophrenia are high for transient residents and low for non-transient residents, regardless of whether they live in communities with a high or a low proportion of transient residents. If this is the case, the researcher would be committing the sociologistic fallacy in attributing the higher schizophrenia rates to social disorganisation affecting all community members rather than to differences across communities in the percentage of transient residents. See also psychologistic fallacy.

## STRUCTURAL VARIABLES

A type of group level variable that refers to relations or interactions between members of a group,^{13} for example, characteristics of social networks within the group or patterns of contacts or interactions between members of the group. Structural variables are sometimes considered a subtype of integral variables.^{12,}^{18}

## SUBJECT SPECIFIC MODELS

Term used to refer to random effects/random coefficient models (or multilevel models generally) in order to contrast them with population-average models. “Subject specific” is used because the term was originally developed in the context of longitudinal data analysis,^{46} where individuals or subjects are the higher level units and repeat measures are the lower level units. In this case, the fixed effects coefficients derived from a random effects, random coefficient, or multilevel model are conditional on person level (or person specific) random effects, hence the term “subject specific”. More generally, they can be thought of as “higher level unit” specific (or cluster specific), because they are conditional or higher level unit (or cluster specific) random effects. For example, in the entry for multilevel models, the estimate of γ_{01} is conditional on group level random effects (as reflected by the presence of U_{oj} and U_{1j}).

## VARIANCE COMPONENTS

Using multilevel models the total variance in individual level outcomes (or lower level outcomes generally) can be decomposed into variance within and between groups (or higher level units generally). For example, the variance in blood pressure across individuals can be decomposed into variance within and between neighbourhoods. These components are referred to as variance components. The ability to estimate the variance components (which provide important information on the variability in the outcome between and within groups) is a key feature of multilevel models, and what distinguishes multilevel models from traditional contextual effects models and population-average models. For this reason, multilevel models have also sometimes been referred to as variance component or covariance component models. See also multilevel models.

## REFERENCES

## Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

## Copyright information:

## Linked Articles

- In this issue