Article Text

Download PDFPDF
Impact of small group size on neighbourhood influences in multilevel models
  1. Katherine P Theall1,
  2. Richard Scribner1,
  3. Stephanie Broyles2,
  4. Qingzhao Yu1,
  5. Jigar Chotalia1,
  6. Neal Simonsen1,
  7. Matthias Schonlau3,
  8. Bradley P Carlin4
  1. 1Louisiana State University Health Sciences Center, School of Public Health, New Orleans, Louisiana, USA
  2. 2Pennington Biomedical Center, Population Sciences, Baton Rouge, Louisiana, USA
  3. 3RAND Corporation, Center for Population Health and Health Disparities, Santa Monica, California, USA
  4. 4University of Minnesota School of Public Health, Biostatistics, Minneapolis, Minnesota, USA
  1. Correspondence to Dr Katherine P Theall, Louisiana State University Health Sciences Center, School of Public Health, Epidemiology Program, 1615 Poydras Street, Suite 1400, New Orleans, LA 70112, USA; kthea1{at}


Background Given the growing availability of multilevel data from national surveys, researchers interested in contextual effects may find themselves with a small number of individuals per group. Although there is a growing body of literature on sample size in multilevel modelling, few have explored the impact of group sizes of less than five.

Methods In a simulated analysis of real data, the impact of a group size of less than five was examined on both a continuous and dichotomous outcome in a simple two-level multilevel model. Models with group sizes one to five were compared with models with complete data. Four different linear and logistic models were examined: empty models; models with a group-level covariate; models with an individual-level covariate and models with an aggregated group-level covariate. The study evaluated further whether the impact of small group size differed depending on the total number of groups.

Results When the number of groups was large (N=459), neither fixed nor random components were affected by small group size, even when 90% of tracts had only one individual per tract and even when an aggregated group-level covariate was examined. As the number of groups decreased, the SE estimates of both fixed and random effects were inflated. Furthermore, group-level variance estimates were more affected than were fixed components.

Conclusions Datasets in which there is a small to moderate number of groups, with the majority of very small group size (n<5), size may fail to find or even consider a group-level effect when one may exist and also may be underpowered to detect fixed effects.

  • Body mass index
  • body weight
  • multilevel modelling
  • neighbourhood
  • obesity
  • population surveys
  • sample size
  • small area epidemiology

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • Funding This research was supported by grants from the Centers for Disease Control and Prevention (1K01SH000002-01 to KPT) and the National Institute on Alcohol Abuse and Alcoholism (R01AA013749 to RS). The views presented in this paper are those of the authors and do not represent those of the funding agencies.

  • Competing interests None.

  • Ethics approval This study was conducted with the approval of the Louisiana State University Health Sciences Center.

  • Provenance and peer review Not commissioned; externally peer reviewed.