Background Each year numerous studies evaluate longitudinal data within a lifecourse context with later-life health status (e.g. systolic blood pressure, SBP) analysed with respect to repeated measures of early-life experiences (e.g. body mass index, BMI), using standard multiple linear regression inappropriately. We examine these problems, via simulation, to give clear guidance on what happens if basic theory of causation (as informed by directed acyclic graphs; DAGs) is ignored. In our previous work on birth weight (BW) we have shown that for ages 0–17 years, our simulation results are consistent with the Avon Longitudinal Study of Parents and Children (ALSPAC) study with maximal attenuation when later life mediators are included. This study extends this analysis to age 70 years.
Methods We simulated a lifecourse dataset comprising BW, repeated measures of BMI up to 70 years, and a single z-score SBP measure at age 70, using a multivariate normal distribution designed to represent real data, informed for ages 0–17 by the ALSPAC study. We conducted a series of multivariable regression analyses taking BW and BMI at age 70 years as exposures, similar to those frequently seen in lifecourse research.
Results When we extended the maximum age to 70 years, the analyses with BW as the exposure showed that adjusting for any mediator results in attenuation away from the true association. This was most severe when later life measures were included. When BMI at 70 years was the exposure, optimal confounder adjustment was achieved when later life measures of BMI were included in the model. Further adjustment (e.g. for early adulthood BMI) yielded little additional effect, and need not be considered in the interests of parsimony. However, here the attenuation observed was positive, whereas the attenuation observed was negative with our previous analyses of a younger cohort. This disparity is due to much greater collinearity between BMI in later life than between BMI in early life (i.e. BMI 60 years and 70 years versus BMI 17 years vs. 12 years).
Conclusion The common practice of analysing multiple longitudinal measures in a single regression model can lead to incorrect estimates of the main exposure. The distinction between genuine confounders and mediators needs to be better understood, and the use of DAG theory is recommended. Different adverse impacts were encountered for cohorts with outcomes recorded at different ages, the extent of these impacts depending upon the correlation structure.