Article Text
Abstract
Introduction The CES-D scale is commonly used to assess depressive symptoms (DS) in large population-based studies. Missing data (MD) in one or several of the 20 items of the scale are frequent and may create biases. Reasons for not completing items and impact on the estimation the prevalence of DS under various hypotheses are explored.
Methods 71 412 women from the French E3N cohort returned in 2005 a questionnaire containing the CES-D scale. An interview study was carried out on a random sample of 204 participants to examine different hypotheses for the MD mechanism. The prevalence of DS was estimated with different methods for handling MD: complete cases analysis, single imputation, multiple imputation from CES-D items with or without covariates under missing at random (MAR) and missing not at random (MNAR) assumptions.
Results 45% of the 71 412 presented at least one missing value in the scale. The interviews showed that participants were not embarrassed to fill in questions about DS. Potential reasons of nonresponse were identified. MAR and MNAR hypotheses remained plausible. Among complete responders, the prevalence of DS was 26.1%. After multiple imputation under MAR assumption, it was 28.6%, 29.8% and 31.7% among women presenting up to 4, 10 and 20 missing values, respectively. The estimates were robust to the different imputation models, and the various scenarios of MNAR data.
Conclusion The CES-D scale can easily be used to assess DS in large cohorts. Multiple imputation under MAR assumption with the CES-D items only allows to reliably handle MD.