Article Text
Abstract
Introduction The impact of missing data (MD) on the validity of results, although often discussed, is not often investigated. Sensitivity analyses under different scenarios of non-ignorable MD are an attractive approach but they are rarely performed, probably due to lack of tools.
Methods We propose an R function to perform multiple imputation based on the principle of mixture modelling, assuming that the variable of interest has different distributions according to the status missing / non-missing. We propose a 3-step strategy:
Fit an imputation model assuming ignorable MD;
Modify the imputation model by adding a parameter (expressed as the OR comparing the odds of the modality of interest among subjects with MD with those without MD for categorical variables; as the difference in expected values for continuous variables);
Impute MD under the scenario thus specified.
A sensitivity analysis was performed on data from HIV+ patients, to assess the robustness of the OR between mental health and self-reported viral load, including MD. We assumed that non-responders were more likely to have high viral load.
Results Adjusted OR was reduced from 2.01 [1.21 to 3.35] to 1.75 [1.03 to 2.97]. Conclusions were robust to the explored scenarios reinforcing the confidence in results from the analysis assuming ignorable MD.
Conclusion A sensitivity analysis is easy to perform using the proposed package SensMice. The impact of imposed variations in the imputation model on the overall results helps to assess their robustness. This is particularly interesting for self-reported characteristics when MD are highly suspected to be non-ignorable.