Article Text

Download PDFPDF

P18 Multilevel latent class modelling of simulated healthcare provider-level causal effects in observational data
  1. WJ Harrison1,2,
  2. PD Baxter2,
  3. MS Gilthorpe1,2,3
  1. 1Leeds Institute for Data Analytics, University of Leeds, Leeds, UK
  2. 2School of Medicine, University of Leeds, Leeds, UK
  3. 3The Alan Turing Institute, London, UK


Background Healthcare provider performance is commonly assessed using patient outcomes, e.g. survival rates. Patient characteristics that may affect outcomes in the absence of genuine provider-level differences must therefore be balanced across providers to ensure a fair comparison. There are many methods that can accommodate this patient ‘casemix’ but none that also allow the assessment of provider-level covariate effects, i.e. the potential causes of performance differences. We aim to demonstrate the utility of multilevel latent class (MLC) modelling to identify causal provider-level covariate effects after accommodating patient differences.

Methods We simulated data for patients and providers, based on a previously utilised real-world dataset of patients diagnosed with colorectal cancer. Age at diagnosis, sex and socioeconomic status were included at the patient level, and we explored a continuous outcome. We included both binary and continuous effects at the provider level, to reflect organisational features such as surgeon speciality or available beds, although these were analysed separately to demonstrate proof-of-principle. We simulated unique sets of 100 datasets using a range of coefficient effect values and error variances. Interest lies in the ability of the MLC model to recover these simulated provider-level coefficient effects.

Results Models contained one patient-level latent class and up to five provider-level latent classes. For the binary provider-level covariate, median recovered values were almost identical to simulated effects throughout, e.g. for the simulated coefficient value 0.500 at 33% error variance, the median recovered value was 0.499 (95% CI 0.489–0.509) across all models. For the continuous provider-level covariate, median recovered values improved as the number of provider-level latent classes were increased, e.g. for the simulated coefficient value 0.200 at 33% error variance, the median recovered value was 0.153 (95% CI 0.113–0.184) for two provider-level classes and 0.191 (95% CI 0.168–0.210) for five provider-level classes.

Discussion The MLC modelling approach achieved successful recovery of simulated coefficient values, within credible intervals for at least three provider-level latent classes. Very small simulated coefficient values were not recovered as well as higher values, which may be due to the variability introduced during simulation dominating the coefficient effect. There is also some attenuation of effect seen for the continuous provider-level covariate. We have demonstrated the utility of this approach to separate modelling for prediction (to accommodate patient casemix) and for causal inference (to explore provider-level effects) across a data hierarchy. There is much scope to extend the assessment of upper-level causal effects by consideration of a multivariable DAG.

  • Latent class
  • patient casemix
  • causal inference

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.