Background The widespread use of risk algorithms in clinical medicine is testimony to how they have helped transform clinical decision-making. Risk algorithms have a similar but underdeveloped potential to support decision-making for population health.
Objective To describe the role of predictive risk algorithms in a population setting.
Methods First, predictive risk algorithms and how clinicians use them are described. Second, the population uses of risk algorithms are described, highlighting the strengths of risk algorithms for health planning. Lastly, the way in which predictive risk algorithms are developed is discussed briefly and a guide for algorithm assessment in population health presented.
Conclusion For the past 20 years, absolute and baseline risk has been a cornerstone of population health planning. The most accurate and discriminating method to generate such estimates is the use of multivariable risk algorithms. Routinely collected data can be used to develop algorithms with characteristics that are well suited to health planning and such data are increasingly available. The widespread use of risk algorithms in clinical medicine is testimony to how they have helped transform clinical decision-making. Risk algorithms have a similar but underdeveloped potential to support decision-making for population health.
- public health
- social inequalities
- health policy
- public health policy
Statistics from Altmetric.com
In the clinical setting, predictive risk algorithms are embedded in clinicians' daily practice as the primary tool to estimate individual risk of future disease. Hundreds of risk algorithms are now used to guide clinical decisions about disease prevention and treatment. Population health planners have a similar task of understanding how best to prevent disease in populations. As in the clinical setting, baseline risk assessment has long been an underpinning of population health decision-making. Over 20 years ago the epidemiologist Geoffrey Rose stated “All policy decisions should be based on absolute measures of risk”1 — baseline risk, in other words. However, population health planners have only recently begun to appreciate that routinely collected population data can be used to develop robust risk algorithms that offer opportunities to generate more accurate, discriminating and useful predictions on health issues and strategies than have been available in the past.
This overview describes the role of predictive risk algorithms in the population setting. We start by describing predictive risk algorithms and how clinicians use them. Next, we describe population uses of risk algorithms, highlighting the strengths of risk algorithms for health planning. Lastly, we discuss briefly how predictive risk algorithms are developed and present a guide for algorithm assessment in population health.
We use the examples of cardiovascular disease and diabetes to demonstrate both clinical and population uses of algorithms. For clinical prevention of cardiovascular disease, countries with national guidelines recommend the use of risk algorithms such as the Framingham risk tool and the SCORE (Systematic Coronary Risk Evaluation), although there are many others.2 The same risk algorithms have been used to calculate risk for entire populations to assess the potential preventive benefit of both clinical and community-wide preventive interventions.3–5 Diabetes predictive tools demonstrate how risk algorithms, developed using routinely collected data, can provide insights into the role of a risk factors—such as obesity—towards the evolving burden of disease in populations.6
What is a predictive risk algorithm?
A risk algorithm predicts an outcome. Prognostication in health dates back to Hippocrates, but it is only in the past 30 years that rigorous analytic approaches have been developed.7 Clinical risk algorithms are now available to help clinical decision-making throughout life—from seconds after birth8 to the end of life.9 This review focuses on risk algorithms used to predict the future risk of a disease, but risk algorithms are also commonly used to calculate the probability of a disease state.10 In addition to disease prediction, risk algorithms are used to predict healthcare use, risk of adverse events and other outcomes.11 For population health purposes, multivariable risk algorithms are based on individual data such as population health surveys, administrative data and cohort studies that include information about exposure to different risks and long-term follow-up for development of chronic disease. These data are becoming increasingly available, allowing for the creation of such algorithms.
How do clinicians use risk algorithms?
Clinicians mainly use risk algorithms to discriminate the level of risk for individual patients to enable planning of treatment or preventive care. Discrimination is the ability to distinguish between people at high and low risk. A risk tool is considered discriminating if it can correctly predict which patients will develop an outcome. Risk algorithms are particularly helpful when there is a wide range of risk, when multiple factors contribute to risk and when the decision to investigate or treat is strongly influenced by a patient's level of baseline risk. Conversely, clinicians use simpler measures such as likelihood ratios when there is only one or a few exposures to an outcome, particularly when the relative risk of a single exposure is high.
Risk estimated using a risk algorithm often differs from a clinician's perceived risk, particularly when risk is low.12 ,13 Because discrimination is so important to clinicians, most algorithms are developed to be as discriminating as possible, while maintaining calibration (also called accuracy).14 An algorithm is accurate, or well-calibrated, if predicted risk closely approximates observed risk. Box 1 provides an example of a risk prediction algorithm commonly used in the clinical setting.
Case example of risk prediction in clinical medicine—algorithms to predict cardiovascular risk
The cardiovascular risk algorithms predict risk using exposure variables that are easily measured in the clinical setting (such as age, sex, smoking status, lipid levels and blood pressure).
Cardiovascular prevention guidelines recommend the use of risk algorithms to stratify people into categories of high, medium or low risk, with different treatment recommended for patients depending on their risk category. For example, the European guidelines recommend statins for people whose 10-year risk of cardiovascular disease is above the cut-off point of 15% (which corresponds to ‘medium’ baseline risk) because statins are both efficacious and cost-effective in this group.14
Clinical algorithms are assessed using measures of discrimination (receiver operating characteristic curves and C statistic) and accuracy/calibration (the difference between observed and predictive estimates).10 ,11 Plots of risk deciles are a common approach to visually summarise discrimination and calibration (figure 1).
Preferably, algorithms are validated in populations that are separate from the original derivation population. Risk algorithms typically require recalibration when either disease incidence and/or distribution of risk exposure differ in target populations. For this reason, SCORE has separate algorithms for use in European countries at high and low risk.
How can population health planners use risk algorithms?
Population uses of risk algorithms are largely similar to clinical uses, but the different settings require different terminology. Rather than the clinical term of risk discrimination or stratification, the analogous population health term is risk diffusion. Population risk is diffused when individuals in a population share a similar baseline risk. Correspondingly, population risk is concentrated when risk varies considerably among individuals. When risk is concentrated, a small proportion of the population—those at the highest risk—bears a large proportion of the overall or population risk.
Geoffrey Rose viewed the examination of population risk and risk diffusion as a cornerstone of population-based health planning. He argued that when population risk is diffused, a population health strategy that shifts the distribution curve a small amount in an entire population has a greater effect on population outcome than treating only people with high levels of that risk factor. Rose performed his studies in the 1990s by assessing baseline risk for chronic diseases using single risk factors such as cholesterol. Not surprisingly, given the tools available at the time, Rose observed only small differences in baseline risk throughout the population and concluded that risks for coronary artery disease and other chronic conditions were diffused in the population. However, using the more discriminating risk algorithms developed to consider multiple risk factors, current assessment shows that population risk for heart disease and other conditions is considerably more concentrated in identifiable high-risk groups.4
Policy makers and researchers still unwaveringly quote Rose. We have previously argued, “Too often, advocates for a particular population health strategy quote Rose's principle that ‘shifting the curve is the best approach’ without his required caveat, ‘when risk is diffused in the population’. Too often, we assume that risk is widely distributed without actually assessing it, let alone using an appropriately discriminating risk assessment method such as multivariate risk algorithms”.15
As in the clinical setting, decile plots of baseline risk are an intuitive way to visualise the diffusion of population risk. Baseline risk is defined as risk at a baseline time (T0). The term baseline risk often implies that only exposure at or before time T0 is used to estimate risk. The term baseline risk has the same meaning in both a clinical and a population setting. In the population setting the terms baseline risk and population risk are interchangeable, for the most part. A decile plot for a clinical risk algorithm that is highly discriminating (box 1 and figure 1) will have the same appearance as a decile plot of population risk that is concentrated. For example, cardiovascular disease in developed countries typically has a concentration of risk in the 10th decile that is more than 20 times higher than that of the first decile.16–18 In contrast, diabetes risk is typically less concentrated with the large proportion of overweight versus obese people contributing to most of the population risk.19 Decile plots can also be translated into Lorenz or concentration curves which are more commonly used in population health.
Population health planners use risk algorithms for several key purposes, including projection of new cases, description of population risk and evaluation of different preventive strategies. We describe some of these functions below in detail and, in table 1, summarise how each is calculated.
Projecting the number of new cases of a disease
The simplest use of risk algorithms in a population setting is to predict the future incidence of a disease. In epidemiology and health planning, the prediction or prognostication of future health is described using different terms, including forecasting, projection, future estimation, trend analysis and extrapolation. Surprisingly, this application of risk algorithms is infrequently used despite having several advantages over the more commonly used projection methods. For example, the most common approach examines the past incident cases of a disease and then creates a trend that is extrapolated into the future, often incorporating projected changes in the age structure of the population.20 WHO's prediction that diabetes will increase by 39% worldwide from 2010 to 2030 was generated using this approach.21 Projections using this approach are often accurate, especially in the short term, because age is an important risk exposure for many diseases.
The main advantage of using predictive risk algorithms is the ability to directly incorporate baseline exposure characteristics in addition to age and sex. Predictive risk algorithms can help to describe the contribution of current behavioural risk exposure to the development of future cases of a disease, and they can improve predictive accuracy when risk exposure is changing. For example, diabetes projections generated using risk algorithms can estimate the contribution of different categories of weight.19 Such estimates show that because population risk for diabetes is diffused, the obese contribute a smaller proportion of new cases than overweight people, despite obese people having a considerably higher individual risk for diabetes. The projection of worldwide increase of diabetes will be inaccurate if there is a global trend for increasing weights, unless projections include information about population weights.21 Another advantage of multivariable risk algorithms is that their predictive accuracy can be described, validated, calibrated and recalibrated as needed, whereas these assessment methods are underdeveloped—and rarely performed—in many other predictive or forecasting approaches.
Contribution of specific risk factors to population risk
Of the three general approaches for estimating the contribution of a specific risk factor to overall population risk, two use risk algorithms. The first approach is the one used by Rose to assess the strategy of shifting the population health curve of a specific risk exposure. This approach can be used if the risk exposure is contained within the algorithm and the algorithm's predictive accuracy has been assessed for this purpose.22 ,23 In Rose's population strategy, a risk exposure, such as cholesterol, is theoretically reduced by a small amount in the entire population. Baseline risk is calculated both before and after the risk exposure is reduced. The difference between the two estimates represents the potential reduction in population risk or number of cases associated with the risk factor reduction.
The second approach is the same as the method of assessing intervention benefit (described below), except that a harm is assessed instead of a benefit. In this method, the baseline risk for those exposed to the risk is multiplied by the relative risk of the exposure.
The third and more common method of assessing the contribution of a specific risk factor to population outcomes is a method proposed by Levin in the 1970s, often called the attributable fraction in the population or population-attributable risk.22 ,24–26 Conceptually, Levin's method and multivariable risk algorithms share the basic approach. The differences between the two methods are their time perspective (Levin's method is historically focused, while predictive algorithms are more commonly used to estimate future events) and a focus on the population exposed to the risk factor (risk algorithms) versus the total population of exposed and non-exposed people (Levin's method). Levin's method becomes progressively challenging to implement for many uses, such as in the setting of multiple risks, or assessing the impact of varying multiple risks and when risks varies for many subgroups (age, sex, socioeconomic positions, etc).25
Assessing the potential benefit of health interventions
Estimating the potential population benefit of health interventions is straightforward, particularly when the relative benefit of an intervention is known. In the clinical setting, risk algorithms are used to describe potential individual efficacy or effectiveness in absolute terms such as number needed to treat. In population health planning, the analogous uses of risk algorithms are the calculation of community efficacy and community effectiveness. Efficacy—the intervention benefit in the ideal setting—is calculated by multiplying an individual's baseline risk (clinical setting) or average population baseline risk (population setting) by the relative benefit of an intervention. Intervention efficacy is estimated using inventions studies specifically designed to estimate efficacy. Community efficacy is calculated by multiplying average absolute benefit by the size of the target population. Community effectiveness (Equation 1) is calculated by adjusting community efficacy by various attenuating factors that reduce intervention coverage: the level of awareness, access, screening and diagnosis in the population targeted; compliance of providers; and adherence of consumers.27(1)
Box 2 describes how clinical risk algorithms (the Framingham risk tool and SCORE) have been used in a population setting to evaluate the community efficacy of various international guidelines for cardiovascular disease prevention.
Case example of population use of a clinical risk algorithm
The Framingham, SCORE and other cardiovascular risk algorithms are used in the population setting less often than in the clinical setting because they require information on measures such as blood pressure and lipid levels that are not routinely collected in populations. That said, population health surveys that included clinical measures have been used to estimate population risk of cardiovascular disease and the population impact of risk factors such as cholesterol, or medical prevention such as cardiovascular prevention guidelines.3 ,28 We found that small differences between international cardiovascular prevention programmes can have a large effect on the number of people recommended statins and on the overall population effectiveness of the recommendations.28 We speculated that the least effective and efficient prevention guidelines did not fully consider the population implications of their recommendations.
Cardiovascular risk algorithms are also incorporated into population simulation models that generate risk factor distributions in populations. Such models assess the population impact of risk factors such as sodium (44 000 to 92 000 deaths a year in the USA)5 and medical treatment such as cardiovascular prevention guidelines.29
When calculating community effectiveness, historical estimates of disease incidence are often used, rather than predictive multivariable algorithms, to approximate current population risk.22 Each method has distinct advantages. The main advantage of historical estimates is their widespread availability, whereas predictive risk algorithms have at least three important advantages:
First, risk algorithms use risk exposure data to generate estimates of baseline risk, so they are helpful when incidence data are not available or reliable. For example, the incidence of heart disease is difficult to estimate routinely in many target populations because many events are ‘silent’ (patients are asymptomatic or do not go to hospital). However, these events are often captured during the development and validation of the risk algorithm. This was the case with the Framingham and SCORE algorithms. Similarly, disease incidence is often missing for specific and important target populations, such as those defined by socioeconomic position or by risk exposure (eg, tobacco users).
Second, as previously discussed, risk algorithms are helpful when there is large variation in baseline risk among individuals or groups, often reflecting multiple risk exposures that contribute to risk. In these situations, overall or average baseline risk can be used, but may produce a less accurate approximation of risk than can be achieved using the more nuanced information available from applying a risk algorithm to population data. One particular concern with using average population disease incidence is that it will underestimate the community effectiveness of interventions for disadvantaged groups, who typically have a higher-than-average baseline risk.30
Third, when a target population's characteristics and risk exposures are changing, observed historic estimates do not reflect the current baseline risk of a population.
How are predictive risk algorithms developed for the population health setting?
Risk algorithms for the population setting are developed using the same approach as for the clinical setting, except that developers typically place greater emphasis on calibration/accuracy of the algorithm and on simple measures of exposure available in routinely collected population data.
Regardless of their intended use, risk algorithms are developed using data from a cohort that captures individuals' risk exposure at baseline. Clinical risk algorithms for common diseases generally contain clinical information such as biomedical measures (eg, blood pressure and lipid levels in the case of cardiovascular algorithms) and can often be equally discriminating whether or not they include detailed measures of risk exposure such as genetic and specific disease markers (eg, C-reactive protein in the case of cardiovascular disease).31–33 Indeed, inclusion of too many specific markers can result in the algorithm being ‘overfitted’ with a consequent erosion of calibration.14
Judicious development can lead to algorithms that are simple and easy to use without compromising either discrimination or accuracy. That stated, Diamond has outlined the challenge that discrimination and accuracy mathematically compete with each other: “Even if we could all agree on the ‘justifiable proportions’, the ingredients will just not mix. At best they can be made to form a very unstable emulsion—akin to Béarnaise.”14 With this in mind, users of algorithms in a population setting will typically prefer a well-calibrated or accurate algorithm over one that is highly discriminating, because of the implications for resource allocation inherent in population health decisions. In predicting the number of future cases of a disease, calibration is essential, whereas a highly discriminating algorithm offers no appreciable advantage over a less discriminating one. That is not to say that discrimination is irrelevant for other population uses; indeed it is central to the debate about whether disease is best prevented using community-wide versus high-risk approaches. An overemphasis on calibration at the expense of discrimination will favour the efficacy and efficiency of a high-risk approach over community-wide preventive strategies.
Similar care is needed when balancing discrimination and calibration for equity uses. Inequities between social groups will not be identified or will be under-represented if a risk algorithm lacks discriminating power. The lack of discrimination should be apparent if the calibration of the algorithm is specifically assessed across social groups. When discrimination is lacking, there will be an underprediction of risk in disadvantaged social groups and vice versa in advantaged groups.
Multivariate risk tools are the most discriminating risk approach available, and so are well suited to examining social inequities, assessing the efficacy of interventions or strategies to reduce health inequities.15 Rose did not explicitly discuss population risk and inequity, but the implicit conclusion of diffused population risk—by definition when risk is similar across the population (or social groups)—is apparent small inequities across social groups. Restating earlier points, single risk factors lack discriminating power unless the relative risk is large, so single risk factors are generally not well suited to examining health inequities between social groups.
An additional difference between clinical and population uses of risk algorithms is consideration of causal exposures. For both uses, exposures that are causally related to outcomes should be assessed carefully for inclusion in an algorithm. However, for purposes that are solely predictive—as it often the case with clinical prediction—a causal exposure need not be added to an algorithm if it does not improve predictive properties (discrimination and calibration). In the population setting, even more care is needed when examining causal exposure, particularly when the intended use of the algorithm is to evaluate the population burden or risk of a causal exposure; the relative risk of an exposure is high; the exposure is common; or, the exposure varies across an important population for algorithm use. Similarly, algorithms should be reassessed as new evidence of exposures emerges.
How are risk algorithms assessed for the population setting?
Methodological standards for evaluation of clinical risk algorithms are largely applicable to the population setting.10 However, the previously noted differences in algorithm development will affect how population health users assess a risk algorithm. Table 2 summarises two groups of characteristics of robust algorithms for population planning. ‘Useful’ characteristics reflect practical implications that are often the focus of algorithm users in the population setting. ‘Valid’ characteristics describe analytical properties of an algorithm in relation to its intended uses. These properties focus on discrimination and calibration and also consider the causal relationship between risk exposures and disease outcomes.
To plan services and prevention strategies, health planners at all levels and with all healthcare systems require accurate estimates of disease risk. The most accurate and discriminating method of generating such estimates is the use of multivariable risk algorithms. Although indispensable at the clinical or individual level, they are not often used for population health. Greater development and use of risk algorithms for population health planning is warranted: routinely collected data that can be used for their development are increasingly available, and risk algorithms developed using such data are often as equally discriminating and accurate as tools using more detailed but harder-to-collect clinical data.
What is already known on this subject
In the clinical setting, predictive risk algorithms are embedded in clinicians' daily practice as the primary tool to estimate individual risk of future disease.
Examination of population risk and risk diffusion is a cornerstone of population-based health planning, as described by Geoffrey Rose. Population risk is often assumed to be diffused, with analytical assessment often missing.
What this study adds
Routinely collected population data can be used to develop robust risk algorithms that offer accurate, discriminating population risk estimates.
Risk algorithms developed for the population setting offer a range of opportunities to support health planning, including the projection of disease cases; description of population risk and evaluation of health risks and preventive interventions.
DCM holds a CIHR/PHAC chair in applied public health. KW holds the Canada Research Chair in public health policy. We are grateful to Amy Zierler for editorial assistance. This study was supported by Simulation Technology for Applied Research, a CIHR new emerging team, and the Population Health Improvement Research Network that receives funding from the Ontario Ministry of Health and Long-term Care. The opinions, results and conclusions reported in this paper are those of the authors and are independent of the funding sources. No endorsement by the authors' host agencies is intended or should be inferred.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.