Article Text

Download PDFPDF

What does self rated health measure? Results from the British Whitehall II and French Gazel cohort studies
  1. Archana Singh-Manoux1,2,
  2. Pekka Martikainen3,
  3. Jane Ferrie2,
  4. Marie Zins1,
  5. Michael Marmot2,
  6. Marcel Goldberg1
  1. 1INSERM, U687, HNSM, Saint-Maurice Cédex, France
  2. 2Department of Epidemiology and Public Health, University College London, UK
  3. 3Population Research Unit, Department of Sociology, University of Helsinki, Finland
  1. Correspondence to:
 Dr A Singh-Manoux
 INSERM, U687, HNSM, 14 rue du Val d’Osne, 94415 Saint-Maurice Cédex, France; Archana.Singh-Manoux{at}


Objectives: To investigate the determinants of self rated health (SRH) in men and women in the British Whitehall II study and the French Gazel cohort study.

Methods: The cross sectional analyses reported in this paper use data from wave 1 of the Whitehall II study (1985–88) and wave 2 of the Gazel study (1990). Determinants were either self reported or obtained through medical screening and employer’s records. The Whitehall II study is based on 20 civil service departments located in London. The Gazel study is based on employees of France’s national gas and electricity company (EDF-GDF). SRH data were available on 6889 men and 3403 women in Whitehall II and 13 008 men and 4688 women in Gazel.

Results: Correlation analysis was used to identify determinants of SRH from 35 measures in Whitehall II and 33 in Gazel. Stepwise multiple regressions identified five determinants (symptom score, sickness absence, longstanding illness, minor psychiatric morbidity, number of recurring health problems) in Whitehall II, explaining 34.7% of the variance in SRH. In Gazel, four measures (physical tiredness, number of health problems in the past year, physical mobility, number of prescription drugs used) explained 41.4% of the variance in SRH.

Conclusion: Measures of mental and physical health status contribute most to the SRH construct. The part played by age, early life factors, family history, sociodemographic variables, psychosocial factors, and health behaviours in these two occupational cohorts is modest.

  • self rated health
  • stepwise regression

Statistics from

The single item measure of self rated health (SRH) has been widely reported to be a predictor of mortality after adjusting for traditional risk factors, sociodemographics, and measures of health status.1–4 Reviews on a total of 46 studies in various settings have found this relation to be robust.5,6 Attempts to unravel this association in the past have focused on controlling for a wide variety of risk factors: family history,7 measures of health,2,8–12 chronic diseases,4,7,8,12–19 functional ability,7,10,12,14,17,19–21 sociodemographic variables,2–4,7,9,10,12–16,21,22 psychosocial factors,2,10,15,21–24 and health behaviours.2–4,7–9,15,16 This line of inquiry suggests that SRH is a multidimensional and a holistic measure,5,25 even though its correlates have rarely been examined in empirical analyses.

Understanding the relation between SRH and health or mortality requires further investigation into the criteria by which people judge their health. The aim of this paper is to identify the determinants of SRH in two different cohorts, in separate analyses for men and women. The comparative framework is aimed at assessing whether the correlates of SRH are similar in different cultural contexts. We use a range of variables, traditionally used as controls in the examination of the SRH-mortality relation, to examine the extent to which they predict SRH. Six categories of variables are used: age, early life factors and measures of family history, sociodemographic variables, psychosocial factors, health behaviours, and measures of health and disease.


Study populations

Data for the analyses come from two European cohort studies: the British Whitehall II study and the French Gazel cohort study.

The Whitehall II study

The Whitehall II study was established in 1985 as a longitudinal study to examine the socioeconomic gradient in health and disease among 10 308 civil servants (6895 men and 3413 women).26 All civil servants aged 35–55 years in 20 London based departments were invited to participate by letter, and 73% agreed. Baseline examination (phase 1) took place during 1985–1988, and entailed a clinical examination and a self administered questionnaire containing sections on demographic characteristics, health, lifestyle factors, work characteristics, social support, and life events. Clinical examination included measures of blood pressure, anthropometry, biochemical measurements, neuroendocrine function, and subclinical markers of cardiovascular disease. Subsequent phases of data collection have alternated between postal questionnaire alone and postal questionnaire accompanied by a clinical examination. Since baseline six phases of data collection have been completed, with the most recent phase (phase 7) completed in 2004. The University College London ethics committee approved the study.

The GAZEL cohort study

The GAZEL cohort was established in 1989, on employees of France’s national gas and electricity company: Electricité de France-Gaz de France (EDF-GDF). Further details of this study can be found elsewhere.27 At baseline, 20 625 (15 011 men and 5614 women), aged 35–50, gave consent to participate in this study. The study design consists of an annual questionnaire used to collect data on health, lifestyle, individual, familial, social and occupational factors, and life events. Various sources within EDF-GDF provide additional data about Gazel participants. Occupational and personal data are updated through human resources department files. Medical data on sick leaves and incidence of cancer and coronary heart diseases come from the company’s health insurance department. Occupational physicians collect data on working conditions and occupational exposures. Sources outside EDF-GDF provide data on causes of death, health care use, and admissions to hospital. Every five years, volunteers are invited to visit a health centre run by the national health insurance fund, where they undergo a full standardised medical examination.


All the variables used in the analyses come from wave 1 (1985–89) of the Whitehall II study and wave 2 of the Gazel study (1990) as the wave 1 data did not have several critical variables. Earliest waves of the study have been used to reduce bias related to loss of follow up.

Self rated health

The Whitehall II measure of SRH asked respondents: “Over the past 12 months would you say your health has been—very good, good, average, poor or very poor”. The Gazel participants responded on an eight point scale, anchored by 1 = ”very good” and 8 = ”very bad”, to the following question: “How would you judge the state of your general health?”

Explanatory variables

Tables 1A and 1B show the explanatory variables, definitions, and analysis categories for both studies. These variables have been grouped into six theoretical categories: early life factors and measures of family history, sociodemographic variables, psychosocial factors, health behaviour, and measures of health and disease; further details of these measures can be found in table 1A for the Whitehall II and table 1B for Gazel.

Table 1A

 Variables used to assess self rated health in the Whitehall II data

Table 1B

 Variables used to assess self rated health in Gazel data

Statistical analysis

Analysis was carried out separately, in men and women and in the two cohorts. The first step consisted of examining correlations to decide the variables that would be used to predict SRH. All variables that were significantly correlated with SRH at p<0.05 were then used as determinants of SRH. We used multiple linear regressions to build models to predict SRH.28 Linearity of association between the determinants and SRH was ensured using the one way analysis of variance test of linearity; further examination of residuals scatterplots found no important deviations from the assumptions of normality and linearity. In the first set of analyses, the variables were entered in a hierarchical multiple regression; the order of entry of variables (or a set of variables) here was theoretically determined. We chose the order in an attempt to reflect temporality over the life course; early life measures were entered before later life measures, health behaviours before health itself, and so on. The order of entry of variables was as follows: age, early life factors and measures of family history, sociodemographic variables, psychosocial factors, health behaviours, and measures of health and disease. F tests were used to compute the significance of each set of variables; the resulting change in r2 reflected the variance explained in the outcome. The change in r2 was tested at p<0.05 to check if it was significantly different from zero.

The next step involved stepwise method of multiple regression to identify a small subset of determinants of SRH. This analysis is appropriate here as the cases (number) to determinants ratio is much larger than the recommended 50:1 ratio.29 The basic procedures involved are: identifying an initial model with one determinant, one that is most highly correlated with SRH; iteratively “stepping”, that is repeatedly changing the model at the previous step by adding or removing determinants in accordance with the “stepping criteria” defined; and terminating the search when stepping is no longer possible given the stepping criteria. Given the large number of cases in the analysis the stepping criteria was set at p = 0.001 for entry and p = 0.01 at exit.30 Furthermore, only variables that explained more than 1% of the variance (r2>0.01) in SRH were retained as determinants. Stepwise regression provides an objective method for selecting variables that maximise r2 with the smallest number of variables used. As a large number of determinants were used, multicollinearity statistics were analysed throughout using tolerance statistics, set here at 0.50. All analyses were carried out using SPSS 11.5.1 (SPSS for Windows, rel 11.0.1, Chicago, SPSS, 2001).


SRH data were available on 6889 men and 3403 women (10 292 overall; 99.85% of total number at baseline) in the Whitehall II study and 13 008 men and 4688 women in the Gazel study (17 696 overall; 88.85% of total number at baseline). Because of missing values, our analyses, at best, include 7043 (68.43% of those with SRH data) people in Whitehall II and 17 438 (98.54% of those with SRH data) in the Gazel study. Lower numbers in the Whitehall II data are related to four different versions of a questionnaire being used at phase 1 of the study; the first 2913 participants answered versions 1–3 of the questionnaire that did not have several key variables used here (for example, longstanding illness). Thus, many of the bivariate analyses are based on smaller numbers. The key variable associated with missing data in both cohorts is employment grade, with data more likely to be missing among people occupying low employment grade.

Tables 2A and 2B present bivariate analyses, correlation coefficients here, to show the association between SRH and its correlates in the Whitehall II and the Gazel data, respectively. The correlations were deemed to be significant if p<0.05. Table 2A (Whitehall II results) shows that in men four (age, father’s social class, job demand, and alcohol consumption) of the 35 determinants considered and two among women (father’s social class and number of medicines used in the past 14 days) did not have a significant association with SRH. These variables were dropped in subsequent analyses. Symptom score, sickness absence, and the number of recurring health problems were the three variables most strongly associated with SRH in the Whitehall II data (table 2A).

Table 2A

 Whitehall II: correlations between self rated health† and determinants

Table 2B

 Gazel: correlations between self rated health† and determinants

Table 2B presents the correlations between 33 determinants and SRH in the Gazel data. Three of these variables in men (height, father’s social class, support received during life events) and nine in women (height, father’s social class, myocardial infarction in parents, cancer in parents, cancer in siblings, smoking, alcohol consumption, frequency of eating fresh fruit, and frequency of eating fresh vegetables) were not associated with SRH and are not considered in further analyses. The measures most strongly related to SRH in the Gazel data in bivariate analysis were: a measure of physical tiredness, number of health problems in the past year and emotional reactions, as measured by the Nottingham health profile (NHP).32 The principal differences between the bivariate associations in Gazel and Whitehall II stem from the lack of association between SRH and measures of early life factors, family history, and health behaviours among the Gazel women.

Table 3 presents the results of the hierarchical multiple regression, where determinants were entered in theoretically defined blocks. The top half of table 3 presents results for Whitehall II data; 31 correlates in men and 33 in women, grouped in five blocks for men (as age was not significantly correlated with SRH in men) and six blocks for women. In both men and women, all the blocks considered for analysis made a significant contribution to the prediction of SRH. The largest improvement in r2, in both men and women, was from model 5 to 6, reflecting the importance of health variables in predicting SRH. All the variables considered together explained 36.1 % of the variance in SRH in men and 36.9% in women.

Table 3

 Predicting self rated health using categories of determinants. Whitehall II and Gazel

The lower half of table 3 shows results for the Gazel data. All the blocks considered for analyses (none of the measures of early life factors and family history were considered for further analyses in women) predicted SRH significantly. The largest improvement in r2 in men was from model 5 to 6 (r2 = 0.221), followed by model 3 to 4 (r2 = 0.171). Among women, the largest improvement was from model 3 to 4 (r2 = 0.231), followed by model 5 to 6 (r2 = 0.191). Thus measures of health and psychosocial factors were major determinants of SRH in the Gazel data. Overall, the variables considered explained 42.7% of the variance in SRH in men and 44.3% in women.

Table 4A presents the results from stepwise regression in the Whitehall II data. Five variables explained a total of 33% of the variance in SRH in men and 35% in women. Symptom score was the strongest correlate of SRH in both men and women, accounting for 15.4% of the variance in SRH in men and 19.7% in women. Sickness absence was the second strongest correlate, accounting for 9.6% of the variance in SRH in men and 8.1% in women. The difference between men and women relates to the fourth correlate: affect balance score in men and GHQ in women. Affect balance score33 is a measure of psychological wellbeing whereas GHQ34 measures minor psychiatric morbidity. These two measures are highly correlated in men (r = −0.62, p<0.0001) and women (r = −0.65, p<0.0001) in the Whitehall II study. In analyses of men and women together (table 4A), it is the GHQ that comes out a correlate, the other four remain the same. Longstanding illness and the number of recurring health problems were the other two predictors, explaining about 4% and 1.2 % of the variance respectively.

Table 4A

 Whitehall II: predicting SRH using stepwise regression†

What this paper adds

  • Data from two large European cohorts were used to examine the determinants of self rated health from six categories of variables: age, early life factors and measures of family history, sociodemographic variables, measures of health and disease, health behaviour, and psychosocial factors.

  • Our results show that it is mostly measures of health, mental and physical, that are independent determinants of self rated health.

Table 4B presents the results from a similar stepwise regression in the Gazel data. In men, five variables met the stepping criteria and explained 41.5% of the variance in SRH. Feeling “physically tired” explained most of the variance in SRH in men (32.1%) and women (34.6%). One of the determinants in men is a non-health measure: work satisfaction, explaining 1% of the variance in SRH. Among women four determinants (physical tiredness, number of health problems in the past year, emotional reactions (NHP),31 physical mobility (NHP)31 explained 43.9% of the variance in SRH. In analyses of men and women together four measures of health (physical tiredness, number of health problems in the past year, physical mobility (NHP),31 number of prescription drugs) explained 41.4% of the variance in SRH.

Table 4B

 Gazel: predicting SRH using stepwise regression†

The solution from the stepwise regression (tables 4A and 4B) was checked with other forms of statistical regression (forward selection and backward deletion) and no differences were found. There were no problems of collinearity in the final stepwise models, all tolereance levels were over 0.50.


SRH has two key features: it is subjective and evaluative. However, little is known about its determinants. The purpose of these analyses was to identify its determinants, using multivariate regression analyses, in two prospective cohort studies. Six categories of determinants were considered: age, early life factors and family history, sociodemographic variables, psychosocial factors, health behaviour, and health. These blocks explained 36.1% of the variance in men (with 31 determinants) in the Whitehall II data and 36.9% in women (with 33 determinants). In the Gazel data the variance explained was 42.7% in men (with 30 determinants) and 44.3% in women (with 24 determinants). A search for a smaller set of correlates using stepwise regression found five determinants overall in the Whitehall II data and four in the Gazel data explaining 34.7% and 41.4% of the variance in the two studies, respectively. In both studies, it is essentially measures of health that were independent determinants of SRH.

Before further discussion of the results, the similarities and differences in the two cohorts need to be highlighted. Besides cultural differences between the two, there are also other differences. Although they are both occupation based cohorts, the work content of the employees is quite different. The Whitehall II study is based on employees of the British civil service, all of whom have office based jobs. The Gazel study is based on the employees of the gas and electricity public utilities in France (EDF-GDF), with many manual workers, both skilled and unskilled. At baseline (1989), 14.8% of male and 29.1% of female Gazel participants were classified as being “unskilled” workers by their employer in a three level classification: unskilled workers, skilled workers, and managers.27 The essential similarity between the two cohorts is that participants in both studies had comparatively stable employment. Earlier comparative work shows similar social gradients in sickness absence and SRH in the two cohorts.34

An explicit examination of the determinants of SRH is important as it provides an insight into the interpretation of this concept. The main question relates to whether people use conventional indicators of health or if other elements like family history, psychosocial factors, and socioeconomic position influence the assessment of SRH. In other words, to what extent is the perception of one’s health status determined by traditional measures of health and to what extent is it determined by other factors, which may be correlated with SRH at the bivariate level. Our results show that the statistically significant correlations at the bivariate level (tables 2A and 2B) do not translate to independent associations in multivariate analyses. The salient if unsurprising finding of this study is that SRH is mostly a measure of health; measures of early life factors, family history, sociodemograhics, and health behaviours make only minor contributions to SRH whereas measures of health and in Gazel also psychosocial factors are important correlates of SRH. It has been suggested that SRH is a multidimensional phenomenon5,25; our results show that it is mostly dimensions of health that are independent determinants of SRH.

Results from the stepwise regression (men and women combined) in the Whitehall II data (table 4A) show that SRH is predicted by measures of current health status (symptom score), health over the past year (sickness absence and recurring health problems over the year), longer term health problems (longstanding illness), along with measure of minor psychiatric morbidity (GHQ). Sex differences in the determinants of SRH in Whitehall II are minor; in men it is the affect balance score that is a correlate rather than the GHQ. Sex differences in the Gazel data are more pronounced. The nature of the psychosocial determinant differs by sex, it is “satisfaction with work” among men and “emotional reactions” among women.

The striking aspect of the Gazel results is the importance of the measure “physically tired” in predicting SRH; it explains about 81.2% of the total variance explained (33.6% of 41.4%) by the full model. This result can be explained in two ways. Firstly, it reflects the essential difference between the two cohorts, the results from the Gazel participants reflecting the fact that their work is more physically demanding. This interpretation is supported by results (not shown but available from the authors) of a stepwise regression that excluded the “physical tiredness” variable, which resulted in the variable “work physically tiring” (table 1B) becoming an independent determinant of SRH. The second interpretation is that the measure of “physical tiredness” is a good comprehensive measure of health, as reflected in the strong correlation with SRH.

In both studies only part the variance was explained by the determinants examined, 34.7% in Whitehall II and 41.4% in the Gazel data. In the Whitehall II data we examined (results not shown) whether other physiological measures (cholesterol, blood pressure) added to the prediction equation and found it not to be the case. However, previous analysis on Gazel data has shown SRH to be linked to diseases, in both cross sectional and longitudinal analysis.35 One could speculate about the unexplained variance, traditionally seen to be made up of determinants not included in the regression equation, measurement error and individual differences. It is probable that individual history of past health and expected or aspired future health may be potential determinants of SRH.

Policy implications

Self rated health is a widely used measure of health status, mainly used because of its simplicity and its strong relation with outcomes like mortality. Self rated health is believed to be a multidimensional phenomenon even though the determinants of self rated health have rarely been investigated in empirical analysis. Our analysis shows that this simple measure is in effect a valid measure of health.

Comparison with other studies

SRH is a widely used measure in epidemiological studies; a search on Medline (combination self rated health OR self assessed health OR self reported health OR self perceived health AND 2002 as year of publication) found 1991 papers.36 However, an explicit quantitative analysis of the independent determinants of SRH remains elusive in the research literature. An exception is a study by Jylhä and colleagues that examined the correlational structure of SRH in an Italian and Finnish sample.19 This study used logistic regression to examine the relation between good (very good and fairly good) and poor (average, fairly poor, and poor) health using a smaller range of determinants (number of diseases, functional ability, number of symptoms, problems with vision and hearing, number of medical drugs being used, and education). All determinants were significantly related to SRH in multivariate analyses, the odds ratios associated with “functional ability” and “number of symptoms” were the highest. We were able to extend the analyses by including a wider range of correlates and quantifying the unique variance in SRH explained by the correlates.

Study strengths and limitations

To our knowledge, this is the first study to investigate explicitly the determinants of SRH by quantifying the amount of variance in SRH explained by independent determinants. Data come from two large European cohorts, providing sufficient numbers in each cohort to permit “stepwise regression” without the risk of “overfitting”.29,30 We were able to make an objective assessment of the predictive power of several variables. Several categories of determinants were examined in this study. There is evidence to suggest that SRH forms a continuum from poor through average to good health,37,38 justifying our use of linear regression models and further permitting comparability of a five point and an eight point measure of SRH. Furthermore, it has been shown that different ways of treating the SRH variable (dichotomous, ordinal, etc) yields similar results.39 A number of limitations should also be noted. Firstly, data here are from employees with stable jobs and cannot be assumed to represent general populations. However, similarity of results in the two contexts provides support for the generalisability of our findings, at least to people in this age group. A recent study examined the 59 year longitudinal trajectory of SRH and concluded that it was comparatively stable until age 5040; the predictors of SRH beyond this age might be different. Secondly, our analyses are exploratory in nature; further work with other measures of health is required to delineate the determinants of SRH and possible sex differences. Furthermore, the determinants in the two studies are not strictly comparable; this is rarely possible when international comparisons are being made. Rather than restrict analyses to measures where some degree of equivalence could be established, we chose to use multiple measures from the two studies to permit an assessment of the principal determinants of SRH. Finally, data are not missing at random; the non-response is higher among persons who are socially disadvantaged and likely to be in poorer health. This is likely to lead to some underestimation of the association between SRH and its determinants.

In conclusion, our results show the determinants of SRH to be measures of physical and mental health. The “multidimensional” nature attributed to SRH mainly translates to it being a holistic measure of health rather than a measure of non-health circumstances as well.


AS-M is supported by a “Chaire d’excellence” award from the French Ministry of Research. PM is supported by the Academy of Finland. JEF is supported by the MRC (grant number G8802774). The Whitehall II study is supported by grants from the Medical Research Council; British Heart Foundation; Health and Safety Executive; Department of Health; National Heart Lung and Blood Institute (HL36310), US, NIH: National Institute on Aging, US, NIH; Agency for Health Care Policy Research (HS06516); and the JD and CT MacArthur Foundation Research Networks on Successful Midlife Development and Socio-economic Status and Health. We thank all participating civil service departments and their welfare, personnel, and establishment officers; the Occupational Health and Safety Agency; the Council of Civil Service Unions; all participating civil servants in the Whitehall II study; and all members of the Whitehall II study team. The Gazel cohort is supported by Electricité de France-Gaz de France (EDF-GDF). We would like to thank all the staff at INSERM 687; special thanks to Sébastien Bonenfant for his help with the Gazel data.



  • Conflicts of interest: none declared.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles

  • In this issue
    Carlos Alvarez-Dardet John R Ashton