Conceptualisation, development, and evaluation of a measure of unplanned pregnancy
- 1Centre for Sexual and Reproductive Health Research, Department of Public Health and Policy, London School of Hygiene and Tropical Medicine, University of London, London, UK
- 2Health Services Research Unit, Department of Public Health and Policy, London School of Hygiene and Tropical Medicine, University of London
- Correspondence to: Dr G Barrett Department of Health and Social Care, Brunel University, Borough Road, Middlesex TW7 5DU, UK;
- Accepted 10 December 2003
Study objective: To develop a measure of unplanned pregnancy that is valid, reliable, and appropriate in the context of contemporary demographic trends and social mores and can be used in a variety of situations, including the production of population prevalence estimates.
Design: A two stage study design: qualitative (inductive) methods to delineate the construct of pregnancy planning, and quantitative/psychometric methods to establish the means of measurement.
Setting: Eight health service providers (comprising 14 clinics, including antenatal, abortion, and one general practitioner) across London, Edinburgh, Hertfordshire, Salisbury, and Southampton in the UK.
Participants: Samples comprised a mixture of pregnant (continuing pregnancy and opting for abortion) and recently pregnant (post-abortion and postnatal) women. At the qualitative stage, 47 women took part in depth interviews (20 of whom were re-interviewed after the birth of their baby). Items were pre-tested with 26 women, and two psychometric field tests were carried out with, respectively, 390 and 651 women.
Main results: A six item measure of unplanned pregnancy was produced. Psychometric testing demonstrated the measure’s high reliability (Cronbach’s α = 0.92; test-retest reliability = 0.97) and high face, content, and construct validity. Women’s positions in relation to pregnancy planning are represented by the range of scores (0–12).
Conclusions: A psychometric measure of unplanned pregnancy, the development of which was informed by lay views, is now available. The measure is suitable for use with any pregnancy regardless of outcome (that is, birth, abortion, miscarriage) and is highly acceptable to women.
The concept of “unplanned pregnancy” is widely used in health research and policy. Attempts to measure it have been numerous, varying from studies in which the concept is assumed to be self evident to those in which more sophisticated measurement strategies have been used.1–5 The approach taken by large national surveys has tended towards the latter, eliciting planning status by means of multi-dimensional questions, probing (in various combinations) intentions, contraceptive use, reactions to pregnancy, timing of pregnancy plans, and family size intentions.6–12 The most influential of these surveys has been the US National Survey of Family Growth12 whose forms of measurement have been widely adopted.13–15
In recent years, however, there has been growing awareness of the limitations of existing questions.5,16–21 No new estimates of unplanned pregnancy have been produced in Britain since 1991, and in the US National Survey of Family Growth items have been added incrementally to ensure validity.12,22 As most questions currently used to assess unplanned pregnancy were developed several decades ago, before legal abortion was available and when the primary concern was with excess fertility in marital relationships,20 it is not surprising that such questions are becoming dated. Also, as therapeutic advances have effectively raised the upper age limit for pregnancy and as employment opportunities have increased for women, the social context of childbearing has changed. Furthermore, measurement to date has tended to assume congruence between intentions and behaviour despite evidence to the contrary.4,6,7,17
Calls are now being made for a reconsideration of the conceptual basis of unplanned pregnancy.17–19 Improvements to the yield of national fertility surveys will only come, it is said, from “intensive work to refine the measurement of often elusive concepts”.10 In this spirit we began a three year study in 1998 to develop a new measure of unplanned pregnancy for use in Britain. We describe here the development and psychometric validation of the measure.
The overall aim of the study was to develop a measure of pregnancy planning/intention that is valid, reliable, and appropriate in the context of contemporary demographic trends and social mores, and can be used to establish population estimates of unplanned pregnancy. To achieve this we used a two stage study design: (1) qualitative methods to delineate the construct of pregnancy planning; and (2) quantitative/psychometric methods to establish the means of measurement.
To develop a conceptual model of pregnancy planning/intention, we used depth interviews (that is, flexible interviews that use normal modes of conversation) to elicit women’s accounts of the circumstances in which they became pregnant. The aim was to build a model based on the key elements of the interviews through which an understanding of women’s experiences could be gained. (Details of the methodology and qualitative findings have previously been reported.23,24)
As previous evidence had suggested that answers elicited before and after birth may be different,25,26 we conducted follow up interviews with women who continued their pregnancies after they had their babies. The aim was to assess if, or how, women’s accounts changed over this time period
Item development and piloting
The conceptual model produced during the qualitative stage informed item development; items were developed, without limit, until the dimensions of the model were adequately represented. The items were piloted with a small sample of women and qualitative interviews were used to check women’s understanding. Amendments to the items were made incrementally during piloting.
Item analysis and selection (first psychometric field test)
We screened items for homogeneity27 using inter-item correlations and Cronbach’s α.28 We then devised a five step strategy for item analysis and selection: (1) Remove items with more than 5% missing data29; (2) Remove any item with a maximum endorsement frequency of ≥80% on any response option30,31; (3) Remove any item with an item-total correlation of <0.230,31; (4) Rank the remaining items according to the item-total correlations and then, starting with the lowest rank, remove items if they correlate highly (that is, >0.75) with another question27,30–32; (5) Return items to the scale, in reverse order of removal, until an α of >0.90 is reached.27,32 (Although a criterion of 0.7 is often cited for internal consistency, we used a more stringent criterion to allow for the possibility that the Cronbach’s α might be lower in future samples.27,32) The strategy enabled us to maximise homogeneity while still maintaining content validity.
Evaluating the item reduced measure (second psychometric field test)
A second independent field test was carried out to establish the psychometric properties of the final, item reduced, measure (appendix, available to view on the journal web site http://www.jech.com/supplemental).
Before analysis, missing data from this field test were imputed for the 18 women with missing data using the method applied to the SF-3633 (that is, where a subject has completed at least 50% of items of a scale, the mean score of their completed items can be substituted for the missing items).
Acceptability of the measure
Acceptability was assessed by examining rates of missing data for the overall score and the distribution of scores. The reading level of the measure was assessed using the Flesch-Kincaid grade level scale.34 Field notes were kept by researchers of women’s experiences of completing the measure; notes were based on the researchers’ observations and informal questioning of women in clinics.
Internal consistency was assessed using the Cronbach’s α statistic (>0.7 indicating acceptable reliability28,35), and test-retest reliability was examined in two ways: (1) a standard test-retest where a sub-sample of women were required to complete the repeat measure 7 to 14 days after first completion (an interval comparable to that used for other measures31); and (2) a long term test-retest that only included women who had completed the measure initially when they were pregnant, and then completed the repeat measure some months later, after the birth. The rationale for this second test was to assess the stability of scores before and after birth in light of existing evidence which suggests that women’s reporting is unstable over this period.25,26 In both instances, test-retest reliability was measured using the weighted κ (the non-parametric equivalent of the intra-class correlation coefficient), a score of 0.61–0.80 indicating “substantial” reliability and >0.80 indicating “almost perfect” reliability.36
Content validity was assessed by comparing items in the final item reduced instrument with the conceptual model.
Two methods were used to assess construct validity: principal component analysis for within scale analyses; and hypothesis testing. We used principal component analysis (using varimax rotation and requesting as many factors as there were eigenvalues >1) to test the hypothesis that all variables would load onto one factor.37 For hypothesis testing there were two levels of hypotheses, from the qualitative findings and from the literature (table 1).
Although construct validity is also sometimes tested by considering the relation between the new measure and an established measure of a similar construct (convergent validity) or with a known measure of a different construct (discriminant), not enough is known about the nature of the construct of “pregnancy planning” to enable these hypotheses to be formulated. Criterion related validity is usually established by comparing a new psychometric measure with an established measure (ideally a “gold standard”) of the same construct. However, as the absence of an existing measure was the reason for developing this measure, testing (concurrent or predictive) criterion related validity was also not possible. Responsiveness refers to an instrument’s ability to detect change (over time) in a dynamic construct (for example, change in health status); as conception is an event at one point in time and therefore not a dynamic construct, testing responsiveness was not appropriate.
Interpreting the scores
Interpretation of scores is normally an ongoing process over the life of a measure. As a first step in this process, we used a content based method of interpretation,38 using the item score patterns in the second field test and data from the qualitative stage, to provide the contextual detail necessary for initial interpretation of the scores.
As we aimed to develop a measure that could be used to produce population estimates, women who were (or had been) pregnant were our target population. Although many men clearly have an important role in pregnancy planning, we did not include them on the grounds that not all men know about the pregnancies of their partner (or ex-partner), and because relying on information collected from couples, rather than women alone, would introduce substantial biases into a population sample.
All samples were constructed to include women of all ages whose pregnancies were continued and those which ended in abortion. Recruitment of study participants was from eight health service providers (comprising 14 clinics, including antenatal, abortion, and one general practitioner) across London, Edinburgh, Hertfordshire, Salisbury, and Southampton in the UK.
The qualitative sample was purposively sampled according to the above criteria, and for follow up, all women who were eligible and could be contacted were re-interviewed. The pilot/item development sample was also purposively sampled.
As psychometric field test samples must reflect the populations for whom the measure is designed,27,30 unselected clinic populations were invited to take part until the ratio of abortions to live births in the samples was consistent with that in the national population39–42 (that is, abortions comprise 22% of conceptions). Women who were continuing their pregnancy or opting for abortion were recruited in hospital clinics. Three researchers (led by GB) carried out the recruitment. Women were approached directly by the researcher in a waiting room or a side room at some time during their appointment; only occasionally women declined to participate. Postnatal women were recruited in two ways. Firstly, women (14 in first field test and 170 in second field test) were identified from records of recent births at the participating hospitals and were sent the questionnaire via post; response rates were 79% (11) and 67% (112) respectively. Secondly, postnatal women were recruited at community clinics run by health visitors. Sample sizes complied with guidance.43
For the test-retest samples, women were invited to volunteer to complete a second questionnaire at home. Because of issues of confidentiality (particularly the problem of sending material about pregnancy to women’s homes), women undergoing abortion were excluded from this process, and therefore from the standard test-retest. Of the 467 women invited to participate, 340 (73%) agreed. For both test-retests, 121 women were selected (on a quota basis to include a range of ages) to achieve the sample sizes necessary for repeated observations.44
Multi-centre ethical approval was obtained for the study.
The qualitative sample comprised 47 women: 28 were continuing their pregnancies (although one had a miscarriage a couple of days before the interview), two were about to have abortions, and 17 had recently had abortions, most within the past two weeks. Women’s ages ranged from 15 to 43. Women’s educational and occupational levels and marital/relationship situations varied widely.
Of the 27 women in the qualitative sample who subsequently had a baby, 20 were re-interviewed. One declined an interview and six could not be recontacted. The mix of ages and personal circumstances in the sample was, however, maintained. At the time of follow up, the infants’ ages ranged from two to six months, and the time between interviews was seven to ten months.
Twenty six women, aged 16 to 42, took part in the piloting of the items. Seventeen were continuing their pregnancies, five were about to have abortions, and six had had babies in the past three months.
Altogether 390 women took part in the first field test and 651 in the second field test. Table 2 shows the characteristics of both field test samples. The samples were consistent with national data in terms of birth/abortion ratio and were largely consistent in terms of age distribution and marital status, although women born abroad were slightly over-represented45–47 (table 3).
Ninety eight women (81%) completed the repeat measure for the standard test-retest; 90 (74%) were in the seven to 14 day window eligible for analysis.
Ninety women (76%) completed the repeat measure for the long term test-retest; 87 (72%) were eligible for analysis. (Two women had become pregnant again and one woman was still pregnant at 39 weeks). The interval between the two questionnaires was six plus months for most women.
The circumstances in which women became pregnant are summarised by six thematic areas: (1) expressed intentions; (2) desire for motherhood; (3) contraceptive use; (4) pre-conceptual preparations; (5) personal circumstances/timing; and (6) partner influences. These areas formed the dimensions of the conceptual model (fig 1). The model reflects the complexities of women’s accounts by encompassing a range of positions on each dimension (for example, positive, negative, ambivalent) and by neither requiring, nor assuming, congruence between dimensions.
In the follow up interviews, women were, overall, extremely consistent in their descriptions of the circumstances in which they became pregnant; confirming many features of their earlier interviews spontaneously. Only one woman modified an aspect of her account (her contraceptive use) relating to the conceptual model. (In contrast, greater change was noted regarding decision making after confirmation of pregnancy, and the interviews showed that women could clearly distinguish between their thoughts and feelings about events leading to their pregnancies and their thoughts and feelings about these events later, in light of their new experiences.)
Development and piloting of items
Eleven items were developed from the conceptual model: contraceptive use was represented by two items, personal circumstances/timing by four, partner influences by two, and the remaining dimensions by one item each. During piloting, one item (relating to contraception) was separated into two items, and minor changes to the wording and layout were made.
Item analysis and selection
No item failed the threshold of >5% missing data. One question failed the criterion of an endorsement frequency of over 80% on any one response option, hence was removed. Table 4 shows the inter-item and item-total correlations of the remaining items. No item had an item-total correlation of <0.2, and applying the strategy of considering the inter-item correlations resulted in eight items (2, 4a, 4b, 4c, 4d, 6, 7, 8) being removed. Cronbach’s α of the remaining three items was 0.78. To achieve an α of >0.90, three items (4b, 6, 7) were returned.
After imputation of missing data, scores were available for all women in the sample. All scores were represented; 2.3% of women scored the minimum (0) and 25.0% the maximum (12). The skew statistic was −0.4, however visual inspection of the distribution suggested that the scores were negatively skewed, possibly bimodal (with peaks at scores 2 and 12). The readability level of the measure was 6.7 on the Flesch-Kincaid grade level score (that is, suitable for an 11 year old), and most women completed the measure in 60–90 seconds. The measure was well received and did not cause offence.
A new, psychometrically evaluated, measure of unplanned pregnancy is now available for use.
The new measure is based on lay views, rather than professional conceptualisations, of pregnancy planning.
The measure is suitable for use with all women regardless of (intended or actual) pregnancy outcome.
The Cronbach’s α was 0.92. (Item-total correlations ranged from 0.60 to 0.89 and inter-item correlations ranged from 0.44 to 0.83.) For the standard test-retest the weighted κ was 0.97, and for the long term test-retest it was 0.86.
Comparison of the six item measure with the conceptual model showed that content validity had been maintained, with one question representing each dimension of the model. All hypotheses to test construct validity were supported (table 1), suggesting that the scale is indeed measuring the degree to which a pregnancy is planned. The results of principal component analysis confirmed that all variables loaded onto one factor (eigenvalue 4.33), with high factor loadings for each item: qu1-0.70; qu2-0.90, qu3-0.93, qu4-0.90, qu5-0.89, qu6-0.75.
Interpreting the scores
The increasing scores of the measure (zero to 12) represent increasing degrees of pregnancy planning/intention and there are no obvious cut points in the scale; each score provides additional information. In terms of producing population estimates, we suggest (on the basis of preliminary interpretation) the division of scores into a minimum of three groups—that is,10–12 (planned), 4–9 (ambivalent); and 0–3 (unplanned).
We developed a six item measure of unplanned pregnancy. Psychometric testing demonstrated the high internal consistency, high stability (standard and longer term), and excellent face, content, and construct validity of the measure. One limitation is that for reasons of confidentiality, the standard test-retest only included women who continued their pregnancies, thus our assessment of test-retest reliability does not provide any information about the stability of the scale when used with women whose pregnancies ended in abortion. Interestingly, the findings of the long term test-retest (and the qualitative follow up interviews) directly contradict previous evidence concerning the stability of women’s reports of pregnancy planning after birth.25,26 The reason for this may be that the items of the measure permit women a wider range of answers and therefore do not force women into categories that may be invalid. The measure was developed in Britain and is therefore appropriate for use with this population. As with other measures, re-validation would be required before application to other countries.
Compared with previous questions used to assess pregnancy planning, the measure has a number of advantages: it makes no assumptions about the nature of women’s relationships; it does not rely on women having fully formed childbearing plans; it does not assume a particular form of family building; and it is suitable for use with any pregnancy regardless of outcome. Because of its conceptual basis, the measure does not presume that women have clearly defined intentions and/or behaviour consistent with intentions. Women may occupy a range of positions in relation to pregnancy planning, and these are represented by the range of scores from zero to 12. The scores also provide more sophisticated information about pregnancy planning than the dichotomous categories of planned and unplanned. The measure is short (only six items) and highly acceptable (that is, easy to understand, inoffensive, and quick to complete), attributes that make it suitable for use in large scale surveys.
Unplanned pregnancy is often used as a proxy indicator of poor sexual health, and reducing the number of unplanned pregnancies is a policy aim of many countries around the world, including the USA and the UK.
Existing methods of eliciting pregnancy planning status have become dated.
A new measure will facilitate the production of reliable estimates of unplanned pregnancy.
The measure, with its conceptual basis, represents a clear break with the forms of measurement found in the previous British surveys and the current US and Demographic and Health Surveys (the last being the main data source of the international family planning movement). As such, the measure avoids the assumption that members of modern (post-demographic transition) societies are universally rational and instrumental in terms of their fertility decisions and control; an assumption that some have seen as characterising research on fertility and fertility change in the 20th century.48,49 Instead, the measure permits representation of a range of positions (for example, actions congruent with intentions, actions inconsistent with intentions, ambivalence in fertility intentions and actions, etc), thereby providing a more complex and realistic of portrayal of human fertility behaviour than existing questions.
We wish to acknowledge the valuable contribution of our project collaborators: Lynne Cho, Karen Dunnell, Penny Edgington, Elaine Fisher, Anna Glasier, Isaac Manyonda, Stanley Okolo, Catherine Paterson, Connie Smith, and R William Stones. We would also like to thank the many others who have helped with this project, including: Maureen Batley, Audrey Brown, Jill Brown, Jenny Charman, Carrie Free, Rolla Khadduri, Patricia Kingori, Sunethra Kossine, Maya Malalgoda, Janet Peacock, Jan Sanders, Margaret Thorogood, Ros Tolcher, and Maddy Ward.
Funding: this project was supported by a Medical Research Council/London Region special training fellowship in Health Services Research
Conflicts of interest: none declared.