Original articleMethodological approaches to shortening composite measurement scales
Abstract
Composite measurement scales (CMSs) have contributed to improving the measurement of complex medical phenomena such as physical and psychological functioning or health-related quality of life. However, their use in patient care and research is often limited by their length and excessive respondent burden. In such situations, short instruments should be made available. Efforts to develop short instruments have largely focused on shortening existing instruments. To investigate the methodology currently used in the shortening of CMS, we assessed 42 studies reported in medical, psychological, and educational journals between 1984 and 1994. A number of methodological and statistical considerations important in the CMS shortening procedure were found to have been ignored or neglected by authors developing short forms from existing CMS. Serious flaws appear mainly to result from inadequate conceptualization of the shortening process, and inappropriate use and excess credit given to statistical techniques used to select items to be retained in short forms. When performed, the assessment of measurement properties of the short form was often inappropriate, and cross validation studies were seldom conducted. We propose recommendations for shortening existing CMS, to help authors and investigators develop and choose, respectively, shortened measurement instruments. These recommendations address the preliminary choice of the original CMS to be shortened, and the two successive phases to be considered in the development of short forms: the shortening process itself, where items are selected, and the validation of the shortened CMS, which should be conducted independently using independent subject samples.
References (30)
- B Kirshner et al.
A methodological framework for assessing health indices
J Clin Epidemiol
(1985) - GJ Boyle
Does item homogeneity indicate internal consistency or item redundancy in psychometric scales?
Pers Individ Diff
(1991) - AR Feinstein
- JH Wasson et al.
Clinical prediction rules. Applications and methodological standards
N Engl J Med
(1985) - DR Cox et al.
Quality of life assessment: Can we keep it simple?
J Royal Stat Soc
(1992) - DL Streiner et al.
Health Measurement Scales. A Practical Guide to their Development and Use
(1989) - J Coste et al.
Methodological and statistical problems in the contruction of composite measurement scales. A survey of six medical and epidemiological journals
Stat Med
(1995) - A Tellegen et al.
Old wine in new skins: grouping Wechsler subtests into new scales
J Consult Clin Psychol
(1967) - JC Nunnally
Checklist for statisticians
Brit Med J
(1993)
Why predictive indexes perform less well in validation studies. Is it magic or methods?
Arch Int Med
Cited by (193)
Development of an optimal short form of the GAD-7 scale with cross-cultural generalizability based on Riskslim
2024, General Hospital PsychiatryDespite the relatively small number of items in the GAD-7, fewer items are increasingly sought to shorten testing time in large-scale mental health screenings. As a result, short forms based on the GAD-7, the GAD-2, and GAD-mini, have become popular. However, the GAD-2 and GAD-mini have reported lower diagnostic accuracy in some cultural contexts, implying that a validated short-form version of the GAD-7 may be lacking in large-scale cross-cultural anxiety screening. Based on this, to develop an optimal short form of the GAD-7 with cross-cultural stability, we utilized seven GAD-7 datasets from six different countries, totaling 47,484 participants. Five 2 to 6 item short forms of the GAD were constructed using the Riskslim machine learning algorithm. We evaluated the diagnostic accuracy of the GAD-7 short forms in the training and test sets based on the coefficient of determination(R2) and area under the curve(AUC) metrics, and the results showed that GAD-R2 performed poorly in some cultures, and all of the 3 to 6 item short forms of the GAD performed good in cross-cultural diagnostic rates, with the GAD-R6 showing the highest diagnostic accuracy in all cultures; GAD-R3 outperformed GAD-R2, GAD-2, and GAD-mini in all cultures; GAD-R3 had higher generalizability across cultures and special populations; Given that the GAD-R3 was shorter and nearly as accurate as the GAD-R6, we recommend the use of the GAD-R3 in clinical studies and epidemiologic investigations. And we recommend the optimal actual cutoff value of 15 for GAD-R3. Overall, we recommend GAD-R3 as the short-form version of GAD-7 in cross-cultural studies. However, the 2-item GAD scale is also optimal for the short-form version in clinical practice.
Spanish adaptation and validation of the empowerment of parents in the intensive care-neonatology (EMPATHIC-N) questionnaire
2023, Anales de PediatriaLa satisfacción de los padres de niños en Unidades de Cuidados Intensivos Neonatales (UCIN) apenas se evalúa por la carencia de herramientas específicas. El EMpowerment of PArents in THe Intensive Care Neonatology (EMPATHIC-N) es un cuestionario de satisfacción que evalúa los cuidados centrados en el desarrollo y la familia, validado en varios países pero no en España.
Realizar la adaptación y validación del EMPATHIC-N para la evaluación de la satisfacción de los padres de niños ingresados en UCIN.
Primero, se realizó una traducción-retrotraducción y adaptación transcultural siguiendo un procedimiento estandarizado con un comité de expertos, mediante el método Delphi. Posteriormente, se realizó un estudio piloto con ocho padres y un estudio transversal en la unidad neonatal de un hospital terciario para analizar la fiabilidad y la validez convergente del cuestionario.
La versión española del EMPATHIC-N demostró su comprensibilidad, viabilidad, aplicabilidad y utilidad en el ámbito sanitario infantil, evaluado por 19 profesionales y 60 padres. Su validez de contenido global resultó excelente (0,93). La fiabilidad y validez convergente del EMPHATIC-N se analizaron en 65 encuestas. El α de Cronbach por dominios se situó por encima de 0,7, indicando una alta consistencia interna. La validez fue medida correlacionando los cinco dominios con los cuatro ítems de satisfacción general, demostrando una correlación adecuada (rs: 0,4-0,76; p < 0,01).
La versión española del cuestionario EMPATHIC-N es un instrumento comprensible, viable, aplicable, útil, válido y fiable para medir la satisfacción de los padres en las unidades neonatales.
Parental satisfaction is rarely measured in the neonatal intensive care unit due to a lack of specific assessment tools. The EMpowerment of PArents in THe Intensive Care-Neonatology (EMPATHIC-N) questionnaire is an instrument to assess satisfaction in relation to family-centred care that has been validated in several countries, but not Spain.
To perform the translation and cultural adaptation of the EMPATHIC-N to Spanish followed by its validation for the purpose of assessing satisfaction in parents with children admitted to the neonatal intensive care unit.
The questionnaire first underwent forward and backward translation and transcultural adaptation by a panel of experts through a standardized process based on the Delphi method, followed by a pilot study in 8 parents and then a cross-sectional study in the neonatal intensive care unit of a tertiary care hospital to assess the reliability and converging validity of the Spanish version.
The study proved the comprehensibility, validity, feasibility, applicability and usefulness of the Spanish version of the EMPATHIC-N in the field of paediatric health after being evaluated by 19 professionals and 60 parents. The content validity was found to be excellent (0.93). The reliability and convergent validity of the Spanish version of the EMPHATIC-N was analysed in a sample of 65 completed questionnaires. The Cronbach α for each domain was greater than 0.7, indicating a high internal consistency. We assessed validity by analysing the correlation of the 5 domains with the with the 4 general satisfaction items. The validity was found to be adequate (rs, 0.4-0.76; P < .01).
The Spanish version of the EMPATHIC-N questionnaire is a comprehensible, useful, valid and reliable instrument to measure satisfaction in the parents of children admitted to neonatal care units.
Keeping perfectionistic academics safe from themselves with mindfulness
2023, Personality and Individual DifferencesWhile perfectionism may be perceived as necessary to perform in academia, it has also been associated with burnout. This study investigated whether work-related cognitive spillover into nonwork time explains relationships between perfectionistic concerns and exhaustion among academics. We also examine whether facets of mindfulness can safeguard against exhaustion among perfectionistic academics (moderated mediation). We used a one-year multi-wave prospective design with 262 academics. Results suggested that non-reactivity protects perfectionistic academics from emotional exhaustion. These findings highlight the potential of mindfulness-based interventions for reducing the adverse impact of perfectionism.
Shortened versions of self-reported questionnaires may be used to reduce respondent burden. When shortened screening tools are used, it is desirable to maintain equivalent diagnostic accuracy to full-length forms. This manuscript presents a case study that illustrates how external data and individual participant data meta-analysis can be used to assess the equivalence in diagnostic accuracy between a shortened and full-length form. This case study compares the Patient Health Questionnaire-9 (PHQ-9) and a 4-item shortened version (PHQ-Dep-4) that was previously developed using optimal test assembly methods. Using a large database of 75 primary studies (34,698 participants, 3,392 major depression cases), we evaluated whether the PHQ-Dep-4 cutoff of ≥ 4 maintained equivalent diagnostic accuracy to a PHQ-9 cutoff of ≥ 10. Using this external validation dataset, a PHQ-Dep-4 cutoff of ≥ 4 maximized the sum of sensitivity and specificity, with a sensitivity of 0.88 (95% CI 0.81, 0.93), 0.68 (95% CI 0.56, 0.78), and 0.80 (95% CI 0.73, 0.85) for the semi-structured, fully structured, and MINI reference standard categories, respectively, and a specificity of 0.79 (95% CI 0.74, 0.83), 0.85 (95% CI 0.78, 0.90), and 0.83 (95% CI 0.80, 0.86) for the semi-structured, fully structured, and MINI reference standard categories, respectively. While equivalence with a PHQ-9 cutoff of ≥ 10 was not established, we found the sensitivity of the PHQ-Dep-4 to be non-inferior to that of the PHQ-9, and the specificity of the PHQ-Dep-4 to be marginally smaller than the PHQ-9.
Self-report measure of dispositional flow experience in the video game context: Conceptualisation and scale development
2022, International Journal of Human Computer StudiesThe flow theory has been widely applied to explain video game players' gaming and purchasing behaviour. However, due to the conceptual and empirical flaws of the current measurement instruments, researchers can hardly apply them to measure dispositional flow experience of adult video game players. In this research, we conceptualised flow experience and developed its measurement instrument in the video game context. To achieve these objectives, we conducted five phases with different participants in each of them: conceptualisation of the constructs and item generation (n = 13), expert judging (n = 5), pre-test (n = 96), initial development and validation (n = 289), and advanced development and validation (n = 593). We applied both qualitative and quantitative analysis to conceptualise and measure flow experience of video game players, including grounded theory and several statistical tools of latent variable modelling. We obtained a scale of 28-items that performs well in the first-order model. Moreover, we tested three hierarchical structure of flow experience: unidimensional model, independent antecedent model, and hierarchical antecedent model. Results show that hierarchical antecedent model is the best structure to represent flow experience. We named our scale Video Game Dispositional Flow Scale (VGDFS).
Development and validation of a short version of the French Hand Function Sort questionnaire in vocational rehabilitation
2021, Annals of Physical and Rehabilitation MedicineThe Hand Function Sort (HFS) is a pictorial self-administered questionnaire with 62 items. It is a valid and reliable scale focused on the physical function of the upper limbs. It is used to predict the return to work.
We aimed to develop and validate a short version of the French version of the HFS (HFS-F) to simplify its use in clinical practice.
We included patients with upper-limb chronic pain hospitalised for vocational rehabilitation from 2012 to 2019. Vocational rehabilitation aims to improve the autonomy of patients to regain their previous working capacity. The 62 items of the HFS-F were analysed in terms of patient and expert assessments, floor/ceiling effect, item-to-total correlation, principal component analysis, and Rasch analysis. A short HFS-F was developed. Thereafter, we assessed its internal consistency, test–retest reliability, criterion validity with the full-length HFS-F, construct validity with different scales (Disabilities of the Arm, Shoulder, and Hand [DASH]; Brief Pain Inventory [BPI]; Hospital Anxiety and Depression [HAD]), standard error of measurement (SEM), and minimal detectable change (MDC).
Six experts were consulted, 34 patients were interviewed, and 629 questionnaires were analysed. Among the items, 25 were selected after the final round with the six experts. The internal consistency and test–retest reliability were excellent (Cronbach α = 0.95, intraclass correlation coefficient = 0.92, 95% confidence interval [95% CI] 0.87 to 0.95). The correlation coefficient between scores of the short and full-length HFS-F was 0.841 (95% CI: 0.752 to 0.897, P < 10–4), and those between the short HFS-F score and the DASH, BPI, HAD-Anxiety, and HAD-Depression scores were −0.816 (95% CI: −0.714 to −0.881, P < 10–4), −0.529 (95% CI: −0.338 to −0.674, P < 10–4), −0.451 (95% CI: −0.244 to 0.614, P = 0.0001), and −0.360 (95% CI: −0.140 to −0.542, P = 0.0018), respectively. The SEM and MDC values were estimated at 6/100 and 17/100, respectively.
A short version of the HFS-F was developed and validated. We named this questionnaire the 25 HFS-F.