Abstract
In behavioural sciences, local dependence and DIF are common, and purification procedures that eliminate items with these weaknesses often result in short scales with poor reliability. Graphical loglinear Rasch models (Kreiner & Christensen, in Statistical Methods for Quality of Life Studies, ed. by M. Mesbah, F.C. Cole & M.T. Lee, Kluwer Academic, pp. 187–203, 2002) where uniform DIF and uniform local dependence are permitted solve this dilemma by modelling the local dependence and DIF. Identifying loglinear Rasch models by a stepwise model search is often very time consuming, since the initial item analysis may disclose a great deal of spurious and misleading evidence of DIF and local dependence that has to disposed of during the modelling procedure.
Like graphical models, graphical loglinear Rasch models possess Markov properties that are useful during the statistical analysis if they are used methodically. This paper describes how. It contains a systematic study of the Markov properties and the way they can be used to distinguish spurious from genuine evidence of DIF and local dependence and proposes a strategy for initial item screening that will reduce the time needed to identify a graphical loglinear Rasch model that fits the item responses. The last part of the paper illustrates the item screening procedure on simulated data and on data on the PF subscale measuring physical functioning in the SF36 Health Survey inventory.
Similar content being viewed by others
References
Ackerman, T.A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29, 67–91.
Agresti, A. (1984). Analysis of ordinal categorical data. New York: Wiley.
Andersen, E.B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69–81.
Anderson, C.J., & Böckenholt, U. (2000). Graphical regression models for polytomous variables. Psychometrika, 65, 497–509.
Anderson, C.J., & Yu, H.-T. (2007). Log-multiplicative association models as item response models. Psychometrika, 72, 5–23.
Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141–158.
Bartolucci, F., & Forcina, A. (2005). Likelihood inference on the underlying structure of IRT models. Psychometrika, 70, 31–44.
Benjamini–Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.
Besag, J., & Clifford, P. (1991). Sequential Monte Carlo p-values. Biometrika, 78, 301–304.
Bishop, Y.M.M., Fienberg, S.E., & Holland, P.W. (1975). Discrete multivariate analysis: theory and practice. Cambridge: MIT Press.
Christensen, K.B., & Kreiner, S. (2007). A Monte Carlo approach to unidimensionality testing in polytomous Rasch models. Journal of Applied Psychological Measurement, 31, 20–30.
Clauser, B., Mazor, K.M., & Hambleton, R.K. (1994). The effect of score group width on the Mantel–Haenszel procedure. Journal of Educational Measurement, 31, 67–78.
Davis, J.A. (1967). A partial coefficient for Goodman and Kruskal’s Gamma. Journal of the American Statistical Association, 69, 174–180.
Dawid, A.P. (1979). Conditional independence in statistical theory (with discussion). Journal of the Royal Statistical Society, Series A, 147, 278–292.
Fayers, P.M., & Machin, D. (2007). Quality of life: the assessment, analysis, and interpretation of patient reported outcomes (2nd edn.). Chichester: Wiley.
Fidalgo, A.M., Mellenbergh, G.J., & Muniz, J. (2000). Effects of DIF, test length, and purification type on robustness and power of Mantel–Haenszel procedures. Methods of Psychological Research Online, 5, 43–53.
Fischer, G.H. (1995). The derivation of polytomous Rasch models. In Fischer, G.H., & Molenaar, I.W. (Eds.) Rasch models: Foundations, recent developments, and applications (pp. 293–306). New York: Springer.
Finch, H. (2005). The MIMIC model as a method for detecting DIF: comparison with Mantel–Haenszel, SIBTEST and the IRT Likelihood Ratio. Applied Psychological Measurement, 29, 278–295.
Frank, O., & Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81, 832–842.
French, B.F., & Maller, S.J. (2007). Iterative purification and effect size use with logistic regression for differential item functioning detection. Educational and Psychological Measurement, 67, 373–393.
Hagenaars, J.A. (1998). Categorical causal modelling: latent class analysis and directed Log-linear models with latent variables. Sociological Methods and Research, 26, 436–486.
Hanson, B.A. (1998). Uniform DIF and DIF defined by differences in item response functions. Journal of Educational and Behavioral Statistics, 23, 244–253.
Holland, P.W. (1981). When are item response models consistent with observed data. Psychometrika, 46, 79–92.
Holland, P.W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possible nonparallel test. Psychometrika, 68, 123–150.
Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models. Annals of Statistics, 14, 1523–1543.
Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel–Haenszel procedure. In Wainer, H., & Braun, H. (Eds.) Test validity (pp. 129–145). Hillsdale: Lawrence Erlbaum Associates.
Hoskens, M., & De Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 2, 261–277.
Humphreys, K., & Titterington, D.M. (2003). Variational approximations for categorical causal modelling with latent variables. Psychometrika, 68, 391–412.
Ip, E.H. (2001). Testing for local dependence in dichotomous item response models. Psychometrika, 66, 109–132.
Ip, E.H. (2002). Locally dependent latent trait model and the Dutch Identity revisited. Psychometrika, 67, 367–386.
Junker, B.W. (1993). Conditional association, essential independence and monotone unidimensional item response models. Annals of Statistics, 21, 1359–1378.
Junker, B.W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81.
Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223–245.
Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.
Kelderman, H. (1992). Computing maximum likelihood estimates of loglinear models from marginal sums with special attention to loglinear item response theory. Psychometrika, 57, 437–450.
Kelderman, H. (2005). Building IRT models from scratch: Graphical models, exchangeability, marginal freedom, scale type, and latent traits. In van der Ark, A., Croon, M.A., & Sijtsma, K. (Eds.) New developments in categorical data analysis for the social and behavioural Sciences (pp. 167–187). Hillsdale: Lawrence Erlbaum.
Kreiner, S. (1986). Computerized exploratory screening of large-dimensional contingency tables. In De Antoni, F., Lauro, N., & Rizzi, A. (Eds.) COMPSTAT 1986 (pp. 43–48). Heidelberg: Physica Verlag.
Kreiner, S. (1987). Analysis of multidimensional contingency tables by exact conditional tests: Techniques and strategies. Scandinavian Journal of Statistics, 14, 97–112.
Kreiner, S. (1993/2006). Validation of index scales for analysis of survey data. In Dean, K. (Ed.) Population health research (pp. 116–144). London: Sage Publications. Reprinted in D.J. Bartolomew (Ed.) (2006), Measurement, vol. III (pp. 297–328). London: Sage Publications.
Kreiner, S. (2003). Introduction to DIGRAM (Research report 03/10). Copenhagen: Dept. of Biostatistics, Univ. of Copenhagen.
Kreiner, S. (2007). Validity and objectivity: reflections on the role and nature of Rasch models. Nordic Psychology, 59, 268–298.
Kreiner, S., & Christensen, K.B. (2002). Graphical Rasch models. In Mesbah, M., Cole, F.C., & Lee, M.T. (Eds.) Statistical methods for quality of life studies (pp. 187–203). Dordrecht: Kluwer Academic.
Kreiner, S., & Christensen, K.B. (2004). Analysis of local dependence and multidimensionality in graphical loglinear Rasch models. Communications in Statistics. Theory and Methods, 33, 1239–1276.
Kreiner, S., & Christensen, K.B. (2006). Validity and objectivity in health related summated scales: Analysis by graphical loglinear Rasch models. In von Davier, M., & Carstensen, C.H. (Eds.) Multivariate and mixture distribution Rasch models—extensions and applications (pp. 329–346). New York: Springer.
Kreiner, S., Pedersen, J.H., & Siersma, V. (2009). Derivation and testing hypotheses in chain graph models (Research report 09/9). Copenhagen: Dept. of Biostatistics, University of Copenhagen. Retrieved from http://biostat.ku.dk/reports/2009/Research_report_09-09.pdf.
Lauritzen, S.L. (1996). Graphical models. Oxford: Clarendon Press.
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum.
Mazor, K.M., Clauser, B.E., & Hambleton, R.K. (1992). The effect of sample size on the functioning of the Mantel–Haenszel statistic. Educational and Psychological Measurement, 52, 443–451.
Mellenbergh, G.J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105–108.
Park, D.G., & Lautenschlager, G.J. (1990). Improving IRT item bias with iterative linking and ability scale purification. Applied Psychological Measurement, 14, 1163–173.
Penfield, R.D. (2001). Assessing differential item functioning among multiple groups: A comparison of three Mantel–Haenszel procedures. Applied Measurement in Education, 14, 235–259.
Penfield, R.D., & Camilli, G. (2007). Differential item functioning and item bias. In Rao, C.R., & Sinharay, S. (Eds.) Handbook of statistics: psychometrics (pp. 125–168). Amsterdam: Elsevier.
Raju, N.S., Drasgow, F., & Slinde, J.A. (1993). An empirical comparison of the area methods, Lord’s chi-square test, and the Mantel–Haenszel technique for assessing differential item functioning. Educational and Psychological Measurement, 53, 301–315.
Rasch, G. (1961/2006). On general laws and the meaning of measurement in psychology. In Neyman, J. (Ed.) Proceedings of the 4th Berkley symposium on mathematical statistics and probability: Vol. 4 (pp. 321–333). Berkeley: University of California Press. Reprinted in D.J. Bartolomew (Ed.). Measurement, vol. I (pp 319–334). London: Sage Publications.
Rijmen, F., Vansteelandt, K., & De Boeck, P. (2008). Latent class models for diary method data: Parameter estimation by local computations. Psychometrika, 73, 167–182.
Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425–435.
Rosenbaum, P.R. (1988). Item Bundles. Psychometrika, 53, 349–359.
Rosenbaum, P.R. (1989). Criterion-related construct validity. Psychometrika, 54, 625–633.
Sue, Y.-H., & Wang, W.-C. (2005). Efficiency of the Mantel, Generalized Mantel–Haenszel, and logistic discriminant function analysis methods in detecting differential item functioning for polytomous items. Applied Measurement in Education, 18, 313–350.
Swaminathan, H., & Rogers, J.H. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.
Tjur, T. (1982). A connection between Rasch’s item analysis model and a multiplicative Poisson model. Scandinavian Journal of Statistics, 9, 23–30.
Van der Ark, L.A., & Bergsma, W.P. (2010). A Note on stochastic ordering of the latent trait using the sum of polytomous item scores. Psychometrika, 75, 272–279.
Williams, N.J., & Beretvas, S.N. (2006). DIF identification using HGLM for polytomous items. Applied Psychological Measurement, 30, 22–42.
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF). Ottawa: Directorate of Human Resources Research and Evaluation, National Defence.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kreiner, S., Christensen, K.B. Item Screening in Graphical Loglinear Rasch Models. Psychometrika 76, 228–256 (2011). https://doi.org/10.1007/s11336-011-9203-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-011-9203-y