Skip to main content
Log in

Item Screening in Graphical Loglinear Rasch Models

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In behavioural sciences, local dependence and DIF are common, and purification procedures that eliminate items with these weaknesses often result in short scales with poor reliability. Graphical loglinear Rasch models (Kreiner & Christensen, in Statistical Methods for Quality of Life Studies, ed. by M. Mesbah, F.C. Cole & M.T. Lee, Kluwer Academic, pp. 187–203, 2002) where uniform DIF and uniform local dependence are permitted solve this dilemma by modelling the local dependence and DIF. Identifying loglinear Rasch models by a stepwise model search is often very time consuming, since the initial item analysis may disclose a great deal of spurious and misleading evidence of DIF and local dependence that has to disposed of during the modelling procedure.

Like graphical models, graphical loglinear Rasch models possess Markov properties that are useful during the statistical analysis if they are used methodically. This paper describes how. It contains a systematic study of the Markov properties and the way they can be used to distinguish spurious from genuine evidence of DIF and local dependence and proposes a strategy for initial item screening that will reduce the time needed to identify a graphical loglinear Rasch model that fits the item responses. The last part of the paper illustrates the item screening procedure on simulated data and on data on the PF subscale measuring physical functioning in the SF36 Health Survey inventory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Ackerman, T.A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29, 67–91.

    Article  Google Scholar 

  • Agresti, A. (1984). Analysis of ordinal categorical data. New York: Wiley.

    Google Scholar 

  • Andersen, E.B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69–81.

    Article  Google Scholar 

  • Anderson, C.J., & Böckenholt, U. (2000). Graphical regression models for polytomous variables. Psychometrika, 65, 497–509.

    Article  Google Scholar 

  • Anderson, C.J., & Yu, H.-T. (2007). Log-multiplicative association models as item response models. Psychometrika, 72, 5–23.

    Article  Google Scholar 

  • Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141–158.

    Article  Google Scholar 

  • Bartolucci, F., & Forcina, A. (2005). Likelihood inference on the underlying structure of IRT models. Psychometrika, 70, 31–44.

    Article  Google Scholar 

  • Benjamini–Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.

    Google Scholar 

  • Besag, J., & Clifford, P. (1991). Sequential Monte Carlo p-values. Biometrika, 78, 301–304.

    Google Scholar 

  • Bishop, Y.M.M., Fienberg, S.E., & Holland, P.W. (1975). Discrete multivariate analysis: theory and practice. Cambridge: MIT Press.

    Google Scholar 

  • Christensen, K.B., & Kreiner, S. (2007). A Monte Carlo approach to unidimensionality testing in polytomous Rasch models. Journal of Applied Psychological Measurement, 31, 20–30.

    Article  Google Scholar 

  • Clauser, B., Mazor, K.M., & Hambleton, R.K. (1994). The effect of score group width on the Mantel–Haenszel procedure. Journal of Educational Measurement, 31, 67–78.

    Article  Google Scholar 

  • Davis, J.A. (1967). A partial coefficient for Goodman and Kruskal’s Gamma. Journal of the American Statistical Association, 69, 174–180.

    Google Scholar 

  • Dawid, A.P. (1979). Conditional independence in statistical theory (with discussion). Journal of the Royal Statistical Society, Series A, 147, 278–292.

    Article  Google Scholar 

  • Fayers, P.M., & Machin, D. (2007). Quality of life: the assessment, analysis, and interpretation of patient reported outcomes (2nd edn.). Chichester: Wiley.

    Google Scholar 

  • Fidalgo, A.M., Mellenbergh, G.J., & Muniz, J. (2000). Effects of DIF, test length, and purification type on robustness and power of Mantel–Haenszel procedures. Methods of Psychological Research Online, 5, 43–53.

    Google Scholar 

  • Fischer, G.H. (1995). The derivation of polytomous Rasch models. In Fischer, G.H., & Molenaar, I.W. (Eds.) Rasch models: Foundations, recent developments, and applications (pp. 293–306). New York: Springer.

    Google Scholar 

  • Finch, H. (2005). The MIMIC model as a method for detecting DIF: comparison with Mantel–Haenszel, SIBTEST and the IRT Likelihood Ratio. Applied Psychological Measurement, 29, 278–295.

    Article  Google Scholar 

  • Frank, O., & Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81, 832–842.

    Article  Google Scholar 

  • French, B.F., & Maller, S.J. (2007). Iterative purification and effect size use with logistic regression for differential item functioning detection. Educational and Psychological Measurement, 67, 373–393.

    Article  Google Scholar 

  • Hagenaars, J.A. (1998). Categorical causal modelling: latent class analysis and directed Log-linear models with latent variables. Sociological Methods and Research, 26, 436–486.

    Article  Google Scholar 

  • Hanson, B.A. (1998). Uniform DIF and DIF defined by differences in item response functions. Journal of Educational and Behavioral Statistics, 23, 244–253.

    Google Scholar 

  • Holland, P.W. (1981). When are item response models consistent with observed data. Psychometrika, 46, 79–92.

    Article  Google Scholar 

  • Holland, P.W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possible nonparallel test. Psychometrika, 68, 123–150.

    Article  Google Scholar 

  • Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models. Annals of Statistics, 14, 1523–1543.

    Article  Google Scholar 

  • Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel–Haenszel procedure. In Wainer, H., & Braun, H. (Eds.) Test validity (pp. 129–145). Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • Hoskens, M., & De Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 2, 261–277.

    Article  Google Scholar 

  • Humphreys, K., & Titterington, D.M. (2003). Variational approximations for categorical causal modelling with latent variables. Psychometrika, 68, 391–412.

    Article  Google Scholar 

  • Ip, E.H. (2001). Testing for local dependence in dichotomous item response models. Psychometrika, 66, 109–132.

    Article  Google Scholar 

  • Ip, E.H. (2002). Locally dependent latent trait model and the Dutch Identity revisited. Psychometrika, 67, 367–386.

    Article  Google Scholar 

  • Junker, B.W. (1993). Conditional association, essential independence and monotone unidimensional item response models. Annals of Statistics, 21, 1359–1378.

    Article  Google Scholar 

  • Junker, B.W., & Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81.

    Article  Google Scholar 

  • Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223–245.

    Article  Google Scholar 

  • Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.

    Article  Google Scholar 

  • Kelderman, H. (1992). Computing maximum likelihood estimates of loglinear models from marginal sums with special attention to loglinear item response theory. Psychometrika, 57, 437–450.

    Article  Google Scholar 

  • Kelderman, H. (2005). Building IRT models from scratch: Graphical models, exchangeability, marginal freedom, scale type, and latent traits. In van der Ark, A., Croon, M.A., & Sijtsma, K. (Eds.) New developments in categorical data analysis for the social and behavioural Sciences (pp. 167–187). Hillsdale: Lawrence Erlbaum.

    Google Scholar 

  • Kreiner, S. (1986). Computerized exploratory screening of large-dimensional contingency tables. In De Antoni, F., Lauro, N., & Rizzi, A. (Eds.) COMPSTAT 1986 (pp. 43–48). Heidelberg: Physica Verlag.

    Google Scholar 

  • Kreiner, S. (1987). Analysis of multidimensional contingency tables by exact conditional tests: Techniques and strategies. Scandinavian Journal of Statistics, 14, 97–112.

    Google Scholar 

  • Kreiner, S. (1993/2006). Validation of index scales for analysis of survey data. In Dean, K. (Ed.) Population health research (pp. 116–144). London: Sage Publications. Reprinted in D.J. Bartolomew (Ed.) (2006), Measurement, vol. III (pp. 297–328). London: Sage Publications.

    Google Scholar 

  • Kreiner, S. (2003). Introduction to DIGRAM (Research report 03/10). Copenhagen: Dept. of Biostatistics, Univ. of Copenhagen.

  • Kreiner, S. (2007). Validity and objectivity: reflections on the role and nature of Rasch models. Nordic Psychology, 59, 268–298.

    Article  Google Scholar 

  • Kreiner, S., & Christensen, K.B. (2002). Graphical Rasch models. In Mesbah, M., Cole, F.C., & Lee, M.T. (Eds.) Statistical methods for quality of life studies (pp. 187–203). Dordrecht: Kluwer Academic.

    Google Scholar 

  • Kreiner, S., & Christensen, K.B. (2004). Analysis of local dependence and multidimensionality in graphical loglinear Rasch models. Communications in Statistics. Theory and Methods, 33, 1239–1276.

    Article  Google Scholar 

  • Kreiner, S., & Christensen, K.B. (2006). Validity and objectivity in health related summated scales: Analysis by graphical loglinear Rasch models. In von Davier, M., & Carstensen, C.H. (Eds.) Multivariate and mixture distribution Rasch models—extensions and applications (pp. 329–346). New York: Springer.

    Google Scholar 

  • Kreiner, S., Pedersen, J.H., & Siersma, V. (2009). Derivation and testing hypotheses in chain graph models (Research report 09/9). Copenhagen: Dept. of Biostatistics, University of Copenhagen. Retrieved from http://biostat.ku.dk/reports/2009/Research_report_09-09.pdf.

  • Lauritzen, S.L. (1996). Graphical models. Oxford: Clarendon Press.

    Google Scholar 

  • Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum.

    Google Scholar 

  • Mazor, K.M., Clauser, B.E., & Hambleton, R.K. (1992). The effect of sample size on the functioning of the Mantel–Haenszel statistic. Educational and Psychological Measurement, 52, 443–451.

    Article  Google Scholar 

  • Mellenbergh, G.J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105–108.

    Article  Google Scholar 

  • Park, D.G., & Lautenschlager, G.J. (1990). Improving IRT item bias with iterative linking and ability scale purification. Applied Psychological Measurement, 14, 1163–173.

    Article  Google Scholar 

  • Penfield, R.D. (2001). Assessing differential item functioning among multiple groups: A comparison of three Mantel–Haenszel procedures. Applied Measurement in Education, 14, 235–259.

    Article  Google Scholar 

  • Penfield, R.D., & Camilli, G. (2007). Differential item functioning and item bias. In Rao, C.R., & Sinharay, S. (Eds.) Handbook of statistics: psychometrics (pp. 125–168). Amsterdam: Elsevier.

    Google Scholar 

  • Raju, N.S., Drasgow, F., & Slinde, J.A. (1993). An empirical comparison of the area methods, Lord’s chi-square test, and the Mantel–Haenszel technique for assessing differential item functioning. Educational and Psychological Measurement, 53, 301–315.

    Article  Google Scholar 

  • Rasch, G. (1961/2006). On general laws and the meaning of measurement in psychology. In Neyman, J. (Ed.) Proceedings of the 4th Berkley symposium on mathematical statistics and probability: Vol. 4 (pp. 321–333). Berkeley: University of California Press. Reprinted in D.J. Bartolomew (Ed.). Measurement, vol. I (pp 319–334). London: Sage Publications.

    Google Scholar 

  • Rijmen, F., Vansteelandt, K., & De Boeck, P. (2008). Latent class models for diary method data: Parameter estimation by local computations. Psychometrika, 73, 167–182.

    Article  PubMed  Google Scholar 

  • Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425–435.

    Article  Google Scholar 

  • Rosenbaum, P.R. (1988). Item Bundles. Psychometrika, 53, 349–359.

    Article  Google Scholar 

  • Rosenbaum, P.R. (1989). Criterion-related construct validity. Psychometrika, 54, 625–633.

    Article  Google Scholar 

  • Sue, Y.-H., & Wang, W.-C. (2005). Efficiency of the Mantel, Generalized Mantel–Haenszel, and logistic discriminant function analysis methods in detecting differential item functioning for polytomous items. Applied Measurement in Education, 18, 313–350.

    Article  Google Scholar 

  • Swaminathan, H., & Rogers, J.H. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.

    Article  Google Scholar 

  • Tjur, T. (1982). A connection between Rasch’s item analysis model and a multiplicative Poisson model. Scandinavian Journal of Statistics, 9, 23–30.

    Google Scholar 

  • Van der Ark, L.A., & Bergsma, W.P. (2010). A Note on stochastic ordering of the latent trait using the sum of polytomous item scores. Psychometrika, 75, 272–279.

    Article  Google Scholar 

  • Williams, N.J., & Beretvas, S.N. (2006). DIF identification using HGLM for polytomous items. Applied Psychological Measurement, 30, 22–42.

    Article  Google Scholar 

  • Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF). Ottawa: Directorate of Human Resources Research and Evaluation, National Defence.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Svend Kreiner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kreiner, S., Christensen, K.B. Item Screening in Graphical Loglinear Rasch Models. Psychometrika 76, 228–256 (2011). https://doi.org/10.1007/s11336-011-9203-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-011-9203-y

Keywords

Navigation