Background: Dichotomization of continuous variables before analysis has frequently been criticized, but nonetheless remains a common approach. We were interested in the effects of dichotomization of an outcome variable when two predictors are examined.
Methods: Assuming a log-normally distributed continuous outcome, a three-level and a binary independent variable, we evaluated the results that would be obtained by logistic regression after dichotomization. Different cutoffs, predictor effects and dispersions were examined, with a special focus on interaction terms.
Results: Depending on the specific parameter combination, dichotomization introduced sometimes substantial spurious interactions between the two predictor variables regarding their association with the outcome. These interactions could be assigned statistical significance even with modest sample sizes. Real-life data on sex × weight as determinants of gamma-glutamyltransferase provided a practical example of these issues.
Conclusions: The findings presented add a new aspect to the controversy surrounding dichotomization of continuous variables. Researchers should critically examine whether the validity of their results might be hampered by such phenomena.