Breaking up is hard to do: the heartbreak of dichotomizing continuous data

David L Streiner

doi:10.1177/070674370204700307

Breaking up is hard to do: the heartbreak of dichotomizing continuous data

Can J Psychiatry. 2002 Apr;47(3):262-6. doi: 10.1177/070674370204700307.

Author

David L Streiner¹

Affiliation

¹ Department of Psychiatry, University of Toronto, Kunin-Lunenfeld Applied Research Unit, Baycrest Centre for Geriatric Care, Toronto, ON. dstreiner@klaru-baycrest.on.ca

PMID: 11987478
DOI: 10.1177/070674370204700307

Abstract

Researchers often take variables that are measured on a continuum and then break them into categories (for example, above or below some cut-point), either to place subjects into groups or as an outcome measure. In this article, we show that the rationales given for this practice are weak and that categorization results in lost information, reduced power of statistical tests, and increased probability of a Type II error. Dichotomizing a continuous variable is justified only when the distribution of that variable is highly skewed or its relation with another variable is nonlinear.

MeSH terms

Bias
Clinical Trials as Topic / statistics & numerical data*
Data Interpretation, Statistical*
Humans
Outcome Assessment, Health Care / statistics & numerical data*
Psychiatric Status Rating Scales / statistics & numerical data*
Psychometrics