Missing data, where either entire observations or individual variable values are for some reason not available for analysis, is a common challenge to research using complex data bases. Although missing data clearly lead to a loss of information and hence reduced statistical power, a more insidious consequence is that this lack of data may introduce selection bias, which could potentially invalidate the entire study. Fortunately, many solutions to this problem has been put forth, based on different ways of filling out (imputing) values where there are none. Unfortunately, several of these solutions are inherently flawed and may introduce more bias then they remove.
This brief presentation will introduce central concepts in the literature on missing data, go through the circumstances when missing data may introduce selection bias, and show some flaws in several methods sometimes used to impute data.
Disclosure of Interest None declared