Missing data is very common problem in epidemiological studies, and this problem can affect both the accuracy and the precision of any estimates generated from the study data. The magnitude of the problem will depend to some extent on whether the missing data refer to outcomes, exposures or confounders. There are a number of simple methods commonly used for analysing incomplete data, but which rely on unrealistic assumptions about the mechanism causing the data to be missing. Multiple imputation is a more sophisticated method which can produce precise, reliable estimates in a wide range of circumstances, but still relies on unverifiable assumptions about the missing data.
I will outline the problems caused by missing data, and explain when simpler methods of analysing incomplete data would be appropriate. I will also explain the idea behind multiple imputation, and the situations in which it can be used effectively. I will illustrate the differences between conventional analyses and multiple imputations analyses, and show how to interpret the results of multiple imputation. My aim is that you will leave able to confidently decide whether it would be appropriate to use multiple imputation to deal with missing data in a given situation, and to understand the statistical output it generates.
Disclosure of Interest None declared