Background The use of administrative health databases (AHD) is a promising strategy to study the impact of rheumatoid arthritis (RA) at population level. Previous studies, using different methodologies, have shown moderate to excellent accuracy in case identification with sensitivity (Se) and specificity (Sp) ranging from 65 to 100% and from 55 to 97%, respectively.
Objectives To derive and validate a diagnostic algorithm that accurately identifies RA cases in general population using AHD of the Italian health system.
Methods A cross-sectional diagnostic study design was applied. To derive the algorithm, a first random sample of 900 visits between 2007-2010 was drawn from electronic medical records of a tertiary rheumatology centre, applying a case:control ratio of 1:2 . A second sample of 138 patients from a secondary care rheumatology clinic with the same case control ratio and a third sample of 4457 subjects from general population (primary care registry) were used for external validation. Diagnoses were clinically validated, according to standardized criteria .
Clinical and administrative data were linked using deterministic record linkage through tax code.
Useful items for the identification of RA cases were defined through a process informed by literature involving clinicians, analysts and statisticians.
Using a priori beliefs and empirical data, an algorithm that applied the more specific criteria in the first step and progressively more sensitive criteria in the subsequent was developed. Accuracy was assessed calculating Se and Sp, and 95% confidence intervals (CI). The consistency of these estimates was tested both by internal validation (bootstrap), and by two fully independent external validations. Positive and negative predictive values (PPV and NPV) were also estimated in the general population sample.
Results The following variables were included in the algorithm: ICD9 code 714.0 by rheumatologist, ICD9 codes for other rheumatologic diseases, code 714 in Hospital Discharge Form, prescription of DMARDs including biologics, and steroids. In the derivation sample, a four-steps algorithm identified clinically diagnosed RA cases with a Se of 96.4 (95%CI:93.6-98.2) and a Sp of 90.3 (87.5-92.7), confirmed by bootstrap estimates [Se 96.3 (95%CI:96.2-96.4); Sp 90.3 (90.2-90.4)].
The external validation on the sample from secondary care showed highly consistent results: Se 93.8 (95%CI:79.2-99.2) and Sp 90.7 (81.7-96.2).
Final validation at population showed: Se of 93.3 (95%CI:77.9-99.2); Sp of 99.7 (99.5-99.9); PPV of 70 (53.5-84.4) and NPV of 100 (99.8-100).
Conclusions AHD is a valuable tool for the identification of RA cases at the population level: impact studies of population are feasible. Data on misclassification will be useful to improve estimates of occurrence and in selecting subjects for cohort studies.
Steinberg DM, Biostatistics 2009; 10: 94-105 -
Mac Gregor A. J Rheumatol 1994; 21:1420-6.
Disclosure of Interest None Declared