Article Text

THU0156 Impact of Verification Bias on the Evaluation of Diagnostic Accuracy of Lung Ultrasound in Rheumatoid Arthritis-Related Related Interstitial Lung Disease
  1. M. Antivalle,
  2. M. Chevallard,
  3. M. Battellino,
  4. M.C. Ditto,
  5. V. Varisco,
  6. F. Rigamonti,
  7. A. Batticciotto,
  8. F. Atzeni,
  9. P. Sarzi-Puttini
  1. Rheumatology, L. Sacco University Hospital, Milano, Italy


Background Lung ultrasound (LUS) was reported to have a good diagnostic accuracy in the assessment of rheumatoid arthritis-related interstitial lung disease (RA-ILD), as compared to high resolution CT (HRCT), currently considered the diagnostic gold standard. However, partial verification, leading to biased estimates of the accuracy of LUS, is the rule in these studies, as the probability of undergoing HRCT is higher in patients with ILD.

Objectives To assess the influence of verification bias in the assessment of the diagnostic accuracy of LUS in the detection of RA-ILD

Methods We analyzed the data of a previously reported study comparing the diagnostic accuracy of LUS in comparison with usual clinical practice to detect RA-ILD [1]. LUS was performed in 152 RA patients, and its accuracy in the detection of ILD was compared to 3 clinical algorithms (Alg1: bibasilar crackles and dyspnea; Alg2: as Alg1 + reduced pulmonary function tests (PFTs); Alg3: as Alg1 + positive Chest X-Ray). By design, partial verification was significant: only 71/152 patients were verified with HRCT, 49/152 had PFTs, and 72/152 had Chest X-Ray.

Estimated of sensitivity and specificity based on observed data (missing completely at random model MCAR), were compared with the estimates obtained with the Begg-Greenes method (missing at random assumption, MAR) [2], and with multivariate imputation (multivariate imputation by chained equations, MICE) [3].

Results Accuracy estimates based on complete data only (MCAR) differ significantly from the MAR estimates and from MICE results, and tend to inflate sensitivity estimates. Figure 1 reports the point sensitivity and specificity estimates under different assumptions for LUS and the 3 clinical algorithms, and the “ignorance region” as defined by Kosinski and Barnhart [4]. LLUS had the highest diagnostic accuracy. However, LUS estimates of sensitivity and specificity varied from 63.4% to 88.2%, and from 68.5% to 81.3% respectively. Figure 2 reports the influence of estimation model on the likelihood ratios of RA-ILD based on LUS results. LR+ varied from 2.80 to 4.33, and LR- from 0.17 to 0.45.

Conclusions Verification bias can severely affect estimates of diagnostic accuracy of LUS in RA-ILD, and should be always taken into account in the interpretation of results.


  1. Antivalle M, et al. Lung Ultrasound Screening for Interstitial Lung Disease in Rheumatoid Arthritis. Comparison with Usual Detection Algorithms in Clinical Practice. Arthritis Rheum 2014;66(suppl): S1301

  2. Begg CB, R.A. Greenes, Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics 1983;39:207–215.

  3. van Buuren S. Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res. 2007;16:219-42.

  4. Kosinski AS, Barnhart HX, Accounting for nonignorable verification bias in assessment of diagnostic tests, Biometrics 2003; 59: 163–171.

Disclosure of Interest None declared

Statistics from

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.