Background MSK-US is widely used for assessing disease severity and response to therapy in RA clinical trials. Although reliability exercises have been published, there is sparse literature on standardizing numerous factors that may affect reliability.
Objectives The aim of this study was to optimize MSK-US acquisition and scoring reliability prior to embarking on a multicenter MSK-US based RA study.
Methods This two-site MSK-US study evaluated the following joints: bilateral radio-carpal, intercarpal, radioulnar, MCP 1-5, PIP 1-5, knees and MTP 2-5. The reliability among our two main ultrasonographers with >5 years MSK-US experience was assessed in 3 phases (pre-face-to-face [pre-F2F], F2F, and post-F2F). A 1st draft MSK-US RA atlas was utilized as an initial reference for intensive teleconferences, which resulted in a 2nd draft atlas (images refined and scoring rules further delineated). Pre-F2F: The 2 ultrasonographers scored still images of 2 patients independently. Inter-reader reliability was calculated and most discrepant scores discussed. F2F: Each ultrasonographer scanned the same 6 patients independently, and then re-scanned 2 patients. The inter- and intra-reader reliability were calculated, and again discussion of most discrepant scores. Factors affecting discordance were identified and addressed: agreeing on cutoffs for synovitis scores, focal point positioning, positioning of deformed joints, room temperature, layer of gel used, prior familiarity with the machine, and color adjustment for PD to accommodate for color blindness (present is 7% of males and 0.5% in females). The 3rd and final atlas was developed as a reference for the post-F2F; clearly delineated the discrepancies among scores. Post-F2F: Each ultrasonographer rescored 2 patients' scans acquired during F2F study. The intra- and inter-reader reliability was calculated using percent agreement, weighted-kappa, intraclass correlation coefficient (ICC), and spearman correlation.
Results The F2F and post-F2F intra-reader reliability ranged between 79%_94% for exact agreement and 0.64-0.89 spearman correlation. The pre-F2F, F2F, and post-F2F inter-reader reliability ranged from 64-82% for exact agreement, 0.59-0.83 Spearman correlation (Tables 1&2). Inter-reader reliability improved from pre-F2F (64% agreement and 0.77 spearman correlation) to post-F2F (82%agreement and 0.83 spearman correlation).
Conclusions Our study highlights the importance of standardizing acquisition of MSK-US in RA clinical trials and provides practical tools to improve reliability. we recommend standardization of acquisition to be pursued, prior to commencement of multicenter MSK-US based RA trials.
Disclosure of Interest None declared