Background: Radiographic progression in clinical trials is assessed by interpreting changes in total radiographic joint score, and the reliability of those scores depends on an evaluation of sum scores. It is not known how consistently changes in individual joints are identified by independent readers and in independent readings.
Patients and Methods: 7255 single joints from 178 patients who participated in the Trial of Etanercept and Methothrexate with Radiographic Patient Outcomes (TEMPO) trial were evaluated. Every image was independently scored twice according to the Sharp–van der Heijde method by two independent readers, so that four scores per joint were available. Absolute agreement and consistency of negative and positive erosion change scores across readers and readings were compared on a per-joint level, as well as on a per-patient level.
Results: The number of joints showing a change for erosion was very low in this trial: 691/7255 analysed joints had at least one non-zero change score out of four readings. Absolute agreement between readings was remarkably poor: only 12 joints showed a consistently positive or negative change in all four readings. Change scores in opposite directions in the same joint across independent readings were rare (25 joints). Frequency of opposite joint scores in the same patient (mixed change patterns) was reader dependent.
Conclusion: Substantial intra and interreader disagreement in scoring change in individual joints is common. Opposite joint scores in the same patient, however, are rare and reader dependent. Notwithstanding these subtle inconsistencies on the individual joint level, the total Sharp score is a useful and discriminatory outcome measure.
Statistics from Altmetric.com
Competing interests None.