Objective To evaluate the performance of whole body (WB) MRI versus conventional (CON) MRI in assessing active inflammatory lesions of the entire spine in patients with established and clinically active axial spondyloarthritis (SpA) using the Spondyloarthritis Research Consortium of Canada (SPARCC) MRI index.
Methods 32 consecutive patients with SpA fulfilling the modified New York criteria and with clinically active disease (Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) score ≥4) were scanned by sagittal WB and CON MRI of the spine. The MR images were scored independently in random order by three readers blinded to patient identifiers. Active inflammatory lesions of the spine were recorded on a web-based scoring form. Pearson correlation coefficients were used to compare scores for WB MRI and CON MRI for each rater and intraclass correlation coefficients (ICC) were used to assess interobserver reliability.
Results The median percentage of inflammatory lesions recorded concordantly for both WB MRI and CON MRI ranged from 83% to 91% for the three readers; 4–9% were only recorded by WB MRI and 4–9% were recorded by CON MRI only. The Pearson correlation coefficients between WB and CON MRI per rater were 0.79, 0.89 and 0.81, respectively. The ICC(2, 1) were 0.75, 0.80 and 0.68 for CON MRI and 0.82, 0.83 and 0.93 for WB MRI for the three possible reader pairs.
Conclusion WB MRI and CON MRI scores showed a high correlation and comparable high reliability for the detection of active inflammatory lesions in the spine of patients with clinically active SpA.
Statistics from Altmetric.com
Assessing disease activity in axial spondyloarthritis (SpA) is primarily based on patient self-reports; physical and laboratory investigations lack sensitivity. Active spinal inflammation has a considerable impact on physical function and everyday activities; however, these spinal inflammatory lesions cannot be displayed by plain radiography which captures secondary bony changes such as syndesmophytes or ankylosis. MRI is the preferred imaging modality to detect active inflammatory lesions in the axial skeleton long before the appearance of syndesmophytes or spinal ankylosis as detected by plain radiography.1
Scanning the complete spine by conventional (CON) MRI with images of the upper and lower half of the spine is more limited than whole body (WB) MRI which also includes the sacroiliac joints (SIJ), chest wall, shoulders and hips. WB MRI is based on multichannel technology and parallel imaging.2 However, it remains to be proven that WB MRI has the same sensitivity and reliability for the detection of active spinal inflammatory lesions as CON MRI.
The objective of this study was to evaluate the performance of WB MRI versus CON MRI in assessing active inflammatory lesions in the entire spine of patients with SpA with established and clinically active disease by using the Spondyloarthritis Research Consortium of Canada (SPARCC) MRI index.3 This cross-sectional study is designed to compare WB MRI against CON MRI, and therefore precludes any comparison of their diagnostic utility in patients with early undifferentiated SpA or monitoring disease course.
Patients with active SpA who fulfilled the modified New York classification criteria4 were consecutively recruited in a single rheumatology outpatient clinic from April 2006 to August 2007. Active disease was defined as a Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) score ≥4 and/or a BASDAI item 2 score assessing spinal pain ≥4 on a numerical rating scale (NRS) ranging from 0 to 10.5
There were no limitations on the age and disease duration of the study subjects in order to obtain a representative cross-sectional sample of patients with SpA in a rheumatology outpatient clinic. Several clinical and laboratory parameters served to compare disease activity with the MRI findings: the complete BASDAI score, BASDAI item 2 score, intensity and duration of morning stiffness of the back and intensity of nocturnal pain (both assessed by a NRS ranging from 0 to 10), Bath Ankylosing Spondylitis Functional Index (BASFI),6 C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR).
Patients with ongoing or previous therapy (within the preceding 6 months) with tumour necrosis factor α inhibitors or other biological agents, bone neoplasms, osteomyelitis, prior surgery of the spine, pelvic or shoulder girdle and major spinal deformity were excluded from the study. Additional exclusion criteria were pregnancy and technical contraindications to MRI such as cardiac pacemakers and similar devices.
Whole body MRI
A Siemens Avanto 1.5 Tesla magnet (Siemens Medical Solutions, Erlangen, Germany) with 18 independent radiofrequency channels was employed. Various combinations of up to six coils simultaneously plugged into the system were used in order to provide sagittal images covering the entire spine and the sacrum2 with the following parameters: Turbo Short Tau Inversion Recovery (STIR): repetition time (TR) 6270 ms, echo time (TE) 93 ms, inversion time (TI) 130 ms, Turbo Factor 21, PAT (parallel acquisition techniques) Factor 2, PAT mode GRAPPA (generalised autocalibrating partially parallel acquisition); T1-weighted spin echo sequence: TR 401 ms, TE 11 ms, PAT Factor 2, PAT mode GRAPPA. Two imaging steps were used with a field of view (FOV) of 450 mm and an imaging matrix of 780 × 448 (STIR) and 890 × 448 (T1-weighted spin echo) pixels per step, respectively; 3 mm section thickness, interslice gap 0.3 mm, 20 sections resulting in a FOV of 780 × 450 mm. The acquisition times for the two sequences were 2 min 49 s and 2 min 21 s, respectively.
Sagittal turbo STIR and turbo T1-weighted spin echo images of the upper and the lower half of the spine were obtained. Images of the upper half of the spine comprised C0 to minimally Th7 and of the lower half of the spine Th7 to minimally S3. Imaging parameters for the sagittal STIR sequence were TR 3920 ms, TE 70 ms, TI 150 ms, Turbo Factor 9, PAT Factor 2, PAT mode GRAPPA, matrix size 448 × 269 pixels per step. The parameters of the sagittal T1 spin echo images were TR 659 ms, TE 11 ms, Turbo factor 3, PAT Factor 2, PAT mode 2, GRAPPA 2 with a matrix size of 512 × 256 mm. In both sequences the FOV was 450 mm, the slice thickness was 3 mm with an interslice gap of 0.3 mm. Twenty sections were obtained per sequence. Acquisition times were 2 min 03 s for each of the T1-weighted sequences and 1 min 59 s for each of the STIR sequences.
Analysis of MR images
The WB and CON MR images were scored independently by three readers (AGJ, WPM, JH) not involved in patient recruitment and blinded to patient identity and clinical parameters (figure 1). One reader (AGJ, a radiologist) is an experienced OMERACT (Outcome Measures in Rheumatoid Arthritis Clinical Trials) reader and one reader (WPM, a rheumatologist) is co-author of the SPARCC MRI index3; the third reader (JH) is a staff radiologist experienced in musculoskeletal MRI who participated together with WPM in a video teleconference calibration exercise on the SPARCC score (by using reference cases of spinal MRI of either technique). Each reader also reviewed an online training module on the SPARCC spine scoring index available at www.arthritisdoctor.ca. The films were reviewed in random order (concerning both the sequence of patients and of WB and CON MRI) on electronic work stations in the institution of each reader.
Active inflammatory lesions on STIR sequences of the 23 discovertebral units (DVU) from segment C2/C3 to segment L5/S1 on WB MRI and CON MRI films were rated on a web-based scoring program of the SPARCC MRI index (www.sparccmri. filipow.ca). This is based on the concept of a DVU which is defined as the region between two virtual horizontal lines through the middle of two adjacent vertebrae including the intervertebral disc and the two adjacent vertebral endplates.7 The SPARCC MRI method evaluates the four quadrants of each DVU. A three-dimensional evaluation of each lesion is conducted by assessing three consecutive sagittal slices of each quadrant in a dichotomous manner (1=increased signal; 0=normal signal on STIR sequences) which gives a scoring range per DVU of 0–12. An additional score for each of the three slices assessing the extent of inflammatory lesions by recording depth in relation to the endplate (1: ≥1 cm; 0: <1 cm) results in a total scoring range per DVU of 0–15. Spinal T1 spin echo sequences were available for anatomical reference.
The readers scored active inflammatory lesions of all 23 DVU as well as the 6 most affected DVU as specified in the SPARCC MRI index for use in clinical trials. It has been shown that a limited assessment of the 6 most affected DVU performs at least as well as a total assessment of the spine using the SPARCC MRI spine index.8
To evaluate the inflammatory changes within the entire spine, a single DVU analysis of the mean sum score of the three readers was performed for all 23 DVU from C2/C3 to L5/S1.
To assess the association between WB MRI and CON MRI scores within one rater, scatterplots of these respective measurements were inspected and Pearson correlation coefficients computed. Mean sum scores were compared using a Mann–Whitney test. To check for systematic relations between the differences of the CON MRI and WB MRI measurements and their means, increasing variability and potentially necessary transformation of rater scores within each rater, Bland–Altman plots with corresponding limits of agreement were inspected.9 The average score of each rater for the two MRI methods on the x-axis was plotted against the score difference (CON-WB) on the y-axis including 95% limits of agreement. No need for any transformation of score measurements was detected. In all these analyses, no correction for multiple testing was performed and the significance level α was set to 0.05. The intrareader correlation between the two methods was defined as moderate, good, very good and excellent by values >0.5, >0.6, >0.8 and >0.9, respectively.
Within each method, intraclass correlation coefficients (ICC) were computed based on a two-way random effects model with single measurements to compare interobserver reliability.10 The rationale for using this model is that raters are considered to be a random sample from the population of all raters; this ICC variant is commonly abbreviated as ICC(2, 1). Consequently, the results are representative of a larger population of raters. The ICC variant ICC(3, 1) was additionally calculated; this approach considers the raters to be fixed and thus not representative of a larger population. ICCs were computed for each reader pair and for all three reader pairs jointly for the two MRI methods (CON MRI vs WB MRI) and number of DVU category (all 23 DVU vs the 6 most affected DVU). ICCs >0.5, >0.6, >0.8 and >0.9 were regarded as representing moderate, good, very good and excellent reproducibility, respectively.
Thirty-two patients (27 men) participated in the study, all of whom fulfilled the modified New York criteria; 30 had primary ankylosing spondylitis and 2 had associated Crohn's disease. The median age was 35.5 years (range 17.3–65.5) and median symptom duration was 10.5 years (range 1–37). Sixteen patients (50%) had a symptom duration of ≤10 years. Twenty-seven of 31 patients (87%) were HLA B27 positive; this genetic marker was not tested in one patient.
The median disease activity assessed by the BASDAI global score was 4.3 (range 2.0–7.2) and 6.5 (range 4–10) for BASDAI item 2. The median ESR was 15 mm/h (range 2–60). The BASFI ranged from 0 to 7.7 with a median value of 3.5. The median Bath Ankylosing Spondylitis Metrology Index (BASMI)11 score was 1.0 (range 0–7).
Intrareader correlation for WB MRI and CON MRI scores of the spine
The median number of DVU (in total 23 DVU from C2/C3 to L5/S1) that were recorded concordantly for both WB MRI and CON MRI for the presence or absence of inflammatory lesions was 19, 21 and 19 for readers 1, 2, and 3, respectively. 2, 1 and 2 DVU with inflammatory changes were only recorded by WB MRI, while 2, 1 and 2 DVU showing inflammation were recorded by CON MRI only for readers 1, 2 and 3, respectively (table 1).
Bland–Altman plots showed no evidence of major discrepancies between the two methods being related to the SPARCC MRI scores for inflammation. Limits of agreement were wider for the 23 DVU than for the 6 DVU SPARCC MRI score for all three readers (figure 2). The Pearson correlation coefficient between WB MRI and CON MRI per rater for the three readers was 0.79, 0.89 and 0.81, respectively, for all 23 DVU and 0.84, 0.87 and 0.81, respectively, for the 6 most affected DVU (table 2).
Figure 3 shows the mean sum score of the three readers per DVU for WB MRI and CON MRI, and it also shows the previously described clustering of inflammatory lesions in the thoracic spine.12 The mean sum score of the three readers showed no statistical difference between CON MRI and WB MRI for 23 and 6 DVU, with the exception of one p value of 0.049 (Mann–Whitney test) for 23 DVU in one reader.
Interobserver reliability for WB MRI and CON MRI scores of the spine
The interobserver correlation for WB MRI and CON MRI scores assessed by ICC(3, 1), which considers the observers as not being representative of a larger population of readers, ranged from 0.78 to 0.88 for CON MRI and from 0.82 to 0.96 for WB MRI for the three reader pairs (table 3). Under the assumption that the three readers are representative of a larger population of observers, the ICC(2, 1) yielded a range of 0.68–0.81 for CON MRI and 0.81–0.93 for WB MRI for the three reader pairs. Reader pair 2/3 trained by a pre-readout calibration exercise showed a higher inter-reader correlation for WB MRI with values ranging from 0.93 to 0.96 for both ICC variants.
Correlation of WB MRI and CON MRI scores of the spine with clinical variables
Only 3 out of 96 Pearson correlations between mean spine MRI scores from all three readers (WB or CON, 23 or 6 DVU) and BASDAI global score, BASDAI item 2 (axial pain), BASFI, ESR and CRP, nocturnal pain or morning stiffness reached statistical significance (data not shown) and were considered false positives.
This cross-sectional study of 32 patients with active SpA representing a broad disease spectrum showed a very good correlation between WB MRI and CON MRI in assessing active inflammatory lesions of the entire spine. The inter-reader reliability was good to excellent depending on the statistical assumptions concerning the characteristics of the reader team and on pre-readout calibration. Together with a very good intrareader correlation and inter-reader agreement for the evaluation of inflammatory changes of the SIJ,13 WB MRI proved to be a valid alternative to CON MRI for assessing inflammation of the entire axial skeleton in a sample of 32 patients with confirmed and clinically active SpA. Moreover, with an examination time of only 30 min to image the entire axial skeleton including the shoulder and hip girdles, WB MRI is more convenient for the patient than performing separate images of the SIJ and of the upper and lower halves of the spine by CON MRI.
The high clinical relevance of spinal inflammation in terms of pain and impaired physical function contrasts with the limited options of how to assess spinal disease activity. With its more comprehensive assessment of inflammation compared with CON MRI of a limited region, WB MRI may serve as an objective and quantitative measure of inflammatory lesions in the entire axial skeleton in SpA as well as in the hip and shoulder girdles and of the anterior chest wall. Our WB MRI validation study of the entire spine confirms the previous finding that the thoracic spine is the second most frequent axial region affected by inflammation in SpA after the SIJ.12
Using the SPARCC MRI index, the intraobserver correlation between the two MRI methods was very good both for all 23 DVU and for the 6 most affected DVU. The inter-reader reliability expressed as ICC(2, 1) (regarding the readers as representative of a larger population of similar observers) and as ICC(3, 1) (considering the raters as not representative of a larger group of observers) ranged from good to excellent. The inter-reader reliability for WB MRI was consistently higher than for CON MRI, possibly due to a more convenient orientation for a reader on a single WB MRI film representing the entire spine than on two CON images displaying the upper and lower halves of the spine separately. Excellent inter-reader reliability for both ICC(2, 1) and ICC(3, 1) was observed for WB MRI and for the calibrated reader pair 2/3; this finding probably reflects the easier interpretation of one single WB MRI film and underscores the relevance of a pre-readout calibration. The range of interobserver correlation of 0.68–0.96 for the three reader pairs in our study is comparable to the range of 0.73–0.97 for the three reader pairs in the validation study of the spinal SPARCC MRI index.3 A multireader study of the ASAS/OMERACT Working Group comparing three different scoring methods for spinal MRI activity in SpA showed a range of ICC(2, 1) status scores of 0.55 to 0.93 for the SPARCC score, which compares well with the range of 0.68–0.96 in the present study; the SPARCC score consistently resulted in higher ICCs than two other scoring indices.14
In parallel to the SIJ scores,13 there was no correlation of spinal inflammation scores by either of the two MRI techniques with several clinical and laboratory parameters for disease activity, thus reinforcing our hypothesis that inflammatory MRI lesions in the axial skeleton may represent another quality of disease activity compared with clinical and laboratory parameters.
There were five patients with inflammation in the SIJ but not in the spine, and four patients with inflammation in the spine but not in the SIJ. It is of interest that the four patients with inflammation in the spine but not in the SIJ were relatively older, had longer disease duration and had near complete ankylosis of both SIJs on plain radiography. The same observation has recently been reported and interpreted to be a consequence of disease duration15 which is associated with ankylosis of the SIJ.
Before the widespread use of WB MRI in clinical practice, we need more data on reliability of change scores and on the clinical relevance of inflammation in the hip and shoulder girdle as well as the anterior chest wall also displayed by WB MRI. Further research is needed to define standards for reporting spinal inflammatory lesions observed on MRI, evaluating sensitivity and specificity of these inflammatory lesions and for assessing whether spinal inflammatory MR changes are indeed predictive of future structural damage visible as radiographic syndesmophyte formation.16
In conclusion, WB MRI and CON MRI showed a very good correlation for active spinal inflammatory lesions as measured by the SPARCC MRI method in patients with confirmed SpA. The inter-reader reliability to detect active inflammatory lesions of the entire spine was good to excellent, depending on the statistical assumptions concerning the characteristics of the reader team and on pre-readout calibration.
The authors thank the patients for their participation, Désirée van der Heijde, Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands, for her advice concerning the study design, Tracey Clare, clinical research manager, Paul Filipow, data manager, Department of Radiology, University of Alberta, Edmonton, Canada for coordinating the web-based SPARRC scoring index , and Christian Streng, Balgrist University Hospital, Zurich, Switzerland, for his technical assistance with figure 1. We thank the following Swiss rheumatologists, internists and primary care physicians for referral of their patients D Amgwerd, Spreitenbach; G Bickel, Rapperswil; C Boetschi, Romanshorn; C Brunner, Zurich; S Buergin, Basel; P De Vecchi, St Moritz; D Galovic, Pfaeffikon; T Gerber, Zurich; M Giger, Menzingen; C Gut, Reinach; F Haefelin, Zurich; G Hajnos, Zurich; J Imholz, Zurich; C Jeanneret, Schwerzenbach; M Klopfstein, Biel; I Kramers, Zurich; A Meniconi, Schwyz; S Pfister, Buelach; A Schmidt, Basel; J Sturzenegger, Kreuzlingen; P Sutter, Zurich; F Tapernoux, Rueti; B Weiss, Basel; R Wuethrich, Brugg.
Funding The project was funded by the Walter L and Johanna Wolf Foundation, Zurich, Switzerland and the Foundation for Scientific Research at the University of Zurich, Switzerland.
Competing interests None.
Ethics approval This study was conducted with the approval of the Spezialisierte Unterkommission Orthopaedie/Bewegungsapparat der Kantonalen Ethikkommission Zuerich (KEK), Gesundheitsdirektion Kanton Zuerich. The local ethics committee approved the protocol and all patients gave written informed consent.
WPM is a Scientist of the Alberta Heritage Foundation for Medical Research.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.