Article Text

other Versions

Choose wisely: imaging for diagnosis of axial spondyloarthritis
  1. Torsten Diekhoff1,
  2. Iris Eshed2,
  3. Felix Radny1,
  4. Katharina Ziegeler1,
  5. Fabian Proft3,
  6. Juliane Greese1,
  7. Dominik Deppe1,
  8. Robert Biesen4,
  9. Kay Geert Hermann1,
  10. Denis Poddubnyy5
  1. 1Department of Radiology, Charité Universitätsmedizin Berlin, Berlin, Germany
  2. 2Radiology, Sheba Medical Center, Tel Hashomer, Israel
  3. 3Department of Gastroenterology, Infectiology and Rheumatology, Charite Universitatsmedizin Berlin Campus Benjamin Franklin, Berlin, Germany
  4. 4Department of Rheumatology and Clinical Immunology, Charité Universitätsmedizin Berlin, Berlin, Germany
  5. 5Division of Gastroenterology, Infectious Diseases and Rheumatology, Charité Universitätsmedizin Berlin, Berlin, Germany
  1. Correspondence to Torsten Diekhoff, Department of Radiology, Charité Universitätsmedizin Berlin, Berlin, Berlin, Germany; torsten.diekhoff{at}


Objective To assess the diagnostic accuracy of radiography (X-ray, XR), CT and MRI of the sacroiliac joints for diagnosis of axial spondyloarthritis (axSpA).

Methods 163 patients (89 with axSpA; 74 with degenerative conditions) underwent XR, CT and MR. Three blinded experts categorised the imaging findings into axSpA, other diseases or normal in five separate reading rounds (XR, CT, MR, XR +MR, CT +MR). The clinical diagnosis served as reference standard. Sensitivity and specificity for axSpA and inter-rater reliability were compared.

Results XR showed lower sensitivity (66.3%) than MR (82.0%) and CT (76.4%) and also an inferior specificity of 67.6% vs 86.5% (MR) and 97.3% (CT). XR +MR was similar to MR alone (sensitivity 77.5 %/specificity 87.8%) while CT+MR was superior (75.3 %/97.3%). CT had the best inter-rater reliability (kappa=0.875), followed by MR (0.665) and XR (0.517). XR +MR was similar (0.662) and CT+MR (0.732) superior to MR alone.

Conclusions XR had inferior diagnostic accuracy and inter-rater reliability compared with cross-sectional imaging. MR alone was similar in diagnostic performance to XR+MR. CT had the best accuracy, strengthening the importance of structural lesions for the differential diagnosis in axSpA.

  • spondylitis
  • ankylosing
  • magnetic resonance imaging
  • low back pain

Data availability statement

Data are available on reasonable request. All data relevant to the study are included in the article or uploaded as online supplemental information. All source data including but not limited to scoring results and primary imaging are available from the corresponding author on request.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is already known about this subject?

  • Current imaging guidelines recommend radiography of the sacroiliac joints as first-line modality, followed by MRI when axial spondyloarthritis is suspected. Recently, low-dose CT was introduced for detection of structural lesions in the sacroiliac joints; however, its impact for diagnostic workup is still unclear.

What does this study add?

  • Radiography is inferior to MRI and CT of the sacroiliac joints in establishing the diagnosis when axial spondyloarthritis is suspected.

  • Combined radiography and MRI had no added value on readers’ performance or inter-rater reliability compared with MRI alone and diagnostic scenarios with radiography as first imaging showed low specificity.

  • CT shows superior specificity and positive likelihood ratio and only a small shortfall in sensitivity compared with MRI, underlining the importance of structural lesions for the differential diagnosis.

How might this impact on clinical practice or future developments?

  • Radiography should be avoided whenever MRI is readily available, and current guidelines must be re-evaluated. CT is a highly specific alternative whenever MRI is inconclusive, unfeasible or not available.


The European Alliance of Associations for Rheumatology guidelines still recommend X-ray (XR) as first-line imaging in axial spondyloarthritis (axSpA) and MRI if the diagnosis cannot be established by XR and clinical features.1 2 While XR may miss early changes and has low inter-rater reliability, MR has proven to be superior in detecting erosions3 and depicts periarticular and intra-articular fatty metaplasia and active inflammation of bone marrow and soft tissues.4 Therefore, the question arises whether XR should always be used as a first-line imaging test or could be replaced by cross-sectional techniques.

A third modality that is gaining increasing attention as the gold standard for detecting structural lesions is CT.5 While conventional CT is also unable to assess bone marrow changes and active inflammation of the sacroiliac joint (SIJ), it provides higher spatial resolution, thinner slices and direct depiction of the cortical bone compared with standard MR. This is one of the reasons why structural lesions in MR are not included in the Assessment of Spondyloarthritis international Society (ASAS) definition of positive imaging.6 However, much knowledge has been gained since, and there is increasing evidence suggesting that MR can be used for structural assessment as well7 8—but it is still open how this impacts the diagnosis. While structural lesions might be less important in terms of classifying patients for study purposes, they are decisive for the differential diagnosis.9 10 For example, a common condition in women, osteitis condensans, shows bone marrow oedema, sclerosis and fat metaplasia in the SIJ and might be, therefore, difficult to distinguish from axSpA.11 The presence or absence of structural lesions such as erosions can have a decisive role in differentiating axSpA from non-inflammatory mechanical conditions.

In light of this complex diagnostic situation, the aim of the present study was to compare the three major modalities, XR, MR and CT of SIJ, regarding their capabilities in the diagnosis and differential diagnosis of axSpA in patients with low back pain using the final judgement of the rheumatologist as standard of reference.



The SacroIliac joint MAgnetic resonance imaging and Computed Tomography (SIMACT) study, in which patients underwent XR, MR and CT, is already well described in the literature.3 7 For the second study, patients also underwent MR and dual-energy CT of the SIJ, from which conventional CT images were reconstructed. All patients had chronic back pain and were referred for imaging with suspected or known axSpA. Patients were excluded if one of the modalities was not available.


Images were anonymised separately and read independently in the following five sessions: XR, MR and CT alone, XR plus MR and CT plus MR. Thus, every patient was presented five times to each reader either with just one modality (XR, MR and CT) or with two modalities (XR plus MR and CT plus MR). Oblique-coronal T1-weighted and short-tau inversion recovery sequences were provided in the MR datasets, and CT volumes were reconstructed in 4 and 1 mm slices in oblique-coronal orientation.

Image assessment

Three expert musculoskeletal radiologists, completely blinded to identifying information, clinical data including the clinical diagnosis and findings of the other modalities and previous imaging as well as the prevalence of axSpA within the study population, used an online electronic case report form to answer the following questions for each image dataset:

  1. Grading using the modified New York Criteria (mNYC): grade 0–4 for each SIJ.

  2. Are unequivocal structural lesions compatible with axSpA present: yes or no?

  3. Does MR fulfil the ASAS criteria for MR-positivity (MR only): yes or no?

  4. Overall impression: normal or pathologic?

  5. If pathologic: axSpA or other?

Readers were advised to give their personal expert opinion.

Statistical analysis

Descriptive statistics was computed for clinical parameters and scoring results, where agreement of at least two readers was used to report results by modality. Sensitivity and specificity and positive and negative likelihood ratios (LRs) were calculated using the expert rheumatologists’ assessment as standard of reference. Fleiss’ kappa was used for assessing the inter-rater reliability of the imaging diagnosis. Percent of patients with MR and diagnostic accuracy values were calculated for four different clinical scenarios:

  1. : Clinical standard with XR followed by MR if mNYC negative.

  2. : Radiography followed by MR if no SIJ shows sacroiliitis grade 3 or 4.

  3. : MR only.

  4. : CT followed by MR if negative.

Calculations were performed using Microsoft Excel, Graphpad Prism (version 9) and SPSS (version 27).



We analysed a total of 163 patients (see figure 1)—89 (54.6%) with axSpA, 56 (34.4%) with degenerative or mechanical SIJ disease such as osteoarthritis or osteitis condensans ilii, and 18 (11%) with non-specific back pain unrelated to SIJ. Mean age was 38.2±10.6 years, and symptom duration 79.3±89.5 months; 82.3% had inflammatory back pain, 50% were female and 60.2% HLA-B27-positive. Patient characteristics are summarised in online supplemental 1.

Figure 1

Flow chart of patient inclusion. After excluding 19 patients with missing imaging data, a total of 163 patients were included, 89 with the final diagnosis of axSpA. The image datasets were anonymised into five different chunks: radiography, MR and CT alone and MR combined with XR, and MR combined with CT. The datasets were separately presented to the readers. axSpA, axial spondyloarthritis; XR, X-ray.

Imaging findings

Grading results based on the mNYC are provided as means of all readers in online supplemental 2. Stages of axSpA in the study population ranged from early disease to established axSpA with advanced structural damage. 83 XRs, 70 CTs and 75 MRs were scored positive for structural damage. Interestingly, the number of patients positive for structural lesions increased when scoring XR and MR (81) compared with MR alone but decreased for CT and MR (71).

Diagnostic modalities

Sensitivity, specificity and the corresponding LRs for each modality alone and their combinations are shown in figure 2 and in more detail in online supplemental 3. Symptom duration had no effect on diagnostic accuracy (see online supplemental 4).

Figure 2

Frequency of positive and negative findings in radiography (XR), CT, MRI and combinations with resulting diagnostic accuracy values. Numbers are percentages of positive imaging results in patients with and without axSpA. axSpA, axial spondyloarthritis; LR−/+, negative/positive likelihood ratio; SE, sensitivity; SP, specificity.

XR showed lower sensitivity than MR and CT and also an inferior specificity compared with MR and CT. XR +MR was similar to MR alone in terms of sensitivity and specificity, while CT +MR was superior to MR alone. Imaging examples are shown in figure 3.

Figure 3

Imaging examples. (A) Female patient with osteitis condensans (23 years old, HLA-B27 negative, normal CRP). Radiography suggests bilateral erosions and joint space blurring (arrows) with mild sclerosis. However, cross-sectional imaging shows no erosions but some bone marrow oedema (arrows) and sclerosis (arrowheads) consistent with the final diagnosis. (B) Male patient with axSpA (53 years old, HLA-B27 positive, long history of back pain). Radiography shows only mild blurring of the joint space (arrowheads) and capsular calcification (arrow) and was deemed negative by all readers. However, MR and CT show extensive ankylosis (arrowheads) with preservation of only a small portion of the joint space, suggesting advanced axSpA. (C) Female patient with mechanical joint disease (34 years old, HLA-B27 negative, normal CRP). Radiography and T1W MR show extensive sclerosis (arrowheads) and irregularities (arrows) on the left side, MR-STIR extensive bone marrow oedema (arrowheads) and joint fluid (arrow). Both were misclassified by the readers as positive for axSpA. In this patient, only CT ruled out erosions (arrow) and confirmed the diagnosis of osteitis condensans and iliosacral complex as an anatomical variant. (D) Male patient with axSpA (40 years old, HLA-B27 positive, normal CRP). Radiography shows only minor irregularities (arrows) and was deemed negative. MR shows small cysts (arrows) and minor irregularities (arrowhead) as well as some bone marrow oedema on STIR but was judged negative by two of the three readers. Only CT shows very tiny erosions, confirming the diagnosis of axSpA (arrows). axSpA, axial spondyloarthritis; CRP, C reactive protein; STIR, short-tau inversion recovery; XR, X-ray.

Inter-rater reliability

Inter-rater agreement was substantial for XR with a Fleiss’ kappa of 0.517 (95% CI 0.428 to 0.605) and MR (0.665, 95% CI 0.576 to 0.753) but almost perfect for CT (0.875, 95% CI 0.786 to 0.964). CR+MR had similar inter-rater agreement (0.662, 95% CI 0.573 to 0.751) compared with MR alone, while CT +MR showed higher reliability (0.732, 95% CI 0.643 to 0.821).

Clinical scenarios

The evaluation of the different scenarios is presented in figure 4. An increase of the threshold of radiographic positivity improves the specificity of diagnostic imaging but also increases the number of MRs needed for diagnosis (from 49% to 75%) and still performs inferiorly compared with MR alone. CT before MR shows (similar to CT alone) a high specificity and might be an alternative whenever MR is unavailable.

Figure 4

Clinical scenarios. (A) the current clinical standard (MR in patients with mNYC negative XR) shows the highest sensitivity but only poor specificity. (B) XR considered positive if sacroiliitis grade three or four unilaterally is present. This increases the specificity, but MR still must be performed in nearly 75% of patients. (C) MR alone outperforms the scenarios with XR as imaging of first choice showing better overall diagnostic accuracy. (D) CT as first-line imaging showed the best diagnostic accuracy and specificity. However, only 3% of CT-negative patients are positive when adding MR, calling into question, whether the additional MRI is beneficial, whatsoever. axSpA, axial spondyloarthritis; DA, diagnostic accuracy; FN, false-negative; FP, false-positive; mNYC, modified New York Criteria; NPV: negative predictive value; PPV: positive predictive value; TN, true-negative; TP, true positive; SE: sensitivity; SP: specificity; XR, X-ray.


In this study, we designed a unique reading exercise asking three expert radiologists to separately review a total of 815 image datasets acquired in a mixed cohort of 163 patients with early to established axSpA, non-specific low back pain, and SIJ degeneration. Our study has two key results: first, radiography is neither sensitive (66%), nor specific (67%) or reliable (kappa=0.52) in diagnosing axSpA and contributes little when added to MR. Second, CT is similarly sensitive (76% vs 82%) but far more specific (97% vs 87%) than standard MR and the most reliable imaging method in our analysis. When added to MR, CT improves specificity far more markedly than it reduces sensitivity. The current clinical standard can be improved by increasing the threshold for XR positivity (eg, grade 3 or 4 unilaterally) or by omitting XR completely.

Our results underline the importance of structural lesions for the differential diagnosis when axSpA is suspected. CT is certainly the gold standard for structural lesions, displaying the cortical and trabecular bone directly and benefitting from superior resolution and thinner slices compared with MR. The inferior inter-rater reliability of MR alone or in combination with CT might be attributable to the variety of findings that can be detected (eg, fatty metaplasia or bone marrow oedema) and need to be taken into account by readers. Their combinations might be non-specific or complex to interpret. Furthermore, MR is prone to artefacts in bone marrow that might mimic erosions and, thus, lead to false-positive interpretation12 while other changes such as sclerosis can mask other important lesions. Therefore, specific definitions must be established and followed when reading MR.13

Previous studies have already shown that MR is superior to radiography in detecting structural changes3 and can be improved further by using more sophisticated pulse sequences that generate images with greater tissue contrast7 8 or CT-like images.12 14 Other investigators have reported the potential of (low-dose) CT for detecting structural lesions of the SIJ15 or spine.5 Our analysis provides more data in terms of differential diagnosis and diagnostic pathways when SpA is suspected, suggesting that XR adds little once MR has been performed or is easily available, although XR might provide some additional information relevant for the differential diagnosis of back pain, for example, on a hip joint disease. Furthermore, bone marrow changes in MR seem to be less specific for axSpA and interpretation complicated for the differential diagnosis,4 16 17 providing evidence that CT can be a reasonable sensitive and highly specific alternative to MR. Also, CT might be a useful addition, if MR images are ambiguous. When only CT (with or without XR) is readily available, we would recommend adding MR only if the changes seen on the CT scan are inconclusive (ie, very early disease without clear structural changes). However, in view of its radiation exposure, we explicitly do not recommend CT rather than MR as the first-line imaging method. As low-dose CT is comparable to XR in terms of radiation exposure,3 we prefer CT over XR.

Our results contribute to the current discussion within ASAS and other groups, as to whether XR should be a first-line imaging test in suspected axSpA or be replaced by cross-sectional techniques. The authors conclude that it is advisable to avoid XR whenever MR is readily available while clinicians may fall back on XR if MR is not available. However, costs of misdiagnosis and of undertreatment or overtreatment must be included in this calculation. Further studies might address the cost-effectiveness of XR compared with cross-sectional imaging.

Limitations of our study include the use of a rough scoring system not providing details on detected lesions. Our focus here was on global scoring relevant for diagnostic decision-making. About 35% of our axSpA patients did not show characteristic or sufficient inflammatory SIJ changes to meet the ASAS definition of an active MR. Although available—yet not for all patients—we did not include dual-energy CT data, which might have added information undepictable by conventional CT18 because we deliberately focused on conventional techniques widely used in routine clinical practice. There were small differences between the two CT protocols, but they were comparable in terms of radiation exposure (low-dose protocols).

Furthermore, we only assessed imaging findings. Access to clinical information might have improved readers’ diagnostic accuracy. Thus, our approach does not fully capture clinical reality as our aim was to provide an unbiased assessment of imaging findings only. Although no information on sex was provided, complete blinding to sex is usually not possible in pelvic imaging (radiography, MR). Also, we present reading data from three expert radiologist—the performance of imaging outside specialised centres might be considerably worse. While we did not assess intrareader variability in this study, we expect variability to be low because images were assessed by radiologists with expertise in SpA. Also, we did not analyse other aspects of imaging such as radiation exposure, costs or time effectiveness. Imaging is an important part in establishing the diagnosis in suspected axSpA, and imaging findings will always have an impact on the final diagnosis made by the rheumatologist. However, the current reading exercise was unrelated to the clinical diagnostic strategy in order to rule out in the approach. Further, the ‘yes’ or ‘no’ response system used in this study is closer to the classification than to a diagnostic approach in patients. The readers in our study were experienced radiologists with expertise in SpA. Therefore, results might be different when images are assessed by non-expert readers. Continuous training of radiologists and rheumatologists in the interpretation of imaging findings is the only way to improve diagnostic confidence in routine clinical practice. Finally, no follow-up data were available.

In conclusion, XR is inferior to cross-sectional imaging and should be replaced by MR or CT for differential diagnosis. While MR is the most sensitive imaging technique, it lacks specificity compared with CT. CT alone has high diagnostic accuracy despite its insensitivity to bone marrow lesions such as fatty metaplasia or osteitis. Adding CT to MR improves specificity at a minor expense of sensitivity.

Data availability statement

Data are available on reasonable request. All data relevant to the study are included in the article or uploaded as online supplemental information. All source data including but not limited to scoring results and primary imaging are available from the corresponding author on request.

Ethics statements

Ethics approval

The present analysis included patients from two prospective studies (approval by the institutional review board under EA1/086/16 and EA1/073/10). All patients gave written informed consent. Patients were not involved in the study planning.


The authors thank Bettina Herwig for language editing and reviewer 4 for the significant contribution during the review process.


Supplementary materials


  • Handling editor Josef S Smolen

  • TD and IE contributed equally.

  • Contributors TD: conception and design of the study, design of scoring system, image scoring, data evaluation, statistical calculations, article draft, critical revision of the manuscript for important intellectual content. IE: conception and design of the study, image scoring, data evaluation, critical revision of the manuscript for important intellectual content. FR: data management, critical revision of the manuscript for important intellectual content. KZ: image scoring, data evaluation, statistical calculations, critical revision of the manuscript for important intellectual content. FP: patient acquisition, data collection, critical revision of the manuscript for important intellectual content. JG: patient acquisition, data collection, critical revision of the manuscript for important intellectual content. DD: patient acquisition, data evaluation, critical revision of the manuscript for important intellectual content. RB: data evaluation, critical revision of the manuscript for important intellectual content. KGH: conception and design of the study, critical revision of the manuscript for important intellectual content. DP: conception and design of the study, statistical calculations, critical revision of the manuscript for important intellectual content.

  • Funding TD received a research grant from the Assessment of Spondyloarthritis international Society. Other than that, the authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests TD reports an ASAS research grant during the conduct of the study, personal fees from Canon MS, MSD, Roche and Novartis and an institutional grant from Canon MS outside the submitted work. DP reports grants and personal fees from AbbVie, Eli Lilly, MSD, Novartis, Pfizer and personal fees from Bristol-Myers Squibb, Roche, UCB, Biocad, GlaxoSmithKline and Gilead outside the submitted work. KGH reports personal fees from AbbVie, Novartis, Merck and Pfizer outside the submitted work. For the remaining authors none were declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.