The development of Assessment of SpondyloArthritis international Society classification criteria for axial spondyloarthritis (part II): validation and final selection
- M Rudwaleit1,
- D van der Heijde2,
- R Landewé3,
- J Listing4,
- N Akkoc5,
- J Brandt6,
- J Braun7,
- C T Chou8,
- E Collantes-Estevez9,
- M Dougados10,
- F Huang11,
- J Gu12,
- M A Khan13,
- Y Kirazli14,
- W P Maksymowych15,
- H Mielants16,
- I J Sørensen17,
- S Ozgocmen18,
- E Roussou19,
- R Valle-Oñate20,
- U Weber21,
- J Wei22,
- J Sieper1,23
- 1Rheumatology, Med Klinik I, Charité, Campus Benjamin Franklin, Berlin, Germany
- 2Leiden University Medical Center, Leiden, The Netherlands
- 3Maastricht University Medical Center, Maastricht, The Netherlands
- 4Epidemiology Unit, German Rheumatology Research Centre, Berlin, Germany
- 5Dokuz Eylul University Hospital, Izmir, Turkey
- 6Rheumatology Private Practice, Berlin, Germany
- 7Rheumazentrum Ruhrgebiet, Herne and Ruhr University, Bochum, Germany
- 8Veterans General Hospital, Taipei, Taiwan
- 9University of Córdoba, Córdoba, Spain
- 10Hospital Cochin, Paris, France
- 11Chinese PLA General Hospital, Beijing, China
- 12Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
- 13Case Western Reserve University, MetroHealth Medical Center, Cleveland, Ohio, USA
- 1412 University of Ege, Izmir, Turkey
- 15University of Alberta, Edmonton, Canada
- 16University Hospital, Ghent, Belgium
- 17University Hospital, Copenhagen, Denmark
- 18Firat University Hospital, Elazig,Turkey
- 19King George Hospital, London, UK
- 20University Militar Hospital, Bogotá, Colombia
- 21Balgrist University Hospital, Zurich, Switzerland
- 22Chung Shan Medical University, Taichung, Taiwan
- 23German Rheumatology Research Center, Berlin, Germany
- Dr M Rudwaleit, Charité, Universitätsmedizin Berlin, Campus Benjamin Franklin, Rheumatologie, Med Klinik I, Hindenburgdamm 30, 12203 Berlin, Germany;
- Accepted 6 March 2009
- Published Online First 17 March 2009
Objective: To validate and refine two sets of candidate criteria for the classification/diagnosis of axial spondyloarthritis (SpA).
Methods: All Assessment of SpondyloArthritis international Society (ASAS) members were invited to include consecutively new patients with chronic (⩾3 months) back pain of unknown origin that began before 45 years of age. The candidate criteria were first tested in the entire cohort of 649 patients from 25 centres, and then refined in a random selection of 40% of cases and thereafter validated in the remaining 60%.
Results: Upon diagnostic work-up, axial SpA was diagnosed in 60.2% of the cohort. Of these, 70% did not fulfil modified New York criteria and, therefore, were classified as having “non-radiographic” axial SpA. Refinement of the candidate criteria resulted in new ASAS classification criteria that are defined as: the presence of sacroiliitis by radiography or by magnetic resonance imaging (MRI) plus at least one SpA feature (“imaging arm”) or the presence of HLA-B27 plus at least two SpA features (“clinical arm”). The sensitivity and specificity of the entire set of the new criteria were 82.9% and 84.4%, and for the imaging arm alone 66.2% and 97.3%, respectively. The specificity of the new criteria was much better than that of the European Spondylarthropathy Study Group criteria modified for MRI (sensitivity 85.1%, specificity 65.1%) and slightly better than that of the modified Amor criteria (sensitivity 82.9, specificity 77.5%).
Conclusion: The new ASAS classification criteria for axial SpA can reliably classify patients for clinical studies and may help rheumatologists in clinical practice in diagnosing axial SpA in those with chronic back pain.
Trial registration number: NCT00328068.
The concept of spondyloarthritis (SpA) comprises ankylosing spondylitis (AS), psoriatic arthritis, arthritis/spondylitis with inflammatory bowel disease (IBD), and reactive arthritis.1–3 Patients with typical features of SpA that do not fulfil the criteria for one of these subtypes have also been incorporated in the SpA concept as undifferentiated SpA,4 5 which is reflected in the European Spondylarthropathy Study Group (ESSG)1 and Amor criteria.2 SpA patients can also be distinguished according to their clinical presentation as patients with predominantly peripheral SpA or with predominantly axial SpA,1 with some overlap between these two subtypes. In 2004, the Assessment of SpondyloArthritis international Society (ASAS) had decided to improve current SpA criteria especially for application in the early disease stage. As a first step the ASAS group has focused on patients with predominantly axial SpA.6 Radiographic sacroiliitis has been an essential part of the widely accepted modified New York criteria for AS.3 However, radiographic changes may reflect the consequences of inflammation (structural damage) rather than inflammation itself, which may be readily detectable by magnetic resonance imaging (MRI), often years before the appearance of radiographic sacroiliitis.7–10
Candidate criteria for axial SpA that include patients with and without definite radiographic sacroiliitis were developed first11 and, in a second step, as reported herein, validated in an independent prospective international study and further refined and re-tested, after which the most appropriate set of criteria was selected by voting among ASAS members.
All rheumatologists who are ASAS members were invited to participate in this study. For inclusion, eligible patients had to have chronic back pain (for more than 3 months) of unknown origin (no definite diagnosis) that began before 45 years of age, with or without peripheral symptoms, when they first presented for diagnostic work-up at the respective ASAS centre. To prevent selection bias, participants were instructed to include patients in a strictly consecutive manner. This could be accomplished either by including all eligible patients (without exceptions) or, alternatively, by including every first, second or third patient per day who met the inclusion criteria. Diagnostic work-up was performed after written informed consent was obtained and the results were documented in a case report form (CRF).
Clinical, laboratory and imaging data
Clinical data included gender, age, duration and age at onset of back pain. For inflammatory back pain (IBP) the following features were recorded (yes vs no): insidious onset, morning stiffness, improvement with exercise, improvement with rest, alternating buttock pain, pain at night with improvement upon getting out of bed. Based on the clinical history the local rheumatologist had to decide whether IBP was present or absent. A good response of back pain to a full dose of non-steroidal anti-inflammatory drugs (NSAID) was defined as “not anymore present” or “much better”. The presence of extraspinal manifestations (current or in the past), ie, enthesitis, peripheral arthritis, uveitis, dactylitis, psoriasis, IBD and a positive family history of SpA (AS, reactive arthritis, psoriasis, IBD, uveitis) was also documented. Schober’s test, lateral spinal flexion and chest expansion were documented, and laboratory tests included HLA-B27 and C-reactive protein (CRP).
Plain radiographs of the pelvis were taken in all patients, and sacroiliitis was graded locally for each sacroiliac joint separately (grades 0 to 4) according to the modified New York criteria.3 While MRI investigation of the sacroiliac joints was considered obligatory in the first 20 patients in each centre, MRI investigation of the spine was optional. MRI findings were documented as the presence or absence of typical signs of active inflammation. Chronic changes on MRI such as erosions or fatty degeneration were documented but not considered in the analyses, because their value has not yet been precisely defined. Finally, the local rheumatologist (ASAS member) had to make a judgement about the diagnosis (SpA or no SpA) and had to indicate the level of confidence with this judgement on a numerical rating scale from 0 (not confident at all) to 10 (very confident).
Data verification and analysis
The expert physician’s diagnosis was used as a gold standard. Before statistical data analysis, 149 CRF (22%) were randomly selected from all centres by two of us (MR and JS) for scrutiny and plausibility of the diagnosis (SpA vs no SpA). Since implausibility was identified in four cases (2.7%) only, all cases were retained in the database for further analysis.
The performance of the two sets of candidate criteria for axial SpA (fig 1)11 was analysed descriptively in terms of sensitivity and specificity using cross tables. Various definitions for IBP, ie, Calin criteria,12 Berlin criteria,13 and IBP according to experts14 were compared in the candidate criteria. The ASAS-endorsed IBP experts definition requires the presence of at least four out of the following five parameters: (1) age at onset less than 40 years; (2) insidious onset; (3) improvement with exercise; (4) no improvement with rest; (5) night pain with improvement upon getting up. The candidate criteria for axial SpA were also compared with the ESSG1 and Amor2 criteria as well as with modifications of these by adding the parameter “active sacroiliitis on MRI” to the list of ESSG parameters, of which at least one is required in addition to IBP or peripheral arthritis, and to the list of Amor criteria (MRI contributing 3 points similar to and as an alternative for radiographic sacroiliitis).
In addition to testing the prespecified candidate criteria we also investigated minor modifications of the criteria. To evaluate the refined criteria the dataset was split randomly into two parts: in the first 40% of the data, refined sets of criteria were tested and validated subsequently in the remaining 60% of the dataset. Sensitivity analyses were performed for different levels of confidence with the diagnosis (axial SpA vs no SpA), for the exclusion of individual centres, the exclusion of AS patients already fulfilling the modified New York criteria and restriction to patients with MRI of the sacroiliac joints available. All data analyses were performed jointly by five ASAS members (MR, DvdH, RL, JL, JS) using SPSS 14.0 during a 2-day meeting. Multivariable logistic regression analysis was performed to identify parameters contributory to the classification (axial SpA vs no SpA).
Final selection process
All ASAS members were invited to an ASAS meeting held preceding the EULAR Conference in 2008. At this meeting, the results from this study were presented and discussed and the final set of criteria was selected by voting.
Contribution of participating centres
Twenty-five centres in 16 countries had provided 661 patients; complete CRF were available in 649 patients (348 patients from Western Europe (14 centres), 72 from Turkey (four centres) and 187 from Asia (five centres), 26 from Canada (one centre) and 16 patients from Colombia (one centre). Eighteen centres (72%) provided at least 10 patients each and 14 provided at least 20 patients each. The completeness of clinical, laboratory and radiographic data was very good (96–100%), so that all 649 patients could be analysed; 391 of them had axial SpA (60.2%) and 258 did not (39.8%). In general, experts felt confident with their diagnosis, as indicated by levels of confidence of 6 or greater in 95% and 7 or greater in 87% (scale 0–10).
Demographic and clinical data
The characteristics of the patients are shown in table 1. As expected, the frequency of SpA features was higher in the axial SpA compared with the no SpA group. Of note, limitation of anterior (Schober’s test) or sagittal (lateral spinal flexion) lumbar spinal movement was equally frequent in the two groups (table 1).
Definite radiographic sacroiliitis (grade 2 bilateral or grade 3–4 unilateral) was present in 29.7% of axial SpA patients and 10.7% had unilateral grade 2 sacroiliitis (table 1). The duration of back pain (mean 9.4 years, SD 9.0) was significantly higher in those with definite radiographic sacroiliitis as opposed to those without (4.7 years, SD 6.2; p<0.001), supporting the concept that it takes time to develop radiographic changes in axial SpA.5 9 10 15
MRI of the sacroiliac joints was performed in 495 (76%) patients and of the spine in 274 (42%), and those showing active inflammation are shown in table 1. Of 235 patients who had undergone MRI of both the sacroiliac joints and the spine, 130 patients had a diagnosis of axial SpA and 26.9% of them had active inflammation in the sacroiliac joints and spine, 36.2% in the sacroiliac joints only, 5.4% in the spine but not in the sacroiliac joints and the remaining 31.5% had no active inflammation on MRI.
Performance of candidate criteria for axial SpA
The sensitivity and specificity of the two sets of candidate criteria using various definitions of IBP are shown in table 2.
Overall, there were no major differences between candidate criteria sets 1 and 2. Interestingly, the choice of IBP definition in set 1 only marginally influenced the overall performance of the criteria. IBP according to experts and the Berlin definition of IBP performed similarly well in the candidate criteria, and both were superior to the Calin criteria in terms of specificity. We decided to use for future analyses the IBP according to experts definition. In comparison with the ESSG and Amor criteria and with their modifications (incorporating active sacroiliitis on MRI), the candidate criteria set 1 had a better sensitivity, but specificity was neither optimal, nor did it improve considerably if IBP was made obligatory for the clinical arm (as required in criteria set 2). Performance of these candidate criteria did not change on restricting the analyses to patients who were diagnosed with a high level of confidence (⩾7 or ⩾8), or by excluding patients with definite radiographic sacroiliitis (AS) or by excluding patients from particular centres (data not shown).
Refinement of candidate classification criteria
First, the specificity of the candidate criteria set 1 (with IBP according to experts as the definition of IBP) was analysed separately for its “imaging arm” (sacroiliitis on radiographs or MRI plus one or more SpA feature) and for its “clinical arm” (three or more SpA features). Whereas the specificity of the imaging arm was excellent (97.3%), that of the clinical arm was 76.7% and, therefore, accounted for the moderate specificity (74.0%) of the entire set of criteria.
To refine the clinical arm we looked for SpA features that could increase specificity without losing too much sensitivity. HLA-B27 was a candidate because of its high sensitivity and specificity and its good face validity for axial SpA. Furthermore, unilateral radiographic sacroiliitis and to a lesser extent “a good response to NSAID” appeared to discriminate well between axial SpA and no SpA (table 1). The NSAID response was also contributory to the disease classification in multivariable logistic regression analyses (table 3), but from a clinical point of view it was decided that poor responders to NSAID should not be excluded from being classified through the clinical arm. Other parameters that were contributory in the logistic regression analysis were not discriminatory, such as anterior or lateral lumbar flexion (table 1).
Therefore, various sets of refined candidate criteria with HLA-B27 as an obligatory parameter in the clinical arm were generated: HLA-B27 plus two or more other SpA features (set 3a), HLA-B27 plus one or more other SpA feature (set 3b) and HLA-B27 or unilateral radiographic sacroiliitis plus two or more other SpA features (set 4). These refined sets of criteria were first evaluated in a random selection of 40% of the cases and thereafter validated in the remaining 60% of cases.
The sensitivity and specificity of the various sets of refined candidate criteria (sets 3a,b and 4) in comparison with the original candidate criteria (sets 1 and 2) are shown in table 4. The candidate criteria set 3a (HLA-B27 plus two or more other SpA features) and set 4 (HLA-B27 or radiographic sacroiliitis grade 2 plus two or more other SpA features) performed best and had slightly better specificity than the modified Amor criteria. There were no differences in the performance between the various criteria when the analysis was restricted to patients with available MRI of the sacroiliac joints, or when restricted to patients with higher levels of diagnostic confidence (data not shown).
Final selection of new classification criteria
The results of these analyses were discussed at an ASAS meeting preceding EULAR 2008. In a formal voting process, all participants (100%) voted for new ASAS classification criteria either sets 3a or 4, and not for candidate criteria set 1 or set 2, modified ESSG, or modified Amor criteria. In a final voting step, the majority of ASAS members voted for set 3a (fig 2). Definitions for all parameters of the classification criteria are listed in table 5.
Performance of the new criteria for axial SpA as diagnostic criteria
These criteria had 82.9% sensitivity and 84.4% specificity when evaluated in the entire dataset of 649 patients (positive likelihood ratio (LR+) 5.3, negative likelihood ratio (LR−) 0.20). In this specific setting of rheumatology referral centres, the post-test probability of SpA increased from 60.2% (prevalence of axial SpA equals pretest probability) to 89.0% after fulfilment of these criteria and decreased to 23.5% if these criteria were not fulfilled. If the imaging arm (sacroiliitis) alone was considered (sensitivity 66.2%, specificity 97.3%) the post-test probability increased from 60.2% to 97.5% in case of fulfilment of the criteria (LR+ 24.5), but decreased to only 34.5% if the criteria were not fulfilled (LR− 0.35).
In a series of accompanying papers published in the Annals of the Rheumatic Diseases11 14 and this paper, we have described under the auspices of ASAS the development, validation and formal assessment by the ASAS community of new classification criteria for axial SpA. These criteria encompass patients with established radiographic sacroiliitis (that will be classified as AS) and also patients who have not (yet) developed radiographic sacroiliitis and, therefore, are referred to as non-radiographic axial SpA. In this latter group of patients there is in fact an unmet need because the burden of disease can be substantial16 and biological agents may decrease signs and symptoms.17 18 These new criteria will provide a new standard for classifying non-radiographic axial SpA, which, importantly, will facilitate the conduct of clinical trials and observational studies in this group.6 As such, these criteria might serve as a basis for an extension of the use of tumour necrosis factor blockers to the non-radiographic stage of axial SpA.
A two-step process was applied to arrive at the new classification criteria.19 20 First, by means of paper patients with possible non-radiographic axial SpA, we integrated the expert opinion of 20 ASAS experts and we constructed candidate classification criteria for axial SpA.11 In a second step, ASAS has validated and refined the candidate criteria in a prospective, international study of more than 600 patients with chronic back pain of unknown origin. Finally, at an international meeting the ASAS group decided in a formal voting process about the new classification criteria. The 10-item list of SpA features (fig 2) appears to be rather long but, on the other hand, is comprehensive. Although the performance of the criteria in this study was not jeopardised by the exclusion of parameters such as CRP, dactylitis or IBD, the ASAS group felt that these parameters should be maintained in the list because they represent recognisable domains of the SpA concept. Of note, spinal mobility measures were not included in the new criteria as they did not differentiate between axial SpA and no SpA in this group of patients with relatively early disease (table 1).
In this study, a high proportion (60%) of back pain patients were diagnosed as axial SpA, which seems to be a representative prevalence figure among patients who are referred to rheumatologists because of some suspicion of SpA by the primary care physician. This referral bias probably explains the relatively high frequency of some of the typical SpA features such as HLA-B27 or IBP in the no SpA group in our study. We have tried to avoid any further selection by instructing all participating centres to include patients in a strictly consecutive manner. Moreover, only undiagnosed patients with chronic back pain could be included. The inclusion criteria and the study design we have applied are thus close to true diagnostic studies, with the expert opinion as a gold standard for the diagnosis. However, it is important to realise that the new ASAS criteria should be used primarily as classification criteria. The new criteria will also perform quite well as diagnostic criteria if applied by rheumatologists, and if a prevalence of axial SpA of 60% in the rheumatology setting is assumed as was the case in our study (post-test probability 89% following fulfilment of the criteria and 23% if the criteria are not fulfilled). It remains to be seen how the new criteria perform in settings with an importantly lower prevalence of SpA (eg, <10%), for which more flexible diagnostic approaches have been proposed.6 21
The new classification criteria performed clearly better than the ESSG and the Amor criteria, which were developed in the pre-MRI era, which may suggest that the “gestalt” of SpA has changed over the years by the introduction of MRI of the axial skeleton. Nonetheless, the Amor criteria also performed quite well when modified by adding MRI and might be used in certain settings. However, the ASAS group voted uniformly for the new ASAS classification criteria for axial SpA. In the new criteria, active sacroiliitis on MRI as one of the imaging parameters requires the clear-cut presence of active inflammatory lesions, which are typically seen in sacroiliitis associated with SpA. A more detailed definition of such lesions will be provided elsewhere (Rudwaleit et al, unpublished).
In the absence of a true and unequivocal gold standard for the diagnosis of SpA, we used the local expert’s opinion as a gold standard, although it might have been biased due to existing criteria sets, or by new diagnostic developments such as MRI and by discussions with other experts. The new criteria thus need regular updating, and the ASAS community will follow-up the patients in this validation cohort in order to challenge the accuracy of the current diagnosis. At this time, however, the proposed set of criteria offers good performance and good face validity for axial SpA and was unequivocally selected by the ASAS group.
Competing interests: None.
Funding: This study was supported financially by the Assessment of SpondyloArthritis international Society (ASAS).
The following ASAS centres have included at least one patient: N Akkoc, Izmir, Turkey; J Brandt, Berlin, Germany; F Heldmann, J Braun, Herne, Germany; E Collantes-Estevez, Cordóba, Spain; C-T Chou, Taipei, Taiwan; J Darmawan, Semarang, Indonesia; C Hudry, M Dougados, Paris, France; T Duruöz, Manisa, Turkey; O Fitzgerald, Dublin, Ireland; J Gu, Guangzhou, China; F Huang, Beijing, China; Y Kirazli, Izmir, Turkey; R Landewé, D van der Heijde, Maastricht, The Netherlands; A Linnssen, Ijmuiden, The Netherlands; W Maksymowych, Edmonton, Canada; M Matucci-Cerinic, Firenze, Italy; F van den Bosch, H Mielants, Ghent, Belgium; M Ostergaard, Hvidovre, Denmark; S Ozgocmen, Elazig, Turkey; M Rudwaleit, J Sieper, Berlin, Germany; E Roussou, London, UK; C Naclerio, S Scarpato, Scafati, Italy; IJ Sørensen, Copenhagen, Denmark; R Valle-Oñate, Bogotá, Colombia; U Weber, Balgrist, Switzerland; J Wei, Taichung, Taiwan.
Ethics approval: Ethics approval for the conduct of the study was obtained from the local ethics committee in each centre.
Patient consent: Obtained.