Objective: Inflammatory back pain (IBP) is an important clinical symptom in patients with axial spondyloarthritis (SpA), and relevant for classification and diagnosis. In the present report, a new approach for the development of IBP classification criteria is discussed.
Methods: Rheumatologists (n = 13) who are experts in SpA took part in a 2-day international workshop to investigate 20 patients with back pain and possible SpA. Each expert documented the presence/absence of clinical parameters typical for IBP, and judged whether IBP was considered present or absent based on the received information. This expert judgement was used as the dependent variable in a logistic regression analysis in order to identify those individual IBP parameters that contributed best to a diagnosis of IBP. The new set of IBP criteria was validated in a separate cohort of patients (n = 648).
Results: Five parameters best explained IBP according to the experts. These were: (1) improvement with exercise (odds ratio (OR) 23.1); (2) pain at night (OR 20.4); (3) insidious onset (OR 12.7); (4) age at onset <40 years (OR 9.9); and (5) no improvement with rest (OR 7.7). If at least four out of these five parameters were fulfilled, the criteria had a sensitivity of 77.0% and specificity of 91.7% in the patients participating in the workshop, and 79.6% and 72.4%, respectively, in the validation cohort.
Conclusion: This new approach with real patients defines a set of IBP definition criteria using overall expert judgement on IBP as the gold standard. The IBP experts’ criteria are robust, easy to apply and have good face validity.
Statistics from Altmetric.com
Chronic back pain is the leading symptom in patients with axial spondyloarthritis (SpA) including patients with ankylosing spondylitis (AS).1 2 In order to clinically differentiate back pain caused by inflammation of the sacroiliac joints/lower spine from other causes, attempts have been made to describe and to define clinical criteria for this specific kind of back pain, which has been termed inflammatory back pain (IBP).3–6 The following clinical features have been proposed to be used in different sets of criteria: (i) “back pain starting at an age of less than 40 or 45 years”, because the disease usually starts in the third decade of life and an onset after 45 years of age is exceptional;7 (ii) “chronic back pain for longer than 3 months”, because acute back pain due to non-inflammatory reasons is very commonly acute in onset and is often self-limiting;8 (iii) “insidious onset”, because mechanically caused back pain including disc herniation or sciatica, for example, is frequently of acute onset; (iv) “morning stiffness” (normally of the low back); and (v) “improvement with exercise, but not with rest” (either as single items or in combinations as one item). Morning stiffness of the affected musculoskeletal sites and improvement with exercise are symptoms indicating musculoskeletal inflammation, which are also characteristic for other inflammatory rheumatic diseases such as rheumatoid arthritis and polymyalgia rheumatica, although at different locations in these disease entities; (vi) “pain at night (with improvement upon getting up)” results from worsening of symptoms when patients are at rest—a concept similar to that of morning stiffness; and (vii) “alternating buttock pain”, which likely indicates active inflammation of the sacroiliac joints fluctuating from one side to the other, but which has never been defined according to temporal characteristics.
The Calin criteria4 are the first and most frequently used set of criteria for IBP, and they have been utilised in the European Spondyloarthropathy Study Group (ESSG) criteria.9 Modified definitions of IBP have been used in the modified New York criteria for AS,10 and also in the Amor criteria for SpA.11. More recently, a new set of criteria for IBP have been proposed (Berlin criteria), based on a study in 101 patients with AS and 112 control patients with mechanical low back pain.6 In general, criteria for IBP were either derived from studies comparing patients with AS with patients having back pain of other (most often mechanical) origin, or from the experience of single experts. Although IBP is regarded as a typical clinical symptom for axial SpA, its sensitivity and specificity with respect to diagnosis of axial SpA does not exceed 80%.4–6 12–15 Notwithstanding these limitations, the symptom of IBP has been successfully used as a screening parameter for axial SpA in young patients with chronic low back pain seen by primary care physicians or orthopaedists.16
The Assessment of Spondyloarthritis International Society (ASAS) had started an international project in 2004 to develop new classification criteria for axial and peripheral SpA. As part of this project, international SpA experts met in Berlin in a real patient exercise to personally investigate patients with possible SpA (via medical history and physical examination). This exercise was undertaken because it was felt that the patient–investigator interaction was of pivotal importance since interpretation of information elicited from patients is not only content dependent but also investigator dependent. The experts had to make a decision whether they consider the patients suffering from inflammatory back pain or not, without knowledge about the final diagnosis (SpA or no SpA). This set-up gave us the unique opportunity to use the judgement of the expert(s) on the presence or absence of IBP (and not on the diagnosis of SpA) as the gold standard, and to statistically derive an optimal set of IBP classification criteria referred to as “IBP according to experts”.
In all, 13 international rheumatologists (all listed as coauthors of this manuscript) from 9 countries in Europe and North America who are considered experts in AS/SpA, are full members of ASAS and who participate in the development of new ASAS classification criteria for axial SpA met during a 2-day workshop in Berlin. They obtained clinical history and performed physical examination of 20 patients with chronic back pain and suspected axial SpA. The experts had to make a decision whether they considered the patients to be suffering from IBP or not, without knowledge about the final diagnosis (SpA or no SpA). All patients had presented previously to the Rheumatology outpatient clinic of the Charité (Campus Benjamin Franklin, Berlin, Germany) for diagnostic work-up and were selected by the local organising rheumatologists (MR, JS) who themselves did not assess patients in this workshop. The 13 ASAS experts were divided into 4 groups: 3 groups of 3 experts and 1 group of 4 experts. During the 2 days each group of experts interviewed and examined 10 patients with help of an interpreter (6 patients on day 1, 4 patients on day 2; 1 expert participated only on day 2, and, therefore examined only 4 patients). Each expert independently documented the presence or absence of a given clinical symptom, a manifestation and a laboratory or imaging finding. Crosstalk between experts and discussion or interpretation of findings within the group of experts was not allowed.
In addition to gender, age and duration of back pain, the following clinical history items that are related to IBP were assessed in a yes/no fashion: (1) age at onset <40 years, (2) duration of back pain >3 months, (3) insidious onset, (4) morning stiffness of the back, (5) improvement with exercise, (6) improvement with rest, (7) alternating buttock pain and (8) pain at night with improvement upon getting out of bed. In addition, each ASAS expert had to judge whether IBP was present or absent in a given patient after taking the clinical history.
Information on other clinical, laboratory and imaging features was also collected by the experts as part of the project to develop new classification criteria for SpA (data not presented here). Importantly, IBP features including the overall judgement on the presence of IBP were documented prior to the assessment of other manifestations including the physical examination, and prior to looking at radiographs and MRIs, so that the expert IBP judgement was based solely on the set of eight IBP questions and not on diagnosis, thereby reducing possible bias.
All patients had chronic back pain of unknown origin and, in the opinion of the local rheumatologists, had clinical features compatible with spondyloarthritis. According to the assessments and judgements of the local rheumatologists 16/20 patients fulfilled the ESSG classification criteria,9 and 8/20 had 6 or more (range 7–12) Amor points (reflecting definite SpA) while a further 7/20 patients had 5 Amor points (reflecting probable SpA).11 Four patients did not fulfil the ESSG or the Amor (6 points) criteria. Further demographic and clinical data are shown in table 1. The study was approved by the local ethical committee and informed consent was obtained from all patients.
The “IBP according to experts” criteria, which evolved from this workshop, were validated in the international ASAS study on new classification criteria for axial SpA. Similar to the expert meeting in Berlin, individual IBP parameters as well as the overall judgement on the presence of IBP were assessed by the local rheumatologist in patients with chronic back pain of unclear origin and onset of back pain <45 years of age (n = 648, mean age 33.6 years, male gender 44.5%, diagnosis of axial SpA 60.2%).17
In total, 124 clinical judgements on IBP were available on the 20 patients from 13 experts (12 experts evaluated 10 patients each, 1 expert 4 patients only). Nine IBP judgements were missing, and six IBP judgements were excluded from further analysis because the expert could not decide on the presence or absence of IBP. The remaining 109 expert judgements on IBP were used for further analysis. For each patient and each IBP parameter there were between 4–7 judgements. Concordance among experts regarding individual IBP items was calculated as following: first, concordance for a given IBP parameter was calculated for each patient (expressed as 0–1). Second, the concordance rates of all patients assessed for a given IBP parameter were summed and then divided by the number of patients (n = 20). The χ2 test was used for comparison of proportions. Stepwise forward and backward logistic regression analysis was applied to identify IBP parameters, which independently contribute to IBP with the overall expert judgement on IBP (present or absent) as binary outcome. Since the 109 judgements were made by 13 experts and based on 20 patients, the 109 IBP judgements cannot be considered independent judgements. We therefore also applied a multilevel approach adjusting for patient dependence (level 1) and expert dependence (level 2), using generalised estimating equations (GEE) for binomial outcomes (conditional logistic regression analysis). In that analysis, IBP judgement was the dependent variable, and patient and expert were within-subject factors.
IBP was considered to be present in 61 of 109 (56%) expert judgements. The frequencies of individual IBP parameters in patients considered to have IBP and patients considered not to have IBP are shown in fig 1. The concordance among experts regarding IBP items was fairly high: 0.94 (age of onset), 0.95 (duration of back pain >3 months), 0.83 (insidious onset), 0.84 (pain at night), 0.89 (morning stiffness), 0.77 (improvement with exercise), 0.72 (no improvement with rest) and 0.86 (alternating buttock pain). The concordance rate regarding the global judgement on presence or absence of IBP was 0.83.
Logistic regression analysis revealed the following five parameters to be independently contributory to IBP: (1) improvement with exercise (odds ratio (OR) 23.1, 95% CI 3.5 to 154.4; p = 0.001), (2) pain at night (OR 20.4, 95% CI 3.5 to 118.8; p = 0.001), (3) insidious onset (OR 12.7, 95% CI 2.9 to 56.4; p = 0.001), (4) age at onset <40 years (OR 9.9, 95% CI 2.1 to 47.1; p = 0.004) and (5) no improvement with rest (OR 7.7, 95% CI 1.8 to 33.3; p = 0.006). Three parameters were not independently contributory to IBP: duration of back pain >3 months (present in 96.6% vs 91.7% of IBP vs no IBP), alternating buttock pain (present in 41.1% vs 10.9%) and morning stiffness (present in 78.7% vs 41.7%). GEE analysis, which adjusts for patient dependency and expert dependency, revealed the same five parameters and gave almost identical odds ratios (data not shown). Table 2 shows the five parameters that independently contribute to IBP according to the ASAS experts, and which form the new set of “IBP according to experts” criteria.
The resulting sensitivities and specificities if ⩾3/5, ⩾4/5, or all five parameters of the IBP expert criteria were present are shown in table 3. The best trade-off between sensitivity and specificity was found if ⩾4/5 were fulfilled. We also compared in the workshop patients the IBP expert criteria with the Calin criteria4 and with the IBP criteria by Rudwaleit et al,6 using again the expert opinion on presence or absence of IBP, and not the diagnosis, as gold standard (table 3).
The new “IBP according to experts” definition was validated in the international ASAS study on new classification criteria for axial SpA (table 3). The sensitivity of the “IBP according to experts” criteria (at least four of the five present) was similar in the validation study as compared to the workshop patients (79.6% vs 77.0%), whereas the specificity was somewhat lower (72.4% vs 91.7%). The performance of the Calin criteria and of the Berlin criteria in the validation study is also shown in table 3.
An unprecedented approach for the development of IBP definition has been used in this study: firstly, a group of internationally recognised experts with a longstanding reputation in the field of SpA and AS met in Berlin and personally investigated patients with a possible diagnosis of SpA; and secondly, the single expert’s judgement on whether IBP was present or absent in a given patient was used as the gold standard, not the final diagnosis of SpA. This latter aspect importantly adds to the face validity of the herein proposed ASAS expert’s IBP criteria.
Although for practical reasons the number of experts and the number of patients was limited in the workshop, 109 judgements on IBP were analysed. The high concordance rate of the experts of between 0.72 and 0.95 for single IBP parameters indicates that communication with the patients was good and any language barrier (which was solved by independent non-medical interpreters) was not a problem in this exercise.
The herein presented new criteria for IBP do not reveal major differences in comparison to other established IBP criteria. This is not surprising because existing IBP criteria have been applied all over the world in daily clinical practice for many years and some of the participating experts have been involved in the development of the other criteria. In fact, the new criteria represent some sort of synthesis of existing criteria. The items “disease onset at an age <40 years”, “insidious onset” and “improvement with exercise” also represent three of the five Calin criteria.4 Interestingly, “morning stiffness” was substituted by “pain at night”, which describes a somewhat similar domain. As can be seen from fig 1, “morning stiffness” and “pain at night” were similar in regard to differentiation of IBP from no IBP, though “pain at night” performed better in the logistic regression analysis. Morning stiffness is a rather frequent complaint in patients with any kind of back pain, especially when the duration of the morning stiffness is not quantified. The value of morning stiffness as an IBP parameter seems to be better when there is a differentiation between <30 min and >30 min, as was recently proposed by the Berlin criteria,6, however, morning stiffness was not quantified in this workshop.
The item “no improvement of back pain with rest” appeared to be important to the experts and became part of the new IBP criteria. In the modified New York criteria for AS10 IBP was defined in one single item as morning stiffness, improvement with exercise but not by rest and symptom duration >3 months. Additionally, in the Berlin criteria for IBP “improvement with exercise, but not with rest” was used as one conditional item.6 Since “improvement with exercise” and “no improvement with rest” as single items were independently contributory to IBP according to our analysis we decided to use them also independently in the new criteria. “Alternating buttock pain”, which is part of the Berlin criteria,6 was not independently contributory to IBP and, therefore, was not included into the new expert criteria. Although “alternating buttock pain” differentiates well between IBP and no IBP, a limitation of this item seems to be its rather low prevalence of around 40%, which was also the case in the previous study,6 and which makes this item of limited value in any criteria set.
The duration of “back pain >3 months”, an item from the Calin criteria, did not differentiate between IBP and no IBP (fig 1). However, this was due to the selection of patients who were chosen because of chronic low back pain. Thus, nearly all patients with and without a final diagnosis of SpA had back pain for longer than 3 months. In general, however, symptom duration >3 months remains an important entry parameter in less selected back pain patients before considering SpA as a possible diagnosis and before assessing IBP.13 Focussing on patients with chronic back pain (>3 months) and young onset in the application of IBP criteria has also been applied in other studies.6 16 Thus, the new IBP criteria are applicable in patients with chronic low back pain (>3 months) and not necessarily in patients with acute low back pain.
Using the expert judgment on IBP as the gold standard, the Berlin criteria had a worse specificity (62.5%) than the Calin criteria (73%) in the workshop patients, which may be explained by the fact that two of the four items of the Berlin criteria could not be accurately assessed in this exercise. These two items were (a) morning stiffness >30 min (in the present study morning stiffness was assessed only as yes or no) and (b) pain at the second half of the night only. In the present study only pain at night with improvement upon getting up was assessed, without differentiation between the first and the second half of the night.
The validation study on ASAS candidate classification criteria for axial SpA17 gave us the opportunity to validate the new “IBP according to experts” criteria. In this large cohort of 648 patients with chronic back pain of unclear origin and age at onset of back pain <45 years, the “IBP according to experts” criteria (at least 4/5 present) showed a similar sensitivity of 79.6% (against IBP as gold standard, rather than diagnosis) but a lower specificity of 72.4% which was still reasonably good. In this validation study the Calin criteria had a higher sensitivity but a lower specificity, whereas the Berlin criteria had a lower sensitivity but a better specificity than the “IBP according to experts” criteria. Notwithstanding small differences between various sets of criteria, the “IBP according to experts” criteria demonstrated a well balanced trade-off between sensitivity and specificity in both, the workshop patients and the validation cohort.
The approach we chose took advantage of the large experience of the participating experts in SpA in daily clinical practice and in the design and conduct of clinical studies. The resulting “IBP experts” criteria are easy to apply and have strong face validity. Moreover, they have proven as robust in the ongoing ASAS international classification study on axial SpA in which 649 patients with back pain with and without a diagnosis of SpA have already been included, by far the largest number of patients included in any study on IBP. Whether these criteria perform also well in primary care has to be tested.
We would like to acknowledge the following rheumatologists who contributed to the ASAS study on classification criteria on axial SpA but did not participate in the workshop: Nurullah Akkoc, Izmir, Turkey; Jürgen Braun, Herne, Germany; Chung-Tei Chou, Taipei, Taiwan; John Darmawan, Semarang, Indonesia; Tuncay Duruöz, Manisa, Turkey; Oliver Fitzgerald, Dublin, Ireland; Jieruo Gu, Guangzhou, China; Feng Huang, Beijing, China; Yesim Kirazli, Izmir, Turkey; Annelies Linssen, Ijmuiden, The Netherlands; Marco Matucci-Cerinic, Florence, Italy; Walter Maksymowych, Edmonton, Canada; Mikkel Ostergaard, Hvidovre, Denmark; Salih Ozgocmen, Elazig, Turkey; Euthalia Roussou, London, UK; Salvatore Scarpato, Scafati, Italy; Rafael Valle Oñate, Bogotá, Colombia; Ulrich Weber, Zürich, Switzerland; James Wei, Taichung, Taiwan.
Competing interests: None declared.
Funding: The ASAS workshop in Berlin as well as the international study on new classification criteria for axial SpA are official ASAS scientific projects, financially supported by ASAS.
Ethics approval: The study was approved by the local ethical committee and informed consent was obtained from all patients.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.