Article Text
Abstract
Background The Ankylosing Spondylitis Disease Activity Score (ASDAS) is a new composite index to assess disease activity in ankylosing spondylitis (AS). It fulfils important aspects of truth, feasibility and discrimination. Criteria for disease activity states and improvement scores are important for use in clinical practice, observational studies and clinical trials and so far have not been developed for the ASDAS.
Objective To determine clinically relevant cut-off values for disease activity states and improvement scores using the ASDAS.
Methods For the selection of cut-offs data from the Norwegian disease modifying antirheumatic drug (NOR-DMARD) registry, a cohort of patients with AS starting conventional or biological DMARDs, were used. Receiver operating characteristic analysis against several external criteria was performed and several approaches to determine the optimal cut-offs used. The final choice was made on clinical and statistical grounds, after debate and voting by Assessment of SpondyloArthritis international Society members. Crossvalidation was performed in NOR-DMARD and in Ankylosing Spondylitis Study for the Evaluation of Recombinant Infliximab Therapy, a database of patients with AS participating in a randomised placebo-controlled trial with a tumour necrosis factor blocker.
Results Four disease activity states were chosen by consensus: inactive disease, moderate, high and very high disease activity. The three cut-offs selected to separate these states were: 1.3, 2.1 and 3.5 units. Selected cut-offs for improvement were: change ≥1.1 units for clinically important improvement and change ≥2.0 units for major improvement. Results of the crossvalidation strongly supported the cut-offs.
Conclusions Cut-off values for disease activity states and improvement using the ASDAS have been developed. They proved to have external validity and a good performance compared to existing criteria.
Statistics from Altmetric.com
Introduction
Ankylosing spondylitis (AS) is a chronic inflammatory rheumatic disease that affects the axial skeleton. It is characterised by inflammatory back pain, bony fusion of the spine, decreased mobility, functional impairment and decreased quality of life. Other clinical features of AS include asymmetric peripheral oligoarthritis, enthesitis, fatigue and specific organ involvement such as anterior uveitis, psoriasis and chronic inflammatory bowel disease.1
The concept of disease activity, a reflection of the underlying inflammation, encompasses a wide range of domains and measures.2 Since currently used single component measures or indices have limitations because they measure only one aspect of the disease, are fully patient or doctor oriented, or lack face and/or construct validity, the Assessment of SpondyloArthritis international Society (ASAS) has developed a new disease activity score for use in AS: the ‘Ankylosing Spondylitis Disease Activity Score’ (ASDAS).3
Designed in analogy of the DAS4 for rheumatoid arthritis (RA), the ASDAS is a composite index with continuous measurement properties. The development process resulted in four candidate ASDAS scores,3 all of them fulfilling important aspects of truth, feasibility and discrimination.3 5 The ASAS membership has selected the ASDAS with C reactive protein (CRP) as the preferred version and with erythrocyte sedimentation rate (ESR) as the alternative version.3
In order to increase interpretability, a disease activity measure requires criteria for identifying ‘disease activity states’ (or ‘status’) and ‘improvement’ (or ‘response criteria’). Improvement scores help to determine whether treatments really work, that is whether they actually produce clinically important improvement, allowing investigators, clinicians, regulators and patients to determine the efficacy (or lack thereof) of a given intervention and to communicate about response using the same metric.6 Disease activity states measure clinical disease activity at specific timepoints. They are important for supporting decisions about entry into clinical trials, for supporting treatment changes and for defining therapeutic goals. Furthermore, in light of recent therapeutic advances and the increasing potential to improve the outcomes of patients with AS, the definition of criteria for disease states according to the ASDAS is highly relevant, as the prognosis may be different in patients depending on the disease activity states they attain, even if the same level of improvement is achieved. This observation highlights the importance of reporting disease activity states and not just absolute and categorical therapeutic responses, an important concept that has been clearly demonstrated in RA.7
Criteria for disease activity states and improvement scores are therefore important for use in clinical practice, observational studies and clinical trials and so far have not been developed for the ASDAS. In the present study, we evaluated clinically relevant cut-off values for disease activity states and improvement scores using both forms of the ASDAS.
Patients and methods
ASDAS calculation
The ASDAS formulas3 are as follows:
ASDAS-CRP (the preferred version):
0.12 × Back Pain + 0.06 × Duration of Morning Stiffness + 0.11 × Patient Global + 0.07 × Peripheral Pain/Swelling + 0.58×Ln(CRP+1)
ASDAS-ESR (the alternative version):
0.08 × Back Pain + 0.07 × Duration of Morning Stiffness + 0.11 × Patient Global + 0.09 × Peripheral Pain/Swelling+0.29 × √(ESR)
CRP is in mg/litre, ESR is in mm/h; the range of other variables is from 0 to 10; Ln represents the natural logarithm; √ represents the square root.
Nomenclature for ASDAS disease activity states and improvement scores
During the 2010 ASAS workshop in Berlin, Germany, upon presentation of results and discussion, four disease activity states and two improvement scores were chosen by consensus: (1) disease activity states: ‘inactive disease’, ‘moderate disease activity’, ‘high disease activity’ and ‘very high disease activity’; and (2) improvement scores: ‘minimal clinically important improvement’ (MCII) and ‘major improvement’.
Study population used for the selection of cut-offs
For the selection of cut-offs we used data from the Norwegian disease modifying antirheumatic drug (NOR-DMARD) register,8 9 a Norwegian five-centre register that includes consecutive patients with AS (according to the treating doctor) starting a new conventional or biological DMARD regimen. Measures of disease activity and health status are assessed at baseline, 3, 6, 12 months and yearly thereafter. Patients from the NOR-DMARD register are an appropriate representation of patients with AS in general, as seen by rheumatologists in Norway. Of the patients from NOR-DMARD that we analysed, 69% were men, 90% were positive for human leucocyte antigen (HLA)-B27, the mean (SD) age was 43.3 (10.7) years and the mean disease duration since diagnosis was 12.0 (10.6) years. Detailed characteristics of patients included in NOR-DMARD have been described previously.8 9
In order to have the best representation of the disease activity states being studied, 3-month data (n=331–336) were used to select the cut-off for ‘inactive disease’ and between ‘moderate’ and ‘high disease activity’, while baseline data (n=467–477) were only used to select the cut-off for ‘very high disease activity’. The reason for this choice was because the large majority of patients from NOR-DMARD had (very) active disease at baseline (eg, none of the patients fulfilled ASAS partial remission criteria). Change scores between baseline and 3-month assessment (n=295) were used to select the cut-offs for improvement. The development of cut-offs was performed using ASDAS-CRP, the preferred ASDAS version.
Study populations used for crossvalidation of the cut-offs
Crossvalidation was performed in NOR-DMARD (with an additional timepoint at 6 months) and in an 80% random sample of the Ankylosing Spondylitis Study for the Evaluation of Recombinant Infliximab Therapy (ASSERT) cohort (n=219–223).10 In brief, ASSERT was a randomised 24-week placebo-controlled trial with infliximab that included patients with AS (according to the modified New York criteria11) with a Bath Ankylosing Spondylitis Disease Activity Index (BASDAI)12 and a spinal pain score ≥4 (range 0–10). The ASSERT population was typical of patients with moderate to severe AS. Of the patients from ASSERT that we analysed, 79% were men, 89% were positive for HLA-B27, the mean (SD) age was 39.3 (10.1) years and the mean disease duration was 10.6 (8.7) years. Detailed characteristics of patients in the ASSERT trial have been described previously.10 For the validation we used baseline, 12-week and 24-week data.
The validation of the cut-offs was performed for ASDAS-CRP and ASDAS-ESR. Owing to the statistical approach used in the development of the ASDAS formulas,3 it was expected that the cut-offs developed with ASDAS-CRP would also be applicable to ASDAS-ESR.
Measurement instruments
Patient assessment of global disease activity and the six individual questions of the BASDAI were available in NOR-DMARD and ASSERT. The range of all scores is from 0 to 10. CRP (mg/litre) was also available in both databases, while ESR (mm/h) and physician's global assessment of disease activity were only available in NOR-DMARD. With these assessments, ASDAS-CRP could be calculated in both databases while ASDAS-ESR could only be calculated in NOR-DMARD.
In previous studies concerning the ASDAS,3 5 no description has been given as to how values below the CRP threshold of detection should be handled. This has now been studied and we recommend that in such cases half of the value of the threshold should be used (eg, if the limit of detection is 4 mg/litre, a value of 2 should be used). The use of the high sensitivity CRP assay is preferred.
The Bath Ankylosing Spondylitis Functional Index (BASFI)13 was also available in both databases, allowing us to calculate ASAS partial remission and ASAS response criteria.14 15 Moreover, having BASDAI total score available, we were also able to calculate response measures used for the evaluation of efficacy of anti-tumour necrosis factor (TNF) treatment in clinical practice, based on the BASDAI, that is the proportion of patients who had at least 2 units improvement (ΔBASDAI≥2) or at least 50% improvement (BASDAI50).
Use of the receiver operating characteristic analysis for the selection of cut-offs in NOR-DMARD
As there is no universal gold standard to assess disease activity in AS, we performed receiver operating characteristic (ROC) analysis against predefined external criteria considered to be representative of the various diseases activity states. Because ASDAS cut-offs should be representative of the perspectives of patients and doctors, we used the patient and physician global assessments at predefined levels (<1, <3 and >6 cm) as external constructs for ‘inactive disease’, to separate ‘moderate’ from ‘high disease activity’ and for ‘very high disease activity’, respectively. Additionally, for determining the cut-off for ‘inactive disease’ we also used ASAS partial remission as an external criterion (table 1).
One of the questions from ASAS members was about estimating the relationship between BASDAI and ASDAS as the BASDAI cut-off of 4 has been extensively used in trials with TNF blockers to determine ‘high disease activity’. Therefore, we compared BASDAI (<3, <3.5 and <4 cm) with the cut-off between ‘moderate’ and ‘high disease activity’ (table 1).
Regarding improvement, the most frequently recommended external criterion for ROC analysis (an anchor-based approach) is the ‘global rating of change’ (GRC), a Likert-type scale scored for change by the patient.16,–,18 In NOR-DMARD such a scale is available in the form of a unique question where patients score the change in their health status according to five categories: ‘much better’, ‘better’, ‘unchanged’, ‘worse’ and ‘much worse’. For the ROC analysis, external anchors were constructed by dichotomising the rating scale for change in two different ways: a cut-off between ‘much better/better’ and ‘unchanged/worse/much worse’ in order to determine ‘MCII’, and a cut-off between ‘much better’ and ‘better/unchanged/worse/much worse’ to determine ‘major improvement’. Moreover, we used the entire cohort in the ROC analysis, rather than just the two groups adjacent to the dichotomisation point because it has been shown that this procedure maximises precision and yields a more logical estimate of the cut-offs.19 The same principle was used in the ROC analysis for disease activity states.
We applied three methods of ‘optimal’ cut-off determination: (1) fixed 90% specificity, (2) the Youden index and (3) the closest point to (0,1), that is the point where the shoulder of the ROC curve is closest to the left upper corner of the graphic. The first method is particularly important in the clinical context (you try to avoid that patients in low/moderate disease activity are misclassified as inactive), while the last two methods provide the best balance between sensitivity and specificity.20,–,22
Comparison of the cut-off for ‘MCII’ obtained by the ROC method with ‘minimal detectable improvement’ obtained by other methods
The ROC method assesses which change on the measurement instrument corresponds with an important/meaningful change defined by the anchor, in this case the patient.23 This is higher in hierarchy than ‘minimal detectable improvement’ based on measurement precision.18 However, it is important to assure that the ‘MCII’ lies within boundaries that can be assessed beyond measurement error.23 Therefore, we compared ‘MCII’ obtained by the ROC method with various methods of determining ‘minimal detectable improvement’ and used this to benchmark the choice of the cut-off value for ‘MCII’.
Comparison was made with the ‘mean change’ (a less reliable anchor-based approach)24 and several distribution-based approaches: the ‘Wyrwich standard error of measurement’,25 the ‘Jacobson's reliable change index’,26 the ‘0.5*SD approach’,27 and the ‘smallest detectable change approach’28 (appendix 1).
Crossvalidation study
Crossvalidation was performed in NOR-DMARD and ASSERT for ASDAS-CRP and in NOR-DMARD for ASDAS-ESR. In order to allow comparisons between ASDAS-CRP and ASDAS-ESR, only patients with both values available were used for crossvalidation in NOR-DMARD. However, including all patients with obtainable data for each ASDAS version (approximately 10% more patients) the results were similar (data not shown).
Several cross-validation approaches were used:
(1) Calculation of sensitivity and specificity of ASDAS cut-off values in comparison with several other criteria at different timepoints.
(2) Assessment of the longitudinal distribution of patients over ASDAS disease activity states before and after start of treatment.
(3) Mean values of BASDAI and ASDAS across the four ASDAS activity states.
(4) Percentage of patients achieving ASDAS improvement criteria (‘MCII’ and ‘major improvement’) in comparison to other widely used improvement criteria (ΔBASDAI≥2, BASDAI50, ASAS20 and ASAS40), 3 and 6 months after start of treatment.
(5) In order to assess discriminative power, χ2 and p values were calculated for the differences between placebo and infliximab in ASSERT.
SPSS V.17.0 (SPSS, Chicago, Illinois, USA) was used in all statistical analysis.
Results
Selection of the optimal cut-offs for disease activity states and improvement scores
The cut-offs for the various external criteria, according to fixed 90% specificity, Youden index and closest point to (0,1) are presented in table 1. The 90% specificity criterion was considered to be the most clinically relevant cut-off for ‘inactive disease’, to separate ‘moderate’ from ‘high disease activity’ and for improvement scores. In these cases, specificity is clinically more important in order to reduce the risk of misclassifying patients whose disease remains active (or who have not really improved) according to the external construct. Regarding the cut-off for ‘very high disease activity’, we considered that it would be better to have the best balance between sensitivity and specificity.
The definite choice for appropriate cut-offs was facilitated by consistent results across all external criteria (table 1). Such concordance between patient and physician global scores (and ASAS partial remission criteria, in the case of ‘inactive disease’) adds to the robustness of our results.
The three cut-offs for disease activity states selected after debate and voting by ASAS members were as follows: <1.3 between ‘inactive disease’ and ‘moderate disease activity’, <2.1 between ‘moderate’ and ‘high disease activity’ and >3.5 between ‘high’ and ‘very high disease activity’ (figure 1A). The cut-off between ‘moderate’ and ‘high disease activity’ (<2.1 units) corresponded to a BASDAI cut-off of <3.5 cm (table 1).
The cut-offs selected for improvements were: change of ≥1.1 units for ‘MCII’ and change of ≥2.0 units for ‘major improvement’ (figure 1B). Importantly, the cut-off for ‘MCII’ exceeded the ‘minimal detectable improvement’ based on measurement error, which ranged from 0.4 to 1.1 (appendix 1).
Crossvalidation results
Regarding ASDAS-CRP, the cut-offs developed in NOR-DMARD at 3 months showed similar results in terms of sensitivity and specificity against the same (and other) external constructs in NOR-DMARD at 6 months and in ASSERT at 3 and 6 months (table 2). Noticeably, results in ASSERT often surpassed the results in NOR-DMARD, yielding higher sensitivities (above 80%) while retaining the same level of specificity (approximately 90%). For the cut-off between ‘high’ and ‘very high disease activity’ (analysis only preformed at baseline) the slightly lower concordance probably reflects the higher subjectivity of the cut-off and a different selection criterion for the ‘optimal’ cut-off.
The longitudinal distribution of ASDAS-CRP disease activity states in both databases (table 3) showed a clinically and statistically significant shift of treated patients from higher disease activity states towards lower disease activity states. Interestingly, in the longitudinal analysis of ASSERT, the differences between the infliximab and placebo groups clearly discriminate between the two treatment arms: at 6-month follow-up 31.9% (infliximab) versus 0% (placebo) of the patients had ‘inactive disease’ (p<0.001), while 12.3% (infliximab) versus 53.6% (placebo) had ‘very high disease activity’ (p<0.001). Moreover, ‘inactive disease’ according to the ASDAS had higher discriminatory capacity (χ2=23.4, p<0.001) than ASAS partial remission criteria (χ2=13.2, p<0.001).
Comparison of BASDAI and ASDAS mean values across the four ASDAS activity states during follow-up (table 4) showed that ASDAS disease activity states were in agreement with clinically relevant numerical differences in BASDAI mean values: BASDAI mean value for ASDAS ‘inactive disease’ ranged from 0.78 to 1.12, while for ASDAS ‘very high disease activity’ it ranged from 6.93 to 7.29 (scale 0–10).
Finally, in both databases, ASDAS ‘MCII’ (ΔASDAS≥1.1) was able to identify more patients with clinically meaningful improvement than the classical criteria: for example in ASSERT at 6-month follow-up, 57.5% of patients achieved ASDAS ‘MCII’, while 51.6%, 41.6% and 52.5% achieved ΔBASDAI≥2, BASDAI50 and ASAS20, respectively (table 5). ASDAS ‘MCII’ was also able to discriminate better between infliximab and placebo groups when compared to classical response criteria (higher χ2 values). Regarding ASDAS ‘major improvement’ (ΔASDAS≥2.0) it was often a more stringent criterion than ASAS40, supporting its validity as a measure of large improvement. Moreover, similarly to the ‘MCII’ cut-off, it showed a higher capacity to discriminate between active and placebo groups compared to usual response criteria (higher χ2 values).
Regarding ASDAS-ESR, overall the results of the crossvalidation in NOR-DMARD were very similar to ASDAS-CRP (tables 2–5). No relevant differences were observed for ‘improvement cut-offs’, while regarding the cut-off values for disease activity states, ASDAS-ESR showed a trend to categorise slightly more patients in lower disease activity states compared to ASDAS-CRP (eg, in NOR-DMARD at 6 months 26.0% had ‘inactive disease’ according to ASDAS-ESR and 20.8% according to ASDAS-CRP) and slightly less patients in higher disease activity states (13.0% had ‘very high disease activity’ according to ASDAS-ESR and 18.2% according to ASDAS-CRP).
Discussion
This study sought to determine cut-off values for disease activity states and improvement scores in AS based on the ASDAS. The definition of such criteria is of clinical and scientific importance.6 7 We developed the cut-offs in a routine care population of patients with AS (NOR-DMARD) and validated them in the same population at a different timepoint and in a TNF blocker trial population (ASSERT). The fact that the cut-offs preformed at least as good in the trial population enhances their potential for application in both settings. Noticeably, the results of the crossvalidation with ASDAS-CRP and ASDAS-ESR were very similar, supporting the use of the same cut-offs with both ASDAS versions.
The cut-offs were developed on clinical and statistical grounds and showed a remarkable consistence between the various external constructs that were used. Regarding improvement cut-offs, the availability of a GRC questionnaire in NOR-DMARD allowed us to use the most adequate gold standard for this purpose.17 18 29 Importantly, the cut-off for ‘MCII’ was beyond borders of measurement error according to all tested methods.
ASDAS categories will facilitate studying the impact of disease activity states on prognosis. Furthermore, the cut-off for ‘inactive disease’ may be an important guideline for achieving a therapeutic aim. Compared to ASAS partial remission criteria, ASDAS ‘inactive disease’ has the advantage of being independent of BASFI: patients with a lot of structural damage that (as a consequence) have a high BASFI30 may never achieve ASAS partial remission, while they may more easily achieve ‘inactive disease’. In light of the results of the crossvalidation, the new ASDAS-based improvement cut-offs may also facilitate the discrimination between treatment arms in clinical trials, and therefore result in smaller sample sizes.
The major limitation of our study is probably the lack of a universal and broadly accepted ‘gold standard’ for clinical disease activity in AS. However, we believe that the use of patient and physician global assessments as external constructs and their remarkable consistence for the selection of cut-offs overcomes this limitation. The use of arbitrary cut-offs for the external constructs may also be argued, but this was the only possible approach and the predefined cut-offs were discussed and accepted by ASAS members as representative of the disease activity states under study.
In summary, cut-off values for disease activity states and levels of improvement have been developed for the ASDAS. These cut-offs have proven to have external validity and a good performance in crossvalidation. They have been endorsed by ASAS and are now ready to be used in clinical practice, observational studies and clinical trials.
Acknowledgments
This research was conducted while PM was an ARTICULUM Fellow. The authors thank NOR-DMARD contributors and Centocor/Schering-Plough for providing the databases that allowed the analyses necessary to write this manuscript.
References
Supplementary materials
Web only data
Files in this Data Supplement:
Footnotes
-
Competing interests None.
-
Ethics approval This study was conducted with the approval of the ASSERT trial and the NOR-DMARD registry, the two databases used in our analysis.
-
Provenance and peer review Not commissioned; externally peer reviewed.