Background The simplified disease activity index (SDAI) and the clinical disease activity index (CDAI) are established instruments to measure disease activity in rheumatoid arthritis (RA). To date, no validated response definitions for the SDAI and CDAI are available.
Objective The authors aimed to define minor, moderate and major response criteria for the SDAI.
Methods The authors used data from two clinical trials on infliximab versus methotrexate in early (ASPIRE) or established (ATTRACT) RA, and identified the three SDAI cutpoints based on the best agreement (by κ statistics) with the American College of Rheumatology (ACR)20/50/70 responses. Cutpoints were then tested for different aspects of validity in the trial datasets and in a Norwegian disease modifying antirheumatic drug prescription dataset (NOR-DMARD).
Results Based on agreement with the ACR response, the minor, moderate and major responses were identified as SDAI 50%, 70% and 85% improvement. These cutpoints had good face validity concerning the disease activity states achieved by the different response definitions. Construct validity was shown by a clear association of increasing SDAI response categories with increasing levels of functional improvement, achievement of better functional states and lower annual radiographic progression. Across SDAI 50/70/85, the sensitivities regarding a patient-perceived improvement decreased (73%/39%/22%) and the specificities increased (61%/89%/96%) in a meaningful way. Further, the cutpoints discriminated the different treatment arms in ASPIRE and ATTRACT. The same cutpoints were used for the CDAI, with similar results in the validation analyses.
Conclusion These new response criteria expand the usefulness of the SDAI and CDAI for their use as endpoints in clinical trials beyond the definition of disease activity categories.
Statistics from Altmetric.com
Various means are available to assess disease activity of rheumatoid arthritis (RA). Since the disease per se is multi-faceted,1 typically composite disease activity measures are used that combine different variables.2,–,4 The most commonly used indices are the disease activity score based on a 28-joint count (DAS28), the simplified disease activity index (SDAI) and the clinical disease activity index (CDAI), all of which can be applied in clinical trials and clinical practice.5,–,10 The use of these indices to follow disease activity over time has also become a very important aspect in the care for patients with RA.11,–,16
Response criteria inherently require assessments at least at two time points. The most commonly used response criteria in clinical trials are the ones of the American College of Rheumatology (ACR),8 which have been derived based on the discrimination of responses between active drug and placebo. The most recently proposed revision of the ACR criteria has maintained its focus on response.17 The response criteria of the European League Against Rheumatism (EULAR) are based on the DAS28 and were developed on the basis of the notion that not only the change in disease activity upon therapeutic intervention was important, but, potentially also the disease activity state that has been reached.18
Although we have recently shown that it is clinically more important to achieve a good state than merely a good response,19 it has been recommended to maintain the reporting of response rates in addition to reporting of disease activity states, especially for the short-term perspective of clinical trials.20 21
The newer SDAI and CDAI currently have no validated response measures, but in a previous study we have shown that the patients' perception of response is dependent on the starting level of disease activity.19 Using the ACR response definition as reference, we aimed to take this observation further by developing and validating response criteria for these two indices that (A) are simple, (B) relate to patients' perception of a response, (C) are based on relative changes from a baseline (to reflect the aforementioned findings), and (D) predict important outcomes of RA.
Overall study design
Derivation of response cutpoints
Our study design is based on four a priori assumptions: (A) Response metric: on the basis of the results from an earlier study,19 we defined the response metric of the SDAI to be based on percentage change in score, which was best correlated with the patients' perception of improvement; (B) Gold standard: we will define three levels of cutpoints based on the best agreement with the traditional ACR20/50/70 system, which is the most established way of determining response in clinical trials of RA; these response levels will be labelled as ‘minor’, ‘moderate’ and ‘major’; (C) Simplicity: to maintain the simplicity inherent to the concept of the SDAI and CDAI, we will test and accept response cutpoints between 5% and 95% in steps of 5; (D) Similarity: given that SDAI and CDAI are identical except for the inclusion of C-reactive protein in the SDAI, we will aim to develop definitions of response that can be applied to both indices in the same way; therefore, only SDAI will be used for derivation of response cutpoints, but validations of these cutpoints will be performed for both scores.
Validation of response cutpoints
We will perform fourfold testing of the derived cutpoints for (A) face validity, by investigating which states of disease activity are achieved by patients when they show a certain level of response; (B) construct validity, by investigating to what extent the different response levels are associated with important outcomes of the disease, namely structural and functional outcomes; (C) consistency with patient perception, by analysing the sensitivities and specificities which the defined responses exhibit concerning the patients' perception of improvement; and (D) discriminant validity,: by studying how the different response levels discriminate responses achieved with an active agent compared with a placebo or comparator drug.
Patients and datasets
We used three data sources, for the purpose of our study: two from clinical trials and one observational dataset. One trial was on established RA, the anti-TNF trial in rheumatoid arthritis with concomitant therapy (ATTRACT),22 comparing infliximab (IFX) plus methotrexate (MTX) versus placebo and methotrexate in patients with active disease despite MTX therapy and the other one on early RA, the active controlled study of patients receiving infliximab for treatment of rheumatoid arthritis of early onset (ASPIRE),23 comparing IFX plus MTX versus MTX alone in MTX-naïve patients with RA. For both trials, we had 1 year data on courses of disease activity, physical function and radiographic outcomes (total Sharp score) available. An 80% random patient sample of both trials was kindly provided by Centocor (Malvern, PA, USA). For all analyses, we pooled the different dosing groups of IFX (two in ASPIRE; four in ATTRACT).
The source of the observational data was a Norwegian prescription dataset (the NOR-DMARD study).24 We identified and analysed the first documented course with any disease-modifying antirheumatic drug (DMARD) in each patient. Available data included measures of disease activity and function (modified health assessment questionnaire, mHAQ) at start of the respective DMARD therapy (baseline) and after 3, 6 and 12 months follow up; radiographic data were not available.
NOR-DMARD allowed us to investigate the consistency with patient perception (see validation of response cutpoints), as it includes a patient-reported measure of perceived improvement on DMARD therapy19. At all visits except the baseline visit, patients were asked to assess the improvement of their disease activity on a 5-point Likert scale. The wording of the question was: ‘Since you started treatment in this follow-up study, has your rheumatic disease improved, been unchanged or become worse?’ (originally in Norwegian language: ‘Siden du startet behandling i denne oppfølgingsundersøkelsen, er du blitt bedre, uforandret eller verre i din revmatiske sykdom?’), and the response options were:, ‘considerably better’, ‘better’, ‘unchanged,’ ‘worse,’ and ‘considerably worse’.
Derivation of the best response cutpoints
We derived three response cutpoints for the SDAI based on the point (per cent change of SDAI) of best statistical agreement (Cohen's κ) with ACR20 (minor response), ACR50 (moderate response) or ACR70 (major response) at 6 months. The κ value ranges from 0 to 1, with a value of 0 indicating agreement as expected by chance, and one indicating perfect agreement. Based on this analysis in the three active treatment arms of the trials (ASPIRE: MTX+IFX and MTX alone; ATTRACT: MTX+IFX), we obtained and summarised the best cutpoints for the three ‘% SDAI change’ response levels to be further evaluated for different aspects of validity.
Testing and validation of the derived response cutpoints
Clinical face validity
We first identified which states of disease activity would be reached by patients who achieved a response by the minor, moderate and major response cutpoints of the SDAI using the 6-month time point as a clinically relevant time to assess response to therapy. Face validity would suggest that patients with major response should mostly be reaching low disease activity or remission, and that this proportion would decrease when the cutpoints for moderate response or minor response are evaluated. The 6-month cutpoint was chosen because recent therapeutic strategies suggest that a good clinical response should be reached within 6 months from start of therapy.16 Also, we tested the consistency of the SDAI response levels in comparison with the EULAR response criteria (which are based on DAS28 response and state achieved).25
A construct of the disease, RA, is the fact that (A) higher levels of disease activity are associated with higher levels of functional impairment, and thus (B) improvement of disease activity is reflected also in an improvement of physical function, and that (C) ongoing disease activity leads to joint destruction, although for the latter it has been shown that the disease activity state reached, as opposed to the response level reached, is important.26
We tested these outcomes in all arms of the ATTRACT and ASPIRE trials and in NOR-DMARD; in the latter, we did not examine joint damage as there were no radiographic data available. We looked at both, the 6-month and the 12-month time point of responses and outcomes. For radiographic analysis we compared the 1-year radiographic progression between SDAI non-responders, SDAI minor (but not moderate) responders, SDAI moderate (but not major) responders, and SDAI major responders at 6 and 12 months.
Validity regarding patient perspective
We identified sensitivities and specificities for the three response levels regarding the degree of improvement as reported by the patients on a 5-point Likert scale after 6 and 12 months (see above).19 For the analysis of the patients' relevant response, the status of the anchor was defined as follows: ‘considerably worse’ or ‘worse’ or ‘unchanged’ = no subjective improvement; ‘improved’ or ‘considerably improved’ = subjective perception of improvement. Since in clinical practice a patient with worsening SDAI would not even be considered to be assessed for the presence or absence of a relevant response, we excluded all patients with negative relative SDAI changes. We used a receiver operating characteristics (ROC) curve analysis to show the changes in sensitivities and specificities across the range of different cutpoints.27
Another validity criterion for a response measure in RA is its ability to discriminateactive treatment from control treatment in clinical trials. We used the ATTRACT and ASPIRE trials for this assessment. In ATTRACT, infliximab was compared with true placebo (both added to the background of insufficiently effective methotrexate). In ASPIRE, IFX plus MTX was compared with MTX alone in a MTX-naïve setting. We used χ2 test to compare the statistical power to discriminate the proportion of responders between active and comparator regimens in the two trials at 1 year.
For all our analysis we used the R software.28
Table 1 shows the baseline characteristics of patients identified in NOR-DMARD and in the two arms each of the ASPIRE and the ATTRACT trials. From NOR-DMARD we included all DMARD therapies that were initiated, mostly methotrexate (37.2%) and biological DMARDs (40.8%).
Identification of the best response cutpoints
Figure 1 displays the κ values for increasing levels of relative SDAI responses (% SDAI change) compared with ACR20/50/70 responses. It can be seen that the optimal cutpoints for SDAI minor response (ie, the one best corresponding to ACR20 responses) were 50%, 45% and 50% changes in SDAI in the three active treatment arms of the trials, respectively. The best cutpoints for SDAI moderate response (corresponding to ACR50) were 70%, 70% and 65%; for SDAI major response (corresponding to ACR70) they were 85% in all three arms.
Based on these results, the SDAI minor response level was defined as a relative change of SDAI of 50%, SDAI moderate as a relative change of 70% and SDAI major response as a change in SDAI of 85%. The agreement with ACR response levels was very similar when the CDAI responses were tested (data not shown), and we therefore proceeded to identical cutpoints for SDAI and CDAI, as planned.
In a sentivity analysis, we evaluated all patients for whom data were available, including those with negative SDAI changes, and the total area under the ROC curve was 0.732, that is, virtually identical with the 0.752 that we obtained after excluding patients with negative SDAI changes.
We next tested the new response definitions in several different ways.
Clinical face validity
Figure 2 depicts the disease activity states that patients with a minor, moderate or major response have reached at 6 months, using the three active arms of ASPIRE and ATTRACT. Achieving a minor (SDAI50) response led to the virtual absence of high disease activity, while in moderate (SDAI70) responders low disease activity was the dominating state outcome. In major (SDAI85) responders, remission rates reached approximately 40%–60% and very few patients remained in moderate disease activity.
There was reasonable consistency of the new SDAI criteria in comparison with the EULAR response criteria (online supplementary table S1).
Tables 2 and 3 summarise the results of the comparisons of HAQ changes, endpoint HAQ values and 1 year radiographic progression with 6 month SDAI response status for the different cohorts used. It can be seen that functional responses and the HAQ scores reached were associated with the level of SDAI response (table 2). As expected, functional states that were reached even with SDAI85 response in the ATTRACT trial were worse than functional states seen in the ASPIRE trial. This relates to the greater damage-related irreversible functional disability in long-standing RA that remains unaffected by improving disease activity.29 Average radiographic progression was much lower in patients with greater levels of response by the SDAI in both the ASPIRE and the ATTRACT trials (table 3). When progression of joint damage was compared across patients with different ACR response levels (ACR non-responders, ACR20, ACR50 and ACR70), the association was similar (data not shown). The lack of significant effects on radiographic progression in the SDAI85 responders of ATTRACT was also seen with the ACR70 responders. This needs to be viewed in the context of the smaller sample size and of the completers analysis used in the present study, compared with the original trial.
Validity regarding patient perspective
The proportion of patients in NOR-DMARD who rated themselves much worse, worse, unchanged, improved and much improved were 0.5%, 6.6%, 19.2%, 43.7% and 30%, respectively, at 6 months, and 0.8%, 9.8%, 21.3%, 39.4% and 28.7%, respectively, at 12 months.
The expectation would be that the minor response cutpoint is sensitive in identifying patients who consider themselves improved (ie, be inclusive, few false negatives), while the major response should be specific by only identifying those who also subjectively feel improved (ie, be strict; few false positives). The sensitivities and specificities of relative changes of SDAI for patients' perception of improvement (improved/much improved vs rest) are shown in the ROC curve depicted in figure 3. The minor (SDAI50) response cutpoint showed a good sensitivity of 72.5% and specificity of 61.4% regarding the patients' perception of improvement. The moderate (SDAI70) cutpoint showed a moderate sensitivity of 46.6% paired with a high specificity of 83.7%. Finally, the major (SDAI85) cutpoint identified patient-reported responses with a low sensitivity of 21.6% but an excellent specificity of 95.8%.
Thus, three out of four patients with a subjective response are identified with the minor response definition. Vice versa, the major cutpoint is highly specific as it identifies only 4% patients who would not consider themselves improved.
Table 4 shows the response rates in the different arms of the ASPIRE and the ATTRACT trials, and provides the p values resulting from the comparison of response rates between the active and the comparator arm in each trial when the different response levels are used. In the ASPIRE trial of early RA, the SDAI minor, moderate and major response levels discriminated response to infliximab plus MTX compared with active MTX at p values of 1.4 × 10−3, 3.4 × 10−4 and 5.2 × 10−5, respectively. In the ATTRACT trial of established RA, infliximab was discriminated from placebo therapy (both with background MTX) at p values of 2.3 × 10−4, 7.9 × 10−3 and 4.2 × 10−1, respectively. The somewhat lower p values than in the original publication have to be viewed in the context of the smaller sample size and the completers analysis used in the present study compared with the original trial.
Results for the CDAI
Using the same cutpoints as for the SDAI, the four response categories of SDAI and CDAI showed an excellent agreement: looking at the ASPIRE trial, the ATTRACT trial and the NOR-DMARD dataset, each at the 6- and 12-months time point, the weighted κ values ranged between 0.94 and 0.98. However, We repeated all validation analyses subsequently, as shown above for the SDAI, also for the CDAI, which, as expected, led to the same conclusions (data not shown).
Many instruments are currently available for measuring disease activity in RA. These include, typically, the DAS28, but more recently also the SDAI, which had been introduced to provide a score that is simple to calculate and avoids weighting and transformation. Finally, the CDAI was developed to make formal disease activity assessment possible in situations when acute phase measures were not available. For all these indices, disease activity states have been defined.30 While the DAS28 is the basis for the EULAR response criteria,31 which require change and status achievement, no formal criteria are available to evaluate achievement of a response for the SDAI and the CDAI.
In the present study, we newly introduce response criteria for the SDAI and CDAI. They are graded into mild, moderate and major responses, which will allow their application depending on the clinical question, or the specific purpose of a clinical trial. Both indices have recently been used in major clinical trials,32 33 and the new ACR/EULAR remission criteria also use a SDAI-based definition of remission.34 35 Moreover, current recommendations on data reporting in clinical trials suggest to present both disease activity states and responses.20 Therefore, the elaboration of SDAI- and CDAI-based definitions of response is certainly overdue.
The anchor for determining response levels for the SDAI was the currently most established response classification system, namely the ACR improvement definition.8 This resulted in response requirements of 50% for mild, 70% for moderate and 85% for major improvement. In the datasets analysed, the minor SDAI response excluded almost all patients with a high disease activity state, while patients fulfilling the major SDAI response criteria were almost all in low disease activity or remission, an observation that lends a great deal of face validity to these response and improvement levels.
Of note, in contrast to the SDAI, the calculation of response in the ACR definition is based on a Boolean approach. In addition, the ACR response definition requires improvement in both swollen and tender joints, while in the SDAI, as a summative index, smaller improvements in one can be compensated by larger improvements in the other. All this may explain why higher relative changes of SDAI corresponded to lower levels in the ACR (ACR20➔ SDAI50; ACR50➔ SDAI70; and ACR70➔ SDAI85). The similar proportions and similar outcomes in the validation analysis between corresponding ACR and SDAI response levels (as partly shown in the results section), further support the notion that the differences in cutpoints are merely related to conceptual differences in calculation of a response. It has to be emphasised that similar to the ACR responses, SDAI responses are inclusive, that is, an SDAI50 response also always comprises an SDAI70 and SDAI85 response.
A strength of the present study is that the newly derived criteria were analysed and validated in several cohorts. Each of the datasets provided a unique aspect to the validation, with ASPIRE providing trial data on early RA patients and ATTRACT on established RA, both with the possibility to also investigate structural outcomes. Finally, the NOR-DMARD dataset allowed to investigate the concordance of response definitions with the perception of improvement in a real-life cohort.
A limitation of a secondary data analysis like the present study is that the data have not been prospectively collected for the purpose of the study. Furthermore, it has to be borne in mind that the efficacy data from the trials (eg, radiographic progression or functional data) may greatly differ compared with the original reports of these trials. This is partly due to the 80% data cut, but particularly related to the fact that here data were analysed ‘as observed’, and not imputed or carried forward as in the main reports.
The aim of our analysis was to derive response criteria, which not only comprise all aspects of validity but also conform to the conceptual background of the SDAI and CDAI, namely simplicity and ease of use. Although these response definitions may still require the use of a calculator, the mode of calculation is much simpler than the Boolean-based integration of individual response calculations, as needed in the ACR response evaluation or other response assessments.
In summary, the new response criteria for the SDAI and CDAI conform to all aspects of validity, face, construct, correlational and discriminative validity. Moreover, they pay tribute to prior findings that patients judge improvement on therapies in a relative manner comparing it with where they came from when treatment was initiated (the ‘baseline’). Thus, achievement of a relative change by 50%, 70% or 85% can be easily calculated, can be used to summarise efficacy data for a group of patients across all levels of baseline disease activity and, therefore, constitutes an easily useable tool. These definitions expand the convenience and reliability of the SDAI and CDAI to the level of response assessment.
The authors thank Centocor for providing the random sample of the original data from the ASPIRE and ATTRACT trials, and Dr Daniel Baker (Centocor) for the thoughtful discussions.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Web Only Data - This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Handling editor Johannes WJ Bijlsma
Funding This study was supported through Coordination Theme 1 (Health) of the European Community's FP7; Grant Agreement number HEALTH-F2-2008-223404 (Masterswitch). This is a publication of the Joint and Bone Center for Diagnosis, Research and Therapy of Musculoskeletal Disorders of the Medical University of Vienna.
Competing interests DA was involved in the development of SDAI and CDAI, received consulting and speaking honoraria from MSD. JM-A and T KK declare no competing interests. JSS was involved in development of SDAI and CDAI and received consulting and speaking honoraria from Centocor and MSD.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.