Objective: To evaluate concordance and agreement of the original DAS44/ESR-4 item composite disease activity status measure with nine simpler derivatives when classifying patient responses by European League of Associations for Rheumatology (EULAR) criteria, using an early rheumatoid factor positive (RF+) rheumatoid arthritis (RA) patient cohort.
Methods: Disease-modifying anti-rheumatic drug-naïve RF+ patients (n = 223; mean duration of symptoms, 6 months) were categorised as ACR none/20/50/70 responders. One-way analysis of variance and two-sample t tests were used to investigate the relationship between the ACR response groups and each composite measure. EULAR reached/change cut-point scores were calculated for each composite measure. EULAR (good/moderate/none) responses for each composite measure and the degree of agreement with the DAS44/ESR-4 item were calculated for 203 patients.
Results: Patients were mostly female (78%) with moderate to high disease activity. A centile-based nomogram compared equivalent composite measure scores. Changes from baseline in the composite measures in patients with ACRnone were significantly less than those of ACR20/50/70 responders, and those for ACR50 were significantly different from those for ACR70. EULAR reached/change cut-point scores for our cohort were similar to published cut-points. When compared with the DAS44/ESR-4 item, EULAR (good/moderate/none) percentage agreements were 92 with the DAS44/ESR-3 item, 74 with the Clinical Disease Activity Index, and 80 with the DAS28/ESR-4 item, the DAS28/CRP-4 item and the Simplified Disease Activity Index.
Conclusion: The relationships of nine different RA composite measures against the DAS44/ESR-4 item when applied to a cohort of seropositive patients with early RA are described. Each of these simplified status and response measures could be useful in assessing patients with RA, but the specific measure selected should be pre-specified and described for each study.
Statistics from Altmetric.com
Composite measures of disease activity in rheumatoid arthritis (RA) were developed in the 1990s to minimise measurement error and enhance the analysis/interpretation of clinical trials for the evaluation of new disease-modifying anti-rheumatic drugs (DMARDs) and biological agents. No single laboratory, clinical, radiographic or functional disability measure comprehensively defines all aspects of RA disease activity.1 2
Various composite “status” measures have been proposed3–6 to assess RA disease activity7 (table 1). The original Disease Activity Score (DAS, DAS44/erythrocyte sedimentation rate (ESR)-4) synthesises important clinical and laboratory measures to define a patient’s disease activity status (table 1).5 6 Several permutations of the DAS44/ESR-4 were created by using the 28-joint count, C-reactive protein (CRP) instead of ESR, and/or substituting a constant for patient global health. Development of the Simplified Disease Activity Index (SDAI) and Clinical Disease Activity Index (CDAI) simplified these measures further.4 8 Two types of “response” measure are in current use: American College of Rheumatology (ACR) 20/50/70% improvement criteria and European League of Associations for Rheumatology (EULAR) improvement criteria.9–12
The objective of this study was to determine the effects of the various simplifications of the original DAS on the good/moderate/none classification of a cohort of patients with early RA using the EULAR method of calculating response.11 12 We selected the DAS44/ESR-4 item as the referent measure (gold standard) because it is more comprehensive and antedates the other DAS-derived continuous composite measures.
In this paper, we apply nine composite criteria sets to a well-characterised longitudinal observational study cohort of seropositive patients with early RA, whose dataset includes all of the elements needed to calculate and compare the values of those composite measures in the same patients at the same time points with the DAS44/ESR-4 item. We also correlate the disease activity measures with each other and calculate the degree of agreement when the status measurements are used to classify patient responses using the EULAR criteria.
PATIENTS AND METHODS
Patients included in this study were those with early RA participating in an observational study initiated in 1993. The data used in this analysis were collected in the pre-biological era, but many of the patients are still being followed today. The participating rheumatologists in this subset study were from community and university practices in Western USA and Mexico.14–20
Patients were diagnosed according to the 1987 ACR criteria, with <15 months since symptom onset, no previous DMARD treatment, rheumatoid factor seropositivity, ⩾6 swollen and ⩾9 tender joints. Assessment at study entry, 6 months, 1 year, and yearly thereafter included the core set measures required to calculate the ACR response criteria and the composite criteria evaluated in this report (table 2).9 Joint counts were performed as previously described.14–18 During the first 2 years, patients’ years of treatment with single or combination DMARDs were: methotrexate, 221; prednisone, 175; anti-malarial agents, 115; sulfasalazine, 44; gold, 13; other agents, <5. Tumour necrosis factor α inhibitors were not available during the study period.
Blood specimens were collected for CRP; Westergren ESRs were determined when clinically indicated. Patients completed a detailed self-report mailed questionnaire at study entry and every 6 months thereafter that included demographics, health, pain, detailed drug use, global visual analogue scale, and the Health Assessment Questionnaire HAQ-DI.21
There were sufficient paired data to calculate ACR response criteria for 223 patients: 203 patients had complete data to calculate DAS measurements at two time points (from baseline to follow-up within 2 years), and 195 patients had sufficient data to calculate both the SDAI and CDAI (the physician global assessment was missing for eight patients). Patients were excluded if the measures could not be calculated from the available data; no data imputation was performed. On the basis of formulas described in table 1, values for the 10 composite measures and the proportions of patients reaching ACRnone/20/50/70 were calculated (table 3). Patients grouped in the ACRnone category did not achieve ACR20, ACR50 or ACR70 at any time during the 2-year follow-up. Similarly, the ACR20 category included patients with ACR20 to ACR49 responses, and ACR50 included ACR50 to ACR69 responses. For convenience and consistency, the changes in status measure scores were calculated using the baseline score and the score at the time the patient first achieved the maximum response during the 2 years of follow-up with the ACRnone/20/50/70 criteria. For example, if the patient was categorised as ACRnone, then the composite disease activity change score was calculated using the values at baseline and at the last visit within the 2-year follow-up period. If the best response was ACR50 at 1 year, then the 1-year visit value was used; if it was ACR20 at 6 months, then the 6 months value was used. During the 2-year follow-up period, the 223 patients had 705 distinct observations with complete data to calculate all 10 composite disease activity measures at the same time points, and these were the basis for the nomogram (fig 1).
EULAR criteria for good, moderate or none response were determined using cut-points for reached values and change scores (fig 2, table 4).10 11 “Reached values” are those obtained at the follow-up observation. “Change values” are the difference between the baseline score and the follow-up score. Table 4 shows how the EULAR response categories are determined on the basis of two cut-points for the reached and two cut-points for the change values which are used to separate the disease response into nine regions, each of which is then classified as a good, moderate or none response. These cut-points (threshold values) are essential for calculating the EULAR response category of each patient; changes in the cut-points will change the proportions of a cohort who qualify as good, moderate or none responders. We had planned to use accepted published values for the cut-points for the nine DAS-derived measures, but published cut-points were available only for the DAS44/ESR-4 item and DAS28/ESR-4 item.10–12 Reached values (but not change values) have been published for SDAI and CDAI.13 Therefore, it was necessary to calculate the missing cut-points for the remainder of the status measures and use them to estimate the various EULAR response rates from our cohort. Using the 203 patients who had enough paired data, we determined cut-points for reached and change scores for the 10 status measures of disease activity, using an approach similar to that used to develop the original EULAR scores.11 Firstly, each subject was categorised into one of the nine possible regions of the response space (fig 2) using the original published EULAR categorisation and cut-points (DAS44/ESR-4 item; gold standard). Next, we wished to establish cut-points for each of the other disease activity measures. Using the baseline and follow-up observations used to classify the patients with the DAS44/ESR-4 criterion, we calculated the change and reached values for each subject using each of the nine status measure formulas in table 1. For each status measure, we calculated the 75th centile of the change scores of the subjects who were classified in column 2 of table 4 by the gold standard. Next, we computed the 25th centile of the change scores of the subset classified in column 3 of table 4 by the gold standard. We then established the cut-point B as the median of those subjects between the 75th centile of column 2 and the 25th centile of column 3. If there were no subjects between the two centiles, then we established the cut-point B as the midpoint value between these two centiles. The cut-point A is defined similarly by computing the 75th centile for the subjects in column 1 and the 25th centile of those in column 2. This process is repeated for each of the reached values (C and D) using rows rather than the columns. The 25th and 75th centiles were chosen because they provide a stable representation of the population and are unlikely to be influenced by high or low outlier values. Using this method, the cut-point values based on our cohort were similar to the published values11 13 22 that had been calculated using a different cohort; therefore we considered it reasonable to use the cut-points calculated from our cohort to calculate EULAR responses with the DAS derivatives for which no published cut-points were available (fig 2, table 4). Compared with the EULAR categorisation for published cut-points for DAS44/ESR-4, the percentage agreement using the Western Consortium cohort’s calculated cut-points for DAS44/ESR-4 was 95.
In addition, we wished to evaluate how much agreement there was between the composite disease activity measures, when using the EULAR criteria. Using DAS44/ESR-4 published reached and change values as the gold standard and based on the nine-box grid (fig 2), 75 patients (37%) were good, 84 patients (41.4%) were moderate and 44 patients (21.6%) were non-responders (table 5). We then recategorised subjects with each composite disease activity measure to determine the percentage agreement (if none, moderate, good) between DAS44/ESR-4 and each composite disease activity measure using the cut-points indicated by bold type in table 4. Published change and reached values were used for DAS28/ESR-410 and published reached values for SDAI and CDAI.10 13 Cut-points calculated from our cohort (table 4) were used for SDAI and CDAI change values, and for the remaining composite measures. Because the DAS28 is so widely used, we recalculated the percentage agreements using the DAS28/ESR-4 item as the gold standard instead of the DAS44/ESR-4 item.
We used one-way analysis of variance to test for differences between means of the various composite disease activity measures among the ACR groups (ACR20/50/70/none), the two-sample t test for pairwise differences between ACR groups for each composite measure, and Spearman correlation coefficients for the overall relationship between composite measures. The nomogram was created using SPSS V12 (SPSS Inc, Chicago, Illinois, USA). All other statistical computations used SAS V9 (SAS Institute Inc, Cary, North Carolina, USA). p<0.05 was considered significant.
Table 1 compares the formulas used to calculate the different composite measures. Table 2 describes the baseline characteristics of the 223-patient early-RA cohort. The patients are similar to those in other early-RA cohorts: mostly female (78%) with average disease duration of 6 months and of moderate to high disease activity with mean DAS44/ESR-4 item 4.9 (and mean DAS28/ESR-4 item 6.1). The average swollen joint count and tender joint count were 12 and 13 (using the 28-joint count), the average Ritchie Articular Index was 17.8 (out of 78), and the swollen 44-joint count was 18.7.
The centile distributions of the scores of 10 composite status measures (calculated for each of 705 distinct observations) are plotted in fig 1. Scores connected by each horizontal centile line are equivalent to each other in this population—for example, equivalent scores on the 20th centile line are DAS44/ESR-4 (2.4), DAS44/ESR-3 (2.4), DAS44/CRP-4 (2.2), DAS44/CRP-3 (2.2), DAS28/ESR-4 (3.3), DAS28/ESR-3 (3.4), DAS28/CRP-4 (3.1), DAS28/CRP-3 (3.1), SDAI (10) and CDAI (9). Thus, if the value of one of these status measures is known, the scores of the other measures can be approximated. For example, given a known CDAI value, equivalent values of DAS and SDAI can be easily estimated from this nomogram.
Correlations between the disease activity composite measures (table 1) ranged from 0.85 for DAS44/CRP-3 with the DAS28/ESR-3 item to 0.99 comparing DAS44/ESR-4 with DAS44/ESR-3 and SDAI with CDAI (data not shown).
For each of the 223 patients, the ACR20/50/70 response from baseline to the best follow-up observation point was calculated. Maximum improvements attained at any time point during the 2 year follow-up were ACR70 by 54 patients, ACR50 by 36 patients, ACR20 by 46 patients, and 87 patients did not satisfy ACR20 criteria at any evaluation during the 2-year period (table 3). For each of 10 possible status measures, the change in score between baseline and the observation used to calculate the ACR20/50/70/none was determined. For all of the status measures, the mean change scores for the ACRnone group are significantly different from change scores for the ACR20, ACR50 and ACR70 groups. The baseline to ACR50 change scores are significantly different from the baseline to ACR70 change scores, but the change score differences between the ACR20 and ACR50 groups are not significantly different (table 3).
Comparing change scores of the DAS44/ESR-4 item with the DAS44/ESR-3 item, DAS44/CRP-4 item, DAS44/CRP-3 item, DAS28/ESR-4 item, DAS28/ESR-3 item, DAS28/CRP-4 item, DAS28/CRP-3 item, SDAI and CDAI, the correlation coefficients of these change scores were respectively 0.99, 0.97, 0.96, 0.87, 0.85, 0.82, 0.81, 0.83 and 0.81 (all p<0.001) (data not shown).
To evaluate how well the different composite disease activity measures categorised patients into the EULAR good, moderate and none criteria, we evaluated the percentage agreement between DAS44/ESR-4 item categories (using published EULAR cut-points) with categorisation of the same patients using each composite measure. The goal was to use the most recognised values for this analysis; thus, the published reached and change values were preferentially used if available, otherwise the calculated values from the Western Consortium cohort (table 4) were used to calculate the percentage agreement. When the DAS44/ESR-3 item was compared with the DAS44/ESR-4 item published cut-point values, there was a percentage agreement of 92. Similarly, for the DAS44/CRP-4 item, DAS44/CRP-3 item, DAS28/ESR-4 item, DAS28/ESR-3 item, DAS28/CRP-4 item, DAS28/CRP-3 item, SDAI and CDAI, the percentage agreements with the DAS44/ESR-4 item were calculated to be 87, 89, 80, 78, 80, 80, 80 and 74, respectively (table 5). The highest agreement is between the DAS44/ESR-4 item and other DAS44 measures. Less agreement is seen with the DAS28 measures, and least with CDAI.
When we used the DAS28/ESR-4 item as the gold standard in comparisons with the DAS44/ESR-4 item, DAS44/ESR-3 item DAS44/CRP-4 item, DAS44/CRP-3 item, DAS28/ESR-3 item, DAS28/CRP-4 item, DAS28/CRP-3 item, SDAI and CDAI, the percentage agreements were 77, 76, 72, 73, 88, 83, 76, 73 and 74, respectively (data not shown).
This paper compares the DAS44/ESR-4 item with nine composite disease activity status measures and response measures using a well-characterised longitudinal observational study cohort of seropositive patients with early RA. Many modifications of the original DAS (DAS44/ESR-4 item) have been developed including: DAS28/ESR-4 item, DAS28/CRP-4 item, DAS28/ESR-3 item, DAS28/CRP-3 item, DAS44/ESR-3 item, DAS44/CRP-4 item, DAS44/CRP-3 item, SDAI and CDAI. The original DAS was modified by using CRP instead of ESR, 28-joint counts instead of the Ritchie Articular Index and 44 swollen joint count, substitution of a constant for the patient global assessment, and changing the formula to simply add the variables rather than by using multiplication, square roots and natural logarithms.
The correlation among scores obtained using the different disease activity status measures is good, with the smallest correlation coefficient being 0.85. Within our cohort, the nomogram (fig 1) illustrates the translation between comparable values of the different composite disease activity status measures. Thus, there is concordance between all 10 disease activity status measures.
This study also evaluates the ACR20/50/70 response and shows that the mean change from baseline for each composite measure was able to detect a difference between ACRnone and ACR20/50/70 as well as between ACR50 and ACR70, but could not detect a significant difference between the ACR20 and ACR50 responder groups. The 10 RA disease activity status measures behave similarly in this respect. These data are similar to previously published data.13 Change values and reached values were calculated for our cohort using a method similar to that reported when the original EULAR values were calculated.10 11 When our calculated cut-points are compared with published cut-points, the values are generally similar. However, there is substantial variation among the EULAR response criteria cut-points that we calculated for the 10 different composite disease activity measures, suggesting that it may not be appropriate to use the cut-points published for one measure (eg, DAS28/ESR-4) with a different measure (eg, DAS28/CRP-4). To some extent, calculated cut-points may vary somewhat depending on the characteristics of the RA population being studied. For example, our calculated cut-points for the DAS28/ESR-4 item agree more closely with those more recently reported23 24 than with the original published values.22 We do not feel that these findings justify changing the original published cut-points that have been used for many years.11 22 However, for those composite disease activity measures with no available published values, the cut-point values calculated for our cohort might be used as preliminary approximations until consensus can be reached with additional studies.
When the percentage agreement with the gold standard (DAS44/ESR-4 item) was calculated comparing each of the other composite disease activity status measures, CDAI (74%) and SDAI (80%) were in the same general range as the DAS28-based measures (78–80%). This suggests that the accuracy of the EULAR categorisation as none, moderate or good response is similar when CDAI, SDAI or one of the DAS28 measure scores are substituted for the original DAS44/ESR-4 item score. Specifically, the percentage agreement is 92 between the DAS44/ESR-4 item and DAS44/ESR-3 item measures, and 88 between the DAS28/ESR-4 item and DAS28/ESR-3 item when they are used to categorise the responses of our cohort.
The major limitation of this paper is that our cohort was not developed for the purpose of studying the stated objective and the results presented may not be directly applicable to other cohorts. The purpose of this study was to describe the similarities and differences in scores obtained with the different composite disease activity measures, not to improve the measures. Secondly, ours is an early, seropositive cohort; different from the cohorts used to develop original DAS, ACR response, SDAI and CDAI criteria. In addition, some may argue that there is a sense of circularity to the paper. However, to justify the use of change and reached values calculated with our cohort for these measures without published values, we felt that it was necessary to compare the published change and reached values of the DAS44/ESR-4 gold standard with those calculated using our cohort. As these values were similar and there was 95% agreement in the categorisation of the response of subjects as none/good/moderate, we considered that our method for calculating the cut-points to be valid (table 5).
In conclusion, this paper has investigated the application of 10 different composite disease activity measures and how they relate to one another, in a longitudinal observational study cohort of seropositive patients with early RA. It is apparent that there are many similarities in the way the measures behave, suggesting that each could be useful in assessing RA disease activity at a point in time (status) and in analysing change in status over time (response)—for example, as in a controlled clinical trial or in the analysis of a longitudinal observational study—and that some may be suitable candidates for measuring desired responses in the aggressive treatment of patients in clinical practice.
Dr Janet Elashoff for her valued contribution to this paper.
↵These authors contributed equally to this work.
The Western Consortium of Practicing Rheumatologists: Robert Shapiro, Maria W Greenwald, H Walter Emori, Fredrica E Smith, Craig W Wiesenhutter, Charles Boniske, Max Lundberg, Anne MacGuire, Jeffry Carlin, Robert Ettlinger, Michael H Weisman, Elizabeth Tindall, Karen Kolba, George Krick, Melvin Britton, Rudy Greene, Ghislaine Bernard Medina, Raymond T Mirise, Daniel E Furst, Kenneth B Wiesner, Robert F Willkens, Kenneth Wilske, Karen Basin, Robert Gerber, Gerald Schoepflin, Marcia J Sparling, George Young, Philip J Mease, Ina Oppliger, Douglas Roberts, J Javier Orozco Alcala, John Seaman, Martin Berry, Ken J Bulpitt, Grant Cannon, Gregory Gardner, Allen Sawitzke, Andrew Lun Wong, Daniel O Clegg, Timothy Spiegel, Wayne Jack Wallis, Mark Wener, Robert Fox
Funding: The study was supported by Specialty Laboratories (Valencia, CA, USA), with previous support from NIH Multipurpose Arthritis and Musculoskeletal Disease Center Grant P60 AR 36834. VR was supported in part by ASP/REF/ACR Junior Career Development Award in Geriatrics and the OAIC Career Development Award. DK’s work was supported in part by the Arthritis and Scleroderma Foundations (Physician Scientist Development Award), and a National Institutes of Health BIRCWH award (grant No HD051953).
Competing interests: None.
- American College of Rheumatology
- Clinical Disease Activity Index
- C-reactive protein
- Disease Activity Score
- disease-modifying anti-rheumatic drug
- erythrocyte sedimentation rate
- European League of Associations for Rheumatology
- rheumatoid arthritis
- Simplified Disease Activity Index
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.