Article Text

American College of Rheumatology/European League Against Rheumatism Provisional Definition of Remission in Rheumatoid Arthritis for Clinical Trials
Free
1. David T Felson1,2,
2. Josef S Smolen3,
3. George Wells4,
4. Bin Zhang5,
5. Lilian H D van Tuyl1,
6. Julia Funovits6,
7. Daniel Aletaha6,
8. Cornelia F Allaart7,
9. Joan Bathon8,*,
10. Stefano Bombardieri9,
11. Peter Brooks10,
12. Andrew Brown11,
13. Marco Matucci-Cerinic12,
14. Hyon Choi4,
15. Bernard Combe13,
16. Maarten de Wit14,
18. Paul Emery16,
19. Daniel Furst17,
20. Juan Gomez-Reino18,
21. Gillian Hawker19,
22. Edward Keystone20,
23. Dinesh Khanna17,
24. John Kirwan21,
25. Tore K. Kvien22,
26. Robert Landewé23,
27. Joachim Listing24,
28. Kaleb Michaud25,
29. Emilio Martin-Mola26,
30. Pamela Montie27,
31. Theodore Pincus28,
32. Pamela Richards29,
33. Jeffrey N Siegel30,,
34. Lee S Simon31,
35. Tuulikki Sokka32,
36. Vibeke Strand33,
37. Peter Tugwell3,
38. Alan Tyndall34,
39. Desirée van der Heijde7,
40. Suzan Verstappen35,
41. Barbara White36,
42. Frederick Wolfe37,38,
43. Angela Zink24,
44. Maarten Boers5
1. 1Boston University School of Medicine, Boston, Massachusetts, USA
2. 2University of Manchester, Manchester, UK
3. 3Medical University of Vienna and Hietzing Hospital, Vienna, Austria
4. 4University of Ottawa, Ottawa, Ontario, Canada
5. 5VU University Medical Center, Amsterdam, The Netherlands
6. 6Medical University of Vienna, Vienna, Austria
7. 7Leiden University Medical Center, Leiden, The Netherlands
8. 8Johns Hopkins University School of Medicine, Baltimore, Maryland
9. 9University of Pisa, Pisa, Italy
10. 10University of Melbourne, Melbourne, Victoria, Australia
11. 11University of York, York, UK
12. 12University of Florence, Florence, Italy
13. 13Montpellier University Hospital, Montpellier, France
14. 14Amsterdam, The Netherlands
15. 15Paris-Descartes University, UPRES-EA, Assistance Publique Hôpi- taux de Paris, and Cochin Hospital, Paris, France
16. 16University of Leeds and NIHR Leeds Musculoskeletal Biomedical Research Unit, Leeds, UK
17. 17David Geffen School of Medicine, University of California, Los Angeles, California, USA
18. 18Hospital Clinico Universitario and Universidad de Santiago de Compostela, Santiago, Spain
19. 19Wom- en's College Hospital and University of Toronto, Toronto, Ontario, Canada
20. 20University of Toronto, Toronto, Ontario, Canada
21. 21University of Bristol, Bristol, UK
22. 22Diakonhjemmet Hospital, Oslo, Norway
23. 23University Hospital Maas- tricht, Maastricht, The Netherlands
24. 24German Rheumatology Research Center, Berlin, Ger- many
25. 25National Data Bank for Rheumatic Diseases, Wichita, Kansas and University of Nebraska Medical Center, Omaha
26. 26Hospital Universitario La Paz, Madrid, Spain
28. 28New York University School of Medicine, New York, New York
29. 29Bristol, UK
30. 30US Food and Drug Administration, Washington, DC, USA
31. 31SDG LLC, Cambridge, Massachusetts, USA
32. 32Jyväskylä Central Hospital, Jyväskylä, Finland
33. 33Stanford University School of Medicine, Palo Alto, California, USA
34. 34University of Basel, Basel, Switzerland
35. 35University of Manchester, Manchester, UK
36. 36Medlmmune, Gaithersburg, Maryland, USA
37. 37National Data Bank for Rheumatic Diseases, Wichita, Kansas, USA
38. 38University of Kansas, Wichita, Kansas, USA
1. Correspondence to Maarten Boers, Department of Epidemiology and Biostatistics, VU University Medical Center, PK 6Z 165, PO Box 7057, 1007 MB Amsterdam, The Netherlands; eb{at}vumc.nl

## Abstract

Objective Remission in rheumatoid arthritis (RA) is an increasingly attainable goal, but there is no widely used definition of remission that is stringent but achievable and could be applied uniformly as an outcome measure in clinical trials. This work was undertaken to develop such a definition.

Methods A committee consisting of members of the American College of Rheumatology, the European League Against Rheumatism, and the Outcome Measures in Rheumatology Initiative met to guide the process and review prespecified analyses from RA clinical trials. The committee requested a stringent definition (little, if any, active disease) and decided to use core set measures including, as a minimum, joint counts and levels of an acute-phase reactant to define remission. Members were surveyed to select the level of each core set measure that would be consistent with remission. Candidate definitions of remission were tested, including those that constituted a number of individual measures of remission (Boolean approach) as well as definitions using disease activity indexes. To select a definition of remission, trial data were analysed to examine the added contribution of patient-reported outcomes and the ability of candidate measures to predict later good radiographic and functional outcomes.

Results Survey results for the definition of remission suggested indexes at published thresholds and a count of core set measures, with each measure scored as 1 or less (eg, tender and swollen joint counts, C reactive protein (CRP) level, and global assessments on a 0–10 scale). Analyses suggested the need to include a patient-reported measure. Examination of 2-year follow-up data suggested that many candidate definitions performed comparably in terms of predicting later good radiographic and functional outcomes, although 28-joint Disease Activity Score–based measures of remission did not predict good radiographic outcomes as well as the other candidate definitions did. Given these and other considerations, we propose that a patient's RA can be defined as being in remission based on one of two definitions: (1) when scores on the tender joint count, swollen joint count, CRP (in mg/dl), and patient global assessment (0–10 scale) are all ≤1, or (2) when the score on the Simplified Disease Activity Index is ≤3.3.

Conclusion We propose two new definitions of remission, both of which can be uniformly applied and widely used in RA clinical trials. The authors recommend that one of these be selected as an outcome measure in each trial and that the results on both be reported for each trial.

## Statistics from Altmetric.com

With the advent of new therapies and treatment strategies for rheumatoid arthritis (RA), remission has become a realistic goal1,,3 and has recently become a secondary or even primary end point for clinical studies and trials.4,,8 Remission is also regarded as a major therapeutic target in clinical practice9,,12 and can be achieved in a significant proportion of patients receiving routine follow-up care.13,,15 However, the formal definition of RA remission differs between studies.

The current American College of Rheumatology (ACR) definition of remission in RA16 was developed in 1981, prior to the introduction of the RA core set measures.17 In this classic article, Pinals et al stated: “… ‘complete remission’ implies the total absence of all articular and extra-articular inflammation and immunologic activity related to rheumatoid arthritis (RA).” Recognising that detecting such a state could entail documentation by ‘extraordinary measures,’ they settled on the concept of ‘complete clinical remission,’ aiming to achieve ‘uniformity in clinical application using generally acceptable and convenient measures.’ Even though this concept is of considerable value as a therapeutic target in trials and clinical practice, the 1981 ACR definition has not been widely used in clinical trials in RA because it contains some elements not in the core set (morning stiffness, swelling in tendon sheaths) and a time requirement. Also, this original version was so stringent that few patients met the criteria. Subsequently, many modifications of the ACR criteria were developed, usually omitting one or more of the measures as well as the time requirement.

The development of composite indices of disease activity allowed definition of cut point values representing remission,18,,20 but their validation was often limited to comparisons with such modified ACR criteria. For instance, we now know that the widely used definition of remission based on a 28-joint Disease Activity Score (DAS28)21 of <2.618 better represents minimal disease activity than remission, since multiple joints can remain swollen or tender at that score.19 22,,24 This is further exemplified by the fact that in many recent clinical trials the proportion of patients with an ACR70 response (ie, improvement based on the ACR preliminary definition of improvement25 but applying 70% improvement instead of 20%) is similar to, or even lower than, the proportion of patients attaining remission as assessed by the DAS28.5 26,,28 Thus, the 1981 statement by Pinals et al16 remains relevant today: “Substantial variation appears to exist in the concept of remission within the group of participating rheumatologists.”

In the meantime, effective treatments for RA have led to more exacting criteria for improvement (eg, ACR 50%, 70%, and even 90% improvement) and have led to recently proposed definitions of minimal disease activity.29 In light of the heterogeneity of definitions of remission, the time has come for consensus on a new, uniform remission definition. Therefore, the ACR and the European League Against Rheumatism (EULAR), together with the Outcome Measures in Rheumatology Initiative (OMERACT), jointly convened a committee to redefine remission in RA. This committee subsequently published a systematic review of the prognostic validity of current remission definitions30 as well as an outline of the goals of redefining remission and the methods by which the goals would be attained.2

At noted in this already published outline of our goals,2 the committee decided by consensus to create a stringent definition for remission and agreed that any definition should include, as a minimum, tender and swollen joint counts and levels of an acute-phase reactant. Excluded were treatment, duration of remission (the committee believed this should be specified in each trial report), and measures of physical function and radiographic damage. The latter two were to be used to validate candidate remission definitions: the chosen definition should predict future good functional outcomes and absence of radiographic damage progression. Remission should also predict future remission and minimal disease activity, that is, show stability. Finally, the requirement for full or 28-joint counts had to be studied. The committee suggested that core set measures should be used to define remission and that any definition of remission in clinical trials should look toward and make possible a similar definition in clinical practice.

The selection of the optimal definition of remission was guided by the research agenda as put forth by the committee at the beginning of our deliberations. In general, the evidence-based consensus method was in accordance with similar activities previously performed by OMERACT as well as ACR and EULAR31,,33 with the intent of deriving a definition that would pass the OMERACT filter of truth, discrimination, and feasibility.34 Herein we present the results of analyses addressing this research agenda, report on later meetings of the committee in which the results were evaluated, and present a consensus definition of remission.

## Methods

### General aspects

The initial committee was formed by inviting members of the ACR committee that had previously formulated the new ACR response criteria,35 principal investigators of recent clinical trials, methodologists and patient experts from the OMERACT community, and ACR, EULAR, and OMERACT leaders, with a view to being inclusive and geographically representative. All members present at one of the committee meetings were asked to consider becoming authors of the present report. A patient expert (or research partner) can be described as a patient involved in research based on personal experience of disease that is not available to most researchers, but that complements researchers' analytical skills and scientific perspective.36 During the development of the remission definition, six patient experts were involved, three of whom (MdW, PM, PR) are authors. Using the specifications provided by the committee, a steering group (DTF, JSS, GW, BZ, LHDvT, JF, MB) designed and performed the necessary investigations; this included a survey as well as analyses of clinical trial data. Clinical trial data banks with slightly different total patient numbers and data composition were created at different centers. Because industry-funded clinical trials have been the largest RA trials carried out and data collected in these trials would be useful for testing hypothesised remission definitions, we solicited industry-funded RA clinical trials data, with the companies' approval. Industry had no role in data analysis, criteria development, testing or evaluating the process, or final choices made by the committee, nor were they consulted or involved in manuscript development. These data were analysed by the steering group according to the committee's specifications and methodologic discussions in the steering group.

### Survey

Our first goal in selecting candidate criteria for remission was to define what cut points in each of the core set measures might constitute remission. In a survey we asked committee members (experienced RA clinical researchers and patient experts) what level of residual activity in individual core set measures would constitute remission. For all measures except joint counts, we used a 0–10 scale. For C reactive protein (CRP), we used a scale in milligrams per deciliter. For initial analyses of joint counts, we used a 28-joint count and then examined its validity for remission (see below). We asked committee members to state the highest level of each core set measure that would be compatible with remission if it were the only measure assessed, and also asked for the highest level of a particular core set measure that would be compatible with remission if all other measures suggested remission.

### Value of patient-reported outcomes

The committee raised the question as to whether patient-reported outcomes should be included in the definition of remission. We addressed this issue by asking whether patient-reported outcomes at the level of remission discriminated between active versus control treatment in trials. In a subset of our data bank, which comprised core set data from four clinical trials,37,,40 we performed two sets of analyses in each trial. In both analyses the dependent variable was treatment assignment. First, we carried out a logistic regression with each of the core set measures as predictors (recoded as remission level, eg, swollen joint count ≤1: yes or no). Second, we performed recursive partitioning by classification and regression tree (CART) analysis on data from the four clinical trials, in which we ranked core set measures at remission level based on the tree created from a series of binary splits. Recursive partitioning is a statistical method for multivariable analysis, creating a tree with branches that strives to correctly classify members of the population based on a dichotomous dependent variable. If the patient-reported outcomes helped differentiate active treatment from control (either by being a significant predictor in the regression analysis or by having a high rank in the classification tree), then these outcomes would be said to contribute importantly to defining remission. Patient-reported outcomes tested in this analysis were patient global assessment and patient pain. Functional status measurement was not included, for reasons outlined above.

### Assessment of predictive validity

Once we had decided that patient-reported outcomes were to be included and had determined the cut points to be used to define remission, we undertook the analysis of predictive validity. To this end we evaluated various 2-year data sets from randomised clinical trials (patient-level data on 80–90% of patients selected randomly) kindly provided by the sponsors of these studies35 37 40,,43 and obtained permission to use these data for the present analysis. The data are described in more detail in the original publications. For the present analyses, we evaluated only patients for whom all pertinent data over a 2-year period were available.

We initially defined good outcome for radiographic damage and physical function separately. For radiographic damage, the definition comprised stable radiography scores over 1 year (defined as change of ≤0 in Sharp scores44 or modified Sharp/van der Heijde scores45 during the second year of the trial). For physical function, it comprised stable and low scores on the Health Assessment Questionnaire (HAQ)46 (change of ≤0 and HAQ score consistently ≤0.5 during the second year of the trial). We then tested whether patients who met a particular definition of remission at 6 months or 12 months were more likely to have a good outcome in the subsequent period, that is, between 1 and 2 years after trial initiation. Likelihood ratios were used to compare the proportion of patients having the good outcome whose RA was in remission to the proportion of patients having the good outcome whose RA was not in remission. To rank candidate definitions of remission, we used the P value from the logistic regression χ2 test. As has been reported,47 most patients in trials who are followed up long term do not show radiographic progression. This limited our capacity to discriminate between candidate definitions of remission. Moreover, intensive therapy with tumor necrosis factor (TNF) inhibitors plus methotrexate (MTX) dissociates clinical disease activity from progression of joint damage, since–unlike patients treated with MTX alone–those receiving aggressive treatments have no or minimal radiographic progression irrespective of their disease activity.48,,50 Therefore, we primarily performed the analyses on patients treated with MTX monotherapy; however, we also evaluated TNF inhibitor monotherapy and combination therapy in sensitivity analyses. To assess the robustness of the results, we also performed the analyses on a subset of trial patients with an especially poor prognosis in terms of radiographic disease–that is, presence of rheumatoid factor and presence of radiographic damage at baseline. Finally, we tested an additional definition of a good outcome, that is, stability of both radiographic damage and HAQ score.

### Selection of candidate definitions

Candidate definitions of remission were selected from two general categories: first, indices that have been widely used, including the DAS28, the Simplified Disease Activity Index (SDAI), and the Clinical Disease Activity index (CDAI),18 19 21 32 33 51 52 and second, definitions including one or more core set measures at cut points previously defined by the survey, but requiring all included measures to be at or below that cut point. For example, to meet remission defined as low scores on tender and swollen joint counts and physician and patient global assessments, the patient must have had low scores on all 4 measures. These measures are referred to below as the Boolean measures based on their approach, which is to define each core set measure as in remission or not (values of 0 and 1) and use possible combinations of the patient's core set measure remission status to determine the patient's overall remission status (also 0 or 1).*

### Further evaluations of candidate definitions including assessment of face validity

After completing the analysis of predictive validity, we tested our candidate definitions for face validity. Since we had decided that any definition of remission must be stringent with respect to not allowing much disease activity, we studied whether patients could meet a definition of remission yet still have moderate to high levels of disease activity in any core set measure. To do this, in the group of patients meeting a certain definition of remission, we studied the 90th percentile and maximum level of disease activity observed in each core set measure. Last, we looked at recent trial data to determine what proportion of patients met each remission definition. It was our goal not to have an undetectably low percentage of patients meeting the definition of remission, or one so high as to be unreasonable given clinical experience with these treatments.

We also examined two related issues for our candidate definitions. First, we wished to select a definition(s) that was reliable, and we determined this by analysing, in one trial with monthly visits, whether a patient whose RA was defined as being in remission at one visit attained the same status at adjacent visits ≤1 month from the first; if the disease was not in remission at the adjacent visit, we assessed whether disease activity remained at minimal levels.29 Second, we were concerned that a 28-joint count might not capture actively involved joints outside these 28; to address this, we reviewed literature and analysed trial data to determine whether we should define remission differently when using 28 versus, for example, 66 joints. For the latter, we evaluated data from a set of trials that included tenderness and swelling counts of individual joints. In these we assessed residual disease activity in ankles and feet in patients with 28-joint counts of ≤1 and determined what proportion of such patients would satisfy the other requirements of our candidate definitions. These patients would represent real misclassification (‘false-positive’ remissions). In the same data set we subsequently investigated whether such misclassification could materially affect the predictive validity of the remission definitions. For this purpose we compared the prevalence of good outcome (damage or function) in patients with ‘true remission’ (ie, based on full joint counts at ≤1) with that in all patients with remission based on 28-joint counts (ie, ‘true’ plus ‘false-positive’ remissions).

## Results

### Survey

Twenty-seven committee members, including two patients, completed the survey on threshold levels for remission (table 1). In the scenario in which only one variable was available, the responses clustered around core set disease activity levels of 1, such that, for example, the swollen or tender joint count should be 1 or less, the CRP level should be 1 mg/dl or less, and patient and physician global assessments as well as patient pain assessment should be 1 or less on a 10-point scale. The question on which was the highest level of a particular core set measure compatible with remission if all other measures suggested remission yielded more varied answers, with thresholds ranging from 2 for swollen joint count (SJC) and CRP level to 4 for tender joint count (TJC). Since this did not provide us with a single threshold value that was uniform across core set measures, we focused on the more stringent cut points.

Table 1

Threshold levels for remission in the RA core set measures according to the survey of committee members*

Patient-reported outcomes. We then proceeded with an analysis of clinical trial data on active treatment versus control to help determine whether patient-reported outcomes, namely patient global assessment or patient-reported pain, should be incorporated into our definition of remission. In an analysis of 4 clinical trials, both logistic regression and CART analysis demonstrated that these measures added important information to physician-linked measures. In other words, in these trials, patient global assessment and patient-reported pain were statistically significant predictors that discriminated between treatments after controlling for physician-reported measures (TJC and SJC) and a laboratory measure (CRP). For example, in the CART analysis, among the four trials, patient global assessment was the best predictor of treatment assignment among all outcomes in one trial and the fourth best of core set measures in another. Patient-reported pain was the second best predictor (SJC was the best) in a third trial.

Based on these preliminary analyses, we developed a list of candidate remission definitions to test for predictive validity. When presented with the more stringent definitions versus the more relaxed definitions, our committee selected those in the more stringent category and as a consequence, we present results only for these. In accordance with the committee's charge and the assessment of the contribution of patient-reported outcomes, we mainly focused on measures that comprised TJC, SJC, CRP level, and patient global assessment. We tested combinations of these and other core set measures to determine if any group of measures would have important advantages.

### Predictive validity

We then tested whether patients whose RA was in remission according to one of these definitions had a higher likelihood of a good outcome. We focused on patients receiving MTX monotherapy, although we obtained similar results (not shown) when we analysed data from all patients. We found that patients whose RA was in remission by several of the Boolean candidate definitions, as well as by the traditional SDAI definition (≤3.3) and CDAI definition (≤2.8), had an increased likelihood of radiographic stability during the subsequent year (table 2). However, this was not the case for the DAS28 definition, either at the traditional cut point (<2.6) or at a more stringent cut point (<2.0). In contrast, being in remission by any of the definitions increased the likelihood of stability on HAQ scores, without important differences between definitions (data not shown). When we defined good outcome as the combination of radio graphic and HAQ stability, we again found that being in remission by any of the candidate definitions increased the likelihood of a good outcome (table 3). As expected, the performance of the DAS28 at either cutoff was not as good as that of the other definitions. Similar data were also obtained in an additional data set from the COBRA (Combinatietherapie Bij Reumatoïde Artritis) study42 (data not shown). However, reaching remission according to the DAS28, both at the traditional cut point (<2.6) and at a more stringent cut point (<2.0), was associated only with the likelihood of HAQ stability, and not radiographic stability. Candidate definitions of remission did not differ in their prediction of HAQ stability (data not shown). Additional definitions were tested, including incorporating either remission level pain or patient global assessment and other variations, and results were similar. Apart from the DAS28 result, the analyses did not help to distinguish between definitions. This was also the case in the analysis using a more strict definition of good outcome, and when we studied only patients with a poor prognosis (data not shown).

Table 2

Predictive validity of candidate remission definitions for good outcome in adiographic damage*

Table 3

Predictive validity of candidate remission definitions for good outcome in both radiographic damage and HAQ*

### Face validity

Face validity of the different candidate definitions, expressed as residual disease activity in the presence of remission, is shown in table 4. For the Boolean definitions, the high values, as expected, tended to be for core set measures that were not prespecified by the rule. For example, if we used the definition of TJC, SJC, CRP, and pain all ≤1, we found that 10% of patients (90th percentile) had physician and patient global assessment scores compatible with active disease. If we used TJC, SJC, and CRP all ≤1, then the patient-reported outcomes often suggested high levels of symptoms. For the traditional DAS28 definition (<2.6), we found that many of the core set measures remained at levels that would be incompatible with remission. This was even the case for DAS28 <2.0, which was a threshold few patients reached. It was not the case for other index measures that defined remission, such as the SDAI or CDAI, where results were closely aligned with the Boolean definitions and the results of our survey.

Table 4

Face validity expressed as residual disease activity in the presence of remission*

When we examined the proportion of patients in trials who met candidate definitions of remission (table 5), we felt that the high prevalence of remission according to the current DAS28 definition lacked face validity. Otherwise, 18–26% of patients receiving combination therapy with TNF inhibitors and MTX met most of these definitions, compared to only 5–10% of those receiving either monotherapy. We believe these percentages reflect face validity.

Table 5

Face validity expressed as the prevalence of remission (%) in recent trials of patients with rheumatoid arthritis*

### Consensus activity

The committee met prior to the ACR Annual Scientific Meeting in October 2009 to discuss the analyses described above. As noted, the committee did not select, in any case, a more relaxed definition of remission, consistent with its earlier directive. During the committee meeting two subgroups were formed to discuss the tabular results presented, especially including results regarding predictive validity. Both groups voted that there should be both a Boolean approach and an index-based definition. One group voted among individual definitions of remission, and in doing so, the highest vote was received for the Boolean definition that included TJC, SJC, CRP, and patient global assessment, all at levels ≤1. The index definition with the highest vote count was SDAI ≤3.3. In the other subgroup, after a discussion involving all study group members, the same conclusion was reached without a formal vote.

Members of this subgroup noted that in the clinical setting an acute-phase response measure is often not available at every visit and the subgroup suggested that a definition of remission be developed for clinic-based practice that would not require an acute-phase reactant, as long as it would capture remission as stringently as the measure used for clinical trials. Indeed, a Boolean measure comprising TJC, SJC, and patient global assessment provided statistical results similar to those obtained with the same measures encompassing CRP and those obtained with the CDAI, which does not include CRP (table 2). Thus, these definitions of remission may be used in clinical practice until better measures for that purpose become available.

In a trial with monthly visits we found that our selected definitions of remission showed good reliability. Specifically, among patients whose RA was in remission at one time point, the disease remained in remission 1 month later in 66%, and all the rest met criteria for minimal disease activity.29

### Joint counts

We consulted published literature and our own data analysis to determine if remission thresholds for 28-joint counts should be the same as thresholds for counts with more joints assessed (such as 66 or 68 joints). One study53 examined whether adding ankles and metatarsophalangeal joints to the 28-joint count affected remission and showed that <10% of patients with no tender or swollen joints in a 28-joint count had tender or swollen ankles or metatarsophalangeal joints and that the average patient global assessment score in these latter patients was significantly higher, suggesting that they would not meet proposed definitions of remission. Landewé and colleagues24 also noted that defining a patient's disease as being in remission using a 28-joint count often concealed active joints elsewhere, especially in the feet and ankles. However, they also reported that global assessments for patients who had 28 joints in remission but actively involved joints elsewhere resembled those for patients whose disease was not in remission based on a 28-joint count, suggesting that requiring a low patient global assessment score will, to some extent, mitigate the limitation of using a 28-joint count.

In the two trials in our data set that included counts of individual joints with tenderness and swelling, remission prevalence using 66 and 68 joints was 4% and 9%, respectively. As in the studies cited above, we found that patients with 28-joint counts ≤1 often had residual tenderness or swelling in the ankles or feet. However, most of these patients did not satisfy the other requirements of our candidate definitions of remission. Nevertheless, the estimates of remission prevalence increased to 6% and 14%, respectively, of the total population when 28-joint counts were used. In another data set from two trials with 2-year follow-up data, we compared patients whose RA was in remission according to full (66/68) joint counts versus those whose RA was in remission according to only 28-joint counts (ie, with residual disease activity in joints not assessed). Among the patients with ‘full joint count remission,’ 80% had good outcomes in terms of radiographic damage (no change in Sharp score); this number decreased by 1% among patients with only ‘28-joint count remission.’ Likewise, among the patients with ‘full joint count remission,’ 90% had good HAQ outcomes; this number decreased by 1% and 4%, respectively, in the 2 trials, among patients with only ‘28-joint count remission.’ Based on these analyses we concluded that the overall impact of (misclassification) due to reduced joint counts is small.

The final recommended definitions of remission are presented in table 6. Specific suggestions on how to measure components of the definitions are also provided.

Table 6

American College of Rheumatology/European League Against Rheumatism definitions of remission in rheumatoid arthritis clinical trials*

## Discussion

Based on considerations of face and predictive validity, the need for stringency, and the need to include patient-reported outcomes, the ACR/EULAR committee charged with defining remission in RA has produced two definitions for evaluating remission in clinical trials. One is a Boolean-based definition, more categorical in structure than the traditional definition from Pinals et al,16 and the other is based on a composite index of RA activity, the SDAI.19 51

Ideally, we would have liked to select a candidate definition that clearly differentiated patients whose long-term course was without disease progression versus those whose disease continued to progress. Our analysis of long-term data confirmed the findings of our systematic review30 that most definitions of remission did well–that is, that patients whose RA was in remission at any point during a clinical trial, based on any of the definitions we used, were likely to have long-term courses that were better than those of patients who did not meet the definition of remission. One exception was the DAS28, which has been shown previously to allow for significant residual disease activity.19 22 23 54 55 Thus, except for DAS28-based definitions, differences in predictive validity between candidate definitions were small (see tables 2 and 3), and it was difficult to differentiate the course of patients meeting any of these definitions of remission. Among the many definitions tested, none importantly exceeded the ability of the ultimately selected criteria to predict favorable long-term effects on radiographic progression and physical function. Although we can confirm the predictive validity of remission, the goal of the work was to define remission, not to develop a predictive marker.

In our data sets we assessed definitions of remission by 28-joint counts. When we examined more comprehensive counts among patients with disease remission in the 28 joints, we found that residual disease activity was frequently present in ankles and feet. However, most of these patients failed to meet other criteria in the remission definition (eg, their patient global assessments were often high). In other words, even when joints other than the 28 joints counted were swollen or tender, other measures of disease activity often prevented misclassification of these patients as having disease in remission. In addition, the impact of misclassification on long-term outcome proved to be small. We should also bear in mind that the assessment of ankles and forefeet is particularly limited and poorly reproducible.56 In line with this, the discordance between tenderness and swelling has proved to be greater in the joints of the feet than in other joints.57 Therefore, we do not require inclusion of ankles and forefeet in the assessment of remission but recommend that these joints are also included in the examination. Investigators should always report which joints were examined.

In 2008, EULAR and the ACR recommended that in each RA trial, the percentage of patients achieving a low disease activity state and remission should be reported.32 33 On the basis of the present analyses and consensus, we suggest that remission based on one of the definitions recommended here be reported as a preselected outcome measure in trials, and that results for both be included in trial reports. Of the approaches to defining low disease activity, the OMERACT definitions of ‘minimal disease activity,’ designed to reflect the ‘next best’ option apart from remission, have been the best vetted and were consensually developed.29

There are a few limitations to our approach, and possibly to the definitions produced as a consequence. First, we used a HAQ score of ≤0.5 as evidence for stability of the remission criteria; while this is a disability score that is essentially above values obtained in the general population,58 many of the studies evaluated were of patients with longstanding disease who are known to accumulate significant irreversible disability.59 However, we accounted for this potential contrast to the normal situation by also requiring that HAQ scores did not deteriorate at all over a full 1-year period (HAQ change ≤0 during the second year of observation).

Second, we have not yet validated the recommended definitions of remission in observational data sets. This is the next step in our work. In developing definitions, we anticipated clinic-based evaluations, trying to choose definitions of remission that would be easy to apply in an observational context and take advantage of variables that are probably already being measured. In clinical practice, data on acute-phase reactants are frequently not immediately available, and therefore, an additional set of a Boolean definition and an index-based definition not requiring acute-phase reactants is provided for that setting. Nevertheless, our preliminary suggestions for defining remission in clinical practice are still incomplete, as we did not test them in a clinic-based setting. While the remission definitions not requiring an acute-phase reactant performed comparably with those that do require this parameter, the committee believes that including an acute-phase reactant for reporting remission in clinical trials is preferable because acute-phase reactants are important predictors of later radiographic damage.60,,62

Another limitation of the proposed definition is that the patient experience of remission may not have been adequately captured with only one element, the patient's global assessment of his or her disease activity. Indeed, an index based on patient measures alone may clinically discriminate between active and control treatment as well as do some of the indexes tested in this effort.63 64 However, the committee had stipulated that joint counts should be part of the remission criteria; moreover, joints are the ‘organ’ involved in RA, and in the context of assessing remission it was deemed advisable to assess that organ.

Further, fatigue was not evaluated.65 However, fatigue was not assessed in most trials published over the last decade, including those used here for the derivation of the remission criteria. We were also unable to procure data sets that contained information on other non-core set measures. As these data sets are likely to become available only over the course of several years, we decided not to postpone the development of the new remission definition. Indeed, we believe it was important to spend more time developing the concept of patient-assessed ‘absence of disease.’ This will require qualitative research involving focus groups, as well as quantitative research, for example, collection of patient-related outcome data in clinical trials, a task that will be taken forward within the OMERACT framework. Once a working definition of this concept is available, it can be compared with the proposed definition of remission.

Yet another theoretical limitation is that imaging results are not included in our definition of remission. Our goal was to use clinical parameters that are widely used and convenient to assess, but we recognise that residual synovitis may exist in many patients whose disease appears inactive based on conventional clinical evaluation.55 66 67 Importantly, however, our definitions of remission were associated with a retardation of radiographic progression, suggesting that the clinical definition has biologic meaning. Moreover, findings of a recent sonographic analysis of RA patients whose disease was in remission as defined by different means55 were in accordance with the present results. Thus, while our definitions permit a tender or swollen joint to be present, we require multiple pieces of evidence of inactive disease (1 or no tender or swollen joint, low acute-phase reactant level, and assessment by the patient that the disease is inactive) before a patient meets remission criteria. Since inactive disease may be accompanied by 1 residual swollen or tender joint and since the reliability of the examination diminishes with the number of joints with active disease, this procedure enhances the sensitivity of our definition of remission.

We should note that the trial data sets we tested included CRP more frequently than erythrocyte sedimentation rate (ESR), explaining why our definitions present thresholds for CRP. A similar ESR threshold for inactive disease might be <20 mm/h for men and <30 mm/h for women, or even lower, but this may require further testing.68 Our preference for CRP is in part because it can be standardised across centers, making it the preferred acute-phase reactant measure in multicenter trials. Also, while CRP levels may have different upper limits of normal in different laboratories, the test is widely standardised today, and a value of 1 mg/dl covers all of these upper limits; at 1 mg/dl or less the progression of joint damage is minimised.60 61 Given these findings, the practicality of using the same value, that is, 1, for all measures was deemed more important than searching for potential minimal differences between cut points of 1 mg/dl or slightly less.

A ‘treat to target’ approach may yield better outcomes than a conventional approach to therapy for RA, and remission can serve as that target for some patients. However, remission according to the stringent definition presented here may not yet be a realistic goal for most patients.10

In conclusion, we present new definitions of remission for use as outcome measures in RA clinical trials: either the compilation of 4 individual measures or an index-based alternative. We hope that these new definitions will be adopted widely and can provide a uniform approach to assessing this increasingly important outcome.

## Acknowledgments

The authors are grateful to several colleagues who participated in the committee discussions but declined to be named as coauthors on this report, and to Abbott, Amgen, and Wyeth for sharing clinical trial data.

## Footnotes

• * Columbia University, New York, New York, USA

• Genentech, South San Francisco, California

• DTF and JSS contributed equally to this work.

• This criteria set has been approved by the American College of Rheumatology (ACR) Board of Directors and the European League Against Rheumatism (EULAR) Executive Committee as provisional period. This signifies that the criteria set has been quantitatively validated using patient data, but it has not undergone validation based on an external data set. All ACR/EULAR-approved criteria sets are expected to undergo intermittent updates.

• The American College of Rheumatology is an independent, professional, medical and scientific society which does not guarantee, warrant, or endorse any commercial product or service.

• The views presented in this article do not necessarily reflect those of the United States Food and Drug Administration.

• * Boolean measure is the logic that computers use to determine if a statement is true or false. There are 4 main Boolean operators: AND, NOT, OR, and XOR (exclusive OR). Below is an example, from defining remission, of how one operator works:

Assume that x and y are both core set variables for RA whose values are in the range of remission, x AND y returns True if both x and y are true; otherwise the expression returns False. False means that patient's RA is NOT in remission.

• Funding The American College of Rheumatology, the European League Against Rheumatism, and the NIH (grant AR-47785). Dr. Smolen has received consulting fees, speaking fees, and/or honoraria from Amgen, Abbott, Centocor, Schering-Plough, Wyeth, Bristol-Myers Squibb, Roche, UCB, and AstaZeneca (less than $10 000 each); he has received grants from Schering-Plough, Roche, UCB, Bristol-Myers Squibb, and Abbott. Dr. Aletaha has received consulting fees, speaking fees, and/or honoraria from Abbott, Roche, UCB, Bristol- Myers Squibb, Schering-Plough, and Wyeth (less than$10 000 each). Dr. Allaart has received consulting fees, speaking fees, and/or honoraria from Schering-Plough, Centocor, and UCB (less than $10 000 each). Dr. Bathon has received consulting fees, speaking fees, and/or honoraria from Crescendo Biosciences and Roche (less than$10 000 each); she has received research contracts from Biogen Idec. Dr. Brown has received speaking fees and/or honoraria from Schering-Plough, Abbott, and Pfizer (less than $10 000 each). Dr. Matucci-Cerinic has provided paid consul- tation to Pfizer, Actelion, and Schering (less than$10 000 each). Dr. Choi has received honoraria for serving on advisory boards for TAP Pharma- ceuticals and Savient (less than $10 000 each). Dr. Furst has received consulting fees, speaking fees, and/or honoraria from Abbott, Actelion, Amgen, Bristol-Myers Squibb, Biogen Idec, Centocor, Genentech, Gilead, GlaxoSmithKline, Merck, Nitec, Novartis, UCB, Wyeth, and Xoma (less than$10 000 each); he has received research funding from Abbott, Actelion, Amgen, Bristol-Myers Squibb, Genentech, Gilead, GlaxoSmithKline, Nitec, Novartis, Roche, UCB, Wyeth, and Xoma. Dr. Gomez-Reino has received consulting fees, speaking fees, and/or hono- raria from Schering-Plough, Bristol-Myers Squibb, Wyeth, Roche, and UCB (less than $10 000 each). Dr. Khanna has received consulting fees, speaking fees, and/or honoraria from Abbott and UCB (less than$10 000 each). Dr. Landewé has received consulting fees, speaking fees, and/or honoraria from Abbott, Centocor, Schering-Plough, Wyeth, Pfizer, UCB, Merck, Bristol-Myers Squibb, and Amgen (less than $10 000 each). Dr. Martin-Mola has received consulting fees from Merck, Sharp, and Dohme, Pfizer, and Roche (less than$10 000 each). Dr. Pincus has received consulting fees, speaking fees, and/or honoraria from Amgen, Abbott, Bristol-Myers Squibb, Centocor, UCB, Wyeth, and Genentech (less than $10 000 each) and research grants from Amgen, Bristol-Myers Squibb, UCB, and Centocor. Dr. Simon has received consulting fees from Affinergy, AstraZeneca, Abraxis, Alpha Rx, Nuvo/Dimethaid Research, Roche, Pfizer, Novartis, PLx Pharma, Hisamitsu, Dr Reddys, Avanir, Cerimon, Alimera, Paraexel, Nitec, Bayer, Rigel, Chelsea, Regeneron, Cypress Biosciences, Nicox, Biocryst, Extera, Wyeth, Solace, Puretech- ventures, White Mountain Pharma, Abbott, Omeros, Jazz, Takeda, Teva, Zydus, Proprius, Alder, Cephalon, Seprecor, Purdue, EMD Merck Se- rono, Altea, Talagen, TiGenix, Antigenics, Forest, Genzyme, CaloSyn, King, Pozen, IL Pharma, Analgesic Solutions, and US WorldMeds (less than$10 000 each) and from Savient and Horizon (more than $10 000 each); owns stock or stock options in Savient; and has provided paid consultation to Leerink Swann, Luxor, Nomura, and Fidelity, investment analysis firms. Dr. Sokka has received consulting fees, speaking fees, and/or honoraria from Abbott, Pfizer, and UCB (less than$10 000 each). Dr. Tugwell has received consulting fees from Bristol-Myers Squibb, Chelsea, and UCB (more than $10 000 each); he has received research grants from Aventis (HMR), Biomatrix, Cigna, Genzyme, Merck, Novar- tis, Parke Davis, Pfizer, Rhone-Poulenc, Sandoz, and SmithKline Beecham. Dr. Zink has received speaking fees from Abbott, Bristol- Myers Squibb, Roche, Wyeth, and UCB (less than$10,000 each).

• Provenance and peer review Not commissioned; externally peer reviewed.

## Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.