Objective: To make recommendations on how to report disease activity in clinical trials of rheumatoid arthritis (RA) endorsed by the European League Against Rheumatism (EULAR) and the American College of Rheumatology (ACR).
Methods: The project followed the EULAR standardised operating procedures, which use a three-step approach: (1) expert-based definition of relevant research questions (November 2006); (2) systematic literature search (November 2006 to May 2007); and (3) expert consensus on recommendations based on the literature search results (May 2007). In addition, since this is the first joint EULAR/ACR publication on recommendations, an extra step included a meeting with an ACR panel to approve the recommendations elaborated by the expert group (August 2007).
Results: Eleven relevant questions were identified for the literature search. Based on the evidence from the literature the expert panel recommended that each trial should report the following items: (1) disease activity response and disease activity states; (2) appropriate descriptive statistics of the baseline, the endpoints and change of the single variables included in the core set; (3) baseline disease activity levels (in general); (4) the percentage of patients achieving a low disease activity state and remission; (5) time to onset of the primary outcome; (6) sustainability of the primary outcome; (7) fatigue.
Conclusions: These recommendations endorsed by EULAR and ACR will help harmonise the presentations of results from clinical trials. Adherence to these recommendations will provide the readership of clinical trials with more details of important outcomes, while the higher level of homogeneity may facilitate the comparison of outcomes across different trials and pooling of trial results, such as in meta-analyses.
Statistics from Altmetric.com
Patients with rheumatoid arthritis (RA) differ in the predominance of individual clinical manifestations. This phenotypic heterogeneity of the disease makes the evaluation of disease activity in RA complex and prone to misrepresentation if individual domains are used to evaluate disease activity.1 2 Therefore, over a decade ago, core sets of individual measures were defined that should be reported in clinical trials of RA.2 3 Subsequently, criteria for response to therapy have been defined and used for clinical trial reporting since the late 1990s; the American College of Rheumatology (ACR) criteria were based on relative changes in core set variables,4 while the criteria of the European League Against Rheumatism (EULAR) relied on an absolute change of a composite Disease Activity Score (DAS) and the attainment of a particular disease activity state.5 In addition, regulatory authorities have developed guidance documents for RA clinical trials.6 7
Since then, many clinical trials have been published. When evaluating the respective studies, it is surprising how heterogeneous reports on results of RA clinical trials have been, eg, while ACR response rates were regularly presented, composite indices reflecting disease activity were shown to a varying degree. Likewise, data on baseline and treatment associated changes in core set variables were often shown incompletely and disease activity levels and states during the trial and at trial endpoint were rarely reported in detail, although such information would be useful for patients and clinicians and necessary for the performance of meta-analyses. Moreover, the field of RA outcomes research has moved forward. New insights into the relationship between disease activity, joint damage and disability have been put forward,8–10 new indices or criteria have been proposed following patients in practice and/or trials11–13 and an expansion of patient-reported outcomes (PRO) has been suggested.14 15 Finally, new and highly effective recently approved drugs, such as inhibitors of tumour necrosis factor α (TNFα) and other biological agents in combination with methotrexate, have challenged outcomes assessment by their relatively rapid onset of action, their profound efficacy and other aspects related to their clinical effectiveness.16 These developments suggested that the recommendations for trial reporting should be revisited and renewed.
The aim of the current work was to derive evidence-based recommendations on key issues related to disease activity evaluation in RA, and to enable a more uniform presentation of clinical trial results. Although many of the recommendations on how to report trial results, will require respective considerations a priori, this document will not deal with principal issues of trial design, such as inclusion criteria etc. The present recommendations are also distinct from previous publications as they represent a combined document from two international professional societies, EULAR and ACR. Here, we provide the recommendations derived in the course of this process, which are now published simultaneously in the Annals of the Rheumatic Diseases and in Arthritis & Rheumatism (Arthritis Care & Research).17 The detailed methodology of the project, ie, the evidence-based and consensus-based approach to elaborate the recommendations, is described in a separate paper.18
The general approach to this project followed the EULAR standardised operating procedures for the elaboration and implementation of evidence-based recommendations19 and was supplemented by an ACR expert panel of RA trialists. The process started with the formation of an expert panel that comprised the convenor (DA), a clinical epidemiologist (RL), a research fellow (TK) and 20 experts, 3 of them from North America, who had been invited on the basis of their expertise in the development of outcome measures and/or clinical trials. In addition, a patient representative (PR), nominated by the patient members of the EULAR Executive Committee, was part of the panel. Over the course of the first meeting, the target population for these recommendations was defined as the consumers of scientific literature as well as researchers and journal editors; 11 key questions related to the topic were identified by the experts using a modified Delphi technique (presented in detail in the accompanying publication).18 This was followed by the main element of the process; a systematic literature search on the key elements, which was performed by the research fellow. The results were fed back to the panel, and suggestions and comments by the experts complemented the results of the systematic literature search. At a second meeting, the results of the systematic review served as the basis for the formulation of a consensus statement on specific recommendations. The methodology for derivation of the questions, detailed search algorithm and retrievals are described in detail elsewhere (see the accompanying publication).18 All retrieved studies were then categorised by grade of evidence and their main results were extracted.
At the second panel meeting, based on the availability and strength of the presented evidence for each of the initial research questions, the expert panel developed statements (“items”) that were categorised into three groups:
Items for the final recommendations. These items were preliminary recommendations for the final statement. After the assembly of these initial recommendations, items were ordered by logical sequence and reworded as needed.
Items to be considered as a research agenda. Complementary to the recommendations, the research agenda comprised items that were deemed important by the experts, but for which there is insufficient evidence in the literature or for which published information is controversial.
Text items for the recommendations manuscript. These elements have been regarded as important to be mentioned or discussed in the final manuscript. They are mostly of explanatory nature but do not necessarily constitute bullet points of recommendations.
Evidence and draft recommendations were then shared with the ACR (whose representatives had participated in the EULAR expert panel). The ACR panel on outcome measures in RA trials included experts in trial design and outcome measurements who met with representatives (DA, JSS) from the EULAR panel to review evidence, discuss additional evidence, evaluate specifics of the draft recommendations and modify these as needed. The result is the final set of recommendations presented here. Given the desire of both organisations to reach consensus on one document, this final set was presented to the Executive Committee of the EULAR and to the ACR Board of Directors and approved by both.
Specific references obtained from the literature search are provided separately in the context of results related to the individual questions.18 Here we focus on the presentation and discussion of the content of the individual recommendations (table 1).
Using a consensus approach, the EULAR/ACR task force formulated seven recommendations for reporting disease activity in clinical trials in RA. Since these recommendations are directed to reporting of trial results, all items start with the statement “Each trial should report…”. Since the focus of the present process is to make recommendations for reporting disease activity in RA trials and not reporting outcomes in general, the following preamble was added for clarification:
“There are several domains that are important in reporting clinical trials: disease activity, physical function and damage. For each of these domains state and response should be assessed and reported, where appropriate. However, the present recommendations will deal specifically with disease activity.”
It is important to mention that issues of principal study design (eg, inclusion criteria) and issues of detailed methodology for analysis were considered to be highly important by the expert panel, but are not within the purview of this document.
1. Each trial should report disease activity states and response
The essence of this point is the importance that a measure of response and a measure of a state be presented in the results of a clinical trial, while any primary outcome (response or state) needs to be defined in advance.
“Response” is defined as a change score in a continuous variable with cutpoints for various response levels, or predefined responder criteria, such as the EULAR response criteria (good response, moderate response, no response), as well as the ACR response criteria (ACR20, ACR50, ACR70, for 20%, 50% and 70% improvement as per ACR criteria, respectively). The ACR Hybrid measure could replace the ACR20/50/70 once successful prospective validation in clinical trials has been achieved.
“State” is defined as a measurable, cross-sectional level of disease activity. Typical disease activity states are remission, low, moderate and high disease activity. Continuous composite measures of disease activity can be used as state measures, if cut-points are applied. States based on the following measures could be presented: the DAS based on 28 joint counts (DAS28), the Clinical Disease Activity Index (CDAI), the Simplified Disease Activity Index (SDAI) and the DAS. For all composite indices the appropriate descriptive statistics of the baseline, endpoint and change scores should be reported (including for each component of these indices; see below).
2. Each trial should report the appropriate descriptive statistics of baseline, endpoint and change scores for each component (single variable) included in the core set
This item was deemed important to provide an overall more accurate representation of disease activity as shown by individual components of the response criteria, which cannot be directly extractable from presentation of the composite indices. Similar to the recommendation in the original core set publications in the early 1990s, it is recommended that baseline, endpoint and change scores for all core set variables measured in a study be presented. The reporting should be comprised of a summary measure (eg, mean/median) together with a variability measure (eg, standard deviation). This will allow interpretation of data on individual and population-based levels and facilitate comparison of results across trials. However, it must be emphasised, that single measures should not be used as a primary endpoint in clinical trials of RA. While change in the individual core set measures should be reported, presenting statistical significance of these changes can only occur if statistical analyses account for multiple testing and are clearly presented in interpretation of the results.
3. Each trial should report baseline disease activity levels, which have relevance when evaluating the results
This item reflects the literature that documents an association of baseline disease activity levels with degrees of improvement as well as achievement of favourable states (eg, remission): while responder criteria are more easily achieved in patients with higher baseline activity levels, remission criteria are harder to achieve in these patients. In addition to the recommendation no. 2, to report appropriate baseline values, this recommendation relates to interpretation of trial results. This does not imply a stratification of study results by baseline activity levels, but rather evaluation of the impact of the inclusion criteria and patient characteristics at baseline in discussion of the results. Although currently most pivotal trials in RA employ comparable disease activity requirements for patient enrolment, future trials may investigate patients with less active disease and their results (remission rates, responder rates) may not be comparable to previous trials.
4. Each trial should report the percentage of patients achieving a low disease activity state and remission
In addition to item 1, where reporting responses and disease activity states are recommended, this recommendation emphasises the importance of achieving remission or, at least, low disease activity, and to report the frequencies of achieving these important states in clinical trials. This acknowledges the increased interest in gaining these benefits now that highly effective therapies are available to treat RA.
There are multiple ways to define remission and low disease activity. Low disease activity definitions include the defined cutpoints for DAS, DAS28, CDAI and SDAI, and the definition of “minimal disease activity” according to OMERACT. Definitions that can be used for remission include preliminary ARA remission criteria and the defined cutpoints for DAS, DAS28, CDAI and SDAI. “Response measures”, which do not include a definition of “state”, such as ACR70 responses do not reflect remission. Efforts are currently underway to develop a new definition of remission, the results of which may be incorporated in an update of these recommendations once they are validated.
5. Each trial should report the time to onset of the primary outcome
Disease activity over time leads to radiographic progression. This is shown, for example, by the fact that time-averaged levels of disease activity correlate well with increases in radiographic damage, as well as loss of physical function. The result of therapy with TNF inhibitors is one exception to this association. Time to onset of benefit should be reported for achievement of the primary outcome variable of a trial. As mentioned above, the primary outcome can be attainment of a level of response or a certain disease activity state. In its simplest form, the average time to onset of a favourable outcome may be compared between the treatment groups in a clinical trial.
6. Each trial should consider and report the sustainability of the primary outcome
In addition to the recommendation to present the onset and time course of attainment of the primary endpoint, (as indicated in item 5) and the presentation of response or state achievement rates at predefined time point(s) during the trial (item 1), it is important to report the proportion of patients that achieve sustained responses or a predefined state. For example, remission rates or ACR50 response rates, presented at a given time point should be supplemented by presentation of the proportion of patients who continue to sustain this outcome/state. The beneficial effect of sustained responses in comparison to intermittent or improvements at a single time point is documented in numerous studies, in terms of reduction in radiographic progression as well as loss of physical function. However, since there is currently no best way to define sustainability, further research on the optimal definition of sustained response has been placed on the research agenda (see below). Conceptually sustainability of disease activity state measures has a higher face validity compared with sustained response measures, which must be compared to baseline values. To avoid bias, the denominator for definition of sustainability should be the intent to treat population.
7. Each trial should report on fatigue
There is evidence from double-blind, randomised controlled trials and observational studies that patient-reported measures correlate cross-sectionally and longitudinally with measures of disease activity (eg, disease activity indices). These measures include the Health Assessment Questionnaire Disability Index (HAQ), the Short Form 36 (SF-36) and patient assessments of pain and global disease activity. In addition to pain and disease activity, which are already part of the core sets of measures of disease activity, the most prominent patient derived outcome on the basis of working groups with patients involved, as well as the patient included in the present process, was fatigue. Some studies suggest that one limitation of the assessment of fatigue in clinical trials could be that fatigue is potentially secondary to other disease characteristics, and thus not an independent attribute. The importance of sleep (and sleep disorders) has also been brought forward by the patient participant of the meeting, and it is advisable to collect information on fatigue as well as on sleep. While sleep has also been suggested as an important symptom from the patient’s perspective, based on the obtained evidence it is, at this time, only recommendable to reporting on fatigue using validated fatigue scales.
Several items for which there was insufficient or controversial literature to support recommendations have been added to the research agenda. This agenda is a list of items and issues which researchers are encouraged to address in the future.
To reach consensus on how to measure remission. There is a need to derive a uniform definition of remission. This could include clinical markers and imaging modalities (such as radiographs, MRI, or ultrasonography). A potential challenge will be to adopt the appropriate validation criteria for remission. For example, assessment of construct validity will likely include structural aspects, that is, testing the concept of little or no progression of radiographic damage in remission.
To investigate the Patient Acceptable Symptom State (PASS) for usefulness in clinical trials. The importance of patient-reported outcomes and of presentation of state outcomes in clinical trials has been emphasised in above recommendations 1, 4 and 7 of this report. Furthermore, the PASS, as a combination of patient-reported outcome and state measures, should be tested in clinical trials, ideally for its association with more objectively assessed states, such as remission based on composite indices.
To investigate response levels for relevant clinical measures. The levels for minimal clinically important differences (MCID) and major responses are important thresholds to assess the ability of an intervention to improve patient conditions. For some measures, such as HAQ, SF-36 and different visual analogue scales (VASs), definitions of MCID are available, but it will be important to identify similar and higher and clinically potentially more meaningful thresholds of improvement for these and other potential outcomes that may be reported in clinical trials.
To test the usefulness of probability plots for disease activity measures. In the past, probability plots have been shown to be very helpful in the analysis of radiographic data from clinical trials. Their strength lies in the presentation of all data points by depicting the cumulative distribution of values observed in a group of patients, and is most advantageous if data are not normally distributed. In the case of disease activity measures, graphical comparisons of groups by probability plots may be valuable, but have not yet been tested.
To investigate the use of MRI and ultrasound to measure synovitis. It is currently unclear how MRI or ultrasound may be helpful in reliably determining synovial inflammation and its changes, and how the information provided by these imaging techniques will complement the clinical assessment of joint swelling. Based on the current literature it is unlikely that these imaging modalities can replace established clinical and radiographic methods in trials of RA. Further research is therefore needed to investigate how to best report the relationship between clinical disease activity and the various imaging techniques. It is recommended that information on imaging and clinical measures be included in the same report of a clinical trial, with analyses of the relationship between them.
To investigate the value of physical function as part of disease activity indices. As indicated in the preamble, disease activity and physical function can be regarded as two outcome domains of RA, although disease activity influences physical function and, thus, functional measures are sometimes also used as measures of disease activity. In the ACR response criteria, disease activity and functional measures are combined, but currently the value and pitfalls of including functional measures as part of disease activity indices are unclear. Simplistically, for example, this may be addressed by evaluating ACR responses with and without including physical function, or by adding a functional measure to a composite disease activity score, such as the DAS, in clinical trials.
To test the influence of baseline disease activity levels on response rates and on study power. Formal studies need to be performed with the objective to investigate the influence of inclusion criteria on response rates and whether lowering disease activity requirements decrease the power to detect differences between different therapeutic regimens. This may be performed in a trial specifically designed to address this question, but also in posthoc analyses of available databases.
To investigate the influence of chronicity of disease on discriminant capacity of disease activity measures. It has been shown that physical function levels and responses in clinical trials are clearly dependent on the chronicity of disease in the patient population of interest. It is currently unclear to what extent other measures of disease activity are influenced by disease chronicity, which would clearly affect the interpretation of response rates and comparability of results between trials.
To investigate the best way to define sustainability of response. Of particular importance is the question of how measures of sustainability of response can be included in clinical trials without radically altering current trial design (eg, if 6 months were required for sustainability, and the treatment required 4 months to show efficacy, trials would need to last at least 10 months).
Further items that were considered important for future research were:
To study the minimal threshold level of disease activity, associated with an absence of disease progression.
To study the importance of sleep and sleep disorders (see recommendation 7).
To investigate how to best define and measure loss of response.
To study measures of prognosis.
To set up a biomarker database coupled with clinical trial data to enable prediction of responses to specific treatment regimens.
This collaborative effort of EULAR and ACR to develop recommendations in a particular aspect of rheumatological care and research has passed an evidence-based filter and a consensus of experts in the field. While—as all similar activities—these recommendations are not binding, their consideration will harmonise the presentations of results from clinical trials. Adherence to these recommendations will provide more comprehensive information for the following reasons: (1) more details of important outcomes reflecting a large spectrum of disease characteristics will be reported, maximising the interpretability by the consumers of trial reports; (2) a higher level of homogeneity may facilitate comparison of outcomes across treatment groups and trials even in the absence of head to head comparisons, although this will require testing; (3) outcomes researchers will be better able to perform meta-analyses that will potentially improve our understanding of therapeutic responses; and (4) complete pertinent information on individual patients may allow for new approaches to individualised therapies. These are expected to lead to better care of patients with RA. Here it should also be emphasised that clinical trials, which are the focus of this document, only partially reflect clinical practice. However, good clinical trial reporting will facilitate better translation of data to daily practice, again furthering better patient care.
As research continues to provide new insights into issues of outcomes assessment of RA, and as items presented in the research agenda will be addressed in the future, it is expected the present recommendations will need to be updated and revised in the future.
Funding: This project was fully funded by EULAR and the ACR.
Competing interests: None.
The views presented in this presentation do not necessarily reflect those of the Food and Drug Administration or the National Institutes of Health.