Objective: To describe methods and procedures used for the development of the European League Against Rheumatism (EULAR)/EULAR Scleroderma Trial and Research group (EUSTAR) recommendations for the treatment of systemic sclerosis. In particular, the results of a web-based Delphi exercise aimed at selection of research questions and evidence from systematic literature research, as parts of the development of these recommendations, are presented in detail.
Methods: In agreement with the EULAR standard operating procedures a Task Force was created that consisted of the EUSTAR board members, 10 systemic sclerosis (SSc) experts invited from outside the EUSTAR board and representing Europe, the USA and Japan, a clinical epidemiologist, 2 patients with SSc and 3 fellows for literature research. All EUSTAR centres were invited to contribute to the development of recommendations through submission and preliminary selection of the research questions. The systematic literature research was performed using the Pubmed, Medline, EMBASE and Cochrane databases. Retrieved trials were evaluated according to the Jadad classification, and the level of evidence was graded from 1 to 4. Outcome data for efficacy and adverse events were abstracted and effect size, number needed to treat (NNT) and number needed to harm (NNH) were calculated when appropriate.
Results: In all, 65 EUSTAR Centres provided 304 research questions concerning SSc treatment. These questions were aggregated, subdivided into 19 treatment categories and then subjected to preliminary selection by a web-based Delphi technique. The final set of 26 research questions was created by the Expert Committee based on the results of the Delphi exercise and the expert’s experience.
Conclusions: This paper is a comprehensive summary of the methods we used to build recommendations for the drug treatment of systemic sclerosis, combining an evidence based approach and expert opinion.
Statistics from Altmetric.com
Systemic sclerosis (SSc) is an infrequent1 and complex disease that affects the skin and several internal organs; it often has a poor outcome with significant morbidity and mortality risks. Treating SSc is a challenge because the disease is heterogeneous in expression and there are no specific drugs that are yet proven to be curative.
A major objective of the European League Against Rheumatism (EULAR) executive committee is to promote actions and/or projects that are aimed at improving the knowledge and/or the recognition of musculoskeletal disorders. In the case of SSc, the EULAR Scleroderma Trials and Research group (EUSTAR) was formed in 2002 to foster the study and the excellent care of patients with SSc and to achieve a consensus on evidence based standards for the management of patients with SSc throughout Europe.2 – 4 Thus, a recent aim of EUSTAR was to develop “evidence-based recommendations” for the drug treatment of SSc with two specific objectives: (1) to improve the clinical outcome of patients with SSc and (2) to help the practitioner to effectively manage patients with SSc. These recommendations were developed using information from two major sources: a systematic review of the literature, and an expert opinion based on clinical expertise and careful review of the research based evidence. This approach followed EULAR standard operating procedures in order to obtain and maintain a high level of quality and comparability.5
This manuscript details the methods and procedures used to obtain specific management recommendations. The methods described include the selection of the research question, the literature search strategy, the analysis of the manuscripts and the expert opinion approach.
Selection process of research questions
In order to create a comprehensive list of potential topics of interest, experts in the management of SSc from 105 EUSTAR centres were asked to contribute independently with questions relevant to the drug treatment of SSc. It is recognised that the management of patients with SSc in actual practice is much more complex than using drug therapy alone. However, due difficulties in defining clinical care guidelines, a relative lack of appropriate literature, incomplete prognostic markers and imperfect criteria for diagnosis of early SSc, the present set of recommendations and questions focused only on drug treatment of SSc.
Initially, 304 questions were provided by 65/105 (62%) EUSTAR centres. In a pre-meeting, these questions were categorised by the nature of the drug and then aggregated to remove duplicates. Thus, 72 questions, split into 19 categories, were proposed for selection and subsequently used in a web-based Delphi exercise involving members of the EUSTAR centres. The Delphi exercise was Internet-based and was completed between July and October 2006. The Delphi method is a consensus method for medical and health service research.6 Such methods attempt to assess the extent of agreement (consensus measurement) and to resolve disagreement (consensus development). As opposed to the nominal group technique (expert panel) and to a consensus development conference, a web-based Delphi exercise enables the participation of experts without geographical limitations. In the Delphi procedure, participants can give their opinions independently and confidentially without the pressures of face-to-face meetings. Thus, many potential problems related to group dynamics are bypassed. Importantly, participants are allowed to change their opinion in consecutive stages of the process when given feedback from the results of the previous rounds.
To ensure security and confidentiality, each participant received a personal log-in code with the email invitation, allowing individual access to the questionnaire on a webpage specifically designed and programmed for the present Delphi study. The questionnaire was completed online by the participants. Questions were accepted automatically if selected by 80% or more of the participants in any round, whereas questions receiving less than 20% of the votes were rejected. EUSTAR centres were not authorised to participate in rounds 2 and 3 if they had not responded in round 1.
After the first round of the Delphi exercise, 27 questions split into 9 categories were accepted. The second round led to the acceptance of 44 questions (17 additional questions) split into 11 categories. After the third round, 46 questions (2 additional questions) split into 12 categories (provided in table 1) were proposed to 18 experts in the field of SSc in a first meeting in October 2006 (14 rheumatologists, 1 dermatologist, 1 internist and 2 patients with SSc) and 1 clinical epidemiologist (RL) representing 10 countries. This first meeting was organised to select (or create) and validate a list of research questions based on the results of the Delphi process and expert opinion. To support the content validity of the process, these experts had to be professors recognised as specialists in SSc with several years of experience in diagnosing and treating patients with this disease, had to be involved in SSc clinical and basic research, had to have published on SSc in peer-reviewed journals or had to have presented at major meetings. These experts did not declare relevant conflicts of interest during all the recommendation process. The industry also did not influenced the selection of questions and the entire Delphi exercise.
Based on the analysis of the results of the web-based Delphi exercise and discussion among experts, 26 research questions were formulated (provided in table 2) divided into 11 categories.
Systematic literature search
The first step, prior to the bibliographic search, was to rephrase the questions to facilitate the interpretation of the literature research. Based on this rephrasing, we anticipated the different inclusion/exclusion criteria and data to be extracted in every manuscript analysed.
Studies concerning Raynaud phenomenon or organ manifestations not specific to SSc (eg, pulmonary arterial hypertension) were included only if the study included a subgroup of patients with SSc (more than 10% of the whole population). These studies were considered eligible for review even if they did not provide a specific subgroup analysis for the patients with SSc. However, it was mentioned to the experts that results had to be taken cautiously because they combined data from patients with SSc and patients without.
The main reports of interest were those that performed a meta-analysis, systematic reviews, randomised controlled trials (RCTs)/controlled trials, uncontrolled trials/cohort studies, case–control studies or cross sectional studies. Case reports and case series with less than five patients with SSc were excluded.
Each category of drug treatment was selected by the experts (table 2)
To assess safety, we extracted, from each study, the number of discontinuations secondary to adverse events in the treated group compared to the placebo group and the number of significant adverse events reported in each group.
The articles that fulfilled the inclusion criteria underwent quality appraisal. We used the Jadad scale, the impact factor of the journal in which the trial was published and evidence of statistics using intention-to-treat analysis to assess the quality of RCTs. The Jadad scale11 contains two questions to determine appropriate randomisation and study masking and one question evaluating the reporting of withdrawals and dropouts. Each question requires a yes or no response. Five total points can be awarded with a higher score indicating superior quality.
Evidence was categorised according to study design using a hierarchy of evidence in descending order according of quality12 (table 3), and the highest level of available evidence for each intervention was reviewed in detail. Although the highest-level studies were available, all the remained categories were also reviewed. The most recent meta-analysis of RCTs (level Ia) was reviewed for each intervention, if available, and any RCTs published since the meta-analysis was conducted were also considered.
To measure the magnitude of the treatment effect for a continuous outcome variable, the effect size was calculated. The effect size is a standard way to determine the degree of improvement (or otherwise) of a particular therapy after any placebo effect has been accounted for. The effect size used here was calculated as the ratio of the treatment effect (mean difference in the treatment group minus mean difference in placebo group) and the pooled standard deviation of the differences in both groups.13 This calculation entails the use of means, for baseline and final data (or baseline and change during study) with a measure of variability such as standard deviation (SD). Every effort was made to calculate the effect size in all studies. If the SD was given in only one group, it was used as baseline SD for both groups. However, if no measure of variability was given the effect size could not be calculated. By convention, an effect size <0.2 is considered as trivial; >0.2–0.5 as small; >0.5–0.8 as moderate; >0.8–1.2 as important and >1.2 as very important.14 Minus or plus signs indicate the direction of the difference, not the magnitude of the difference. An effect size is considered as statistically significant if its confidence interval does not include zero.
The number needed to treat (NNT) was estimated in case of binomial outcome variables,15 eg, a positive treatment response (yes vs no). The NNT is the number of patients who need to be treated with a given intervention instead of placebo in order to achieve one additional patient with a positive response. The NNT is calculated as the inverse of the absolute treatment effect. If, for example, the response rate is 10% (0.10) in the placebo group and 30% (0.30) in the intervention group, the absolute treatment effect is 20% (0.20) and the NNT is 5.
Results from the latest systematic review were used if there was more than one systematic review for the same intervention. Statistical pooling was undertaken if appropriate16 when a systematic review was not available, or when more recent RCTs could be included.
The number needed to harm (NNH) was used to express safety of an intervention. The NNH is an epidemiological measure (defined as the inverse of the absolute risk increase) that reflects the number of patients who should be treated with the intervention instead of placebo (or control treatment) in order to obtain one additional patient with a predefined adverse event. “Harm” was primarily defined as the discontinuation of the study drug because of toxicity. The advantage of the NNH is that it reflects an absolute risk increase and, because it is related to the control event rate, it reflects the true baseline or underlying risk of the study population.17 For rational decision making in daily clinical practice, absolute measures such as the NNH are more meaningful than relative measures.18
Because of the large confidence intervals around drug efficacy/safety, confidence intervals were not reported for the NNT/NNH, as proposed by McQuay and Moore.19
The systematic literature research was performed independently by four Task Force members (OKB, JA, SC, IM), guided by the scientific organiser (MMC) and the clinical epidemiologist (RL). Assessors who performed the systematic literature research had no conflict of interest that may have influenced their search and the interpretation of results of their analysis. The literature search was performed on all articles published between 1966 and February 2007 and expanded on Medline, Pubmed, EMBASE, the Cochrane Controlled Trials Register for RCTs and EULAR/American College of Rheumatology (ACR) congress abstract archives. All languages were eligible for inclusion. We limited our search to humans and to adult population. The search strategy included all relevant terms for SSc (“systemic sclerosis” OR “scleroderma” OR “CREST”) combined with different sets of keywords specific for each question (the combination for each question is provided in table 4).
Medical subject heading (MeSH) search was used for all databases and a keyword search was used if the MeSH search was not available. All MeSH search terms were exploded. The reference lists of reviews or systematic reviews were examined and any additional studies meeting the inclusion/exclusion criteria were included.
The summary results of this search were reported to the expert committee at the beginning of the recommendation development process.
After the reading of title and abstracts, 313 manuscripts remained. Finally, 281 articles were considered after reading the full text.
Of these 281 studies, 4 were meta-analyses, 51 were RCTs, 8 were controlled studies without randomisation, 49 were quasiexperimental studies and 169 were descriptive studies (such as comparative studies, correlation studies or case–control studies).
The results of article selection and the highest evidence rating for each category of drug treatment are provided in table 5.
Analysis of manuscripts
Data extraction was performed by the members of the literature review team on the full texts, not blinded to author and journal, using a predefined extraction sheet, available from the authors. Type of information that was extracted included first author, publication year, quality assessment of the manuscript, mean age of participants, sex proportion, trial duration, characteristics of patients with SSc, type of treatment, type of comparator, drug dose, number of patients in active and control group and the outcome measure used to assess efficacy, a priori defined.
In case of quantitative variables, mean and standard deviation at baseline and at the end of the study of the parameter were collected in order to calculate effect size. For binary variables, we collected the number of improved patients in the active and placebo groups in order to calculate the NNT. The number of withdrawals because of toxicity was collected in the active and placebo group to calculate the NNH.
Expert opinion approach
It is admitted that publication of the evidence based approach alone may be too complicated to be fully used by the target population. For example, interpretation of the effect size of a treatment modality requires specific knowledge. Thus to make it clear, the recommendations included summary statements from the experts based on a combination of the reported evidence plus clinical judgment and experience. Contents derived from expert opinion were clearly identified, together with the reasons for that approach.
A set of 12 draft recommendations was prepared by 3 members of the committee (OKB, RL, MMC), as a compilation of the research questions and the results of the literature research. This set of draft recommendations formed the basis for discussion during a second meeting in March 2007, which resulted in the formulation of the final set of 14 recommendations.
The same group of experts was present in the second meeting. Agreement was achieved after the presentation of the results of the systematic literature research by each assessor, followed by a discussion between experts to reach a consensus. In case of discrepancies, the nominal group technique was applied to obtain a consensus. This final set of recommendations is presented and discussed in a separate manuscript.
This paper is a comprehensive summary of the methods we used to build recommendations for the drug treatment of SSc, combining an evidence based approach and expert opinion. This systematic approach allowed the development of recommendations, detailed in a specific manuscript,19 for clinicians treating patients with SSc. These recommendations aim to improve the outcome of patients with SSc and may help define directions for future clinical research in SSc.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.