Objectives The rarity of early diffuse cutaneous systemic sclerosis (dcSSc) makes randomised controlled trials very difficult. We aimed to use an observational approach to compare effectiveness of currently used treatment approaches.
Methods This was a prospective, observational cohort study of early dcSSc (within three years of onset of skin thickening). Clinicians selected one of four protocols for each patient: methotrexate, mycophenolate mofetil (MMF), cyclophosphamide or ‘no immunosuppressant’. Patients were assessed three-monthly for up to 24 months. The primary outcome was the change in modified Rodnan skin score (mRSS). Confounding by indication at baseline was accounted for using inverse probability of treatment (IPT) weights. As a secondary outcome, an IPT-weighted Cox model was used to test for differences in survival.
Results Of 326 patients recruited from 50 centres, 65 were prescribed methotrexate, 118 MMF, 87 cyclophosphamide and 56 no immunosuppressant. 276 (84.7%) patients completed 12 and 234 (71.7%) 24 months follow-up (or reached last visit date). There were statistically significant reductions in mRSS at 12 months in all groups: −4.0 (−5.2 to −2.7) units for methotrexate, −4.1 (−5.3 to −2.9) for MMF, −3.3 (−4.9 to −1.7) for cyclophosphamide and −2.2 (−4.0 to −0.3) for no immunosuppressant (p value for between-group differences=0.346). There were no statistically significant differences in survival between protocols before (p=0.389) or after weighting (p=0.440), but survival was poorest in the no immunosuppressant group (84.0%) at 24 months.
Conclusions These findings may support using immunosuppressants for early dcSSc but suggest that overall benefit is modest over 12 months and that better treatments are needed.
Trial registration number NCT02339441.
Statistics from Altmetric.com
The diffuse cutaneous subtype of systemic sclerosis (dcSSc) is rare (SSc incidence is around 10–20/million/year,1 of whom approximately 25% will have diffuse disease) but carries high morbidity and mortality due to early internal organ involvement and rapidly progressive, painful skin thickening. Also, 5-year and 10-year survival rates, although improving, are in the order of 68% and 50%, respectively.2 ,3
At present, there is no drug known to favourably influence disease course. Randomised controlled trials (RCTs) have historically been confounded by disease rarity (only small numbers of patients are recruited, often over long periods) and strict entry criteria meaning that severe cases are often excluded.4 These strict criteria further restrict sample sizes and limit generalisability. Therefore, although RCTs represent a gold standard for assessing drug efficacy, results may not be applicable to real-life clinical settings.5 Small trials run the risk of being underpowered, thus potentially yielding false-negative results.6 The past three decades have seen a number of promising treatments for early dcSSc failing to meet efficacy end points in RCTs: examples include methotrexate (multinational, 71 patients)7 and anti-transforming growth factor β1 antibody therapy (multinational, 45 patients).8
A further difficulty in recruiting into RCTs of early dcSSc is that many clinicians have reservations about placebo therapy in a potentially life-threatening disease and favour immunosuppression, consistent with the European League Against Rheumatism (EULAR) recommendations, which advocate methotrexate for skin manifestations9 in early dcSSc, although this agent has been shown to be of only limited efficacy.7 Immunosuppressants are potentially hazardous, especially in patients prone to internal organ disease and infection.
Against this background, our aim was to compare, using an observational approach, the effectiveness of standard treatment approaches (mainly immunosuppressant treatments but including a ‘no immunosuppressant’ option to reflect that some patients or clinicians may choose this approach) in the early management of patients with dcSSc, capturing entry and outcome data in a systematic way. Modern statistical approaches allow robust interrogations of prospective observational studies, as an adjunct to, or even substitute for, RCTs in rare diseases,10 although the potential of these novel approaches has not yet been realised.11
The European Scleroderma Observational Study (ESOS) was a prospective, observational cohort study (ClinicalTrials.gov identifier: NCT02339441), in which standardised data were collected at study entry and at follow-up visits, and entered electronically by investigators at each centre into an electronic case record form. All data were checked by the project coordinator and any inconsistencies were discussed with the chief investigator and (if appropriate) the local principal investigator. The main inclusion criteria were early dcSSc (skin involvement proximal to elbow, knee, face, neck12 and within three years of the onset of skin thickening) and age >18 years. Exclusion criteria were previous stem cell transplantation, previous immunosuppressant treatment for >4 months or use of any immunosuppressant drug other than methotrexate, mycophenolate mofetil (MMF) or cyclophosphamide within the month prior to study entry.
Clinicians selected the protocol of their choice for each patient. The recommended treatment protocols, as decided by the Steering Committee to reflect international best clinical practice, were
Methotrexate (oral or subcutaneous with a target dose of 20–25 mg weekly).
MMF (500 mg twice daily for 2 weeks increasing to 1 g twice daily).
Possible regimens included:
Intravenous. Minimum monthly dose 500 mg/m2 with a recommended duration of 6–12 months.
Oral. 1–2 mg/kg/day with a recommended duration of 12 months. Patients treated with cyclophosphamide were then usually ‘transferred’ to a maintenance immunosuppressive drug (methotrexate, MMF or azathioprine) as per the treating clinician's choice.
No immunosuppressant treatment, to give the option of including patients in whom immunosuppression was not felt indicated or appropriate (or declined by the patient).
Patients were assessed at baseline, with subsequent visits scheduled three-monthly for 24 months (or between 12 and 24 months for those patients recruited after September 2013).
To have 80% power to detect a difference between two treatment arms of five modified Rodnan skin score (mRSS) units at 12 months would require 63 patients per protocol. Allowing 20% loss to follow-up, and varying numbers recruited to the different protocols, recruitment target was 316 patients.
Patients were recruited between July 2010 and September 2014. Demographic characteristics including age, gender, smoking habit, ethnicity, antibody status (anti-topoisomerase-1 (anti-Scl70), anti-RNA III polymerase, anticentromere) and presence of visceral organ involvement were recorded for all patients. The algorithms to determine the presence of different types of organ involvement are summarised in online supplementary table S1.
The primary outcome measure, assessed at each visit, was the change in mRSS over time. All mRSS assessments were performed by those experienced in skin scoring. The mRSS is assessed clinically at 17 body sites on a 0–3 scale (maximum score 51) and measures the extent of skin thickening.13 It is the most commonly used primary outcome measure in RCTs of dcSSc,4 ,7 ,8 reflecting disease severity and predicting mortality.14 All other outcomes/recorded variables were mainly part of routine clinical practice and are summarised in online supplementary table S2. Secondary end points included pulmonary function (forced vital capacity (FVC: % predicted) and carbon monoxide diffusing capacity (DLCO: % predicted)), quality of life15–18 (including the Health Assessment Questionnaire Disability Index (HAQ-DI)15 and Cochin Hand Function Scale18), occurrence of side effects and survival.
In an observational study, patient characteristics differ between groups and any differences in outcomes might be driven by those characteristics rather than the treatments (confounding by indication). In each of the analyses (for the different outcome measures), all variables associated with the outcome were considered as confounders.19 ,20
Differences between protocols at baseline
Kruskal-Wallis test was applied for continuous variables and Fisher's test for categorical variables.
Influence of baseline characteristics on mRSS at baseline and over time
The association between baseline variables and mRSS was assessed by simple linear regressions, entering each characteristic separately as a predictor of mRSS. To examine how each variable affected the progression of mRSS, the regression equation was modified by adding a term for time and its interaction with the baseline predictor value.
Differences in the changes between groups for all outcomes
Inverse probability of treatment (IPT) weights equalise the distributions of confounders between the treatment groups, thus removing confounding by indication.21 Treatment probabilities were computed using multinomial logistic regressions, with the baseline values of the selected confounders as predictors.22 Censoring weights rebalance the data such that the distributions of confounders remain unchanged throughout the study. For each observation, the probability of remaining uncensored given the baseline values of the confounders, the initial protocol and a cubic spline for time was calculated using a pooled logistic regression model.23 Multiplying both weights yielded the IPT and inverse probability of censoring (IPTC) weights. Weights >20 were truncated at that value.24
Treatment effects were assessed using IPTC-weighted linear regression models, which include an intercept, a time term, indicator variables for treatment groups and interactions between time and treatments. The model followed an intention-to-treat approach. Differences in the interaction terms reflected differences in the evolution of outcome.
Cochin hand function data were log-transformed (after adding one to each value) to correct for a highly left-skewed distribution. CIs for the difference of logs were back-transformed, yielding a percentage difference between predicted baseline and 12-month levels.
Because of missing data at baseline for confounders, multiple imputation by chained equations was applied with STATA V.13.1. Imputations were performed separately for each different outcome model. Moreover, each analysis was restricted to the subset of patients with available outcome data at baseline.
Kaplan-Meier curves, adjusted using IPT weights, provide estimates of the cumulative probability of surviving in each of the protocols. An IPT-weighted Cox regression, including indicator variables for the protocols, was used to test for differences in survival between protocols. Both overall and adverse event-free survival were examined.
In total, 326 patients from 50 centres (19 countries) were recruited into the study (figure 1): 160 from mainland Europe and the Middle East, 134 from the UK, 15 from Australia and 17 from North America (six centres from Australia and North America joined after the initial recruitment wave). Not being a randomised study, the number of patients starting on each protocol differed: 65 (19.9%) methotrexate, 118 (36.2%) MMF, 87 (26.7%) cyclophosphamide and 56 (17.2%) no immunosuppressant treatment. Median (IQR) doses are shown in online supplementary table S3.
Baseline characteristics of patients
The median mRSS (21, IQR 16–27) and its distribution did not differ across all four treatment groups (p=0.306) (table 1). There were significant differences between treatment groups in gender (patients in the cyclophosphamide group less likely to be female, p=0.003) and duration of skin thickening (the ‘no immunosuppressant’ group had the longest, p=0.001). Also, patients in the cyclophosphamide group were more likely to have had previous immunosuppression (p=0.007) or steroid treatment (p=0.001). At baseline, 94 (28.8%) patients were taking oral corticosteroids, with a median dose of 10 mg/day (range 2.5–60 mg/day).
There were significant differences between groups for presence of pulmonary fibrosis, cardiac, renal and muscle involvement. Patients on cyclophosphamide were more likely to have pulmonary fibrosis (p=0.036 across groups) or cardiac involvement (p=0.009 across groups). Patients in the ‘no immunosuppressant’ group were more likely to have renal involvement (p=0.039), and the methotrexate group had more frequent muscle involvement (p=0.002).
Scores for the HAQ-DI, Functional Assessment of Chronic Illness Therapy (FACIT) fatigue and Short-Form 36 (SF36) physical and mental indexes did not differ significantly between groups. However, there were significant differences across groups in the cochin hand function scale (CHFS), which was poorest in the cyclophosphamide group (p=0.025).
As anticipated in a study of patients with early dcSSc, there was substantial use of concomitant medications (see online supplementary table S4).
Progression through the study
Figure 1 shows how patients progressed through the study. Overall, 276 patients (84.7%) remained in the study at 12 months of follow-up and 234 (71.7%) completed 24 months (or reached the last study visit date of 30 September 2015).
Changes in protocol
A total of 60 (18.4%), 12 (3.7%) and 1 (0.3%) patients changed protocol one, two or three times during the study. Among patients still in the study, adherence to initial protocol at 24 months for the different cohorts was 76.2% (methotrexate), 79.7% (MMF), 79.2% (cyclophosphamide) and 73.3% (no immunosuppressant) (see online supplementary figure S1). In the no immunosuppressant cohort, 10 out of 56 patients commenced an immunosuppressant (figure 1).
Withdrawals and deaths
In total, 35 patients (10.7%) died and 42 (12.9%) withdrew from the study (including lost to follow-up). Of the 35 deceased patients, 31 cases were primarily attributed to SSc-related causes (26 most likely primarily cardiorespiratory, 2 renal crises, 2 gastrointestinal (one aspiration) and 1 peritonitis (on peritoneal dialysis following renal crisis)), 3 died of cancer (1 nasopharyngeal, 1 rectal, 1 colorectal) and in 1 case the cause was unknown.
Influence of baseline variables on the initial skin score and on skin score trajectory
Table 2 summarises the effect of different characteristics on the initial mRSS and its subsequent trajectory, as analysed with linear regression.
Using the associations described by table 2, the confounders identified for the skin score were age, duration of skin thickening, current or previous steroid use, anti-topoisomerase, anti-RNA polymerase III, pulmonary fibrosis, pulmonary hypertension, cardiac, renal and muscle involvement, as well as HAQ-DI, Cochin hand function and FACIT fatigue scores (see online supplementary table S5 for lists of confounders and online supplementary tables S6– S13 for each model's confounder selection process).
Changes in skin score over time in the different treatment groups
The mean change in mRSS after 12 and 24 months was −2.9 and −6.7 units. Based on a weighted regression model, there were statistically significant reductions in mRSS in all four treatment groups at 12 months (−4.0 (−5.2 to −2.7) units for methotrexate, −4.1 (−5.3 to −2.9) for MMF, −3.3 (−4.9 to −1.7) for cyclophosphamide and −2.2 (−4.0 to −0.3) for the no immunosuppressant group), but the differences between treatments were not significant (p=0.346) (table 3 and figure 2).
Changes in secondary outcomes over time in the different treatment groups
After adjusting for potential confounders, the change rates of FVC and DLCO were not significantly different in the four treatment groups (p=0.460 and 0.505) (table 3).
However, in a subset of patients with pulmonary fibrosis or suspected pulmonary fibrosis (cases confirmed on high-resolution CT (HRCT) irrespective of FVC or DLCO, or with one of the following if HRCT not performed: FVC or DLCO under 55% predicted or definite bibasal shadowing on X-ray), there was a significant difference in the change rate of FVC over time (p=0.035). Patients initially prescribed cyclophosphamide demonstrated 7.4% absolute increase in FVC (% predicted) compared with 2.0% decrease for methotrexate, 3.2% increase for MMF and 4.0% increase for the ‘no immunosuppressant’ group (table 3).
Functional ability and hand function
Changes over time for the HAQ-DI and CHFS did not differ between protocols (p=0.130 and 0.073), regardless of adjusting (table 3).
Development of internal organ involvement
This is described in online supplementary figure S2.
Comparison of survival between treatment protocols
Survival was lowest in the no immunosuppressant group at both 12 and 24 months but differences between protocols were not statistically significant either before (p=0.389) or after weighting (p=0.440). In the adjusted model, at 24 months, those in the no immunosuppressant group had a predicted survival rate of 84.0% compared with 94.1% for methotrexate, 88.8% for MMF and 90.1% for cyclophosphamide (figure 3). Patients with lung involvement (pulmonary fibrosis and/or hypertension) at baseline had significantly poorer survival than those without: at 24 months, their predicted survival rate was 74.6% versus 91.7% (p<0.0005) and similarly for cardiac involvement, 71.6% versus 90.7% (p<0.0005).
Of the 75, 182 and 101 patients who were ever on methotrexate, MMF or cyclophosphamide, respectively, 29 (38.7%), 40 (22.0%) and 23 (22.8%) were reported to have had side effects, necessitating drug discontinuation in 9 (12.0%), 14 (7.7%) and 5 (4.5%) patients, respectively. A survival analysis on protocol exits due to adverse effects showed no differences in the tolerability of the three treatments (p=0.212) (see online supplementary figure S3).
Our main findings were, first, that there were no significant differences in outcome between the four treatment protocols (methotrexate, MMF, cyclophosphamide, no immunosuppression), although there may be a signal in favour of immunosuppression for early dcSSc. Although skin score improved in all treatment groups, this was least in the no immunosuppressant category, who also had the highest mortality. Second, ESOS confirms the relative effectiveness of cyclophosphamide in patients with pulmonary fibrosis.25 ,26
An important point when interpreting our findings (and therefore a note of caution) is that the ‘no immunosuppressant’ group was not a control group. Patients in this group had a longer disease duration than the other three groups and were more likely to have renal involvement.
Our findings lend support to two recently published studies (the Autologous Stem Cell Transplantation International Scleroderma trial (ASTIS) trial of autologous stem cell transplantation27 and the Scleroderma Lung Study (SLS) II (comparing MMF and cyclophosphamide),26 which suggest benefit, including in mRSS, from immunosuppression (as did SLS 125). In ASTIS, those patients randomised to cyclophosphamide had an 8.8 unit fall in mRSS (from 25.8) at 24 months (compared with 3.3 in ESOS over 12 months), but the cyclophosphamide protocol was more intense, and the patients had more severe disease (patients with the highest mRSS at baseline tend to improve most quickly4 as also demonstrated by our own findings (table 2)). MRSS fell by 19.9 units in those patients randomised to stem cell transplantation27 (and therefore intensive immunosuppression). In SLS 1,25 patients with dcSSc randomised to cyclophosphamide experienced a 5.3 unit fall in mRSS at 12 months (compared with 3.3 in ESOS), whereas mRSS fell by 1.7 on placebo (compared with 2.2 units in the ESOS ‘no immunosuppressant’ group). In SLS II,26 mRSS at 24 months fell 4.9 units on MMF (compared with 4.1 units in ESOS at 12 months) and by 5.4 after 12 months treatment with cyclophosphamide, although these values are not directly comparable because they relate to patients with limited cutaneous and dcSSc combined.
The methodological strength of ESOS, which built upon experience gained in a previous, smaller observational study,28 was its design: its standardised protocols emulated the conditions of a clinical trial, and although not randomised, patients were enrolled into four homogenous treatment arms with well-defined interventions and a systematic record of protocol changes and exits. Entry criteria were deliberately inclusive: RCTs often exclude patients with internal organ involvement and for whom immunosuppression is most likely to be beneficial. By recruiting 326 patients from 50 centres, ESOS represents a large cohort of patients with very early dcSSc (median duration of skin thickening 11.9 months): its data will serve as a benchmark when designing and interpreting future clinical trials. This is especially relevant with a number of novel treatment approaches currently being explored including biological agents. For example, in a recent RCT of tocilizumab,29 mRSS fell over 24 weeks by 3.9 units from 26 in the 43 tocilizumab-treated patients and by 1.2 units from 26 in the 44 placebo-treated patients, this latter fall comparable to the ESOS ‘no immunosuppressant’ response. In comparing between these studies, the higher baseline mRSS in the tocilizumab study should be borne in mind.
The main weakness of observational studies is that each patient's outcome on her/his treatment arm cannot be completely disentangled from her/his initial characteristics. For instance, ESOS has verified that patients with lung and cardiac involvement tend to be prescribed cyclophosphamide. However, adjusting using IPT weights minimises the problem of confounding by indication.
In conclusion, observational studies offer a rich population-wide perspective assessing treatment effects in a real-world setting. ESOS achieved its aim of following a large international cohort of patients with early dcSSc over 2 years, each of whom was treated according to one of four protocols. The message for clinicians is that there is a weak signal to support using immunosuppressants for early dcSSc (and in particular cyclophosphamide for patients with pulmonary fibrosis). However, it is clear that there remains a pressing need for the development of more effective and targeted treatments.
The authors are grateful to Dr Holly Ennis for study set-up and to her and Dr Graham Dinsdale for project coordination during the earlier phases of the study. Thanks also to members of the independent oversight board: Stephen Cole, Dinesh Khanna and Frank Wollheim.
Handling editor Tore K Kvien
Contributors ALH, ML, RH, LM, AS, EB, LC, JHWD, OD, KF, WJG, RO, MV and CPD were members of the Steering Committee and designed the study. ALH, RH, LM, LC, JHWD, OD, MV, CA, VHO, DF, MH, MM-C, AB-G,OM, ACJ, PJ, WS, PM, FCH, CA, MEA, ED, RM, MA, MHB, LC, ND, HG, PL, YA, KC, SJ, AJM, NM, UM-L, GR, MB, JR, PEC, AF, EH, JH, MI, JSM, J van L, SP, SP, AR, JS, BC, CS, TS, DJV, CG, GT and CPD were principal investigators at the different sites and recruited patients. XP was study coordinator. SP and ML were responsible for the statistical analysis. ALH, XP, SP, ML, RH, LM, AS and CPD wrote the draft report, and all authors reviewed the report, provided comments and approved the final report.
Funding ESOS was funded by a grant from the European League Against Rheumatism (EULAR) Orphan Disease Programme. Additional funding from Scleroderma and Raynaud's UK allowed a 1-year extension of the study.
Competing interests ALH has done consultancy work for Actelion, served on a Data Safety Monitoring Board for Apricus, received research funding and speaker's fees from Actelion, and speaker's fees from GSK. JHWD has consultancy relationships and/or has received research funding from Actelion, BMS, Celgene, Bayer Pharma, Boehringer Ingelheim, JB Therapeutics, Sanofi-Aventis, Novartis, UCB, GSK, Array Biopharma, Active Biotech, Galapagos, Inventiva, Medac, Pfizer, Anamar and RuiYi and is stock owner of 4D Science GmbH. OD has received consultancy fees from 4D Science, Actelion, Active Biotech, Bayer, Biogenidec, BMS, Boehringer Ingelheim, EpiPharm, Ergonex, espeRare Foundation, Genentech/Roche, GSK, Inventiva, Lilly, Medac, Medimmune, Pharmacyclics, Pfizer, Serodapharm, and Sinoxa and received research grants from Actelion, Bayer, Boehringer Ingelheim, Ergonex, Pfizer and Sanofi, and has a patent mir-29 for the treatment of systemic sclerosis licenced. WG has received teaching fees from Pfizer. FH has received research funding from Actelion. MEA has undertaken advisory board work and received honoraria from Actelion, and received speaker's fees from Bristol-Myers Squibb. LC has done advisory board work for Gilead and served Data Safety Monitoring Boards for Cytori and Reata. HG has done consultancy work and received honoraria from Actelion. UM-L is funded in part bu EUSTAR/EULAR. JMvL has received honoraria from Eli Lilly, Pfizer, Roche, MSD and BMS. AR receives funding from AstraZeneca. CPD has done consultancy for GSK, Actelion, Bayer, Inventiva and Merck-Serono, received research grant funding from GSK, Actelion, CSL Behring and Inventiva, received speaker's fees from Bayer and given trial advice to Merck-Serono.
Patient consent Obtained.
Ethics approval The Ethics Committee of each centre approved the study.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement At present, unpublished data from the study are not available for sharing. This position may change in 6–12 months time.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.