Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Dobrota et al1 have analysed the EUSTAR (EULAR Scleroderma Trials and Research) systemic sclerosis (SSc, scleroderma) database using a subset of diffuse cutaneous SSc (dcSSc) to determine predictors of skin improvement over 1 year in randomised trials. The idea is to enrol more informative patients in a randomised controlled trial (RCT), as incident dcSSc is rare. However, you could want patients who progress/worsen (if not on effective treatment) or those who regress/improve more when on treatment, or some of each subgroup in the sample studied. More than 900 patients were included in the EUSTAR early dcSSc study,1 and like a Bell curve, with a smaller tail, a quarter improved, two-thirds had no change and one-tenth worsened. In order to have a sample size that is enriched for patients who may improve with effective treatment compared with a control, or conversely may worsen without effective treatment, then more informative patients are included and the study size can be smaller. They found that by varying the baseline modified Rodnan skin score (mRSS), the proportion of those who regressed went from 13% to 19% if the mRSS was <18 to 25, respectively. Therefore, the range of mRSS from 18 to 25 was most likely to enrich for those who would progress over an observation period. This can simply be that patients in mid-range skin scores in early dcSSc do not have as much of a floor or ceiling effect; that is, very low scores may worsen but are unlikely to improve and very high mRSS scores are unlikely to progress as they already have a high amount of skin involvement (ceiling). Conversely, they observed that 44% improved if the mRSS was more than 25. The mean mRSS was 16 at baseline in the EUSTAR analysis,1 so not all patients would be eligible for a trial. Using data from the Canadian Scleroderma Research Group (CSRG), Choy et al2 found the baseline mRSS for patients with dcSSc initiating immune suppressive treatment was 21. This is aligned with the data from EUSTAR. However, the generalisability of the patients must be interpreted within the context of different continents/regions that have varying rates of positive autoantibodies such as Topoisomerase 1 and RNA polymerase 3.3
Figure 1 demonstrates that a sample of patients with early dcSSc may shift towards mild improvement on a certain treatment compared with mild worsening on, for example, a placebo, but sometimes the outliers (a lot better or somewhat worse inform efficacy results). Selecting patients within a more narrow range of skin scores lessens the variability of patients (less SD of mRSS translates to lower sample sizes needed in trials). Based on the current data, a study that investigates a treatment with only mild effect should not include patients with very high mRSS (ie, >25). However, the ASTIS (Autologous hematopoietic stem cell transplantation vs. intravenous pulse cyclophosphamide in diffuse cutaneous systemic sclerosis) trial for stem cell transplantation versus cyclophosphamide had no upper limit of mRSS (inclusion criteria of mRSS of at least 15) and the mean baseline mRSS was 25–26.4 So, in my opinion, it depends on what you think will yield the most efficient sample size. In figure 1, a population with early dcSSc as an example starts with baseline skin scores in A and worsen on average in the control arm and shift to B or improve on average in the treatment arm and shift to C. The efficacy is determined by the between-group differences of the mean change in mRSS: change in skin scores between the changes in the mean skin scores of B and C or (C–A)–(B–A). The figure demonstrates that most of the patients overlap with the baseline mRSS (ie, they do not change very much). Merkel et al5 have studied trajectories of individual patient data from several dcSSc trials showing large variability of mRSS over follow-up but only major changes in the minority of subjects.
In the EUSTAR database, it was previously found that short disease duration, low mRSS at baseline and the presence of synovitis were predictive of patients who would progress over time.6 Interestingly, predictors of improvement are not the opposite of worsening. So, one cannot necessarily anticipate that those who improve would have longer disease duration, higher mRSS and no inflammatory arthritis. Predictors of improvement will obviously depend on what variables are collected and added to the statistical model. Improvement of at least 20% in the mRSS from two randomised trials in early dcSSc suggested that low baseline functional impairment (ie, a low baseline Health Assessment Questionnaire Disability Index (HAQ-DI)) and lower pain scores, but not by baseline mRSS were strongly correlated with improvement over the next 1–2 years.7
The data from this EUSTAR study appear to be generalisable to other early dcSSc populations. Approximately 60% were positive for Topoisomerase 1 and nearly all were positive for antinuclear antibody (ANA). One-fifth of their population included had synovitis which is close to the 15% expected in a SSc cohort.8
So, who should be included in a trial to improve skin in early dcSSc? It depends on what you think a treatment can do. If you think there will be only a mild improvement in mRSS, maybe you want to study patients with high mRSS, so the regression will be more in active versus control group, or you may want to include subjects with a low mRSS; if you think that the background patients are more apt to worsening and thus you can show a difference if a drug is effective.
In reality, sample size calculations depend on the distribution of the patients (variability, SD), so a more homogeneous baseline group with respect to the variable of interest will reduce sample size irrespective of which of a spectrum you want to study. It also depends on the expected difference of how treatment groups will perform on average and sometimes, the distribution of that finding.9
Similarly, when planning a trial in rheumatoid arthritis, if the outcome is an American College of Rheumatology (ACR) response or change in disease activity score (using a 28 joint count) (DAS28), then enrolling patients with high disease activity will yield the best responses, whereas if the outcome is the proportion that achieves remission, then treating patients with lower baseline DAS28 scores will result in more patients attaining the outcome.
It may also be important in SSc trial design as to what a desirable label will state for an effective drug in SSc—improving mRSS in early dcSSc versus less worsening (slowing progression) and in reality with a mean between-group differences of active versus control, likely both ends of the population are informing efficacy.
Also confounding how patients will perform in a trial is that the natural history of early dcSSc is to peak skin score in the first few years and then improve mRSS after that, but that early damage is reflective of the skin involvement in recent-onset dcSSc.10
The mRSS is a validated outcome.11 ,12 In this study, worsening was defined as mRSS worsening by 5 or more points and also by 25% over baseline skin score. Experts would agree that this defines worsening where even a change of 3–7.5 points on mRSS is thought to be clinically relevant.13
Other SSc trials have tried to enrich the included patients so that they would change such as the scleroderma lung study and digital ulcer prevention study (RAPIDS-2; RAndomized, double-blind, Placebo-controlled study with bosentan on healing and prevention of Ischemic Digital ulcers in patients with systemic Sclerosis (second trial)).14 ,15 In the former, the vast majority did not worsen or improve their lung function in either treatment arm, despite including relatively early patients with interstitial lung disease (ILD), but patients with more fibrosis on lung imaging and higher baseline mRSS (23 or more) were more likely to have benefit from cyclophosphamide treatment (as these patients were more apt to worsen on placebo).14 In the latter trial, although all patients had to have a baseline digital ulcer, two-thirds had a recurrent ulcer over the 24-week long study,15 whereas in an earlier trial of previous digital ulcers over the previous year, but not currently, 60% has a subsequent ulcer over the study period, so enriching the population leads to slightly more events.16
In conclusion, enriching a population of early dcSSc that may change over 1 year of follow-up depends on how you predict the population will shift on the variable of interest such as the mRSS. Enriching a population as a strategy may reduce sample size (less variability of the included patients) and/or increase power, but assumptions that are made will affect future labelling claims.
Competing interests JEP has had research grants and/or honoraria from Actelion, Bayer, Merck, Pfizer and Roche.
Provenance and peer review Commissioned; externally peer reviewed.