Article Text
Statistics from Altmetric.com
‘All other things being equal, we may assume the superiority of the demonstration which derives from fewer postulates or hypotheses’—Aristotle
Therapeutic options for systemic lupus erythematosus (SLE) are still limited. A number of conventional immunosuppressives are used along with glucocorticoids and antimalarials, but the successes of biological therapies in the arthritides have not been duplicated for SLE. To date, only one biologic succeeded in phase III trials1 ,2 and was approved for use in SLE, but its proper role in the overall management of the patient with SLE has remained incompletely defined.3 Rituximab is used off-label on the strength of many observational studies,4 ,5 but two large trials failed to demonstrate efficacy.6 ,7 A distressingly large number of biologics failed in late-stage trials, where some have completely disappeared from further consideration (the B-cell tolerising oligonucleotide construct abetimus,8 the anti-CD20 monoclonal ocrelizumab9); for some others, the future looks bleak (the BLyS antagonist tabalumab,10 ,11 the anti-CD22 monoclonal epratuzumab) and for yet others, investigators and sponsors are struggling to find the best way forward (the T-cell costimulation blocker abatacept,12 ,13 the BLyS/APRIL antagonist atacicept14 ,15).
The failure of many SLE trials may have been due, at least in part, to the simple fact that the drugs being studied were not very effective, but it has also been suggested that clinical trial methodology for SLE has been suboptimal. Any positive or successful trial could therefore give us the dual benefit of moving the field forward with a new compound, while also providing us with useful lessons for the design of clinical trials.
The interferon (IFN) system has been identified as a key pathway in the immunopathogenesis of SLE.16–22 It was therefore logical to develop specific IFN antagonists in order to try to control the immunological activation in this disease. A phase II clinical trial with the anti-IFN monoclonal antibody rontalizumab failed to demonstrate overall efficacy, but, unexpectedly, seemed to benefit patients with a low IFN signature.20
Khamashta et al23 present the results of a randomised clinical trial of the anti-IFNα monoclonal antibody sifalimumab in SLE. In this multicentre trial, 431 patients with active SLE were randomised to one of four arms, and treated with three different dosages of sifalimumab or placebo, all added to stable conventional background therapy. After 24 weeks, the patients were assessed using the SLE response index (SRI-4; an index that requires the patient to have improved by at least four points on the systemic lupus erythematosus disease activity index (SLEDAI), while not worsening British Isles Lupus Assessment Group (BILAG) and physician global), and based on this result, the authors conclude that the drug was more effective than placebo in achieving the prespecified primary outcome; multiple secondary outcomes were also achieved. Without any doubt, this is good news, and further development of this compound will hopefully confirm that a new mechanism of action for the treatment of this disease has indeed been identified. Further research will be needed to better understand the details of why sifalimumab seemed to perform better in this trial than rontalizumab had done previously, how the results compare with those of other ongoing trials with agents targeting the same pathway such as the anti-IFN receptor monoclonal antibody anafrolumab and what role the IFN signature has in identifying patients more or less likely to respond to such treatments. Thus, the trial speaks of a number of interesting issues for the specific development of sifalimumab, and it also addresses some points that may be of importance for the entire field of clinical trials in SLE.
First about sifalimumab. The clinical trial reported here was fully sponsored by the manufacturer and represents a very solid piece of work. The authors correctly identify that this is the first trial formally to demonstrate the clinical efficacy of targeting the IFNα pathway in SLE. The authors can also be commended for having found an appropriate balance between enthusiasm over these positive results and restraint given the limitations of this trial. The latter are discussed to some extent in the paper, but it may be useful to look at them more specifically:
The absolute change in the response percentages between the active treatment arms and the placebo group was rather small: 45.4% for placebo, 56.5%–59.8% for the three treatment groups. If a difference in the response rate of 11.1–14.4 represents the true difference between treatment and placebo, an unbiased observer might ask about the clinical significance. This same question has also been asked about belimumab, which had similar small effect sizes in its two phase III trials.2 My answer to this particular critique is that even a small effect size could very well be clinically relevant for several reasons. First of all, the effect size is obtained on a background of conventional ‘standard-of-care’ treatment; so, it is an added effect to what is achievable without the new drug. Second, the small effect size might indicate that only a subset of patients will benefit, but in that case each of the individual patients who do respond has a (much) bigger effect. Most importantly, the measures used in SLE clinical trials are not particularly sensitive or responsive, and the ‘noise’ generated by their complexity could easily drown out a proper ‘signal’.
In the trial reported here, there was no very clear relationship between the three dosages of sifalimumab and the clinical responses. The absence of a dose–response curve is a slight concern in phase II trials, but could of course indicate that maximal competition with the target molecule was already achieved at the lowest dosage. Likewise, the difference in the placebo response seen in this trial between the different geographical regions raises a slight concern.
Some of the analyses in the trial by Khamashta et al may suggest tachyphylaxis with this treatment, that is, the loss of efficacy over time. This is most notably seen in the figure showing the cutaneous lupus activity and severity index (CLASI) response, which seems to come and go, especially for the 600 mg group. Tachyphylaxis is always a concern for medications intended for long-term use, but it is especially worrisome with a monoclonal antibody that could, in theory, give rise to antidrug antibodies. Reassuringly, direct testing did not reveal anti-sifalimumab antibodies, and it can be hoped that the pattern observed in the figure and in some of the other outcomes was a spurious finding.
While randomised clinical trials (RCTs) are ideal for establishing efficacy, safety concerns cannot always be addressed sufficiently. With IFN-targeting treatments, one of the main a priori concerns could be that inhibition of the IFNα pathway would jeopardise the host response to viruses such as influenza, and even host defenses against malignancies. Fortunately, no signals indicating such risks emerged in this trial, but vigilance must be maintained and long-term follow-up would be essential.
The CLASI response, which the authors identify as the response with the most convincing treatment effect for sifalimumab, was also the outcome that had a placebo response of about 60%—which is very high by any standard. This might suggest that there is something inherently ‘easy’ about this particular outcome. It would have been of interest to determine if a high-threshold outcome for cutaneous manifestations would have been equally convincing.
And then second: Can this trial teach us something about how to do clinical trials in SLE? I believe it can, but perhaps the lesson I would highlight is not the one that everyone would agree on, having to do with the manner in which the primary outcome was defined.
Choosing a primary outcome in a RCT is no simple matter, especially in a relatively early phase of development when it is hard to know what to expect from the new treatment. Nevertheless, prespecifying the primary outcome of a clinical trial even in phases I and II has become an almost sacrosanct part of trial design, driven by a variety of motives. Statistical purity is one, regulatory authorities have been outspoken in this regard, and it has also become clear that the investment community reacts very strongly (probably too strongly) to the news whether a trial ‘achieved’ or ‘missed’ its primary endpoint, and tens or hundreds of millions may be riding on whether the p value was 0.04 or 0.06.
To be sure, when a phase III trial is launched with the express purpose of getting a drug approved, it is entirely reasonable to insist on having a prespecified primary outcome. But the stated purpose of phase II trials is not to demonstrate the efficacy of the drug beyond any doubt, but to be informed about the optimal way to move forward in terms of dosages, dosing schedules, patient characteristics and, yes, outcomes. I find it very peculiar that so much attention is being paid to the primary outcomes of phase II trials, when in fact it is perfectly legitimate to use the results of the phase II trials to decide on a better outcome for the phase III trials that follow.
In defining the outcome of a trial, some choices have to be made upfront, for example, if there are more than two groups, as in this trial, will the primary comparison be made between all groups (which could then be followed by post-hoc testing for pairs of groups) or should the primary testing be done between two of the four groups. Clinical trials also face the important problem of how to deal with patients who do not complete the trial as intended. There has been a notable trend towards more and more complex statistical methodologies, and this was also the case in the current trial. The prespecified statistical analysis plan (SAP), as described in the paper, is complex indeed. It entails a trend test of the four arms of the trial followed by pairwise comparisons between each sifalimumab group and placebo. Perhaps, most notably, the SAP specified an interim analysis that was used to determine at which level of the p value the final analysis would be ‘declared statistically significant’. And as it happened, this level was set at almost exactly 0.10, a marked deviation from practice. As is widely known, by convention, p values of <0.05 are considered significant, because the risk of mistakenly asserting a true difference when the observed difference is due to chance is <1 in 20. By setting this cut-off at 0.1, one accepts a risk of 1 in 10 that the trial suggests an effect when in fact there is none. Khamashta et al explain clearly why they took this approach, and as the conventional cut-off of 0.05 is not based on anything else than, well, convention, it seems to me that we can accept this for now, as future phase III trials will of course be held to the higher standard.
So, it would seem that for the authors and sponsors of the sifalimumab trial, their approach worked out very well. The statistical machinery churned out a p value of 0.053 for the trend test, allowing the authors to declare significance in this case. The investment community will no doubt react positively, and the sponsor is continuing the development of this monoclonal that will, hopefully, become a new effective therapy for SLE one day. All is well that ends well.
However, it seems ironic that a simpler statistical approach would have given a clearer result. Specifically, one could have reasoned that the first analysis looks at whether all the patients on sifalimumab taken together as one group did better than those on placebo—certainly a reasonable clinical question. Thus, one would compare the aggregate of the three sifalimumab groups (treating them as one group) with placebo, and this comparison would be, for the SRI-4 outcome, 49 responders out of 108 on placebo versus 188 out of 323 on sifalimumab. Using a very simple test (eg, χ2 or Fisher's exact), this would give a p value of about 0.02. Or, one could also think that the dosing levels should all be considered as separate treatments, and ask the question how each of them did compared with placebo. That could also be done in a simple way, namely, to test each of the three dosages (treating them as though they were independent) separately, and this would yield ‘conventional’ significance (ie, p<0.05) only for the highest dose level. This approach could be criticised for involving ‘multiple comparisons’, but there is compelling biological plausibility in all of these, because if any of a range of dosages of a specific cytokine antagonist is going to work, it stands to reason that it will be the highest one.
So, in my common-sense way of looking at it, this trial could have achieved its objective by two simple but reasonable statistical approaches, and I therefore think it is a real pity that the authors chose a more complex and sophisticated but less intuitive approach. It is an after-the-fact construction, but the data suggest that, had the authors chosen a more simple and straightforward statistical approach, they would have had a beautiful and very convincing result. As reported, this study, while positive on the whole, may also contain an important warning to investigators to be cautious when deciding on sophisticated but unintuitive statistical approaches. Of course, this concern applies to all trials and not only to those in SLE, but because many lupus trials have failed, some might hope that this could be solved by resorting to fanciful and complex statistical analyses. I do not believe that is the way to go.
Having said that, Khamashta et al report a positive trial in SLE, something that has been an uncommon occurrence for too long. After everything is said and done, patients with SLE and the specialists taking care of them may be heartened by a trial demonstrating that targeting the IFN pathway may be a way forward in the treatment of this elusive disease.
References
Footnotes
Competing interests Research support and grants: AbbVie, Amgen, BMS, GSK, Pfizer, Roche and UCB Consultancy; honoraria: AbbVie, Biotest, BMS, Celgene, Crescendo, GSK, Janssen, Lilly, Merck, Novartis, Pfizer, Roche, UCB and Vertex.
Provenance and peer review Commissioned; externally peer reviewed.