Article Text
Abstract
Current drug development programs produce high quality data on the efficacy of new drugs, substantial data on safety, but little data on actual applicability of the new product compared to standard of care. After successful registration and launch, such data require years to accumulate and often remain incomplete.
This viewpoint proposes a new trial design for phase 2 and 3 drug trials in rheumatoid arthritis. In this design the trial starts conventionally: patients that are inadequate responders to standard treatment (usually methotrexate) are randomised to receive the experimental drug or placebo on the background of continued (methotrexate) treatment. However, after 3 months all patients in the placebo group are additionally treated with one and the same standard of care treatment (usually an inhibitor of tumour necrosis factor) and all patients in the experimental group are additionally treated with a placebo corresponding to the chosen standard of care treatment.
This design allows primary assessment of efficacy and safety of the experimental drug compared with placebo at the ethically acceptable limit of 3 months, followed by secondary assessments of efficacy (including durability) and safety compared to standard of care. The secondary assessments are observational and thus more prone to bias, but it is argued that the potential for bias is limited in this setting. Widespread adoption of the design will greatly help to determine the place of a new product in the spectrum of treatment possibilities of rheumatoid arthritis.
Statistics from Altmetric.com
Clinical drug development in rheumatoid arthritis (RA) is facing increasing challenges as standard of care improves, especially in developed countries. More and more, patients and their doctors refuse to participate in placebo-controlled trials where inadequate treatment is continued for long periods of time. Additionally, fewer patients are available with high levels of disease activity.1 2 Recent trials have introduced a shorter placebo phase and more rigorous escape rules in case of lack of efficacy. A meta-analysis submitted to the Annals of Rheumatic Diseases suggests a 3-month comparison with placebo (on the background of methotrexate (MTX)) is indeed sufficient for proof of efficacy of current biologicals (including rituximab and abatacept).3 However, decreasing trial duration also decreases information on long-term exposure to the experimental drug. Therefore in some designs non-responders or even all patients on placebo are offered experimental drug after 12–20 weeks (eg, as has been done with tocilizumab,4 certolizumab5 and golimumab6 in recent trials) but this could introduce new ethical problems because the effect of the experimental drug is not yet known.
Another, increasingly pressing problem of current drug research is the lack of direct (“head-to-head”) randomised comparisons between the new drug and standard of care. Registration trials are designed to prove to the authorities that the new drug has intrinsic antirheumatic effects and an acceptable safety profile. Subsequent trials to define the place of such a new drug in the spectrum of possibilities are rarely performed, mainly because there is no independent funding mechanism to support such studies. On the side of industry there is little impetus to perform a head-to-head trial against an active comparator as such trials pose several problems: (1) in the study question (ie, should one aim for superiority, equivalence or non-inferiority and how is this defined?), (2) in the execution, with larger sample sizes due to smaller expected contrasts between the treatment groups and (3) in the interpretation, which may affect the label if the study is performed in the registration phase. Thus, during the lifetime of a drug little or no high quality information is available to directly compare one drug against the other. This viewpoint introduces a trial design fit for registration purposes that combines a short placebo phase with a high quality observational head-to-head comparison.
A new design
Starting with a classical design, patients with “inadequate response” to MTX are randomised to receive the experimental drug or placebo (control group) on top of their background MTX therapy (fig 1). At 3 months (the maximum duration of placebo treatment for most patients with active RA) the primary endpoint in signs and symptoms, and perhaps radiographs,7 is compared between the two groups (primary comparison). Thereafter, all patients on placebo are switched to standard of care biological therapy. What this entails will depend on the characteristics of the included patients/inclusion criteria. Therapy could be the same for all, or variable, as long as it is the treatment that would normally be given in this situation, even including non-biologicals. Obviously such treatment should not include modalities that have already failed in these patients. To retain blinding all patients in the experimental arm should receive placebo standard treatment.
When we follow the patients for a total of 9 months, we can make the following secondary comparisons for efficacy (including durability):
change in disease activity of experimental group between 0 and 3 months vs change in control group between 3 and 6 months (comparison 1);
as above, experimental (0–6 months) vs control (3–9 months) (comparison 2);
as above, experimental (0–9 months) vs control (0–9 months) (comparison 3);
as above, control (0–3 months) vs control (3–6 months) (comparison 4).
The same comparisons can be made for safety, and these are potentially even more valuable because safety signals are less likely to be influenced by a preceding 3-month placebo phase. Strictly speaking, comparisons 1 and 2 should be regarded as observational because the prognostic similarity of the randomised groups at baseline is lost: in fact the control patients can now be regarded as a historical control group. Comparison 3 retains the internal validity of the original trial because both groups are followed from randomisation, but its underlying study question is somewhat artificial (ie, “is treatment with the experimental agent for 9 months better than treatment with placebo for 3 months, followed by control treatment for 6 months?”). Finally, comparison 4 is a within-group comparison where patients serve as their own control.
Randomised trials are designed to minimise the threat of bias inherent in observational studies, including studies that use historical control groups. Most forms of bias can be summarised under the headings of selection bias (comparability of populations, circumstances and treatments) and information bias (comparability of information on exposure and outcome in the groups studied). So how large is the threat of bias with these secondary comparisons? To address the potential for selection bias first: in the proposed design, successful initial randomisation will have created groups that are similar for all measured and unmeasured prognostic factors. By 3 months into the trial, this similarity will most likely be intact for all factors except those influenced by the effects of the experimental drug and by time. Disease activity (the primary outcome) will probably be less in both arms after 3 months: in the placebo arm due to regression to the mean and in some cases due to ongoing response to the background MTX;8 in the experimental arm due to all of the above plus the specific effects of the experimental drug. So the baseline situation for the placebo group starting control treatment at 3 months is not the same as the baseline situation of the experimental group at 0 months. However, as we are comparing changes the baseline situation need not be exactly the same, and we can correct for these differences in the analysis.
It is important to note that the potential for bias is greatly increased if not all patients are switched to standard of care therapy. Many recent trials have included a “rescue” pathway for patients with inadequate response; internal validity is maintained if the primary outcome assessment is placed before the rescue point, or if rescue is regarded as failure in the primary analysis. However, subsequent outcome of treatment in these patients should be analysed with great caution. If rescue is optional only patients with a poor prognosis (high disease activity and lack of response) will select to be rescued, and this subgroup is by definition greatly different from the whole group at baseline. In the situation where the experimental drug works, most rescue patients will come from the placebo group. Compared to the results of the (less depleted) experimental group, any treatment will look bad if given only to patients with a poor prognosis: a nice example of what is called confounding by indication.
The trial setting of standardised data collection and procedures greatly reduces the potential for information bias compared to regular observational studies. However, if the group allocation becomes known at 3 months external circumstances and the quality of information might start to differ, creating bias. For example, concomitant treatment might be altered preferentially in one treatment group, and beneficial and adverse effects might be attributed differently once the treatment strategy has been revealed. Therefore I would advocate retaining the blind of all parties involved in the trial (doctors, patients and assessors) by starting placebo “control” treatment in the experimental group.
Obviously MTX is not the only possible background therapy: it could be replaced by another drug or left to the discretion of the treating doctor. Another variant to the design would be to re-randomise patients in the control group at 3 months to either experimental therapy or standard active therapy. This would create a real head-to-head trial within the main trial and increase exposure to the new drug, albeit at the expense of a lower number of patients studied with standard therapy. Another disadvantage in this variant is that it would also potentially expose patients to at least 6 months of inactive therapy if the experimental drug were found to be ineffective.
The proposed design is not a panacea for all problems encountered in RA trials. Patients will still prematurely discontinue trial treatment for efficacy or safety reasons, although hopefully the expected treatment step at 3 months will serve to keep many patients in the trial up to that point. This was nicely demonstrated in a recent trial where most patients on placebo stayed on for 14 weeks, then opted for treatment with the active drug.9 Any patient totally stopping trial treatment will have to be unblinded to be able to start alternative therapy, although this could be standardised per protocol. Also, the design would not work well for agents with a slow onset of action. However, such agents are currently unlikely to be commercially successful.
We should realise that in the ideal world, a control group treated with placebo for, say, a year, would yield much more “pure” data on long-term efficacy and safety of the new product. However, such a trial design has become unethical and therefore unrealistic, and even 6-month placebo trials now have such significant numbers of dropouts that no technique of data imputation can correct for them. Note that even the best of these techniques actually can only work in situations where the cause for dropout is not related to the outcome. Unfortunately, most patients on placebo will drop out because of lack of effect, a reason closely related to outcome, whichever way it is defined. When data on these patients continues to be collected (“intention to treat”) even though they are offered alternative treatments outside of trial control, we are in fact getting a poor surrogate of the proposal described above. I have discussed these and other problems with the “inadequate responder” design elsewhere.8 10
There is a potential for a negative backlash for the sponsor when the comparison with standard treatment shows little difference (or perhaps even slight inferiority vs standard of care). By contrast, the data can also provide an early impetus for a real head-to-head comparison when the experimental product actually looks better. If regulators were to ask all companies for this trial design, preferably early in drug development, it would keep the playing field level.
The current proposal is aimed at registration trials, currently constituting the bulk of research in terms of patients studied. It can be placed within a group of more generic proposals aimed at improving trial design and optimising the amount of relevant information that can be obtained. In this group the major advance has been the appearance of trials comparing strategies of drug treatment rather than single drugs. Prominent examples include the Tight Control Of Rheumatoid Arthritis (TICORA) trial, the BeSt trial and, more recently, SWEFOT.11 12 13 These were active control trials because they were investigator initiated (and thus aimed at answering questions relevant to the clinician), and open label for feasibility reasons. However, in principle strategy trials could be double blind and also include a placebo phase. To increase efficiency in the setting of dwindling patient numbers, Honkanen et al proposed a more complicated version of the current proposal: a three-stage design where responders in the (experimental) treatment group would be re-randomised to treatment or placebo, and non-responders in the placebo group would be placed on the treatment; subsequently responders in this group would also be re-randomised to treatment or placebo.14 This design appears attractive in the proof of concept stage, but less so in phase 3 where large numbers of patients need to be enrolled and potentially exposed to placebo on the background of inadequate treatment.
In conclusion, the present viewpoint suggests a feasible design innovation for registration trials. It can be refined through expert discussions and adapted to meet specific needs. Widespread adoption of this design will result in a priceless body of data that will help clinicians, patients, but also regulators and industry itself to determine, already at launch, the place of a new product in the spectrum of treatment possibilities of RA.
Acknowledgments
I am grateful to Peter Tugwell for his review of the manuscript.
REFERENCES
Footnotes
Competing interests None.
Provenance and Peer review Not commissioned; externally peer reviewed.