Article Text


Number needed to treat (NNT): implication in rheumatology clinical practice
  1. M Osiri1,
  2. M E Suarez-Almazor2,
  3. G A Wells3,
  4. V Robinson4,
  5. P Tugwell4
  1. 1M Osiri, Department of Medicine, Chulalongkorn University Hospital, Bangkok, Thailand
  2. 2M E Suarez-Almazor, Health Services Research, Baylor College of Medicine, Veteran Affairs Medical Center, Houston, Texas, USA
  3. 3G A Wells, Faculty of Epidemiology and Community Medicine, University of Ottawa, Ontario, Canada
  4. 4V Robinson, P Tugwell, Center for Global Health, Institute of Population Health, University of Ottawa, Ottawa, Ontario, Canada
  1. Correspondence to:
    Dr P Tugwell, Center for Global Health, University of Ottawa, Institute of Population Health, 1 Stewart Street, Room 312, Ottawa, Ontario, Canada, K1N 6N5;


Objective: To calculate the number needed to treat (NNT) and number needed to harm (NNH) from the data in rheumatology clinical trials and systematic reviews.

Methods: The NNTs for the clinically important outcome measures in the rheumatology systematic reviews from the Cochrane Library, issue 2, 2000 and in the original randomised, double blind, controlled trials were calculated. The measure used for calculating the NNT in rheumatoid arthritis (RA) interventions was the American College of Rheumatology 20% improvement or Paulus criteria; in osteoarthritis (OA) interventions, the improvement of pain; and in systemic sclerosis (SSc) interventions, the improvement of Raynaud's phenomenon. The NNH was calculated from the rate of withdrawals due to adverse events from the treatment.

Results: The data required for the calculation of the NNT were available in 15 systematic reviews and 11 original articles. For RA interventions, etanercept treatment for six months had the smallest NNT (1.6; 95% confidence interval (CI) 1.4 to 2.0), whereas leflunomide had the largest NNH (9.6; 95% CI 6.8 to 16.7). For OA treatment options, only etodolac and tenoxicam produced significant pain relief compared with placebo (NNT=4.4; 95% CI 2.4 to 24.4 and 3.8; 95% CI 2.5 to 7.3, respectively). For SSc interventions, none were shown to be efficacious in improving Raynaud's phenomenon because the 95% CI of the NNT was infinite.

Conclusions: The NNT and NNH are helpful for clinicians, enabling them to translate the results from clinical trials and systematic reviews to use in routine clinical practice. Both NNT and NNH should be accompanied by a limited 95% CI and adjusted for the individual subject's baseline risk.

  • number needed to treat
  • systematic reviews
  • rheumatic diseases
  • randomised controlled trials
  • ACR, American College of Rheumatology
  • CI, confidence interval
  • CsA, cyclosporin A
  • DMARD, disease modifying antirheumatic drug
  • MTX, methotrexate
  • NNH, number needed to harm
  • NNT, number needed to treat
  • NSAIDs, non-steroidal anti-inflammatory drugs
  • OA, osteoarthritis
  • RA, rheumatoid arthritis
  • RCT, randomised controlled trial
  • RRR, relative risk reduction
  • SSc, systemic sclerosis
  • SSZ, sulfasalazine

Statistics from

In rheumatology clinical practice, one of the major decisions with which rheumatologists are confronted is the choice of treatment for their patients. Although there is increasing appreciation of evidence based medicine, the data sources for this are still in their infancy. Guidelines and algorithms have been developed to help determine the appropriate choices of treatment, but they are not applicable to every patient. Moreover, new information from clinical trials is being published at too fast a rate for textbooks to remain current. The challenge is to translate the clinical research data into a format suitable for use by busy clinicians in practice. One key item of information needed for an informed decision is an easily understood estimate of the magnitude of benefit (and risk of adverse effects) that can be used by doctors and other care givers.1 Those most commonly used in rheumatology include event rates, relative risk, relative risk reduction, absolute risk reduction or risk difference, and odds ratios. In addition, a number of rheumatological measures are based on continuous outcomes—for example, number of tender joints or swollen joints. Many of these are difficult for the clinicians to use in clinical practice for reasons that include their complexity, the difficulty in assessing the clinical importance of a specific result, and the challenge in comparing benefits with risks/adverse effects. One approach that is becoming increasingly used in other disciplines is the “number needed to treat” (NNT). Our review aims at describing the concept, method of calculation, and interpretation of the NNT and its use in decision making of rheumatology clinical practice.


The most common measure used in reporting clinical trials is the relative risk reduction (RRR). In brief, consider a parallel group, randomised trial comparing an active treatment with placebo; the outcome of interest is dichotomous (event/no event) and assumed to be prevented by the active treatment. The probability of the event in the treatment group (X) is supposed to be less than the probability of the event in the placebo group (Y). Relative risk is defined as the ratio of the probability of the event in the treatment group and that of the control group (X/Y). RRR is defined as the reduction of the probability of the event in the treatment group in proportion to that in the placebo group [(Y−X)/Y] and it can be calculated from relative risk as RRR=1−relative risk (1−X/Y). RRR is usually expressed as a percentage [(1−(X/Y))×100%]. The problem of the RRR is that it fails to discriminate between enormous absolute treatment effects and very small effects in absolute numbers. Table 1 shows an example of this. The relative risk and RRR calculated from the trial by Moreland and colleagues2 and a hypothetical trial were similar.

Table 1

Measures of treatment efficacy from the randomised controlled trials in rheumatoid arthritis (RA)*

In contrast, the absolute difference in rates between the experimental and control groups does discriminate between low and high magnitudes of benefit and harm. Absolute risk reduction is the difference in the event rates between the placebo and treatment groups (Y−X).1,3–5 The inverse of the absolute risk reduction applies for estimating the increase in absolute risk for a side effect (absolute risk increase). Because the values of relative risk and RRR depend on the probability of the event in the control group, they provide less information on the clinical benefit. Absolute risk reduction is considered a better measure of treatment effect than relative risk or RRR because it expresses the consequences of giving no treatment.1,3 However, the absolute risk reductions and increases are often <1 and have to be expressed as a decimal fraction, which is difficult to incorporate into clinical practice. Table 1 shows that the difference between treatment groups is much larger when looking at absolute risk reductions than the relative risk or RRR. However, it may still be difficult to understand the clinical benefit of an absolute reduction of 0.45 compared with that of 0.0024.

The risk reduction obtained from either of these measures of treatment effect is regarded as a “point estimate” because the precise value of risk reduction is unknown but lies within a certain extent—that is, confidence intervals (CIs). The point estimates are usually provided by most clinical trials and the 95% CI is frequently used.3

The advantages of the absolute risk reduction can be retained but made much easier to use by “inverting” it and taking its reciprocal—this is called the NNT. This is clinically useful because it is the number of patients who must be treated in order to obtain the benefit of interest in one new patient.3 The point estimates of NNT should be accompanied by the 95% CI, which is simply the reciprocals of the 95% CI of absolute risk reduction.1,3 The advantage of the NNT over the relative risk and RRR is that it expresses both the risk without treatment and the risk reduction with treatment. In addition, the NNT informs the clinicians and patients how much effort they must spend to prevent one event and allows comparison of the amount of effort needed to prevent the same event with other treatment options.3

An example for calculation of the NNT is shown in table 1.

Mathematical calculation of the number needed to harm (NNH) is similar to that for the NNT but differs in that the experimental treatment increases the probability of a bad outcome compared with placebo or other comparator.3,5,6 This is useful when an adverse event is caused by the active treatment. The NNH is defined as the number of patients who receive the active treatment that will lead to one additional patient being harmed compared with those receiving placebo.5 The 95% CI for the NNH is calculated as for the NNT. The NNH should be considered together with the NNT because an experimental treatment may help decrease the probability of one event, but may increase the probability of another adverse event, which might exceed the beneficial effect of the active treatment. The NNH calculation is also shown in table 1.

Among the measures of treatment effect, the NNT seems to be the most helpful tool for therapeutic decision making and bedside teaching, as it is easier to interpret and can be compared among different treatment alternatives.1 The NNT describes the difference in the outcome of interest achieved between treatment and control. For therapeutic trials, small NNTs (with the numbers close to 1) indicate a favourable effect of the treatment.7

The NNT for a certain intervention in an individual patient depends on both the nature of treatment and the baseline risk of that patient—that is, the event rate in the control group.3,6–9 If the baseline risk is high, even a small RRR will produce a low NNT. On the other hand, if the event rate in the control group is low, the RRR must be large to produce a low NNT. For example, in a disease with a baseline risk of 90%, an RRR of only 15% will yield an NNT of 7. But if the baseline risk of another disease is 30%, an RRR of 50% is needed to yield the same NNT. Thus, an NNT provided in the literature must be adjusted for a patient's risk at baseline.7,8 Table 2 shows the effect of the baseline risk and RRR on the NNT. To evaluate and compare the effectiveness of various interventions in rheumatology, we conducted this study to assess the NNTs and NNHs of the clinically important outcomes provided in the systematic reviews in the Cochrane Library.

Table 2

Effect of the baseline risk and relative risk reduction on the number needed to treat.3,8 The results are shown as the number needed to treat


We used the key words musculoskeletal, rheumatology, and systematic reviews to search for eligible reviews. Eligible trials included rheumatology systematic reviews from the Cochrane Database of Systematic Reviews in the Cochrane Library, issue 2, 2000 for rheumatoid arthritis (RA), osteoarthritis (OA), and systemic sclerosis (SSc). These were supplemented by data from reviews in progress. Additional data were obtained from personal communications with the authors. Where sufficient data on the outcomes were not available in the reviews, we searched for the data from the original articles (randomised, double blind, controlled trials (RCTs) only). We then calculated the NNTs from the absolute risk differences of the outcomes of interest. These outcomes must be primary, dichotomous outcomes. For disease modifying antirheumatic drug (DMARD) treatment for RA the outcome was whether or not the patients met the American College of Rheumatology 20% improvement (ACR20)10 or Paulus criteria11; for OA interventions, whether or not the improvement of the joint pain met the predefined criteria; for SSc, whether or not the improvement of Raynaud's phenomenon met the predefined study criteria. For trials of RA interventions conducted before the ACR20 or Paulus criteria were developed, using the approach of Norman et al,12 we calculated the NNT from the mean improvement in the tender joint count, which was the most commonly used continuous outcome in these studies. For the calculation of NNH from these interventions, we used the dichotomous outcome of whether or not the patients withdrew from the studies owing to adverse events.

The NNTs and NNHs were estimated to provide the current best estimate of their positive and negative effects—they were not compared with each other, because these trials vary in the population studied, the activity and severity of the disease, and the period of treatment.


From the Cochrane Database of Systematic Reviews, we retrieved 63 complete reviews of musculoskeletal disorders. From these, nine systematic reviews in RA,13–21 two in OA,22,23 and four in SSc24–27 contained the categorical outcomes required for NNT and NNH calculation and were included in this study. Pooled estimates were available for NNT and NNH calculation in 13 systematic reviews. In two reviews, azathioprine for RA13 and penicillamine for RA,15 a continuous outcome was used for calculating the NNT. Eleven original articles contributed the additional data for NNT calculation from their dichotomous outcomes.2,28–37

Figure 1 shows the point estimates and 95% CI of the NNTs and NNHs calculated from the RA systematic reviews and original articles. For the NNT, the outcome was whether or not the clinical improvement of the patients met the ACR20 or Paulus criteria. If the studies were published before the time of these two criteria, the outcome measures used for calculating the NNTs were the criteria for clinical improvement defined in each study.36,37 In the study by Townes et al the criteria for clinical improvement used for calculating the NNT were whether or not the patient improved in five of the following measures of disease activity: tender joint count, swollen joint count, grip strength, 50 foot (15 m) walk time, duration of morning stiffness, and erythrocyte sedimentation rate.36 In the study by Williams et al, the NNT was calculated from whether or not the number of tender joints improved by 30% or more.37 If the ACR20 was not available and a dichotomous measure of clinical improvement was not specified in the trial report, the NNTs were calculated using the mean change in the number of tender joints.13,15 This methodology was derived from a study by Norman and colleagues on the relationship between effect size and proportion benefiting from treatment.12 From this study, the association between effect size and proportion benefiting from treatment for parallel group studies was a near-linear curve and nearly independent of minimally important difference.12 These authors showed that the NNTs derived from continuous outcomes using this curve approximated those calculated from dichotomous outcomes. The 95% CIs of the NNTs calculated from the mean change were also obtained by the same method.

Figure 1

Number needed to treat (NNT) and number needed to harm (NNH) in RA clinical trials and systematic reviews. *95% CI not calculated for non-significant results. SSZ, sulfasalazine; MTX, methotrexate; CYC, cyclophosphamide; AZA, azathioprine; d-Pen, d-penicillamine; CsA, cyclosporin A; Pred, prednisolone; TJC, tender joint count.

Figure 1 shows the NNTs and NNHs of different DMARDs and biological agents. To recap, the NNT is defined as the number of patients with RA that need to receive treatment for one additional patient to achieve a treatment response. In most RA studies the efficacy of most active treatments was compared with placebo. The exceptions were the trials of cyclosporin A (CsA)30 and infliximab,31 which studied the additional efficacy over methotrexate (MTX), and the COBRA trial,35 which compared the efficacy of triple drugs (MTX+prednisolone+sulfasalazine (SSZ)) with that of SSZ alone.

For the treatment of active RA at 24 weeks with different agents compared with placebo, using the ACR20 response criteria to calculate the NNTs, etanercept treatment for six months2 had an NNT of 1.6 (95% CI 1.4 to 2.0). SSZ had an NNT of 3.7 (95%CI 2.5 to 6.7) and the NNT for leflunomide was 3.6 (95% CI 2.9 to 4.8). In the trials comparing the efficacy of MTX with that of placebo, the NNT calculated from the 18 week trial33 was 2.9 (95% CI 2.1 to 4.9) and that from a more recent trial with a longer treatment duration of 12 months32 was higher (5.0; 95% CI 3.3 to 11.0). These differences were not statistically significant.

The NNT for combined MTX, prednisolone, and SSZ compared with SSZ alone was 4.8 (95% CI 2.1 to 21.3) at 28 weeks.35 The NNT for combined CsA+MTX treatment compared with MTX alone at 24 weeks was 3.2 (95% CI 2.2 to 5.7),30 whereas the NNT for infliximab+MTX compared with MTX alone at 30 weeks of treatment was 3.2 (95% CI 2.3 to 5.9).31

Figure 1 shows the NNHs for these treatments of RA. The point estimates demonstrate a wide range and a number of these estimates have infinitely broad CIs—for example, antimalarial drugs and etanercept. The point estimates of the NNTs and NNHs without 95% CI are not of proven benefit or harm, but they may still be clinically important.7 Further studies are required to achieve a finite 95% CI to confirm the benefit or harm of these interventions.

Table 3 shows the results for OA, listing the NNTs to improve pain or patient global assessment of disease severity from the systematic reviews in OA of the hip and knee. For the treatment of hip OA, different analgesics and non-steroidal anti-inflammatory drugs (NSAIDs) were compared in a single systematic review.22 A significant improvement in pain was seen when etodolac and tenoxicam were compared with placebo. The NNT for a four week treatment with etodolac was 4.4 (95% CI 2.4 to 24.4) and that for an eight week treatment by tenoxicam was 3.8 (95% CI 2.5 to 7.3). The NNTs of the other NSAIDs and analgesics are shown as point estimates only.

Table 3

Number needed to treat (NNT) from the systematic reviews in osteoarthritis (OA)

For knee OA, only low level laser treatment, which is one of the modalities used in physiotherapy, was systematically reviewed.23 The point estimate of the NNT for the low level laser treatment to improve pain in one patient was 11.6. The NNH results of both OA interventions were not available owing to insufficient data in the reviews.

Table 4 shows the NNTs for different drugs in patients with SSc to improve digital ulcers or patient global assessment of disease severity.24–27 Although the NNTS calculated from certain drugs were small—for example, intravenous iloprost (NNT=1.2)25 and ketanserin (NNT=2.3),26 their efficacy is still inconclusive because the 95% CIs are large and include 1.

Table 4

Number needed to treat (NNT) and number needed to harm (NNH) from the systematic reviews for Raynaud's phenomenon in systemic sclerosis (SSc)

Table 4 also shows the NNHs of the agents used in SSc. The calculated NNHs of iloprost and ketanserin were small, approximately similar to their NNTs. Although these two drugs may be efficacious in the treatment of Raynaud's phenomenon in SSc, their small NNHs indicate that half of the treated patients would develop adverse events requiring the treatment to be stopped.

The NNTs and NNHs in these tables may need to be adjusted to match the individual patients' baseline risk; as mentioned earlier, this will depend upon factors such as the severity of the presenting disease and whether in primary or tertiary referred care; most of the trials included here are derived from the tertiary practices of rheumatologists. Health professionals in other situations, such as those in primary care, will need to adjust the NNTs and NNHs using table 2.


This study demonstrates the use of NNT and NNH as one approach that should be considered for communicating the clinical importance of results of intervention studies assessing interventions in rheumatological disorders, which can be used as a guideline for decision making in rheumatology clinical practice.

Systematic reviews are increasingly used for evaluating the efficacy and safety of interventions in clinical trials. They also explicitly provide information on the effectiveness and safety of the experimental treatment when applied to patients in routine clinical practice.7 However, the numerical and statistical results in systematic reviews are difficult to understand and use in daily practice. A more understandable term is needed to translate these results into a term which can easily be used.7

The NNT is a measure of clinical benefit that is useful in clinical practice because it interprets the abstract terms used in clinical trials to a more concrete term for routine practice decision making.3 Although many clinical trials in rheumatology interventions have been conducted and published, the main focus is usually upon statistical significance rather than clinical importance. The objective of this review was to provide the best current estimates of the clinically important benefit and adverse effects using the NNT and NNH metric for rheumatology interventions on the basis of published data from systematic reviews and RCTs. They are intended to demonstrate the NNTs calculated from different interventions in rheumatological disorders and might be used as a guideline for decision making in rheumatology clinical practice.

The NNH should be considered along with the NNT to assess the benefit and harm of an intervention.3,7 For example, the NNT for a 12 month treatment with leflunomide was 4 for patients with RA to achieve the ACR20 response.32 The NNH was 10 for withdrawal due to adverse events caused by leflunomide.32 Thus, if 20 patients with active RA are treated with leflunomide for a year, five will achieve the ACR20 response while two are expected to withdraw owing to the adverse events induced by this drug.

The NNT can be affected remarkably by the risk at baseline of the patient and risk reduction by the intervention. The higher the event rate in the control group, the smaller the NNT will be.3 An adjustment for treatment duration is also required for comparisons among different interventions. The NNT for the same intervention will be smaller if the treatment duration is longer.

Although the NNTs in fig 1 might be considered to be comparable among different treatments because most of the studies on RA treatment were conducted in white patients with active RA that could not be adequately controlled by conventional treatment (analgesic drugs and/or NSAIDs), the NNT tables and figure presented in this study are not intended to be used as “league tables” for a comparison across different interventions or diseases. Even in the same disease condition, adjustments for baseline risk and time period of treatment may invalidate such comparisons.

Other limitations of the NNT/NNH include7: (a) the need to take the 95% CI of the NNT/NNH into consideration. For example, if the 95% CI of an NNT estimate is infinite, it is possible that the experimental treatment has no benefit or causes harm even if the point estimate of the NNT shows a beneficial effect; the inverse applies for the NNH. Inclusion of the 95% CI makes the point estimate of the NNT/NNH more clinically relevant and interpretable. (b)The NNTs can be compared among different interventions for the same condition and severity with the same outcome and period of treatment, but because the concept of the NNT is one of frequency, not of utility or importance, it is inappropriate to compare the NNTs across different disease outcomes or treatment duration.


The NNT is a term translated from the less understandable results of clinical trials and systematic reviews to help clinicians in routine practice decision making. It provides the information about treatment benefit by incorporating both the baseline risk without treatment and the risk reduction with treatment. The NNTs demonstrate the number of patients needed to be treated in order to prevent one event and they can be compared among different treatment options for the same disease and outcome. The point estimates of the NNT should be presented with a 95% CI. The NNH can be used to describe the harm caused by the intervention in the same context as the NNT. In this study, the NNTs and NNHs for the interventions of RA, OA, and SSc from clinical trials and systematic reviews are presented. Both the NNTs and NNHs should be adjusted for the baseline risk and treatment duration in individual patients.


The authors thank Dr Geoffrey Norman, Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Canada, for providing the method of calculation of the NNT from continuous outcomes; and Dr Maarten Boers, Department of Clinical Epidemiology and Biostatistics, Free University Hospital, Amsterdam, the Netherlands, for the additional data on the COBRA study and invaluable comments.


View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.