Evaluation of clinically relevant changes in patient reported outcomes in knee and hip osteoarthritis: the minimal clinically important improvement
- F Tubach1,
- P Ravaud1,
- G Baron1,
- B Falissard2,
- I Logeart3,
- N Bellamy4,
- C Bombardier5,
- D Felson6,
- M Hochberg7,
- D van der Heijde8,
- M Dougados9
- 1Institut National de la Santé et de la Recherche Médicale (INSERM) E 0357; Département d’Epidémiologie, Biostatistique et Recherche Clinique; Groupe Hospitalier Bichat-Claude Bernard (Assistance Publique—Hôpitaux de Paris); Faculté Xavier Bichat (Université Paris 7), Paris, France
- 2Faculté de Médecine Paris-sud, Département de Santé Publique, Hôpital Paul Brousse (Assistance Publique—Hôpitaux de Paris), Villejuif, France
- 3Merck, Sharp & Dohme Chibret Laboratories, Paris, France
- 4Department of Medicine, University of Queensland, Royal Brisbane Hospital, Brisbane, Queensland, Australia
- 5Institute for Work and Health, Toronto, Ontario, Canada
- 6Boston University School of Medicine, Boston, Massachusetts, USA
- 7University of Maryland, Baltimore, Maryland, USA
- 8University Hospital, Maastricht, The Netherlands
- 9Service de Rhumatologie B, Hôpital Cochin (Assistance Publique—Hôpitaux de Paris), Paris, France
- Correspondence to:
Dr F Tubach
Département d’Epidémiologie, Biostatistique et Recherche Clinique, INSERM E0357, Hôpital Bichat, 46 rue Henri Huchard, 75018 Paris, France;
- Accepted 9 May 2004
- Published Online First 18 June 2004
Background: In clinical trials, at the group level, results are usually reported as mean and standard deviation of the change in score, which is not meaningful for most readers.
Objective: To determine the minimal clinically important improvement (MCII) of pain, patient’s global assessment of disease activity, and functional impairment in patients with knee and hip osteoarthritis (OA).
Methods: A prospective multicentre 4 week cohort study involving 1362 outpatients with knee or hip OA was carried out. Data on assessment of pain and patient’s global assessment, measured on visual analogue scales, and functional impairment, measured on the Western Ontario McMaster Universities Osteoarthritis Index (WOMAC) function subscale, were collected at baseline and final visits. Patients assessed their response to treatment on a five point Likert scale at the final visit. An anchoring method based on the patient’s opinion was used. The MCII was estimated in a subgroup of 814 patients (603 with knee OA, 211 with hip OA).
Results: For knee and hip OA, MCII for absolute (and relative) changes were, respectively, (a) −19.9 mm (−40.8%) and −15.3 mm (−32.0%) for pain; (b) −18.3 mm (–39.0%) and −15.2 mm (−32.6%) for patient’s global assessment; (c) −9.1 (−26.0%) and −7.9 (−21.1%) for WOMAC function subscale score. The MCII is affected by the initial degree of severity of the symptoms but not by age, disease duration, or sex.
Conclusion: Using criteria such as MCII in clinical trials would provide meaningful information which would help in interpreting the results by expressing them as a proportion of improved patients.
- MCID, minimal clinically important difference
- MCII, minimal clinically important improvement
- NSAID, non-steroidal anti-inflammatory drug
- OA, osteoarthritis
- VAS, visual analogue scale
- WOMAC, Western Ontario McMaster Universities Osteoarthritis Index
The choice of an outcome measure is a major step in the design of clinical trials. In evaluating the symptomatic severity of osteoarthritis (OA) of the lower limbs, scientific groups such as the OMERACT (Outcome Measures in Rheumatology Group),1 GREES (Group for the Respect of Ethics and Excellence in Science),2 and OARSI (OsteoArthritis Research Society International)3 have raised the importance of evaluating at least three dimensions: pain, patient’s global assessment of disease status, and functional impairment. At the individual level, determining the minimal meaningful change in a score by use of a structured instrument is a challenge. Are changes in self reported levels of pain of 10 mm on a 0–100 mm visual analogue scale (VAS) clinically important? Does the change reflect meaningful improvement for the patient? The concept of the minimal clinically important difference (MCID)4–6 could help in interpreting changes in scores at the individual level. However, the MCID, which can reflect either an improvement or a worsening, has not been used here, because in clinical trials we are always interested in improvement and not worsening. Furthermore, it has been shown that the MCID could be different for improvement and worsening.7 The minimal clinically important improvement (MCII), defined as the smallest change in measurement that signifies an important improvement in a patient’s symptom, seems more appropriate and, in clinical trials, provides readers with additional information on the effect size by expressing the results more meaningfully (that is, as a percentage of improved patients).
This prospective cohort study aimed at estimating the MCII from the patient’s perspective for three main patient reported outcomes used in OA trials: pain, patient’s global assessment of disease activity, and functional impairment.
MATERIALS AND METHODS
We conducted a prospective 4 week cohort study.
This study involved 1362 outpatients with knee or hip OA, as defined by the American College of Rheumatology,8,9 included by 399 rheumatologists. Each rheumatologist had to recruit four patients, three with knee OA and one with hip OA. To be included in the study, patients had to experience pain from OA (⩾30 mm on a VAS varying from 0 to 100), require treatment with a non-steroidal anti-inflammatory drug (NSAID), and be able to complete questionnaires in French. Inclusion could begin with the onset of treatment or a switch from one NSAID to another. Patients were excluded if they had a prosthesis on the assessed joint or if they had been given an intra-articular injection in the 4 weeks before the study began. All patients initially visited the rheumatologist in charge of their case, and an NSAID was prescribed (the drug and its dosage was chosen by the physician). A final visit to the same rheumatologist was scheduled 4 weeks later.
At the baseline visit, demographic and disease data were collected. Patients assessed their OA status at baseline and final visit. They assessed the following patient reported outcomes: (a) pain on movement during the 48 hours before the visit, measured on a 0–100 mm VAS; (b) global assessment of disease activity measured on a 0–100 mm VAS; and (c) physical function, measured on the Western Ontario McMaster Universities Osteoarthritis Index (WOMAC) function subscale (17 items, five point Likert scale for each item; high scores indicate high degree of functional impairment; total score normalised to a 0–100 score).
At the final visit, a random sample of two thirds of the patients (n = 923) assessed their response to NSAID treatment on a five point Likert scale (none = no good at all, ineffective drug; poor = some effect but unsatisfactory; fair = reasonable effect but could be better; good = satisfactory effect with occasional episodes of pain or stiffness; excellent = ideal response, virtually pain free). The other third of the patients assessed their response to treatment on a 15 point Likert scale (from −7, a very great deal worse, to +7, a very great deal better, with 0, no change).
All the analyses considered patients with knee and hip OA separately.
The MCII was determined in a subgroup of 814 patients (603 with knee and 211 with hip OA) whose assessment of response to treatment was measured on a five point Likert scale and who had completed the final visit.
An anchoring method based on the patient’s assessment of response to treatment was used.
The MCII was estimated for both the absolute (final value−baseline value) and the relative ((final value−baseline value)/baseline value) changes in each patient reported outcome. It was estimated by constructing a curve of cumulative percentages of patients as a function of the change in score (for example, difference in pain score) among patients whose final evaluation of response to treatment was “good, satisfactory effect with occasional episodes of pain or stiffness”, because we wanted to focus on the improvement that was clinically important. Logistic regression was used to model the observations (fig 1). We targeted the point at the flattening of the curve at which most subjects stated they had improved. To determine the change in score corresponding to this point, we first looked at the two parameter logistic model that best fitted the data. Then we determined the square root of the third derivative of this logistic function that corresponded with the MCII. One can demonstrate that this point corresponds by construction to the 78.9th centile of the change in score, and thus we propose to define the MCII as the 75th centile of the change in score, because it is very close to the point defined above and easier to derive. The model permitted us, firstly, to determine that the target point was correctly approached by the 75th centile and, secondly, to estimate the 95% confidence intervals.
In a second step, we stratified the analysis on the baseline score of interest (divided into tertiles) to assess whether the level of pain, the patient’s assessment of disease activity, and functional impairment had a modifying effect on the MCII. That is we stratified (a) on the baseline pain score to estimate the MCII for pain; (b) on the baseline assessment of disease activity to estimate the MCII for patient’s assessment of disease activity; (c) on the baseline WOMAC function score to estimate the MCII for functional impairment.
In a third step, to investigate the effect of covariates (other than location of OA) on the MCII, we stratified the analysis successively by age, disease duration (both divided into tertiles), and sex.
Statistical analyses was performed with the SAS Release 8.2 statistical software package and the S plus 4.5 statistical software package.
Compliance with research ethics standards
This study was conducted in compliance, with the protocol, good clinical practices, and the Declaration of Helsinki principles.
A total of 1362 patients were enrolled in the study: 1019 (75%) had knee and 343 (25%) hip OA; 913 (67%) were female; and the mean (SD) age was 67.2 (10.5) years. A total of 914 (90%) patients with knee and 310 (90%) with hip OA completed the final visit. Patients lost to follow up were excluded from the analysis and did not differ from completers in their baseline characteristics. Among the completers, 603 patients with knee and 211 with hip OA assessed their response to treatment on a five point Likert scale.
Table 1 shows the descriptive statistics on clinical and demographics variables. Figure 2 shows patients’ rating of response to treatment.
Table 2 lists the MCII values for the three patient reported outcomes, according to location of OA. These values were estimated in the 265 patients with knee and the 87 patients with hip OA who completed the final visit and assessed their response to treatment as “good”. For instance, patients with knee OA considered themselves clinically improved if the decrease in pain exceeded 19.9 mm on the 0–100 mm VAS. We used the data from the five point not the 15 point Likert scale mentioned in the “Methods” section.
Table 3 shows the estimates of the MCII (for absolute change) stratified on the baseline score in patients with knee or hip OA. The higher the baseline score, the larger the MCII. Patients who have a severe symptom need a higher level of change to consider themselves clinically improved than those with less severe symptoms. For instance, patients with severe pain (a high tertile of baseline pain score) considered themselves clinically improved if the decrease in pain exceeded 36.6 mm on the 0–100 mm VAS. Patients with less pain (low tertile of baseline pain score) needed a lower level of change (−10.8 mm on the VAS) to consider themselves clinically improved. The estimates of the MCII for relative change also varied across tertiles of the baseline score (data not shown).
The estimates of the MCII do not vary across age, disease duration tertiles, or sex (data not shown).
This study dealt with the clinical meaningfulness of changes observed for patient reported outcome measures. Because a statistically significant difference is mostly a matter of sample size, the most difficult issue is whether an observed or estimated difference is clinically important.10 In other words, statistical significance is not equivalent to clinical significance. Reporting results of a trial using the MCII (that is, as a percentage of improved patients) provides readers with values which are more easily understood and additional information to help them decide whether a treatment should be used. This threshold also allows for monitoring of individual response to treatment over time and adapting treatment to individual patients (for example, determining whether to start or interrupt a treatment). Furthermore, the designation and use of MCII in clinical trials is critical for meaningful systematic reviews and combining results from different studies in meta-analyses.11 This concept aims at complementing, not replacing, information on the effect size, because the effect size remains a more powerful approach.12
The MCII is the smallest change in measures that signifies an important improvement in a patient’s symptom. Thus, the MCII can undoubtedly be considered as a treatment target from the patient’s perspective. It is based on the patient’s opinion as an external anchor and contrasts changes within patients at the individual level (proportion of improved patients) instead of at the group level (mean change in a variable).
Approaches such as investigator defined (expert consensus) or statistically defined methods have been used to determine this threshold.13,14 Despite the absence of a criterion measure, establishing the meaning of changes in a measure requires an independent standard. Patient global ratings are recommended as an external anchor for evaluating the clinical significance of individual change.13 The large sample of patients as experts in determining improvement is a good indicator of representativeness.
To determine the MCII, the external criterion was the patient’s assessment of response to treatment as assessed on a five point Likert scale. We defined MCII in the group of patients whose evaluation of response to treatment was “good”, because one is always looking for clinically important differences. We did not include patients whose evaluation of response to treatment was “excellent,” because our target was the minimal change important from the patient’s perspective. But obviously, this choice was arbitrary and affects the results (data not shown). The group of patients in whom MCII is determined and the wording of the items in the questionnaire to assess response to treatment should be chosen with the help of experts; in our study, the group of patients were chosen by the experts NB, CB, DF, MH, DvdH, and MD.
In a previous study,15 a three round Delphi method involving six academic rheumatologists experienced in OA trials was used to define the MCID for some outcome measures used in OA trials (not specifically focusing on hip or knee OA). The MCID for patient pain on movement (measured on a 0–100 VAS) was 17.5 mm and that for patient global assessment of disease activity (measured on a 0–100 VAS) was 15 mm. Although this method differs from that used in our study, the values are very close to our estimates of MCII for these patient reported outcomes. The only study dealing with meaningful change for the WOMAC dealt with the minimal clinically perceptible difference not the MCID.16
Our study has demonstrated that the MCII varies depending on the baseline state. Patients who have the most severe symptoms have to experience a greater change to consider themselves improved. Riddle and colleagues also found this effect in their investigation of low back pain,17 where the MCID varied between 3 and 13 depending on the baseline range of scores (on the Roland-Morris Back Pain Questionnaire,18 total score varying from 0 to 24 points, with baseline scores divided into five approximately equal sized intervals). However, the precision of their estimates may have been compromised by the small sample size, especially for patients with high levels of disability.
The variation of MCII across tertiles of baseline scores in our study cannot been imputed to the size of the sample, as confirmed by the narrowness of the 95% confidence intervals. We believe that this variation depending on the baseline score may preclude the use of the crude MCII. The patient’s initial or previous score should be taken into account when making decisions about important change. We propose to use three estimates of MCII, corresponding to the tertiles of each baseline score, to express the changes in terms of important improvement. This meets the recommendation of Crosby and associates13 for estimating MCID in health related quality of life criteria: to anchor baseline severity of individual patients.
We believe this is the first study to investigate the effect of several covariates such as age, sex, OA location, and disease duration on patient responses. It is interesting to observe that these factors do not consistently modify the estimates of MCII.
In conclusion, use of the concept MCII facilitates the presentation and interpretation of results obtained in clinical trials and the transposition of trial results into practice. However, the baseline score should be taken into account. Further studies involving different datasets, clinical environments, languages, and countries are necessary to validate these observations prospectively.
This study was supported by an unrestricted grant from Merck, Sharp, & Dohme Chibret Laboratories, France.