Symposium on Quality of Life in Cancer PatientsMethods to Explain the Clinical Significance of Health Status Measures
Section snippets
THE PROBLEM OF MEANINGFULNESS
Those responsible for making treatment recommendations, such as clinicians for individual patients or experts and health policymakers for groups of patients, must weigh the expected benefits of a treatment against its adverse effects, toxic effects, inconvenience, and cost. This process requires a reasonably accurate understanding of the benefits and risks of alternative treatments. Acquiring this understanding presents a significant problem even for dichotomous clinical outcomes, such as
THE TARGET AUDIENCES FOR CLINICAL SIGNIFICANCE
The intended audience for our discussion on clinical significance includes patients, clinicians, and policymakers. Increasing awareness that value judgments are implicit in every clinical management decision7 has focused more attention on the role of the patient in the decision-making process.8, 9 For patients who desire major involvement in decision making, one approach involves presenting patients with the options and eliciting their choice. Using this approach requires that patients
THE PROBLEM OF MEANINGFULNESS IN QOL MEASURES
We have noted a problem in presenting results of studies using binary outcomes: the different meaning conveyed by relative and absolute risk reduction, NNT, and life-years gained. The complexity increases with the realization that no binary outcome is truly unambiguous. Deaths can be painful or painless, strokes can be mild or severe, and myocardial infarctions can be large and complicated or small and uncomplicated. In fact, severity of stroke and myocardial infarction are continuous in
INFERENCES CONCERNING INDIVIDUALS AND INFERENCES CONCERNING GROUPS
Observers frequently distinguish between the significance of a particular change in score in an individual and a change of the same magnitude in the mean score of a group of patients.12 A change in mean blood pressure in a population of a magnitude that would be trivial in an individual (eg, 2 mm Hg) may translate into a large number of reduced strokes in a population. Indeed, a mean change of 2 mm Hg in a population would reduce the number of strokes substantially. There are 2 reasons for the
ANCHOR-BASED METHODS
Investigators have used 2 easily separable strategies to achieve an understanding of the meaning of scores on a given instrument.12 The first relies on anchor-based methods and examines the relationship between scores on the instrument whose interpretation is under question (the target instrument) and some independent measure (an anchor). For instance, we might examine the relationship between scores on a QOL measure for heart failure and the New York Heart Association (NYHA) functional
APPROACHES FOR IDENTIFYING CLINICAL SIGNIFICANCE
We have not conducted a systematic search for approaches to clinical significance. Thus, our examples are neither comprehensive nor representative. Rather, we have attempted to provide a broad sample of approaches investigators have used, focusing on those we believe are both well done and instructive. However, we have surveyed the entire group of participants in this conference to ensure that we have not omitted any salient methods.
Similarly, we have not tried to be systematic in our critique.
ANCHOR-BASED METHODS OF ESTABLISHING INTERPRETABILITY: REQUIREMENTS
Whether relying on a single anchor or multiple anchors, anchor-based methods have 2 requirements. First, the anchor must be interpretable. It would be of little use to tell clinicians that a 2-point change per item in the fatigue scale (range, 1-7) in the Chronic Heart Failure Questionnaire (CHQ)16 is equivalent to a 30-point change in the Medical Outcome Study physical function scale if they had no idea how to interpret the Medical Outcome Study instrument. On the other hand, if they use the
CLINICIANS’ TRADITIONAL APPROACHES AND INTERPRETABILITY
Experienced clinicians show little hesitation in acting on the clinical measures, yielding continuous scores, by which they judge their patients’ status. Hemoglobin concentration, platelet count, creatinine level, and treadmill exercise capacity constitute a few examples. How does the process of establishing interpretability occur? How, for instance, do chest physicians decide that a change in forced expiratory volume in 1 second (FEV1) of 15% approximates a minimum important change?
Chest
MULTIPLE ANCHORS
Ware and Keller,18 with the 36-Item Short-Form Health Survey (SF-36), have accomplished extensive and comprehensive work using multiple anchors, and we rely to a large extent on their studies to provide examples of this approach. In our discussion, we deal initially with anchors that involve concurrent measurement of the target and anchor and subsequently discuss anchors that involve monitoring patient outcome over time (health care utilization, job loss, and death).
SINGLE-ANCHOR METHODS The Minimum Important Difference
Single-anchor methods generally aim to establish differences in score on the target instrument that constitute trivial, small but important, moderate, and large changes in QOL. However, they generally put great emphasis on a threshold that demarcates trivial from small but important differences: the minimum important difference (MID). One popular definition of the MID is “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in
ANALYTIC STRATEGIES FOR SINGLE-ANCHOR APPROACHES
Having chosen a single-anchor approach, investigators may use alternative analytic strategies that will lead to different estimates of the MID.54 The simplest and so far most widely used approach is to specify a result or range of anchor instrument results that corresponds to the MID and calculate the target score corresponding to that value. For example, investigators have examined the mean change in QOL score corresponding to global ratings of change that included “hardly any better,” “a
SINGLE-ANCHOR APPROACHES AND CLINICAL TRIALS INTERPRETATION
Once one has established the MID for a patient, one must decide how to use this information in clinical trials. A naive approach would assume that if the mean difference between treatment and control was less than the MID, the treatment effect would be trivial, and if greater than the MID, the treatment effect would be important. This ignores the distribution of the results. For example, assume a MID of 0.5. A mean difference of 0.25 (trivial in a naive interpretation) could be achieved if 25%
BETWEEN-PERSON STANDARD DEVIATION UNITS
The most widely used distribution-based method to date is the between-person standard deviation. The group from which this is drawn is typically the control group of a particular study at baseline or the pooled standard deviation of the treatment and control groups at baseline. As we have mentioned herein, an alternative is to choose the standard deviation for a sample of the general population or some particular population of special interest, rather than the population of the particular
STANDARD ERROR OF MEASUREMENT
The standard error of measurement is defined as the variability between an individual's observed score and the true score and is computed as the baseline standard deviation multiplied by the square root of 1 minus the reliability of the QOL measure. Theoretically, a QOL measure's standard error of measurement is sample independent, whereas its component statistics, the standard deviation and the reliability estimate, are sample dependent and vary around the standard error of measurement.64 For
RECONCILIATION OF ANCHOR-BASED AND DISTRIBUTION-BASED METHODS
Investigators are adducing increasing evidence concerning the relationship between statistical measures of patient variability and anchor-based estimates of small, moderate, and large differences in QOL. To the extent that standard deviations across QOL studies using the same instruments are consistent, one will see a consistent relationship between the standard deviation and the MID. If this relationship were also consistent across instruments, this area of investigation would become much
CONCLUSIONS
This review reflects both the considerable work that has been done to establish the interpretability of QOL measures in the last 15 years and the enormous amount left to do. The field remains controversial, and there are many alternative approaches, each with its advocates. The following conclusions, however, may be relatively safe. First, distribution-based methods will not suffice on their own but will be useful to the extent that they bear a consistent relationship with anchor-based methods.
REFERENCES (67)
- et al.
Can there be a more patient-centered approach to determining clinically important effect sizes for randomized treatment trials?
J Clin Epidemiol
(1994) - et al.
Completeness of reporting trial results: effect on physicians' willingness to prescribe
Lancet
(1994) - et al.
Quality of life on angina therapy: a randomised controlled trial of transdermal glyceryl trinitrate against placebo
Lancet
(1988) - et al.
Measurement of health status: ascertaining the minimal clinically important difference
Control Clin Trials
(1989) - et al.
Determining a minimal important change in a disease-specific Quality of Life Questionnaire
J Clin Epidemiol
(1994) - et al.
Interpretation of rhinoconjunctivitis quality of life questionnaire data
J Allergy Clin Immunol
(1996) - et al.
Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach
J Clin Epidemiol
(1997) - et al.
Assessing the minimal important difference in symptoms: a comparison of two techniques
J Clin Epidemiol
(1996) - et al.
On the debate over methods for estimating the clinically important difference
J Clin Epidemiol
(1996) - et al.
Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance
J Chronic Dis
(1986)
Identification of clinically important changes in health status using receiver operating characteristic curves
J Clin Epidemiol
Economic analysis of respiratory rehabilitation
Chest
Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life
J Clin Epidemiol
Setting the minimal metrically detectable change on disability rating scales
Arch Phys Med Rehabil
Indexes of contrast and quantitative significance for comparisons of two groups
Stat Med
Measured enthusiasm: does the method of reporting trial results alter perceptions of therapeutic effectiveness?
Ann Intern Med
Prescribing propensity: influence of life-expectancy gains and drug costs
J Gen Intern Med
Discrepancy between medical decisions for individual patients and for groups
N Engl J Med
Users' guides to the medical literature, XVI: how to use a treatment recommendation
JAMA
Decision aids for patients facing health treatment or screening decisions: systematic review
BMJ
Moving from evidence to action: incorporating patient values
Measuring health-related quality of life
Ann Intern Med
Interpretation of quality-of-life outcomes: issues that affect magnitude and meaning
Med Care
Interpretation of quality of life changes
Qual Life Res
Interpreting treatment effects in randomised trials
BMJ
The clinical meaning of Rankin “handicap” grades after stroke
Stroke
Thrombolysis for acute ischaemic stroke
Cochrane Database Syst Rev
Development and testing of a new measure of health status for clinical trials in heart failure
J Gen Intern Med
Approaches to the interpretation of quality-of-life scales
Med Care
Interpreting general health measures
The cost effectiveness of auranofin: results of a randomized clinical trial
J Rheumatol
The impact of psychologic factors on measurement of functional status: assessment of the sickness impact profile
Med Care
Measuring functional outcomes in chronic disease: a comparison of traditional scales and a self-administered health status questionnaire in patients with rheumatoid arthritis
Med Care
Cited by (1187)
Minimal clinically important difference: Bridging the gap between statistical significance and clinical meaningfulness
2024, Journal of Clinical AnesthesiaEORTC QLQ-C30 normative data for the United Kingdom: Results of a cross-sectional survey of the general population
2024, European Journal of CancerThe minimal important difference for the Postural Assessment Scale for Stroke Patients in the subacute stage
2024, Brazilian Journal of Physical TherapyMinimal important change and difference in health outcome: An overview of approaches, concepts, and methods
2024, Osteoarthritis and Cartilage
A complete list of other Clinical Significance Consensus Meeting Group contributors to this article appears at the end of the article.
This project was supported in part by Public Health Service grants CA25224, CA37404, CA15083, CA35269, CA35113, CA35272, CA52352, CA35103, CA37417, CA63849, CA35448, CA35101, CA35195, CA35415, and CA35103.
Individual reprints of this article are not available. The entire Symposium on the Clinical Significance of Quality-of-Life Measures in Cancer Patients will be available for purchase as a bound booklet from the Proceedings Editorial Office at a later date.