Article Text

PDF

Minimally important difference in diffuse systemic sclerosis: results from the d-penicillamine study
  1. D Khanna1,
  2. D E Furst2,
  3. R D Hays3,5,
  4. G S Park2,4,
  5. W K Wong4,
  6. J R Seibold6,
  7. M D Mayes7,
  8. B White8,*,
  9. F F Wigley9,
  10. M Weisman10,
  11. W Barr11,
  12. L Moreland12,
  13. T A Medsger Jr13,
  14. V D Steen14,
  15. R W Martin15,
  16. D Collier16,
  17. A Weinstein14,
  18. E V Lally17,
  19. J Varga11,
  20. S R Weiner2,
  21. B Andrews18,
  22. M Abeles19,
  23. P J Clements2
  1. 1Division of Immunology, Department of Medicine; Institute for the Study of Health, University of Cincinnati, Cincinnati, Ohio, USA; Veterans Affairs Medical Center, Cincinnati
  2. 2Department of Medicine, Division of Rheumatology, David Geffen School of Medicine, Los Angeles, California, USA
  3. 3Division of General Internal Medicine and Health Services Research, David Geffen School of Medicine
  4. 4Department of Biostatistics, David Geffen School of Medicine
  5. 5RAND, Los Angeles
  6. 6University of Michigan Scleroderma Program, Ann Arbor, Michigan, USA
  7. 7Division of Rheumatology and Clinical Immunogenetics, The University of Texas—Houston Medical School, Houston, Texas, USA
  8. 8Department of Rheumatology and Clinical Immunology, University of Maryland School of Medicine, Baltimore, Maryland, USA
  9. 9Rheumatology Division, The Johns Hopkins University, Baltimore, Maryland, USA
  10. 10Division of Rheumatology, Cedars-Sinai Medical Center, Los Angeles
  11. 11Division of Rheumatology, Northwestern University Medical School, Chicago, Illinois, USA
  12. 12Divison of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, Alabama, USA
  13. 13Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
  14. 14Division of Rheumatology, Georgetown University Medical Center, Washington, DC, USA
  15. 15Division of Rheumatology, College of Human Medicine, Michigan State University, Grand Rapids, Michigan, USA
  16. 16Division of Rheumatology, University of Colorado Health Sciences Center, Denver, Colorado, USA
  17. 17Division of Rheumatology, Brown Medical School, Providence, Rhode Island, USA
  18. 18Division of Rheumatology, University of California, Irvine, California, USA
  19. 19Division of Rheumatic Diseases, University of Connecticut Health Center, Farmington, Connecticut, USA
  1. Correspondence to:
    D Khanna
    Division of Immunology, Department of Medicine, University of Cincinnati, ML 0563, Cincinnati, OH 45267-0563, USA;dinesh.khanna{at}uc.edu

Abstract

Objective: To estimate minimally important differences (MIDs) in scores for the modified Rodnan Skin Score (mRSS) and Health Assessment Questionnaire—Disability Index (HAQ-DI) in a clinical trial on diffuse systemic sclerosis (SSc).

Participants and methods: 134 people participated in a 2-year, double-blind, randomised clinical trial comparing efficacy of low-dose and high-dose d-penicillamine in diffuse SSc. At 6, 12, 18 and 24 months, the investigator was asked to rate the change in the patient’s health since entering the study: markedly worsened, moderately worsened, slightly worsened, unchanged, slightly improved, moderately improved or markedly improved. Patients who were rated as slightly improved were defined as the minimally changed subgroup and compared with patients rated as moderately or markedly improved.

Results: The MID estimates for the mRSS improvement ranged from 3.2 to 5.3 (0.40–0.66 effect size) and for the HAQ-DI from 0.10 to 0.14 (0.15–0.21 effect size). Patients who were rated to improve more than slightly were found to improve by 6.9–14.2 (0.86–1.77 effect size) on the mRSS and 0.21–0.55 (0.32–0.83 effect size) on the HAQ-DI score.

Conclusion: MID estimates are provided for improvement in the mRSS and HAQ-DI scores, which can help in interpreting clinical trials on patients with SSc and be used for sample size calculation for future clinical trials on diffuse SSc.

  • d-Pen, d-penicillamine
  • HAQ-DI, Health Assessment Questionnaire—Disability Index
  • HRQOL, health-related quality of life
  • MID, minimally important difference
  • mRSS, modified Rodnan Skin Score
  • SSc, systemic sclerosis

Statistics from Altmetric.com

Chronic diseases often have a relapsing and remitting course, with substantial effect on functioning and well-being or health-related quality of life (HRQOL). Systemic sclerosis (SSc) or scleroderma, a multisystem autoimmune disorder with no effective treatment or cure, is an example of a disease in which patients must cope with pain, disfigurement, disability and feelings of helplessness, each of which can affect their HRQOL and outlook on the future.1 The ultimate goal of healthcare is to improve, restore or preserve HRQOL.2 An important progress in the research on HRQOL is the concept of minimally important difference (MID), defined as “the smallest difference in score of a HRQOL [or measure of interest] instrument that patients perceive as beneficial and that would mandate, in the absence of troublesome side-effects and excessive cost, a change in the patient’s management”.3 For the clinicians, it has been defined as “the smallest effect size that would lead them to recommend a therapy to their patients”.4 In short, “MID is the smallest difference in a score that is considered to be worthwhile or important”.5

Estimation of the MID requires identification of an external anchor that provides an indication of underlying change— a clinically relevant indicator or pointer to which a change in HRQOL can be tied.6 An anchor is of clinical relevance and can be “subjective”, such as self-reports of change as assessed by the patient or the investigator, or “objective”, such as clinical indicators of response to treatment (skin thickness or disease severity). Self-reports of change require the respondents to evaluate whether they have experienced change in a domain of health from a previous time point to the present.7

In this report, we estimate MIDs for the modified Rodnan Skin Score (mRSS) and Health Assessment Questionnaire—Disability Index (HAQ-DI) in a 2-year, double-blind, randomised clinical trial comparing efficacy of low-dose and high-dose d-penicillamine (d-Pen) in patients with diffuse SSc.8,9,10 The investigator’s assessment of the patient’s disease was used as the anchor at time points of 6, 12, 18 and 24 months.

PARTICIPANTS AND METHODS

Participants

The d-Pen study has been detailed elsewhere.8 Briefly, this was a 2-year, multicentre US study that recruited and randomised 134 people with diffuse SSc (⩽18 months) to receive 750–1000 mg/day of d-Pen versus 125 mg of d-Pen every other day. Diffuse SSc was characterised by the presence of skin thickening, proximal as well as distal to the elbows and knees, with or without involvement of the face.11 No statistically significant differences were observed between treatment groups at baseline and at 2 years in the three primary outcomes in this trial: change in the skin score, incidence of renal crisis and survival. Therefore, for the present analysis, the data from the two groups were pooled. The skin score was assessed using the mRSS (referred to as skin score from this point) and is a clinical measure of the extent and severity of skin thickening.12,13 In addition to serving as the primary basis for disease classification, skin thickening in SSc is viewed as a clinical surrogate of disease progression, at least in early disease (usually considered to be <3 years from the onset of the first typical sign or symptom of SSc, not including Raynaud’s phenomenon).14 Skin thickening is assessed on 17 body areas: fingers, hands, forearms, arms, feet, legs and thighs (bilaterally), and the face, chest and abdomen (singly). Each area is scored from 0 to 3, where 0 is, normal skin and 3 is, severe thickening (range 0 (no thickening) to 51 (severe thickening) in all 17 areas).

The HAQ-DI is a self-administered arthritis-targeted measure intended for assessing activities of daily living in arthritis.15 It assesses a patient’s level of functional ability and includes questions on fine movements of the upper extremity, locomotor activities of the lower extremity, and activities that are associated with both upper and lower extremities. In the original HAQ-DI, an additional grade of difficulty was added in patients using assistive or adaptive devices (such as a cane or walker). In our study, the patient responses were not modified for patient use of assistive or adaptive devices.9,10 The standard HAQ-DI is scored from 0 (no disability) to 3 (severe disability), representing the averaging of the worst score in each of eight domains of daily function.

At 6, 12, 18 and 24 months, the investigator was asked “Since entering the study, the patient is: markedly worsened, moderately worsened, slightly worsened, unchanged, slightly improved, moderately improved, markedly improved.” In the question, there is no direct reference to the patients’ health or their disease. Patients who improved slightly were defined as the minimally changed subgroup. The changes in the skin and HAQ-DI scores for the group that “slightly improved” at 6, 12, 18 and 24 months were determined in order to estimate the MID. This was compared to change scores for the group that moderately or markedly improved.

Only improvement is considered in this article because the number of patients who slightly worsened was too small to make a statistically or clinically robust conclusion (n = 5–6 at different time points) and the data are not presented.

Statistical analysis

The normality of change in skin score and HAQ-DI score for patients who slightly improved (ie, skin scoremonth x–skin scoremonth 0, where x = 6, 12, 18 or 24 months) was assessed using the Shapiro–Wilk test. The data were normally distributed for the change in skin score and HAQ-DI score except for 18-month change in HAQ-DI score; therefore, the MIDs are presented as mean (standard deviation (SD)). As pointed by Hays and colleagues,16


... [an] anchor should have a non-trivial association with change in HRQOL measure [or measure of interest, eg, skin score here]. If the correlation between the anchor and HRQOL [or measure of interest] change is zero, the anchor is not useful for establishing MID (pp 64–5).

To assess the usefulness of an anchor, change in the anchor and change in the HRQOL scores should have a correlation coefficient of at least ⩾0.37.16 The correlation coefficient of ⩾0.37 corresponds to an effect size of 0.80 (large effect as proposed by Cohen17). This was assessed using the Spearman correlation coefficient (as change in anchor is an ordinal variable) between the investigator global assessment (the anchor) and change in skin and HAQ-DI scores. MID was estimated by examining change in skin score and HAQ-DI score in patients who had slightly improved. These estimates were compared with those of patients who were rated to improve more than slightly. Responsiveness to change was evaluated using the effect size.18 Effect size is the mean change in the skin score or HAQ-DI from the baseline to month x (where x = 6, 12, 18 or 24 months) divided by the SD at baseline (8.0 for the skin score and 0.66 for HAQ-DI). Cohen’s rule of thumb for interpreting effect size is that a value of 0.20–0.49 represents a small change, 0.50–0.79 a medium change and ⩾0.80 a large change.17,19–21

As an exploratory analysis, the effect of baseline HAQ-DI and skin scores was assessed on the MID estimates.6 People with different baseline HAQ-DI or skin scores may require different amounts of improvement to consider a change to represent an MID. We divided the baseline HAQ-DI and skin scores into three groups: between 0 and 25th centile, group 1; between 26th and 74th centile, group 2; and ⩾75th centile, group 3. We estimated MID for 6-month change score; the numbers of participants were too small at other time points to estimate changes for the three groups.

RESULTS

Table 1 shows the baseline characteristics of the 134 patients. The mean (SD) values were 43.7 (12.4) years for age, 1.0 (0.7) for HAQ-DI and 21 (8) for skin score. At baseline, the 66 patients who completed the study had a statistically significant (p<0.05) higher haematocrit than non-completers (n = 68); other baseline variables were similar between the two groups. MID analysis is based on completers at 6 (n = 104), 12 (n = 97), 18 (n = 72) and 24 months (n = 68; table 2).

Table 1

 Baseline characteristics of all patients, classified by completion status of the 2-year study

Table 2

 Minimal important difference estimates for the skin score and Health Assessment Questionnaire—Disability Index

Table 3 shows the Spearman’s correlation coefficient between the anchor (investigator assessment of change) and change in the HAQ-DI and skin score in the slightly improved group from baseline at four time points. The correlation coefficient was >0.37 at all four time points for both measures, except change in the HAQ-DI score at month 6 (r = 0.35). This satisfies the requirement that the correlation be ⩾0.37.

Table 3

 Spearman’s correlation coefficient between the anchor and change in the Health Assessment Questionnaire—Disability Index and skin scores

Table 2 shows the mean (SD) MID scores for improvement in the skin score and HAQ-DI. MID estimate for the skin score ranged from 3.2 to 5.3 (0.40–0.66 effect size), except at month 12 (MID = 1.9). HAQ-DI score ranged from 0.10 to 0.14 (0.15–0.21 effect size), except 0.03 at month 12. Patients who were rated to improve more than slightly were found to improve by 6.9–14.2 (0.86–1.77 effect size) on the skin score and 0.21–0.55 (0.32–0.83 effect size) on the HAQ-DI. Table 2 also provides mean (SD) scores of patients who had not changed according to the doctors (<0.20 effect size at all four time points for skin score and HAQ-DI). MID estimates of those who were rated to improve more than slightly were consistently higher and in the expected direction than those of the unchanged group.

Table 4 shows the mean (SD) MID scores at 6 months for improvement in the skin and HAQ-DI stratified by severity of skin involvement at baseline. Although exploratory, the data suggest that patients with very high baseline scores require a larger improvement to be characterised by their doctors as minimally improved as exemplified by the skin score (MID for the minimally improved group for low-score group, −2.6; medium-score group, −1.8; and high-score group, −6.1) and HAQ-DI (MID for the minimally improved group for low score group, +0.01; medium score group, −0.11; and high score group, −0.21).

Table 4

 Minimal important difference estimates for the skin score and Health Assessment Questionnaire—Disability Index at 6 months stratified by severity at the baseline

DISCUSSION

MID estimates of HRQOL measures have propelled the development of a new drug in the treatment of different arthritides and successful design of controlled studies to improve HRQOL.22 Rapid advances in the field of immunology have spawned novel treatments targeting the cytokines associated with the pathogenesis of SSc. In addition, MID estimates provide a benchmark for future design of clinical trials on patients with SSc by helping researchers and clinicians understand whether differences in HRQOL scores between two treatment groups are meaningful or whether changes within one group over time are meaningful.6 MIDs are also useful for determining sample size for future studies.23

We used data from the d-Pen study to derive the MIDs for skin scores and HAQ-DI. We used investigator global assessment of change as our anchor. In this study, the correlation coefficient between the anchor and change in skin score and HAQ-DI was ⩾0.37. Investigator global assessment of change could capture changes in skin score and HAQ-DI.

Although a previous study21 and consensus statement24 by the SSc experts suggest that a change of ⩾30% is clinically meaningful, to our knowledge, no study has estimated MID using a data-driven approach. Our MID estimates of skin score were 3.2–5.3 (or 15–25% of the baseline skin score of 21), which are smaller than the proposed consensus improvement of ⩾30%. Also, the range of MID estimates of skin score (3.2–5.3) is greater than the reported intraobserver skin score variability of 2.45,25 making these data-driven estimates statistically credible as well.

MID for improvement in HAQ-DI of patients with rheumatoid arthritis has been estimated to be 0.2226 and our MID estimate is lower (0.10–0.14) than that reported in patients with rheumatoid arthritis. The difference might be explained by the fact that MID in this study is based on the doctor’s assessment rather than the patient’s assessment of the disease course. Also, MID estimates are reported as a range rather than a point estimate, as MID may change with different anchors.16

The effect size was consistently lower for HAQ-DI than for skin score—negligible to small for the slightly improved group, small to medium (except month 18, which was large) for the moderately and markedly improved groups. This is compared with a small to medium effect size for the slightly improved groups and a large effect size for the other two groups.

MID estimates in the other groups should be larger and in the expected direction compared with the no change group because “if it turns out that the change for the no change group is similar to that of the minimally changed group, then the MID estimate is suspect” Hayes et al16(p 65). MID estimates for skin score and HAQ-DI were consistently greater and in the expected direction with the no change group with exception at month 12. A review of the 12-month data did not find any non-random pattern of missingness, and we excluded these values from our proposed estimates.

As previously noted,6 MID estimates may depend on baseline scores. This trend was seen in our analysis (table 4), where people with higher baseline scores (defined as baseline score ⩾75th centile) required a larger change in their skin score and HAQ-DI for them to be considered as minimally improved by their doctors.

Our study has certain limitations. Firstly, as the global ratings of change asked the doctor to remember how his or her patient’s health was a year ago, retrospective self-reports are subject to recall bias. The HRQOL experts suggest choosing multiple and different kinds of anchors owing to inherent uncertainty associated with MID estimates,16 and future clinical trials should incorporate prospective or concurrent anchors.7 Secondly, MID estimates for the HAQ-DI were based on doctor’s assessment rather than patient’s assessment; patient’s assessment was not considered in this study. This assumes that the doctor is able to assess SSc-related disability as reflected by the HAQ-DI. Future studies should incorporate the patient’s assessment to confirm our findings. Thirdly, the trial cohort is not reflective of a general SSc population as the patients had early diffuse SSc.

In conclusion, using a data-driven approach, we provide MID estimates (generated by the patient’s doctor) for improvement in the skin score and HAQ-DI, which can be used in interpreting clinical trials on patients with SSc and which can help in sample size calculation for future clinical trials on patients with diffuse SSc.

REFERENCES

View Abstract

Footnotes

  • * Current address: Amgen, Thousand Oaks, California, USA

  • Published Online First 15 March 2006

  • Funding: This study was supported in part by grants from the Scleroderma Federation; United Scleroderma Foundation; FDA Orphan Drug Program; Arthritis Foundation; CRC grant numbers M01RR00865 and M01RR00827; WH Conzen Endowment in Clinical Pharmacology of the Schering-Plough Foundation; and bequests from the estates of Winifred Krause, Morris Goldsmith and Takako Ito. DK was supported by the Arthritis and Scleroderma Foundations (Physician Scientist Development Award), the Scleroderma Foundation (New Investigator Award), a National Institutes of Health BIRCWH Award (grant number HD051953) and a grant from the Scleroderma Clinical Trial Consortium. RDH was supported in part by the UCLA/DREW Project EXPORT, National Institutes of Health, National Center on Minority Health & Health Disparities (P20-MD00148-01), the UCLA Center for Health Improvement in Minority Elders/Resource Centers for Minority Aging Research, National Institutes of Health, and National Institute of Aging (AG-02-004).

  • Competing interests: None declared.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.