Article Text
Abstract
Background Joint damage is an important outcome in trials of rheumatoid arthritis (RA), usually assessed by Total Sharp Score (TSS). It is currently unknown how it translates numerically into disability by the Health Assessment Questionnaire (HAQ).
Objective To determine the units of HAQ score corresponding to one TSS unit.
Methods A short-term observational trial of glucocorticoids in RA (the ‘BEst LIfe with Rheumatoid Arthritis’ (BELIRA) trial) was evaluated, using randomised controlled clinical trial (RCT) data for confirmation. For each trial arm HAQ, TSS and the Simplified Disease Activity Index (SDAI) were assessed. Based on the hypothesis that short-term HAQ changes will mostly be due to changes of disease activity, activity HAQ (ACT-HAQ) at end point (EP) was determined and remaining disability defined as damage related (DAM-HAQ). Using TSS at EP, the HAQ units corresponding to a TSS unit were estimated.
Results In BELIRA, one TSS unit corresponded to a mean of 0.017 HAQ units; to account for other causes of irreversible disability, the 25th percentile was used: 0.011 HAQ units/TSS unit. In RCT trial arms, the HAQ/TSS were similar (0.013 and 0.015 in established and early RA, respectively; 25th percentile: 0.010). The correlation between DAM-HAQEP and TSS was r=0.829. Over 5 years, damage would amount to an increase of irreversible HAQ of 0.33 on placebo, 0.13 on disease-modifying antirheumatic drugs (DMARDs) and 0.03 on TNF inhibitors+methotrexate (MTX).
Conclusion An approach to estimate the numerical relationship between HAQ and damage as 0.01 HAQ points/TSS unit is presented, although the linear relationship may not be generally valid. This allows the assessment of functional correlates of radiographic changes in trials.
Statistics from Altmetric.com
Introduction
Reversal of acute and prevention of chronic disability are major therapeutic goals in rheumatoid arthritis (RA). RA activity elicits signs and symptoms, such as pain, swelling and stiffness, which interfere with the joints' function, while structural joint damage likewise impedes their normal functioning, although this may take longer time.1 2 Indeed, in early RA, disability, measured by Health Assessment Questionnaire Disability Index (HAQ-DI), is correlated with disease activity, whereas correlation with joint damage increases with time.3,–,7 However, disability is also related to other factors, such as age, psychosocial aspects or comorbidities.8,–,10 Nevertheless, looking at clinical trial patients attaining stringent remission who have no activity governing disability, the residual HAQ is closely related to joint damage11 and apparently not importantly influenced by other factors. Thus, the HAQ comprises mainly a reversible activity-related component and an irreversible one due to joint damage. With increasing damage the achievable floor of disability rises and with longstanding disease may not allow distinguishing treatment from placebo effects.12
Disease activity can be best assessed using composite indices; joint damage is usually measured by the Total Sharp Score (TSS) and its derivatives.13 However, the numerical relationship between damage and HAQ is still enigmatic. This quantification is the main objective of this study, searching for an answer to the question of what an increase in TSS of 1, 10 or 100 units may mean in terms of units of physical disability.
Patients and methods
Patient datasets
The ‘BEst LIfe with Rheumatoid Arthritis’ (BELIRA) trial
Between 2002 and 2004 we performed an observational trial in patients attending our clinic using 250 mg oral methylprednisolone over 1 week (see supplementary methods). This study, called BELIRA, was designed to address the numerical relationship between joint damage and disability based on glucocorticoid-induced rapid disease activity reduction. X-rays of 77 patients were read by a single author (JTS) according to the modified TSS.14 For the present analyses, these 77 patients were used. Some aspects of BELIRA have been published,15 but results on the primary hypothesis are provided here.
Sources of data from randomised controlled clinical trials
To validate the BELIRA trial findings, we used data from pivotal, double-blind, randomised controlled clinical trials (RCTs) in RA, which were kindly provided by the sponsors of these studies. We received an 80% to 90% random sample of the following trials: ATTRACT (for ‘Anti-TNF Trial in Rheumatoid Arthritis with Concomitant Therapy’) and ASPIRE (for ‘Active-controlled Study of Patents receiving Infliximab for the treatment of RA of Early onset’) studying infliximab16 17; DE019 and PREMIER studying adalimumab18 19; ERA (for ‘Early Rheumatoid Arthritis’) and TEMPO (for ‘Trial of Etanercept and Methotrexate with radiographic Patient Outcomes’) studying etanercept20 21; and trials evaluating leflunomide (LEF).14 22 23 Further details are given in the supplementary methods. Only patients with complete data sets as needed for the present analyses were used.
Core set variables and composite index
For all patients swollen and tender joint counts (SJC, TJC), patient and evaluator global assessments (PGA, EGA) on 100 mm visual analogue scales (VAS), erythrocyte sedimentation rate (ESR); C reactive protein (CRP), HAQ and modified Sharp or van der Heijde modified Sharp score were available. Using respective data, we calculated the Simplified Disease Activity Index (SDAI),24 an untransformed score ranging from 0 to about 100 (see supplementary material).
Theoretical approach
Based on several hypotheses, we developed a series of equations enabling approximation of the numerical relationship between TSS units and units of HAQ. First, we reasoned that HAQ change between baseline and end point would be primarily related to the change in disease activity (equation 1) and that effects of a potential increase in joint damage could be negligible/very low in the short-term course of the BELIRA trial, but even over the 1 year of RCTs. The change in HAQ seen per change in SDAI was assumed to be linear and constant (equation 2). This relationship can be represented as the factor fACT:
(1)where
Based on these hypotheses, activity-related disability at study end point (ACT-HAQEP) was estimated by multiplying fACT (units of
Although physical disability is multifactorial,8 10 25 26 its components could be reasonably simplified for the next step to comprise mainly a damage-related (DAM-HAQ) score, and the mentioned activity-related component (ACT-HAQ) (equation 3). The reasons for this simplification were the following: major comorbidities and age over 70 influence the HAQ,25 26 but in the context of the trials studied here major comorbidities had been excluded and mean age ranged from 49–58 years with only few patients above 70 years; also other aspects, such as psychosocial and educational factors, may matter for the HAQ,9 27 but using the current trial dataset patients with no joint damage, when achieving clinical remission, attained a mean HAQ of 0,11 speaking against a major influence of these factors. All these considerations suggested that in the populations studied here the mentioned simplification bears soundness. Therefore, in the next step, DAM-HAQEP was estimated as:
(3)After calculation of the DAM − HAQ, we estimated a factor for linking DAM-HAQ with structural damage score (TSS); this factor, fDAM, is also assumed to be linear and constant in a given setting (equation 4) and constitutes:
(4)The quantification of fDAM as the link between radiographic progression and worsening of disability is the major aim of the present study. However, since—as broadly discussed earlier—residual disability may partly be related to other reasons than joint damage8,–,10 25 27 and fDAM may, therefore, overestimate disability if mean values were used, we applied a more conservative approach selecting the value of the first quartile of the data as the final fDAM. In this situation, where the contribution of other measures is unknown but must be predicted to influence the irreversible HAQ and where a strictly linear relationship as modelled may not be fully valid, the lower quartile was deemed an intuitive and straightforward midpoint between the mean or median which, might exaggerate the relationship and the minimum or the 10th percentile, which might underestimate it.
Analytical implementation
We assessed all variables needed (HAQ at baseline, end point and change; SDAI at baseline, end point and change and TSS at end point) for application of equations 1–4 to all BELIRA patients (end point: 1 week); and patients of each RCT arm (end point: 1 year), using group level data, of established (ATTRACT, DE019, LEF and TEMPO) and early RA (ASPIRE, ERA and PREMIER).14 16,–,23
Additional analyses
We tested construct validity by correlating the damage-related component (DAM-HAQ) with the TSS. The equations to derive fDAM were developed from the disease activity perspective; therefore this validation is not circular. In addition, we calculated the increase of functional disability associated with radiographic progression among patients on different types of treatments.
Results
Baseline data used for the current analyses are shown in table 1.
Estimation of fDAM
BELIRA trial
At baseline, the mean SDAI of BELIRA patients was 25.9 ± 13.7 (DAS28: 4.9 ± 4.7) and mean HAQ 1.1 ± 0.7 (table 1). After prednisolone treatment, SDAI fell by 11.3 ± 9.9 (DAS28: 1.3 ± 1.0) and HAQ by 0.23 ± 0.13. Thus, according to equation 1, fACT was 0.021 HAQ units per one SDAI unit (HAQ/SDAI); with a mean SDAIEP of 14.4, the ACT-HAQEP was 0.3 (table 2). Given that mean HAQEP was 0.93, the DAM-HAQEP calculated as 0.63 (equation 3). According to equation 4, (fDAM), each TSS point was estimated as corresponding to 0.0165 HAQ points on the group level (table 2). Among all individual BELIRA patients who upon prednisone treatment had an improved SDAI and an improved or stable HAQ, the median fDAM, that is, HAQ points related to one Sharp score unit, amounted to 0.013 (quartiles: 0.0108 and 0.0610).
Analyses on RCTs in patients with longstanding RA
When obtaining the estimated equivalent of HAQ associated with each SDAI point, the mean fAct values were relatively similar across trial arms within and also between the trials on longstanding RA: 0.022 ± 0.004 (range 0.017–0.029). Interestingly, these data were similar to those for BELIRA (fACT=0.021, shown earlier), which also comprised patients with longstanding RA. In accordance with equation 2, these values allowed calculating ACT-HAQEP. While HAQEP was 0.69–1.33 in these patients with longstanding RA, the range of ACT-HAQEP was 0.27–0.61 and of DAM-HAQEP 0.36–0.75 (mean 0.53 ± 0.13).
We then calculated the estimated HAQ values related to 1 TSS unit (fDAM). As shown in table 2, mean fDAM of all clinical trial arms of longstanding RA was 0.013 ± 0.004 (range 0.007–0.018); the median (quartiles) amounted to 0.011 (0.011–0.016). Importantly, these values were again similar to those in BELIRA, confirming these results.
Analyses in patients with early RA
Performing the same analyses for early RA trial arms, mean fAct values were significantly higher when compared with longstanding RA as shown earlier: 0.029±0.003 (range 0.025–0.034, p=0.001). Consequently, as depicted in table 2, the DAM-HAQEP values were significantly lower in early than longstanding RA (mean 0.24±0.07, range 0.18–0.33, p<0.0001). Nevertheless, fDAM values were very similar (mean 0.015 ± 0.005, p=NS). Median fDAM in early RA cohorts amounted to 0.016 (0.010–0.018), also very similar to those of the longstanding RA cohort. Because of these similarities, we calculated fDAM across all trials obtaining a mean of 0.014 ± 0.004 and a median of 0.013 (0.010–0.018).
The DAM-HAQ comprises all disability components that are not activity related. Therefore, it likely includes also other, unmeasured causes of disability.8,–,10 Consequently, mean or median fDAM in HAQ/TSS units will overestimate the damage/TSS effects on HAQ. To account for such systematic error, which is currently not measurable, we used the values of the first quartile as the final fDAM for the estimation of DAM-HAQ/TSS, amounting to 0.01 for the BELIRA data and the trial arms of the RCTs.
Additional analyses
Correlation of the estimated damage-related HAQ (DAM-HAQEP) with actual joint damage at end point
The ACT-HAQEP equation was not derived using radiographic scores. Therefore, also the DAM-HAQEP values as obtained by subtracting the ACT-HAQEP from the measured HAQEP value were not derived from radiographic scores. Nevertheless, there was a strong correlation of the DAM-HAQEP with TSS (r=0.829; Pearson correlation coefficient; figure 1C). This was indeed the case for both populations, late (r=0.721) and early RA (figure 1A; 0.480). Contrasting these considerable correlations with the DAM-HAQ, the total HAQ EP was well correlated with joint damage only in late (r=0.663) but not early RA (figure 1B; r=−0.291), in line with previous observations.3 These data together suggest that DAM-HAQ has higher correlational validity with joint damage than total HAQ.
Estimation of the increase of DAM-HAQ over time
Using the fDAM of 0.01, we estimated the increase of HAQ due to joint damage over 1 year (figure 2). On tumour necrosis factor (TNF) inhibitors (TNFi) plus methotrexate (MTX), DAM-HAQ increase calculated to 0.006 ± 0.07 points; on TNFi monotherapy (ERA, PREMIER, TEMPO trials) to 0.011 ± 0.012; and on synthetic disease-modifying antirheumatic drugs (DMARDs) including MTX (ASPIRE, ERA, PREMIER, TEMPO, LEF trials) to 0.025 ± 0.012 (p=0.006 compared with TNFi+MTX); however, on placebo (ATTRACT) the increase calculated to even 0.067 points. Extrapolating these data to 5 years, the increase of irreversible disability on placebo would be approximately 0.3 HAQ points, 0.125 on synthetic DMARDs and 0.03 on TNFi+MTX. This latter very low increase of irreversible DAM-HAQ is a consequence of the well established dissociation between disease activity and joint damage progression on TNFi+MTX.28,–,30 Moreover, while the hierarchy of these treatments is established, this has not previously been related numerically to the interference of synthetic or biological DMARDs with the increase of irreversible disability.
Discussion
Impairment of physical function is governed by various underlying causes.8 10 31 In chronic diseases like RA, signs and symptoms as well as organ (joint) damage constitute major determinants of disability.3 32 While it is impossible to distinguish between these underlying causes using available generic instruments for functional assessment, the relationship between disability and damage increases over time,3 5,–,7 11 whereas the responsiveness of HAQ to treatment fades because of irreversible joint damage and a raising of the base level that can be reached.12 33
Here, we were able to derive a numerical value for the functional equivalent of joint damage. We started by hypothesising that, on the group level, the change of HAQ would mainly be due to change in disease activity during a short-term therapeutic intervention. By relating change of HAQ to change of SDAI and using SDAI at study end point for further analyses, we estimated the ACT-HAQ at end points. The residual HAQ (non-ACT-HAQ) was assumed related to joint damage (and therefore named DAM-HAQ). The strong correlation (r>0.8) of this mathematically derived DAM-HAQ with the actual TSS suggested that this residual HAQ was, indeed, damage associated. Finally, by relating TSS to DAM-HAQ, we could estimate the HAQ units corresponding to one TSS unit (HAQ/TSS, fDAM). Importantly, using fDAM to calculate DAM-HAQ, the ratio between DAM-HAQ and ACT-HAQ would be highly in favour of ACT-HAQ in the presence of little damage as in early RA or over a short period of time, while it would shift in favour of DAM-HAQ with increasing damage, in line with previous reports on the increasing association of damage with disability over time.3 4
For these analyses, we first evaluated data of an observational study employing glucocorticoids, subsequently performing the same analyses in 16 therapeutic arms of RCTs. The results for fDAM (HAQ/TSS) were similar irrespective of the trial types (observational or RCTs), treatments (glucocorticoids, DMARDs, TNF inhibitors) or patient populations (early or longstanding RA), ranging from 0.007–0.020 HAQ units per TSS unit with a median of 0.0133 (mean: 0.0137). Our method might have overestimated the effect of TSS on DAM-HAQ, because the latter is also influenced by other causes than joint damage.8,–,10 25 Therefore we obtained a more conservative estimate of fDAM, using the lower quartile of data from BELIRA patients, which corresponded well with the lower quartile obtained in RCT treatment arms, providing the final fDAM of 0.01. Thus, 10 points increase of TSS, a possible annual change with inadequate treatment, would irreversibly increase the HAQ by approximately 0.1 units (1% of the TSS); over 5 years, the HAQ would increase irreversibly by 0.5. Indeed, trial patients in clinical remission, where activity was not a cause of functional impairment,11 had a (residual) mean HAQ of 0.4 in the highest TSS quartile (TSS range 22.9–278.5; median 4911), giving a HAQ/TSS of 0.008, well in line with the estimated HAQ/TSS presented here. This also supports the use of the conservative lower quartile.
Analyses of relationships between TSS and functional impairment by general linear mixed modelling arrived at inconsistent conclusions.4 6 7 In contrast, we obtained our data not by statistical modelling but by stepwise hypothesis-driven mathematical derivation of an activity-related and a damage-related HAQ component and ultimately by relating DAM-HAQ to TSS in order to obtain the HAQ/TSS, the fDAM. Although DAM-HAQ is a latent variable (ie, usually disguised by activity effects and thus not readily measurable), the current approach allowed reproducible estimation of the relationship between actual damage and disability.
Using fDAM, radiographic progression in RA now has a quantifiable functional meaning, allowing economical valuation of radiographic progression.
There are various limitations of our study. One relates to assessing group level data from RCTs rather than individual patient data. We have chosen this for several reasons. First, measures may be erratic on individual levels as shown for HAQ, but this may also be so for composite activity scores, where individual variables may distort the measure.10 31 34,–,36 Also damage scores may be erratic, since a TSS of 20 can mean 20 joints each affected by a TSS of 1, or 4 totally eroded joints, the latter likely leading to more disability if active disease were absent. Thus, just like some patients with active disease at treatment start may have HAQ scores of 0,11 some patients with high TSSs may have HAQ scores of 0 even in remission. Second, clinical trials report on group level data of radiographic scores; thus, it is of particular importance to understand the consequences of damage on the group level. Moreover, since the majority of patients have no or little radiographic progression irrespective of treatment and highly effective treatments affect only 20% to 30% of the patients differently from less effective treatments,6 37 the results presented relate to group rather than patient levels. Also, radiographic scoring is not performed routinely in clinical practice and, therefore, our results will likely not be applicable to the individual patient. Nevertheless, when we looked at the individual patient data from BELIRA, the lower quartile of HAQ/TSS matched that from the RCT arms.
Another limitation relates to the simplicity of the hypotheses presented and that not all relationships are as linear as assumed. However, at least for x-ray changes, roughly linear progression occurs on MTX treatment or placebo16 17 19 21; for example, in PREMIER changes from baseline to 12 and 24 months, respectively, were 5.7 and 10.4 for MTX, 3.0 and 5.5 for adalimumab and 1.3 and 1.9 for the combination treatment19; TEMPO showed almost linearity for MTX and etanercept monotherapies.21 Moreover, when patients drop out in clinical trials, x-ray progression is usually estimated by linear extrapolation.21 Likewise, the correlation coefficient between joint damage and functional impairment increases in a relatively linear fashion over time.3 5 In line with this, the similarities of the results obtained here across patient populations (in early as well as longstanding RA) and treatments (from placebo to TNFi plus MTX) support the nature of the hypotheses. However, data for structural damage are highly skewed37 and assuming that TSS and HAQ always correspond with each other directly may not be appropriate.
A further limitation relates to the danger of using a factor with a fixed value. This approach bears the risk of overestimating the effects, especially since TSS and its modifications may range to above 400. A fixed HAQ/TSS value of 0.01 by itself would then allow for a HAQ of ≥4, while the HAQ range ends at 3. However, the relationship between TSS and DAM-HAQ likely has a sigmoid shape, since in low TSS ranges joint damage may not affect disability at all, while a maximal effect of damage on HAQ is likely reached already at submaximal TSS levels (putative ceiling effect). Thus, the data shown primarily relate to the TSS range studied here, namely about 11–75. With early institution of effective treatments and rapid switching of treatment, this range is most pertinent for our understanding of therapeutic effects on joint damage and its consequences on physical disability. The relevance of the results for higher scores will have to be investigated in appropriate patient populations. Importantly, the information presented does not claim any kind of exactness that should be used in the care of individual patients, but rather provides a guidance to facilitate interpretation of clinical trial results.
HAQ, SDAI and TSS constitute continuous variables that are untransformed and unweighted. Using this set of scores may be one of the reasons why we were able to derive a value that has construct, correlational and face validity in various scenarios. The range of the HAQ is 0 to 3, of SDAI 0 to approximately 100 and of TSS 0 to approximately 400/440. Because of the high scores achievable with TSS, the value of HAQ points (0.01) obtained for one TSS point appears very small. However, proportionally it amounts to 0.33% of the maximal HAQ or 4% of a normal HAQ (approximately 0.25) for each TSS point. Interestingly, when testing a similar approach using DAS28 as disease activity measure, the data obtained were inconsistent (not shown), likely due to the transformed and weighted nature of DAS28.36 In contrast, employing the Clinical Disease Activity Index (CDAI), which is similar to SDAI but lacks CRP, gave similar results as for the SDAI (data not shown).
In conclusion, using clinical trial data, it was possible to estimate the level of disability related to one Sharp score unit. This estimate allows us to answer the frequently asked question as to the clinical and functional meaning of TSS increases in trials, and thus enhances the interpretation of trial results. The data presented reveal that for every 10 TSS units the HAQ will increase by 1/10th of a unit. It should be borne in mind that the linearity suggested by this rule of thumb may not be valid under all circumstances. Moreover, this information is not meant to be employed on an individual patient basis; rather, it is helpful for groups of patients. Additionally, it may be particularly pertinent for health planners and pharmacoeconomic analyses. However, when judging joint damage and functional impairment from an economic perspective, the considerable residual disability observed on any treatment, including TNFi+MTX in early RA, is mostly due to residual disease activity38; this has to be taken into account upon therapeutic or economic inferences. Nevertheless, for health planners, and also rheumatologists, these data reinforce the fact that joint damage is still irreversible and that, therefore, the damage-related increase in disability is also irreversible, supporting the necessity to treat RA early and vigorously by appropriate therapeutic strategies to prevent accrual of joint damage. The data presented provide an initial estimate of the relation between joint damage and physical disability, but this might not be the final word on this topic and other estimates and models might be developed in the future.
Acknowledgments
We thank Abbott, Amgen, Centocor, Sanofi-Aventis and Wyeth for kindly providing us with data from their clinical trials. This study is dedicated to the memory of Dr John T Sharp. JTS visited Vienna twice to perform the x-ray readings on one of the trials presented here, but his death did not allow him to perform the radiographic analyses of a final small subset of study patients.
References
Supplementary materials
Web Only Data ard.2009.114652
Files in this Data Supplement:
Footnotes
JTS is deceased.
-
Funding While we received data from the pharmaceutical industry, this study was not funded by any of the companies involved.
-
Competing interests JSS received honoraria and grant support from Abbott, Centocor, Schering-Plough, Wyeth and honoraria from Amgen and Sanofi-Aventis. DA received honoraria from Abbott, Schering-Plough, Sanofi-Aventis and Wyeth.
-
Ethics approval This study was conducted with the approval of the individual study centres and Medical University of Vienna.
-
Provenance and peer review Not commissioned; externally peer reviewed.