Background Validity of European Scleroderma Study Group (EScSG) activity indexes currently used to assess disease activity in systemic sclerosis (SSc) has been criticised.
Methods Three investigators assigned an activity score on a 0–10 scale for 97 clinical charts. The median score served as gold standard. Two other investigators labelled the disease as inactive/moderately active or active/very active. Univariate–multivariate linear regression analyses were used to define variables predicting the ‘gold standard’, their weight and derive an activity index. The cut-off point of the index best separating active/very active from inactive/moderately active disease was identified by a receiver-operating curve analysis. The index was validated on a second set of 60 charts assessed by three different investigators on a 0–10 scale and defined as inactive/moderately active or active/very active by other two investigators. One hundred and twenty-three were investigated for changes over time in the index and their relationships with those in the summed Medsger severity score (MSS).
Results A weighted 10-point activity index was identified and validated: Δ-skin=1.5 (Δ=patient assessed worsening during the previous month), modified Rodnan skin score (mRss) >18=1.5, digital ulcers=1.5, tendon friction rubs=2.25, C-reactive protein >1 mg/dL=2.25 and diffusing capacity of the lung for CO (DLCO) % predicted <70%=1.0. A cut-off ≥2.5 was found to identify patients with active disease. Changes in the index paralleled those of MSS (p=0.0001).
Conclusions A preliminarily revised SSc activity index has been developed and validated, providing a valuable tool for clinical practice and observational studies.
- Systemic Sclerosis
- Disease Activity
- Autoimmune Diseases
Statistics from Altmetric.com
The assessment of patients with systemic sclerosis (SSc) should address different disease aspects: diagnosis and fulfilment of classification criteria, extent of organ involvement, activity (the reversible part of the disease process), damage (the irreversible part of the disease process), prognosis prediction, outcome and response to treatment.1 Defining disease activity in SSc cannot be done using a single variable and it is challenging for a number of reasons: first, patients can present with an indolent course, irrespective of whether or not they belong to either of the two disease subsets, that is, diffuse cutaneous SSc (dcSSc) and limited cutaneous SSc (lcSSc);2–6 second, SSc flares can be difficult to separate from quiescent disease;1 third, the two main morphological manifestations of the disease (interstitial fibrosis and vascular occlusion) may reflect both activity and damage and, finally, validated biological markers reflecting disease activity are still lacking.7
In 2001, the European Scleroderma Study Group (EScSG) in an analysis of the clinical charts of 290 patients from 19 European SSc centres identified 11 disease activity variables and developed a preliminary activity index.8 ,9 The construct validity of the index was first verified by the jackknife technique (ie, assessing the dispersion of the correlation coefficients calculated by removing out one patient at a time),10 and then confirmed by calculating the correlations between the index and the rank of disease activity assigned by four experts to 30 charts, selected to represent different degrees of disease activity.11
This index was subsequently endorsed by European Scleroderma Trials and Research group (EUSTAR) and has been used to assess disease activity in 149 studies.12 Its criterion validity is supported by its correlation with the physician global assessment of activity of the Canadian Scleroderma Research Group (CSRG),13 its association with antitopoisomerase 1 titre14 and its role as the main predictor of the scleroderma phenotype (presenting a higher procollagen transcription) of skin fibroblasts from patients with SSc.15
However, it has some limitations due to the procedure underlying its development. In fact, most patients had long disease duration and the number of missing values was high. Moreover, the face validity of hypocomplementaemia has been questioned since the complement fixation is not thought to be important in SSc.16 ,17 Finally, there was no validation in an independent sample. Here, we present the results of a EUSTAR study devoted to revise the original activity index to improve and validate it.
Materials and methods
The study coordinator selected 97 clinical charts from patients included in the EUSTAR database.5 The selection was carried out in order to identify patients fulfilling the 1980 American College of Rheumatology (ACR) criteria for the classification of SSc;18 followed in SSc referral centres in order to reduce the number of missing values and representing one of the following disease subgroups: early dcSSc, late dcSSc, early lcSSc and late lcSSc. Early disease was defined as a disease duration ≤3 years from the onset of the first non-Raynaud symptom.19
Three clinical investigators (YA, CPD and OK-B) from centres other than those from which the patients' charts were derived assigned a disease activity score on a 0–10 scale to each chart. The reliability of this scoring system was assessed by the evaluation of the interclass correlation coefficient (ICC). The median disease activity score was used as the ‘gold standard’ to identify items significantly associated with disease activity.
To this aim, the study coordinator selected from the items listed in the EUSTAR chart,5 the following 18 thought to have face validity as activity variables: (1) anamnestic: Δ-skin, Δ-vascular, Δ-heart/lung worsening (ie, worsening, as evaluated by the patient during the month before enrolment, of skin induration, Raynaud's phenomenon and/or digital ischaemic ulcers and dyspnoea and/or palpitations, respectively); (2) clinical: active digital ulcers, modified Rodnan skin score (mRss), tendon friction rubs (TFR), muscle weakness, arthritis; (3) laboratory: C-reactive protein (CRP) elevation, erythrocyte sedimentation rate (ESR)/h value, hypocomplementaemia, creatine kinase (CK) elevation, proteinuria; (4) functional and imaging: systolic pulmonary arterial pressure (sPAP) and pericardial effusion at echocardiography; ground glass and lung fibrosis at lung high-resolution CT (HRCT); forced vital capacity (FVC); diffusing lung capacity for carbon monoxide-single breath (DLCO).
Subsequently, we performed a univariate linear regression analysis to search for significant associations between each of the selected items and the median disease activity score given by the three experts. Cut-off values for sPAP, FVC and DLCO were derived from literature.20–22
The items significantly associated with gold standard in univariate analysis were all entered in a multivariate linear regression analysis to identify the set of variables independently associated with the ‘gold standard’. As far as the remaining two continuous variables (mRss and ESR), we made a number of attempts devoted to both identify the cut-off point most significantly associated with the gold standard (highest R2; lowest p) to construct a model with the highest sum of sensitivity and specificity. Each variable found to be significantly associated in multivariate analysis was assigned a weight corresponding to β coefficient adjusted in order to construct a 10-point weighted activity index.
Two other investigators (LC and GV), who were unaware of the values assigned on the 0–10 scale, evaluated each of the 97 charts as inactive (corresponding to no need to change treatment and requiring a follow-up after 6 months to 1 year), or moderately active (corresponding to no need to change treatment and a 3 to 6 monthly follow-up), or active (needing treatment intensification and 1 to 3 monthly follow-up), or very active (requiring hospitalisation for active disease). The reliability of this system was assessed by the evaluation of Cohen's K. The charts that had received a discordant evaluation by the two investigators were resent to them for a reassessment devoted to find an agreement.
For each patient, the overall disease activity was calculated summing the scores of the new index. The cut-off value presenting the highest sum of sensitivity and specificity in separating patients with active/very active disease from those with inactive/moderately active condition was identified by a receiver-operating curve (ROC).
The new index was validated on 60 patients recruited from the same database and selected in order to satisfy the following aspects: (1) fulfilling 2013 ACR/European League Against Rheumatism (EULAR) criteria for the classification of SSc;23(2) belonging to patients recruited at SSc centres with SSc expertise in whom capillaroscopy had been performed and the pattern defined according to Cutolo et al;24 (3) being representative of one of the following disease subgroups, as described above: early dcSSc, late dcSSc, early lcSSc and late lcSSc. The 60 charts were assessed for disease activity on a 0–10 scale by three different investigators (PC, AH and JP) and defined as inactive, moderately active, active and very active by other two investigators (MB and EH), all of whom unaware of the derived index. The reliability of the two scoring systems was assessed by ICC and Κ statistics, respectively.
Changes of the index over time and its relationships to summed Medsger severity score
In order to validate the index further, we assessed the changes in the activity index detected in patients from either derivation or validation cohorts in a follow-up visit at least 6 months apart and compared it with the changes in the summed Medsger severity score (MSS),25 which is a validated measure of disease severity in observational studies.26 We undertook this approach by considering that severity reflects both activity and damage and its change, being damage irreversible, can only depend on changes in the activity part of the disease process.
Table 1 lists the main epidemiological, serological and clinical features of the 97 patients considered in the derivation part of this study. All the patients also satisfied 2013 ACR/EULAR criteria. The ICC among the activity scores given by the three clinical experts was 0.786, indicating that either the median or the mean value could be considered consistent measures of disease activity and supporting the use of one of them as a gold standard.
Table 2 lists the items, out of the 19 selected, that resulted to be associated with the median value of the three 0–10 scores in univariate linear regression analysis.
It is notable that no finding detected at HRCT of the lung was associated with the gold standard. However, the extent of lung involvement is not defined in the EUSTAR chart.
With single exceptions (eg, TFR for lcSSc), the same items were also associated with the gold standard in each of the two subsets as well as in early and late disease (data not shown). After a number of attempts, we identified an mRss>18 and a ESR>50 mm/hour as the cut-off points most significantly associated with the gold standard (highest R2 and lowest p), that is, the value corresponding to the highest association of the variable in univariate analysis. These values were entered in multivariate analysis along with the other items resulted to be associated with the gold standard.
Table 3 lists the items resulted to be associated with the gold standard in multiple regression analysis and the respective weight that was assigned depending on the β values of the regression model, in order to construct a 10-point index.
Since an mRss lower than 18 can also reflect active disease, in the final index we considered a formula in which the highest possible value that was associated with the cut-off point was found to be most significantly associated with the gold standard. In detail, each mRss score lower than 18 can contribute to the overall activity score according to the following formula: mRss score×0.084.
The assessment of disease activity on the Likert scale was the same (ie, either inactive/moderately active or active/very active) in 91 patients; differed in 6 (Cohen's K=0.851). In these six patients, an agreement was reached after a re-evaluation of each chart that had given a discrepant evaluation. Out of the 97 patients with SSc, 57 were considered to be inactive to moderately active and 40 active to very active (table 4).
Figure 1 shows the ROC exploring the best cut-off value discriminating between inactive/moderately active disease (no treatment change needed) and active/very active disease. A value ≥2.5 resulted to have the maximal sum of sensitivity (80.0%; 95% CI 64.4 to 90.9) and specificity (91.2%; 95% CI 80.7 to 97.1) and was used to validate the index in the validation cohort.
Table 5 shows the epidemiological, serological, capillaroscopic and clinical features of the additional set of 60 patients selected for the validation cohort. Out of the 60 patients, 47 also satisfied 1980 ACR criteria for the classification of the disease.
The scores given by the three raters were significantly correlated (ICC=0.749). Moreover, their median values were significantly correlated to the respective calculated indices (r=0.772, 95% CI 0.644 to 0.857; p<0.0001). The early capillaroscopic pattern was associated with the gold standard (R2=0.07; p=0.029). Nevertheless, adding it to the other items (Δ-skin, digital ulcers, mRss>18, TFR, CRP>1 mg/dL, DLCO<70% of the predicted value) in multivariate regression analysis did not improve the performance of the index.
The evaluations on the Likert scale were consistent (ie, either inactive/moderately active or active/very active) in 46 patients; differed in 14 (Cohen's K=0.525). In these 14 patients, an agreement was reached. Out of the 60 patients, 37 were considered to be inactive to moderately active and 23 active to very active. An index ≥2.5 identified active/very active disease as defined by MB and EH with a 73.9% (95% CI 51.6 to 89.8) sensitivity and 78.3% (95% CI 61.8 to 90.2) specificity. Performing the validation process in the 47 patients also satisfying 1980 classification criteria gave very similar results. In this cohort, an EScSG activity index ≥3 identified active disease with 52.2% sensitivity and 89.1% specificity.
Changes in the index over time and its relationships to summed MSS
A follow-up visit made after 6–38 months (median 13) was available in 123 out of the 157 patients from either derivation and validation cohorts. The calculated index unchanged in 36 patients, decreased in 59 and increased in 28. The changes in the activity index resulted to be significantly correlated with those in the MSS in the 123 patients with a follow-up visit (r=0.330; 95% CI 0.162 to 0.479, p=0.0002), pointing out a significant relationship between the index and the course of disease severity. In particular, at baseline, 43 out of the 123 patients had an activity index ≥2.5. Twenty-two had an activity index <2.5 at the end of follow-up, where 18 experienced a decrease (≥1 point) and 4 a stable severity score. On the other hand, among the remaining 80 patients with a baseline activity index <2.5, 8 with their activity index ≥2.5 at the end of follow-up, 5 increased by ≥1 point, with 3 a stable severity score.
Using the multinational EUSTAR database, we have identified a preliminarily revised set of weighted items correlated with disease activity in patients with SSc. The 2001 EScSG study8 ,9 was based on the analysis of 290 patients, most of whom with longstanding disease and was affected by a high number of missing values ensuing in a low number of patients evaluable for most items. In order to overcome these limitations, we only relied on charts from centres with a large and scientifically supported expertise and included a high proportion of patients with early disease.
The 97 patients selected for the derivation cohort present some aspects that deserve to be discussed. First, 21 out the 48 patients with lcSSc were anti-Scl-70 positive. Differences in the prevalence of anti-Scl-70 positivity among patients from different geographical regions have long been known: 29% of French patients with lcSSc versus 15% of American patients.27 Since all our patients came from European centres, this is an expected result. Second, two patients with dcSSc were anticentromere antibody (ACA) positive. However, this figure does not differ from the 5% prevalence of ACA in dcSSc reported by Steen.28 Third, five patients with lcSSc presented TFR. Again, TFR have been detected in 5% of patients with lcSSc, supporting the absence of any derived generalisability issues.29
The revised EUSTAR activity index differs from the original EScSG index in several aspects.
Hypocomplementaemia and arthritis were not associated with disease activity in the present study, even in univariate analysis. The role of hypocomplementaemia in assessing SSc activity has been largely debated.16 ,17 Hudson et al30 investigated 321 patients from the CSRG and found that hypocomplementaemia was significantly associated with inflammatory myositis and vasculitis, and concluded that it may identify a subgroup of patients with SSc who have overlap disease. These data suggest that some patients enrolled in the EScSG study31 were affected by SSc (all of them satisfied the 1980 ACR criteria)18 in overlap with other autoimmune systemic rheumatic diseases. This aspect might also justify the exclusion of arthritis.
The revised EUSTAR index contains TFR and increased serum CRP. TFR were associated with diffuse and reduced survival in 1301 patients with SSc.32 This item was predictive of worsening of skin fibrosis and scleroderma renal crisis in the EUSTAR cohort.32 CRP levels were increased in early disease and were associated with activity, skin, lung, kidney disease and poor survival in 1043 patients with SSc from the CSRG Registry.33
Similarly to the EScSG activity index, the revised EUSTAR index contains mRss, digital ulcers and DLCO. mRss reflects the degree of skin sclerosis and has long been considered a measure of disease activity in SSc.34 One could argue that a decreasing mRss (eg, from 24 to 18) might represent a reduced disease activity. Nevertheless, the persistence of defined skin sclerosis is not consistent with inactive disease. Digital ulcers are clearly related to vascular disease activity and have been recently found to predict the occurrence of new digital ulcers during follow-up and to be associated with cardiovascular morbidity and decreased survival.35 A decreased DLCO can depend on both vascular and interstitial lung disease. In the absence of pulmonary hypertension, however, it has been found to provide the best overall estimate of HRCT-measured lung fibrosis.36
Similarly to the EScSG activity index, the revised EUSTAR index contains Δ-factors (namely Δ-skin). Δ-Items had been criticised because they can fail to capture persistent activity and are influenced by depression.16 Recently, however, patient assessment has been reported to be significantly correlated with mRss, the Short Form 36 health survey physical component and skin involvement in the last month.37 In any case, the present index is less influenced by Δ-items, which represented 45% of the 2001 index with respect to the 15% of the present one.
In the present study, three patients of the derivation cohort and one of the validation cohort had previously presented with scleroderma renal crisis, preventing the use of the revised EUSTAR activity index in that context.
Following the publication of the EScSG activity criteria, several attempts have been made to identify a set of criteria with an improved performance. Diaconu et al38 asked 6 SSc experts to evaluate 40 charts completed by clinical investigators from Nijmegen; 20 patients had early disease, not yet satisfying 1980 ACR classification criteria18 and 20 had established disease. They derived an eight-unweighted item index (scleroderma, mRss, fatigue, exertional dyspnoea, DLCO, musculoskeletal symptoms, ESR and digital ulcers), performing similarly to the EScSG activity index9 in patients with either early or late disease. Furthermore, Minier et al39 identified two activity indexes (a 12-point extended index including Δ variables and a simplified 8.5-point devoid of them) by investigating 131 consecutive patients at enrolment and 1 year later. These patients were assessed using a standardised protocol including HRCT of the lung and echocardiography. The authors confirmed the good construct validity of the original EScSG activity index and found a very good correlation both at baseline and after 1 year between both the extended and the simplified score and the original EScSG activity.
The SSc activity reported herein represents a step forward with respect to the EScSG activity index.9 Unlike the EScSG activity index, it was validated on an independent cohort. Moreover, the lower number and value of Δ-factors as well as the exclusion of disputable items such as hypocomplementaemia give it a greater face validity. In addition, the greater sensitivity detected in the validation cohort was valuable in better characterising the series investigated in observational studies. Finally, the revised EUSTAR activity index was found to parallel MSS over time.
Our study has some limitations. First, the evaluation of predefined EUSTAR charts did not allow capturing either the extent of lung involvement, which has been found to be related to disease activity,40 or any change in laboratory, physical or physiological or radiological parameter, preventing any consideration of the changes of parameters like FVC/DLCO, Δ-fibrosis at lung HRCT or acute-phase reactants. This aspect can have prevented the inclusion of these items. In that regard, however, one should consider the possible unavailability of some previous values and the need to assess disease activity at the first patient visit. Second, no relevant biomarker was investigated. This limitation could be approached in the future by a collaborative multicentre study including the assessment of parameters not included in the EUSTAR chart. Finally, the lower specificity with respect to the EScSG activity index in the validation cohort requires a careful evaluation in the clinical setting, for example, the patient with a respiratory infection presenting high CRP and low DLCO, who would be considered active according to the index, but is suffering from an unrelated condition.
In conclusion, the revised EUSTAR activity index is feasible, presents face, construct and content validity, and represents a step forward to the so far widely used EScSG activity index. Future collaborative, prospective studies are needed to further improve its performance.
Handling editor Tore K Kvien
Contributors Design of the study: GV and YA. Acquisition of data: GV, MI, UAW, VKJ, PC, LC, CPD, OD, EH, AH, OK-B, JP, UM-L, GR, JA, MF, SJ, TM, ES, VHO, SV and YA. Data interpretation and analysis: GV, MI, UAW, VKJ, MB, PC, LC, CPD, OD, EH, AH, OK-B, JP, UM-L, GR and YA. Drafting and revisiting the manuscript: GV, MI, UAW, VKJ, PC, LC, CPD, OD, EH, AH, OK-B, JP, UM-L, GR and YA. Final approval of the manuscript: GV, MI, UAW, VKJ, MB, PC, LC, CPD, OD, EH, AH, OK-B, JP, UM-L, GR, JA, MF, SJ, TM, ES, VHO, SV and YA.
Competing interests GV has received research funding in the area of systemic sclerosis from Abbvie, Actelion, Bayer, BMS, Merck SD, Pfizer and Roche. CPD has been a consultant to Roche, GSK, Actelion, Inventiva, CSL Behring, Takeda, Merck-Serono, MedImmune and Biogen. He has received research grants from Actelion, GSK, Novartis and CSL Behring. AH has undertaken consultancy work and received speaker's fees and research funding from Actelion. She has undertaken consultancy work for Apricus. OD has/had a consultancy relationship and/or has received research funding in the area of systemic sclerosis and related conditions from 4 D Science, Actelion, Active Biotec, BMS, Boehringer Ingelheim, EpiPharm, BiogenIdec, Genentech/Roche, GSK, Inventiva, Lilly, Medac, Pfizer, Serodapharm, Sinoxa, Ergonex, Pharmacyclics and Sanofi. In addition, OD has a patent mir-29 for the treatment of systemic sclerosis licensed. YA has/had consultancy relationship and/or has received research funding in relationship with the treatment of systemic sclerosis from Actelion, Bayer, Biogen Idec, Bristol-Myers Squibb, Genentech/Roche, Inventiva, Medac, Pfizer, Sanofi/Genzyme, Servier and UCB. JP has consulted for Actelion, Bayer, BMS, Merck, Pfizer and Roche.
Patient consent Obtained.
Ethics approval IRB approval was obtained previously as EULAR/EUSTAR data were used that were collected previously on patients who had signed informed consent.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.