Statistics from Altmetric.com
Primary Sjögren's syndrome (SS) is a systemic disorder primarily characterised by lymphocytic infiltration of exocrine glands, resulting in functional impairment of salivary and lachrymal glands. But, the inflammatory process extends beyond the exocrine glands and can potentially affect any organ. Therefore, along with dryness features, systemic manifestations, such as synovitis, vasculitis, and skin, lung, renal and neurological involvement may occur.
As a result, clinical features can be divided into two facets: (1) benign but disabling patient symptoms, such as dryness, pain and fatigue that affect almost all patients and (2) systemic potentially severe manifestations that affect 20–40% of patients. For evaluation of both disease facets, two disease activity indexes have been recently developed by the European League Against Rheumatism (EULAR) SS task force: the EULAR SS Patient Reported Index (ESSPRI)1 for patients’ symptoms (figure 1) and the EULAR SS Disease Activity Index (ESSDAI)2 for systemic features (table 1). Preliminary data from the development study3 and from small randomised controlled trial suggested that ESSDAI had a good sensitivity to change.4 But, to date, these two indexes had not been prospectively validated in a large cohort for assessing their psychometric properties, and particularly, their sensitivity to change.
This study aimed at validating the EULAR indexes, ESSDAI and ESSPRI, for assessment of primary SS in a large international multicentre cohort of patients, by analysing their psychometric properties (ie, validity, reliability and sensitivity to change) and compare them with other existing tools: SS disease activity index (SSDAI)5 Sjögren's Systemic Clinical Activity Index (SCAI),6 Profile of Fatigue and Discomfort (PROFAD) and Sicca Symptoms Inventory (SSI).7 ,8
Patients and methods
Between May 2009 and July 2011, 395 patients from 14 countries were included by 30 primary SS experienced investigators (Argentina, Brasil, France, Germany, Greece, Italy, Japan, The Netherlands, Norway, Slovenia, Spain, Sweden, UK and USA) participating in this international EULAR collaborative project (project code CLI 010).
To be included, patients had to fulfil American-European Consensus Group (AECG) criteria.9 Additionally, investigators were asked to include approximately half the patients with systemic features. Patients were prospectively followed and two visits were planned at inclusion and at 6 months. No therapeutic intervention was planned in this observational study, and therapeutic management was left to the discretion of the treating physician. This study was conducted with the approval of the institutional review board of GHU Paris Nord (IRB0006477). Depending on local rules, ethical approval has been obtained in other countries whenever necessary. In each country, local ethical requirements have been observed.
Disease activity indexes
At enrolment and at 6 months, physicians completed the ESSDAI, the SCAI and SDAI, and assessed systemic disease activity with a 0–10 physician global assessment (PhGA) scale. Also, they evaluated, separately, severity of patients’ symptoms with a 0–10 scale (PhGA of patient symptoms).
At 6 months, physicians also had to evaluate the change in disease activity by answering the question ‘Compared with the previous visit, is this patient’s primary Sjögren's Syndrome activity now…’ according to a 5-point Likert scale (much worse, worse, the same, better, much better). Three groups of patients were defined according to change in disease activity: (1) improved, if considered ‘better’ or ‘much better’; (2) stable, if considered ‘the same’ and (3) worsened, if considered ‘worse’ or ‘much worse.’
At enrolment and at 6 months, all patients completed the ESSPRI (mean score of 0–10 numerical scales for pain, fatigue and dryness features, including oral, ocular and global dryness), SSI, PROFAD questionnaires and a 0–10 patient global assessment (PGA).
At 6 months, patients also had to evaluate the change in their state by answering the question ‘Compared to the beginning of the study (6 months ago) how do you evaluate the severity of your Sjögren's syndrome now …’ according to a 5-point Likert scale (very importantly improved, importantly improved, slightly improved, no change, worsened). Three groups of patients were defined according to change in symptom state: (1) improved, if considered ‘very importantly, importantly and slightly improved’; (2) stable, if considered ‘no change’ and (3) worsened, if considered ‘worsened’.
Objective measures of dryness
At enrolment and at 6 months, objective measures of dryness included Schirmer's dye scores of both eyes, which were considered abnormal if ≤5 mm in 5 min and unstimulated salivary flow (USF), considered abnormal if ≤0.15 mL/min.
Definition of systemic involvement
Systemic involvement was recorded as the presence and/or a past history of the following manifestations: arthritis, myositis, purpura, peripheral or central nervous system, pulmonary or renal involvement, lymphoma or other B-cell proliferative disorder. All these items had been prospectively collected in a standardised case-reported form. Glandular swelling was not considered as a systemic involvement, and was recorded separately.
Continuous data are presented as medians with IQR. We used non-parametric tests to analyse continuous variables because the data were not normally distributed.
Correlations between scores
The construct validity was assessed by correlation between the disease-specific indexes and their respective gold standard: the PhGA for systemic disease activity indexes and PGA for patient-centred measures. To assess convergent and divergent validity of disease-specific indices, Spearman's r correlation coefficients were used to assess correlation between all disease activity scores and all patient scores. Higher correlation should need to be observed between scores measuring the same construct (convergent validity), whereas lower correlations might be observed between scores measuring different construct (divergent validity).
Reliability of scores
Reliability was assessed on a subsample of patients with the intraclass correlation coefficient (ICC),10 as follows:
For physician measures: inter-rater reliability was assessed between the scoring of two physicians that assessed independently the same patient on the same day.
For patient measures: intrarater reliability was assessed between two scoring of the same score by the same patients performed 2 days apart, without any therapeutic modification.
Evaluation of sensitivity to change and accuracy in detection of change
The sensitivity to change was assessed between the baseline and the 6-month visits. Since no therapeutic intervention was systematically applied to the study population, the disease activity, or patient's symptoms, might improve, worsen or stay the same. Therefore, to evaluate accuracy of the patient's and physician's indexes to detect change, sensitivity to change was evaluated in each subgroup of patients considered as (1) improved, (2) stable, or (3) worsened. Evaluation of change (improvement, worsening or stability), used as external anchor, was assessed by patients for patient scores, and by physicians for disease activity measures. Sensitivity to change was assessed with the standardised response mean (SRM), which is the mean change in score between two visits divided by the SD of the change in score.13 If indexes correctly detected changes, sensitivity-to-change scores should be (1) <0 for patients with improved condition, (2) around 0 for patients with stable condition and (3) >0 for patients with worsened condition. Therefore, the larger the SRM for improved/worsened disease activity, the greater the sensitivity to change of the instrument. SRM values can be considered large (>0.8), moderate (0.5–0.8) or small (<0.5).14–16 An SRM closer to zero, when disease activity is unchanged, indicates that the assessment of stability is more accurate.
We compared the SRMs for the SSDAI and SCAI to that of the ESSDAI, and the SRMs of SSI and PROFAD to that of ESSPRI. The estimation of CI and comparisons of SRMs were performed using bootstrapping methods, with 1000 replications.12
For all statistical analyses, a p value less than 0.05 was considered statistically significant. All statistical analyses involved use of SAS release V.9.3 (SAS Institute, Cary, North Carolina, USA) and R release 2.2.7 (The R Foundation for Statistical Computing, Vienna, Austria) statistical software packages.
Of the 395 patients, 145 (37%) and 251 (64%) had current or either current or past systemic manifestations, respectively. Their characteristics are reported in table 2. The median ESSDAI score was 6 (IQR=2–12). The most frequently involved domains were biological (n=219 (55.4%)), articular (n=136 (34.4%)), glandular (n=113 (28.6%)) and haematological (n=95 (24.1%)) domains. The median ESSPRI score was 6 (IQR=4.3–7.3). Dryness was considered as the most important symptom by 178/392 (45.4%) patients, whereas fatigue (either physical or mental) and pain were considered as a priority by 143/392 (36.5%) and 71/392 (18.1%) patients, respectively. Three hundred and fifty (88.6%) have been followed until the 6-month visit.
Comparison of systemic and non-systemic patients
Current systemic patients had significantly higher systemic disease activity scores (SSDAI, SCAI, ESSDAI and PhGA) than patients not having current systemic complications (table 3).
Patients with current systemic complications had higher scores of ESSPRI and PROFAD scores compared to not currently systemic patients; whereas dryness SSI score did not differ between groups (table 3).
Correlation between scores
Construct validity and convergent validity
EULAR scores had higher correlation with their respective gold standard than other scores (correlation of ESSDAI with PhGA: r=0.59; correlation of ESSPRI with PGA: r=0.70) (table 4). Also ESSPRI had good correlation with patient’s scores (with SSI: r=0.59; with PROFAD: r=0.68). Correlation between ESSDAI and systemic scores were lower (with SCAI: r= 0.33; with SSDAI: r=0.39).
Correlations between patient and systemic scores were very low (r ranging from 0.07 to 0.29). Correlation between ESSDAI and ESSPRI was low (r=0.20), as was the correlation between change in ESSDAI and change in ESSPRI (from the baseline to the 6-month visit) (r=0.14).
Reliability was assessed in a subgroup of 47 patients for systemic scores and 62 patients for patient scores. ICC was >0.85 for all scores, and reliability was considered as very good for all scores (table 4).
Sensitivity to change
Since for most of the patients no therapeutic change was done, the majority of them had a stable disease (either for systemic activity or for patient symptoms). Changes in patients’ symptoms and in disease activity are represented in figure 2. All systemic scores had similar large responsiveness in improved patients (table 3). Responsiveness of patient scores was low in patient experiencing improvement of their symptoms, but was significantly higher for ESSPRI compared to SSI and PROFAD (p=0.006 and 0.049 for SRM comparisons).
This study is the first to compare psychometric properties including responsiveness of all existing disease-specific indexes in a large cohort of primary SS patients. The results showed that EULAR and all other disease-specific score were highly reproducible. Also, sensitivity to change of all systemic scores was similar and large in patients whose disease improves. By comparison, sensitivity to change in patient scores was low, reflecting the stability of these symptoms over time. However, among the measures used, the ESSPRI had a significantly higher sensitivity to change than SSI and PROFAD. Also, we found that correlations between systemic activity scores and patient-reported scores were low, attesting that theses two components are different facets of the disease.
Our study has some limitations. First, no therapeutic intervention was systematically applied to the study population which might have impacted the evaluation of sensitivity to change. However, since to date no treatment has clearly been shown to improve patient features, this is not unethical, nor illogical to do so. Furthermore, this limitation had been encompassed by the use of an external anchor to assess if the patient's condition had changed. Second, since we required physicians to include half the number of patients with systemic complications, the population of this study may include more systemic patients than in a real-life patients cohort. However, we believe that it is more a strength than a limit, since the objectives of this study was to validate the ESSDAI systemic score and to compare it to previous systemic scores. Additionally, our results of responsiveness are in accordance with those previously found in patient profiles3 and in real patients treated with rituximab.4
As previously shown,17 we found that correlations between patient and systemic scores were low. However, we found that patients with systemic complications had higher levels of symptoms, as reflected by higher ESSPRI and PROFAD scores but similar levels of dryness. Also, patient scores demonstrated a low sensitivity to change, lower than previously shown in a small study involving patients treated with rituximab.4 This could be linked to the fact that the study protocol did not require a systematic therapeutic intervention. However, using the same methodology (with an external anchor to assess change), we did manage to show that systemic scores had large sensitivity to change. It is therefore likely that systemic scores are much more sensitive to change than those of patients, reinforcing the interest of the use of a systemic score to detect improvement of patient status. This observation has some major implication for the conduct of clinical trials, particularly when assessing the effect of immunosuppressive therapy. First, this might explain why previous trials failed to demonstrate any effect of immunosuppressive or biologic treatments.18 ,19 Also, when evaluating such treatments, it might be preferable to assess its effect on the systemic manifestations for which it has been prescribed.
The next steps in methodological improvement of the therapeutic evaluation of pSS patients, will be to define the thresholds of disease activity states, and suggest a minimal disease activity level as entry criteria for future trials. Also, to define the minimal clinically important improvement (MCII), and derive response criteria, would also help to assess the effectiveness of a new treatment over placebo. This work is ongoing.
In conclusion, this study, which was performed in a new large cohort of patients from different continents allows the validation of ESSDAI and ESSPRI. Both scores had good construct validity and had correlations with PhGA and PGA that were, respectively, higher than those of other scores. All disease-specific scores were highly reliable. Systemic and patient scores poorly correlated, confirming that they are two complementary components that should both be evaluated, but separately. Even though ESSPRI had a significantly better sensitivity to change than other patient scores, sensitivity to change of patients’ scores was low. Systemic scores had much larger sensitivity to change in patients whose disease activity improves. All these results, along with ongoing work on definition of threshold of activity and response criteria, will considerably help at designing clinical trials for therapeutic evaluation in primary SS.
We thank Maxime Dougados, Alan Tyndall, Iain McInnes and Daniel Aletaha for their guidance and support. We thank the EULAR house in Zurich for their hospitality and outstanding organisation (Ernst Isler, Anja Schönbächler and their associates). We also thank physicians who helped in patient recruitment (John Hamburger and Andrea Richards, Birmingham Dental Hospital & School, Saaeha Rauz, Academic Unit of Ophthalmology, University of Birmingham, Birmingham UK; Emmanuel Chatelus, Strasbourg University Hospital, Strasbourg, France, Frederic Desmoulins, Bicêtre University Hospital, Le Kremlin Bicêtre, France, Petra Meiners, University Medical Center Groningen, Groningen, The Netherlands, Roberto Gerli, Perugia University Hospital, Perugia, Italy), and all members of the EULAR Sjogren's Task force.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.