Background: Enthesitis is a recommended core domain for assessment of ankylosing spondylitis (AS), but no measurement has yet been validated according to Outcome Measures in Rheumatoid Arthritis Clinical Trials (OMERACT) criteria.
Objective: The purpose of this study was to seek to validate an enthesitis index for patients with AS according to OMERACT criteria.
Methods: An enthesitis index was validated in two AS patient cohorts: (1) a longitudinal cohort (n = 223) and (2) 22 patients from three Canadian sites participating in a 24-week randomised placebo-controlled trial of adalimumab in AS. Construct validity was evaluated by correlation analysis with the Bath AS Disease Activity Index (BASDAI), the Bath AS Functional Index (BASFI) and quality of life instruments. Reproducibility was assessed by intraclass correlation coefficient (ICC), and responsiveness was assessed by Guyatt’s effect size and standardised response mean.
Results: The most frequently affected sites were the greater trochanter and supraspinatus insertion (∼20%). Patients with enthesitis had significantly greater scores for the BASDAI, BASFI, patient global, AS-specific quality of life index (ASQOL) and the Short Form 36 (SF-36) General Health Survey (p<0.001). The enthesitis score contributed significantly to variance in the BASDAI and BASFI. Interobserver ICCs were 0.96 in the longitudinal cohort and 0.89 and 0.77 in the adalimumab clinical trial cohort (for status and change score, respectively). Significant differences in change scores were evident for all patients after 24 weeks of adalimumab treatment, (p = 0.04), this being more significant when a subset of the most commonly affected entheses were analysed (p = 0.01).
Conclusion: AS patients with enthesitis constitute a more severe subset of disease, and the Spondyloarthritis Research Consortium of Canada (SPARCC) Enthesitis Index is feasible and reliable for measurement of this condition. Discrimination requires further study in larger trials.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Enthesitis is a fundamental clinical and pathophysiological feature of spondyloarthritis (SpA) that is evident clinically and using various imaging techniques.1–3 Moreover, the Assessments in SpondyloArthritis International Working Group (ASAS) has recommended that enthesitis be considered a core outcome domain for the evaluation of AS.4 However, there is no agreement on which measure should be used.
The first published instrument for scoring enthesitis required the assessment of 66 entheses graded according to severity of tenderness.5 This instrument is widely regarded as being non-feasible because it is time consuming and not all enthesitic sites are readily identifiable on physical examination (eg, the seventh costochondral junction). Other instruments that have been used only in clinical trials include the Major Enthesis Index,6 which focuses entirely on assessment of entheses in the lower limbs, and a modified Mander Index, which assesses 14 entheses in the spine and lower limbs.7 Another group has developed an instrument based on the Mander Index—the Maastricht AS Enthesitis Score (MASES).8 It comprises the most commonly affected entheses, as noted upon successive examinations in patients with AS. These are the bilateral first and seventh costochondral joints; the anterior and posterior superior iliac spines; the iliac crests; the insertion of the Achilles tendon into the posterior surface of the calcaneus; and the fifth lumbar spinous process. None of these instruments has been entirely validated from the perspective of the essential criteria comprising the Outcome Measures in Rheumatology (OMERACT) filter—truth, discrimination and feasibility.9
Guided by the OMERACT filter, investigators for the Spondyloarthritis Research Consortium of Canada (SPARCC) conducted clinical validation of an enthesitis instrument, for which selection of entheses sites was based primarily on published sonographic and MRI studies.10 11
Selection of entheses according to imaging studies (truth criterion)
The first step in the selection of entheses was based on published findings from two imaging studies. The first study evaluated 2952 entheses from 164 patients with SpA (64 control volunteers and 30 patients with rheumatoid arthritis), using power Doppler ultrasound.10 The most frequent enthesitic sites were the Achilles (79%) and plantaris fascia (74%) insertions into the calcaneum; the patellar tendon insertion into the apex of the patella (59%); the greater trochanter (44%); the quadriceps insertion into the superior border of the patella (28%); and the medial (25%) and lateral (24%) epicondyles. The distribution of affected entheses did not vary by SpA subtypes or by whether inflammation was predominantly axial or peripheral. The shoulder was not assessed in this study, although symptoms have been reported in as many as 33% of AS patients.12 13 The second study was a systematic MRI evaluation of the shoulder in patients with AS. This study identified supraspinatus enthesitis in two-thirds of patients with shoulder pain.11
Further development of an enthesitis index focused on clinical evaluation and validation of these 16 sites: the greater trochanter (right/left (R/L)), quadriceps tendon insertion into the patella (R/L), patellar ligament insertion into the patella and tibial tuberosity (R/L), Achilles tendon insertion (R/L), plantar fascia insertion (R/L), medial and lateral epicondyles (R/L) and the supraspinatus insertion (R/L) (box 1). Tenderness at each site was quantified on a dichotomous basis: 0 = non-tender and 1 = tender.
Box 1 Entheses sites comprising the total Spondyloarthritis Research Consortium of Canada (SPARCC) Enthesitis Index
Medial epicondyle (left/right (L/R))
Lateral epicondyle (L/R)
Supraspinatus insertion into greater tuberosity of humerus (L/R)
Greater trochanter (L/R)
Quadriceps insertion into superior border of patella (L/R)
Patellar ligament insertion into inferior pole of patella or tibial tubercle (L/R)
Achilles tendon insertion into calcaneum (L/R)
Plantar fascia insertion into calcaneum (L/R)
Range of scores: 0–16.
Clinical assessment of entheses (feasibility criterion)
In an attempt to achieve consensus for and standardisation of clinical evaluation, the approach to clinical assessment of these entheses was demonstrated on a normal patient by the principal investigator (WPM) and discussed by SPARCC investigators and nurse coordinators at an investigators’ meeting for a clinical trial of adalimumab in AS. The sites selected by imaging studies can be readily located anatomically, and most physicians can complete the entire assessment of all 16 entheses within 5 min. The video files demonstrating the approach to clinical assessment can be viewed at www.arthritisdoctor.ca.
Clinical validation studies: patient cohorts and methods
We studied entheses in two cohorts of patients. In the first cohort the frequency and distribution of enthesitis were evaluated in 223 consecutive outpatients followed by rheumatologists in the city of Edmonton, Alberta, Canada, at both tertiary (University of Alberta Hospital) and community-based sites. All patients met the modified New York criteria for AS. Data were systematically collected on patient demographics, and disease status was evaluated every 6 months with the Bath AS Disease Activity Index (BASDAI),14 the Bath AS Functional Index (BASFI),15 the patient global visual analogue scale (VAS 0–10 cm), the Bath AS Metrology Index (BASMI),16 swollen joint count, enthesitis, the Edmonton AS Metrology Index (EDASMI),17 AS-specific quality of life index (ASQOL)18 and the Short Form 36 (SF-36) General Health Survey. One-year follow-up data were available for 72 patients who had not changed treatment, which consisted of physiotherapy and non-steroidal anti-inflammatory drugs (NSAIDs). Data collection for this cohort was approved by the ethics committee of the University of Alberta. Interobserver reliability was assessed in 42 consecutive patients recruited to this cohort. Observers were two clinician nurse coordinators who had received special training in the assessment of enthesitis conducted by the principal investigator (WPM). The order of assessment was randomised between the two observers. Construct validity including data from all 16 entheses was also examined in this cohort.
Interobserver reliability and responsiveness were studied in a second AS patient cohort, which included 22 patients from Edmonton (n = 11), Vancouver (n = 6) and Saskatoon (n = 5) who were enrolled in a randomised controlled trial of adalimumab. This was a substudy of the Canadian AS study (M03-606), a randomised, multicentre, double-blind, placebo-controlled comparison of adalimumab with placebo for the treatment of patients with active AS. Patients were randomised in a 1:1 ratio to receive either adalimumab 40 mg every other week (eow) or placebo during an initial 24-week double-blind period. At week 24, all patients began receiving adalimumab 40 mg eow. The primary endpoint was at 12 weeks. Patients who failed to achieve an ASsessment in Ankylosing Spondylitis International Working Group 20% response (ASAS20) at week 12, week 16 or week 20 could initiate open-label adalimumab 40 mg eow (an early-escape option). Investigators and nurse coordinators from sites participating in the trial attended an investigators’ meeting, at which enthesitis assessment was demonstrated on a normal patient by the principal investigator (WPM). Two observers at each of three sites (Edmonton, Vancouver and Saskatoon) conducted assessments of enthesitis at baseline, and at 12 and 24 weeks. Interobserver reliability was conducted on baseline scores. The mean of baseline, 12-week and 24-week scores for the two observers was used to calculate responsiveness. Entheses comprising both the SPARCC and MASES indices were evaluated. Ethics approval for this study was obtained from the ethics committees at the participating sites.
Descriptive statistics (mean, median and SD) were used to describe the overall distribution of enthesitis scores in the FORCAST (Follow-Up Research Cohort of AS longitudinal study) cohort. Comparisons of disease status for FORCAST patients with and without enthesitis were conducted using t tests (two-tailed). Construct validity was assessed by analysing correlations (Pearson’s for normally distributed data, and Spearman’s rho for non-normally distributed data, two-tailed test) between enthesitis scores and disease activity (BASDAI), function (BASFI), patient global and quality of life (ASQOL). Multiple linear regression was used to assess the contributions of the enthesitis score to the variance in the BASDAI (adjusted for age, sex and disease duration), and to the variance in the BASFI (adjusted for age, sex, disease duration and the BASDAI). In all analyses, two-sided p values <0.05 were considered statistically significant.
Reproducibility was calculated using analysis of variance (ANOVA) for partitioning of the variance to estimate the intraclass correlation coefficient (ICC). The ICC for interobserver reliability was estimated from the two measurements recorded by two different observers for the same patient. A two-way mixed-effects ANOVA model was employed, and the fixed factor designated was the observer. ICC values >0.6, 0.8 and 0.9 represented good, very good and excellent reproducibility, respectively. Reliability of assessment of individual entheses was assessed using kappa statistics (unweighted Cohen’s kappa).
Guyatt’s effect size was used to assess responsiveness at the 12-week primary endpoint in the adalimumab clinical trial cohort and was calculated by dividing the mean change in the adalimumab group by the SD of the change in the placebo group. Values of 0.20, 0.50 and ⩾0.80 were considered to represent small, moderate and large degrees of responsiveness, respectively. The standardised response mean was used to assess responsiveness at 24 weeks since a majority of placebo patients (9 of 13) received early-escape, open-label adalimumab between 12 and 24 weeks. Differences between pretreatment and post-treatment scores were assessed by the paired t test, and treatment group differences in change scores at 12 weeks were analysed by the unpaired t test.
We analysed data from 223 patients in the FORCAST cross-sectional cohort (table 1). Enthesitis in one or more of these 16 sites was clinically documented in a total of 80 patients (35.9%). The most common sites were the supraspinatus and greater trochanter (25.1% and 19.6% of all patients, respectively) (table 2). Involvement of only one enthesis was noted in 27 patients (12.1%), and involvement of two or more entheses was noted in 53 patients (23.8%). Peripheral joint synovitis occurred significantly more frequently in those with versus those without enthesitis (21.3% vs 10.9%, p = 0.03). When compared with patients without any peripheral inflammation (neither enthesitis or synovitis), patients with enthesitis included statistically significantly more females (p<0.001), had a significantly greater frequency of hip involvement (Δ = 18.1%; odds ratio (OR) = 2.6 (95% CI 1.1 to 4.0), p = 0.02), had significantly greater scores for the BASDAI, BASFI, patient global and ASQOL (all p<0.001) and lower scores for the SF-36 Physical and Mental Component Summary scores (both p<0.001) (table 1). Moreover, statistically significantly more patients with enthesitis had a BASDAI score of ⩾4, compared with those without any peripheral inflammation (80% vs 53%, respectively; p<0.001).
One-year follow-up data were available for 72 of 223 patients (32.3%), all of whom were on standard therapies (table 2). The mean change (SD) in the total enthesitis score from baseline was –0.07 (1.97) (range –12 to 4) (p>0.05). Increases in enthesitis score were recorded for 14 patients (19.4%), no changes for 42 patients (58.3%) and decreases for 16 patients (22.2%). Enthesitis developed in 21% of patients with zero scores at baseline and at 1 year was no longer present in 39.3% of patients who had enthesitis at baseline. There was little change in the distribution of enthesitis; involvement of the supraspinatus (18.1%) and the greater trochanter (13.9%) were still the most prevalent.
Substantial correlations between the total enthesitis score and scores for the BASDAI, BASFI, ASQOL, patient global and presence of hip involvement were observed (table 3). Moreover, a significant correlation was noted between the total enthesitis score and Item 4 of the BASDAI, the self-reported item that measures the degree of discomfort from areas tender to touch or pressure. The total enthesitis score contributed significantly to the variance in the BASDAI (adjusted for age, sex and disease duration) (r2 = 13%, p<0.001); and to the variance in the BASFI (adjusted for age, sex, disease duration and the BASDAI) (r2 = 3%, p<0.001).
We first studied interobserver reliability between two trained clinician nurses for a subset of 42 consecutive patients from the longitudinal cohort. The ICC for the enthesitis score was 0.96 (p<0.001) (table 4). For the 22 patients at the three sites participating in the clinical trial cohort, interobserver ICC status score at baseline was 0.89, and interobserver ICC for 12-week change score was 0.77. Reliability for the MASES was comparable for status score but less for change score. Interobserver kappa values for assessment of individual entheses at baseline varied from 0.22 to 0.80; assessment of the left supraspinatus entheses was the most reliable and assessment of the left quadriceps to patella entheses was the least reliable (data not shown). For the most frequently affected sites (ie, the supraspinatus and the greater trochanter), mean kappa values (left and right) were 0.80 and 0.54, respectively.
A non-significant reduction in SPARCC enthesitis score was recorded at 12 weeks for the 9 patients randomised to adalimumab, and this decreased further by 24 weeks (p = 0.04 vs baseline) (table 5). Four of these patients received early-escape, open-label treatment from 12–24 weeks. No significant treatment group differences were evident at 12 weeks, and Guyatt’s effect size indicated a low level of responsiveness for both the SPARCC and MASES indices. Significant reduction in enthesitis score (p = 0.01) was evident by 24 weeks for the 13 placebo patients, of whom 9 received early-escape, open-label adalimumab therapy from 12–24 weeks.
We re-analysed data on responsiveness in two sets of exploratory analyses (table 5). In the first set, we analysed data for only those entheses for which enthesitis was present in ⩾10% of patients in the FORCAST prospective cohort (ie, L/R supraspinatus, L/R greater trochanter, L/R Achilles tendon, L/R plantar fascia, SPARCC 8/16 score). Guyatt’s effect size at 12 weeks was greater with this more limited index than for the total SPARCC Enthesitis Index. The standardised response mean (SRM) at 24 weeks was also greater than for the total SPARCC Enthesitis Index for patients randomised to adalimumab.
For the second set of analyses, we evaluated only those entheses that discriminated between treatment groups for the clinical trial patients at 12 weeks (ie, L/R greater trochanter, L/R Achilles tendon, L/R plantar fascia, SPARCC 6/16 score). Guyatt’s effect size at 12 weeks was greater with this index than for the total SPARCC Enthesitis Score, as was the SRM at 24 weeks for patients randomised to adalimumab.
We demonstrate that an enthesitis index that assesses enthesitis at 16 sites is both feasible and reproducible. The selected entheses are readily localised and often evaluated in routine clinical practice. Analysis of construct validity demonstrated a strong association between enthesitis and both disease activity and functional disability. Moreover, patients with enthesitis define a category with more severe disease. Responsiveness to change was demonstrable after 24 weeks for patients receiving antitumour necrosis factor (TNF) therapy. Preliminary data suggest that an index based on a more limited subset of entheses that are more commonly inflamed in AS may be even more responsive.
Our clinical data indicate that the greater trochanter and supraspinatus entheses were the most frequently affected, which differs from the findings of previous sonographic examinations indicating that the Achilles and plantar fascia insertions into the calcaneum were most frequently affected.10 However, the latter report did not study the supraspinatus enthesis and demonstrated a major discrepancy between the frequency of sonographic and clinical involvement of entheses. By clinical examination, the most frequently detected enthesitis was at the patellar tendon insertion (25%) and the greater trochanter (24%), while Achilles (13%) and plantar fascia (16%) involvement was much less common. With the exception of patellar insertion enthesitis, this is consistent with our observations. This finding primarily reflects a lack of sensitivity of clinical examination. However, sonography is operator dependent, and relevant expertise is not readily available.
Our finding that enthesitis defines a subset of patients with more severe disease has not been reported previously. The association with hip involvement is also novel. A previous longitudinal study reported more severe disease for AS patients with peripheral arthritis, although enthesitis was not assessed.19 This is why comparisons of patients in the longitudinal cohort were conducted between those with enthesitis but without swollen joints and those without any peripheral inflammation whatsoever. In fact, 80% of patients with enthesitis would probably meet current inclusion criteria for clinical trials of anti- anti-TNF therapies, which typically require a BASDAI ⩾4. It is not known if enthesitis may also be a risk factor for structural damage progression,20 and this should be evaluated through further study.
Estimates of reliability obtained in these analyses may be conservative, since formal standardisation and training of observers in the adalimumab trial were not conducted. Only two studies have assessed the reliability of enthesitis instruments. One study demonstrated good intraobserver reliability for the Mander Index, as assessed by a single observer at each of four investigating sites, with ICC scores ranging from 0.84 to 0.96.21 On the other hand, it was not possible to demonstrate any treatment group differences in a large placebo-controlled trial of infliximab in AS using the Mander Index, suggesting that discrimination was poor because of low interobserver reliability.22 In a second study, all reported enthesitis instruments were compared for interobserver reliability in a Latin square design that included 20 observers with special expertise in SpA and 19 patients (10 with psoriatic arthritis, 9 with AS).23 The SPARCC Enthesitis Index compared favourably with other instruments for the assessment of both AS and psoriatic arthritis.
Because of the relatively small patient numbers, estimation of responsiveness in this study should be regarded as preliminary. Not surprisingly, an index based on a more limited number of entheses that are more commonly inflamed in patients with AS was more discriminatory than both the total SPARCC Enthesitis Index and the MASES index. This would be consistent with data from the pivotal trial of adalimumab in AS, for which treatment group differences in enthesitis scores were evident using the MASES, a scale that selects entheses from the Mander Index according to their frequencies of involvement in patients with AS.24 The complete Mander Index, on the other hand, did not discriminate between treatment groups in a placebo-controlled trial of infliximab.22 The modified Mander Index developed in San Francisco also discriminated between patients receiving etanercept and patients receiving placebo.7 Comparative studies are necessary to determine the relative responsiveness of each instrument in different SpA patients.
In conclusion, we have validated an index of enthesitis in AS patients according to the three essential criteria of the OMERACT filter: truth, discrimination and feasibility. The selected entheses are those that are most familiar to clinicians in routine clinical practice and reflect findings of imaging studies. Analysis of construct validity indicates that patients with enthesitis constitute a patient subset with more severe disease. Further, most of the patients in this important subset meet current criteria for anti-TNF therapy. Further analyses are required to assess discrimination in controlled clinical trials.
Lead author WPM is a senior scholar of the Alberta Heritage Foundation for Medical Research. The authors thank Rebecca Hill and Helene Ostojic for their assistance with implementation of this work. They also thank Michael Nissen, ELS, of Abbott Laboratories for his editorial assistance in the development of this manuscript.
Description of clinical evaluation of the SPARCC Enthesitis Index
At each site, pressure was exerted using the dominant thumb until the nail bed blanched. With the patient’s arm flexed at the elbow and extended posteriorly, the supraspinatus enthesis was palpated directly inferior to the acromion. The medial and lateral epicondyle entheses were palpated just distal to the bony prominence of the epicondyles. The greater trochanter enthesis was palpated with the patient lying on the side, uppermost leg flexed at the hip and knee, and uppermost knee resting on the examination couch. The insertion of the quadriceps into the superior pole of the patella is readily palpable. The insertion of the patellar ligament into the inferior pole of the patella and the tibial tubercle, as well as the Achilles tendon and plantar fascia insertions into the calcaneum, were all readily palpable. The video files demonstrating the approach to clinical assessment may be viewed at www.arthritisdoctor.ca.
Competing interests: The study was fully funded by Abbott Laboratories. RLW is an Abbott employee.
Ethics approval: Ethics approved was obtained from the ethics committee of the University of Alberta and from ethics committees at the participating sites