Objectives To evaluate construct validity, interpretability, reliability and responsiveness as well as determination of cut-off points for good and poor health within the original English version and the 18 translations of the disease-specific Assessment of Spondyloarthritis international Society Health Index (ASAS HI) in 23 countries worldwide in patients with spondyloarthritis (SpA).
Methods A representative sample of patients with SpA fulfilling the ASAS classification criteria for axial (axSpA) or peripheral SpA was used. The construct validity of the ASAS HI was tested using Spearman correlation with several standard health outcomes for axSpA. Test–retest reliability was assessed by intraclass correlation coefficients (ICCs) in patients with stable disease (interval 4–7 days). In patients who required an escalation of therapy because of high disease activity, responsiveness was tested after 2–24weeks using standardised response mean (SRM).
Results Among the 1548 patients, 64.9% were men, with a mean (SD) age 42.0 (13.4) years. Construct validity ranged from low (age: 0.10) to high (Bath AnkylosingSpondylitisFunctioning Index: 0.71). Internal consistency was high (Cronbach’s α of 0.93). The reliability among 578 patients was good (ICC=0.87 (95% CI 0.84 to 0.89)). Responsiveness among 246 patients was moderate-large (SRM=−0.44 for non-steroidal anti-inflammatory drugs, −0.69 for conventional synthetic disease-modifying antirheumatic drug and −0.85 for tumour necrosis factor inhibitor). The smallest detectable change was 3.0. Values ≤5.0 have balanced specificity to distinguish good health as opposed to moderate health, and values ≥12.0 are specific to represent poor health as opposed to moderate health.
Conclusions The ASAS HI proved to be valid, reliable and responsive. It can be used to evaluate the impact of SpA and its treatment on functioning and health. Furthermore, comparison of disease impact between populations is possible.
- ankylosing spondylitis
- outcomes research
Statistics from Altmetric.com
Spondyloarthritis (SpA) is characterised by inflammation and new bone formation in the axial skeleton and joints.1 Patients with SpA suffer from axial and peripheral symptoms resulting in pain, spinal stiffness, sleep problems and fatigue.2–4 Peripheral manifestations (arthritis, dactylitis or enthesitis) and extra-articular manifestations such as uveitis, psoriasis and inflammatory bowel disease may add to the burden of disease in a substantial number of patients but are less well studied.5 6 The course of disease varies, but many patients experience functional disability and limitation in activities and social participation. The influence of the disease on health-related quality of life and functional status has been well characterised in patients with ankylosing spondylitis (AS), and to a lesser extent for patients with non-radiographic axial SpA (nr-axSpA), but there are little data relating to patients with peripheral SpA (pSpA).4 6
The Assessment of Spondyloarthritis international Society Health Index (ASAS HI) has been developed to measure functioning and health in patients with SpA with the aim of defining and comparing the impact of the disease and health in this patient group.7 Initial phases in the development of the ASAS HI focused on investigating functional impairments from the patients’ perspective using both qualitative and quantitative approaches.
The biopsychosocial model of disease proposed by the International Classification of Functioning, Disability and Health (ICF) was used as the basis for the development of the ASAS HI. The ICF is accompanied by a classification of categories, called factors, that allow description of functioning, disability and health in individuals in a systematic and inclusive way.8 The comprehensive ICF Core Set for AS is a disease-specific selection of the ICF factors that are typical and relevant for AS, and has served as the underlying construct of the ASAS HI since the whole range of functioning, disability and health of patients with AS was captured.9 Patients, rheumatologists and methodologists were involved in the further reduction of categories using qualitative and quantitative methods and resulting finally in the ASAS HI.7 The 17 dichotomous items of the ASAS HI address aspects of pain, emotional functions, sleep, sexual functions, mobility, self-care and community life representing a wide spectrum of different levels of functioning, disability and health in patients with SpA. The sum score of the ASAS HI ranges from 0 to 17, with a lower score indicating a better health status. Preliminary validity and feasibility (time of completion) have already been assessed in a field test during the final steps of the development phase.7 Cognitive debriefing was undertaken in patients with AS and nr-axSpA and patients with peripheral manifestations aiming at assessment of a broad impact of health on all patients with SpA.7 10 The ASAS HI was originally developed in parallel in English-speaking countries (Australia, Canada, Ireland, UK, USA), and it has later been translated and cross-culturally adapted into 18 languages worldwide.10
The objective of the current paper is to evaluate construct validity, interpretability, reliability and responsiveness as well as determination of cut-off points for good and poor health within the original English version and the 18 translations of the disease-specific ASAS HI in 23 countries worldwide.
A cross-sectional international observational study with a longitudinal component for reliability and responsiveness of the ASAS HI was performed in 23 countries during 2014 and 2015.
A representative sample of patients with SpA fulfilling the ASAS classification criteria for either axial (axSpA) or pSpA were recruited.11 12 Each centre was asked to recruit a sample of patients, 80% of whom were to have axSpA and 20% pSpA with no more than 10% of all recruits having coexistent psoriasis. Of the axSpA subset, 40% were to have nr-axSpA and 60% AS. There was a target of 50–100 recruits per country to reach an overall sample size of 1700 patients. The aim was to include patients with a broad range of disease severity and a variety of treatments. Patients with severe concomitant diseases that may influence their functional status were excluded from participation together with patients who were unable to understand the objectives of the study or the various questionnaires. Centres were asked to include at least 25% of their sample in the reliability arm and 25% in the responsiveness arm. All centres received approval from their local ethics committee. Written informed consent was obtained from all respondents prior to the start of their participation.
Demographic and clinical information was collected including age, gender, predominant presentation, presence of extra-articular manifestations, years of education and employment. C reactive protein levels, imaging results and current medications were also recorded. Physician’ s judgement of patients’ overall functioning and health was assessed by a single global question (“Please score the overall status of the subject’s signs and symptoms and the functional capacity of the subject”) on a zero to 10 numerical rating scale (NRS) (10 representing severe impairment) and a Likert scale (“How do you rate the health of your patient today?”) on a 4-point scale ranging from very poor to very good. Physician’s opinion on the level of disease activity was recorded by answering the question “How active was the spondyloarthritis of your patient during the last week?”.
Patients completed a series of self-reported questionnaires: ASAS HI,7 Bath Ankylosing Spondylitis Disease Index (BASDAI),13 Bath Ankylosing Spondylitis Functioning Index (BASFI),14 EuroQol five dimensions questionnaire (EQ-5D-5L index and thermometer),15 Short Form Survey Instrument 36-Item (SF-36),16 Hospital Anxiety and Depression Scale (HADS),17 work productivity and activity impairment questionnaire (WPAI)18 and pain and spinal pain NRS (0–10 NRS; 10 representing severe pain). Patient’s opinion on the level of disease activity was recorded by a single patient global question (“How active was your rheumatic disease on average during the last week?”) on a NRS 0–10 and on the health status (“How do you rate your health today?”) on a 4-point Likert scale ranging from very poor to very good. Based on collected data, the Ankylosing Spondylitis Disease Activity Score (ASDAS) sum score was calculated and patients were categorised into ASDAS status groups.19 20 EQ-5D index was calculated using the value set for UK except for France, Germany, Netherlands, Spain, Thailand and USA for which country-specific value sets were used.
ASAS Health Index
The ASAS HI contains 17 items (dichotomous response option: ‘I agree’ and ‘I do not agree’) addressing different aspects of functioning. A sum score is being calculated by summing up all responses to ‘I agree’ given a total ASAS HI score ranging from 0 to 17—with a lower score indicating a better and a higher score indicating an inferior health status (see also user’s manual for handling missing items; online supplementary file 1).7
Supplementary file 1
Variables were collected at baseline and longitudinally in stable patients (reliability arm) or in patients who required a therapeutic change because of high disease activity (responsiveness arm) (flow chart and patients’ disposition in online supplementary file 2). Longitudinal assessments were performed in patients who were in a stable disease state (reliability arm) or in patients who required a therapeutic change because of high disease activity (responsiveness arm). Patients in the reliability arm were eligible for the analyses when they considered themselves in a stable disease state while on stable treatment (no change in non-steroidal anti-inflammatory drugs (NSAIDs) over the preceding week, with no change in conventional synthetic disease-modifying antirheumatic drug (csDMARD) or tumour necrosis factor inhibitor (TNFi) therapy over the last 4 weeks). Patients were invited to complete the questionnaire at home after an interval of 4–7 days to evaluate reproducibility. Patients in the responsiveness arm required therapeutic change initiated due to high disease activity. The therapeutic change could include initiation of NSAIDs, a csDMARD or a TNFi. Patients were reassessed 12–24 weeks (for NSAIDs 2–24 weeks) after the treatment change had been implemented. The patients with longitudinal assessments (reliability and responsiveness) were asked to answer a global question at the second assessment and respond as to whether their condition was stable, improved or had worsened compared with baseline assessments. Only those patients reporting improvement in response to the global change question were analysed to assess responsiveness. Results of the validation process and the psychometric properties of the ASAS HI were presented at various ASAS meetings. Votes were taken from ASAS members to confirm the thresholds of ASAS HI.
COSMIN recommendations were followed to test and report measurement properties.21 Psychometric properties were examined according to the OMERACT filter.22 Descriptive statistics were used to characterise the sample. According to the COMSIN checklist, interpretability is being summarised as information about percentage of missing items and description of how missing items were handled as well as distribution of the (total) ASAS HI score including floor and ceiling effects. Distributions of scores were examined for identification of floor and ceiling effects. Construct validity was evaluated against other health outcomes (including patient and physician global assessment, ASDAS, BASDAI, BASFI, HADS, WPAI, SF-36 summary values (physical component summary score (PCS) and mental component summary score (MCS), EQ-5D) in a cross-sectional analysis using Spearman correlation. Prior to the analysis, we hypothesised magnitude and direction of correlations, and correlation were considered low if ≤0.30, moderate if >0.30 and ≤0.50, high if >0.50 and <0.80, and very high if ≥0.8.23 Internal consistency was evaluated using Cronbach’s α coefficient (adequate: ≥0.70). Test–retest reliability was assessed by intraclass correlation coefficient (ICC) (two-way model, single measure) with a 95% CI. An ICC of ≥0.8 was considered to indicate excellent reliability. Agreement across the scale of the ASAS HI was visualised by Bland and Altman plot. Measurement error was assessed by analysing the smallest detectable change (SDC) based on the 95% limits of agreement by using the formula: SDC=1.96×SD of the mean difference in ASAS HI of the two assessments in the reliability sample/√2.24 Responsiveness was tested with standardised response mean (SRM) after 2–24 weeks depending on the type of medication. SRM was assessed by using the following formula: SRM=ASAS HI mean difference/SD of ASAS HI mean difference. A SRM <0.4 was considered to represent a low effect, 0.4–0.79 a moderate effect and ≥0.8 a large effect. The discriminant ability of the ASAS HI was assessed by calculating ASAS HI mean scores for predefined status groups (ASDAS status groups (inactive, moderate, high and very high), BASDAI and BASFI thresholds (<2.0, 2.0–3.99, 4.0–5.99, ≥6.0)) by analysis of variance. To distinguish between relevant health states (an additional relevant aspect of interpretability), two different methods were applied: fixed 90% specificity and the closest point to (0,1).25 26 We used the patient global assessment at predefined levels (<3 and >6 on NRS and cut-off between good and poor on Likert scale) as external constructs for ‘poor’, ‘moderate’ and ‘good’ health status. We used a global rating of change question (Likert scale) as external construct to assess change perceived by the patient. A cut-off between ‘improved’ versus ‘no change’ or ‘worse’ was used to determine minimal clinically important improvement. Final choice was based on a consensus during the ASAS meeting in June 2017 (74 participants, 100% agreement). A p value ≤0.05 was considered significant. Statistical analyses were performed using SPSS V.23.
In total, 1593 patients participated in the international validation study (sample size per country varied between 15 and 130) (see online supplementary file 3). Of these, 1548 had analysable data (45 patients were excluded because of major incomplete data): 64.9% were men, mean age 42.0 (SD 13.4) years, mean symptom duration 14.5 (11.4) (table 1). There were 1292 (83.5%) patients with axSpA (375 patients (29.0%) with nr-axSpA and 917 (71.0%) with AS) and 256 (16.5%) patients with pSpA. Patients had, on average, moderate disease activity as measured by ASDAS and BASDAI, with 64.2% treated with NSAIDs and 38.2% were treated with TNFi (table 1; additional detailed patients’ characteristics of the whole cohort are presented in online supplementary file 4). As expected, the patients in the responsiveness sample have a higher level of disease activity at baseline.
Psychometric properties of the ASAS HI
The mean total score in the population sampled for the ASAS HI was 6.7 (SD 4.3). A total score was calculated for respondents in which not more than 20% of the data were missing (see also user’s manual published in online supplementary file 1). Numbers of missing values were limited and occurred between 0.1% and 0.3% (online supplementary file 5). Floor (percentage of the respondents who had the lowest possible (total) score) or ceiling effects (percentage of the respondents who had the highest possible (total) score) of the ASAS HI in this analysis were acceptable (6.9% and 0.8%, respectively) (figure 1).
Construct validity showed Spearman correlation coefficient ranging from moderate (WPAI absenteeism: 0.38) to high (BASFI: 0.71 or SF-36 PSC 0.73).
As hypothesised, the ASAS HI had high correlation with patient global (r=0.57), pain (r=0.60), spinal pain (r=0.54), SF-36 MCS (r=0.59), HADS (r=−0.55 and −0.57), BASFI, ASDAS (r=0.61), presenteeism (r=0.60), BASDAI, BASFI, EQ-5D and SF-36 PCS (r>0.70). The correlation of ASAS HI with physician global (r=0.49) and absenteeism (r=0.38) was moderate (table 2). Of note, correlation of ASAS HI with age was weak (r=0.10). Hypothesis about magnitude and direction of correlation was confirmed in 68.7% of variables.
ASAS HI scores showed a high Cronbach’s α of 0.93. Internal consistency of ASAS HI did not vary much across different disease groups (0.93 for AS, 0.94 for nr-axSpA and 0.91 for pSpA).
Reliability and measurement error
A total of 770 patients had a second assessment for reliability. Of these, 192 patients had to be excluded because of missing data (n=54), patients not being stable (n=74) or second assessment performed outside the time frame (n=64). Finally, 578 (75.1%) patients who considered themselves to be in a stable state were analysed (table 1). The mean (SD) baseline ASAS HI was 6.2 (4.2) and the second ASAS HI was 6.0 (4.2). Reliability was excellent with an ICC of 0.87 (95% CI 0.84 to 0.89) and ICCs were comparably high in all disease subtypes (AS 0.87 (95% CI 0.84 to 0.89); nr-axSpA 0.89 (95% CI 0.85 to 0.93); pSpA 0.83 (95% CI 0.75 to 0.88)). Bland-Altman plot shows a good agreement between ASAS HI sum score at first and second assessment. No systematic differences in sum score for the two measurement time points were found. Calculation of the limits of agreement (and the SDC) was based on the assumption that reliability was homoscedastic over the entire range of ASAS HI although this was not completely the case as the variation was somewhat more pronounced in the middle of the range (figure 2). The SDC was calculated as 3.0, which corresponds to the minimum change beyond measurement error that can be detected in an individual patient over time.
A total of 353 patients were allocated to the sensitivity to change arm because of initiation of a new treatment. Also, 107 patients had to be excluded from the 353 initial patients because of missing data (n=47), patients deteriorating during time interval (n=12), patients not reporting a change in their disease state (n=47) and second assessment performed outside of the time frame (n=1). Finally, 246 (69.7%) estimated themselves to have improved between visits and were analysed. Seventy-eight patients started NSAIDs, 41 patients a csDMARD and 127 patients TNFi. The SRM was −0.44 for NSAIDs (moderate), −0.69 for csDMARDs (moderate) and −0.85 for TNFi (large).
The ASAS HI discriminated well between patients with different disease activity states (measured by ASDAS and BASDAI) and function (measured by BASFI) (table 3). The groups with greater disease activity and more impaired functioning had higher mean ASAS HI scores (indicating impaired health) than those with lower disease activity.
Cut-off values for interpreting health status based on ASAS HI scores
Final cut-offs for ASAS HI scores to distinguish poor versus moderate, and moderate versus good health are presented in table 4. All analysed scenarios with application of different external anchors and different methodological approaches are presented in online supplementary file 6. In order to balance sensitivity and specificity, a threshold of ASAS HI, which differentiated patients with ‘good/very good’ health from those with ‘moderate’ health state, was identified as being 5.0. In contrast, the 90% specificity criterion was considered to be the most clinically relevant threshold of ASAS HI for ‘moderate’ versus ‘poor/very poor’ health identified as a score of 12.0 or above.
Attempts to define a clinically important improvement proved an elusive target since scores were too heterogeneous. We therefore recommend using the SDC value of 3.0 to determine change in ASAS HI in individual patients and present the percentage of patients with a change of ≥3.0.
Applying these thresholds within the validation cohort, we were able to show that the three defined health status groups within ASAS HI could discriminate with respect to both disease activity, functioning and health measures (table 5). The two cut-off values delineating the three health statuses were agreed on after discussion and voting by 74 ASAS members during their European League Against Rheumatism meeting 2017 (74 approval, 0 decline, 0 abstention).
The manuscript presents the psychometric properties of the original English ASAS HI and its different translations, as obtained in a large international cohort. We show that the ASAS HI is a valid, reliable and responsive measure of functioning and health in patients with SpA on a global level. Interpretability was good as has been shown for different aspects. In this paper, we report the values for the entire cohort and country-specific results will be published separately in the language of the specific country. Generally, the results were similar in the various countries (data not shown).
Since the ASAS HI contains only 17 items with a dichotomous response option addressing all important aspects of patient complaints, administration of the questionnaire is feasible as it has been shown in a previous field test.10 The calculation to obtain a single sum score is simple and quick to undertake. Floor effect was acceptable with almost no ceiling effect observed in our study. The scores have good face validity and the ASAS HI exhibited excellent correlation with other measures covering a range of health outcomes. Analysis of construct validity demonstrated a strong association between ASAS HI sum score and both disease activity and functional disability, indicating that the ASAS HI is measuring a broader concept than just disease activity or physical functioning. In addition, the high correlation between ASAS HI and patient global assessment as well as generic health measures (such as SF-36) suggest that patients do not make substantial distinctions between disease-specific and more generally worded questionnaires. We noted in our cohort a weaker correlation between ASAS HI and physician global as well as discordance between physician and patient global scores at baseline. However, the discordance between patient responses and physician response is very small and not comparable with those reported in literature.27 28
We were able to show that the ASAS HI is applicable in all patients with SpA irrespective of the disease subgroup. Similar results in internal consistency between AS, nr-axSpA and pSpA provide support for the use of these questionnaires in the whole group of SpA. This is an important finding as the ASAS HI was originally developed in patients with AS. However, use of the ASAS HI in patients with pSpA should be carefully checked and its applicability should be further investigated to gain more insights into this subgroup of patients.
There is a debate about which measure is suitable for assessing responsiveness. Our choice is SRM, which is not recommended according to the COSMIN guidelines.21 However, SRM is one of the widely used responsiveness measures and there is also critique published in the literature about this part of the COSMIN guidelines.29 One of the arguments is that the SRM is more reflecting the magnitude of the event than providing information on the measure. Indeed, we do show that the SRM is better for start of biological DMARDs than for NSAIDs. However, providing the SRM is a useful information for researchers who want to use the ASAS HI as an outcome measure in a trial.
This study has clear strengths and weaknesses. Strengths include the involvement of 23 countries with 18 country-specific translations with different cultures and socioeconomic backgrounds within the validation process.10 Thus, the domains of functioning and health assessed in the questionnaire are likely to be relevant across countries and cultures. However, qualitative research about this issue is lacking. One relative weakness of this international validation study may be considered the small sample size in some countries, especially in the longitudinal arm. However, the results of the study do show that the psychometric properties are robust and meaningful. The ASAS HI can be used in clinical trials to evaluate the impact of SpA and its treatment on overall functioning and health in patients with SpA and also to compare disease impact in cohorts and populations. Further research is needed to address the question whether and how the ASAS HI is applicable in daily routine care to guide treatment decisions.
In conclusion, the ASAS HI proved to be a valid, interpretable, reliable and responsive questionnaire to assess overall functioning and health in this global international validation study including 19 languages.
The authors thank all patients who participated in the study. We would like to thank all national collaborators for their tremendous endeavour to recruit patients and to document the results. We thank S Kemmerling in Austria; K Elgarf, M Shaaban and A Ahmed in Egypt; M Flörecke, J Meier and J Winter in Germany; G Marsico and I Olivieri in Italy; L Ward in New Zealand; A Akulova and A Rebrov in Russia; R Ortega (University Hospital Reina Sofía, Córdoba), M Aparicio and X Juanola (University Hospital Bellvitge, Barcelona), R Almodóvar and P Zarco (University Hospital Foundation Alcorcón, Madrid), and C Rodríguez (University Hospital Dr Negrín, Gran Canaria) in Spain; E Kilic, G KIlic, D Yıldız and G Kenar in Turkey; Frane Grubišić and Hana Skala Kavanagh from Croatia, S Bhalara (West Hertfordshire Hospitals NHS Trust), K Gaffney (Norfolk and Norwich University Hospital), P Helliwell (Bradford Institute for Health Research and University of Leeds), J Packham (Keele University) and CS Yee (Doncaster Royal Infirmary) in UK; and G Yoon and The Russell Engleman Rheumatology Research Center (UCSF) in USA for local data collection. We thank Joachim Listing from the German Rheumatism Research Centre Berlin (DRFZ) Germany, who did the threshold analysis and who was of enormous help in discussing the detailed analysis. This work was previously presented at the following conferences and published as a conference abstract:
Handling editor Tore K Kvien
Contributors Study concept and design: UKU, DvdH, AB, JB. Acquisition of data: all authors. Analysis and interpretation of data: all authors. Writing of the manuscript: UKU. Critical revision of the manuscript for important intellectual content: all authors. All authors had access to the data, commented on the report drafts and approved the final submitted version.
Funding This study was funded by the Assessment of Spondyloarthritis international Society (ASAS).
Competing interests None declared.
Patient consent Obtained.
Ethics approval All centres received approval from their local ethics committee.
Provenance and peer review Not commissioned; externally peer reviewed.
Correction notice This article has been corrected since it published Online First. The acknowledgements statement has been updated.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.