Aim: Currently, several different instruments are used to measure disease activity and extent in clinical trials of anti-neutrophil cytoplasmic autoantibody (ANCA)-associated vasculitis, leading to division among investigative groups and difficulty comparing study results. An exercise comparing six different vasculitis instruments was performed.
Methods: A total of 10 experienced vasculitis investigators from 5 countries scored 20 cases in the literature of Wegener granulomatosis or microscopic polyangiitis using 6 disease assessment tools: the Birmingham Vasculitis Activity Score (BVAS), The BVAS for Wegener granulomatosis (BVAS/WG), BVAS 2003, a Physician Global Assessment (PGA), the Disease Extent Index (DEI) and the Five Factor Score (FFS). Five cases were rescored by all raters.
Results: Reliability of the measures was extremely high (intraclass correlations for the six measures all = 0.98). Within each instrument, there were no significant differences or outliers among the scores from the 10 investigators. Test/retest reliability was high for each measure: range = 0.77 to 0.95. The scores of the five acute activity measures correlated extremely well with one another.
Conclusions: Currently available tools for measuring disease extent and activity in ANCA-associated vasculitis are highly correlated and reliable. These results provide investigators with confidence to compare different clinical trial data and helps form common ground as international research groups develop new, improved and universally accepted vasculitis disease assessment instruments.
Statistics from Altmetric.com
Wegener granulomatosis (WG) and microscopic polyangiitis (MPA) are forms of small–medium vessel vasculitis that are commonly associated with positive tests for anti-neutrophil cytoplasmic autoantibodies (ANCA) and both diseases have overlapping clinical and pathological manifestations.1 2 These diseases are complex, multisystem processes that usually threaten vital organs and are associated with substantial morbidity and increased mortality.3 The similarities of these two diseases regarding clinical presentation and treatment regimens has led to their being commonly studied together in clinical therapeutic trials and routinely being classified as “ANCA-associated” vasculitides (AAV), despite the lack of perfect accuracy of this label.
Assessment of disease activity in AAV is a significant challenge. There are currently several different instruments used to measure disease activity and extent in clinical trials of AAV.4 Some measures aim to quantify disease activity, others aim to predict long-term outcome and some aim to do both. Most of these measures are derived from the original Birmingham Vasculitis Activity Score (BVAS)5 6 and attempt to catalogue organ system involvement in similar fashions. Although similar in many ways, the three BVAS tools differ in several important respects, including item selection, method of attribution, weighting and application,4 making comparison of cohorts and data difficult, leading to divisions among investigative groups, and resulting in difficulty comparing study results.4 7–10 We performed an exercise comparing six different instruments in terms of reliability, correlation of disease activity scores, ease of use, and other features. This study is part of an effort to develop a new international consensus and data-driven vasculitis disease assessment tools.
Multiple investigators scored the same set of 20 case summaries in the literature using 6 different vasculitis disease assessment instruments; 5 cases were scored twice by each rater. Results were compared across instruments and among investigators. The study was conducted by the Vasculitis Clinical Research Consortium (VCRC).
The study personnel include 10 experienced vasculitis investigators from 4 countries in the EU and the USA (5 countries total) and the group included specialists from internal medicine, rheumatology, nephrology and pulmonology.
A total of 20 case summaries were written in a standardised fashion by 4 different investigators based on actual patients with WG or MPA evaluated in their respective internal medicine, rheumatology, nephrology and pulmonology clinics.
Six different disease assessment tools were completed by each investigator for each case. The activity measures used were:
The BVAS.5 The weighted total score was used for analysis.
The BVAS for Wegener granulomatosis (BVAS/WG).6 The weighted total score was used for analysis.
The BVAS 2003. The weighted total score was used for analysis.
The Physician Global Assessment (PGA) that is included in BVAS/WG.6 This PGA is a visual analogue scale (VAS).
The Disease Extent Index (DEI).11 The total score was used for analysis.
The disease prognosis tool used was the Five Factor Score (FFS),12 (modified to allow for scoring a patient whose disease is already established). The total FFS score was used for analysis.
For each instrument all investigators reviewed a manual of operations and an item glossary. Each investigator completed a training exercise that included scoring three cases with all six instruments and reviewed annotated scoring explanations. Copies of all material are available on the VCRC website (http://www.rarediseasesnetwork.org/vcrc/investigators/outcomes).
Each investigator scored all 20 cases with all instruments. The case order was randomised for each investigator. Order of instruments scored was the same for all investigators.
Each investigator rescored five of the cases using all instruments 1–2 weeks after the initial exercise without reference to prior scoring.
Inter-rater and intrarater reliabilities were measured using intraclass correlations (ICC) fixed between observer pairs. General linear models were used to test for differences between reviewers. Correlations were measured using Pearson method. All analyses were based on a two-sided significance level of 0.5. Data analysis was conducted using the SAS system (Cary, North Carolina, USA).
Reliability of vasculitis disease assessment instruments
Inter-rater reliability of the measures was extremely high for all six measures (ICC = 0.98; table 1). Within each instrument, there were no significant differences or outliers among the scores from the 10 investigators.
Intrarater (test/retest) reliability was also high for each of the measures, with ICCs ranging from 0.77–0.95 (table 1).
Correlations among vasculitis disease assessment instruments
Table 2 and fig 1 list and depict the correlations among the various disease assessment instruments. The scores of the five acute activity measures (BVAS, BVAS/WG, PGA, BVAS 2003, DEI) were highly correlated with one another. As expected, the correlations between FFS (prognostic score) and other measures were poor.
Operational characteristics of the disease activity measures
Following the data collection phase of the exercise, investigators held scheduled freeform discussions regarding the various instruments studied. All investigators agreed that each of the tools were easy to learn and use.
Assessing disease activity in vasculitis involves several major challenges including recognising that vasculitis: (a) affects multiple organ systems; (b) produces manifestations ranging from fulminant and acute disease to chronic, smouldering problems; and (c) results in tissue damage leading to chronic signs and symptoms of disease similar to, but not caused by, active disease. Furthermore, the lack of reliable biomarkers means clinician assessments remain the mainstay of activity measurement. The introduction of validated measures of disease burden in vasculitis has been instrumental in conducting clinical trials of promising new therapies, particularly for WG and MPA.7 10 However, the existence of multiple instruments to measure disease activity in clinical trials of WG and MPA is divisive for the vasculitis research community, is a source of confusion for the wider medical community, and complicates the comparison of results from trials using different disease measures.4
This study, comparing six different vasculitis disease assessment instruments, demonstrates that the currently available tools for measuring disease extent and activity in AAV are highly reliable and correlated. These results, therefore, give investigators more confidence that some comparisons across trials are possible.
This study also demonstrates that differences among the BVAS versions may not be as great as previously perceived. Although the BVAS/WG was designed to be more disease specific,6 it correlated highly with BVAS and BVAS 2003, at least in the one-time measures of disease activity included in this exercise.
This study has several notable strengths. The investigators were highly experienced in caring for patients with vasculitis as well as participating in clinical research in these diseases. Additionally, the investigator group included representatives from multiple medical specialties and several countries. While investigators were each familiar with at least one of the instruments under study, no investigator had used all of the tools and this enhanced the study of reliability of new measures. The 20 cases used in this study were chosen to represent a wide spectrum of clinical manifestations and disease activity states.
This study has some limitations to consider. First, the exercise used paper-based cases; however, the logistics and expense of having 10 investigators question and examine the same 20 patients as well as review chart material are formidable. Second, 20 cases may not represent the full spectrum of these diseases and can not include all of the manifestations an investigator might see within the context of a clinical trial. The main aim of this study, however, was to compare the instruments against one another and not to revalidate their utility in WG and MPA.
The results of this study apply only to AAV and to investigators trained in the use of these instruments. The study did not attempt to rank the utility of these different measures. The results do support the idea that these measures, despite the stated goals associated with their creation, are perhaps less different than anticipated. The strong correlations between different activity measures also illustrates that even if the use of a particular instrument is not intuitive, the careful training of investigators in the use of the instrument prior to its application may result in excellent reliability.
The data and insight gained from this exercise provides clinical investigators studying vasculitis with confidence to easily compare different clinical trial data in WG/MPA. This study also helps form common ground as international research groups go forward with developing new, improved and universally accepted vasculitis disease assessment instruments.
Competing interests: None declared.
Funding: Grants were received from the National Institutes of Health/National Institute of Arthritis and Musculoskeletal and Skin Diseases (UO1 AR51874) and the National Center for Research Resources/NIH: U54 RR01949703. The Vasculitis Clinical Research Consortium is part of the NIH Rare Diseases Clinical Research Network (http://www.rarediseasesnetwork.org/vcrc). PAM, ELM and JHS were supported by Mid-Career Development Awards in Clinical Investigation (NIH-NIAMS: K24 AR02224, AR047578 and AR049185). None of the funding sources for this study had any role in any aspect of conducting this research, drafting the manuscript, or the decision to publish the data.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.