- Split View
-
Views
-
Cite
Cite
Kaoru Takase-Minegishi, Nobuyuki Horita, Kouji Kobayashi, Ryusuke Yoshimi, Yohei Kirino, Shigeru Ohno, Takeshi Kaneko, Hideaki Nakajima, Richard J Wakefield, Paul Emery, Diagnostic test accuracy of ultrasound for synovitis in rheumatoid arthritis: systematic review and meta-analysis, Rheumatology, Volume 57, Issue 1, January 2018, Pages 49–58, https://doi.org/10.1093/rheumatology/kex036
- Share Icon Share
Abstract
To evaluate diagnostic test accuracy of US compared with MRI for the detection of synovitis in RA patients.
A systematic literature search was performed in the PubMed, EMBASE, Cochrane Library and Web of Science Core Collection databases. Studies evaluating the diagnostic test accuracy of US for synovitis detected by MRI as the reference standard for wrist, MCP, PIP and knee joints were included. To assess the overall accuracy, we calculated the diagnostic odds ratio using a DerSimonian–Laird random effects model and the area under the curve (AUC) for the hierarchical summary receiver operating characteristics using Holling’s proportional hazards models. The summary estimate of the sensitivity and specificity were obtained using the bivariate model.
Fourteen of 601 identified articles were included in the review. The diagnostic odds ratio was 11.6 (95% CI 5.6, 24; I2 = 0%), 28 (95% CI 12, 66; I2 = 11%), 23 (95% CI 6.5, 84; I2 = 19%) and 5.3 (95% CI 0.60, 48; I2 = 0%) and the AUC was 0.81, 0.91, 0.91 and 0.61 for wrist, MCP, PIP and knee joints, respectively. The summary estimates of sensitivity and specificity were 0.73 (95% CI 0.51, 0.87)/0.78 (95% CI 0.46, 0.94), 0.64 (95% CI 0.43, 0.81)/0.93 (95% CI 0.88, 0.97), 0.71 (95% CI 0.33, 0.93)/0.94 (95% CI 0.89, 0.97) and 0.91 (95% CI 0.56, 0.99)/0.60 (95% CI 0.20, 0.90) for wrist, MCP, PIP and knee joints, respectively.
US is a valid and reproducible technique for detecting synovitis in the wrist and finger joints. It may be considered for routine use as part of the standard diagnostic tools in RA.
US seems to be a valid and reproducible technique for detecting synovitis in the wrist and finger joints.
Power Doppler US showed better overall diagnostic test accuracy than greyscale US.
Further US quality assessment is necessary to determine diagnostic test accuracy for synovitis in RA.
Introduction
RA is a chronic inflammatory disease characterized by autoimmunity and polyarticular synovial inflammation; it subsequently causes bone destruction. For patients with RA, the current concept is treat-to-target, with clinical remission the primary treatment goal and aiming to achieve it as soon as possible [1]. Clinical trials have demonstrated that early treatment reduces inflammation, resulting in limited structural change and better long-term outcomes [2–6]. Therefore, early diagnosis of RA is essential for initiation of treatment. Recently, advances in the field of imaging techniques have resulted in US and MRI being recommended for diagnosing and monitoring disease activity in RA patients [7]. US and MRI have been shown to be more sensitive than clinical examination in detecting synovitis, both in active disease and in remission [8–10]. The predictive value of evaluating subclinical synovitis by imaging techniques was first described by Brown et al. [11], and it has been demonstrated that US-detected subclinical synovitis can lead to radiographic progression, even in clinical remission [12]. Moreover, the presence of inflammation observed with US or MRI can be used to predict the progression from undifferentiated inflammatory arthritis to clinical RA [13–16].
Although MRI is capable of directly visualizing joint inflammation, there are difficulties in performing MRI as an initial test because of limited resources. The assessment of multiple joints with MRI is time consuming and expensive for routine use. In contrast, US is relatively low cost, non-invasive and has real-time capabilities and portability. Despite these advantages, there are some limitations of this technology. While several studies have highlighted the use of US in the detection of joint inflammation as compared with MRI, there were considerable discrepancies in the results in these previous studies, and US is considered to be an operator-dependent technology. To assist in resolving these discrepancies, this systematic review and meta-analysis was conducted.
Methods
Overview
The study protocol followed the Cochrane Handbook for Diagnostic Test Accuracy Review and the Preferred Reporting Items in Systematic Reviews and the meta-analysis statement has been registered on the international prospective register of systematic reviews (no. 42016033912) [17–20]. Institutional review board approval and patient informed consent were waived due to the review nature of this study.
Both case–control and cohort studies were included when they provided sufficient data for both sensitivity and specificity of US for the detection of MRI-judged synovitis in human RA. However, no eligible case–control study was found. Here, single- and two-gate studies were customarily termed cohort and case–control studies. Studies covering only sensitivity or only specificity were excluded. Non-English written reports and conference abstracts were allowed in the protocol, although none of them were eventually eligible.
Search strategy
In the electronic search, we systematically searched the PubMed, EMBASE, Cochrane Central Register of Controlled Trials and Web of Science Core Collection databases. Search formulas are presented as supplementary data, section Search Formulas, available at Rheumatology Online. References of previously published reviews and those of included original studies were checked through hand searching. Two investigators (K.M., N.H.) independently screened the candidate articles by checking the title and abstract after uploading the citation list into the Endnote X7 software (Thomson Reuters, Philadelphia, PA, USA). After independent screening, articles still regarded as candidates by at least one investigator were then scrutinized independently through full-text reading. Final inclusion was decided after resolving discrepancies between the two investigators.
Participants
We included patients with a diagnosis of RA defined by the 2010 ACR/EULAR classification criteria or 1987 ACR criteria for RA [21, 22]. Synovitis in RA at the wrist, MCP, PIP or knee joints was the target pathology. Neither bone erosion nor synovitis that was caused by CTDs other than RA was included in this study.
Index and reference test
The index test was US in any mode, including colour Doppler US, power Doppler US, B-mode US, greyscale US, two-dimensional US, three-dimensional US and contrast-enhanced US [23]. Positive and negative results for US were determined based on judgement by the authors of the original research. When a report presented diagnostic test accuracy of two US modes separately, we used only the data of PD to avoid duplicate use of the data from the same subject. In such a case, we selected PD rather than greyscale, because recent data suggested that PD can provide more accurate data than greyscale for synovitis in RA [7].
Reference tests were MRI in any mode, including non-enhanced MRI, enhanced MRI, dynamic MRI, 1.5T MRI and 3T MRI, compact MRI, low-field extremity MRI and 0.2T MRI [24]. Positive and negative results in MRI were also determined based on judgement by the authors of the original research. We categorized the quality of MRI based on MRI mode as follows: high = high field contrast-enhanced MRI, moderate = high agreement was confirmed compared with high field contrast-enhanced MRI and low = low field extremity MRI. Four cohorts, using MRI without contrast enhancement, evaluated the ability to detect synovitis compared with conventional 1.5T contrast-enhanced MRI in RA patients in a preliminary study [33, 35, 41].
Primary outcome
Primary outcomes were diagnostic test accuracy of US for synovitis diagnosed by MRI using the following statistics: diagnostic odds ratio (DOR), the hierarchical summary receiver operating characteristic (HSROC) area under the curve (AUC), the summary estimates of sensitivity and specificity, the positive likelihood ratio (PLR) and the negative likelihood ratio (NLR). Wrist, MCP, PIP and knee joints were evaluated separately [17, 18].
Risk of bias
The two investigators independently evaluated each study by scoring seven domains of A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies-2 evaluation sheet [25]. Any discrepancies were resolved through discussion.
Data synthesis
Data were crosschecked after extraction by the two investigators independently. We then composed a 2 × 2 contingency.
All analyses were done based on the number of joints, not on the number of patients.
We used both the HSROC model and bivariate model. To determine the overall diagnostic test accuracy, we calculated the DOR using the DerSimonian–Laird random effects model and the AUC using the bivariate model of Reitsma [26, 27]. Heterogeneity was indicated by I2, wherein 0% means no heterogeneity and 100% means the strongest heterogeneity. We obtained a paired forest plot, HSROC curve and summary estimates of the sensitivity and specificity using the bivariate model. PLR and NLR were obtained using the summary estimate of the sensitivity and specificity [26, 27]. The DOR, AUC and HSROC were obtained from all the cohorts, regardless of the cut-off value. Summary estimates of the sensitivity and specificity, PLR and NLR were obtained from cohorts that used US cut-off values between negative and positive. According to the authors, five adaptive cut-off scores of US were used: score 1 = 0 grouped as negative, 1 grouped as positive; score 2 = 0 grouped as negative, 1–2 grouped as positive; score 3 = 0 grouped as negative, 1–3 grouped as positive; score 4 = 0–1 grouped as negative, 2–3 grouped as positive; score 5 = 0–1 grouped as negative, 2–4 grouped as positive. We conducted subgroup analyses based on US modes and MRI modes.
We used the following commands in the mada package of the statistics software R: the madauni command for DOR and the reitsma command for the AUC, HSROC curve and the summary estimates of sensitivity and specificity [26, 27]. Review Manager 5.3 (Cochrane, London, UK) was used to draw the paired forest plot and the Cochrane risk of bias graph.
Interpretation of diagnostic test accuracy statistics
The AUC was interpreted in a four-grade scale: <0.75, not accurate; 0.75–0.92, good; 0.93–0.96, very good; <0.97, excellent [28]. PLR values <2, 2–5, 5–10 and >10 were recognized as a not meaningful, small, moderate and large increase in probability, respectively [29]. NLR values >0.5, 0.2–.5, 0.1–0.2 and <0.1 were recognized as a not meaningful, small, moderate and large decrease in probability, respectively [29].
Results
Study search and study characteristics
Of the 601 candidate articles, we finally identified 14 eligible reports [30–43]. Three of these presented 2 cohorts, thus we included 17 independent cohorts (Fig. 1). To obtain data that were not presented in each original report, we tried to contact the authors of 18 reports; authors of 3 original reports provided additional information [32–34].
Among the 14 reports included, 6 were from Japan, 4 from Denmark and 1 each from Belgium, China, Germany and the UK. Publication dates ranged from 2001 to 2014. All reports used a one-gate cohort recruiting method (Table 1). One was a letter and the others were full articles. Seven, three and one study were conducted in a single university hospital, in multicentre hospital-based arthritis clinics and in single hospital, respectively, while three reports did not provide specific information about the facility. To diagnose RA, 1 used both 1987 ACR criteria and 2010 ACR/EULAR criteria, 12 used 1987 ACR criteria only and 1 did not provide information on diagnostic criteria. The number of patients in each study ranged from 6 to 77, with a median of 18 and a total of 376 (Table 1). Concerning the Cochrane risk of bias evaluation, one study had a high risk of index bias due to an arbitrary US cut-off [36]. No other report had a high risk of bias or any high applicability concerns (supplementary Fig. S1, available at Rheumatology Online).
Cohort . | Country . | Facility . | Patient and joint condition . | [Typesetter: please delete this cell] . | US . | MRI . | MCP/PIP joints . | Number of joints . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Design . | Riska . | Mode . | Cut-off . | Machine . | Mode . | Cut-off . | MCP . | PIP . | Patients . | Wrist . | MCP . | PIP . | Knee . | ||||
Beckers et al. [30] | Belgium | NA | Serious side, before TNFi | IC + | No | GSUS + PDUS | 0/1 | NA | DCE-MRI | 0/1 | 16 | 16 | |||||
Freeston et al. [31] | UK | H | Serious side | —b | No | PDUS | 01/23 | NA | eMRI | 01/23 | 2–3 | 64 | 124 | ||||
Fukuba et al. [32] | Japan | NA | IC + | No | PDUS | 0/123 | 1.5T | GD-DTPA | 0/123 | 1–5 | 1–5 | 10 | 20 | 100 | 100 | ||
Horikoshi et al. [33], GSUS | Japan | UH | No | GSUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Horikoshi et al. [33], PDUS | Japan | UH | No | PDUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Kamishima et al. [34] | Japan | UH | Before tocilizumab | IC + | No | PDUS | 0/123 | 1.5T | GD-DPTA | 0/123 | 2–5 (D) | 29 | 232 | ||||
Ogishima et al. [35] | Japan | UH | (i)d | No | PDUS | 0/123 | 0.3T | STIR | 0/1 | 1–5 (D) | 2–5 (D) | 77 | 154 | 770 | 616 | ||
Scheel et al. [36] | Germany | UH | IC + | Yes | GSUS | 0/123 | 0.2T | GD-DPTA | 0/123 | 2–5 (V) | 2–5 (V) | 10 | 80 | 80 | |||
Szkudlarek et al. [37] | Denmark | AC | On MTX | IC + | No | PDUS | 0/1 | 1.0T | DCE-MRI | ESE <1%/s | 2–5 | 15 | 54 | ||||
Szkudlarek et al. [38], CE-PDUS | Denmark | AC | On MTX | IC + | No | CE-PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [38], PDUS | Denmark | AC | On MTX | IC + | No | PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [39] | Denmark | AC | IC + | No | GSUS | 01/234 | 1.0T | GD-DTPA | 01/234 | 2–5 (DV) | 2–5 (DV) | 40 | 154 | 123 | |||
Takase et al. [40], GSUS | Japan | UH | Before knee operation | IC + | No | GSUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Takase et al. [40], PDUS | Japan | UH | Before knee operation | IC + | No | PDUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Taniguchi et al. [41] | Japan | UH | IC + | No | PDUS | 0/123 | 1.5T | Maximum intensity projection | 0/12 | 1–5 (D) | 30 | 60 | 300 | ||||
Terslev et al. [42] | Denmark | NA | IC + | No | PDUS | 0/1 | 1.5T | GDD | 0/123 | 1–5 (D) | 1–5 (D) | 29 | 29 | 91 | 74 | ||
Xiao et al. [43] | China | UH | Serious side | IC − | No | PDUS | 0/123 | 3.0T | GD-DTPA | 0/1 | 2–5 (DV) | 2–5 (DV) | 20 | (i) 20c | 80 | 80 |
Cohort . | Country . | Facility . | Patient and joint condition . | [Typesetter: please delete this cell] . | US . | MRI . | MCP/PIP joints . | Number of joints . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Design . | Riska . | Mode . | Cut-off . | Machine . | Mode . | Cut-off . | MCP . | PIP . | Patients . | Wrist . | MCP . | PIP . | Knee . | ||||
Beckers et al. [30] | Belgium | NA | Serious side, before TNFi | IC + | No | GSUS + PDUS | 0/1 | NA | DCE-MRI | 0/1 | 16 | 16 | |||||
Freeston et al. [31] | UK | H | Serious side | —b | No | PDUS | 01/23 | NA | eMRI | 01/23 | 2–3 | 64 | 124 | ||||
Fukuba et al. [32] | Japan | NA | IC + | No | PDUS | 0/123 | 1.5T | GD-DTPA | 0/123 | 1–5 | 1–5 | 10 | 20 | 100 | 100 | ||
Horikoshi et al. [33], GSUS | Japan | UH | No | GSUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Horikoshi et al. [33], PDUS | Japan | UH | No | PDUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Kamishima et al. [34] | Japan | UH | Before tocilizumab | IC + | No | PDUS | 0/123 | 1.5T | GD-DPTA | 0/123 | 2–5 (D) | 29 | 232 | ||||
Ogishima et al. [35] | Japan | UH | (i)d | No | PDUS | 0/123 | 0.3T | STIR | 0/1 | 1–5 (D) | 2–5 (D) | 77 | 154 | 770 | 616 | ||
Scheel et al. [36] | Germany | UH | IC + | Yes | GSUS | 0/123 | 0.2T | GD-DPTA | 0/123 | 2–5 (V) | 2–5 (V) | 10 | 80 | 80 | |||
Szkudlarek et al. [37] | Denmark | AC | On MTX | IC + | No | PDUS | 0/1 | 1.0T | DCE-MRI | ESE <1%/s | 2–5 | 15 | 54 | ||||
Szkudlarek et al. [38], CE-PDUS | Denmark | AC | On MTX | IC + | No | CE-PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [38], PDUS | Denmark | AC | On MTX | IC + | No | PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [39] | Denmark | AC | IC + | No | GSUS | 01/234 | 1.0T | GD-DTPA | 01/234 | 2–5 (DV) | 2–5 (DV) | 40 | 154 | 123 | |||
Takase et al. [40], GSUS | Japan | UH | Before knee operation | IC + | No | GSUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Takase et al. [40], PDUS | Japan | UH | Before knee operation | IC + | No | PDUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Taniguchi et al. [41] | Japan | UH | IC + | No | PDUS | 0/123 | 1.5T | Maximum intensity projection | 0/12 | 1–5 (D) | 30 | 60 | 300 | ||||
Terslev et al. [42] | Denmark | NA | IC + | No | PDUS | 0/1 | 1.5T | GDD | 0/123 | 1–5 (D) | 1–5 (D) | 29 | 29 | 91 | 74 | ||
Xiao et al. [43] | China | UH | Serious side | IC − | No | PDUS | 0/123 | 3.0T | GD-DTPA | 0/1 | 2–5 (DV) | 2–5 (DV) | 20 | (i) 20c | 80 | 80 |
If a report presented diagnostic test accuracy of two US modes separately, we treat them as two independent cohorts.
Cochrane risk of bias was evaluated; when one or more domains out of seven domains were scored high risk, we described ‘yes’ in the cell.
Freeston et al.’s report was a letter. The other reports were full articles. Freeston et al. did not present RA criteria.
Data were excluded from diagnostic test accuracy estimation since all joints were either reference positive or negative.
Ogishima et al. used both 1987 ACR or 2010 ACR/EULAR criteria. All reports except for Freeston et al. and Ogishima et al. used 1987 ACR criteria. NA: data not available; H: hospital; UH: university hospital; AC: hospital-based arthritis clinic; serious side: more seriously affected side only; TNFi: TNF inhibitor; IC: informed consent; GSUS: greyscale US; PDUS: power Doppler US; CE-PDUS: contrast-enhanced PDUS; T: Tesla; DCE-MRI: dynamic contrast-enhanced MRI; eMRI: low field extremity MRI; GD-DTPA: gadolinium-diethylenetriaminepentacetate-enhanced MRI; STIR: short tau inversion recovery MRI; GDD: gadodiamide-enhanced MRI; D: dorsal; V: volar; ESE <1%/s: rate of early synovial enhancement <1.0%/s.
Cohort . | Country . | Facility . | Patient and joint condition . | [Typesetter: please delete this cell] . | US . | MRI . | MCP/PIP joints . | Number of joints . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Design . | Riska . | Mode . | Cut-off . | Machine . | Mode . | Cut-off . | MCP . | PIP . | Patients . | Wrist . | MCP . | PIP . | Knee . | ||||
Beckers et al. [30] | Belgium | NA | Serious side, before TNFi | IC + | No | GSUS + PDUS | 0/1 | NA | DCE-MRI | 0/1 | 16 | 16 | |||||
Freeston et al. [31] | UK | H | Serious side | —b | No | PDUS | 01/23 | NA | eMRI | 01/23 | 2–3 | 64 | 124 | ||||
Fukuba et al. [32] | Japan | NA | IC + | No | PDUS | 0/123 | 1.5T | GD-DTPA | 0/123 | 1–5 | 1–5 | 10 | 20 | 100 | 100 | ||
Horikoshi et al. [33], GSUS | Japan | UH | No | GSUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Horikoshi et al. [33], PDUS | Japan | UH | No | PDUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Kamishima et al. [34] | Japan | UH | Before tocilizumab | IC + | No | PDUS | 0/123 | 1.5T | GD-DPTA | 0/123 | 2–5 (D) | 29 | 232 | ||||
Ogishima et al. [35] | Japan | UH | (i)d | No | PDUS | 0/123 | 0.3T | STIR | 0/1 | 1–5 (D) | 2–5 (D) | 77 | 154 | 770 | 616 | ||
Scheel et al. [36] | Germany | UH | IC + | Yes | GSUS | 0/123 | 0.2T | GD-DPTA | 0/123 | 2–5 (V) | 2–5 (V) | 10 | 80 | 80 | |||
Szkudlarek et al. [37] | Denmark | AC | On MTX | IC + | No | PDUS | 0/1 | 1.0T | DCE-MRI | ESE <1%/s | 2–5 | 15 | 54 | ||||
Szkudlarek et al. [38], CE-PDUS | Denmark | AC | On MTX | IC + | No | CE-PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [38], PDUS | Denmark | AC | On MTX | IC + | No | PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [39] | Denmark | AC | IC + | No | GSUS | 01/234 | 1.0T | GD-DTPA | 01/234 | 2–5 (DV) | 2–5 (DV) | 40 | 154 | 123 | |||
Takase et al. [40], GSUS | Japan | UH | Before knee operation | IC + | No | GSUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Takase et al. [40], PDUS | Japan | UH | Before knee operation | IC + | No | PDUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Taniguchi et al. [41] | Japan | UH | IC + | No | PDUS | 0/123 | 1.5T | Maximum intensity projection | 0/12 | 1–5 (D) | 30 | 60 | 300 | ||||
Terslev et al. [42] | Denmark | NA | IC + | No | PDUS | 0/1 | 1.5T | GDD | 0/123 | 1–5 (D) | 1–5 (D) | 29 | 29 | 91 | 74 | ||
Xiao et al. [43] | China | UH | Serious side | IC − | No | PDUS | 0/123 | 3.0T | GD-DTPA | 0/1 | 2–5 (DV) | 2–5 (DV) | 20 | (i) 20c | 80 | 80 |
Cohort . | Country . | Facility . | Patient and joint condition . | [Typesetter: please delete this cell] . | US . | MRI . | MCP/PIP joints . | Number of joints . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Design . | Riska . | Mode . | Cut-off . | Machine . | Mode . | Cut-off . | MCP . | PIP . | Patients . | Wrist . | MCP . | PIP . | Knee . | ||||
Beckers et al. [30] | Belgium | NA | Serious side, before TNFi | IC + | No | GSUS + PDUS | 0/1 | NA | DCE-MRI | 0/1 | 16 | 16 | |||||
Freeston et al. [31] | UK | H | Serious side | —b | No | PDUS | 01/23 | NA | eMRI | 01/23 | 2–3 | 64 | 124 | ||||
Fukuba et al. [32] | Japan | NA | IC + | No | PDUS | 0/123 | 1.5T | GD-DTPA | 0/123 | 1–5 | 1–5 | 10 | 20 | 100 | 100 | ||
Horikoshi et al. [33], GSUS | Japan | UH | No | GSUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Horikoshi et al. [33], PDUS | Japan | UH | No | PDUS | 0/123 | 0.3T | STIR | 0/123 | 1–5 (D) | 2–5 (D) | 6 | 12 | 60 | 48c | |||
Kamishima et al. [34] | Japan | UH | Before tocilizumab | IC + | No | PDUS | 0/123 | 1.5T | GD-DPTA | 0/123 | 2–5 (D) | 29 | 232 | ||||
Ogishima et al. [35] | Japan | UH | (i)d | No | PDUS | 0/123 | 0.3T | STIR | 0/1 | 1–5 (D) | 2–5 (D) | 77 | 154 | 770 | 616 | ||
Scheel et al. [36] | Germany | UH | IC + | Yes | GSUS | 0/123 | 0.2T | GD-DPTA | 0/123 | 2–5 (V) | 2–5 (V) | 10 | 80 | 80 | |||
Szkudlarek et al. [37] | Denmark | AC | On MTX | IC + | No | PDUS | 0/1 | 1.0T | DCE-MRI | ESE <1%/s | 2–5 | 15 | 54 | ||||
Szkudlarek et al. [38], CE-PDUS | Denmark | AC | On MTX | IC + | No | CE-PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [38], PDUS | Denmark | AC | On MTX | IC + | No | PDUS | 0/123 | 1.0T | DCE-MRI | ESE <1%/s | One of 2–5 | 15 | 15 | ||||
Szkudlarek et al. [39] | Denmark | AC | IC + | No | GSUS | 01/234 | 1.0T | GD-DTPA | 01/234 | 2–5 (DV) | 2–5 (DV) | 40 | 154 | 123 | |||
Takase et al. [40], GSUS | Japan | UH | Before knee operation | IC + | No | GSUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Takase et al. [40], PDUS | Japan | UH | Before knee operation | IC + | No | PDUS | 01/23 | 1.5T | GD-DTPA | 01/23 | 15 | 15 | |||||
Taniguchi et al. [41] | Japan | UH | IC + | No | PDUS | 0/123 | 1.5T | Maximum intensity projection | 0/12 | 1–5 (D) | 30 | 60 | 300 | ||||
Terslev et al. [42] | Denmark | NA | IC + | No | PDUS | 0/1 | 1.5T | GDD | 0/123 | 1–5 (D) | 1–5 (D) | 29 | 29 | 91 | 74 | ||
Xiao et al. [43] | China | UH | Serious side | IC − | No | PDUS | 0/123 | 3.0T | GD-DTPA | 0/1 | 2–5 (DV) | 2–5 (DV) | 20 | (i) 20c | 80 | 80 |
If a report presented diagnostic test accuracy of two US modes separately, we treat them as two independent cohorts.
Cochrane risk of bias was evaluated; when one or more domains out of seven domains were scored high risk, we described ‘yes’ in the cell.
Freeston et al.’s report was a letter. The other reports were full articles. Freeston et al. did not present RA criteria.
Data were excluded from diagnostic test accuracy estimation since all joints were either reference positive or negative.
Ogishima et al. used both 1987 ACR or 2010 ACR/EULAR criteria. All reports except for Freeston et al. and Ogishima et al. used 1987 ACR criteria. NA: data not available; H: hospital; UH: university hospital; AC: hospital-based arthritis clinic; serious side: more seriously affected side only; TNFi: TNF inhibitor; IC: informed consent; GSUS: greyscale US; PDUS: power Doppler US; CE-PDUS: contrast-enhanced PDUS; T: Tesla; DCE-MRI: dynamic contrast-enhanced MRI; eMRI: low field extremity MRI; GD-DTPA: gadolinium-diethylenetriaminepentacetate-enhanced MRI; STIR: short tau inversion recovery MRI; GDD: gadodiamide-enhanced MRI; D: dorsal; V: volar; ESE <1%/s: rate of early synovial enhancement <1.0%/s.
Among 17 cohorts, 12 used non-enhanced power Doppler, 3 used greyscale US, 1 used contract-enhanced power Doppler and 1 used both greyscale US and power Doppler. Wrist, MCP, PIP and knee joints were evaluated in 5, 12, 6 and 2 cohorts, respectively, and were evaluated for 275, 2060, 1073 and 31 joints, respectively (Table 2). The median sensitivities/specificities were 0.66/0.90, 0.77/0.96, 0.80/0.91 and 0.77/0.55 for wrist, MCP, PIP and knee joints, respectively (Fig. 2).
. | Wrist . | MCP . | PIP . | Knee . |
---|---|---|---|---|
Overall diagnostic value | ||||
DOR (95% CI) | 11.6 (5.6, 24) | 28 (12, 66) | 23 (6.5, 84) | 5.3 (0.60, 48) |
I2, % | 0 | 11 | 19 | 0 |
AUC | 0.81 (good) | 0.91 (good) | 0.91 (good) | 0.61 (not accurate) |
Cohorts (joints), n | 5 (275) | 12 (2060) | 6 (1073) | 2 (31) |
Cut-off absence/presence | ||||
Sensitivity (95% CI) | 0.73 (0.51, 0.87) | 0.64 (0.43, 0.81) | 0.71 (0.33, 0.93) | 0.91 (0.56, 0.99) |
Specificity (95% CI) | 0.78 (0.46, 0.94) | 0.93 (0.88, 0.97) | 0.94 (0.89, 0.97) | 0.60 (0.20, 0.90) |
PLR (95% CI) | 3.3 (1.3, 12)b | 9.1 (4.2, 19)c | 11.8 (4.3, 24)d | 2.3 (0.91, 15)b |
NLR (95% CI) | 0.35 (0.16, 0.75)b | 0.39 (0.21, 0.62)b | 0.31 (0.077, 0.72)b | 0.15 (0.02, 1.3)c |
Cohorts (joints), n | 5 (275) | 10 (1782) | 5 (950) | 1 (16) |
. | Wrist . | MCP . | PIP . | Knee . |
---|---|---|---|---|
Overall diagnostic value | ||||
DOR (95% CI) | 11.6 (5.6, 24) | 28 (12, 66) | 23 (6.5, 84) | 5.3 (0.60, 48) |
I2, % | 0 | 11 | 19 | 0 |
AUC | 0.81 (good) | 0.91 (good) | 0.91 (good) | 0.61 (not accurate) |
Cohorts (joints), n | 5 (275) | 12 (2060) | 6 (1073) | 2 (31) |
Cut-off absence/presence | ||||
Sensitivity (95% CI) | 0.73 (0.51, 0.87) | 0.64 (0.43, 0.81) | 0.71 (0.33, 0.93) | 0.91 (0.56, 0.99) |
Specificity (95% CI) | 0.78 (0.46, 0.94) | 0.93 (0.88, 0.97) | 0.94 (0.89, 0.97) | 0.60 (0.20, 0.90) |
PLR (95% CI) | 3.3 (1.3, 12)b | 9.1 (4.2, 19)c | 11.8 (4.3, 24)d | 2.3 (0.91, 15)b |
NLR (95% CI) | 0.35 (0.16, 0.75)b | 0.39 (0.21, 0.62)b | 0.31 (0.077, 0.72)b | 0.15 (0.02, 1.3)c |
Cohorts (joints), n | 5 (275) | 10 (1782) | 5 (950) | 1 (16) |
The AUC was interpreted in a four-grade scale as follows: −0.75, not accurate; 0.75–0.92, good; 0.93–0.96, very good; 0.97, excellent. PLR: values in the range of − 2, 2–5, 5–10 and >10 are recognized as a not meaningful (N), bsmall (S), cmoderate (M) or dlarge (L) increase of probability. NLR: values in the range of 0.5, 0.2–0.5, 0.1–0.2 and −0.1 are recognized as a anot meaningful (N), bsmall (S), cmoderate (M) or dlarge (L) decrease of probability. Sensitivity, specificity, PLR and NLR were obtained from studies with cut-off values between absence/presence or 0/1.
. | Wrist . | MCP . | PIP . | Knee . |
---|---|---|---|---|
Overall diagnostic value | ||||
DOR (95% CI) | 11.6 (5.6, 24) | 28 (12, 66) | 23 (6.5, 84) | 5.3 (0.60, 48) |
I2, % | 0 | 11 | 19 | 0 |
AUC | 0.81 (good) | 0.91 (good) | 0.91 (good) | 0.61 (not accurate) |
Cohorts (joints), n | 5 (275) | 12 (2060) | 6 (1073) | 2 (31) |
Cut-off absence/presence | ||||
Sensitivity (95% CI) | 0.73 (0.51, 0.87) | 0.64 (0.43, 0.81) | 0.71 (0.33, 0.93) | 0.91 (0.56, 0.99) |
Specificity (95% CI) | 0.78 (0.46, 0.94) | 0.93 (0.88, 0.97) | 0.94 (0.89, 0.97) | 0.60 (0.20, 0.90) |
PLR (95% CI) | 3.3 (1.3, 12)b | 9.1 (4.2, 19)c | 11.8 (4.3, 24)d | 2.3 (0.91, 15)b |
NLR (95% CI) | 0.35 (0.16, 0.75)b | 0.39 (0.21, 0.62)b | 0.31 (0.077, 0.72)b | 0.15 (0.02, 1.3)c |
Cohorts (joints), n | 5 (275) | 10 (1782) | 5 (950) | 1 (16) |
. | Wrist . | MCP . | PIP . | Knee . |
---|---|---|---|---|
Overall diagnostic value | ||||
DOR (95% CI) | 11.6 (5.6, 24) | 28 (12, 66) | 23 (6.5, 84) | 5.3 (0.60, 48) |
I2, % | 0 | 11 | 19 | 0 |
AUC | 0.81 (good) | 0.91 (good) | 0.91 (good) | 0.61 (not accurate) |
Cohorts (joints), n | 5 (275) | 12 (2060) | 6 (1073) | 2 (31) |
Cut-off absence/presence | ||||
Sensitivity (95% CI) | 0.73 (0.51, 0.87) | 0.64 (0.43, 0.81) | 0.71 (0.33, 0.93) | 0.91 (0.56, 0.99) |
Specificity (95% CI) | 0.78 (0.46, 0.94) | 0.93 (0.88, 0.97) | 0.94 (0.89, 0.97) | 0.60 (0.20, 0.90) |
PLR (95% CI) | 3.3 (1.3, 12)b | 9.1 (4.2, 19)c | 11.8 (4.3, 24)d | 2.3 (0.91, 15)b |
NLR (95% CI) | 0.35 (0.16, 0.75)b | 0.39 (0.21, 0.62)b | 0.31 (0.077, 0.72)b | 0.15 (0.02, 1.3)c |
Cohorts (joints), n | 5 (275) | 10 (1782) | 5 (950) | 1 (16) |
The AUC was interpreted in a four-grade scale as follows: −0.75, not accurate; 0.75–0.92, good; 0.93–0.96, very good; 0.97, excellent. PLR: values in the range of − 2, 2–5, 5–10 and >10 are recognized as a not meaningful (N), bsmall (S), cmoderate (M) or dlarge (L) increase of probability. NLR: values in the range of 0.5, 0.2–0.5, 0.1–0.2 and −0.1 are recognized as a anot meaningful (N), bsmall (S), cmoderate (M) or dlarge (L) decrease of probability. Sensitivity, specificity, PLR and NLR were obtained from studies with cut-off values between absence/presence or 0/1.
Wrist
Five cohorts with 275 wrist joints yielded a DOR of 11.6 (95% CI 5.6, 24; I2 = 0%) and an AUC of 0.81. This AUC suggested that US had good diagnostic test accuracy for wrist synovitis (Fig. 3 and Table 2).
Using the cut-off value between absence and presence, the summary estimates of sensitivity and specificity were 0.73 (95% CI 0.51, 0.87) and 0.78 (95% CI 0.46, 0.94), respectively. Based on a PLR of 3.3 and an NLR of 0.35, both positive and negative US results suggested a small change in synovitis probability (Table 2).
MCP
Data for 2060 MCP joints from 12 cohorts suggested a DOR of 28 (95% CI 12, 66; I2 = 11%) and an AUC of 0.91, which means that US had good diagnostic test accuracy for MCP synovitis (Fig. 3 and Table 2).
When applying the cut-off value between absence and presence, the summary estimates of sensitivity and specificity were 0.64 (95% CI 0.43, 0.81) and 0.93 (95% CI 0.88, 0.97), respectively (Table 2). The PLR was 9.1 (95% CI 4.2, 19), suggesting a moderate increase in MCP synovitis probability when detected by US.
PIP
Six cohorts of 1073 PIP joints yielded a DOR of 23 (95% CI 6.5, 84; I2 = 19%) and an AUC of 0.91. This AUC value suggests that US had good diagnostic test accuracy for PIP synovitis (Fig. 3 and Table 2). Using the data from five cohorts that used a cut-off value between absence and presence, the summary estimates of sensitivity and specificity were 0.71 (95% CI 0.33, 0.93) and 0.94 (95% CI 0.89, 0.97), respectively. Positive and negative US results suggested large and small changes in synovitis probability, respectively (Table 2).
Knee
The diagnostic test accuracy of the knee was researched in a smaller number of cohorts and joints compared with other joints. The DOR was 5.3 (95% CI 0.60, 48; I2 = 0%) and the AUC was 0.61, which indicated that the US did not have good diagnostic test accuracy for knee synovitis (Fig. 3 and Table 2). The 95% CI of both PLR and NLR included 1.0, which meant no diagnostic value (Table 2).
MRI mode subgroup analysis
We carried out subgroup analyses focusing on studies with high-quality MRIs and those with moderate- or high-quality MRIs. These analyses almost replicated the results from studies with any MRI modes (supplementary Table S1, available at Rheumatology Online).
US mode subgroup analysis
Based on the US mode subgroup analysis, power Doppler US showed better overall diagnostic test accuracy than greyscale US (supplementary Table S2, available at Rheumatology Online). Notably, power Doppler US had a very good AUC to detect MRI-proven synovitis in MCP and PIP joints. Positive power Doppler US with a cut-off value between absence and presence or 0–1 largely increases the probability of MRI-proven synovitis in MCP and PIP joints (supplementary Table S2, available at Rheumatology Online).
Discussion
US is widely used in daily practice and in clinical trials for the evaluation of RA inflammatory activity. Despite the increasing availability of US, there remains a lack of quality validation studies. The OMERACT group has proposed definitions for SF and synovial hypertrophy [44]. US allows visualization of the pannus developing in the inflamed joint. Greyscale and Doppler US are capable of measuring synovial proliferation and vascularity, respectively. Several approaches for assessing synovitis in RA patients have been described in published studies. Qualitative, semi-quantitative and scoring systems have been used for assessing synovitis by greyscale and/or Doppler US.
Our systematic review and meta-analysis provided evidence supporting the use of US for evaluating synovitis in RA patients. MRI mode–based subgroup analyses suggested the robustness of our analysis. We showed that the diagnostic test accuracy of US was good for detecting synovitis at the joint level using MRI as the reference standard, especially with regard to MCP and PIP joints. The data suggest that US of the wrist joints is less accurate than the MCP and PIP joints. The diagnostic test accuracy for knee joints was low, but was based on a small number of cohorts. Although it has limited resolution for deeper joints and the patient’s body habitus may sometimes make examination difficult, US has been shown to be more sensitive than clinical examination in determining synovitis for large joints such as the shoulder and knee [45, 46]. The small sample size increased the size of the CIs, and thus there was greater statistical uncertainty of the results, even when the diagnostic test had a high sensitivity.
This meta-analysis has several limitations. The number of papers qualifying for the analysis is small and we used data from direct communication with the original authors, therefore recall bias may have occurred. Our systematic review focused on wrist, finger and knee joints. As noted above, only two reports representing three cohorts compared the ability of US and MRI to detect synovitis for knee joints. However, the small joints of the hands and feet play a central role in the diagnosis of RA. Our systematic review shows that US can be recommended as a reliable diagnostic tool for synovitis in RA. Previous systematic review suggested that the wrist, MCP and MTP joints should be scanned in the diagnostic process of RA [47]. Despite the fact that the feet were not evaluated in this study, similar results may be obtained for MTP joints. As MRI is not a gold standard to detect synovitis without contrast enhancement, we carried out subgroup analyses based on MRI quality. MRI is also reader dependent, particularly when an established scoring method such as the Rheumatoid Arthritis MRI Score is not used. Furthermore, subgroup analysis based on US mode showed Doppler US had very good diagnostic test accuracy and was more accurate than greyscale US in detecting MCP synovitis. Some subgroup analyses provided imprecise estimations of test accuracy due to the limited number of studies.
Our systematic review did not distinguish early and established RA. In this meta-analysis, all of the identified eligible studies were performed for established RA patients. Harman et al. [48] assessed the efficacy of US compared with contrast-enhanced MRI in patients with newly diagnosed RA. However, as they showed only sensitivity and specificity data, this study was excluded. Another issue is operator-dependent techniques for scoring systems. Although US examination for synovitis is mostly carried out from the dorsal aspect of the finger joint, several studies have addressed volar synovitis. Moreover, our systematic review revealed a lack of consensus regarding a standardized US scoring system for synovitis. The definition of a positive or negative US-determined synovitis was defined with different cut-off values. The sensitivity and specificity of a quantitative test are dependent on the cut-off values above or below, and there is a trade-off between sensitivity and specificity. We chose the bivariate model to determine the overall diagnostic test accuracy of US. This model takes into account the potential trade-off between sensitivity and specificity by explicitly incorporating this negative correlation in the analysis, with the result that it could calculate the DOR/AUC. However, the reliability of the estimated accuracy is limited, especially for the knee, where only a limited number of studies are available. In addition, the optimal cut-off value was not determined in this study. Although five adaptive cut-off scores were used, it was not enough to make the distinction at various cut-off scores due to the small sample size.
In summary, this systematic review and meta-analysis suggest that US, especially power Doppler US, is a valid and reproducible technique for detecting synovitis in the wrist and finger joints. US has certain advantages over MRI, including low cost, portability and lack of contraindications. However, it does require consideration of appropriate training and quality assessment. But US may be more widespread for routine use as part of the standard diagnostic tools in RA.
Acknowledgements
The authors would like to thank Erika Ota, St Luke’s International University Graduate School of Nursing Science, for help with the literature search. The authors declare no conflicts of interest regarding this work. K.T.-M. is supported by the Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research (grant no. 15K19578). R.Y. is supported by the Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research (grant no. 26461468) and the Yokohama Foundation for the Advancement of Medical Science. Y.K. is supported by the Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research (grant no. 26713036 and 15K15374).
Funding: No specific funding was received from any bodies in the public, commercial or not-for-profit sectors to carry out the work described in this article.
Disclosure statement: R.J.W. has received a speaker’s honorarium from General Electric. P.E. has undertaken clinical trials and provided expert advice to Pfizer, MSD, AbbVie, Bristol-Myers Squibb, USB, Roche, Novartis, Samsung, Sandoz and Lilly. All other authors have declared no conflicts of interest.
Supplementary data
Supplementary data are available at Rheumatology Online.
Comments