Article Text

Extended report
Imaging modalities for the classification of gout: systematic literature review and meta-analysis
  1. Alexis Ogdie1,
  2. William J Taylor2,
  3. Mark Weatherall2,
  4. Jaap Fransen3,
  5. Tim L Jansen3,
  6. Tuhina Neogi4,
  7. H Ralph Schumacher1,
  8. Nicola Dalbeth5
  1. 1Division of Rheumatology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
  2. 2Department of Medicine, University of Otago, Wellington, New Zealand
  3. 3Department of Rheumatology, Radboud University Medical Center, Nijmegen, The Netherlands
  4. 4Sections of Epidemiology and Rheumatology, Boston University School of Medicine, Boston, Massachusetts, USA
  5. 5Department of Medicine, University of Auckland, Auckland, New Zealand
  1. Correspondence to Dr Alexis Ogdie, Division of Rheumatology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; alexis.ogdie{at}uphs.upenn.edu

Statistics from Altmetric.com

Introduction

Classification criteria are necessary to ensure relative homogeneity of participants in clinical research, including clinical trials and epidemiological studies.1 The definitive classification of gout relies on the microscopic identification of monosodium urate (MSU) crystals in synovial fluid or from tophi.2 However, examination of synovial fluid may not be practical for all studies, such as those with an epidemiological focus. Therefore, clinical classification criteria also exist for gout. The most widely used clinical classification criteria are the 1977 American Rheumatology Association (ARA) preliminary classification criteria of acute arthritis of primary gout.3 ,4

The 1977 ARA clinical criteria included two plain radiography features: asymmetric swelling within a joint, and subcortical cysts without erosions.4 Since 1977, major advances have been made in the imaging of gout, and new imaging modalities have become more widely available and commonly used in clinical practice.5 Inclusion of such imaging tests, if they can distinguish gout from not-gout, may be helpful in the clinical classification of gout. However, it remains unclear how accurate and useful available imaging modalities are for the classification of gout, particularly when compared to the microscopic confirmation of MSU crystals as the gold standard test.

The objective of this study was to examine the usefulness of imaging modalities in the classification of symptomatic gout when compared to MSU crystal confirmation as the gold standard. We systematically reviewed the published literature concerning the diagnostic performance of plain film radiography (x-ray), MRI, ultrasound (US), conventional CT and dual energy CT (DECT). This systematic review was performed to inform the development of new classification criteria for gout.2

Methods

Literature search

A systematic search was performed by a medical librarian using Ovid Medline, PubMed, Embase and Cochrane databases from January 1946 to March 2014. Search terms included gout, podagra, crystal arthrop$, toph$, imaging, arthrography, radiography, ultrasound, radiograph, plain x-ray, MRI, tomography, CT, dual energy CT and DECT. (Complete search strategy listed in online supplementary file 1.) Articles were excluded from the search if they were not published in the English language, did not involve human subjects or were case reports (as these reports did not include comparator patients and thus would not meet the inclusion criteria as described below). We also searched the American College of Rheumatology (ACR) and European League Against Rheumatism (EULAR) meetings for relevant abstracts from 2007 to 2013. All abstracts with ‘gout‘ in the title or body were reviewed.

Review of literature

After the initial searches were completed, AO reviewed all the resulting titles and abstracts. Citations were excluded if the title or abstract was not relevant to the goals of the review. Full manuscripts of the remaining citations were reviewed by AO. Review articles were excluded but references within review articles were searched to ensure adequate capture of all relevant articles. When not enough information was provided in the abstract or manuscript, authors were emailed to obtain further data.

Selection criteria

Inclusion criteria were: (a) studies examining the diagnostic performance of an imaging modality (X-ray, MRI, US, CT or DECT) in gout; (b) inclusion of at least two groups of patients where one group had gout; and (c) gout was confirmed by the presence of MSU crystals in joint fluid. The article or abstract also had to include either the raw results (positive vs negative imaging features for each group), or specificity and sensitivity. Exclusion criteria were: (a) use of clinical criteria or physician- or patient-report for classification of gout instead of MSU crystal confirmation; (b) lack of a control or comparison group; (c) cases with asymptomatic hyperuricaemia; or (d) insufficient information provided to calculate sensitivity and specificity.

Data extraction and quality assessment

Data were extracted from manuscripts by AO and ND using a standardised data abstraction tool. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool was then applied by AO and ND. When there were differences in QUADAS scores between AO and ND, WT served as a third reviewer to settle discrepancies. The QUADAS is a 14-item scale designed to assess the quality of studies of diagnostic accuracy included in systematic reviews.6

Meta-analysis

When more than one study examined the same imaging feature, the data were pooled and summary test characteristics were calculated from the hierarchical summary receiver operating characteristic (HSROC) curve model of Rutter and Gatsonis implemented in R software V.0.5.5.7 ,8 Summary ROC curves were then generated. In generating the HSROC, we did not assume similar thresholds across studies since a ‘positive’ test depended on observer judgment rather than objective measurement.

Results were compiled using Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and Standards for Reporting of Diagnostic Accuracy (STARD) guidelines.9 ,10

Results

Study identification

A total of 1171 manuscripts and 88 abstracts were reviewed (figure 1). Among manuscripts identified, 884 were excluded after review of the title and abstract, 338 were excluded after review of the paper, and one duplicate was excluded. Among ACR and EULAR meeting abstracts identified, 88 were excluded after review and additional information was sought in three. Of these, only one response was received; this abstract was excluded as the classification of gout cases was based on 1977 ARA clinical criteria rather than MSU crystal confirmation. A total of 11 studies were included in the analysis: nine full length manuscripts11–19 and two meeting abstracts.20 ,21 Seven studies examined US, three studies examined DECT and one examined X-ray features of the sternomanubrial joint.

Figure 1

Search results. Ovid Medline, PubMed, Embase and Cochrane databases were searched using the search strategy in the online supplementary file 1. In addition, proceedings from the American College of Rheumatology (ACR) and European Union League Against Rheumatism (EULAR) annual meetings from 2007–2012 were searched for relevant abstracts.

Quality assessment

Overall, the included studies met most of the quality indicators of the QUADAS tool (figure 2 and online supplementary table S1). The most common quality issues were unreported time between arthrocentesis confirming MSU crystals (reference test), and the performance of the imaging (index) test and lack of reporting of withdrawals or uninterpretable results.

Figure 2

Methodologic quality as assessed using the QUADAS tool. The vertical access contains the individual quality metrics and the horizontal access reflects the proportion of studies meeting these criteria (in green). Yellow signifies that it was unclear whether the study met the quality metric (usually because it was reported) and red signifies that the study specifically did not meet that metric.

Patient and study characteristics

Study characteristics are shown in table 1. Most studies were single centre (with exception of Naredo et al14) case–control or cross-sectional studies comparing gout to other types of arthritis. Patients were generally referred to the study with joint swelling and were recruited from secondary care clinics. In the four studies that reported disease duration, the mean duration of gout ranged from 7 to 13 years. However, half of the patients in one study (Bongartz et al19) had symptom duration of <6 weeks. In most studies, both active joints and inactive joints were included in the analysis. Arthrocentesis was performed in all patients with gout, although it was often not clear when the arthrocentesis occurred relative to the imaging test. Only half of the studies reported performing arthrocentesis in the control/comparator patients.

Table 1

Patient characteristics

Imaging features

A variety of imaging features were examined in the studies included (table 2). There was also substantial variation in the joints examined in each study (table 2). In the studies examining US, most of the sonographers were rheumatologists with training in musculoskeletal US (5/7 studies; two studies did not report the sonographer's training). Four of seven US studies utilised sonographers blinded to the patient's diagnosis, one study had one blinded and one unblinded sonographer, and two studies did not report whether the sonographer was blinded. In all three DECT studies, the images were interpreted by musculoskeletal radiologists who were blinded to the diagnosis.

Table 2

Study characteristics

Pooled results

Only three imaging features were examined in more than one study: the double contour sign (DCS) on US, presence of tophus on US, and MSU crystal deposition on DECT. Pooled results are presented in table 3. The pooled (95% CI) sensitivity and specificity of DCS were 0.83 (0.72 to 0.91) and 0.76 (0.68 to 0.83), respectively. The pooled (95% CI) sensitivity and specificity for tophus on US were 0.65 (0.34 to 0.87) and 0.80 (0.38 to 0.96), respectively. DECT had pooled (95% CI) sensitivity and specificity of 0.87 (0.79 to 0.93) and 0.84 (0.75 to 0.90). The summary ROC curves are shown in figure 3.

Table 3

Meta-analysis results

Figure 3

Hierarchical summary receiver operator curves (HSROC). Hierarchical summary receiver operating characteristic curve for (a) ultrasound double contour sign, (b) tophi on ultrasound, and (c) DECT. The closed points represent the individual studies in the review. The open point represents the pooled sensitivity and specificity estimate, and the enclosed shape represents the bivariate 95% CI for the pooled sensitivity and specificity estimate.

Discussion

In this systematic review and meta-analysis, we found 11 studies examining the accuracy of imaging features for the classification of gout. Relatively few studies met the inclusion criteria requiring MSU crystal confirmation as the gold standard and the inclusion of a comparison group without gout. The three imaging findings examined in the pooled analysis had similar pooled specificity; and pooled sensitivity was high for both DCS and DECT but lower for US identification of tophi. The results available suggest that US and DECT may be useful to include in revised gout clinical classification criteria.

The value of each modality for classification of gout in terms of sensitivity and specificity in comparison to MSU crystal proven gout as the gold standard (rather than ACR criteria or physician diagnosis) has not previously been explored in a meta-analysis. Three previous systematic reviews have examined the usefulness of ultrasound as an outcome tool in gout. Chowalloor et al22 and Ottaviani et al23 provided an extensive review of the features of gout reported in US studies to date but did not focus on the diagnostic or classification properties of these features and did not perform a meta-analysis. Mathieu et al performed a systematic literature review and meta-analysis of the prevalence of ultrasound characteristics in gout.24 However, in examining the test properties of ultrasound, none of these reviews specifically restricted the gold standard to demonstration of MSU crystals. This is important because comparison of a new test to a reference standard that may or may not be accurate can lead to inflation or deflation of the sensitivity and specificity of the index test.

Interpretation of the results reported in this study requires some important considerations. First, the patients studied had been diagnosed with gout for an average of at least 7 years in those studies reporting length of disease. These imaging modalities may perform differently in patients with early gout. It is this population of patients with earlier gout, most often without tophi, for which an accurate imaging technique would be most useful. Thus, further studies are needed to address this population. It is also important to note that we excluded studies examining the use of imaging modalities in patients with asymptomatic hyperuricaemia only, as the proposed new classification criteria will apply to people with symptomatic disease, rather than those with asymptomatic hyperuricaemia and/or asymptomatic MSU crystal deposition.2 Therefore, studies examining the use of imaging modalities to determine risk of symptomatic gout or the presence of subclinical gout in patients with asymptomatic hyperuricaemia were beyond the scope of this review.

A further issue when considering imaging for gout classification is the observation that all the studies involved patients in secondary care rheumatology clinics. Patients recruited from secondary care setting may have more complex and severe gout than those treated in primary care. Gout is mostly managed within primary care, and a key property of new classification criteria for gout is that they should be applicable to patients within a range of research settings, including primary care.2 ,25

We used MSU crystal identification as the gold standard, but even this test has some variability when performed by different investigators.26 However, this is the best gold standard available. Additionally, not all joints included in these imaging analyses were sites at which arthrocentesis had been performed. We do not believe this should substantially affect the results, particularly as this mirrors current clinical practice in which a patient is diagnosed or classified as having gout when multiple joints are inflamed but MSU crystals are identified on arthrocentesis from one joint. Finally, there may be a risk of misclassification bias in that not all comparator patients underwent arthrocentesis to confirm their ‘control’ status.

The methods employed by the included studies were, in general, satisfactory. However, the majority of studies utilised a case–control design. Such designs may exaggerate the diagnostic properties (sensitivity and specificity). Future studies may consider cross-sectional designs in which patients for whom the clinical question ‘does this patient have gout?’ are referred for participation. This type of design was implemented in some of the studies included.12 ,18 ,20 ,21 Finally, there was great variability in the study protocols used and the sites that were imaged. Standardisation of the methodology used for both ultrasound and DECT are needed. One of the goals of Naredo et al was to examine optimum sites for inclusion in US studies.14 At present, it is similarly unclear which sites are optimal for DECT imaging, and also which scanner settings are most appropriate to achieve optimal sensitivity and specificity for urate deposition.27

In summary, although imaging modalities such as ultrasound and DECT show promise in the classification of symptomatic gout, the studies to date have been small and have primarily involved people with longstanding, established disease. Determination of whether these imaging modalities should be included in the revised ACR/EULAR classification criteria for gout will occur at a consensus meeting adjacent to EULAR in Paris, France in June 2014. Future studies aiming to determine the usefulness of imaging modalities in the diagnosis of symptomatic gout should focus on patients with recent onset joint pain and swelling, and should use MSU crystal identification as the gold standard when determining test characteristics. Additional studies are also needed to determine which imaging modalities are optimal and to examine the relative contribution of imaging modalities over clinical elements to the classification of gout in clinical situations including primary care.

Acknowledgments

We thank Janet Joyce for performing the literature search and Yihui Connie Jiang for administrative support.

References

View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

  • Lay summary

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.