OBJECTIVE To compare the reliability of quantitative measurement of minimum hip joint space with a qualitative global assessment of radiological features for estimating the prevalence of primary osteoarthritis (OA) of the hip in colon radiographs.
METHODS All colon radiographs from patients aged 35 or older, taken at three different radiographic departments in Iceland during the years 1990–96, were examined. A total of 3002 hips in 638 men and 863 women were analysed. Intraobserver and interobserver reliability was assessed by measuring 147 randomly selected radiographs (294 hips) twice by the same observer, and 87 and 98 randomly selected radiographs (174 and 196 hips) by two additional independent observers. Minimum hip joint space was measured with a millimetre ruler, and global assessment of radiological features by a published atlas.
RESULTS With a minimum joint space of 2.5 mm or less as definition for OA, 212 hips were defined as having OA. When the global Kellgren and Lawrence assessment with grade 2 (definite narrowing in the presence of definite osteophytes) or higher as definition for OA was used, 202 hips showed OA. However, only 166 hips were diagnosed as OA with both systems. With 2.0 or 3.0 mm minimum joint space as cut off point, the difference between the two methods increased. Both intrarater and interrater reliability was significantly higher with joint space measurement than with global assessment.
CONCLUSIONS Overall prevalence of radiological OA was similar with the two methods. However, the quantitative measurement of minimum hip joint space had a better within-observer and between-observer reliability than qualitative global assessment of radiographic features of hip OA. It is thus suggested that minimum joint space measurement is a preferable method in epidemiological studies of radiological hip OA.
- colon radiographs
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Most population studies of osteoarthritis (OA) prevalence have used the radiographic classification developed by Kellgren and Lawrence.1 The system is based on qualitative assessment and grading of joint space narrowing and osteophytes, generating a composite global score. This scoring system has, however, shown a low interrater reliability.1-3 Further, the emphasis on osteophytes as the initial defining feature of hip OA has been questioned.4 Other investigators have used assessment of individual radiological joint features—that is, minimum joint space, to define OA.2 ,5-7
We have previously estimated the prevalence of primary OA of the hip in Iceland, using colon radiographs and a cut off point of 2.5 mm or less of minimum hip joint space as the definition of hip OA.8The purpose of the present investigation was to compare the Kellgren and Lawrence system and the minimum joint space classification systems for the assessment of primary hip OA.
Material and methods
For this study the same radiographs were used as in our previous study of hip OA in an Icelandic population.8 Thus all colon radiographs (double contrast, barium enema) taken at three different radiographic departments in Iceland during the years 1990–96 were examined. Only radiographs from patients aged 35 or older at the time of the colon examination were used. In total, radiographs from 1530 patients (653 men, 877 women) were analysed.
The patients were referred for radiography from four different hospitals, as well as from the primary healthcare system. They were from both rural and urban areas. The radiograms examined represent approximately 40% of all colon radiographs taken in Iceland during this seven year period.
Of the 1530 radiographs, 29 were excluded. In eight, radiographs of both hips were not clearly visualised and in five, signs of secondary OA were seen. Forty nine hips were operated on with hip arthroplasty, and for 34 of these preoperative x rays were available. Fifteen hips in 13 patients were thus excluded because of a lack of preoperative radiograms. Three patients had been operated on with arthroplasty because of hip fracture and were excluded. A total of 3002 hips in 638 men and 863 women therefore remained for analysis. Intraobserver and interobserver reliability was assessed by measuring 147 randomly selected radiographs (294 hips) twice by the same observer (TI), and 87 and 98 randomly selected radiographs (174 and 196 hips) by two independent observers (reader 1 and reader 2). Data reported in this study for joint space measurements and Kellgren and Lawrence gradings are from a single observer (TI).
The double contrast (barium enema) colon radiographs included at least two supine anteroposterior (AP) and several oblique exposures. The hip joints in this study were assessed from an (AP) control radiograph, which was taken with the same tube to film distance of 100 cm that is used in a standard AP view of the pelvis. To be included in this investigation both hips had to be clearly visualised on an AP film. The age of the patient at the time of the colon examination, and signs of secondary OA and hip operations were registered. Hips with signs of secondary OA (congenital dislocation or dysplasia, Perthes' disease, slipped epiphysis) were excluded from further analysis. Clinical information was sought in hospital records for patients who had been operated on with total hip replacement, and their primary diagnosis was established.
RADIOGRAPHIC CLASSIFICATION SYSTEMS
Minimum hip joint space was measured on the AP film with a ruler divided in millimetres.2 A minimum joint space of ⩽2.5 mm was used as a definition of OA of the hip.2 ,8 Global joint assessment was done according to Kellgren and Lawrence as described in the Atlas of Standard Radiographs of arthritis.1 ,9 Hips classified as grade 2 (definite narrowing in the presence of definite osteophytes) or higher were defined as having OA.
Non-parametric statistical methods were used for group comparisons. For estimates of interobserver and intraobserver reliability, the κ statistic was used for categorical variables, and the intraclass correlation coefficient for continuous variables.
PREVALENCE OF OA
The mean minimum joint space in the 3002 hips was 3.97 (SD 0.68) mm.8 With a minimum joint space of ⩽2.5 mm as a cut off point for the presence of radiological hip OA, 212 hips in 151 patients (71 men, 80 women) were diagnosed as having OA (table 1). The mean age at colon examination for these patients was 68.0 years (range 35–89).
With the use of the Kellgren and Lawrence system with grade 2 or higher as a cut off point for the presence of radiological OA, 202 hips in 137 patients (63 men, 74 women) were diagnosed as having OA (table 1). The mean age at colon examination for these patients was 67.5 years (38–88).
The overall prevalence of hip OA in the population was thus 10.0% using the minimum joint space criterion and 9.2% using the Kellgren and Lawrence system.
COMPARISON OF METHODS
With a 2.5 mm minimum joint space as the cut off point for OA, 166 hips were classified as having OA by both methods, while 46 hips showed OA with the minimum joint space criterion only and 36 with the Kellgren and Lawrence system only (fig 1A). By using 2.0 or 3.0 mm as cut off points, the difference between the two techniques increased (figs 1B and 1C) (table 1).
Both the intraobserver and interobserver reliability were higher for all readers with the minimum joint space criterion than with the Kellgren and Lawrence grading system (table 2). When joint space width was used there was intraobserver agreement for OA classification in 296/296 radiographs, whereas for the Kellgren and Lawrence grading there was agreement in 286/296 radiographs (p< 0.004). For joint space width there was interobserver agreement in classification of OA in 171/174 radiographs for readers 1 v 2, while for Kellgren and Lawrence classification the corresponding agreement was 162/174 (p<10−7). For readers 1v 3, the corresponding agreement was 188/196 and 181/196, respectively (p<0.01).
A golden standard is lacking for the radiographic definition of hip OA. At present, several different radiographic classification systems are used. Comparison and harmonisation of these systems is desirable to facilitate comparison between prevalence studies.
The qualitative grading system based on the original suggestions by Kellgren and Lawrence has in several investigations shown a high within-observer and between-observer variability.1 ,2 ,5 ,7 Several alternatives to the Kellgren and Lawrence system have therefore been proposed.10-12 Grading scales using individual radiographic features have been developed to assess the prevalence, progression, and ultimately, the significance of individual radiographic features singly, and in combination, for OA of the hip. Most of these grading systems for OA of the hip are based on individual features that together contribute to the Kellgren and Lawrence global score—that is, osteophytes, joint space narrowing, subchondral sclerosis, cysts, and deformity.
A previous study showed that the quantitative measurement of hip joint space was more reproducible than the qualitative assessment of osteophytes, sclerosis, or an overall qualitative assessment.2 Minimum joint space measurement and the overall qualitative assessment showed similar association with pain. This association was stronger than between hip pain and hip osteophytes, suggesting that joint space may be the better surrogate measure for hip OA.
We have in this study extended previous observations by directly comparing the two most commonly used methods for the assessment of radiological hip OA: the quantitative measurement of minimum joint by space, on the one hand, and the qualitative grading of joint space and osteophytes by atlas, on the other. Our comparison focused on the prevalence of radiological hip OA resulting from the alternative use of the two methods, and their reliability and agreement.
METHODOLOGY AND RELIABILITY OF COLON RADIOGRAPHY FOR HIP ASSESSMENT
Radiography of the colon might underestimate minor structural changes compared with radiographs that are optimally exposed for the hip joint. Earlier studies, however, have found good agreement between colon and hip joint radiographs both for prevalence and degree of hip OA.1 ,13 Recent studies further suggest that obesity is linked to colon cancer and adenoma.14 If it is assumed that these subjects more commonly undergo colon radiography, obese subjects may be overrepresented in the group studied here. However, the linkage between obesity and hip OA is uncertain.15 In any case, similar methods (assessment of minimum hip joint space on AP colon radiograms) were used in our work and in the studies in Sweden and Denmark, which form the main basis for our comparisons with Icelandic hip OA prevalence.8
PREVALENCE OF OSTEOARTHRITIS OF THE HIP
We found an almost equal prevalence of OA when comparing quantitative joint space measurement with a cut off point of 2.5 mm with the Kellgren and Lawrence qualitative grading system with grade 2 (definite narrowing in the presence of definite osteophytes) or more as cut off point. However, when these criteria for radiological OA were used, only 166 of 248 hips were defined as having OA with both systems (fig 1A). A change in cut off point for joint space width to either 2.0 or 3.0 mm decreased the level of agreement between the two methods (figs 1B and 1C).
RELIABILITY OF THE TWO MEASURING METHODS
In this study, quantitative measurement of joint space showed a significantly higher intrarater and interrater reliability than the qualitative Kellgren and Lawrence global assessment for identifying radiological hip OA (table 2). This was true for all three of the readers of radiographs who participated in this study. For this comparison we used weighted κ values for categorical values, and intraclass correlation coefficients for continuous values. These statistical measures of reliability have been shown to be equivalent.16
JOINT SPACE WIDTH
The average minimum hip joint space width in this study was 3.97 mm (SD 0.68 mm).8 The repeated measures SD for reader 1 was 0.38 mm (data not shown). Thus the estimated measurement error was 0.68 × 0.38, or 0.25 mm. Using these results, a cut off point of 2.5 mm for the presence of OA, and assuming a normal distribution of hip joint space width,8 we can assess the chance of a hip radiograph that is found to have a minimum joint space width ⩽2.5 mmnot having OA is 3.6%.
We have confirmed and extended previous studies by directly comparing two methods for assessment of OA in hip radiographs. Although the two methods resulted in a similar overall prevalence of radiological OA, our results show that the quantitative measurement of minimum hip joint space has a significantly better within-observer and between-observer reliability than a qualitative global assessment. We thus suggest that a minimum joint space measurement is a preferable method in epidemiological studies of radiological hip OA. The most suitable method for studies of hip OA defined by a combination of radiological signs and symptoms has not been determined, and may differ for different contexts. However, it was shown that minimum joint space is as well correlated with symptoms as a qualitative global assessment.2
The authors thank Hjörtur H Jónsson at the statistical department of DeCode Genetics for help and statistical advice. We also thank the staff at the radiological department at the Central Hospital of Akureyri, and the staff at the radiological department of Domus Medica, Reykjavík, Iceland, for their assistance.
Supported by the Scientific Foundation of Akureyri Central Hospital, the Swedish Medical Research Council, Lund Medical Faculty and University Hospital, the King Gustaf V 80-year Fund, and the Kock Foundations.