Background: Doubts have been expressed about the performance of the American College of Rheumatology (ACR) clinical classification criteria for osteoarthritis when applied in the general population.
Objective: To investigate whether the distribution of population subgroups and underlying disease severity might explain the performance of these criteria in the population setting.
Methods: Population-based cross-sectional study. 819 adults aged ⩾50 years reporting knee pain in the last 12 months were clinically assessed by research therapists using standardised protocols and blinded to radiographic status. All participants underwent plain radiography of the knees, scored by a single reader blinded to clinical status. The relationship between fulfilling the ACR clinical classification criteria for knee osteoarthritis and the presence of symptomatic radiographic knee osteoarthritis was summarised for the sample as a whole and within subgroups.
Results: Radiographic osteoarthritis was present in 539 participants (68%) and symptomatic radiographic knee osteoarthritis in 259 (33%). 238 participants (30%) fulfilled the ACR clinical criteria for knee osteoarthritis. Agreement between the ACR clinical criteria and symptomatic radiographic knee osteoarthritis was low (sensitivity 41%; specificity 75%; positive predictive value 44%; negative predictive value 72%). Sensitivity and specificity did not vary markedly between population subgroups, although they were influenced by the underlying severity of radiographic osteoarthritis.
Conclusion: The ACR clinical criteria seem to reflect later signs in advanced disease. Other approaches may be needed to identify early, mild osteoarthritis in the general population and primary care.
- ACR, American College of Rheumatology
- BMI, body mass index
- CAS(K), clinical assessment study of the knee
- K&L, Kellgren and Lawrence
- NPV, negative predictive value
- PPV, positive predictive value
Statistics from Altmetric.com
- ACR, American College of Rheumatology
- BMI, body mass index
- CAS(K), clinical assessment study of the knee
- K&L, Kellgren and Lawrence
- NPV, negative predictive value
- PPV, positive predictive value
The clinical classification criteria developed by the American College of Rheumatology (ACR)1 remain a popular method of classifying knee osteoarthritis, recommended for clinical and epidemiological studies2 and the practice of primary care.3
However, caution is needed when applying classification criteria in circumstances different from those in which they were derived. Summary statistics like positive and negative predictive values change from one setting to another as a function of disease prevalence and so do the sensitivity, specificity and likelihood ratios, which were once believed to be invariant across settings and subgroups.4 These parameters are often conditional on other factors in the patient profile in addition to being related to the underlying spectrum of disease severity in any given population.
Doubts have been expressed about the validity of the ACR clinical criteria in the general population and primary care.5,6 In a population-based sample of symptomatic adults aged ⩾50 years, we examined how often the ACR clinical criteria (tree method) are satisfied in those with and without symptomatic radiographic knee osteoarthritis. We aimed to understand why the performance of these clinical classification criteria may be poorer in the general population. Specifically, we investigated the extent to which parameters such as sensitivity and specificity, used for summarising the “diagnostic accuracy” of the ACR clinical criteria, change between population subgroups and as a function of the underlying spectrum of disease severity.
The clinical assessment study of the knee
The clinical assessment study of the knee (CAS(K)) is a population-based prospective observational cohort study of 819 symptomatic patients, aged ⩾50 years, registered with three general practices (irrespective of their actual consultation pattern). The North Staffordshire local research ethics committee approved the study. Full details of the study design and methods have been previously presented.7,8 Between August 2002 and September 2003, respondents to two postal questionnaires who reported knee pain in the past 12 months were invited to attend a research clinic that included a standardised clinical interview and examination, and plain radiographs of both knees. Participants with “red flags” (recent trauma likely to be associated with considerable tissue damage: acute, hot, swollen joint) were excluded. Participants were asked for permission to review their general practice medical records. From this review, it was determined whether participants had consulted their general practitioner as a consequence of knee pain or osteoarthritis in the 18 months before attending the clinical assessment.
Three views of the knees were obtained for each participant at clinic; the weight-bearing posteroanterior semiflexed/metatarsophalangeal view according to the Buckland–Wright protocol,9 a skyline view and a lateral view. The skyline and lateral views were obtained in the supine position, with the knee flexed to 45°.
A single reader (RD) scored all films and was blinded to all questionnaire and clinical data. The tibiofemoral joint was assessed using a posteroanterior view and, for the posterior compartment, a lateral view. The patellofemoral joint was assessed using a skyline and a lateral view. A Kellgren and Lawrence (K&L) score was assigned to the posteroanterior and skyline views using the original written description.10 For the lateral view a standard atlas11 was used to score superior and inferior osteophytes. Posterior tibial osteophytes do not appear in this atlas, but were judged on the same basis of severity as those osteophytes shown in the lateral view.
Defining participants with “symptomatic radiographic knee osteoarthritis”
The presence of any radiographic osteoarthritis in the knee joint was defined as: K&L score ⩾2 in the posteroanterior or K&L score ⩾2 in the skyline or the presence of superior or inferior patella osteophytes in the lateral view or the presence of posterior tibial osteophytes in the lateral view. The definition of moderate or severe radiographic osteoarthritis was based on the worst score at any location within each knee—for example, if a participant scored posteroanterior K&L = 3, skyline K&L = 2, lateral osteophytes = 0 and posterior osteophytes = 2, he or she was assigned to the moderate or severe group. Table 1 shows the definitions of radiographic severity used for the whole knee joint.
All the participants recruited to this study had knee pain during the past year, but they were also asked to report whether they had experienced knee pain, aching or stiffness on most days in the past month. A positive response to this question and the presence of radiographic osteoarthritis in their index knee defined the participant as having “symptomatic radiographic OA”. The combination of symptoms and pathology has been proposed as the most compelling clinical definition of osteoarthritis12 and an operational definition similar to that applied in this study has been used as a reference standard in other population-based studies.13 In the current analysis, this definition of symptomatic radiographic knee osteoarthritis was treated as the “reference standard” for comparing with the ACR clinical classification.
The clinical interview included a question about the presence and duration of morning stiffness. The presence of definite palpable bony enlargement at the knee and definite palpable coarse crepitus on transferring from sitting to standing were rated by the assessor on examination. Assessments were conducted by one of the six research therapists blinded to the findings from radiography, postal questionnaires and medical records. Training of the assessors took place before the study and was updated throughout the study period after every 100 participants were recruited. This took the form of comparisons against rheumatologists, open and blinded comparisons against each other using “expert patients” and peer observation. Assessors were issued a manual of detailed protocols for assessing each sign and symptom.
Defining participants with clinical knee osteoarthritis using the ACR criteria
The decision tree format of the ACR classification criteria for clinical knee osteoarthritis was applied to the collected examination data.1 Although the traditional “3 out of 6” format has been the more widely promulgated version, the decision tree format was recommended by the authors in their original publication. As all participants in our study satisfied the age criterion, the ACR clinical criteria for knee osteoarthritis were fulfilled in the following circumstances:
crepitus + and morning stiffness >30 min and bony enlargement +; or
crepitus + and morning stiffness ⩽30 min; or
crepitus − and bony enlargement + (where + = present, − = absent).
Weight and height data were measured to calculate body mass index (BMI), which was classified according to the World Health Organization criteria.
In this study only one knee per participant was analysed, the “index knee”. In patients with unilateral knee pain the “index” was the single painful knee; in those with bilateral knee pain it was the most painful knee. In situations where participants felt both knees were similarly painful, the index knee was selected at random. The prevalence of symptomatic radiographic knee osteoarthritis along with the positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and positive and negative likelihood ratios (LR+ and LR−) of the ACR clinical classification criteria were calculated (with 95% confidence intervals (CIs)) for the whole sample and then for each subgroup (age, sex, BMI and consulting status). To determine whether the performance of the ACR clinical criteria was dependent on the underlying spectrum of radiographic disease, we compared the sensitivity of the ACR criteria in those with isolated patellofemoral joint disease versus those with involvement of the tibiofemoral joint, and in those with mild versus severe disease. Both of these comparisons were restricted to those with knee symptoms on most days in the previous month.
Complete radiographs and clinical data were available for 788 of 819 participants. Table 2 shows their descriptive characteristics.
ACR clinical criteria versus symptomatic radiographic knee osteoarthritis
In all, 105 (41%) of the 259 participants classified as having symptomatic radiographic knee osteoarthrtits (symptoms on most days in the past month and radiographic evidence of definite osteoarthritis) fulfilled the ACR clinical criteria for knee osteoarthritis. This proportion was slightly lower among those with radiographic osteoarthritis but less frequent symptoms (90/278; 32%). In those participants with no radiographic osteoarthritis the proportion fulfilling the ACR criteria was even lower (43/251; 17%). Fulfilling the ACR clinical criteria was related more to the presence of radiographic osteoarthrits than to the frequency of symptoms in the past month.
Performance of the ACR clinical criteria in population subgroups
Table 3 presents the correspondence between symptomatic radiographic knee osteoarthritis (dichotomised as present or absent) and the ACR clinical criteria (fulfilled or not fulfilled).
The first line of the table shows that for the whole sample, 259 (33%) of the 788 had symptomatic radiographic knee osteoarthritis. The PPV for ACR clinical criteria was 44% and NPV was 72%. Sensitivity was 41% and specificity was 71%—that is, applying the ACR clinical criteria results in a high proportion of “false negatives”. The positive and negative likelihood ratios were small (ie applying the ACR clinical criteria did little to change the pre-test probability of symptomatic radiographic knee osteoarthritis). When the sample was divided into subgroups according to sex, age, BMI and consultation, the PPV and NPV varied in a predictable fashion: in groups with higher prevalence of symptomatic osteoarthritis (eg, obese), PPV increased and NPV decreased. Sensitivity, specificity and likelihood ratios were largely invariant across each of the subgroups.
Performance of the ACR clinical criteria by disease severity and compartmental distribution
The sensitivity of the ACR criteria was noticeably higher in those with involvement of the tibiofemoral joint and in those with more severe radiographic knee osteoarthritis (table 4). However, the likelihood ratios even in symptomatic severe radiographic knee osteoarthritis were still small.
In developing the ACR classification criteria for knee osteoarthritis, we recognised that classification criteria are often criticised for their imperfections.1 We sought to understand why caution is needed in their application in different settings and populations and to explore some of the reasons underlying this.
Our findings suggest that much of what might be classified as symptomatic radiographic knee osteoarthritis (frequent symptoms and radiographic evidence of disease) in the general population and primary care does not fulfil the ACR clinical criteria. Conversely, fulfilling the ACR clinical criteria does little to help rule out the presence of symptomatic osteoarthritis in older adults with knee pain. Interestingly, both approaches yielded essentially the same prevalence estimate of knee osteoarthritis in this sample (33% and 30%, respectively), although clearly they did not pick up the same participants.
Contrary to what we had expected from studies of other conditions, we found little variation in the relationship between the ACR clinical criteria and the presence of symptomatic radiographic knee osteoarthritis across different population subgroups. The performance of the ACR clinical criteria was, however, linked to the underlying disease severity. Crepitus, morning stiffness and bony enlargement were found more often in those with more advanced disease. Our finding of increased sensitivity with more severe disease is consistent with findings from other diagnostic studies.14
The estimate of sensitivity for the whole sample in the current study (41%) is markedly lower than that reported in the original development of the criteria (89%). A direct comparison of the two estimates is difficult as the reference standard used in the original study was clinical diagnosis of knee osteoarthritis verified by three independent expert panel members. Our study used the combination of frequent knee symptoms and radiographic evidence of definite osteoarthritis. Our definition of radiographic osteoarthritis applied the same principle as earlier studies (ie, definite osteophyte15), but extended this to all three views of the knee.16 Although some previous definitions have also included cases with joint space narrowing in the presence of doubtful osteophytes,13 there were few people in our study who satisfied this definition (only 10/245 people with no definite osteophyte had evidence of joint space narrowing). We therefore did not incorporate this additional group. In the original study by Altman et al,1 108 of 115 cases with knee pain and radiographic evidence of an osteophyte received a clinical diagnosis of knee osteoarthritis (22/122 who did not have an osteophyte received the clinical diagnosis of knee osteoarthritis). Given the high agreement between these two reference standards, it seems unlikely that the poorer performance of the clinical classification criteria in our study is explained by the use of a different reference standard. However, the possibility remains that the selection of patients in the original ACR study was influenced by unmeasured clinician perceptions related to the criteria. These would not have applied to the sample described here.
Observer variability either in assessing the ACR clinical criteria or in the reference standard (ie, scoring radiographs and reporting the frequency of knee symptoms in the previous month) would be expected to reduce our ability to observe any true association between these two. Intra-reader and inter-reader reliability for posteroanterior K&L score, skyline K&L score and lateral osteophytes were both good (κ 0.81–0.98 and 0.49–0.76, respectively). In our earlier pilot studies, with three of the assessors, interobserver and intraobserver reliability for applying the ACR clinical classification criteria were lower (observed agreement 78% and 87%; κ 0.30 and 0.66, respectively).17 The component with lowest reliability was palpation of coarse crepitus (observed agreement 61% and 77%; κ 0.22 and 0.53).18 When we re-analysed our results separately for each of the six assessors, the sensitivitities (and specificities) of the ACR clinical criteria were 6.7% (95.6%), 41.4% (64.3%), 43.3% (63.1%), 47.2% (76.1%), 54.0% (72.0%) and 57.1% (74.1%). We found evidence of a gradual improvement over the course of the study period (first quarter sensitivity = 0.36 and last quarter sensitivity = 0.50), but at the expense of a reduction in specificity (0.81 and 0.72, respectively). Observer variability in assessing the ACR clinical criteria is thus a contributing factor to our findings. However, rather than being viewed as a major limitation, this may actually give a better indication of what happens when these criteria are applied in non-specialist settings by clinicians with average capabilities.19
Throughout, we have used traditional diagnostic terminology (sensitivity, specificity and so forth) although it should be recognised that our analysis is more a comparison of two available alternatives for classifying knee osteoarthritis than a diagnostic study in which we have a clear gold standard (hence we use the term “reference standard” to describe the combination of frequent symptoms with definite radiographic osteoarthritis). Both approaches have their weaknesses in identifying mild, early osteoarthritis in the general population and primary care. The ACR clinical classification criteria seem to reflect later signs in advanced disease, and, in our experience, may be difficult to assess reliably. The co-occurrence of frequent symptoms with definite osteophyte may be coincidental in many cases. Restricting the definition of symptomatic radiographic osteoarthritis to those with frequent symptoms also excludes people with obvious structural disease but intermittent symptoms. We have argued, as have others, that after excluding red flags, specific inflammatory disease and other non-articular and extra-articular causes, the more important distinctions may be on the basis of the persistence and severity of pain and associated disability and on the basis of the presence or absence of modifiable risk factors for symptom and disease progression.20 Descriptive studies aimed at characterising “early stages” in both clinical and disease terms should complement classifications used for case definition in osteoarthritis research.
We acknowledge the contributions of Krysia Dziedzic, June Handy, Jonathan Hill, Helen Myers and Ross Wilkie to data collection and study design and thank the administrative and health informatics staff at Keele University’s Primary Care Sciences Research Centre, and the staff of the participating general practices and Haywood Hospital, especially Dr Jackie Saklatvala, Carole Jackson and the radiographers at the Department of Radiology.
Published Online First 20 April 2006
Ethical approval: The study was approved by the North Staffordshire Local Research Ethics Committee (Project No 1430).
Funding: This study is supported financially by a Programme Grant awarded by the Medical Research Council, UK (grant code: G9900220) and Support for Science funding secured by the North Staffordshire Primary Care Research Consortium for NHS service support costs. The funding sources for this study had no involvement in the study design, collection, analysis and interpretation of data, the writing of this report or the decision to submit this paper for publication.
Competing interests: None.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.