Background: The WOMAC (Western Ontario and McMaster Universities) function subscale is widely used in clinical trials of hip and knee osteoarthritis. Reducing the number of items of the subscale would enhance efficiency and compliance, particularly for use in clinical practice applications.
Objective: To develop a short form of the WOMAC function subscale based on patients’ and experts’ opinions (WOMAC function short form).
Methods: WOMAC function subscale data (Likert version) were obtained from 1218 outpatients with painful hip or knee osteoarthritis. These patients and their rheumatologists selected the five items that they considered most in need of improvement. The rheumatologists were asked to select the five items for which patients in general are the most impaired. Items that were least important to patients and experts, those with a high proportion of missing data, and those with a response distribution showing a floor or ceiling response were excluded, along with one of a pair of items with a correlation coefficient >0.75.
Results: The WOMAC function short form included items 1, 2, 3, 6, 7, 8, 9, and 15 of the long form. The short form did not differ substantially from the long form in responsiveness (standardised response mean of 0.84 v 0.80).
Conclusions: A short form of the WOMAC function subscale was developed according to the views of patients and rheumatologists, based on the responses of 1218 patients and 399 rheumatologists. The clinical relevance and applicability of this WOMAC function subscale short form require further evaluation.
- ICC, intraclass correlation coefficient
- SRM, standardised response mean
- WOMAC, Western Ontario and McMaster Universities osteoarthritis index
- treatment outcome
Statistics from Altmetric.com
- ICC, intraclass correlation coefficient
- SRM, standardised response mean
- WOMAC, Western Ontario and McMaster Universities osteoarthritis index
One of the major uses of health measurement scales is detecting health status change over time, either in the context of clinical trials or epidemiological studies or as a strategy for monitoring the outcomes and making decisions about the care of individual patients in daily clinical practice. In all situations, a priority may be efficiency, achieved by the shortest possible questionnaire.1 To date, methods of shortening questionnaires have focused on approaches that maintain the greatest internal consistency.2 However, in the context of health measurement scales targeting a relatively heterogeneous disorder, it may be advantageous to sacrifice internal consistency for content validity.3
The Western Ontario and McMaster Universities (WOMAC) osteoarthritis index is a valid, reliable, and responsive measure in hip and knee osteoarthritis.4,5 This index is self administered and involves 17 items addressing the degree of difficulty in accomplishing 17 activities of daily life. While the mean importance score of the 17 items is similar at a group level, there is interindividual variability in the importance attached by individual patients to particular items.4,5 The WOMAC function subscale is short, and can be completed quickly. Nevertheless, an even shorter version would further enhance its applicability in epidemiological studies and for use in routine clinical practice.2
Our aim in this study was to specify a short form of the WOMAC function subscale dedicated to all patients with hip or knee osteoarthritis, by preserving the most important items for patients and rheumatologists (WOMAC function subscale short form).
We conducted a prospective cohort study of four weeks’ duration, involving 1362 outpatients with hip or knee osteoarthritis as defined by the American College of Rheumatology,6,7 and 399 private rheumatologists in France. Each rheumatologist was required to include four patients, three with knee osteoarthritis and one with hip osteoarthritis. To be included in the study, patients had to experience pain from the osteoarthritis (⩾30 mm on a visual analogue scale (VAS) ranging from 0 to 100 mm) and to require treatment with a non-steroidal anti-inflammatory drug (NSAID). Inclusion could begin with the onset of treatment or with a switch from one NSAID to another. Patients were excluded if they had a prosthesis on the assessed joint or if they had been treated with intra-articular injection in the four weeks before the study began. All patients initially visited the rheumatologist in charge of their case, and an NSAID was prescribed (the drug and its dosage were chosen by the physician). A final visit to the same rheumatologist was scheduled four weeks later.
Patients and rheumatologists assessed the patient’s status with respect to the osteoarthritis at the baseline visit and at week 4. Patients completed the French Canadian version of the WOMAC physical function subscale8 (17 items, five point Likert scale version, total score varying between 0 and 68; high scores indicate a high degree of functional impairment).
Patients were also asked to select the five items of the WOMAC function subscale that they considered most in need of improvement.
The rheumatologists were asked on one occasion to select the five items on the subscale which they consider result in the greatest impairment in patients with knee and hip osteoarthritis (not the specific patients they had included in the study).
To assess the test–retest reliability of the resulting WOMAC function subscale short form, a subsample of 86 patients was asked to complete the full WOMAC function subscale again, 48 hours after the baseline visit. These patients had begun taking NSAIDs 48 hours after the baseline visit (that is, after completing the WOMAC function subscale a second time).
First, we computed descriptive statistics on clinical and demographic variables. Then we used a four step procedure to eliminate items as follows:
Step 1. We ranked the 17 items of the complete WOMAC function subscale from highest to lowest importance according to the patients’ and rheumatologists’ opinions, excluding the five items that were least important for both patients and rheumatologists. The whole sample was then divided into tertiles of the WOMAC function subscale score to investigate the potential impact of the level of functional impairment on the patients’ ranking.
Step 2. We ranked the 17 items by the proportion of missing data per item. Items with a high proportion of missing data were excluded.
Step 3. Items whose distribution of answers showed a floor or ceiling response were excluded. This response is present when most of the answers are clustered in only a few response options at one extreme—that is, when most of the subjects attest to having no difficulty (floor response) or extreme difficulty (ceiling response) in the activity. For floor response items, it is impossible to detect improvement, while for ceiling response items, it is not possible to distinguish among various grades of difficulty, as most of the subjects answer the same way.
Step 4. We tested for potentially redundant items. Inter-item correlation coefficients were computed. When the correlation coefficient was greater than 0.75, the least important item of the pair in the patients’ ranking was excluded.
Responsiveness was assessed by use of the standardised response mean (SRM) for the complete WOMAC function subscale and the WOMAC function short form. SRM is the mean change in score between the baseline and the final visit divided by the standard deviation of the change in score. Test–retest reliability was assessed using the intraclass correlation coefficient (ICC). Construct validity of the WOMAC function short form was assessed using the correlation between scores of the long and short forms, as recommended when the original scale cannot be considered a gold standard (that is, the reference measurement instrument).2 Internal consistency was assessed using Cronbach’s α.9
Statistical analyses involved use of the SAS Release 8.2 statistical software package.
In all, 1362 patients were enrolled in the study: 1019 (75%) with knee osteoarthritis and 343 (25%) with hip osteoarthritis. At the baseline visit, 1218 patients (89%) completed the full WOMAC function subscale without any missing data. The derivation process is based on these 1218 patients, described in table 1.
Ranking of the 17 items of the complete WOMAC function subscale
Patients and rheumatologists were consistent in ranking the importance of items (table 2). The four most important items for rheumatologists were among the five most important items for patients. The five least important items for rheumatologists were among the six least important items for patients. The ranking of item importance was similar between patients with hip osteoarthritis and those with knee osteoarthritis (data not shown), except for “descending stairs” (ranked sixth and first, respectively), and “putting on socks/stockings” (first and 12th, respectively). As these items are relatively specific to the location of the osteoarthritis (hip or knee), this discrepancy was expected. The ranking of the items’ importance was similar between men and women (data not shown), except for “going shopping” (ranked 10th and fourth, respectively) and “performing light domestic duties” (ranked 13th and seventh, respectively).
The five least important items for both patients and experts were “lying in bed,” “bending to the floor,” “rising from bed,” “sitting,” “taking off socks/stockings,” and “standing.”
Results of dividing the whole sample into tertiles of the WOMAC function subscale score showed exactly the same items being selected by patients in the three subgroups.
Ranking of the 17 items by the proportion of missing data
Three items generated notably more missing data than the others. These items may have been interpreted too literally and considered not to be relevant—for example, domestic duties may have been interpreted only as cleaning the house and therefore probably of more concern to women, while respondents answering the getting in/out of the bath question may not have appreciated that this question can alternatively be considered relevant to getting in/out of the shower.
The items excluded were “performing heavy domestic duties,” “performing light domestic duties,” and “getting in/out of the bath.”
Items for which the distribution of answers showed a floor or ceiling response
Almost all the items of the complete WOMAC function subscale had a good distribution of answers among response modes. However, two had a saturation point in one or two response modes: for “bending to the floor” and “lying in bed,” 74% and 75% of the answers, respectively, were “no difficulty” or “slight difficulty.”
The items excluded were “bending to the floor” and “lying in bed.” Both items had already been excluded in a previous step.
Pairs of highly correlated items (r>0.75) were “putting on socks/stockings” with “taking off socks/stockings” (r = 0.85) and “performing light domestic duties” with “performing heavy domestic duties” (r = 0.78).
The items excluded were “taking off socks/stockings,” and “performing heavy domestic duties.” Both items had been excluded in a previous step.
Summary of the reduction procedure
The eight items of the WOMAC function subscale short form derived by the above mentioned methods are shown in the appendix. These items were the eight most important in the patients’ opinion.
When summarising the different steps in the reduction procedure, it can be seen that six of the nine excluded items were excluded in at least two steps (two steps for four of the items and three for two of the items).
The WOMAC function subscale short form did not differ substantially from the complete WOMAC function subscale either in responsiveness (SRM = 0.84 (n = 1169) and 0.80 (n = 1048), respectively) or in test–retest reliability (ICC = 0.75 (0.65 to 0.83) and 0.79 (0.69 to 0.87), respectively).
Construct validity of the WOMAC function subscale short form was excellent (r = 0.95 between the long and short forms). Internal consistency was good in the WOMAC function subscale short form and the complete WOMAC function subscale (α = 0.84 and α = 0.93, respectively).
Using patients’ and rheumatologists’ opinions in France, and based on the Likert version of the French Canadian WOMAC function subscale, we have specified a short form of this subscale for patients with hip or knee osteoarthritis (including a broad spectrum of disease severity). To address recent recommendations for shortening composite measurement scales,2 we have ensured that the original scale was valid, relevant in the context of hip and knee osteoarthritis, and had satisfactory measurement properties.4,5
The WOMAC function subscale short form contains only eight items. It was derived by preserving face validity (patients’ and rheumatologists’ opinions) and quality of the items (few missing data, no redundancy, good distribution of the answers across response modes). Preserving face validity is important because it increases the acceptance of the instrument by those who will ultimately use it and thus decreases the amount of missing data.3 This short form has good responsiveness, good test–retest reliability, and good construct validity for this sample, but these parameters should be validated in an independent sample of subjects from the target population.10 Our reduction procedure involved deleting items that were highly correlated, and thus a lower internal consistency was expected for the short form than for the complete subscale (an internal Cronbach’s α = 1 indicates redundancy).
As the WOMAC subscale is dedicated to patients with hip or knee osteoarthritis, our sample reflects this target population well. The proportion of patients with hip and knee osteoarthritis (three quarters knee and one quarter hip) is close to the distribution in the community.11 As shown in table 1, our sample, is similar to samples included in trials on osteoarthritis treatment and represents a large spectrum of disease severity. Inclusion criteria, especially the requirement for NSAID treatment, were the same as those in the validation study of the WOMAC scale by Bellamy and associates.4 In our sample, the same items were selected by patients across the range of osteoarthritis severity: the results of dividing the sample into tertiles of the WOMAC function subscale score showed that the five least important items to patients (those to be excluded) were exactly the same in the three tertiles.
It has been assumed that items for assessing knee osteoarthritis may be somewhat different from those required for hip disease. In fact, when we evaluated the ranking of the 17 items of the complete WOMAC function subscale according to their importance to patients with hip or knee disease, the five least important items (those to be excluded) were the same for patients with both types of osteoarthritis.
According to previous recommendations, when the original scale cannot be considered a gold standard (the reference measurement instrument), an expert based approach to item reduction may be preferable to a statistical approach.2 This situation is far more likely in the patients’ self assessment of symptoms. An expert based approach has been employed in very few studies that involved reducing indices, and mainly served to help choose among several solutions provided by statistical methods.2 We chose the other route. We used an expert based reduction procedure, and statistical analyses of the quality of the items were secondary criteria. To reduce information bias in the reduction process, we combined two types of expert: patient experts, who had experience of the problems (representatives of the target population), and rheumatologist experts, using their knowledge of a broad spectrum of the disease.
The originality of our approach lies in the large number of experts involved in the study. Expert based approaches usually rely on the authors’ own judgment of redundancy and insufficient face validity, or on the use of consensus methods with relatively small panels of experts. For instance, Guillemin and colleagues12 used two panels when shortening the arthritis impact measurement scales 2 (AIMS2): one of 19 experts (rheumatologists, rehabilitation specialists, and methodologists) and another of 12 patients. Whitehouse et al13 used a panel of 36 experts (orthopaedic surgeons, rheumatologists, nurses, physiotherapists, and research personnel). The large sample of patient experts (n = 1218) and rheumatologist experts (n = 399, approximately 15% of the rheumatologists in France) in our study is a good indicator of its representativeness and of the validity of the results.
The relevance of our reduction procedure is reinforced by the outcome of the procedure. The remaining items are the eight most important in the patients’ opinion, and most of the excluded items were excluded in at least two steps of the reduction procedure. Taking patients’ opinion into account in deriving short forms of validated questionnaires could improve the clinical relevance of such methods.
Whitehouse et al13 proposed a seven item short form of the WOMAC function subscale, but the derivation process involved only a subgroup of patients with severe disease (patients undergoing hip or knee arthroplasty). In this context, the short form should be dedicated to assessing the outcome of total joint arthroplasty, as Whitehouse indicated. However, five items are shared between the Whitehouse form and our own. The particular population in Whitehouse’s study may explain some of the discrepancies between the two short forms—especially that fact that activities such as “sitting” or “rising from bed,” which are more likely to be impaired in severe disease, are two of the seven items included in Whitehouse’s version but excluded from our version (because they were ranked 14th and 15th, respectively, by the patients).
The assessment of the performance characteristics of the WOMAC function subscale short form, its clinical relevance, and its acceptability require further studies in independent samples. Such studies should involve different versions of the WOMAC function subscale, as well as different language translations and different scaling formats, and should be conducted in different countries, in different clinical environments (for example, rheumatology, orthopaedic surgery, physiotherapy, rehabilitation), and with different interventions.
Proposed WOMAC function subscale short form (eight items)
Rising from sitting
Walking on flat
Getting in/out of a car
Putting on socks/stockings
Getting on/off the toilet
The WOMAC function subscale gradations in the Likert-scaled French Canadian 3.0 version are: 0 = none, 1 = slight, 2 = moderate, 3 = severe, 4 = extreme.
The WOMAC function subscale short form comprises a total of 32 possible points, with 0 being the best and 32 being the worst.
This study was supported by an unrestricted grant from Merck, Sharp & Dohme Chibret Laboratories, France.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.