Objective: To compare the value of various IgM and IgA rheumatoid factor (RF) tests for the diagnosis of rheumatoid arthritis (RA).
Methods: Firstly, the latex test, one global assay (for IgM, IgA, and IgG RF), six IgM, and four IgA RF assays were compared in a particularly challenging situation—that is, with 67 patients with RA, many of whom were latex negative, and 91 non-RA controls, many of whom were latex positive. More detailed evaluation followed with three IgM RF tests (two commercially available kits and one assay developed in our laboratory) and two IgA RF tests (one commercially available and one from our laboratory) in two more representative samples of rheumatological patients (146 RA and 75 non-RA controls).
Results: Diagnostic performance differed considerably between the assays. For IgM RF detection the highest sensitivity (88%) was obtained with the Diamedix kit (specificity 67%) and for IgA RF with the Inova kit (sensitivity 65%, specificity 88%). Combining one IgM and one IgA RF test improved diagnostic performance when both tests were in agreement, but at the cost of yielding 15–27% of discrepant results which did not help in ruling RA in or out. Mean concentration values differed significantly among IgM RF tests, and in most cases concentrations were not correlated.
Conclusions: Available tests for IgM RF isotype vary in accuracy, and none is uniformly better than all the others. For IgA RF isotype, the Inova kit appears to be the best. Quantitative results cannot be compared across tests. Combination of one IgM and one IgA RF test may improve diagnostic accuracy.
- rheumatoid arthritis
- rheumatoid factors
- diagnostic tests
- ACR, American College of Rheumatology
- ELISA, enzyme linked immunosorbent assay
- OD, optical density
- PBS, phosphate buffered saline
- RA, rheumatoid arthritis
- RF, rheumatoid factor
Statistics from Altmetric.com
- ACR, American College of Rheumatology
- ELISA, enzyme linked immunosorbent assay
- OD, optical density
- PBS, phosphate buffered saline
- RA, rheumatoid arthritis
- RF, rheumatoid factor
The presence of rheumatoid factors (RFs) is included among the original and revised criteria for the classification of rheumatoid arthritis (RA) as accepted by the American College of Rheumatology (ACR).1 However, the proportion of patients with RA positive for RF has ranged from 30% to more than 90% in various studies.2 These discrepancies may result from several factors, such as differences in patient selection, study design, and techniques used to measure RFs. Indeed, different laboratory methods such as agglutination, radioimmunoassay, or enzyme linked immunosorbent assay (ELISA) have been used, but these methods do not have equivalent sensitivity, specificity, and reproducibility.
Most routine laboratory tests in current use (haemagglutination, latex agglutination, nephelometry, and turbidimetry assays) are based on the ability of RFs to agglutinate sheep red blood cells, latex, or similar particles (polystyrene, bentonite, acryl) coated with IgG. They detect mainly the IgM isotype of RFs (IgM RF), which are more efficient in agglutination reactions owing to their polyvalency, but the exact contributions of IgM, IgG, and IgA RFs are not known.3–5 Furthermore, with very common techniques such as haemagglutination or latex agglutination titration, end points are not objectively determined because an observer decides when agglutination is present and this judgment will vary among observers. Enzyme immunoassay is therefore a more objective method for determining the levels of different RF isotypes, particularly IgM and IgA RFs. IgM RF concentration can be expressed in IU/ml on the basis of the World Health Organisation reference RF serum. Moreover, IgM RF measurement by ELISA has also been shown to be more sensitive than latex agglutination or nephelometry.6
Today, numerous ELISA tests for detecting RF isotypes are commercially available, using either human or rabbit IgG as antigen. In this study we compared the sensitivity, specificity, and quantitative values obtained with seven such tests. Four commercial kits were used to detect IgM RF (Immco, Sigma, Diamedix, and Inova), two to detect IgA RF (Immco and Inova), and one to detect IgM, IgA, and IgG RF (Immco). Finally, we included the latex test as well as the IgM and IgA RF determinations by ELISA (with Fc of either human or rabbit IgG as antigen) developed in our laboratory. As we planned to compare many different assays, the evaluation was conducted in two steps. To select the most sensitive tests we used 67 serum samples from patients with RA, of whom 41 (61%) were latex negative. To select the most specific assays, we tested 91 sera from non-RA control patients, of whom 62 (68%) were latex positive. After this first screening, we retained the following tests for further evaluation: two commercially available kits and one ELISA from our laboratory to detect IgM RF, and one commercially available kit and one ELISA from our laboratory to detect IgA RF. The second comparative evaluation was performed in two larger groups of rheumatological patients (146 RA and 75 non-RA patients). We did not include IgG RF assays because the characteristics of these RFs have prevented them from becoming a routine test in the evaluation of patients with RA.7–9
PATIENTS AND METHODS
The initial evaluation of immunoassays was performed from two groups of patients: 67 patients with RA diagnosed according to the revised criteria formulated by the ACR,1 median age 64 years (range 22–87), 50 (75%) were women; 91 non-RA control patients, median age 58 years (range 4–91), 53 (58%) were women. Among controls, latex testing was performed during January 1999 in the laboratory of the Division of Rheumatology (University Hospital, Geneva). After subsequent review of medical records, one year after sample collection, it was established that none of these patients had RA. About half of the controls (n=47) were rheumatological patients. Diagnoses were crystal-induced arthritis, transient hip synovitis, osteoarthritis, seronegative spondyloarthropathies, connective tissue diseases (systemic lupus erythematous, dermatomyositis, polymyositis, scleroderma), other inflammatory diseases, including Crohn's disease, polymyalgia rheumatica, and sarcoidosis. Non-rheumatological control patients (n=44) had infectious, cardiac, cerebrovascular, or neurological diseases.
The main evaluation was performed with two larger groups of patients: 146 patients with RA diagnosed according to the revised criteria formulated by the ACR,1 median age 62 years (range 22–87), 107 (73%) were women; 75 non-RA control patients, median age 62 years (range 4–101), 49 (65%) were women. All controls had a rheumatological disease other than RA. The diagnoses were similar to those of the first control group (see above).
A serum sample was drawn from each patient, aliquots were taken and stored at –80°C until use.
Serum samples were tested for RF by agglutination of latex particles coated with human immunoglobulin (Difco Laboratories, Detroit, MI, USA). Titrations were performed in tubes. Agglutination titres ≥1/80 were regarded as positive.
IgM and IgA RF determinations
Flat well microtitre plates (Immunoplate I, cat. No 439454 from Nunc, Life Technologies, Basel, Switzerland) were used as the solid phase and “coated” overnight at 4°C with 100 μl/well of affinity purified human IgG Fc fragment (5 μg/ml) (Organon Teknika Cappel, Durham, NC) dissolved in carbonate “coating” buffer (0.05 M, pH 9.6). After four washes with phosphate buffered saline (PBS) with 0.05% Tween 20, non-specific binding sites were blocked with 1% fetal calf serum from Gibco (Life Technologies, Basel, Switzerland) for one hour at 37°C. After the washes, appropriate dilutions of serum (usually 1/25, 1/50, 1/100 for patient serum and 13 serial twofold dilutions from 1/20 for standard serum) in PBS with 0.05% Tween 20 and 0.25% bovine serum albumin (Boehringer, Heidelberg, Germany), were incubated for 90 minutes at 37°C. The reactivity of each serum dilution was also tested onto a non-coated well. After the washes, bound IgM and IgA RF were detected with the alkaline phosphatase labelled F(ab`)2 fragment of polyclonal goat IgG antihuman IgM (μ chain specific) or IgA (α chain specific) from Organon Teknika Cappel for one and a half hours at 37°C. Substrate (1 mg/ml p-nitrophenyl phosphate (Fluka, Buchs, Switzerland) in 10% diethanolamine) was added for 45 minutes at 37°C. Colour development was stopped by the addition of 25 μl/well of 3 M NaOH. The optical density (OD) value at 410 nm was determined with a Dynatech MR 5000 microplate reader (Dynatech, Alexandria, VA) linked to a Macintosh computer equipped with the Biocalc program (Dynatech). The net OD values (OD value obtained for the Fc coated well – the non-specific OD (OD value obtained for the non-coated well)) were transformed automatically after curve fitting with the four parameter log-logistic model transformation10 into units by using the standard dilution curve (13 serial twofold dilution curve of a pool of sera). For IgM RF, results were expressed in IU/ml because our pool of sera was calibrated with the WHO RF reference serum.11 For IgA RFs, as no recognised international standard exists for expressing their immunoreactivity, results were expressed in arbitrary units/ml according to our own standard.
Commercially available kits for RF determinations
One ELISA kit, called “screen ELISA” for IgM, IgA, and IgG RF determinations, was purchased from Immco Diagnostics, Inc (Buffalo, NY). Kits for IgM were purchased from Immco Diagnostics Inc, Inova Diagnostics Inc (San Diego, CA), Sigma (Saint Louis, MO), and Diamedix (Miami, FL) and those for IgA RF determinations from Immco and Inova Diagnostics Inc. Depending on the kit, the antigens used were either rabbit or human IgG. Some kits used five different calibrators, whereas the “screen ELISA” from Immco and the IgM RF test from Diamedix used only one concentration of calibrator. Sample dilutions varied from 1/21 (IgM RF test from Sigma) to 1/201 (“screen ELISA” from Immco) and cut off values from ≥6 to ≥20 IU/ml for IgM RF measurements. The assays and calculations were performed according to the manufacturers' instructions. Table 1 gives the characteristics of the different immunoassays.
The sensitivity (among patients with RA) and specificity (among other patients) were computed for each test, with 95% confidence intervals.12 Differences in sensitivity or specificity between assays were tested using the McNemar test for paired data.
Furthermore, to examine the clinical utility of positive or negative test results, we computed the likelihood ratio for each result.13 The likelihood ratio is the ratio of probabilities of a test result in patients with RA versus controls—that is, for a positive test, sensitivity/(1−specificity), and for a negative test, (1−sensitivity)/specificity. The likelihood ratio indicates by how much the odds of disease should change after the test (by application of Bayes theorem). If the likelihood ratio is >1 the odds of disease increase, if it is equal to 1 the odds do not change, and if it is <1 the odds of disease decrease (note that a likelihood ratio of disease of 1/k is equivalent to a likelihood ratio of no disease of k). For instance, if the likelihood ratio of a positive test is 4 and the odds of RA for a given patient before the test are even (odds 1:1, or 50% probability), then after a positive test the odds of RA become 4:1, or a probability of 80%. Confidence intervals for likelihood ratios were computed according to Koopman.14 We also examined the likelihood ratios for combinations of two tests—that is, for two positive results, discrepant results (one test positive, the other negative), and two negative results.
Finally, we examined the agreement between assays, separately for IgM RF tests and for IgA RF tests. To examine the ability of assays to label a patient as having a positive or negative RF, we computed κ statistics. The κ statistic measures agreement beyond chance; values >0.75 are commonly taken to denote excellent agreement, 0.40–0.75 fair to good agreement, and <0.40 poor agreement.15 To examine the consistency of the assays in ranking patients, we computed Spearman correlation coefficients. Then, we tested the pairwise differences between test values, in the whole sample, and separately in patients with RA and in controls.
Initial evaluation of immunoassays
The first step was performed with selected serum samples. To better determine the assays with the best sensitivity, a high percentage of samples from patients with RA were chosen because they were latex negative, and to better determine the assays with the best specificity, a high percentage of samples from non-RA controls were chosen because they were latex positive. In these samples, a sensitivity of 66% and a specificity of 47% were obtained for the “screen ELISA” from Immco (table 2). Important differences in sensitivity (up to 25%) and specificity (up to 30%) were observed between the different IgM RF or IgA RF assays. For IgM RF detection, the best sensitivity (66%) was obtained for the Diamedix test. The best specificity (65%) was obtained for the Inova test. The best agreement (sum of true positive and true negative samples divided by total number of observations) (59%) was found for the Sigma test. For IgA RF detection, the best sensitivity (34%) was obtained for the Inova test, and the best specificity (96%) for the Immco test. The best agreement (64%) was found for the Inova test.
As with methods developed in our laboratory, the only difference was the target antigen, it was possible to compare results obtained with Fc fragments of rabbit or human IgG. Higher specificities were obtained when Fc fragments of rabbit IgG were used instead of human IgG, but these differences were not significant. Sensitivities of IgM RF determinations were similar, but the IgA RF test based on human IgG was more sensitive than when rabbit IgG was used (21% v 9%, p=0.02)
Test results did not agree well. Only 33% of samples yielded the same qualitative results for the six IgM RF ELISA tests, and 70% for the four IgA RF ELISA tests. When results were classified according the coated antigen, 51% of samples gave the same results for the three IgM RF using rabbit IgG and 61% for those using human IgG. In the same way, 75% of samples produced the same results for the three IgA RF ELISA tests using rabbit IgG. When results obtained for the three Immco tests were examined, 26 samples (16%) were found to be IgM RF positive when the “screen ELISA” supposed to detect IgM, IgA, and IgG RF was negative.
From these results, six tests were chosen to pursue the evaluation, four being commercially available. The “screen ELISA” from Immco and the Diamedix kit were further evaluated because they had the highest sensitivities. The Inova IgA RF and the Sigma kits were also kept because they had the highest agreements in their respective group. The most sensitive of our methods (using human IgG Fc fragment as antigen) were also tested in the second part of this comparative study.
Main evaluation of immunoassays
For IgM RF detection, the best sensitivity (88%) was again obtained for the Diamedix test, and the best specificity (81%) for the Sigma test (table 3). For IgA RF detection, the Inova test had the best sensitivity (65%), while our assay using human IgG Fc fragment as antigen had the best specificity (91%) but with a much lower sensitivity. Specificity was unacceptably low for the “screen ELISA” from Immco (48%).
Examination of the likelihood ratios for various test results confirmed the differences between the tests (table 4). A positive result was most helpful in establishing a diagnosis of RA for the Inova IgA test and for the latex agglutination test: a likelihood ratio of 5.4 implies that the probability of RA would increase from 50% before the test to 84% after the test, or from 80% to 96%, or from 20% to 57%. In contrast, the likelihood ratio of 1.6, seen for the “screen ELISA” test, would move pre-test probabilities of RA from 50% to only 62%, or from 80% to 86%, or from 20% to 29%. The most helpful negative result was produced by the Diamedix test: a likelihood ratio of 0.18 implies that probabilities of RA would change from 50% to 15% after the test, or from 80% to 42%, or from 20% to 4%. The least helpful was the IgA RF test from our laboratory: a likelihood ratio of 0.61 would change pre-test probabilities from 50% to 38% after the test, or from 80% to 71%, or from 20% to 13%.
Combining tests for two isotypes
Because no single test displayed near optimal properties, we examined the utility of combinations of two RF tests. To enhance interpretability, we combined each time an IgM RF test (Sigma, Diamedix, and from our laboratory) with an IgA RF test (Inova only). The resulting likelihood ratios were strengthened for consistent results, whether positive or negative, but were generally unhelpful for discrepant results, with likelihood ratios in the vicinity of 1 (table 5). The most balanced test combination was that of the Diamedix IgM RF with the Inova IgA RF, with nearly equivalent likelihood ratios for positive (6.8) and negative (0.17=1/5.8) results. For situations where confirmation of RA is the chief concern, the combination of the IgM RF from our laboratory and Inova IgA RF performed best, with a likelihood ratio for a positive result >11. Finally, the combination of the Sigma IgM RF and Inova IgA RF yielded the lowest proportion of discrepant results.
Agreement between the different IgM RF tests
The various IgM RF assays yielded quantitatively different results. Even though Spearman rank correlations between quantitative tests results were rather high: between 0.74 and 0.84, absolute concentration values were not so consistent. Mean concentrations of IgM RF were 30 for the Sigma test, 88 for the Diamedix test, and 154 for the test from our laboratory (all pairwise differences significant at p<0.01). Similar results were obtained after stratification of these analyses by diagnosis of RA. The κ statistics between the three IgM RF assays ranged from 0.60 to 0.71, indicating good (but not excellent) agreement.
Determination of RF is a common analysis in routine clinical laboratories and is often ordered for non-rheumatological patients. Shmerling and Delbanco reported that after subsequent consultation of the medical charts at least two months after RF determination, of 563 requests for RF determination received in their institution, only 5.1% had RA or another systemic rheumatic disease.16 To evaluate the diagnostic value of RF tests it is therefore important also to take into account samples from a large group of non-rheumatological patients; this was performed in our initial evaluation.
Positive RF results are currently used to establish a diagnosis of RA, following accepted criteria for the classification of this disease.1 It is therefore crucial to get accurate results, comparable between different laboratories. However, although many commercial ELISA kits for detecting RF isotypes are available, no study has been performed to compare their performance in the diagnosis of RA. The present study shows that Sigma, Diamedix, and our laboratory IgM RF and Inova IgA RF tests had the best sensitivity or agreement, or both. IgA RF determination was the most specific (88%) but the least sensitive (65%) for RA while IgM RF determinations appeared to be more sensitive (77–88%) but less specific (67–81%). In any case, we should remember that RFs are not exclusive to RA. RFs are found in a number of other connective tissue diseases, such as systemic lupus erythematous and Sjögren's syndrome, and in chronic infectious diseases.16–19 They can also occur in 3% of the general population and in about 10–15% of healthy elderly (>60 years of age) people.4, 20, 21 More importantly, we found substantial discrepancies between IgM RF tests which should provide the same information to clinicians. The κ statistics suggested good but not excellent agreement between tests. One way of improving diagnostic accuracy may be to combine two RF tests. When one IgM and one IgA RF were combined, diagnostic performance was improved, as long as the two tests yielded the same result. For instance, the combination of the IgM RF from our laboratory and Inova IgA RF had a likelihood ratio for a positive result >11, which may be sufficient to confirm the diagnosis of RA. Previous studies have already reported an increase of diagnostic specificity when the presence of IgM + IgA RF is considered instead of that of IgM RF only.6, 22 Nevertheless, there are always situations where results of the two tests are discrepant (more than 20% in our study) and for these patients it is not possible to rule RA in or out.
The discrepancies we noted probably reflect the fact that the characteristics of ELISA tests for detecting RF isotypes vary considerably: they use different antigens, different sample dilution, and different cut off points. A calibration curve is available for most of them but not all. Some uncertainty remains about the choice of the best target antigen. The use of human IgG was first considered to be more relevant than rabbit IgG,23 but the latter has been preferred by other authors who consider that rabbit IgG yields more specific tests8, 24–26 because IgM RF has a high affinity for the Ga determinant present on the Fc part of rabbit IgG.7, 27 The use of Fc fragments instead of complete IgG has also been reported to improve assay specificity.28 Commercially available kits examined in this study differ not only in their target antigen but also in several other aspects, so it is not possible to attribute their different specificities to the differences in antigens. In contrast, methods developed in our laboratory are comparable because they only differed by the target antigens. We found a similar specificity when Fc fragments of rabbit IgG were used instead of human IgG, but the sensitivity of IgA RF measurement was significantly lowered when Fc fragments of rabbit IgG were used. A similar observation has been reported by Tuomi.29
Apart from the different target antigens used, other differences in laboratory methods, such as sample dilutions, non-specific background evaluation, calibration or detecting antibodies, may explain the high variability seen in IgM RF concentrations. In principle, no quantitative comparison is possible for IgA RF, as no recognised international standard is available. In all commercial kits, only one sample dilution is tested and the non-specific binding of each sample is never evaluated. However, when assays are performed on serial dilutions of serum samples, the signal is linear only within a small range of serum dilutions and an accurate concentration value can only be determined in this range. If only one dilution is tested it is impossible to know if it corresponds to the linear part of the dose-response curve. Moreover, if the analyte concentration is very high, a hook effect is also possible, resulting in a gross underestimation.
Furthermore, to make a quantitative measurement, it is important to consider “non-specific background” signals due to a high absorbance of a negative patient sample. Such signals may occur from heterophile antibodies, which are a common cause of erroneous results, particularly when sera are not highly diluted,30 because at high serum concentrations there is a greater tendency for low affinity antibodies to bind. Quantitative results also require a standard curve. No such curve is provided in two of seven tested kits.
Finally, no information is given by the companies about the nature of the detecting antibody. In principle, only the F(ab`)2 portion of the antihuman IgM or IgA should be used, as free antibody combining sites on RF can react with the Fc region of these conjugate antibodies. All these factors may contribute to the large differences found in the quantitative results between the different assays. It is certainly more appropriate to consider them as semiquantitative results.
Another subject of concern is the lack of coherence observed between the Immco “screening” and the IgM RF detecting tests. Although we found that the screening test was more sensitive (66%) than the IgM RF detecting test (60%), a substantial number of samples (16%) were found to be negative with the screening but positive with the IgM RF test. Moreover, no information is given about IgG RF detection by the screening test, which is especially difficult to perform accurately owing to the chemical characteristics of these factors. Indeed, in the binding reaction used for the IgG RF specific immunoassay, IgG is not only the antigen but also the antibody (RF) and the detecting reagent (second antibody). In solid phase assays multivalent IgM RF may interfere with the measurement of IgG RF, as it can bind to the immobilised target IgG and to IgG in the serum. As a consequence, non-RF polyclonal IgG present in the serum sample may be detected by the antibody conjugate, leading to fallacious IgG RF estimation.
In conclusion, for the diagnosis of RA, substantial differences were seen between the different tests, and the reliability of some RF assays can be questioned. All commercially available tests provide only semiquantitative determinations. Therefore, quantitative RF value comparisons will be accurate only if data were obtained by the same method. The assumption that rabbit IgG would be more specific for RA than human IgG was not verified in our study. Combination of one IgM and one IgA RF test may improve diagnostic accuracy.
The technical assistance of Ursula Spenato and Madeleine Vuillet is gratefully acknowledged.