Article Text

PDF

Variations in performance characteristics of commercial enzyme immunoassay kits for detection of antineutrophil cytoplasmic antibodies: what is the optimal cut off?
  1. J U Holle,
  2. B Hellmich,
  3. M Backes,
  4. W L Gross,
  5. E Csernok
  1. Department of Rheumatology, University of Schleswig-Holstein Campus Lübeck and Rheumaklinik Bad Bramstedt, Germany
  1. Correspondence to:
    Dr Elena Csernok
    Rheumaklinik Bad Bramstedt, Oskar-Alexander-Str 26, 24576 Bad Bramstedt, Germany; csernokrheuma-zentrum.de

Abstract

Background: Previous studies have shown considerable variation in diagnostic performance of enzyme linked immunosorbent assays (ELISAs) for measuring antineutrophil cytoplasmic antibodies (ANCA) specific for proteinase 3 (PR3) and myeloperoxidase (MPO).

Objective: To analyse the performance characteristics of different commercially available direct ANCA ELISA kits.

Methods: ELISA kits for detecting PR3-ANCA and MPO-ANCA from 11 manufacturers were evaluated. Serum samples were taken from patients with Wegener’s granulomatosis (15), microscopic polyangiitis (15), other vasculitides (10), and controls (40). Results were compared with data obtained by indirect immunofluorescence (IFT). The diagnostic performance of the tests was analysed and compared by receiver operating characteristic (ROC) curve analysis.

Results: Applying the manufacturers’ cut off resulted in great variation in sensitivity of the commercial PR3-ANCA kits for diagnosing Wegener’s granulomatosis (ranging from 13.3% to 66.7%), and of the MPO-ANCA kits for diagnosing microscopic polyangiitis (ranging from 26.7% to 66.7%). Specificities were relatively constant (from 96.0% to 100%). IFT was superior to all ELISAs (C-ANCA for Wegener’s granulomatosis: sensitivity 73.3%, specificity 98%; P-ANCA for microscopic polyangiitis: sensitivity 86.7%, specificity 98%). The sensitivities of PR3-ANCA and MPO-ANCA ELISA kits were increased by lowering the cut off values. This reduced specificity but increased overall diagnostic performance.

Conclusions: The low sensitivity of some commercial kits reflects the high cut off levels recommended rather than methodological problems with the assays. Comparative analyses using sera from well characterised patients may help identify optimum cut off levels of commercial ANCA ELISA tests, resulting in better comparability of results among assays from different manufacturers.

  • ANCA, antineutrophil cytoplasmic antibodies
  • AUC, area under the curve
  • BVAS, Birmingham vasculitis activity score
  • ELISA, enzyme linked immunosorbent assay
  • IFT, indirect immunofluorescence technique
  • ROC, receiver operating characteristic
  • ANCA
  • Wegener’s granulomatosis
  • proteinase 3

Statistics from Altmetric.com

The detection of proteinase 3 and myeloperoxidase antineutrophil cytoplasmic antibodies (PR3-ANCA, MPO-ANCA) is a helpful tool for establishing the diagnosis of Wegener’s granulomatosis and microscopic polyangiitis (MPA), respectively. Moreover, follow up measurement of both PR3-ANCA and MPO-ANCA may be important, as there is evidence that increases in titre may precede a relapse of the disease.1 For routine detection of ANCA, a consensus statement on ANCA testing has been published by an international group of ANCA researchers. According to this statement, a further test (for example, a direct enzyme linked immunosorbent assay (ELISA)) is required if the indirect immunofluorescence technique (IFT, used as a screening method) successfully detects ANCA. This is because the value of IFT can be greatly enhanced by additional direct ELISA testing.2

There are various commercially available ELISA kits for detecting ANCA directed against proteinase 3 and myeloperoxidase. The manufacturing companies apply their own arbitrary units to express ANCA concentrations in a patient’s serum, which therefore makes it impossible to compare the concentrations measured by different kits. Moreover, the various kits do not have equal performance.3–6 Consequently, a clinician can only interpret the evolution of ANCA concentrations in a patient’s serum by sticking to one assay.

Our aim in this study was to compare the performance of 11 commercially available direct ELISA kits, an in-house direct ELISA, and IFT for detecting PR3-ANCA and MPO-ANCA in well characterised patients with ANCA associated vasculitides in order to identify factors associated with poor diagnostic performance.

METHODS

Patients

We studied 40 consecutive patients diagnosed with Wegener’s granulomatosis (n = 15), microscopic polyangiitis (n = 15), Churg-Strauss syndrome (n = 3), giant cell arteritis (n = 1), Takayasu syndrome (n = 1), cryoglobulinaemic vasculitis (n = 1), unclassified small vessel vasculitis (n = 2), and connective tissue diseases with secondary vasculitis (n = 2) in the Department of Rheumatology in Lübeck (University Schleswig Holstein, Campus Lübeck) and Rheumaklinik Bad Bramstedt. Serum samples were obtained from well defined patients with a known clinical diagnosis of Wegener’s granulomatosis or microscopic polyangiitis, where the diagnosis was made irrespective of serology (the presence of ANCA was not included as a criterion for the diagnosis in these patients). Classification of patients with primary systemic vasculitides was based on American College of Rheumatology (ACR) classification criteria7 and Chapel Hill consensus conference definitions.8 All the sera were taken at the time of diagnosis.

A prospective cohort of 30 disease controls was created, consisting of patients who were admitted consecutively to the medical clinic of the University of Schleswig Holstein, Campus Luebeck, over a period of one month with severe symptoms of pulmonary disease (unilateral or bilateral nodes or infiltration), renal disease (acute renal failure), or ENT disease (recurrent sinusitis) but without vasculitis. Sera from 10 healthy volunteers were also tested in all assays.

The diagnosis was biopsy proven in each patient. Biopsies had been examined by two different pathologists in the German reference centre for vasculitis (Department of Pathology, University of Schleswig-Holstein, Campus Lübeck). The disease activity was assessed by the Birmingham vasculitis activity score (BVAS)9 at the time of serum collection.

The study was carried out according to the 1997 Declaration of Helsinki of the World Medical Association. The design of the work was approved by the ethics committee of the University of Schleswig-Holstein, Campus Lübeck, and each patient gave informed consent before participation in the study.

Methods of ANCA detection

Indirect immunofluorescence

ANCA detection by IFT was carried out on ethanol and formaldehyde fixed leucocytes, as described previously.10 In our laboratory, a positive ANCA is defined as a titre of antibodies higher than 1:16.

ELISA for PR3- and MPO-ANCA

In-house direct ELISAs were undertaken as described earlier.5

Commercial ELISA kits to detect PR3-ANCA and MPO-ANCA

For testing, each company was designated by a letter (A to M) and their kits are identified in this way in the results section and the tables. Participating companies were: AESCU Diagnostics, Axis Shield, Binding Site, Biorad, Euro Diagnostica, Euroimmun, IBL, Inova, Pharmacia, Trinity Biotech, and Wieslab. Serum levels of PR3-ANCA and MPO-ANCA were assessed according to the manufacturers’ instructions. The results were regarded as positive or negative according to the scale provided by each manufacturer.

To confirm the test results, all sera were analysed by capture ELISA and immunoblotting (data not shown).

Samples from the patients with Wegener’s granulomatosis and microscopic polyangiitis were tested in both assays (PR3-ANCA and MPO-ANCA ELISAs) and were not used as controls for PR3 or MPO assays.

Statistical analysis

The diagnostic sensitivity, specificity, and receiver operating characteristics (ROC) were analysed for the different ELISAs and the IFT to estimate the diagnostic performance of the respective tests. The optimal cut off levels for calculation of sensitivity and specificity were determined by ROC curves. Sensitivity and specificity were calculated by 2×2 tables. ROC curves were analysed as described previously.11 In order to assess overall diagnostic performance, the area under the ROC curve (AUC) was calculated and AUC values of all different assays were compared. A difference in AUC between two assays was considered significant at a probability (p) value of <0.05. The MedCalc® software package (Medcalc® Software, Mariakerke, Belgium) was used for data analysis.

RESULTS

Patient characteristics are outlined in table 1. Patients were categorised into two groups according to disease activity at the time of sampling. Thirteen of the Wegener’s patients were classified as having active disease and 10 of the microscopic polyangiitis patients had active disease. Their median BVAS values were 17 and 14, respectively.

Table 1

 Clinical findings and organ involvement in 15 patients with Wegener’s granulomatosis and 15 patients with microscopic polyangiitis

Fifty patients served as controls. Ten of these were diagnosed with vasculitides other than Wegener’s granulomatosis and microscopic polyangiitis (three with Churg-Strauss syndrome, one with giant cell arteritis, one with Takayasu Syndrome, one with cryoglobulinaemic vasculitis, two with unclassified small vessel vasculitis and two with connective tissue diseases with secondary vasculitis). In this group, there was one false positive result among the PR3-ANCA tests (kit H) and no false positive results among the MPO-ANCA tests. Ten other subjects served as healthy controls. In all tests, these controls gave a negative result when tested for PR3-ANCA and MPO-ANCA, respectively.

Thirty patients suffering from non-vasculitic disorders were also tested (disease controls). Seven of these had renal insufficiency or glomerulonephritis, 10 had sinusitis, six had bronchial carcinoma and seven had various other disorders. In the disease control group, there were three false positive results (kits F and M) and one false positive IFT result in the PR3-ANCA tests; in the MPO-ANCA tests there were two false positive IFT results and one false positive result with kit B.

Sensitivity and specificity

Sensitivity and specificity varied among the different kits and was dependent on the cut off that had been provided by the manufacturer. Concerning the tests for the detection of C-ANCA and PR3-ANCA, the sensitivity varied greatly if the given cut off values were applied, ranging from 13.3% (kit B) to 73.3% (IFT) (table 2). The in-house ELISA performed second best for sensitivity (66.7%), followed by kits A, G, H, I, J, and K (60%) and kit E (46.7%). Specificity did not have such a large range when the given cut off levels were used, ranging from 96% (kit M) to 98% (kits F and H and the IFT). Most of the assays (in-house ELISA, A, B, E, G, I, J, K, and L) reached a specificity of 100%.

Table 2

 Diagnostic performance of commercially available enzyme linked immunosorbent assay kits for detection of PR3-ANCA compared with direct immunofluorescence using cut off levels provided by the manufacturer

The test for the detection of P-ANCA and MPO-ANCA gave similar results for sensitivity and specificity. There was a wide range of sensitivity, from 26.7% (kit A) to 86.7% (IFT) (table 3). Most of the kits showed a sensitivity in the range of 46% to 67% (kits G, I, and K, 46.7%; kits F, J, and M, 53.3%; kits E and L, and in-house ELISA, 60%; kits B and H, 66.7%). Again specificity was more uniform. All assays except IFT and kit B reached 100% specificity (IFT and kit B, 98%).

Table 3

 Diagnostic performance of commercially available enzyme linked immunosorbent assay kits for detection of MPO-ANCA compared with direct immunofluorescence using cut off levels provided by the manufacturer

Change of cut off points and its influence on specificity

Sensitivity and specificity could be altered to improve the diagnostic power of the test by changing the cut off levels. Using the optimum cut off for the individual tests, which can be calculated by ROC analysis, sensitivity of the PR3-ANCA tests increased to a range from 53.3% (kits L and M) to 86.7% (kit I). Kit A gave the second highest sensitivity (80%). Most of the kits reached a sensitivity of 66.7% (kits B, E, F, G, H, and K). The gain in sensitivity was achieved at the cost of a minor loss in specificity, ranging now from 90% (kits F and J) to 100% (kits K and L)). Kits A and I gave the best diagnostic performance (AUC 0.900).

In all kits testing for the presence of PR3-ANCA, except for IFT and the in-house ELISA, a change in cut off values was able to increase their overall performance, as shown in tables 4 and 5.

Table 4

 Diagnostic performance of commercially available enzyme linked immunosorbent assay kits for detection of PR3-ANCA compared with direct immunofluorescence using optimum cut off levels calculated by ROC curve analysis

Table 5

 Diagnostic performance of commercially available enzyme linked immunosorbent assay kits for detection of MPO-ANCA compared with direct immunofluorescence using optimum cut off levels calculated by ROC curve analysis

By changing cut off levels, sensitivity of MPO-ANCA assays increased from 66.7% (kits A and J) to 93.3% (kits G and K), and specificities were therefore reduced slightly (from 90% (kits E and G) to 100% (kit L)). Kit K has the highest diagnostic power (AUC 0.956). In all of the MPO-ANCA assays, performance (measured by AUC) could be increased by changing the cut off levels.

In a few cases optimum performance (measured by AUC) was accompanied by a loss in specificity to below 90% (kits F and J with PR3-ANCA tests, and kits E and G with MPO-ANCA tests). As specificity should not be lower than 90%, AUC values were calculated on the basis of a minimum specificity of 90%.

As stated above, lowering cut off points to give a better overall diagnostic performance gives a higher sensitivity but means a reduction in specificity. The effect of lowering cut off points is shown in tables 6 and 7. Applying the cut off points provided by the manufacturer, there were three tests among the PR3-ANCA detecting kits that gave false positive test results in the control groups (n = 50), ranging from one to two false positive results per test; healthy controls were all tested as being negative for PR3-ANCA (by all tests). Using optimum cut off levels, nine of the tests gave false positive results in the disease control group, ranging from one to nine false positive results per test. False positive disease controls were distributed relatively equally among the different disease groups (renal, pulmonary, or ENT disease), showing one or two false positive results in one or more of the disease groups. In one case there were four false positive results in the ENT disease control group. Moreover, this kit showed two false positive results in both the renal and the pulmonary disease control groups (PR3-ANCA kit F).

Table 6

 Comparison of false positive and true positive results applying the provided and optimum cut off points in enzyme linked immunosorbent assay kits for detection of PR3-ANCA

Table 7

 Comparison of false positive and true positive results applying the provided and optimum cut off points in enzyme linked immunosorbent assay kits for detection of MPO-ANCA

Moreover, among the healthy controls, there were now five tests giving false positive results (ranging from one to two per test). Kits K and L did not give false positive results in the control groups irrespective of the cut off level and IFT had one false positive result.

However, the reduction in specificity was outweighed by the increase in sensitivity, as shown in table 6. This shows the increase in false positive results among the controls (absolute value and percentage) and the increase of correct results among the disease group (Wegener’s granulomatosis). In most cases, the increase in correct results exceeded the increase in false negative results (for example, kit A of the PR3-ANCA ELISAs, with 10% additional false positive results and 20% additional true positive results), which indicates an increased performance. The improvement in the tests by changing cut off values was variable: in a few cases, changing cut offs did not change false or true positive tests results at all (in-house ELISA, IFT), which means that these assays could not be improved by changing cut off points, whereas in other cases there was a substantial increase in positive results and only a moderate increase in false positive results (kit B additional false positive result 4%, additional true positive results 53.4%)

In some cases, companies provided a “borderline positive” scale and these results were counted as negative. After lowering cut off points (by applying the optimum cut off), these borderline results then gave a definite positive test result. In the disease group (Wegener’s granulomatosis), kits E, H, and L all gave positive “borderline” results (kits E = 1, H = 1, L = 2), which became positive after lowering the cut off values. These results did not appear as additional false positive or true positive results among the control or disease groups. Similar results were obtained for the MPO-ANCA ELISA tests (table 7).

To summarise, lowering cut off points is in most cases effective in improving diagnostic performance. However, one should be aware of the reduction of specificity (increased false positive results).

ROC analysis

In order to determine the diagnostic performance of the different ELISAs and the IFT, ROC curves were analysed for the respective diagnostic tests. According to a method described by Hanley and McNeil,12 the 95% confidence interval for the area under the ROC curve was used to test the hypothesis that the theoretical area is 0.5. So long as the confidence interval does not include the 0.5 values, there is evidence that the test under investigation has the ability to distinguish between disease and controls.

The value of the AUC depends on two factors: first, sensitivity/specificity should be high; second, differences in specificity and sensitivity should be slight. The higher the AUC value, the greater the diagnostic power.

We calculated AUC values for all the tests. In general, the results showed that each of the tests had a high diagnostic power for detecting their respective target (PR3-ANCA or MPO-ANCA). Concerning kits testing for C-ANCA/PR3-ANCA, kits A and I had the highest diagnostic power (AUC 0.900), followed by kit J (AUC 0.896), kit K (AUC 0.875), and IFT (AUC 0.863); however, these differences in AUC among the five best performing kits were not statistically significant (p>0.05). IFT did not show any significant differences in overall diagnostic performance (measured by AUC) compared with any of the ELISA kits. However, the four best performing kits were significantly different from the four worst performing kits (kits F, M, B, and L; p<0.05).

With respect to assays detecting P-ANCA/MPO-ANCA, kit K gave the highest diagnostic power (AUC 0.956), followed by kits F and G (AUC 0.933) and IFT (AUC 0.931). There was no statistically significant difference between the performances of the tests except for kit K (best performance, AUC 0.956) and in-house ELISA (AUC 0.800; p = 0.025). It was remarkable that there was no statistically significant difference between IFT and any of the ELISA kits, as shown for the PR3-ANCA ELISAs.

Inter-test agreement

The inter-test agreement among the different PR3-ANCA kits was between 55% and 100% and the concordance among the MPO-ANCA tests was between 18% and 100%.

DISCUSSION

Along with the clinical presentation and the histological picture, the detection of ANCA is one of the cornerstones of the diagnosis of ANCA associated vasculitis. Current guidelines on ANCA testing recommend dual testing—first by IFT and then by ELISA or other tests.2 Many previous studies have shown that the sensitivity of most direct PR3-ANCA and MPO-ANCA ELISAs is equivalent to standard IFT for ANCA detection using neutrophil cytospin preparations. However, in these studies the specificity of the assays was lower than that of IFT.12–14

The current problem in ANCA testing is the lack of international standardisation of ANCA assays to allow comparison of test results from different laboratories or hospitals. Commercially available kits, as well as in-house tests from different hospitals, have different performance characteristics. This may affect the diagnosis and management of patients with ANCA associated vasculitides.

We have previously undertaken three different studies to evaluate the performance characteristics of commercially available direct and capture ELISA kits for detecting PR3-ANCA and MPO-ANCA in patients with ANCA associated vasculitides.4–6 Direct PR3-ANCA and MPO-ANCA ELISAs have not shown an improvement in performance over the years, although the numbers of test kits have increased. In 1997, we tested eight direct PR3-ANCA and MPO-ANCA ELISA kits, finding sensitivities between 44% and 84% and specificities of 90% to 100%.5 In 2002, we tested 11 direct PR3 and MPO ELISA kits. Sensitivities ranged from 22% to 77% and 45% to 67%, respectively.4 Specificity was 93–100% and 97–100%. The most recent evaluation of six in-house direct PR3-ANCA ELISAs from academic laboratories in 2003 showed sensitivities between 53% and 80% and specificities from 95% to 100%.6

In the present study, we analysed the performance of commercially available direct ELISA kits to detect autoantibodies to PR3 and MPO in patients with ANCA associated vasculitides. In a relatively large number of clinically well defined patients (n = 70) we analysed their serum samples to compare the results of in-house IFT and direct ELISA. In general, if the cut off levels provided by the manufacturer were applied, the tests for direct PR3-ANCA and MPO-ANCA ELISA both showed a wide range of sensitivities, whereas specificity was high among all the tests.

Comparing the test results with previous studies, the performance of direct PR3-ANCA ELISAs was not improved by applying the cut off points provided (sensitivity ranging from 13.3% to 66.7%, specificity from 96% to 100%). Moreover, both sensitivity and specificity decreased. Applying lower cut off points, the sensitivity was similar to the 1997 results (53.3% to 86.7%), but the specificity was lower (82% to 100%). With respect to direct MPO-ANCA ELISAs, sensitivity could not be improved (26.7% to 66.7%); however, specificity has increased over recent years (now, 98% to 100%; in 1997, 90% to 100%). By applying lower cut off points, sensitivity could be improved (60% to 93.3%), but specificity was reduced substantially (90–100%). Specificity was kept at a minimum of 90%. Theoretically, in some cases it would have been necessary to lower specificity below 90% to achieve an optimum AUC. The overall diagnostic performance of the tests could be improved in all tests except for two (PR3-ANCA in-house ELISA and IFT) by lowering the cut off levels. This is generally achieved by an improvement in sensitivity and diagnostic performance, but may sometimes be accompanied by a loss of specificity.

The AUC value served to assess the overall diagnostic performance of a test independent of the manufacturer-provided cut off levels. As stated above, among the PR3-ANCA kits, kits A and I had the best diagnostic performance (AUC 0.900) followed by kit J (AUC 0.896), kit K (AUC 0.875), and IFT (AUC 0.863); however, these differences were not statistically significant. A significant difference in performance was shown by comparing the four best kits with the four weakest kits.

With respect to MPO-ELISA, after lowering cut off levels, there was a relatively uniform performance, with a significant difference occurring only between the best and worst performing kit. Kits F, G, and K give a marginally better performance than IFT; this, however, was not statistically significant. None of the ELISA kits showed a significant difference in the AUC values, which means that that they had equivalent performance after lowering the cut off points.

Previous studies have shown that IFT is superior to direct ELISA. In this study, we found that by changing cut off levels to optimise overall diagnostic performance, all ELISAs (for PR3-ANCA and MPO-ANCA) had a performance equal to IFT, and a few even had a minimally better overall diagnostic performance (measured by AUC) than IFT (kits A, I, J, and K among the PR3-ANCA ELISA kits, and kits F, G, and K among the MPO ANCA kits); however, these differences were not different. Comparing the different ELISA kits, there was a significant difference among the four best and the four worst performing PR3-ANCA ELISA kits, while among MPO-ANCA ELISAs there was a more uniform performance, with a significant difference only between kit K and the in-house ELISA.

It is however, necessary to mention that by lowering cut off points specificity is in some cases substantially reduced. Although by optimising cut off points the percentage of additional correctly positive results usually exceeded the percentage of additional negative results, it is important to be aware that the increased in false positive results is greater when optimum cut off points are applied. For example, regarding the direct PR3-ANCA ELISAs, the best performing kits A and I had no false positive results when the manufacturer-provided cut off points were applied, while the optimum cut off resulted in five and three false positive results, respectively. Consequently, companies need to be critical when planning to lower cut off points to increase overall diagnostic performance. If a test has a sufficiently high sensitivity, a negative result would rule out the diagnosis of ANCA associated vasculitides. If the test has high specificity, a positive result would support the diagnosis. Thus the cut off level can be set to make the test highly sensitive or highly specific, depending on the target disease. It must be emphasised that the rank orderings of sensitivity and specificity do not imply a definitive overall qualitative ranking of the test kits relative to one another.

Conclusions

Our data show that the diagnostic performance of commercially available direct PR3-ANCA and MPO-ANCA ELISA kits has not improved over the years. Thus the clinician needs to be aware of the different performance characteristics of the various ELISA kits. We propose that by optimising cut off levels using ROC curve analysis, the overall diagnostic performance of many ELISA tests can be improved. Manufacturers of ANCA ELISA kits need to consider the correct cut off level to optimise a test. More studies will be needed to optimise cut off levels and to compare the performance of ELISAs using optimum cut offs.

Acknowledgments

This study was supported by BMBF, grant No 01 G1 9951, Competence Network systemic inflammatory rheumatic diseases.

REFERENCES

View Abstract

Footnotes

  • Published Online First 20 April 2005

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.