Background Rheumatoid arthritis (RA) disease activity measures in the Veterans Affairs RA (VARA) registry are extracted automatically through natural language processing (NLP). While this system is very effective at extracting data when templated notes are properly used, it lacked an error detection and feedback mechanism. Accuracy of the registry is essential for credible epidemiological research and patient care. We report a new automated approach with an active error monitoring and reporting system to alert providers of missing or potentially erroneous elements that can be easily corrected using standardized addendums available in the electronic medical record. The automated NLP system was revised to identify, extract, and integrate these updates to support the calculation of DAS28 and other composite outcome measures for the VARA database.
Objectives 1. To describe the systems to identify needed corrections of VARA data.
2. To outline the procedures that allow providers to easily use addendums to enter corrections into the medical record to be automatically captured and loaded into the VARA database.
Methods Procedures were developed and tested at a single pilot VARA site using data available in the Corporate Data Warehouse (CDW) from 01/01/2016 to 12/31/2016. A Java program was designed to retrieve Rheumatology notes, and corresponding addendums, based on “local” and “national” note titles. Notes were then processed to extract defined elements of RA disease activity listed in the table below. After each scheduled NLP run the system generates a log file that provides a summary, and patient-level report of completed and missing data elements. Providers receive the report and are asked to review the clinical notes of patient visits with missing data elements and follow simple procedures that leverage addendums to add or correct data elements when template violations occur. Addendums are also used to terminate the flag and request for review when the items are not available in the notes. Updating the VARA database from addendums occurs during the next NLP run.
Results During the pilot testing phase the automated system processed 516 notes and identified 489/516 (94.8%) as successful loads, and 27/516 (5.2%) were flagged as problematic since one or more data elements were missing. Misapplication of the template occurred in 21/27 (77.8%) of notes flagged by the monitoring system and corrected with addendums. An additional NLP run produced 510/516 (98.8%) completed assessments with calculated DAS28 scores. Specific elements recovered using this process are presented in table below.
Conclusions The addition of this error monitoring system provides an efficient data correction system and is expected to motivate and reinforce the use of RA templates. The implications of which may be profound as we transition from traditional epidemiological research to a more active learning healthcare enterprise. This pilot study established “proof of concept” and the next challenge is to adapt the technology to other VARA and non-VARA sites. This technology and framework could enable collaborative clinical research networks that are committed to large-scale pragmatic and observational effectiveness studies.
Acknowledgements Work Sponsored by VA Specialty Care Centers of Innovation, VA Health Service Research and Development.
Disclosure of Interest G. Cannon Grant/research support from: Amgen, S. Mehrotra Grant/research support from: Amgen, B. Sauer Grant/research support from: Amgen