Article Text

Download PDFPDF

  1. M. Jani1,2,3,
  2. G. Alfattni4,
  3. M. Belousov4,
  4. Y. Zhang1,
  5. M. Cheng5,
  6. K. Webb5,
  7. L. Laidlaw1,
  8. A. Kanter6,7,
  9. W. Dixon1,2,3,
  10. G. Nenadic4
  1. 1The University of Manchester, Centre for Epidemiology Versus Arthritis, Manchester, United Kingdom
  2. 2The University of Manchester, NIHR Manchester Biomedical Research Centre, Manchester, United Kingdom
  3. 3Salford Royal Hospital, Northern Care Alliance, Department of Rheumatology, Salford, United Kingdom
  4. 4The University of Manchester, Department of Computer Science, Manchester, United Kingdom
  5. 5Salford Royal Hospital, Northern Care Alliance, Data Science, Salford, United Kingdom
  6. 6Columbia University, Department of Bioinformatics, New York, United States of America
  7. 7Intelligent Medical Objects, (IMO), Rosemont, United States of America


Background Efficient pandemic planning is a key for providing a timely response to any developing disease outbreak. For example, at the beginning of the current Coronavirus disease 2019 (COVID-19) pandemic, the UK’s Scientific Committee issued extreme social distancing measures, termed ‘shielding’, that were aimed at a subset of the UK population who were deemed especially vulnerable to infection. In April 2020 the British Society for Rheumatology (BSR) issued a risk stratification guide to identify patients at the highest risk of COVID-19 requiring shielding. This guidance was based on patients’ age, comorbidities, and immunosuppressive therapies, including biologics that are not captured in primary care records. This meant rheumatologists needed to manually review outpatient letters to score patients’ risk. The process required considerable clinician time, with shielding decisions not always transparently communicated.

Objectives Our aim was to develop an automated shielding algorithm by text-mining outpatient letter diagnoses and medications, reducing the need for future manual review.

Methods Rheumatology outpatient letters from Salford Royal Hospital, a large UK tertiary hospital, were retrieved between 2013-2020. The two most recent letters for each patient were extracted, created before 01.04.2020 when BSR guidance was published. Free-text diagnoses were processed using Intelligent Medical Objects software1 (Concept Tagger), which utilised interface terminology for each condition mapped to a SNOMED-CT code. We developed the Medication Concept Recognition tool (MedCore Named Entity Recognition) to retrieve medications type, dose, duration and status (active/past) at the time of the letter. The medication status was established based on the heading where they appeared (e.g. past medications, current medications), but incorporated additional information such as medication stop dates. The age, diagnosis and medication variables were then combined to output the BSR shielding score. The algorithm’s performance was calculated using clinical review as the gold standard.

Results To allow for the comparison with manual decisions, we focused on all 895 patients who were reviewed clinically. 64 patients (7.1%) had not consented for their data to be used for research as part of the national opt-out scheme. After removing duplicates, 803 patients were used to run the algorithm. 5,942 free-text diagnoses were extracted and mapped to SNOMED CT, with 13,665 free-text medications. The automated algorithm demonstrated a sensitivity of 80.3% (95% CI: 74.7, 85.2%) and specificity of 92.2% (95% CI: 89.7, 94.2%). Positive likelihood ratio was 10.3 (95% CI: 7.7, 13.7), negative likelihood ratio was 0.21 (95% CI: 0.16, 0.28), F1 score was 0.81. False positive rate was 7.9%, whilst false negative rate was 19.7%. Further evaluation of false positives/negatives revealed clinician interpretation of BSR guidance and misclassification of medications status were important contributing factors.

Conclusion An automated algorithm for risk stratification has several advantages including reducing clinician time for manual review to allow more time for direct care, improving efficiency and transparently communicating decisions based on individual risk. With further development, it has the potential to be adapted for future public health initiatives that requires prompt automated review of hospital outpatient letters.

Acknowledgements MJ is funded by a National Institute for Health Research (NIHR) Advanced Fellowship [NIHR301413]. The views expressed in this publication are those of the authors and not necessarily those of the NIHR, NHS or the UK Department of Health and Social Care.

Disclosure of Interests Meghna Jani: None declared, Ghada Alfattni: None declared, Maksim Belousov: None declared, Yuanyuan Zhang: None declared, Michael Cheng: None declared, Karim Webb: None declared, Lynn Laidlaw: None declared, Andrew Kanter Employee of: AK is a senior advisor and previous Chief Medical Officer at IMO, William Dixon Consultant of: WGD has received consultancy fees from Google unrelated to this work, Goran Nenadic: None declared.

  • Health Services Research
  • Safety
  • Disease-modifying Drugs (DMARDs)

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.