Article Text

Lupus or not? SLE Risk Probability Index (SLERPI): a simple, clinician-friendly machine learning-based model to assist the diagnosis of systemic lupus erythematosus
  1. Christina Adamichou1,
  2. Irini Genitsaridi1,
  3. Dionysis Nikolopoulos2,
  4. Myrto Nikoloudaki1,
  5. Argyro Repa1,
  6. Alessandra Bortoluzzi3,
  7. Antonis Fanouriakis2,4,
  8. Prodromos Sidiropoulos1,5,
  9. Dimitrios T Boumpas2,6,
  10. George K Bertsias1,5
  1. 1Rheumatology, Clinical Immunology and Allergy, University of Crete School of Medicine, Heraklion, Crete, Greece
  2. 2Rheumatology and Clinical Immunology Unit, 4th Department of Internal Medicine, Attikon University Hospital, National and Kapodistrian University of Athens, Athens, Greece
  3. 3Section of Rheumatology, Department of Medical Sciences, Azienda Ospedaliero Universitaria di Ferrara Arcispedale Sant'Anna, Cona, Emilia-Romagna, Italy
  4. 4Rheumatology, “Asklepieion” General Hospital, Athens, Greece
  5. 5Institute of Molecular Biology and Biotechnology, Foundation of Research and Technology—Hellas, Heraklion, Crete, Greece
  6. 6Laboratory of Immune Regulation and Tolerance, Autoimmunity and Inflammation, Biomedical Research Foundation of the Academy of Athens, Athens, Attica, Greece
  1. Correspondence to Dr George K Bertsias, Rheumatology, Clinical Immunology and Allergy, University of Crete School of Medicine, Heraklion 700 13, Greece; gbertsias{at}uoc.gr

Abstract

Objectives Diagnostic reasoning in systemic lupus erythematosus (SLE) is a complex process reflecting the probability of disease at a given timepoint against competing diagnoses. We applied machine learning in well-characterised patient data sets to develop an algorithm that can aid SLE diagnosis.

Methods From a discovery cohort of randomly selected 802 adults with SLE or control rheumatologic diseases, clinically selected panels of deconvoluted classification criteria and non-criteria features were analysed. Feature selection and model construction were done with Random Forests and Least Absolute Shrinkage and Selection Operator-logistic regression (LASSO-LR). The best model in 10-fold cross-validation was tested in a validation cohort (512 SLE, 143 disease controls).

Results A novel LASSO-LR model had the best performance and included 14 variably weighed features with thrombocytopenia/haemolytic anaemia, malar/maculopapular rash, proteinuria, low C3 and C4, antinuclear antibodies (ANA) and immunologic disorder being the strongest SLE predictors. Our model produced SLE risk probabilities (depending on the combination of features) correlating positively with disease severity and organ damage, and allowing the unbiased classification of a validation cohort into diagnostic certainty levels (unlikely, possible, likely, definitive SLE) based on the likelihood of SLE against other diagnoses. Operating the model as binary (lupus/not-lupus), we noted excellent accuracy (94.8%) for identifying SLE, and high sensitivity for early disease (93.8%), nephritis (97.9%), neuropsychiatric (91.8%) and severe lupus requiring immunosuppressives/biologics (96.4%). This was converted into a scoring system, whereby a score >7 has 94.2% accuracy.

Conclusions We have developed and validated an accurate, clinician-friendly algorithm based on classical disease features for early SLE diagnosis and treatment to improve patient outcomes.

  • lupus erythematosus
  • systemic
  • autoantibodies
  • autoimmune diseases

Data availability statement

Data are available upon reasonable request. Data will be available upon request.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

Data are available upon reasonable request. Data will be available upon request.

View Full Text

Supplementary materials

Footnotes

  • Handling editor Josef S Smolen

  • Twitter @none, @george_bertsias

  • CA and IG contributed equally.

  • Correction notice This article has been corrected since it published Online First. The provenance and peer review statement has been included.

  • Contributors CA, DN and MN collected data from patient medical charts and also performed data entry. IG designed and implemented the machine learning (ML) methodology, constructed and evaluated the ML models and drafted the relevant methodology sections on feature selection, model construction, evaluation and statistical analysis. AB organised the RedCap database. AR and AF assessed patients enrolled in the study and collected data from patient medical charts. PS and DTB assisted in patient recruitment and critically reviewed the manuscript. GKB conceived and supervised the study, performed statistical analyses and drafted the manuscript.

  • Funding The study received funding by the Hellenic Society of Rheumatology & Professionals Union of Rheumatologists of Greece (protocol number: 644), the Pancretan Health Association and the Foundation for Research in Rheumatology (FOREUM; protocol number: 016BertsiasPrecl) and from the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation programme (grant agreement number 742390) to DB.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.