Article Text

Download PDFPDF
Machine learning algorithms reveal unique gene expression profiles in muscle biopsies from patients with different types of myositis
  1. Iago Pinal-Fernandez1,2,3,4,
  2. Maria Casal-Dominguez1,2,
  3. Assia Derfoul1,
  4. Katherine Pak1,
  5. Frederick W Miller5,
  6. Jose César Milisenda6,
  7. Josep Maria Grau-Junyent6,
  8. Albert Selva-O'Callaghan7,
  9. Carme Carrion-Ribas3,
  10. Julie J Paik8,
  11. Jemima Albayda8,
  12. Lisa Christopher-Stine2,8,
  13. Thomas E Lloyd2,
  14. Andrea M Corse2,
  15. Andrew L Mammen1,2,8
  1. 1 Muscle Disease Unit, Laboratory of Muscle Stem Cells and Gene Regulation, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Insititutes of Health, Bethesda, Maryland, USA
  2. 2 Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
  3. 3 Faculty of Health Sciences, Universitat Oberta de Catalunya, Barcelona, Spain
  4. 4 Faculty of Computer Science, Multimedia and Telecommunications, Universitat Oberta de Catalunya, Barcelona, Spain
  5. 5 Enivironmental Autoimmunity Group, National Institute of Environmental Health Sciences, National Institutes of Health, Bethesda, Maryland, USA
  6. 6 Internal Medicine, Hospital Clinic de Barcelona, Barcelona, Catalunya, Spain
  7. 7 Internal Medicine, Vall d'Hebron General Hospital, Universitat Autonoma de Barcelona, Barcelona, Spain
  8. 8 Division of Rheumatology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
  1. Correspondence to Dr Andrew L Mammen, NIAMS/NIH, Bethesda, Maryland, USA; andrew.mammen{at}


Objectives Myositis is a heterogeneous family of diseases that includes dermatomyositis (DM), antisynthetase syndrome (AS), immune-mediated necrotising myopathy (IMNM), inclusion body myositis (IBM), polymyositis and overlap myositis. Additional subtypes of myositis can be defined by the presence of myositis-specific autoantibodies (MSAs). The purpose of this study was to define unique gene expression profiles in muscle biopsies from patients with MSA-positive DM, AS and IMNM as well as IBM.

Methods RNA-seq was performed on muscle biopsies from 119 myositis patients with IBM or defined MSAs and 20 controls. Machine learning algorithms were trained on transcriptomic data and recursive feature elimination was used to determine which genes were most useful for classifying muscle biopsies into each type and MSA-defined subtype of myositis.

Results The support vector machine learning algorithm classified the muscle biopsies with >90% accuracy. Recursive feature elimination identified genes that are most useful to the machine learning algorithm and that are only overexpressed in one type of myositis. For example, CAMK1G (calcium/calmodulin-dependent protein kinase IG), EGR4 (early growth response protein 4) and CXCL8 (interleukin 8) are highly expressed in AS but not in DM or other types of myositis. Using the same computational approach, we also identified genes that are uniquely overexpressed in different MSA-defined subtypes. These included apolipoprotein A4 (APOA4), which is only expressed in anti-3-hydroxy-3-methylglutaryl-CoA reductase (HMGCR) myopathy, and MADCAM1 (mucosal vascular addressin cell adhesion molecule 1), which is only expressed in anti-Mi2-positive DM.

Conclusions Unique gene expression profiles in muscle biopsies from patients with MSA-defined subtypes of myositis and IBM suggest that different pathological mechanisms underly muscle damage in each of these diseases.

  • dermatomyositis
  • polymyositis
  • autoantibodies
  • autoimmune diseases
  • autoimmunity
View Full Text

Statistics from


  • Handling editor Josef S Smolen

  • Contributors All authors have met these four criteria: Substantial contributions to the conception or design of the work, or the acquisition, analysis or interpretation of data. Drafting the work or revising it critically for important intellectual content. Final approval of the version published. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

  • Funding This research was supported in part by the Intramural Research Programme of the National Institute of Arthritis and Musculoskeletal and Skin Diseases and the National Institute of Environmental Health Sciences of the National Institutes of Health. The Myositis Research Database and Dr LC-S are supported by the Huayi and Siuling Zhang Discovery Fund. IPF's research was supported by a Fellowship from the Myositis Association. The authors also thank Dr Peter Buck for support.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval This study was approved by the Institutional Review Boards at participating institutions.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available upon reasonable request. De-identified RNA-seq data will be made available upon request to Dr Andrew Mammen at

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.