Table 2

A comparison of machine learning models to classify muscle biopsies based on gene expression data. Accuracy and 95% CI in the 1000 test sets of the different machine learning models to classify muscle biopsies into normal muscle tissue (NT), dermatomyositis (DM), antisynthetase syndrome (AS), inclusion body myositis (IBM) or immune-mediated necrotising myopathy (IMNM).

NTDMASIBMIMNM
Linear SVM94.7 (87.2 to 100.0)92.0 (85.1 to 97.9)91.0 (85.1 to 95.7)95.0 (91.5 to 100.0)92.0 (85.1 to 97.9)
AdaBoost91.5 (83.0 to 97.9)89.6 (80.9 to 95.7)89.1 (83.0 to 93.6)91.9 (80.9 to 97.9)85.8 (76.6 to 93.6)
Gaussian process94.2 (87.2 to 100.0)82.9 (74.5 to 91.5)87.2 (80.9 to 91.5)91.0 (85.1 to 95.7)79.6 (68.1 to 89.4)
Nearest neighbours91.5 (85.1 to 97.9)87.8 (80.9 to 95.7)87.2 (83.0 to 89.4)90.6 (89.4 to 93.6)77.4 (66.0 to 87.2)
Random forest89.7 (83.0 to 95.7)85.6 (76.6 to 93.6)85.7 (78.7 to 91.5)90.4 (87.2 to 93.6)78.3 (68.1 to 87.2)
Neural network89.1 (72.3 to 97.9)83.5 (44.7 to 95.7)87.4 (74.4 to 93.6)91.1 (89.4 to 97.9)71.6 (36.2 to 95.7)
Decision tree87.8 (76.6 to 95.7)86.5 (76.6 to 93.6)85.0 (74.5 to 91.5)85.7 (76.6 to 93.6)76.1 (57.4 to 89.4)
RBF SVM85.1 (85.1 to 85.1)82.6 (76.6 to 87.2)87.2 (87.2 to 87.2)89.4 (89.4 to 89.4)64.0 (63.8 to 66.0)
Gaussian Naïve Bayes85.1 (85.1 to 85.1)80.2 (70.2 to 89.4)86.4 (83.0 to 89.4)89.3 (87.2 to 91.5)66.1 (53.2 to 78.7)
QDA86.5 (78.7 to 93.6)63.5 (48.9 to 76.6)75.5 (61.7 to 87.2)80.4 (68.1 to 89.4)63.1 (46.8 to 76.6)
  • The models are sorted based on the average accuracy of all the groups.

  • AdaBoost, adaptative boosting; QDA, quadratic discriminant analysis; RBF, radial basis function; SVM, support vector machines.