Article Text

Rheumatoid arthritis subtypes identified by genomic profiling of peripheral blood cells: assignment of a type I interferon signature in a subpopulation of patients
  1. T C T M van der Pouw Kraan1,*,
  2. C A Wijbrandts2,*,
  3. L G M van Baarsen1,
  4. A E Voskuyl3,
  5. F Rustenburg1,
  6. J M Baggen1,
  7. S M Ibrahim4,
  8. M Fero5,
  9. B A C Dijkmans3,
  10. P P Tak2,
  11. C L Verweij1
  1. 1VU Medical Center, Department of Molecular Cell Biology & Immunology, Amsterdam, The Netherlands
  2. 2Academic Medical Center, Division of Clinical Immunology and Rheumatology, University of Amsterdam, Amsterdam, The Netherlands
  3. 3VU Medical Center, Department of Rheumatology, Amsterdam, The Netherlands
  4. 4Section of Immunogenetics, University of Rostock, Rostock, Germany
  5. 5Stanford Functional Genomics Facility, Stanford University, Stanford, California, USA
  1. Correspondence to:
    Dr C L Verweij
    VU University Medical Centre, Department of Molecular and Cellular Biology & Immunology, J295, PO Box 7057, 1007 MB Amsterdam, The Netherlands;c.verweij{at}


Background: Rheumatoid arthritis (RA) is a heterogeneous disease with unknown cause.

Aim: To identify peripheral blood (PB) gene expression profiles that may distinguish RA subtypes.

Methods: Large-scale expression profiling by cDNA microarrays was performed on PB from 35 patients and 15 healthy individuals. Differential gene expression was analysed by significance analysis of microarrays (SAM), followed by gene ontology analysis of the significant genes. Gene set enrichment analysis was applied to identify pathways relevant to disease.

Results: A substantially raised expression of a spectrum of genes involved in immune defence was found in the PB of patients with RA compared with healthy individuals. SAM analysis revealed a highly significant elevated expression of interferon (IFN) type I regulated genes in patients with RA compared with healthy individuals, which was confirmed by gene ontology and pathway analysis, suggesting that this pathway was activated systemically in RA. A quantitative analysis revealed that increased expression of IFN-response genes was characteristic of approximately half of the patients (IFNhigh patients). Application of pathway analysis revealed that the IFNhigh group was largely different from the controls, with evidence for upregulated pathways involved in coagulation and complement cascades, and fatty acid metabolism, while the IFNlow group was similar to the controls.

Conclusion: The IFN type I signature defines a subgroup of patients with RA, with a distinct biomolecular phenotype, characterised by increased activity of the innate defence system, coagulation and complement cascades, and fatty acid metabolism.

  • DC, dendritic cell
  • FLS, fibroblast-like synoviocytes
  • GSEA, gene set enrichment analysis
  • IFN, interferon
  • MTX, methotrexate
  • PB, peripheral blood
  • RA, rheumatoid arthritis
  • SAM, significance analysis of microarrays
  • SLE, systemic lupus erythematosus
  • SS, Sjögren’s syndrome

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Rheumatoid arthritis (RA) is a systemic autoimmune disease characterised by chronic inflammation of the joints. There is growing evidence that patients with RA, as defined by the American College of Rheumatology Classification Criteria,1 represent a highly heterogeneous group. However, a clinical approach to disease classification could erroneously suggest that the criteria be applied to classify one disease entity. The heterogeneity of RA is indicated by notable variability in clinical presentation, and the presence of distinct autoantibody specificities, such as rheumatoid factor and anticyclic citrullinated peptide antibodies in the serum.2,3

Disease heterogeneity is also apparent in histological features of the synovium, which shows different complexity levels of lymphoid organisation in subsets of patients.4,5 Moreover, gene expression profiling of synovial tissue from patients with RA and those with osteoarthritis revealed marked variation in gene expression profiles that allowed us to identify molecularly distinct forms of RA synovium.6,7

The wide variation in responsiveness to virtually any treatment in RA is also consistent with the heterogeneous nature of the disease.8,9,10 Together, these findings suggest that distinct disease mechanisms are at play in RA pathology. The relative contribution of the different mechanisms may vary between patients and, perhaps, in different stages of disease.

The heterogeneity most likely has its origin in the multifactorial nature of the disease, whereby specific combinations of environmental factors and a varying polygenic background are likely to influence not only susceptibility but also the severity and disease outcome. Findings from genetically identical twins, in whom the concordance rate is far less than complete, are indicative of a major role of an environmental factor in the risk of developing RA.11,12

Given the heterogeneous nature of RA, and its systemic features, we investigated whether this heterogeneity is reflected in peripheral blood (PB) cells, because it can be anticipated that (aetio)pathogenic events in the host are reflected as phenotypic changes in the host cells. Large-scale gene expression profiling of PB cells from patients with RA could thus provide a molecular portrait that reflects the contributions of diverse cellular responses associated with RA, in general, and with disease subtypes, and could thus define the samples’ unique biology.


Patients and controls

PB was obtained in PAXgene RNA isolation tubes (PreAnalytix GmbH, Hombrechtikon, Switzerland) from 35 patients. Of these, 25 used methotrexate (MTX), and 10 patients were MTX and other DMARD naïve. All patients fulfilled the revised American College of Rheumatology 1987 criteria for RA,1 except for three patients in the MTX naive group, who were diagnosed as having probable RA with a disease duration of 3–12 months, as having mono-arthritis and as being positive for anticyclic citrullinated peptide antibodies. Two patients were diagnosed with RA after 6 and 12 months. Table 1 summarises the characteristics of these patients.

Table 1

 Patient characteristics

The control group consisted of 15 healthy individuals (nine women, six men, mean age: 43 years, range 27–63). In all comparisons mentioned, the groups were age- and sex-matched. All patients and controls gave their informed consent, and the study protocol was approved by the medical ethics committees from the Academic Medical Center and VU Medical Centre.

Sample preparation, labelling and hybridisation

This procedure was performed as described previously.13 In brief, total RNA was isolated from PB using the PAXgene RNA isolation kit. Amplified RNA was labelled with aminoallyl-deoxyuridine-5-triphosphate during cDNA synthesis, followed by chemical coupling of the aminoallyl group to Cy3 or Cy5 for the experimental and reference samples, respectively. The labelled cDNA transcripts were hybridised together on human cDNA microarrays with 43 000 elements, representing about 26 000 genes, generated at Stanford University, as described.14

Data filtering and analysis

Data were stored and preanalysed in the Stanford Microarray Database15 at Data are expressed as log2 ratios of fluorescence intensities of the experimental and the common reference sample. Intensity-dependent normalisation using local estimation (Loess) was performed separately on each sector of the array. Spots were included in the analysis when in at least 80% of the microarrays a reliable data point was obtained for that element (defined by a regression correlation coefficient R>0.6 for Cy3 and Cy5 pixel intensities, and a signal intensity of 2.5 times the local background for both Cy3 and Cy5). The use of a common reference allowed comparison of the expression levels across all samples.13 Therefore, the expression levels (as log2 ratios) were median centred—that is, each spotted element was expressed relative to the median expression level of that element across all samples. We corrected for array batch differences by applying “singular value decomposition”.16 Genes represented more than once on the microarrays were averaged in the Stanford Microarray Database from sequences with the same Unigene identifier.

Statistical analysis

Statistical analysis on microarray data was performed using significance analysis of microarrays (SAM).17 Genes that were expressed at significantly different levels between patients and controls, defined by a q-value of <5%, were analysed by supervised hierarchical clustering18 to visualise the correlation of co-expressed genes in Treeview (available at

For an interpretation of the biological processes that are represented by the genes that show a significantly different level of expression in patients with RA compared with the controls, we applied gene ontology analysis in the PANTHER database at PANTHER uses the binomial statistics tool to compare our gene list with a reference list (NCBI: Homo sapiens genes) to determine the statistically significant over-representation of functional groups of genes. A Bonferroni correction was applied to adjust for multiple comparisons. p Values <0.05 were considered significant.

For pathway analysis, we used gene set enrichment analysis (GSEA)20 at Like SAM, it utilises data permutation to adjust for multiple testing, indicated by a false discovery rate. A total of 408 pathways from the Kyoto Encyclopedia of Genes and Genomes ( and Biocarta ( are applied in this analysis. The same gene may be present in more than one pathway or biological process. In addition, we incorporated several interferon (IFN)-response gene sets from published data.21,22 A minimal gene set size of 20 genes per pathway was applied, and pathways with a p value <0.05 and a false discovery rate of <0.25 were considered significant, according to the authors’ suggestions.20 For the comparison of mean gene expression levels in different gene sets, a Student’s t test was used.


Gene expression profiling in PB cells of patients with RA

Gene expression profiling of PB cells from 32 patients with RA, 3 patients with probable RA, and 15 age- and sex-matched healthy controls was performed on microarrays with a complexity of ∼26K unique genes (43K elements). Data were analysed as, unpaired data belonging to two classes using SAM.17 A total of 577 genes, of which 259 were upregulated and 318 genes downregulated, were selected, whose transcript levels were expressed at significantly different levels between the two groups. The significant gene expression differences between patients with RA and healthy controls were visualised in a heatmap (fig 1A).18

Figure 1

 (A) Cluster diagram of the expression of 577 significantly different by expressed genes in 35 patients and 15 healthy individuals. Genes are organised by hierarchical clustering, based on the overall similarity in expression patterns.18 Red represents relative expression greater than the median expression level across all samples, and green represents an expression level lower than the median. Black indicates intermediate expression. Grey indicates missing data. Coloured bars to the right identify the locations of a category of clustered genes, with a correlated expression profile and related function. (B) Representation of the interferon (IFN)-response gene cluster with an enhanced expression in the group having patients with rheumatoid arthritis (RA). An expanded view of the genes in the IFN-response cluster of (A) is shown. Genes are either known genes with a unigene symbol characteristic for the defined gene cluster, or unknown, indicated by an accession number or unigene cluster ID. (C, D) The IFN-response programme is present in patients with RA irrespective of treatment. Representation of genes that are expressed at significantly different levels between patients with RA undergoing (C) or patients not undergoing (D) methotrexate (MTX) treatment and age- and sex-matched healthy controls. A selection of genes with a correlated expression profile that are indicative of an IFN-response programme is shown.

Genes upregulated in RA

A global view of the significantly differential expressed genes revealed a prominent cluster of IFN-inducible genes that was upregulated in patients with RA. This cluster, highlighted in fig 1B, contains highly correlated genes such as IFRG28 (28 kDa interferon responsive protein), IFI35 (interferon-induced protein 35), IFI44L (interferon-induced protein 44-like), IFIT1 (interferon-induced protein with tetratricopeptide repeats 1), IFIT2, IRF2 (interferon regulatory factor 2), IRF7, GIP2 (interferon α-inducible protein 2), GIP3, SERPING1 (serine proteinase inhibitor clade G member 1, C1 inhibitor), OAS1 (2′-5′-oligoadenylate synthetase 1), OAS2, MX1 (Myxovirus resistance 1), G1P2/ISG15 (interferon-induced protein 15) and RSAD2 (radical S-adenosyl methionine domain containing 2).

In addition, all patients showed increased expression of several inflammatory mediators, including the chemokines CXCL12, CXCL9, CCL15, CCL19, CCL7, CXCL12, CCL19, CCL7, CXCL3 and CCL8, as well as interleukin (IL)19 and the S100 family proteins S100 calcium-binding protein A8 (S100A8), S100A11 and S100A12. Other genes that were upregulated in patients with RA were members of the antioxidant metallothionein family and the anti-inflammatory IL1 receptor antagonist.

Genes downregulated in RA

Genes that showed a lower expression in RA included CD3ζ, TCRβ chain, TARP/TCRγV9, granzyme M, runx3 and KLRB1, which are involved in cytotoxic functions, and many other genes with unknown function.

Gene ontology analysis of genes with significant differential expression in RA

To systematically categorise the 577 genes with significant differential expression into functional groups, we used the PANTHER database consisting of a large collection of protein families that have been subdivided into functionally related subfamilies.19 The differentially upregulated genes represented seven significant functional biological processes (table 2). There were no significant downregulated ontology groups. The “immunity and defense” ontology group represented a broad composite family that consists of more specified ontology subgroups. Within these subgroups, the most significant upregulated process that distinguished patients with RA from controls was “interferon-mediated immunity”.

Table 2

 Ontology analysis of the genes that were expressed at significantly different levels between all 35 patients, or the subgroups consisting of 20 IFNhigh and 15 IFNlow patients, and healthy controls

Pathway analysis

In addition, we performed GSEA20 to identify pathways relevant to RA. In contrast to ontology analysis, this algorithm is based on the usage of all available gene expression data and derives its power from the analysis of sets of genes that are coordinately regulated in a defined biological process or pathway, while it uses data permutation to adjust for multiple testing. In addition to the intrinsic GSEA pathway gene sets, we included previously reported IFN-response sets in our analysis.21,22 The results revealed that, besides the five GSEA intrinsic gene sets (table 3), the previously described type I IFN-induced genes by Baechler et al22 (in their supplementary data) and the IFNα-induced genes were both significantly increased in patients with RA.

Table 3

 Pathways which are overexpressed in all patients, and in the subgroups of IFNhigh and IFNlow patients, all compared with healthy controls, analysed by gene set enrichment analysis

IFN-induced genes in RA

We confirmed expression of key genes of the IFN pathway, RSAD2 and G1P2, in all samples by real-time PCR, which showed a high correlation with the microarray data (r = 0.78 and 0.87, respectively, p<0.001 in both cases, data not shown). To rule out an effect of MTX treatment on the IFN-induced genes, we compared patients undergoing (n = 25) and not undergoing (n = 10) MTX treatment with the appropriate age- and sex-matched controls. SAM revealed that both groups of patients showed a prominent IFN-induced gene expression signature (fig 1C, D). Thus, the IFN expression signature was present in patients with RA irrespective of MTX treatment.

The expression profiles of the three patients with probable RA, within the MTX naive group, did not differ from those of the other MTX-naive patients. Meanwhile, two of the patients with probable RA were diagnosed with “definite” RA at 6 and 12 months after Paxgene blood sampling, respectively, suggesting that the RA signature was present in the blood before diagnosis.

Selective upregulation of type I IFN-response genes in RA

Type I IFNs are mainly produced directly after viral infection, whereas type II IFNs are secondarily produced by activated T- and natural killer cells. Type I and type II IFN response programmes share many of their genes. To disclose information on the inducing type of IFN, we obtained a specific type I IFN- and type II IFN-response gene set21 (supplementary table 1). The type I IFN-response set consists of five genes that respond to both IFNα and IFNβ, but not to IFNγ. The type II IFN-response set consists of 13 genes responding specifically to IFNγ.21 To investigate the relative contribution of either gene set to the RA gene expression profile, we calculated for each gene set the mean gene expression level (log2 ratio) per patient and per healthy control, and compared the two groups with each other (fig 2). This analysis showed that the mean gene expression level of the type I IFN gene set was significantly higher in the group of patients with RA (p<0.001), whereas the mean gene expression level of the type II interferon (IFN) genes was similar between patients and controls. Hence, these findings provide evidence that type I IFNs rather than type II IFNs are responsible for the increased expression of IFN-induced genes.

Figure 2

 Type I interferon (IFN)-induced genes are overexpressed in rheumatoid arthritis (RA). Each square indicates the mean expression levels of genes known to be specifically induced by either type I (13 genes) or type II IFN (5 genes) per individual patient or per healthy control (HC). These genes are extracted from the gene sets used for pathway analysis in table 3.21

The IFN signature defines a subgroup of patients with RA

Consistent with the heterogeneous nature of RA, we observed that the IFN response showed a large variation between patients with RA (figs 2 and 3). To obtain more insight into the differential expression of IFN-induced genes in individual patients, we calculated for each individual the average expression of the IFN-cluster genes that were upregulated in patients with RA, as described in fig 1B. Next, we defined which patients show an altered IFN response, by calculating the 95% limits of the controls (normal values, defined as the mean (SD) expression of the 43 IFN genes, ±1.96 SD). We identified 20 patients with an average expression level above normal values, further defined as the IFNhigh group, while the remainder of the patients, with an expression level equal to that of controls, was defined as the group IFNlow (fig 3).

Figure 3

 A subgroup of patients with rheumatoid arthritis (RA) show increased expression of interferon (IFN)-response genes (IFNhigh). Each square represents a single individual with the average expression ratio of all 43 IFN-response genes, which are shown as a distinct cluster in fig 1A and B. The shaded box indicates the normal range within the 95% confidence limits. Patients with RA outside the shaded box are defined as the IFNhigh group. HC, healthy control.

Distinct characteristics of the IFNhigh group

To further characterise the IFNhigh group, we performed SAM analysis, which revealed that 484 genes were upregulated in IFNhigh patients, compared with the healthy controls, while 229 genes were downregulated. The same analysis for the IFNlow patients revealed only 57 upregulated genes and 93 downregulated genes. These data indicate that, within the patients with RA, the patients with an IFN signature represent the most distinct group compared with normal controls.

When we applied gene ontology analysis, we found that nearly all of the processes that were identified as upregulated in the whole RA group were also upregulated in the IFNhigh group. Moreover, an additional group of 10 biological processes were selectively upregulated in the IFNhigh group (table 2). No downregulated processes were identified.

Gene ontology analysis of the IFNlow group revealed no significant downregulated or upregulated processes.

In accordance with gene ontology analysis, pathway analysis by GSEA revealed that the IFNhigh patients were responsible for the upregulated pathways in the overall RA group (table 3). This was particularly clear for the IFN-type I induced gene sets, complement and coagulation cascades. On the other hand, the IFNlow group was associated with increased expression of the “neuroactive ligand–receptor interaction” pathway. We did not identify any downregulated pathwways in either group of patients with RA. Overall, these analyses indicate that, within the whole group of patients, the IFNhigh group is more distinct from controls than the IFNlow group.

The molecular stratification of RA was not associated with clinical parameters described in table 1.


Since RA is a systemic disease, several investigators addressed the question whether disease characteristics are reflected by changes in gene expression levels in PB cells. Whereas these studies provided insight into the genes that were differentially expressed between patients with RA and healthy controls, the issue of transcript-based disease heterogeneity has not been addressed so far, except for a comparison between recent-onset arthritis and longstanding disease.23

Using large-scale gene expression profiling, we identified a large number of genes, including genes involved in the immune/inflammatory response, such as the previously described calcium-binding proteins S100A8, S100A12 and IL1RA.24,25 Pathway level analysis was used to classify gene expression data in biological processes and pathways. The clear induction of IFN-response genes in patients with RA prompted us to incorporate several IFN-response gene sets from published data21,22 in the analysis. This analysis revealed that the type I IFN-mediated immunity was the most significantly upregulated pathway in patients with RA, independent of MTX treatment. Albeit that inclusion of the type I IFN gene set is a biased decision, this approach provides a method to demonstrate the significance of the type I IFN response programme in RA.

Most interestingly, our analysis revealed a striking heterogeneity between patients with RA on the basis of the differential expression of genes involved in the innate defence system—in particular, the type I IFN system. These findings suggest that different pathogenic mechanisms may contribute to the disease. The IFNhigh group was further characterised by gene sets reflecting increased activity of complement and coagulation cascades.26 Next to complement activation, the other pathways associated with the IFN type I signature, such as “fatty acid metabolism” and “coagulation” may all contribute to the increased risk for cardiovascular disease in a subgroup of patients with RA.27

The most significant genes from the complement and coagulation pathway are indicated in fig 1, including complement subcomponent C1q chain B (C1QB), coagulation factor XII (F12), tissue plasminogen activator (PLAT) and serpin peptidase inhibitor, clade G (C1 inhibitor), member 1 (SERPING1). These genes are involved in activation as well as inhibitory components of the pathways.

Upregulation of IFN-induced genes has also been observed in PB cells of (a subset of) patients with other autoimmune diseases like systemic lupus erythematosus (SLE),22,28 systemic sclerosis,29 multiple sclerosis,13 and in tissues from patients with Sjögren’s syndrome (SS),30 type I diabetes mellitus31 and dermatomyositis.32 These findings suggest that an activated IFN gene expression programme is a common hallmark in chronic autoimmune diseases.

Type I IFNs, which are the early mediators of the innate immune response that influence the adaptive immune response through direct and indirect actions on dendritic cells (DCs), T- and B cells, and natural killer cells, could affect the initiation or amplification of autoimmunity and tissue damage through their diverse and broad actions on almost every cell type and promotion of T helper 1 responses.33 This appears to be the case for SLE, but for RA both clinical and pathophysiological data have suggested that tumour necrosis factor alpha (TNFα) rather than type I IFN is essential for persistence of the disease. Hence, it is believed that mutually exclusive cytokine expression patterns are characteristic for distinct autoimmune diseases. However, since we observed an IFN type I signature in the PB of a subgroup of patients with RA this could mean that cytokine profiles are a patient-specific rather than a disease-specific phenomenon.

In patients with SLE, the IFN signature is related to disease severity.22 It is at present unclear what the role of type I IFNs in RA pathogenesis could be. In analogy to systemic sclerosis29 and multiple sclerosis,13 no clinical associations were found for RA so far. We have previously suggested that IFN/STAT-1 activation in RA synovium could be a reactive attempt to limit inflammation.34 This suggestion was recently supported by a study showing that IFNβ deficiency could prolong experimental arthritis and resulted in increased activation of fibroblast-like synoviocytes (FLS) in vitro.35 In addition, IFNβ-competent fibroblasts were able to ameliorate arthritis in IFNβ-deficient recipients. It should be noted, however, that systemic administration of IFNβ was unsuccessful in the treatment of RA, which may be due to pharmacokinetic issues.36

Concerning the origin of type I IFNs, infectious and endogenous agents, such as viruses, bacteria, unmethylated CpG DNA, single- or dsRNA, heat shock protein 60, or fibrinogen fragments, could all be proximal mediators of type I IFN production and thus lead to the more downstream activation programme. In sera from patients with SLE, IFNα levels correlate with IFN-response gene expression levels of PB cells.37 For SLE and SS, it has been demonstrated that immune complexes of autoantibodies and DNA- or RNA-containing autoantigens can induce type I IFN production by PB plasmacytoid DCs.28,30,38,39 This response depends on interaction with FcγRIIa and Toll-like receptors.30,39 Further studies are needed to determine whether the increased type I IFN-response genes in RA are the result of endogenous or infectious factors.

Besides a role for PB cells as producers of type I IFNs, cells at the site of inflammation may also be responsible for production. Cells with morphological and phenotypic characteristics of plasmacytoid DCs were shown to infiltrate skin lesions in SLE and actively produce type 1 IFN locally.40 In patients with SS, numerous IFNα -producing cells were detected in the affected salivary gland biopsies.30 In RA, IFNβ protein has been detected in the synovium.41 Moreover, FLS were responsible for increased levels of IFNβ in the RA synovium.42 The endogenous TLR3 ligand, dsRNA, derived from necrotic synovial fluid cells, has been shown to stimulate the production of IFNβ in RA FLS.43

In conclusion, we demonstrated that genomic profiling powers disease subclassification and has led to the identification of subgroups of patients, on the basis of differential expression of genes involved in non-specific immunity.


Supplementary materials


  • * These authors contributed equally to this work.

  • Published Online First 18 January 2007

  • This work was supported in part by the EURO-RA Marie Curie Trainings network, the European Community’s FP6 integrated program funding (AUTOCURE), the Innovation Oriented research Program (IOP) on Genomics and the Centre for Medical Systems Biology (a centre of excellence approved by the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research). This publication reflects only the author’s views. The European Community is not liable for any use that may be made of the information herein. These sponsors had no involvement in the study design, analysis or interpretation of the data and publications.

  • Competing interests: None.