Article Text

Download PDFPDF

Concise report
Genetic variants in the prediction of rheumatoid arthritis
  1. Annette H M van der Helm-van Mil,
  2. René E M Toes,
  3. Tom W J Huizinga
  1. Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands
  1. Correspondence to Annette H M van der Helm-van Mil, Leiden University Medical Center, Department of Rheumatology, C1-44, P O Box 9600, Leiden 2300 RC, The Netherlands; avdHelm{at}


Objective To determine whether the currently known genetic risk factors for rheumatoid arthritis (RA) improve the prediction of the development of RA compared to prediction using clinical risk factors alone in patients with undifferentiated arthritis (UA).

Methods Five hundred and seventy early UA-patients included in the Leiden Early Arthritis Clinic cohort, previously used to derive a clinical prediction rule, were used to explore the additional value of genetic variants. The following genetic variants were assessed HLA-DRB1 shared epitope (SE) alleles, rs2476601 (PTPN22), rs108184088 (TRAF1-C5), rs7574865 (STAT4), rs3087243 (CTLA4), rs4810485 (CD40), rs1678542 (KIF5A-PIP4K2C), rs2812378 (CCL21), rs42041 (CDK6), rs4750316 (PRKCQ), rs6684865 (MMEL1-TNFRSF14), rs2004640 (IRF5), rs6920220 and rs10499194 (TNFAIP3-OLIG3), interactions between HLA-SE alleles and rs2476601 (PTPN22) and between HLA-SE alleles and smoking. The area under the receiver operator curve (AUC) was used as measure of the discriminative ability of the models.

Results The AUC of a model consisting of genetic variants only was low, 0.536 (95% CI 0.48 to 0.59). The AUC of the model including genetic and clinical risk factors was not superior over the AUC of the clinical prediction rule (0.889, 95% CI 0.86 to 0.95 and 0.884, 95% CI 0.86 to 0.92).

Conclusion In a population at risk, information on currently known genetic risk factors for RA does not improve prediction of risk for RA compared to a prediction rule based on common clinical risk factors alone.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Rheumatoid arthritis (RA) is a multifactorial disease; genetic variants contribute to 50–60% of the total phenotypic variation. For long, the HLA-DRB1 shared epitope (SE) alleles were the only known genetic risk factor for RA. Due to recent technical advances >10 new genetic risk factors have been identified and are widely replicated.1 These findings fuel studies towards underlying pathophysiological mechanisms in RA, but it is of equal importance to explore the predictive value in clinical practice. Such investigations are preferentially conducted in populations that are representative for the setting in which genetic testing will be applied. In an early phase, RA often presents as undifferentiated arthritis (UA). Initiation of disease-modifying antirheumatic drugs in this phase results in less joint destruction during the disease course.2 However, the course of UA is variable; although one-third progresses to RA, 40–50% remits spontaneously.3 This underlines the need for tools that estimate the chance on RA in patients with UA. This also highlights that the UA-population is perfectly appropriate to evaluate the predictive ability of genetic variants. This study aimed to determine the ability of genetic variants, which were recently identified as RA-susceptibility factors, to predict RA-development in UA-patients. Lately a prediction rule for RA-development was derived and validated.4 It consists of nine commonly assessed clinical factors: age, gender, morning stiffness, distribution of involved joints, number of swollen and tender joints, C reactive protein (CRP), rheumatoid factor (RF) and antibodies to cyclic-citrullinated peptides (CCPs).5,,8 The second aim was to investigate whether genetic risk factors improve the ability to predict RA-development compared to classic risk factors alone.

Patients and methods


Five hundred and seventy early UA patients included in the Leiden Early Arthritis Clinic cohort,9 previously used to derive a prediction rule,4 were now included to explore the value of genetic variants. Patients were included in this cohort in case of a by physical examination confirmed synovitis of recent onset. At first visit questionnaires were filled-in and physical examination performed. Baseline blood samples were taken for determination of CRP, IgM RF, anti-CCP2 and DNA extraction.9 After 1-year follow-up the disease status of all UA patients was examined to determine whether they had developed RA according to 1987 American College of Rheumatology (ACR) criteria.


The following genetic variants were typed: HLA-SE alleles, rs2476601 (PTPN22), rs108184088 (TRAF1-C5), rs7574865 (STAT4), rs3087243 (CTLA4), rs4810485 (CD40), rs1678542 (KIF5A-PIP4K2C), rs2812378 (CCL21), rs42041 (CDK6), rs4750316 (PRKCQ), rs6684865 (MMEL1-TNFRSF14), rs2004640 (IRF5), rs6920220 and rs10499194 (TNFAIP3-OLIG3). HLA-DRB1 (sub) typing and typing of the single-nucleotide polymorphisms were performed as previously described; error rates were ≤1%.10,,16

Statistical analysis

The clinical prediction rule was derived by identifying independent variables using logistic regression analysis with backward selection procedure and p>0.10 as removal criteria. Subsequently these factors were entered in a logistic regression analysis with an enter method and weighted for their predictive ability using regression coefficients. Combining these factors formed a prediction rule.4 To determine the predictive ability of the genetic variants, the same procedure was performed using the 14 genetic variants and the known gene–gene interaction between HLA-SE alleles and PTPN22.17 Additionally, a logistic regression analysis was performed including both the traditional risk factors and the genetic variants. Here the interactions HLA-SE alleles*smoking and HLA-SE alleles*PTPN22 were also included.17 To evaluate the diagnostic performance of these models, a receiver-operating characteristic (ROC) curve was constructed. The area under the ROC curve (AUC) provided a measure of the overall discriminative ability and AUCs of the several models were compared. The Statistical Package for Social Sciences (SPSS), version 16.0 (SPSS, Chicago, Illinois, USA) was used.


After 1 year, 177 patients had progressed to RA. Univariate analysis on individual genetic variants showed a significant association only for HLA-SE (OR 1.7, 95% CI 1.2 to 2.6, p=0.005). Performing a logistic regression on the 14 genetic risk variants and the interaction HLA-SE*PTPN22 yielded a significant result only for the interaction-term HLA-SE*PTPN22 (OR 1.89, 95% CI1.07 to 3.25, p=0.027). The fraction of variance explained of this model was low (Nagelkerke R2 0.05). The AUC of a prediction model consisting of genetic factors was 0.536 (95% CI 0.48 to 0.59).

The AUC of the prediction model with only clinical and serological risk factors is described previously, and is 0.889 (95% CI 0.86 to 0.95).

Subsequently, a logistic regression with backward selection procedure was performed with classic risk and genetic risk factors and the two interactions as possible explanatory variables. Similar clinical risk factors as in the prediction rule, but not gender, symmetry and involvement of upper/lower extremities and the interaction HLA-SE*smoking were independently associated with RA-development (table 1). These factors were weighted using the regression coefficients as described. The discriminative ability of this model was 0.884 (95% CI 0.86 to 0.92), and not significantly different from the model consisting of classic risk factors (figure 1).

Figure 1

Receiver-operator-characteristic curves of the prediction rule consisting of clinical risk factors and the prediction rule with both clinical and genetic risk factors.

Table 1

Results of logistic regression analysis with backward selection procedure to identify risk factors that are independently associated with RA-development

As it is conceivable that the genetic contribution to prediction of RA-development is different in patients with a negative or positive family history, analyses were repeated in UA patients with a negative and positive family history for RA in first degree relatives. In patients with a negative family history (n=435), combined information on genetic variants and classic risk factors did not increase the AUC compared to the classic risk factors alone (AUC 0.89, 95% CI 0.86 to 0.92 and 0.86, 95% CI 0.82 to 0.90). The same observation was done in patients with a positive family history (n=135, AUC 0.85, 95% CI 0.79 to 0.92 and 0.84, 95% CI 0.72 to 0.92).


New genetic variants associating with RA-susceptibility have been identified. The clinical relevance of these finding is unclear and it has not been explored whether information on these genetic variants is helpful in early UA patients to estimate who will progress towards RA. The present study observed that the known genetic risk factors for RA had insufficient discriminative ability to identify the patients that will develop RA. Additionally, adding information on these genetic variants to a prediction rule consisting of classic risk factors did not improve the predictive ability of this clinical model.

There are several explanations for this finding. First, the effect size of these genetic variants is low.1 Second, some genetic factors may predispose to the development of UA and do not provide additional information for RA-development in UA patients. In addition, several genetic factors associate with risk factors that were already included in the prediction rule. For example, anti-CCP antibodies are part of the clinical prediction rule and several genetic variants are specifically reported to associate with anti-CCP-positive RA and proposed not to be associated with anti-CCP-negative RA (HLA-SE, PTPN22, TRAF1-C5, CTLA4) or not tested in anti-CCP-negative RA (CD40, KIF5A-PIP4K2C, CCL21, CDK6, PRKCQ, MMEL1-TNFSF14). Theoretically, genetic variants may particularly improve prediction beyond traditional risk factors when they are involved in pathways with no measurable clinical or serological factors.18

Genetic variants may improve a clinical prediction rule that contains less clinical risk factors. Given the fact that clinical factors are regularly assessed in daily practice, this doesn't seem cost-effective.

The strength of the current study is the longitudinal design. Following patients with and without risk factors allow direct assessments of the absolute chances on RA. Such chances may also be estimated using data of case–control studies and Bayes' theorema. One reason elucidating why such calculations are less accurate compared to measurements in longitudinal cohort studies, is that case–control studies that detected new genetic variants often contain hyperselected cases (for instance anti-CCP positive patients with longstanding RA). Consequently, these results cannot be easily extrapolated to all patients.

The present study has several limitations. It was not explored whether, apart from an interaction between the HLA-SE alleles and PTPN22, gene–gene interactions altered the predictive ability. No other gene–gene interactions for RA-susceptibility have been described. Exploring 14 genetic variants yields 214 (16 384) genetic combinations. Entering this number of variants in the regression analysis will result in overfitting of the data. Second, in inception cohorts the duration of follow-up differs within the study population and at the moment of analysis 94% had been followed for ≥1 year (mean follow-up 8 years, SD 3 years). The 1-year time-point was chosen in order to have a similar duration of follow-up for all studied UA-patients. Studying the total available follow-up revealed that 4.4% of UA patients developed RA later than 1 year. However, although some misclassification may be present, this would not affect the results of the present study in which predictive abilities of classic risk factors alone or combined with genetic risk factors were compared.

The assessed genetic variants may not be casually related to RA. Identification of casual variants is mandatory to understand the contribution of a genetic region to RA, but will have limited effect on the predictive ability.

The present study contained 570 patients and is not optimally powered to detect a significant difference between slight changes in AUCs. Nevertheless, the differences in AUCs observed in the present study were marginal. Even in the case of a type II error, the difference will not be of clinical relevance.

In conclusion, in a population at risk, information on the currently known genetic risk factors for RA did not result in a better prediction of risk for RA compared to a prediction rule based on common clinical risk factors alone. Therefore at present it is not advocated to determine genetic variants in early UA patients in clinical practice.


View Abstract


  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval This study was conducted with the approval of the Ethical Committee of the Leiden University Medical Center.

  • Provenance and peer review Not commissioned; externally peer reviewed.