T cell subsets: an immunological biomarker to predict progression to clinical arthritis in ACPA-positive individuals

Objectives Anticitrullinated protein antibody (ACPA)+ individuals with non-specific musculoskeletal symptoms are at risk of inflammatory arthritis (IA). This study aims to demonstrate the predictive value of T cell subset quantification for progression towards IA and compare it with previously identified clinical predictors of progression. Methods 103 ACPA+ individuals without clinical synovitis were observed 3-monthly for 12 months and then as clinically indicated. The end point was the development of IA. Naïve, regulatory T cells (Treg) and inflammation related cells (IRCs) were quantified by flow cytometry. Areas under the ROC curve (AUC) were calculated. Adjusted logistic regressions and Cox proportional hazards models for time to progression to IA were constructed. Results Compared with healthy controls (age adjusted where appropriate), ACPA+ individuals demonstrated reduced naïve (22.1% of subjects) and Treg (35.8%) frequencies and elevated IRC (29.5%). Of the 103 subjects, 48(46.6%) progressed. Individually, T cell subsets were weakly predictive (AUC between 0.63 and 0.66), although the presence of 2 T cell abnormalities had high specificity. Three models were compared: model-1 used T cell subsets only, model-2 used previously published clinical parameters, model-3 combined clinical data and T cell data. Model-3 performed the best (AUC 0.79 (95% CI 0.70 to 0.89)) compared with model-1 (0.75 (0.65 to 0.86)) and particularly with model-2 (0.62 (0.54 to 0.76)) demonstrating the added value of T cell subsets. Time to progression differed significantly between high-risk, moderate-risk and low-risk groups from model-3 (p=0.001, median 15.4 months, 25.8 months and 63.4 months, respectively). Conclusions T cell subset dysregulation in ACPA+ individuals predates the onset of IA, predicts the risk and faster progression to IA, with added value over previously published clinical predictors of progression.


Reference limits for T-cell subsets
Multiple linear regression was used to assess whether T-cell subset frequencies varied by age or sex. Where associations with age were found, one-sided 95% prediction intervals for the association were obtained by calculating two-sided 90% intervals and discarding the upper or lower interval accordingly; a 90% confidence interval around the prediction interval was calculated. Otherwise, the 5% or 95% centile, estimated assuming a normal distribution, and robust 90% confidence interval around it were calculated. T-cell subset frequencies found to be skewed were ln-transformed prior to analysis. Back-transforming to the original units yielded asymmetric confidence intervals.

Solid line = Lower limit of normal, Dashed line = 90% CI
To calculate a one-sided 95% prediction interval, a two-sided 90% interval was calculated and the upper limit discarded. The lower reference limit and its 90% confidence interval were calculated to be (reference limit) (90% CI) 1.645 1.645

1
where is the root mean square error, and ̅ is the mean age.

Inflammation related cells (IRC)
Data were ln-transformed prior to analysis.
Supplementary Figure 1S: Scatter plot of ln-transformed Treg frequency (%) and age

(Solid line = Lower limit of normal, Dashed line = 90% CI)
There was no evidence that T-regulatory cell frequency differed between males and females [age-adjusted geometric mean ratio 1.03 (0.82, 1.22); p=0.976], or that its association with age differed by sex [ratio of differences in slope 1.00 (0.99, 1.02); p=0.677], but there was a statistically significant tendency for Treg cell frequency to be higher in older individuals [by 1.22% (0.50%, 1.94%) per year; p=0.001]. The reference range was therefore adjusted for age but was not stratified by sex. Treg cell frequency was available for 98 controls; mean (SD) age 44.09 (12.30), range 19 to 69.
The Treg lower limit of normal and corresponding 90% confidence interval around it were calculated as = ( (0.

T-cell Model of Progression to inflammatory arthritis
Binary logistic regression models of the occurrence of progression to IA, and Cox proportional hazards Models of time to progression were constructed. Models were produced sequentially to investigate the effects of adding in covariates. Having obtained unadjusted odds ratio estimates, firstly an adjusted model containing only the T-cell subsets and age was specified (model-1). We then compared results for the variables from the published clinical model (model-2) to a Model that added in the T-cell pathway (model-3). Analyses were first performed in the subset of patients with full data to permit model performance to be tested.
Link tests were performed to check for specification error in the logistic regression Models and Hosmer and Lemeshow goodness-of-fit tests were performed. Concordance was assessed for Cox regression models and the proportional hazards assumption was tested. To account for missing data, multiple imputation using chained equations was then used to produce 20 complete datasets, the results from which were combined according to Rubin's rules.
Intermediate models were first constructed to investigate the effect of genetic (SE) and environment (smoking) and to build the final model where some clinical parameters had to be eliminated to fit the limitation imposed by our relatively small samples size.
Model-1: When all three subsets were included in a model with age ( Figure 2 and Table 5S), naïve and Treg were independently associated with progression notably compared to the unadjusted OR (Table 4S), while the effect of IRC was less prominent. The area under the ROC for the predicted probability of progression from this model was 0.75 (95%CI 0.65, 0.85), which represents an improvement over the prediction by the 3 subsets individually (Table 4S).
Model-2: The clinical model consisted of antibody status (RF and/or ACPA titre 3x the upper limit of normal), EMS >30 minutes and physician assessed small joint symptoms [14].
Within this patient group (n=95), EMS was not independently associated with the odds of progression to IA in this group of patients ( . When the variables from model-1 and 2 were combined also considering SE and smoking, EMS (p=0.553) and smoking (p=0.627) were the least significant and were therefore removed.
Age was retained (p=0.668) because its removal affected the ORs for naïve and Treg. Having adjusted for age, SE, autoantibody status and joint counts, naïve and Treg frequencies remained independently associated with the odds of progression (Table 5S,  Although this is a relatively large study of ACPA+ at risk individuals, the sample size has limited the robustness of statistical Modelling. It is recommended that there should be at least 10 cases in the smallest outcome category ('events') per variable (EPV), although it has been shown that valid results can be obtained with EPVs between 5 and 9 provided the results are interpreted cautiously. In Model-3 the EPV was 6.9, therefore we feel that these are promising preliminary results, but this Model must be considered exploratory until it is validated in a second cohort.