Objectives Clinical heterogeneity is a cardinal feature of systemic sclerosis (SSc). Hallmark SSc autoantibodies are central to diagnosis and associate with distinct patterns of skin-based and organ-based complications. Understanding molecular differences between patients will benefit clinical practice and research and give insight into pathogenesis of the disease. We aimed to improve understanding of the molecular differences between key diffuse cutaneous SSc subgroups as defined by their SSc-specific autoantibodies
Methods We have used high-dimensional transcriptional and proteomic analysis of blood and the skin in a well-characterised cohort of SSc (n=52) and healthy controls (n=16) to understand the molecular basis of clinical diversity in SSc and explore differences between the hallmark antinuclear autoantibody (ANA) reactivities.
Results Our data define a molecular spectrum of SSc based on skin gene expression and serum protein analysis, reflecting recognised clinical subgroups. Moreover, we show that antitopoisomerase-1 antibodies and anti-RNA polymerase III antibodies specificities associate with remarkably different longitudinal change in serum protein markers of fibrosis and divergent gene expression profiles. Overlapping and distinct disease processes are defined using individual patient pathway analysis.
Conclusions Our findings provide insight into clinical diversity and imply pathogenetic differences between ANA-based subgroups. This supports stratification of SSc cases by ANA antibody subtype in clinical trials and may explain different outcomes across ANA subgroups in trials targeting specific pathogenic mechanisms.
- systemic sclerosis
Data availability statement
Data are available upon reasonable request. All data, code and materials used in the analysis are available to any researcher for purposes of reproducing or extending the analysis upon request to the corresponding author.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
What is already known about this subject?
Linking skin and protein expression to clinical differences between subgroups in systemic sclerosis (SSc) has been challenging.
The hallmark antinuclear autoantibodies (ANA) used to diagnose SSc also predict clinically important differences in skin and internal organ disease.
What does this study add?
This study uses clinical and ANA heterogeneity across a well-characterised broad SSc cohort to better understand the molecular architecture of early diffuse cutaneous SSc.
We demonstrate for the first time striking differences in longitudinal patterns of serum protein markers between ANA subgroups in SSc.
High-dimensional analysis of skin gene expression with patient-level pathway analysis suggests biological basis for differences between ANA-based subgroups.
How might this impact on clinical practice or future developments?
Defining the molecular basis for clinical diversity gives insight into SSc disease biology relevant to clinical practice and trial design.
Systemic sclerosis (SSc) patients are characterised by antinuclear autoantibodies (ANA), including antitopoisomerase-1 antibodies (ATA), Scleroderma (Scl)-70, anticentromere antibodies or anti-RNA polymerase III antibodies (ARA).1 Different major organ-based complications link with ANA. For example, ATA is associated with significant interstitial lung fibrosis,1 2 while ARA carries a tenfold increased risk of scleroderma renal crisis.3 These strong associations with specific disease manifestations suggest that there are pathobiological differences beyond ANA underlying diverse clinical outcomes.
The skin and blood are readily accessible to compare gene and protein expression in SSc subgroups to better understand molecular correlates of clinical phenotypes. Skin analysis may be especially informative to understand differences between ANA subgroups because skin changes over time have been linked to ANA reactivities. ARA generally has a higher peak skin score than ATA in early diffuse cutaneous SSc (dcSSc) but faster improvement, whereas ATA may show slower regression.4 5
With the objective of understanding the molecular basis for heterogeneity in SSc, we undertook a detailed longitudinal analysis of skin and blood samples from a cohort of early-stage dcSSc followed over 12 months. This included measurement of serum proteins reflecting pathogenesis or extracellular matrix turnover and with genome-wide assessment of gene expression. To put our findings in the broader context, we also studied late-stage dcSSc and limited cutaneous systemic sclerosis (lcSSc) and have compared our findings with matched healthy control subjects. We have specifically tested the hypothesis that hallmark ANA specific to SSc associate with different patterns of gene expression and proteins reflecting fundamental differences in pathogenesis in dcSSc. Our results strongly suggest that ANA specificity defines distinct biological subgroups within SSc with implications for case selection for clinical trials and therapeutic strategies in clinical practice.
This was a single-centre, prospective observational study comprising four distinct participant cohorts: early dcSSc (<5-year duration), established dcSSc (>5-year duration), lcSSc and healthy volunteers (HC). Blood samples for serum and plasma and in PAXtubes were collected with concomitant 4 mm skin biopsies in RNAlater.
The early dcSSc cohort were reviewed every 3 months for a 12-month period, with blood sample collection and clinical assessment at each visit, and a 4 mm skin biopsy at baseline, month 3 and month 12.
Serum was analysed for soluble markers associated with collagen synthesis and degradation and fibrosis, including the constituents of the enhanced liver fibrosis (ELF) test. RNA expression analysis was performed on the skin and whole blood.
Additional methodology is described in the online supplemental material.
The prospective cohort was assigned the status of ‘improver’, ‘progressor’ or ‘stable’ based on change in Modified Rodnan Skin Score (MRSS) of greater than or equal to four points AND ≥20% change from baseline at the 12-month time point. For soluble markers, analysis of variance (ANOVA) with Tukey post hoc analysis or Kruskal-Wallis with post hoc Mann-Whitney U test was used. The Benjamini-Hochberg false discovery rate (FDR) was used for multiple comparisons. All statistical analysis was performed using the software R. Gene set enrichment analysis (GSEA) was used for pathway analysis.
Patients and HC provided informed consent and attended visits as part of routine care or for purposes of research sampling.
The BIOlogical Phenotyping of diffuse SYstemic sclerosis (BIOPSY) dataset was generated to provide a platform for the integrated analysis of skin and blood samples, together with detailed clinical phenotyping (online supplemental figure 1). The study recruited 52 patients with SSc (21 early dcSSc, 15 established dcSSc and 16 lcSSc) and 16 gender-matched HC to the early dcSSc cohort. Thirty-six (69%) of the patients with SSc are women. Baseline characteristics are summarised in table 1. Mean disease duration in the early dcSSc cohort was 24 months (SD 12 months). ANA frequency in BIOPSY reflected the overall dcSSc population: ATA n=14 (27%), ARA n=12 (23%) and others n=26 (50%), which is aligned with those of other recent large SSc cohorts.1 5
One patient died during the study period from cardiac complications. These cases were managed in line with current treatment guidelines in the UK.6 As expected, all patients with early dcSSc were on immunosuppression by 12 months, most often (85%) including mycophenolate mofetil (MMF). The doses of corticosteroids used in the prognostic dcSSc group did not exceed 10 mg prednisolone a day, and patients on corticosteroids were evenly distributed between the different autoantibody subtypes.
MRSS for early dcSSc was 18 (IQR 19). At a group level, MRSS peak was 21 (22) at 3 months and fell to 16 (14.25) at 12 months (figure 1). The median MRSS for the established patients with dcSSc was 10 (6.5) and for the lcSSc was 4 (1.25). Lower skin scores were seen in subjects with more established disease of greater than 36-month duration and in cases of early disease with less than 20-month duration. There was no significant relationship between disease duration and baseline MRSS (r=0.133, p=0.575).
For around half of the BIOPSY cohort, MRSS was clinically stable over 12 months. The remaining cases split between those that are significantly worsening (n=4) and those that show clinically significant improvement (n=5). Prospective dcSSc cases were divided into the three most recognised ANA-based subgroups, namely, ARA, ATA or ‘others’ for the purposes of analysis (which includes ANA positive, extractable nuclear antigen (ENA) negative or alternative ENAs). Group-level change in MRSS for the ANA subgroups is shown in figure 1. There was equal distribution of autoantibody subsets (specifically ATA and ARA) in each of the skin trajectory cohorts.
There was no significant difference between group-level skin score change between different immunosuppressive treatments or between those that were already on immunosuppression and those that started during the first 3 months of follow-up.
Differential longitudinal change in serum protein markers between ANA subgroups
Baseline serum protein marker analysis
At baseline, markers of collagen synthesis discriminated early dcSSc from HCs (online supplemental table 1 and figure 2). Composite fibrotic indices (C3 fibrotic index and C6 fibrotic index) did not outperform the markers of protein synthesis. The ELF composite score discriminated early dcSSc from HCs and was driven largely by type III procollagen peptide (PIIINP) and tissue inhibitor of metalloproteinases-1 (TIMP-1) (figure 2).
Longitudinal serum protein marker analysis in the early dcSSc cohort
Longitudinal changes in serum proteins over 12 months in early dcSSc in serum proteins explored differences based on skin score trajectory and ANA-defined subgroups.
Only ProC1 showed association longitudinally with skin progression (online supplemental figure 3). There were consistent and remarkable differences in the change in serum proteins between the major ANA-based subgroups (figure 3 and online supplemental figure 3). This was most evident for ELF, and the three constituents (PIIINP, hyaluronic acid and TIMP-1) and ProC1, where there is a linear increase overall for both ARA and ‘other’ groups whereas ATA shows decline over time from baseline values.
Integrated transcriptional analysis of the skin
Baseline transcriptional analysis of the skin
To better understand the molecular basis for longitudinal, clinical and serum protein differences between subgroups of SSc, a detailed analysis of global gene expression was undertaken across the BIOPSY cohort for skin and blood RNA.
There was clear differentiation between early dcSSc and HC by principal component analysis (PCA) and unsupervised clustering of significantly differentiated genes (731 genes; FDR <0.001) on baseline samples (figure 4B,C), with established dcSSc and lcSSc having a more similar transcriptional phenotype in the skin.
A focused analysis of early dcSSc and HC baseline skin biopsy samples identified 491 differentially expressed genes (fold change (FC) ≥1.5 and FDR<0.001) that separated these subpopulations and indicated a very distinct molecular signature shared by most cases of early dcSSc (online supplemental figure 4A,B).
Next, we explored differences in skin gene expression within the patients with early dcSSc based on ANA status. PCA and unsupervised cluster analysis of differentially expressed genes (n=384, p<0.01; FDR 0.4) clearly separating ARA and ATA patients (online supplemental figure 4C,D) with ‘other ANA’ patients being intermediate between ARA and ATA in some cases.
Analysis of ARA and ATA patients with early dcSSc revealed 61 differentially expressed genes at baseline (FC ≥1.5 and FDR<0.1) that fully differentiated these ANA subgroups (figure 4D and online supplemental table 3). These include genes previously associated with fibrosis and SSc showing significant difference between autoantibodies within the early dcSSc subgroup and across the whole SSc spectrum. Examples include inhibin subunit beta A (INHBA), interleukin 6 signal transducer (IL6ST), apelin (APLN) and complement 6 (C6) (figure 5D–G).
Similar analysis was performed on whole blood baseline samples, although we could not identify any genes that would directly differentiate ARA+ and ATA+ cases (online supplemental figure 4E,F).
Longitudinal transcriptional analysis of paired early dcSSc samples at 3 months and 12 months
Longitudinal sampling of the early dcSSc cases at 3 and 12 months showed stability of the gene expression profiles in the skin and blood over time (online supplemental figure 5) suggesting that gene-expression-based classification is a robust assessment that changes relatively little at a global level over 12 months.
SSc-specific gene expression in the skin shows relevant changes across the disease spectrum
To compare our findings with previously reported SSc-associated gene expression signatures in the skin, we used a robust SSc-associated composite signature of SSc-specific genes identified from publicly available gene expression datasets for whole skin.7–11 Our analysis replicated this SSc-associated signature across different time points for the BIOPSY cohort of early dcSSc and showed consistent relevant differences across the BIOPSY cohort for both upregulated and downregulated SSc signature scores (figure 4A). Both the upregulated and downregulated genes of the SSc signature showed differences from healthy controls for all SSc subgroups. The global differences reflected a spectrum of the disease, with greatest difference observed in the baseline early dcSSc and least in established dcSSc and lcSSc. Notably, the signature became attenuated at later time points in the longitudinal cohort and in late-stage dcSSc and lcSSc, in contrast to the relative stability of overall gene expression in BIOPSY for individual patients. This suggests that the global expression signature of SSc reflects stage and severity of skin disease. Overall, the composite disease-associated signature analysis provides strong external validation of our cohort compared with other datasets although likely to be less informative about patient-level MRSS change than our analysis of the prospectively collected and rigorously phenotyped BIOPSY dataset.
Differences in gene expression for ARA-positive and ATA-positive dcSSc compared with healthy controls
To explore similarities and differences between gene expression profiles for the two major ANA antibody subtypes of early dcSSc, we compared the baseline differences between ARA and ATA subgroups and HC in the skin. In the skin, 664 and 903 differentially expressed genes were identified between ATA versus HC and ARA versus HC, respectively, with only 386 transcripts shared between the two disease subpopulations (figure 5A–C). This further suggests meaningful differences between gene expression profiles in the skin of the two ANA-based subgroups.
The same analysis was performed on the transcripts from blood, with 430 differentially expressed genes between ARA and HC, and 313 genes were significantly differentially expressed between ATA and HC. Only 59 genes were shared between the two disease subpopulations. Unlike the direct comparison between the autoantibody subsets in blood, we were able to appreciate shared upregulated genes when the analysis included HC in blood.
Patient-level pathway analysis differentiates autoantibody subsets
To better understand the functional significance of differentially expressed genes in skin at baseline, GSEA was used for individualised pathway analysis across the BIOPSY cohort (figure 6A).
The comparison of differentially expressed Hallmark pathways for ATA and ARA versus healthy controls for the skin suggested overlapping differential pathway expression, with clear differences between the two major ANA subgroups as well as overlap (figure 6B and online supplemental figure 6A–C). None of the gene sets for the parallel whole blood analysis passed the threshold for difference on GSEA. Overlapping pathways using Hallmark are linked to aspects of SSc pathobiology that are likely to be shared across dcSSc cases. These pathways include allograft rejection, inflammation, IL6 signalling, transforming growth factor (TGF)-beta signalling, angiogenesis and complement as well as upregulation of interferon (IFN) alpha response. Conversely, oestrogen response and Myc targets are increased for ATA-positive skin but downregulated in ARA compared with HC, while adipogenesis, ultraviolet response down and androgen are increased in ARA but downregulated in ATA. These data provide insight into differences that could be highly relevant to the clinical, biomarker and gene expression features of these ANA-based subgroups.
In the present study, we have used the intrinsic clinical diversity across the SSc spectrum to help interpret molecular phenotypes and elucidate differences in potential transcriptional drivers in different stages and subsets of disease. This has important implications for both clinical practice and research, especially early-stage drug trials that will necessarily include relatively small numbers of patients, and risk being confounded by clinical and molecular imbalance between treatment arms. By demonstrating for the first time clear differences in serum proteins and skin gene expression between ANA subgroups of early dcSSc, our findings begin to explain how ANA reactivities are such strong predictors of clinical outcome and internal organ involvement.1
The results of serum protein analysis provide an anchor for our findings. We show that serum markers that have been validated as cross-sectional markers of skin fibrosis8 have remarkably different trajectories of change between ANA subgroups, specifically the two dominant ANA reactivities, ATA and ARA. Given our findings, despite the well-established correlation of the ELF test with MRSS and forced vital capacity (FVC),12 interpretation on a group level in early dcSSc with a mixed ANA profile, and especially over time, may be misleading. Unlike previous work on circulating markers of collagen turnover,13 14 we did not identify clear differences between markers of collagen degradation (C1M, C4M and C6M) between disease subgroups.15 One explanation is that while ELF reflects important pathological events in the skin that drive fibrosis, skin score trajectory is also influenced by processes that resolve fibrosis and that are not captured by ELF. Alternatively, it may be that ELF levels in blood reflect multisystem disease outside the skin compartment that is not captured by serial measurement of skin score. At a practical level, our findings highlight how important it therefore is to take the antibody subtypes into consideration when interpreting potential biomarkers, as the natural trajectory may be intrinsically different.
Taken together, whole skin gene expression analysis differentiates stage and subset of SSc and gives robust insight into the differential gene expression between SSc and HC. Differential gene expression resulted in complete separation of early dcSSc and HC (similar to Skaug et al 16), with limited and established dcSSc also forming moderately distinctive subgroups. As previously reported,8 9 we observed relative stability in gene expression profiles over 3 months and 12 months. Skin transcriptomic differences between ATA and ARA patients with early dcSSc are especially relevant in the context of the contrasting longitudinal changes in serum markers of fibrosis observed in the BIOPSY Study. This implies fundamental differences in skin biology and possibly pathogenic mechanism between ARA and ATA subgroups. This is supported by a recently published analysis of data from the Genetics vs Environment in Scleroderma Outcome Study (GENISOS) cohort, which suggests distinct gene expression differences between major ANA reactivities.17
Our data suggest that a relatively small number of transcripts clearly separate ARA and ATA skin gene expression. Many of these genes have already been identified to show altered expression in SSc (IL6ST (gp130), APLN and C618–21) or other fibrotic diseases (INHBA22). We found shared signatures across these autoantibody subsets, as well as differences that likely contribute to the distinctive clinical phenotype of these autoantibody profiles.
The fact that there were no differentiating transcriptomes in the blood between ARA+ and ATA+ patients suggests that these key differences are important in the skin pathology and clinical diversity of skin disease notable in these autoantibodies.
Hallmark ANA-associated differences may offer insight into diversity in outcome and treatment response within early dcSSc, including clinical trials. Some recent studies have analysed intrinsic subset gene sets, which found patients who responded to MMF or abatacept (a CTLA4-Ig fusion protein) tended to be in the inflammatory subset11 23 whereas those who responded to dasatinib (a tyrosine kinase inhibitor with antifibrotic potential) were in the fibroproliferative group.24 However, these studies did not look at the differential response to targeted therapies based on antibody subtype. It is possible that the intrinsic gene subsets9 are differentially represented between hallmark ANA subgroups in early-stage SSc and that future classification approaches incorporating both molecular and serological aspects may provide further opportunities for case stratification.
However, molecular differences between ATA and ARA identified in the present study may have relevance to treatment response for skin or internal organ disease in SSc based on other recent trial data. For example, subgroup analysis of recent phase two and phase three trials of tocilizumab in dcSSc suggests treatment benefit was much more marked in ATA-positive patients, where prevention of decline in lung function on tocilizumab was highly significant in ATA-positive subjects but not statistically significant in ATA negative.25–27 In contrast, the RIociguat Safety and Efficacy in patients with diffuse cutaneous Systemic Sclerosis (RISE-SSc) trial of riociguat showed a major benefit preventing MRSS progression in the ARA subgroup and no benefit for the ATA subgroup.28 Finally, the large SENSCIS trial of nintedanib showed a numerically greater preservation of lung function in ATA-negative compared with ATA-positive cases. This is notable because the ATA-negative group also demonstrated numerically greater improvement in MRSS.29 These are consistent with our hypothesis that ANA subgroups may respond differently to therapies targeting specific pathogenic mechanisms in the skin and lung.
These clinical associations raise the possibility that some of the SSc-specific autoantibodies may have a direct role in pathogenesis and that it may differ between ARA and ATA. The strongest evidence is for ATA, where ATA immune complexes (ICs) have greater effect on the IFN mRNA signature in fibroblasts compared with ARA-ICs and controls,30 31 a key cell type mediating skin fibrosis in SSc and contributing to the heterogeneity seen in SSc.
Taken together, our findings support the overarching hypothesis that there are distinct but overlapping pathogenic processes linking immunity and fibrosis in the skin in all dcSSc, especially the interplay between adipocyte function, immunity and fibrosis. Thus, in ARA-positive cases, local connective tissue/adipocyte biology may be key to the severity and progression of skin change, and this may be independent of immune cell drive. In contrast, ATA-positive dcSSc may reflect more persistent or refractory immune-cell-driven skin fibrosis that is less dependent on local factors and adipocyte biology. In addition, these observations may fit with novel mechanisms proposed by Lerbs et al 32 linking fibrosis to failed elimination of myofibroblasts. It is plausible that this mechanism is more relevant in ARA-positive cases of dcSSc than ATA.
There are notable strengths to this study. First, this is a well-characterised cohort of patients, prospectively collected with only two assessors performing MRSS (minimising interobserver variability). We present a real-life treated cohort of patients with dcSSc who, as would be expected, developed complications during the study period and had medications changed. By including a broad spectrum of patients with SSc, we can interpret any findings in the context of all patients with SSc.
There are also limitations. Being a single-centre study requiring significant time commitment of subjects meant that it comprised a relatively small cohort of patients. Within the prospective cohort of patients, there are only small numbers of progressors or improvers, so these findings should be interpreted with caution. There were also some missing samples, due to patient refusal or technical difficulties. Although we have speculated about treatment effects, this was an observational study, unable to formally compare treatments between patients.
In conclusion, BIOPSY provides a template for translational research that can integrate clinical observation and modern integrative molecular methods. In this way, we have been able to better understand biological differences between subsets of SSc and the relationship between skin disease, autoantibody subgroup and candidate molecular markers. Our results have implications for clinical practice, trial design and basic science studies of SSc.
Data availability statement
Data are available upon reasonable request. All data, code and materials used in the analysis are available to any researcher for purposes of reproducing or extending the analysis upon request to the corresponding author.
Patient consent for publication
Handling editor Josef S Smolen
Contributors All authors contributed significantly to the study design and manuscript and reviewed and edited the final manuscript. Individual contribution as set out below. Conceptualisation: CPD and NW. Methodology: ED-S, NW, CPD, JS and YVT. Investigation: KENC, CC, JS and YVT. Data analysis: KENC, AT, NG and YVT. Supervision: KENC, CC and CPD. Writing—original draft: KENC and CPD. Writing—review and editing: KENC, CC, EC, AT, KN, NG, MAM, JS, YVT, VO, ED-S, NW, SMF and CPD.
Funding This work was funded by a research grant to UCL from GlaxoSmithKline and Medical Research Council UK grant MR/T001631/1 (fellowship to KENC).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.