Article Text

Antifibrotic factor KLF4 is repressed by the miR-10/TFAP2A/TBX5 axis in dermal fibroblasts: insights from twins discordant for systemic sclerosis
  1. Maya Malaab1,
  2. Ludivine Renaud1,
  3. Naoko Takamura1,
  4. Kip D Zimmerman2,
  5. Willian A da Silveira3,
  6. Paula S Ramos1,4,
  7. Sandra Haddad5,
  8. Marc Peters-Golden6,
  9. Loka R Penke6,
  10. Bethany Wolf4,
  11. Gary Hardiman3,
  12. Carl D Langefeld2,
  13. Thomas A Medsger7,
  14. Carol A Feghali-Bostwick1
  1. 1Medicine, Medical University of South Carolina, Charleston, South Carolina, USA
  2. 2Biostatistical Sciences and Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
  3. 3School of Biological Sciences, Institute for Global Food Security, Queen's University Belfast, Belfast, UK
  4. 4Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina, USA
  5. 5Science, Bay Path University, Longmeadow, Massachusetts, USA
  6. 6Internal Medicine, University of Michigan Michigan Medicine, Ann Arbor, Michigan, USA
  7. 7Medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
  1. Correspondence to Dr Carol A Feghali-Bostwick, Medicine, Medical University of South Carolina, Charleston, South Carolina, USA; feghalib{at}


Objectives Systemic sclerosis (SSc) is a complex disease of unknown aetiology in which inflammation and fibrosis lead to multiple organ damage. There is currently no effective therapy that can halt the progression of fibrosis or reverse it, thus studies that provide novel insights into disease pathogenesis and identify novel potential therapeutic targets are critically needed.

Methods We used global gene expression and genome-wide DNA methylation analyses of dermal fibroblasts (dFBs) from a unique cohort of twins discordant for SSc to identify molecular features of this pathology. We validated the findings using in vitro, ex vivo and in vivo models.

Results Our results revealed distinct differentially expressed and methylated genes, including several transcription factors involved in stem cell differentiation and developmental programmes (KLF4, TBX5, TFAP2A and homeobox genes) and the microRNAs miR-10a and miR-10b which target several of these deregulated genes. We show that KLF4 expression is reduced in SSc dFBs and its expression is repressed by TBX5 and TFAP2A. We also show that KLF4 is antifibrotic, and its conditional knockout in fibroblasts promotes a fibrotic phenotype.

Conclusions Our data support a role for epigenetic dysregulation in mediating SSc susceptibility in dFBs, illustrating the intricate interplay between CpG methylation, miRNAs and transcription factors in SSc pathogenesis, and highlighting the potential for future use of epigenetic modifiers as therapies.

  • systemic sclerosis
  • fibroblasts
  • scleroderma
  • systemic

Data availability statement

Data are available in a public, open access repository. Data are available upon reasonable request. All data relevant to the study are included in the article or uploaded as supplementary information. The data supporting the findings of this study are available within the paper and its supplementary information files. Normalized or raw intensity data of the HM450K BeadChips are available upon request from the authors on a collaborative basis. RNAseq data have been deposited on the NCBI GEO under access # GSE153880.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Key messages

What is already known about this subject?

  • Systemic sclerosis (SSc) is a complex connective tissue disease of unknown aetiology.

  • Disease concordance in twins with SSc is low, implicating a potential role for epigenetics in disease manifestation.

What does this study add?

  • Using global DNA methylation and gene expression, we identified novel programmes deregulated in dermal fibroblasts of a unique cohort of twins discordant for SSc.

  • Using gain and loss of function studies, we confirmed the functional impact of differentially expressed and methylated transcription factors and miRNAs in mediating dermal fibrosis in SSc.

How might this impact on clinical practice or future developments?

  • Since no effective therapy that can halt the progression of fibrosis exists, novel insights into disease pathogenesis and identification of novel potential therapeutic targets are critically needed.


Systemic sclerosis (SSc) is a complex connective tissue disease whose hallmarks include autoimmunity, inflammation, fibrosis and vasculopathy. SSc predominantly affects women and is the connective tissue disease with the worst survival.1 Diffuse cutaneous SSc (dcSSc) is characterised by rapid development of skin, lung and other organ fibrosis within 3–5 years, after which skin fibrosis regresses but damages to internal organs persist. Limited cutaneous SSc (lcSSc) shows gradual skin progression with delayed internal organ involvement. Most research studies and clinical trials have focused on dcSSc,2 although dermal fibrosis is a hallmark of both dcSSc and lcSSc. The aetiology and molecular mechanisms underlying SSc remain elusive and no effective therapeutic treatment exists.

In monozygotic (MZ) twins, the rate of disease concordance is low (4.2%),3 suggesting an important role for epigenetic and environmental factors in SSc susceptibility in addition to a genetic predisposition.4 Environmental factors can influence a trait through epigenetic regulation, and epigenetic marks can impact gene expression, thus governing cell function and response to environmental stimuli.5 Additionally, early mutations that occurred before the primordial germ cell specification have recently been shown to contribute to phenotypic discordance observed in MZ twins,6 a contribution previously underestimated. Due to replication errors, mutations that are specific to one twin happened in 15% of MZ twin pairs, revealing the importance of early cell lineages and mutations that are dependent on DNA methylation. Epigenetic mechanisms play a role in the pathogenesis of autoimmune diseases, and epigenome-wide association studies revealed the existence of differentially methylated (DM) regions associated with systemic lupus erythematosus7 and psoriasis,8 two other autoimmune rheumatic diseases. Inhibition of DNA methyltransferases has shown promising antifibrotic properties in SSc fibroblasts.9–11 Epigenetic changes have been reported in SSc fibroblasts.12 13 Epigenetic changes in SSc are emerging as important mediators of disease processes including fibrosis and angiogenesis.11 As a result, epigenetic modifying drugs may provide viable therapies in SSc.9

To our knowledge, our study is the first to combine RNA sequencing (RNAseq) and genome-wide DNA methylation analyses in dermal fibroblasts (dFBs), the effector cells in fibrosis,14 of twins discordant for SSc. This unique cohort is the ideal design to assess the role of epigenetic variations in disease aetiology.15

Our study identified several transcription factors (TFs) involved in stem cell differentiation that were DM and differentially expressed (DE) in SSc, including homeobox (HOX) genes, TBX5 and TFAP2A, suggesting that SSc is mediated by an aberrant resumption of developmental pathways. MicroRNAs (miRNAs) of the miR-10 family encoded within HOX gene clusters have a direct regulatory role on these TFs, and are also DE, highlighting the deregulation of complex mRNA–miRNA regulatory networks in SSc. We also identified KLF4 as a major regulator of fibrosis, cell differentiation and extracellular matrix (ECM) accumulation, and determined that TBX5 and TFAP2A regulate KLF4, which in turn regulates WNT and HOX genes. Mice with conditional fibroblast-targeted loss of KLF4 showed increased dermal hydroxyproline levels, and dFBs from these mice had increased expression levels of fibrotic genes. Over-expression of KLF4 in dFBs and human skin tissues in organ culture prevented the TGFβ-induced fibrotic response. Our data suggest that epigenetic events shape the fibrosis in dFBs that leads to SSc.


See online supplemental file 1 for details of the methods.


Gene expression profiling of twins discordant for SSc

The DE analysis ‘lcSSc twins versus healthy twins’ (n=8) returned 897 DE genes in dFBs of lcSSc twins (online supplemental figure 1A and table 1), 511 upregulated (q<0.1, log2FC>0.6) and 386 downregulated (q<0.1, log2FC<−0.6) genes. The heatmap showed two clear clusters for healthy and lcSSc samples (online supplemental figure 1B). To validate these results, another DE analysis was computed for ‘lcSSc twins versus unrelated controls’ (n=8) and revealed 2164 DE genes in lcSSc twins (online supplemental figure 2A,B and table 2) out of which 308 DE genes are in common with DE genes from the ‘lcSSc twins versus healthy twins’ comparison (online supplemental figure 2C and table 3). One striking feature of the lcSSc twins’ dataset is that 12 homeobox genes are present, 5 of which are significantly upregulated (HOXB3, HOXB7, HOXB8, HOXC10, PITX1) and 7 are significantly downregulated (HOXB13, HOXD3, HOXD9, HOXD10, HOXD11, HOXD13, HOXD-AS2) (figure 1A; online supplemental figure 2D). Of note, WNT5A was significantly upregulated in dFBs of lcSSc twins (online supplemental figure 2E).

Figure 1

RNAseq and DNA methylation results. (A) RNAseq revealed that several homeobox genes are DE in dFBs of lcSSc twins compared with healthy twins and validated with unrelated controls. Significance criteria in RNAseq analysis: q-value <0.1, log2FC>0.6 for upregulation, log2FC<−0.6 for downregulation. (B) Validation of several genes of interest by qPCR in lcSSc twins (lcSSc) and unrelated controls (Ctrl). (C) Homeobox genes DE in dcSSc twins compared with healthy twins. Significance criteria in RNAseq analysis: q-value <0.1, log2FC>0.6 for upregulation, log2FC<−0.6 for downregulation. (D) Venn diagram for the comparison of DE genes in dcSSc and lcSSc twins. (E) Genes with increased and reduced methylation status identified in the DNA methylation analysis of dFBs. Significance criteria in DNA methylation analysis: p value <10–04, W Beta >0.20 for increased methylation, W Beta <−0.20 for reduced methylation. (F) Genes that are simultaneously DE and differentially methylated in dFBs of lcSSc (validated dataset) and dcSSc twins. dcSSc, diffuse cutaneous SSc; DE, differentially expressed; dFBs, dermal fibroblasts; lcSSc, limited cutaneous SSc; SSc, systemic sclerosis.

The systems level analysis showed enrichment of pathways and ontologies pertaining to ECM components, organ morphogenesis, regulatory region involved in DNA binding and integral component of the plasma membrane (table 1; online supplemental table 4). Note that the HOX genes are hit in the query list for ‘animal organ morphogenesis’ and ‘transcription regulatory region sequence-specific DNA binding’ along with several T-box genes (TBX2, TBX3, TBX15) and TFs that bind the consensus sequence 5'-GCCNNNGGC-3' (TFAP2A, TFAP2B) (online supplemental figure 2E). We used qPCR to validate several genes of interest (figure 1B). The expression levels of TFAP2A were increased in lcSSc samples compared with unrelated controls while HOXD10 and HOXD11 levels were reduced, in agreement with the RNAseq data. A significant decrease in expression levels of KLF4 was observed, consistent with DE analysis ‘lcSSc twins versus unrelated controls’ (online supplemental table 2). We also detected an increase in HOXA13 levels and a decrease in HOXC5 levels in lcSSc samples (figure 1B).

Table 1

Functional enrichment in lcSSc twins

The comparison of ‘dcSSc twins versus healthy twins’ (n=7) returned 76 DE genes (online supplemental figure 3A and table 5). The heatmap revealed that the transcriptomic signatures of dcSSc and healthy samples were not markedly different, therefore no clear clustering of the samples was observed; dcSSc and healthy samples appear to be randomly intertwined (online supplemental figure 3B), suggesting that the transcriptomic signature of dFBs in patients with dcSSc is not as deep and well-defined as in their healthy twins. Three HOX genes are present in this dataset (figure 1C: HOXB-AS3 is downregulated and HOXD10/HOXD11 are upregulated in dFBs of dcSSc twins) as well as the T-box gene TBX5 (upregulated; online supplemental table 5). The systems level analysis did not identify any pathways, likely due to a limited number of DE genes (table 2; online supplemental table 6).

Table 2

Functional enrichment in dcSSc twins

By comparing the DE genes in lcSSc and dcSSc twins, we identified nine genes in the intersect (figure 1D): CDK18, IL-6, KLF5, LIF, MST1R and RGCC were commonly upregulated in both disease subtypes while HOXD10, HOXD11 and PRDM6 were upregulated in dcSSc twins but downregulated in lcSSc twins.

Methylation profiling of twins discordant for SSc

Methylation profiling of 10 twin pairs discordant for SSc identified 174 DM CpG sites (p<10−04) between healthy and SSc twins, 67% of which mapped to gene bodies (116 genes), while the remainder mapped to intergenic regions, and these 116 DM CpG sites in dFBs mapped to 83 distinct genes (online supplemental table 7). A total of 55 CpG sites showed a large reduction (W Beta <−0.20) and 16 showed a large increase (W Beta >0.20) in DNA methylation status in the SSc twins. At the gene level, 35 DM genes were identified (p<10−04) in dFBs of SSc twins, out of which 13 had increased methylation and 22 had decreased methylation (figure 1E).

Next, we sought to identify genes that are both DM and DE (figure 1F). In lcSSc twins, HOXB3 and TFAP2A showed an increase in both gene expression and DNA methylation while HOXB8 and HOXC10 had an increase in gene expression but a reduction in DNA methylation status as compared with healthy twins (online supplemental figure 4A). In dcSSc twins, TBX5 and UNC5B were upregulated and less methylated compared with their healthy twins (online supplemental figure 4B).

miR-10a and miR-10b mediate HOXD10, TFAP2A, TBX5 and COL1A1 dysregulation

The HOX gene clusters have coevolved with and contain many miRNAs,16 including miR-10a and miR-10b (figure 2A). miR-10a is located upstream of HOXB4 on chromosome 17, while miR-10b is located upstream of HOXD4 on chromosome 2. Both miR-10a-5p and miR-10b-5p expression levels were significantly reduced in lcSSc dFBs compared with healthy twins (figure 2B–C). In the dcSSc subtype, both miRNAs were also decreased, although not significantly for miR-10b-5p. Since miR-10a-5p and miR-10b-5p are downregulated, we expect a negative correlation with their target genes. Accordingly, TFAP2A and TBX5 levels were upregulated in lcSSc and dcSSc dFBs, respectively (figure 1B; online supplemental table 5). This negative correlation was also observed with HOXD10 in dcSSc dFBs (figure 1C) but not in the lcSSc dFBs in which HOXD10 is downregulated (figure 1A,B), suggesting that other regulatory mechanisms govern HOXD10 in lcSSc dFBs.

Figure 2

Both miR-10a and miR-10b mediate the dysregulation of TFAP2A, TBX5 and COL1A1 in SSc dFBs. (A) miR-10a and miR-10b are encoded from the Hox clusters, upstream of HOXB4 (chromosome 17) and HOXD4 (chromosome 2) respectively. (B) Levels of expression of miR-10a-5p in lcSSc and dcSSc dFBs compared with healthy dFBs. (C) Levels of expression of miR-10b-5p in lcSSc and dcSSc dFBs compared with healthy dFBs. (D) Comparison of predicted miR-10a/miR-10b target genes (obtained from TargetScan database) and validated DE genes in lcSSc twins. Genes in red font are upregulated, and genes in blue are downregulated. Note that HOXD10 is downregulated in lcSSc twins while upregulated in dcSSc twins. (E) Effect of antagomiR-mediated miR-10a-5p (inhib miR-10a) and miR-10b-5p (inhib miR-10b) silencing on HOXD10 (n=7), TFAP2A (n=10) and TBX5 (n=7) expression levels in normal dFBs transfected for 48 hours. Combined inhibition (inhib combined) was induced by simultaneous delivery of both miR-10a-5p and miR-10b-5p antagomiRs. SCR: scramble antagomiR. (F) Effect of miR-10a and miR-10b mimics in SSc dFBs on HOXD10, TFAP2A, TBX5 and COL1A1 protein abundance by immunoblotting (n=5). The concentration of mimic transfection was 50 mM for individual mimic and 25 mM for each mimic when transfected together (mimic combined). p<0.05, **p<0.01, ***p<0.001, ****p<0.0001. Error bars=SEM. dcSSc, diffuse cutaneous SSc; DE, differentially expressed; dFBs, dermal fibroblasts; SCR, scramble mimic control; SSc, systemic sclerosis.

These two miRNAs share many target genes because they have seed sequences that only differ by one base. By comparing the list of predicted gene targets obtained from TargetScan for miR-10a-5p and miR-10b-5p to the DE genes lists in lcSSc (validated) and dcSSc twins, we determined that 10 predicted targets were present as DE genes in our RNAseq dataset: in lcSSc twins ALPL, ANK1, CADM1, HOXB3, HOXD10, SOBP, TFAP2A and VASH1, and in dcSSc twins HOXD10, TBX5 and UNC5B (figure 2D). Silencing of both miR-10a-5p and miR-10b-5p in normal dFBs (single or combined) significantly increased TFAP2A and TBX5 mRNA levels (figure 2E), while only the silencing of miR-10b-5p induced a significant increase in HOXD10 expression levels. This is consistent with previous reports stating that miR-10b targets HOXD10 for silencing.17–19 In a complementary approach, transfection of SSc dFBs with miR-10a-5p and miR-10b-5p mimics (individually or combined) significantly reduced HOXD10 protein abundance (figure 2F), confirming the efficacy of the mimics and showing that miR-10a is also able to decrease HOXD10 in SSc dFBs. The combination of both mimics also reduced the TFAP2A, TBX5 and COL1A1 protein levels, and the miR-10a-5p and miR-10b-5p mimics individually exerted a significant translational silencing of TBX5. These results suggest that miR-10a-5p and miR-10b-5p, whether individually or in combination, modulate HOXD10, TFAP2A, TBX5 and COL1A1 levels in SSc.

In early dcSSc, TBX5 and TFAP2A regulate HOX genes via KLF4, NANOG and POU5F1 pathway

Given that HOX genes encode TFs involved in cellular differentiation during embryogenesis,20 and that myofibroblast differentiation is central to SSc and fibrosis pathogenesis,14 a review of TGFβ and HOX gene regulation led us to examine the embryonic stem cell (ESC) TFs SOX2, KLF4, NANOG and POU5F1 (aka OCT4).21 In lcSSc twins, SOX2 was significantly upregulated while KLF4 was downregulated compared with unrelated controls (figure 1B; online supplemental figure 5A and table 2). Additionally, POU5F1 was downregulated in several samples, although not significantly, and NANOG was not detected. In dcSSc and healthy twins, the expression levels of these four ESC TFs were different by samples, not by experimental groups (online supplemental figure 5B and table 5). Since the patients with lcSSc and dcSSc in the twin cohort had variable disease duration (online supplemental file 1), we quantified these ESC TFs in non-twin patients with early dcSSc with disease duration of 8–24 months. Upregulation of SOX2 was observed in early dcSSc dFBs along with downregulation of KLF4, NANOG and POU5F1 (figure 3A). TFAP2A and TBX5 were also upregulated in these cells.

Figure 3

Validation in early disease and regulation of KLF4 function. (A) Expression levels of ESC pluripotency TFs and other genes of interest in early dcSSc dFBs (n=8, disease duration 8–24 months) compared with levels in normal dFBs from healthy twins (n=7, aged-matched to patients with early dcSSc). (B) KLF4 expression levels in dcSSc discordant twins (n=7). (C) Disease duration correlation in dcSSc for NANOG and POU5F1 expression levels. All dcSSc twins (n=7) and non-twin patients with dcSSc selected for the early disease study (n=5) were used in this linear regression analysis. (D) Effect of TFAP2A and TBX5 silencing (siRNA) on KLF4 expression levels in early dcSSc dFBs (n=7) transfected for 24 hours. (E) Effect of KLF4 silencing (siRNA) on ACTA2, CTGF, HOXB5, WNT2, WNT4 and WNT16 expression levels in normal dFBs (n=5–6) transfected for 72 hours. (F) Effect of adenoviral overexpression of KLF4 (ad-KLF4) in normal dFBs infected for 24 hours and stimulated with TGFβ (TGF) or VC for 72 hours on protein abundance of fibrotic genes (n=5). *p<0.05, **p<0.01, ***p<0.001. Error bars=SEM. ad-NULL, null adenovirus; CTGF, connective tissue growth factor; dcSSc, diffuse cutaneous SSc; dFBs, dermal fibroblasts; ESC, embryonic stem cell; SCR, scramble siRNA; SSc, systemic sclerosis; TFs, transcription factors; VC, vehicle control.

In discordant dcSSc twins, KLF4 expression levels were noticeably, although not significantly, reduced as compared with healthy twins (figure 3B). We performed a disease duration correlation analysis (figure 3C and online supplemental figure 5C) using data from all dcSSc twins (representing intermediate and late disease stages) and patients with early dcSSc and identified a strong positive correlation between disease duration and relative mRNA levels of NANOG (p=0.0129, R2=0.4769) and POU5F1 (p=0.0236, R2=0.4159).

KLF4 is protective against fibrosis

KLF4 is a master regulator of skin and stem cell biology.22 Silencing of TFAP2A and TBX5 in early dcSSc dFBs increased KLF4 expression levels (figure 3D). In turn, KLF4 silencing in normal dFBs (figure 3E) significantly increased smooth muscle actin alpha 2 (ΑCΤΑ2), an indicator of myofibroblast differentiation,14 and connective tissue growth factor (CTGF), a central profibrotic factor,23 while decreasing HOXB5 levels, an expression profile mirroring that in SSc. We also examined the effect of KLF4 silencing on the expression levels of several key molecules in the WNT pathway and determined that KLF4 positively regulates WNT2 and WNT16 and negatively regulates WNT4 (figure 3E). Normal dFBs overexpressing KLF4 and stimulated with TGFβ showed a significant decrease in COL1A1, FN, αSΜΑ and CTGF protein abundance as compared with TGFβ alone, confirming that KLF4 protects against TGFβ-induced fibrogenesis (figure 3F). These findings were further validated in human skin in organ culture as it is the relevant tissue for SSc and lends direct relevance to the human disease. Similarly to its effect in dFBs, KLF4 reduced the TGFβ1 response in human skin (online supplemental figure 6). To extend our findings in vivo, we induced fibroblast-specific knock-down of KLF4 in mice (KLF4-KO). In KLF4-KO dFBs, fibrotic genes were highly expressed (figure 4A). Furthermore, hydroxyproline content was significantly higher in the skin of KLF4-KO mice as compared with control mice (figure 4B). Since KLF4 regulated WNT gene expression (figure 3E), we examined its effect on β-catenin. Loss of KLF4 resulted in a significant increase in nuclear chromatin levels of β-catenin, a key downstream component of the canonical Wnt signalling pathway, in dFBs of KLF4-KO mice (figure 4C). This suggests that the imbalance of WNT gene expression resulted in an activation status in mouse dFBs.

Figure 4

Antifibrotic properties of KLF4. (A) Effect of KLF4 conditional knock-out (KLF4-KO) on the expression levels of fibrotic genes in murine dFBs. Control KLF4-flox mice (Ctrl) received corn oil. (B) Hydroxyproline levels in 3 mm skin punches of control and inducible KLF4-KO mice. Representative images of skin sections stained with Masson’s trichrome showing collagen in blue. The scale bar indicates 100 µm. (C) Protein abundance of β-catenin in the nuclear chromatin fraction of control and KLF4-KO dFBs relative to histone H3. *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001. Error bars=SEM.

Together our data highlight KLF4 as a key antifibrotic factor that is downregulated by TFAP2A and TBX5 early in disease progression, leading to fibroblast activation and fibrosis. KLF4 could be a candidate for the development of targeted therapies against fibrosis that would aim to maintain increased levels of this antifibrotic gene.


This study is the first to examine genome-wide DNA methylation profiling in combination with RNAseq in dFBs from a unique cohort of twins discordant for SSc categorised by disease subtypes. HOX genes belong to a family of evolutionarily highly conserved genes that encode TFs involved in stem cell differentiation.20 In addition to driving stem cells towards their corresponding lineages, the HOX code is also important during adulthood to repair and maintain proper tissue and organ function, and provide cellular positional memory that is crucial to inform the location and cell lineage of new epidermal cells during turnover, a process that is mediated in adult dFBs via HOXA13–WNT5A pathway.24 Our data show that both HOXA13 and WNT5A are upregulated in lcSSc dFBs, implicating this pathway in SSc. Even though HOXA13 function is required for WNT5A expression, it does not directly target WNT5A, which led Rinn et al to speculate that HOXA13 activates WNT5A directly via distant enhancer sequences, or indirectly via HOXB6.24 Rinn et al25 examined the HOX signature in primary human dFBs that delineates distal and proximal sections of limbs and showed that HOXA13 is exclusively expressed in adult dFBs isolated from distal sites such as hands, feet and foreskin. Our qRT-PCR data are consistent with Rinn et al25 in that HOXA13 was upregulated in the lcSSc dFBs as compared with unrelated controls. Our data suggest that deregulation of the HOX zip code, including HOXA, HOXB, HOXC and HOXD genes, is dependent on SSc skin involvement and independent from biopsy location since the expression of HOXB genes is limited to the trunk and non-dermal samples while HOXD4 and HOXD8 are expressed exclusively in the trunk and proximal samples in healthy donors.25

Decreased promoter DNA methylation often leads to increased gene expression as is the case with HOXB8 and HOXC10 in lcSSc twins, and TBX5 and UNC5B in dcSSc twins. In the same reasoning, gene silencing is expected with increased methylation of CpG islands, however, this classic paradigm does not always apply, as illustrated by HOXB3 and TFAP2A which were more methylated yet upregulated in lcSSc twins. Mucientes et al also observed differential expression of HOX genes independent of their methylation status in stem cells of patients with osteoarthritis.26 This suggests that other regulatory and epigenetic mechanisms are implicated in the regulation of these HOX genes, including sequential activation of HOX genes with 3′ to 5′ polarity.27–29 Our data highlight that perturbation of homeotic genes drives developmental reprogramming in SSc dFBs. Interestingly, TBX5 is located on chromosome 12, the same chromosome as the HOXC cluster (online supplemental figure 4A), and is a direct target of HOX genes.30 TBX5 showed reduced methylation status and upregulated gene expression in dcSSc dFBs in our study (online supplemental figure 4B), a state also observed in the activation of rheumatoid arthritis synovial fibroblasts.31

Our data showed that TFAP2A and TBX5 negatively regulate KLF4, an upstream regulator of the ESC factors NANOG, SOX2 and POU5F132 that control cellular differentiation into extraembryonic, endodermal, mesodermal and ectodermal lineages, and also regulate HOX genes33 (figure 5). We determined that KLF4 is a negative regulator of ΑCΤΑ2, CTGF and WNT4 in dFBs, indicative of active fibrotic cascade, and a positive regulator of HOXB5, WNT2 and WNT16. Interestingly, binding motifs for KLF4, POU5F1 and SOX2 exist in the WNT2 promoter region, and WNT2 has been shown to play a role in the nuclear accumulation of β-catenin and cell proliferation in early stages of embryonic fibroblast reprogramming.34 Beazley et al35 showed that TGFβ inhibits WNT16 expression, leading to the deactivation of Notch signalling in vascular smooth muscle cells. Considering that inhibition of Notch signalling limits both fibrosis and autoimmune activation36 and even reversed established fibrosis in a murine model of SSc,37 the role of downregulation of the KLF4–WNT16–Notch signalling axis in dFBs in the development of SSc warrants additional investigation. Due to functional redundancy among HOX5 paralogs, it is important to consider HOXA5 and HOXC5 when assessing the role of HOXB5. Loss of HOX5 leads to downregulation of WNT2,38 consistent with our KLF4 silencing assay in which both WNT2 and HOXB5 were downregulated. Together this suggests that both KLF4 and HOXB5 regulate WNT/β-catenin signalling. Additionally, HOX5 paralogs modulate Th2 inflammation during chronic allergic reaction,39 a role that is unexplored in SSc.

Figure 5

Summary model of the regulatory network. In SSc dFBs, miR-10a and miR-10b expression levels are low, allowing for the upregulation of two silencing targets TFAP2A and TBX5. KLF4 levels are repressed by TFAP2A and TBX5, which leads to high expression of ACTA2, COL1A1, CTGF and FN, enhancing myofibroblast differentiation and fibrosis. Low levels of KLF4 also correlate with low levels of HOXB5, WNT2 and WNT16, and high levels of WNT4, suggesting that KLF4 regulates the Wnt signalling pathway. HOXB5 is positively correlated with KLF4 levels, and has been shown to play a role in myofibroblast adhesion to FN. CTGF, connective tissue growth factor; dFBs, dermal fibroblasts; FN, fibronectin.

The fact that TFAP2A overexpression suppressed the expression of the matrix genes collagen types II and X, and aggrecan,40 and KLF4 negatively regulated the expression of mesenchymal markers ΑCΤΑ2 and fibroblast specific protein 1,41 suggests that the TFAP2A–KLF4 axis likely regulates myofibroblast differentiation and fibrosis. Furthermore, the decreased expression levels of POU5F1 and NANOG in early dcSSc parallel those of KLF4 and may reflect a state of reduced plasticity that points to cell fate reprogramming in dFBs.42

MiRNAs of the miR-10 family target TFAP2A,43 TBX5,44 KLF4 and several HOX genes for silencing.16 Both miR-10a-5p and miR-10b-5p are active elements in mediating cancer metastasis45 46 and the fibrogenic response.47 Our silencing assay showed that they mediate TFAP2A, TBX5 and HOXD10 dysregulation in dFBs. Our data suggest that it is a combination of decreased DNA methylation and reduced miRNA silencing that maintain high levels of TBX5 in dcSSc dFBs. However, the epigenetic regulation of TFAP2A, which was upregulated with increased methylation status, may be more driven by miRNAs. Even though miR-10a/miR-10b target several DE genes in our dataset, not all of these targets exhibited increased expression levels when miR-10a and miR-10b were downregulated, as is the case for HOXD10 and SOBP in lcSSc twins, which could be due to the complex way these genes are epigenetically or post-transcriptionally regulated.

Our findings suggest that developmental programmes deregulated in lcSSc are similarly observed in the early phase of dcSSc. This may be due to the fact that skin involvement in lcSSc, although minimal, remains active as the disease progresses, whereas skin involvement regresses in dcSSc approximately 3 years after disease onset. This highlights the importance of including patients with lcSSc in studies involving gene expression profiling, whether in response to potential therapies or to monitor disease progression.

Some genes linked to DM CpG sites reported in individual SSc studies did not show evidence of differential methylation in this discordant twin design. This may be due to simple random sampling variation in combination with the power to detect more modest effects. Some may reflect differences in tissues assayed (type, physical location). But some of these genes may not replicate because of the robustness of a discordant twin study design where the test of association is between the twins, eliminating the confounding effects of genetic heterogeneity across samples that cause inflation of the test statistics (ie, lead to false positive).

This study provides insights into the interplay between different epigenetic mechanisms in SSc-associated dermal fibrosis and implicate TFs and miRNAs in disease pathogenesis. Given that epigenetic regulation is reversible, this suggests that epigenetic modifying agents or CRISPR-based epigenome editing to ameliorate fibrosis may be of relevance in SSc.48

Data availability statement

Data are available in a public, open access repository. Data are available upon reasonable request. All data relevant to the study are included in the article or uploaded as supplementary information. The data supporting the findings of this study are available within the paper and its supplementary information files. Normalized or raw intensity data of the HM450K BeadChips are available upon request from the authors on a collaborative basis. RNAseq data have been deposited on the NCBI GEO under access # GSE153880.

Ethics statements

Patient consent for publication


The authors would like to acknowledge Debby Hollingshead in the Genomics Research Core at the University of Pittsburgh and Dr Timothy Wright for contributing to the recruitment of the twins, as well as Dr Steven W Kubalak for providing skin of cadaveric donors.


Supplementary materials


  • MM, LR and NT are joint first authors.

  • Handling editor Josef S Smolen

  • Twitter @PSRamos_PhD

  • MM, LR and NT contributed equally.

  • Contributors CAF-B is the guarantor for this study. CAF-B designed and coordinated the study. CAF-B and TAM recruited the patients. TAM did history and physical examinations and disease subtyping of patients. CAF-B and NT processed the skin biopsies, cultured fibroblasts and extracted DNA and RNA. SH confirmed the zygosity of the twins. MM and NT conducted the qPCR and gain/loss of function experiments. KDZ, PSR, WAdS, GH, NT, LRP, BW and CDL analysed the data. MP-G and LRP generated the KLF4 conditional knock out mice. MM, LR, PSR, and CFB wrote the manuscript. All authors were involved in critical review, editing, revision and approval of the final manuscript.

  • Funding This study was Supported by the US National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health (NIH) under Awards Numbers K24 AR060297 (CFB), K01 AR067280 (PSR), T32 AR050958 (LR), and R35 HL144979 (MPG), by the Parker B. Francis Fellowship (LRP), and by the Scleroderma Foundation (CFB).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Linked Articles