Article Text

Download PDFPDF

Extended report
Transethnic meta-analysis identifies GSDMA and PRDM1 as susceptibility genes to systemic sclerosis
  1. Chikashi Terao1,2,3,4,5,
  2. Takahisa Kawaguchi1,
  3. Philippe Dieude6,
  4. John Varga7,
  5. Masataka Kuwana8,
  6. Marie Hudson9,
  7. Yasushi Kawaguchi10,
  8. Marco Matucci-Cerinic11,
  9. Koichiro Ohmura12,
  10. Gabriela Riemekasten13,14,
  11. Aya Kawasaki15,
  12. Paolo Airo16,
  13. Tetsuya Horita17,
  14. Akira Oka18,
  15. Eric Hachulla19,
  16. Hajime Yoshifuji12,
  17. Paola Caramaschi20,
  18. Nicolas Hunzelmann21,
  19. Murray Baron9,
  20. Tatsuya Atsumi17,
  21. Paul Hassoun22,
  22. Takeshi Torii23,
  23. Meiko Takahashi1,
  24. Yasuharu Tabara1,
  25. Masakazu Shimizu1,
  26. Akiko Tochimoto10,
  27. Naho Ayuzawa24,
  28. Hidetoshi Yanagida24,
  29. Hiroshi Furukawa15,25,
  30. Shigeto Tohma25,
  31. Minoru Hasegawa26,
  32. Manabu Fujimoto27,
  33. Osamu Ishikawa28,
  34. Toshiyuki Yamamoto29,
  35. Daisuke Goto30,
  36. Yoshihide Asano31,
  37. Masatoshi Jinnin32,
  38. Hirahito Endo33,
  39. Hiroki Takahashi34,
  40. Kazuhiko Takehara35,
  41. Shinichi Sato31,
  42. Hironobu Ihn32,
  43. Soumya Raychaudhuri3,4,5,36,
  44. Katherine Liao3,
  45. Peter Gregersen37,
  46. Naoyuki Tsuchiya15,
  47. Valeria Riccieri38,
  48. Inga Melchers39,
  49. Gabriele Valentini40,
  50. Anne Cauvet41,
  51. Maria Martinez42,
  52. Tsuneyo Mimori12,
  53. Fumihiko Matsuda1,
  54. Yannick Allanore43
  1. 1Department of Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
  2. 2Center for the Promotion of Interdisciplinary Education and Research, Kyoto University Graduate School of Medicine, Kyoto, Japan
  3. 3Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Boston, Massachusetts, USA
  4. 4Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
  5. 5Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
  6. 6Rheumatology Bichat Hospital, Paris 7 University, Paris, France
  7. 7Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
  8. 8Division of Rheumatology, Department of Internal Medicine, Keio University School of Medicine, Tokyo, Japan
  9. 9Jewish General Hospital and Lady Davis Research Institute, Montreal, Quebec, Canada
  10. 10Institute of Rheumatology, Tokyo Women's Medical University, Tokyo, Japan
  11. 11Division of Rheumatology AOUC, Department of Experimental and Clinical Medicine, Department of Medical & Geriatrics Medicine, University of Florence, Firenze, Italy
  12. 12Department of Rheumatology and Clinical Immunology, Kyoto University Graduate School of Medicine, Kyoto, Japan
  13. 13Clinic for Rheumatology, University of Lübeck, Lübeck, Germany
  14. 14German Lung Center Borstel, Leibniz Institute, Germany
  15. 15Molecular and Genetic Epidemiology Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
  16. 16Rheumatology Unit, Spedali Civili, Brescia, Italy
  17. 17Division of Rheumatology, Endocrinology and Nephrology, Hokkaido University Graduate School of Medicine, Sapporo, Japan
  18. 18The Institute of Medical Science, Tokai University, Isehara, Japan
  19. 19Internal Medicine Department, FHU Immune-Mediated Inflammatory Diseases and Targeted Therapies, Lille University, Lille, France
  20. 20Rheumatology Department, University of Verona, Azienda Ospedaliera Universitaria Integrata, Italy
  21. 21Dermatology Department, University of Koln, Koln, Germany
  22. 22Division of Pulmonary and Critical Care Medicine, Department of Medicine, Johns Hopkins University, Baltimore, Maryland, USA
  23. 23Torii Clinic, Kyoto, Japan
  24. 24Department of Rheumatology, National Hospital Organization, Utano National Hospital, Kyoto, Japan
  25. 25Clinical Research Center for Allergy and Rheumatology, Sagamihara Hospital, National Hospital Organization, Sagamihara, Japan
  26. 26Division of Medicine, Faculty of Medical Sciences, Department of Dermatology, University of Fukui, Fukui, Japan
  27. 27Department of Dermatology, Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki, Japan
  28. 28Department of Dermatology, Gunma University Graduate School of Medicine, Gunma, Japan
  29. 29Department of Dermatology, Fukushima Medical University, Fukushima, Japan
  30. 30Department of Internal Medicine, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
  31. 31Department of Dermatology, University of Tokyo Graduate School of Medicine, Tokyo, Japan
  32. 32Department of Dermatology and Plastic Surgery, Faculty of Life Sciences, Kumamoto University, Kumamoto, Japan
  33. 33Division of Rheumatology, Department of Internal Medicine, School of Medicine, Toho University, Tokyo, Japan
  34. 34Department of Rheumatology and Clinical Immunology, Sapporo Medical University School of Medicine, Sapporo, Hokkaido, Japan
  35. 35Department of Dermatology, Faculty of Medicine, Institute of Medical, Pharmaceutical and Health Sciences, Kanazawa University, Kanazawa, Ishikawa, Japan
  36. 36Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal Research, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
  37. 37Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Manhasset, New York, USA
  38. 38Sapienza University of Rome, Rome, Italy
  39. 39University Medical Center, Freiburg, Germany
  40. 40Department of Clinical and Experimental Medicine, Rheumatology Section, Second University of Naples, Naples, Italy
  41. 41INSERM U1016/UMR 8104, Cochin Institute, Paris Descartes University, Paris, France
  42. 42INSERM U1220—IRSD—Batiment B Purpan Hospital Toulouse, Paris, France
  43. 43Rheumatology A Department, INSERM U1016/UMR 8104, Cochin Institute, Paris Descartes University, Paris, France
  1. Correspondence to Dr Yannick Allanore, INSERM U1016/UMR 8104, Cochin Institute, Rheumatology A Department, Paris Descartes University, 75014 Paris, France; yannick.allanore{at} or Dr Chikashi Terao, Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Shogoin-Kawahara-cho 54, Sakyo-ku, Kyoto 606-8507, Japan; a0001101{at}


Objectives Systemic sclerosis (SSc) is an autoimmune disease characterised by skin and systemic fibrosis culminating in organ damage. Previous genetic studies including genome-wide association studies (GWAS) have identified 12 susceptibility loci satisfying genome-wide significance. Transethnic meta-analyses have successfully expanded the list of susceptibility genes and deepened biological insights for other autoimmune diseases.

Methods We performed transethnic meta-analysis of GWAS in the Japanese and European populations, followed by a two-staged replication study comprising a total of 4436 cases and 14 751 controls. Associations between significant single nuclear polymorphisms (SNPs) and neighbouring genes were evaluated. Enrichment analysis of H3K4Me3, a representative histone mark for active promoter was conducted with an expanded list of SSc susceptibility genes.

Results We identified two significant SNP in two loci, GSDMA and PRDM1, both of which are related to immune functions and associated with other autoimmune diseases (p=1.4×10−10 and 6.6×10−10, respectively). GSDMA also showed a significant association with limited cutaneous SSc. We also replicated the associations of previously reported loci including a non-GWAS locus, TNFAIP3. PRDM1 encodes BLIMP1, a transcription factor regulating T-cell proliferation and plasma cell differentiation. The top SNP in GSDMA was a missense variant and correlated with gene expression of neighbouring genes, and this could explain the association in this locus. We found different human leukocyte antigen (HLA) association patterns between the two populations. Enrichment analysis suggested the importance of CD4-naïve primary T cell.

Conclusions GSDMA and PRDM1 are associated with SSc. These findings provide enhanced insight into the genetic and biological basis of SSc.

  • Systemic Sclerosis
  • Gene Polymorphism
  • Autoimmune Diseases

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Systemic sclerosis (SSc) is an orphan disease with high morbidity and mortality. It is composed of two main subsets, a limited cutaneous form (lcSSc) and a diffuse cutaneous form (dcSSc).1 SSc is also characterised by production of specific autoantibodies, anticentromere antibody (ACA) and anti-Scl70 antibody. Severe complications in SSc include interstitial lung disease (ILD), digital ulcers (DU), renal crisis and pulmonary hypertension (PH), where fibrosis in tissues and vessel remodelling play fundamental roles.1 Genetic and environmental elements are associated with the development of SSc.1 ,2 While SSc is a heterogeneous disease, it has a significant genetic component.3 A total of 12 non-HLA loci showing significant associations (p<5.0×10−8) were reported for their associations4–11 (table 1).

Table 1

The results in the current study for the previous GWAS loci and TNFAIP3

In spite of a paradigm shift in the treatment of autoimmune diseases by biological agents,12 treatment of SSc remains challenging and new molecular targets are still under investigation. Results of previous genome-wide association studies (GWAS) in other autoimmune diseases have successfully identified important pathways as molecular targets, leading to effective treatments.13–15 Similarly, GWAS on SSc may suggest novel targets for treatment.

To this end, transethnic meta-analysis of GWAS would be a promising way to identify unknown susceptibility genes which were difficult to detect in a single population due to lack of statistical power, different structure of linkage disequilibrium (LD) or different allele frequencies between populations. In fact, transethnic meta-analyses of GWAS for another autoimmune disease, rheumatoid arthritis (RA), have expanded lists of susceptibility genes and led to candidates for target cell types and molecules.16 However, most of the previous GWAS for SSc are mainly reported from the European population, with only one GWAS from the Asian population using 137 Korean patients.17 Thus, we performed GWAS for SSc using 716 Japanese cases,2 and 1797 controls and performed transethnic meta-analysis of GWAS using the previous GWAS from the French population,5 with comparable numbers of subjects to Japanese GWAS (see online supplementary figure S1).

Materials and methods

Study design

The schematic view of the study design is illustrated in online supplementary figure S1. In brief, after a first analysis of Japanese and French GWAS data, we then performed two replication studies. In the first replication study, we used a Japanese cohort and a European cohort originating from several European countries. In the second replication study, we used Canadian and North American population with European decent. As for markers, we picked up a total of 33 single nucleotide polymorphisms (SNPs) for the first replication study based on the criteria of selection of candidate SNPs. We further selected seven SNPs fulfilling the criteria of selection for the second replication study.


A total of 1280 cases and 3660 controls in the Japanese population and 3156 cases and 11 091 controls in the European population were recruited. Break down of subjects are shown in online supplementary tables S1 and S2. All case samples fulfilled the American College of Rheumatology classification criteria for SSc.18 Written informed consent was obtained from all the participants. This study was approved by local ethical committees.

Clinical information

Clinical information regarding subtypes of SSc defined by LeRoy et al,19 but also ILD, PH, renal crisis, DU and possession of ACA, anti-Scl70 antibody and anti-RNA polymerase III antibody were collected. The clinical information was selected based on the importance of SSc outcome and the previous genetic studies identifying specific associations with SSc subtypes or phenotypes. Due to very low prevalence of renal crisis, PH and anti-RNA polymerase III antibody, we did not include these phenotypes for subtype-specific analysis. The availability of clinical information is shown in online supplementary table S1.


The French GWAS data were published previously and the methods are written elsewhere.5 Genotyping with a competitive allele-specific PCR system for replication in the European samples was performed in the LGC Genomics (Hoddesdon, UK). A part of the replication data in the European samples was obtained by imputation based on GWAS (see online supplementary tables S1 and S2). The Japanese samples in the GWAS and the first replication study were genotyped in Kyoto University and University of Tsukuba, Japan.


Imputation and phasing were performed by MaCH software,20 using the East Asian panel and European panel in the 1000 Genomes Project,21 as references for Japanese and French populations, respectively. After imputation, we performed quality control (see below). Imputation for the Japanese and French population was performed separately at Kyoto University in Japan and the INSERM UMR 1220 in France, respectively, and only summary statistics for the French imputation data were available due to restriction of data sharing policy of the control samples.

Quality control

We applied different quality control criteria in the two GWAS. The details are shown in online supplementary table S1. Since the current study, especially GWAS, had limited power to find signals in SNPs with low allele frequency (see online supplementary table S3), we filtered SNPs in each data set again after imputation based on allele frequency and used SNPs showing r2>0.5 in the output of MaCH for the subsequent analyses (see online supplementary table S4). Since information of variants in the sex chromosomes was not available in the French GWAS, we focused on variants in the autosomal chromosomes.

Linkage disequilibrium between SNPs

LD structure was evaluated based on the 1000 Genomes data and our genotyping data. Statistical value for LD was calculated by Haploview22 or PLINK.23

Selection of SNPs for the first replication study

For the first replication study, we picked up SNPs whose associations or the associations of other SNPs in the same region were not previously reported, and satisfying the following criteria: (1) whose p values for SSc susceptibility were <1.0×10−5 in the meta-analysis of the two GWAS, (2) whose p values for SSc susceptibility were <2.0×10−4 in the meta-analysis and whose p values were <0.05 in the text-based in silico analysis using Gene Relationships Across Implicated Loci (GRAIL) programme,24 with use of previously reported genes in SSc as seeds or (3) whose p values for SSc subtypes were <1.0×10−6. When multiple SNPs in the same region (r2>0.5) satisfied the above criteria, we picked up SNPs showing the best p values or SNPs for which probe and primer design for replication studies was not technically difficult.

Selection of SNPs for the second replication study

For the second replication study, we selected SNPs (1) whose p values in the meta-analysis of the GWAS and first replication studies were <5.0×10−6 or (2) whose p values in the GWAS were <2.0×10−5 and whose p values were <0.05 by GRAIL.

Calculation of variance explained by susceptibility SNPs

We evaluated variance explained by new susceptibility SNPs based on liability-scale threshold model. We assumed that there are underlying liability scores following normal distribution and that subjects having a liability score over a predefined threshold to develop SSc. We set prevalence of SSc as 0.05%. OR in the overall study was used as approximation of common relative risk between populations. We performed this estimation separately in each population using control allele frequencies in GWAS.

Associations between clinical manifestation and associated SNPs

After confirming the associations of the two SNPs with SSc, the associations between the two SNPs and SSc subtypes or clinical manifestations including ILD and DU were estimated. We did not perform GWAS for these clinical manifestations due to limited number of subjects who were positive for these manifestations. We further defined two phenotypes, fibrotic and vascular and performed association studies with these two phenotypes. Fibrotic phenotype includes dcSSc or severe lung disease defined by forced volume capacity (FVC) <70% or ILD in combination with FVC <75%. Vascular phenotype is DU, pulmonary arterial hypertension (PAH) or renal crisis.

Amino acid conservation search

We assessed amino acid conservation for the residue of GSDMA altered by rs3894194 across vertebrates in combination with Genomic Evolutionary Rate Profiling (GERP) score by using UCSC Genome Browser. GERP score was calculated for the single amino acid residue in which positive score indicated conservation.

HLA imputation and analyses

HLA alleles and amino acids were imputed by SNP2HLA.25 We used Asian reference panel and European reference panel for Japanese and French samples, respectively.

Functional annotation and biological insights

HaploReg V.4.0 was used to assess functional annotation of the significant SNPs. The programme of functional enrichment by Trynka et al,15 was used for the enrichment analysis. We looked up the effects of SNPs on gene expression by previous expression quantitative trait loci (eQTL) data.26 ,27

To get biological insight of SSc based on all SSc-associated loci, we also searched for all the SSc-associated loci: (1) missense mutations and functional annotation signals in SNPs in strong LD (r2>0.8) with top SNPs in the loci, (2) the associations with the other diseases by GWAS catalogue using Gene names, (3) cis-eQTL signals based on the largest eQTL study,26 with p values <1.0×10−5, (4) H3K4me3 signals in CD4-naïve primary T cell with scores more than 0 calculated by the method mentioned above and (5) promoter histone marks in skin tissues based on results of HaploReg V.4.0.

Statistical analysis

Logistic regression model was used for association studies. We used the first three principal components as covariates for the Japanese GWAS since additional PCs did not further improve the results. No inflation of p values was observed in French GWAS. We performed association studies using SSc, dcSSc, lcSSc, anti-Scl70 antibody(+)SSc and ACA(+)SSc as dependent variables. Inverse-variance method assuming fixed effects was applied to integrate the different association studies. Hardy-Weinberg equilibrium was assessed for the SNPs across the studies. SNPs showing association p values <5.0×10−8 in the overall study were regarded as significant. Heterogeneity was evaluated for SNPs showing significant results using Cochran Q test. Interactive effects of two SNPs were evaluated by multiplicative model. Power calculation of the current study was conducted with use of ‘Genetics Design’ package of R software.

Since the first replication study in the European population was composed of French, Italian and German populations, we put covariates of the three populations as indicator variables. When we excluded imputation data to confirm the results by avoiding batch effects derived from different genotyping methods, German cases were combined with French cohorts because imputation data were used for all German controls.

Since individual imputation data for the French GWAS were not fully available due to restriction of data sharing policy in the control samples, we performed conditional analysis and HLA imputation using all of the case samples and a part of the control samples whose genotyping data were available.

Omnibus test to assess critical amino acid positions in the HLA region was conducted in each population and in the combined set as previously described.28 ,29 When we analysed the combined set, the indicator variable of population was added as a covariate.

Statistical analysis was performed by PLINK or R statistical software. LocusZoom30 was used to draw regional plots.


Japanese GWAS of SSc

We genotyped the Japanese cases and controls with five different Illumina Infinium arrays (see online supplementary table S1). After filtering samples based on quality control criteria, 700 cases and 1797 controls remained (see the Materials and methods section). To maximise the power to find new susceptibility loci, we performed imputation for this dataset with the East Asian panel in the 1000 Genome project21 as a reference. We identified rs12612769 in STAT4 and rs9268636 near HLA-DRA showing significant associations (p=4.7×10−8 and 9.6×10−10, respectively, online supplementary figure S2A).

European GWAS for SSc and meta-analysis

Next, we used the previously published French GWAS containing 564 cases and 1776 controls and performed imputation with use of the European population panel in the 1000 Genomes Project European panel as reference (see online supplementary figure S2B). We conducted a transethnic meta-analysis of the two GWAS by the inverse-variance method assuming fixed effects for SNPs satisfying criteria of quality control (see online supplementary table S4). Since no evidence of population structure was obtained (lambda=1.05, figure 1), we did not apply genomic control31 to correct statistics. As a result, we identified the STAT4 region showing a significant association (p=3.0×10−11, figure 1 and see online supplementary table S5). The HLA locus did not show a significant association (p≥1.3×10−7, figure 1) in spite of significant associations of the HLA locus in both populations (see online supplementary figure S2), suggesting different causative variants between the two populations. In fact, when we conducted HLA imputation using SNP2HLA, the different association patterns of amino acid positions were observed (see online supplementary table S6). The results of the susceptibility loci in the previous studies are shown in table 1. The risk alleles for all of the SNPs in the previous studies were the same in the meta-analysis or the French GWAS, suggesting replication of the previous findings and validity in the current study. In addition, we found an association in the TNFAIP3 region whose association was previously reported without satisfying genome-wide significance level.32 All the variants in non-HLA region showing p value <1.0×10−5 are shown in online supplementary table S5. We also performed SSc subtype GWAS according to the previous GWAS,6 namely, lcSSc, dcSSc, ACA(+)SSc and anti-Scl70(+)SSc (see online supplementary figure S3).

Figure 1

Transethnic meta-analysis of genome-wide association studies (GWAS) revealed multiple susceptibility loci to systemic sclerosis (SSc). The results of the transethnic meta-analysis of GWAS are shown in the Manhattan plot and quantile-quantile (QQ) plot. The newly identified loci and previously reported loci with strong p values are indicated in the Manhattan plot. The horizontal line indicates the genome-wide significance level.

Selection of SNPs for the replication studies

We identified 33 SNPs in 33 novel candidates of susceptibility loci (see the Materials and methods section or see online supplementary figure S1). Twenty-seven out of the 33 SNPs were novel candidates of susceptibility loci to SSc. Among the remaining six SNPs, one and two were specific for limited and diffuse types, respectively, and two and one for possession of ACA and anti-Scl70 antibody only, respectively (see online supplementary table S7).

The two-staged replication studies

We recruited 564 cases and 1863 controls in the Japanese population and 1582 cases and 6694 controls in the European population for the first replication study (see online supplementary table S1 and S2). We found that rs3894194 in GSDMA showed an association beyond the significance level in the combined population. All the results for the 33 SNPs are shown in online supplementary table S7. We further recruited a total of 1010 cases and 2621 controls in the European population for the second replication study to validate the associations of the seven SNPs showing possible associations (see the Materials and methods section or see online supplementary figure S1). As a result, rs3894194 in GSDMA kept its association (overall p=1.4×10−10, table 2). rs4134466 in PRDM1 in chromosome 6 also showed an association beyond the significance level (overall p=6.6×10−10, table 2). The two SNPs did not display deviation from Hardy-Weinberg disequilibrium (p≥0.037) and heterogeneity (p≥0.011) across the studies. When we assessed the liability-scale variance explained by these two SNPs,33 a total of 0.2% was explained in each population (see the Materials and methods section).

Table 2

The results of the seven SNPs selected for the second replication study

PRDM1 as a novel locus for SSc

rs4134466 is located 20 kbp downstream of PRDM1, also known as BLIMP1, encoding a transcription factor regulating T-cell proliferation and plasma cell differentiation.34 The LD block spanning rs4134466 does not contain any other genes (figure 2A). The previous GWAS reported that this region was associated with other inflammatory conditions including RA,35 systemic lupus erythematosus (SLE)36 and inflammatory bowel disease (IBD).37 When we searched for SNPs in the exonic region of PRDM1 in strong LD with rs4134466, we could not find any coding variants in both Japanese and European populations. While PRDM1 in chromosome 6 was adjacent to ATG5, a previously reported susceptibility gene to SSc,8 rs4134466 in PRDM1 was not in strong LD with rs9373839 in ATG5 showing the strongest susceptibility association in the previous study (r2<0.15 in our study and the 1000 Genomes Project). In addition, rs9373839 was not polymorphic in the Japanese population. Thus, the association of rs4134466 was not driven by rs9373839. In fact, when we conditioned the association of rs4134466 on rs9373839 using imputation data of French GWAS, the effect size of rs413466 risk allele did not change before and after conditioning (OR 1.102 and 1.105, before and after conditioning, respectively). Since the previous study of SLE GWAS36 reported that rs65684331 in PRDM1 is associated with SLE independently from rs2245214 in ATG5,38 SSc seems to have multiple hits in this region as in SLE.

Figure 2

Detailed plot for the two loci found in the current study. The detailed plots in chromosome 6 and 17 are shown for (A and B), respectively. The purple plots indicate the top SNPs in the combined results and GWAS meta-analysis for the upper and lower plots, respectively. The plots are drawn based on the linkage disequilibrium (LD) structure of East Asians by using LocusZoom as a representative.

GSDMA as a novel locus for SSc

rs3894194 is a missense mutation of GSDMA altering an arginine residue to glutamine (p.R18Q). This amino acid residue is conserved across species with GERP score 3.34 (see online supplementary table S8). Estimation by PolyPhen-239 software suggest a benign effect of this variant. The LD block containing SNPs in LD with rs3894194 (r2>0.8) harboured LRRC3C and this region is neighbouring ORMDL3 and GSDMB (figure 2B). This region is a gene-rich region and reported to be associated with various immune-related diseases including RA16 and IBD.37 ,40 However, SNPs located in the LD block tagged by rs3894194 have not been reported to be associated with other diseases. The RA-associated SNP (rs59716545) is in low LD with rs3894194 (r2=0.25). GSDMA is associated with IBD in the previous study and the associated SNPs are in low LD with rs3894194 (rs2872507 or rs12946510, r2<0.38). This region is also associated with asthma,41 but the effect of this SNP on asthma is opposite to that on IBD.41 This opposing effect seems to be true for asthma and SSc (OR of risk allele of this region: 1.26 and 1.18 in asthma and SSc, respectively).

Functional annotation of the two SNPs

Next, we assessed the effects of the two SNPs and the neighbouring SNPs on gene expression and functional annotation. We went through GTEx,26 and found that rs3894914 in GSDMA showed a strong association with expression of GSDMB and ORMDL3, neighbouring genes to GSDMA (p≤2.6×10−12, figure 3A) and whose gene expression strongly correlated with each other. The association between gene expression and rs3894194 was also confirmed in the largest eQTL data27 (see online supplementary table S9). We found that the associations between SSc and SNPs in the GSDMA locus correlated well with the associations of the SNPs with gene expressions of GSDMB and ORMDL3 (figure 3B). Thus, the effect of the SNP on gene expression of GSDMB and ORMDL3 in combination with amino acid alteration of the GSDMA protein seems to explain the association of this locus. HaploReg V.4.042 revealed that rs3894194 showed enhancer activity and enrichment of histone marks (see online supplementary table S10). While the previous eQTL studies27 did not show associations between rs4134466 and gene expression, rs4134466 showed DNase hypersensitivity and methylation in various kinds of cells (see online supplementary table S10).

Figure 3

Correlation between the associations of variants in GSDMA region with systemic sclerosis (SSc) susceptibility and gene expression. (A) rs4134466 is associated with gene expression of GSDMA-neighbouring genes GSDMB (left) and ORMDL3 (right). The box plots were obtained from GTEx data. (B) The associations between SSc susceptibility and single nuclear polymorphisms (SNPs) in chromosome 17 GSDMA locus are plotted together with the associations between the variants and expression of GSDMB (left) and ORMDL3 (right). The gene expression data were obtained from Blood eQTL Browser. The correlation plots are indicated in the lower panels. The black diamonds indicate rs4134466. (C) rs4134466 is associated with limited SSc. The associations between the two SNPs and the two subtypes of SSc are indicated. lcSSc, limited cutaneous SSc; dcSSc, diffuse cutaneous SSc.

When we assessed interactive effects of the two SNPs on SSc susceptibility, we did not observe a significant effect (p=0.57).

Subtype analyses for the two SNPs

When the associations of these two SNPs and the subtypes of SSc were analysed, rs3894194 in GSDMA showed a significant association with lcSSc (figure 3C). No other significant associations were observed (see online supplementary figure S4), but this study was underpowered to detect phenotype-specific associations. When we focused on SSc subtypes showing extreme phenotypes of fibrosis and vasculopathy (see the Materials and methods section), we did not find enhanced associations between the two SNPs and the subtypes (data not shown).

Enrichment analysis of histone modification

Next, based on the expanded list of susceptibility genes to SSc, we performed enrichment analysis of H3K4Me3, a representative histone modification mark that was shown to be enriched in autoimmune disease-related variants.15 We found that the susceptibility SNPs and the neighbouring SNPs in LD with them (r2>0.8) showed suggestive enrichment of H3K4Me3 signal in CD4-naïve primary T cell or CD4 memory T cell (see online supplementary figure S5A). We also found that the suggestive enrichment signal in CD4-naïve primary T cell was mainly brought about by the three SNPs in GSDMA, PRDM1 and TNFAIP3 found in the current study (see online supplementary figure S5B).

Functional annotation of susceptibility loci

The significant SSc-associated genes including the current results and TNFAIP3 are summarised in online supplementary figure S6. We combined information of protein alteration, associations with other diseases and functional annotations. The development of promising drug targets by enrichment analysis based on the list may be challenging, but this table would be useful for candidates of future functional analyses and further expansion of SSc-associated loci.


This is the largest SSc GWAS from non-European populations and the first transethnic meta-analysis of SSc GWAS. We identified two novel susceptibility loci, namely GSDMA and PRDM1. Both loci were associated with other autoimmune diseases, consistent with overlapping susceptibility genes among various autoimmune diseases. We also replicated the associations of previously reported GWAS variants and provided evidence of association with TNFAIP3. To avoid possible batch effects due to different genotyping methods, we excluded all European subjects in the first replication study whose genotypes were imputed. The associations of the two SNPs remained significant (p≤9.8×10−9, data not shown).

We did not find a significant multiplicative interaction between the two SNPs. Since a previous study showed substantial interactive effects limited to the HLA loci,43 it would be interesting to expand SSc cohorts and assess HLA interaction.

The enrichment analysis suggested possible involvement of CD4-naïve primary T cells with SSc. However, further expansion of susceptibility loci and convincing evidence of cell-type-specific enrichment are essential. We did not observe suggestive enrichment signal in CD19 primary cells, representing B cells. Interestingly, both SNPs showed evidence of associations of gene-expression including fibroblast or keratinocyte. Since previous loci were associated with gene expression especially in immune-related cells, the current findings would suggest importance of skin-residing cells on SSc pathophysiology. Cell-specific gene expression profile of fibroblast, keratinocyte or other fibrosis-related cell types including endothelial cells in combination with genetic data would be useful to address the importance and involvement of these cells and genes in SSc.

PRDM1, also known as B lymphocyte-induced maturation protein 1, is a transcript factor influencing a broad range of genes involved with cell proliferation and the immune system. PRDM1 is critical for epithelial and B cell differentiation,34 and associated with other autoimmune diseases and haematopoietic malignancies. The association of this locus with SSc suggests a critical role of lymphocytes on SSc susceptibility. In fact, rs4134466 provided the highest score of H3K4me3 in CD4(+)-naïve primary T cell among the SSc susceptibility variants. The first European replication study might suggest heterogeneity of this allele within the European population. Further expansion of subjects in subpopulations would clarify this point.

rs3894194 is a missense variant of GSDMA protein and associated with neighbouring gene expression. While it is not easy to pinpoint a causative variant, rs3894194 is a promising candidate of a causative SNP. GSDMA and GSDMB are strongly expressed in the skin and functional annotation revealed that rs3894194 has a regulatory effect of gene expression in various cell types including skin fibroblast. While rs3894194 also provided histone methylation in CD4(+)-naïve primary T cells, this locus may mainly demonstrate its susceptibility effect in the skin. The GSDMA locus showed a significant association with limited cutaneous SSc in spite of the reduced number of case subjects. This may suggest that this locus plays a more important role on developing lcSSc than dcSSc. However, since this locus also showed substantial associations with dcSSc, the results were inconclusive.

TNFAIP3 encodes A20 regulating tumour necrosis factor response by inhibiting nuclear factor-κB (NF-κB) activation. A20 also suppresses profibrotic signalling, relevant to SSc pathogenesis.44 rs2230926 is a missense variant of TNFAIP3 and associated with other rheumatic diseases.45 The association of TNFAIP3 as well as TNIP1 supports NF-κB involvement with SSc. However, we did not observe significant interactive effect of the two SNPs (data not shown).

Since the two populations substantially contributed to both the associations found in this study, the current findings indicate that transethnic meta-analysis is effective to identify unreported susceptibility loci to SSc, which comprise moderate effect sizes in each population. Furthermore, the current findings, especially rs4134466, would suggest that transethnic meta-analysis is effective by taking advantage of different allele frequencies and LD structure between the populations to discern unreported susceptibility signals from previously reported loci.

HLA and STAT4 loci showed different association patterns between subtypes of SSc, suggesting genetic heterogeneity in SSc. While the association between STAT4 and SSc was mainly driven by ACA(+) SSc, intracase analysis did not reveal significant difference in STAT4 between ACA(+) and ACA(–) SSc (p>0.01, data not shown). The HLA locus showed strong associations with antibody-positive SSc subtypes in spite of the reduced sample numbers even in intracase analyses. The associations of the HLA locus were attenuated in overall SSc, and this could be explained by different associations of the HLA locus between different SSc subtypes or different antibodies.46 Our results also suggested different association patterns of the HLA locus between Japanese and European populations. It would be feasible to expand SSc to compare the genetic architectures between populations or subtypes.

The different arrays between cases and controls in the Japanese subjects reduced the number of preimputed and postimputed markers. It would be feasible to rescan the control samples using the same arrays as the cases or take advantage of other controls which have used the same arrays to maximise power to find significant signals in future studies.

While it is still challenging to pinpoint a specific cell type contributing to SSc based on genetic findings, most of the susceptibility genes are immune-related and enrichment analysis suggested the importance of immune-related cells. Increasing samples for genetic studies especially from non-European populations would increase SSc susceptibility loci, identify population-specific susceptibility loci, narrow down candidates of causative variants and clarify genetic architecture. Exome,47 whole-genome or target deep sequencing might also be helpful. Clarification of genetic background of SSc by multiple approaches in combination with functional analyses would lead to the identification of possible therapeutic targets.


We thank the investigators of the French Three-City (3C) cohort and in particular, Drs Philippe Amouyel, Christophe Lambert (Lille, France) and Luc Letenneur (BOrdeaux), who gave us access to data from controls. We also thank Drs Monique Hinchcliffe and Benjamin Korman for sample collection and Mr Takeshi Iino for performing the replication in the Japanese population.



  • Handling editor Tore K Kvien

  • Twitter Follow Soumya Raychaudhuri @soumya_boston

  • Contributors Wrote the paper: CT, YA. Performed imputation and analyses: CT, TK, MM Performed the experimental work: CT, AK, AO, MS, AC. Conceived and designed the study: CT, TM, FM, YA. Substantial contribution to acquired samples and creation of data in GWAS: CT, TK, PD, MK, YK, KO, TH, HY, TA, TT, MT, YT, MS, AT, AC, MM, TM, FM, YA. Substantial contribution to acquired samples and creation of data in replication study: CT, PD, JV, MH, MMC, KO, GR, AK, PA, TH, AO, EH, HY, PC, NH, MB, TA, PH, TT, MT, YT, MS, NA, HY, HF, ST, MH, MF, OI, TY, DG, YA, MJ, HE, HT, KT, SS, HI, SR, KL, PG, NT, VR, IM, GV, AC, TM, FM, YA. Contribution to collection of clinical information: CT, PD, JV, MK, MH, YK, MMC, KO, GR, PA, TH, AO, EH, PC, NH, MB, TA, PH, YT, AT, NA, HY, HF, ST, MH, MF, OI, TY, DG, YA, MJ, HE, HT, KT, SS, HI, VR, IM, GV, AC, TM, FM, YA. All authors revised and approved the manuscript to be published.

  • Funding This study was supported by JSPS KAKENHI Grant Number JP16H06251, KANAE foundation for the promotion of medical science, Research Project of Genetic Studies for Intractable Diseases, Nagao Memorial Fund, The Uehara Memorial Foundation, The John Mung Advanced Program, Kyoto University and Associattion des Sclerodermie de France, INSERM, CNRS, ATIP AVENIR Programme, Agence Nationale pour la Recherche (Project ANR-08-GENO-016-1).

  • Competing interests None declared.

  • Ethics approval This study was approved by local ethical committees.

  • Provenance and peer review Not commissioned; externally peer reviewed.