Introduction Endoplasmic reticulum aminopeptidase-1 (ERAP1) protein is highly polymorphic with numerous missense amino acid variants. We sought to determine the naturally occurring ERAP1 protein allotypes and their contribution to Behçet's disease.
Methods Genotypes of all reported missense ERAP1 gene variants with 1000 Genomes Project EUR superpopulation frequency >1% were determined in 1900 Behçet's disease cases and 1779 controls from Turkey. ERAP1 protein allotypes and their contributions to Behçet's disease risk were determined by haplotype identification and disease association analyses.
Results One ERAP1 protein allotype with five non-ancestral amino acids was recessively associated with disease (p=3.13×10−6, OR 2.55, 95% CI 1.70 to 3.82). The ERAP1 association was absent in individuals who lacked HLA-B*51. Individuals who carry HLA-B*51 and who are also homozygous for the haplotype had an increased disease odds compared with those with neither risk factor (p=4.80×10−20, OR 10.96, 95% CI 5.91 to 20.32).
Discussion The Behçet's disease-associated ERAP1 protein allotype was previously shown to have poor peptide trimming activity. Combined with its requirement for HLA-B*51, these data suggest that a hypoactive ERAP1 allotype contributes to Behçet's disease risk by altering the peptides available for binding to HLA-B*51.
- Behcet's disease
- Gene Polymorphism
Statistics from Altmetric.com
The endoplasmic reticulum aminopeptidase-1 (ERAP1) protein trims intracellular proteasome-processed peptides prior to their loading onto nascent HLA class I molecules in the endoplasmic reticulum (ER). Peptides that are efficiently bound by the human leucocyte antigen (HLA) class I molecules are transported to and displayed on the surface of nearly all cell types, where they play an important role in immune surveillance and in the function of cytotoxic T and natural killer (NK) cells. Variants of the ERAP1 gene have been associated with three polygenic inflammatory diseases with strong HLA class I associations. In all three diseases, the ERAP1 association is found only among individuals carrying the disease-associated HLA type, HLA-B*27 in ankylosing spondylitis,1 HLA-Cw6 in psoriasis2 and HLA-B*51 in Behçet's disease.3 Interestingly, the ERAP1 variant (p.Arg725Gln) that is associated with Behçet's disease risk is protective for both ankylosing spondylitis and psoriasis. Variants that influence ERAP1 activity and peptide specificity are likely to influence the ER peptidome by producing and/or destroying peptides that can be efficiently bound by disease-specific HLA class I molecules.
The ERAP1 protein is highly polymorphic with 10 missense amino acid variants reported with >1% minor allele frequency in the 1000 Genomes Project EUR superpopulation. The enzymatic activity and peptide specificity of the ERAP1 protein are likely dependent on its complete combination of variant amino acids, that is, its protein allotype, but ERAP1 disease associations have been reported for individual variants or for haplotypes composed of only two to five variants.4–7 Furthermore, ERAP1 activity and specificity in individuals may ultimately be dependent on allotype combinations as individuals carry a pair of codominantly expressed haplotypes.8 To determine the common protein allotypes present in the Turkish population and evaluate their contributions to Behçet's disease risk, we genotyped ERAP1 coding region SNPs, imputed additional marker genotypes and estimated coding variant haplotypes in 1900 Behçet's disease cases and 1779 controls from Turkey. We found a single ERAP1 allotype with a large contribution to disease risk in HLA-B*51 carriers.
Materials and methods
Patients and controls
Nineteen hundred unrelated Behçet's disease cases and 1779 unrelated controls (from our previous Turkish genome-wide association study (GWAS) and Turkish replication collections3) that passed stringent quality controls applied after genotyping with the Illumina Immunochip were included in the study. For further information and patient characteristics, see online supplementary text and table S1.
SNP genotypes and imputation
Nine hundred and seventy-three Immunochip SNP markers from the ERAP1 genomic region (hg build 19, chr5:95 970 970–96 427 776) that passed stringent quality controls were used as input for regional imputation with IMPUTE2 software9 with the 1000 Genomes Project phase 1 integrated dataset haplotypes as the reference. Imputed markers with infoscore >0.8 and predicted genotypes with probability >0.9 were included.
Analysis of ERAP1 coding haplotypes
Genotypes of 10 missense SNPs were used for prediction of coding haplotypes with SNP & Variation Suite 8.4 (SVS) (Golden Helix, Bozeman, Montana).
HLA-type imputation was performed with SNP2HLA software with Immunochip HLA region marker genotypes and a reference of 5225 individuals collected, HLA-typed and SNP genotyped by the Type I Diabetes Genetics Consortium.10 In 2213 samples typed for HLA-B*51, there were 24 individuals with discordant imputed types (98.9% concordance rate).
Association testing of ERAP1 coding haplotypes with Behçet's disease
Associations of the ERAP1 coding haplotypes with disease were evaluated by Pearson's χ2 test in SVS or the exact unconditional χ2 test11 in StatXact11 software (Cytel, Cambridge, Massachusetts, USA) under a recessive model and ORs were calculated under a recessive model using SVS. The recessive model was applied because the previously reported single marker associations were found only with the recessive model. Two risk factor analyses (HLA-B*51 and homozygosity for the disease-associated haplotype) were evaluated by 2×2 contingency table ORs comparing the frequency in cases with controls of the single-risk factor or two-risk factor groups relative to the frequency of individuals with neither risk factor. Significance thresholds for p values were 0.05 divided by the number of comparisons made (Bonferroni correction).
Molecular modelling of ERAP1
The structure of ERAP1 was evaluated using the Protein Data Bank protein structure summary 3MDJ and the UCSF Chimera package http://www.cgl.ucsf.edu/chimera. Polymorphic amino acid positions (table 1) were altered to demonstrate the ERAP1 molecules encoded by Hap1 and Hap10.
Of the 10 common ERAP1 missense variants identified in the 1000 Genomes Project EUR superpopulation (see online supplementary table S2), 8 were genotyped successfully on the Immunochip and 2 (rs2287987/Glu56Lys and rs3734016/Met349Val) were successfully imputed. All 10 missense variants had minor allele frequencies >1% in the Turkish control population (see online supplementary table S2). Strong linkage disequilibrium was found among these 10 variants (see online supplementary figure S1). Haplotype prediction in the 3679 Turkish samples identified a pair of haplotypes with probability >0.8 in 3637 of the samples (98.9%). Eight of the 10 haplotypes reported with >1% frequency in HapMap Centre d'Etude du Polymorphisme Humain (Utah residents with northern and western European ancestry) (CEU) samples12 were found at >1% frequency, and these 8 haplotypes accounted for 98.8% of all haplotypes identified in the Turkish control population (see online supplementary table S3). Results of recessive genotypic association tests for all the imputed SNPs are shown in online supplementary figure S2.
Association testing under the recessive model revealed that only Hap10 of ERAP1, which constitutes 14.3% of all coding region haplotypes and bears five non-ancestral alleles, Met349Val, Lys528Arg, Asp575Asn, Arg725Gln and Gln730Glu (where the first amino acid is the ancestral amino acid, defined as the one present in chimpanzees), was significantly associated with Behçet's disease (table 1). Similar to the reported association of individual ERAP1 SNPs with Behçet's disease,3 this haplotypic association was only significant under the recessive model (non-significant associations based on haplotype frequency are shown in online supplementary table S3). Thus, homozygosity for the ERAP1 Hap10 haplotype or for any of the SNPs that specifically tag Hap10 is a risk factor for Behçet's disease.
A combinatorial analysis integrating the presence of HLA-B*51 and homozygosity for ERAP1 Hap10 demonstrated a strong interaction between these two Behçet's disease risk factors (table 2). Compared with individuals with neither risk factor, individuals homozygous for Hap10 but without HLA-B*51 have no detectable increase in disease odds. Individuals who carry HLA-B*51 but are not homozygous for Hap10 have a 3.6-fold increased disease odds. However, individuals who carry HLA-B*51 and are also homozygous for ERAP1 Hap10 had a 10.96-fold increased disease odds (table 2). We found a similar disease odds in individuals heterozygous for HLA-B*51 (OR 10.07, 95% CI 5.41 to 18.74) compared with the individuals homozygous or heterozygous for HLA-B*51.
Our haplotype analysis of ERAP1 missense variants identified eight ERAP1 protein allotypes with >1% frequency in the Turkish population. Only one of these haplotypes, Hap10, was associated with Behçet's disease risk. This association was only detected under the recessive model, and moreover it only influenced disease risk in individuals who also carried the Behçet's disease-associated HLA type, HLA-B*51. Individuals carrying HLA-B*51 who are also homozygous for Hap10 have a nearly 11-fold increased disease odds compared with individuals with neither genetic risk factor. Although homozygosity for Hap10 has a large effect size, particularly in HLA-B*51 carriers, it does not make a large contribution to the overall population risk because of its low frequency. Hap10 represents 14.3% of the ERAP1 gene coding region haplotypes in the Turkish population and therefore, as expected, only about 2% of controls were homozygous for Hap10. Despite the low contribution to the overall population risk, the large effect size conferred by the combination of both risk factors suggests an important mechanism by which their combination contributes to Behçet's disease risk.
The Hap10 allotype bears five non-ancestral amino acids (Met349Val, Lys528Arg, Asp575Asn, Arg725Gln and Gln730Glu), three of which (Met349Val, Asp575Asn and Arg725Gln) are good tags for the haplotype itself. Individual variants encoding these haplotype tagging SNPs, Met349Val (rs2287987), Asp575Asn (rs10050860) and Arg725Gln (rs17482078), were previously reported recessively associated with Behçet's disease in Turkish3 and Iranian13 studies. Their genotype frequencies were also consistent with recessive association in the Spanish and Chinese populations, but did not reach statistical significance.5 ,14
The positions of the ERAP1 variant amino acids and the surface electrostatic potential of the Hap1 and Behçet's disease-associated Hap10 allotypes are shown in structural models of the ERAP1 protein in figure 1. The altered surface electrostatic potential could result in different characteristics of substrate peptides bound. The Hap10 allotype of the ERAP1 protein was previously found to have poor peptide trimming activity,15–17 thus homozygosity for Hap10 could greatly alter the composition of the peptidome available for binding to HLA-B*51. Recent work by Guasp and colleagues suggests that low ERAP1 activity would lead to a peptidome with low affinity for HLA-B*51, which could contribute to Behçet's disease risk by enhancing NK cell lytic activity.17
Thus, the ERAP1 Hap10 allotype could either inefficiently produce disease-protective peptides by failing to trim precursor peptides, or alternatively it could fail to digest/destroy disease-promoting peptides. Identifying the nature and source of such peptides, for example are they self-derived or do they originate in pathogenic or commensal organisms, would be an important step towards elucidating the mechanism by which HLA-B*51 contributes to Behçet's disease risk.
This study used the high-performance computational capabilities of the Helix Systems (http://helix.nih.gov) and the Biowulf cluster (http://hpc.nih.gov) at the National Institutes of Health, Bethesda, Maryland. Molecular graphics and analyses were performed with the UCSF Chimera package. Chimera is developed by the Resource for Biocomputing, Visualization and Informatics at the University of California, San Francisco (supported by NIGMS P41-GM103311).
Handling editor Tore K Kvien
Contributors MT, MJO, YK, AG, DLK and EFR: study design. BE, I-TT, ES, YO and AG: patient recruitment. MT, EFR, MJO and NW: data collection. NW and MJO: protein modelling. MT, MJO, YK, BE, I-TT, ES, YO, AG, DLK and EFR: writing and approving the paper.
Funding This study was supported by the Intramural Research Programs of the National Human Genome Research Institute and the National Institute of Arthritis and Musculoskeletal and Skin Diseases. Dr Kirino is supported by grants from Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research (Grant No. 26713036), the Naito Memorial Foundation, the Kanae Foundation for the Promotion of Medical Science and Yokohama Foundation for Advancement of Medical Science.
Competing interests None declared.
Ethics approval Istanbul Faculty of Medicine Ethics Committee.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.