Objectives To perform fine mapping of the autoimmunity susceptibility gene BLK and identify functional variants involved in systemic lupus erythematosus (SLE).
Methods Genotyping of 1163 European SLE patients and 1482 controls and imputation were performed covering the BLK gene with 158 single-nucleotide polymorphisms. Logistic regression analysis was done using PLINK and conditional analyses using GENABEL's test score. Transfections of BLK constructs on HEK293 cells containing the novel mutation or the wild type form were analysed for their effect on protein half-life using a protein stability assay, cycloheximide and western blot. CHiP-qPCR for detection of nuclear factor κ B (NFkB) binding.
Results Fine mapping of BLK identified two independent genetic effects with functional consequences: one represented by two tightly linked associated haplotype blocks significantly enriched for NFκB-binding sites and numerous putative regulatory variants whose risk alleles correlated with low BLK mRNA levels. Binding of NFkBp50 and p65 to an associated 1.2 Kb haplotype segment was confirmed. A second independent genetic effect was represented by an Ala71Thr, low-frequency missense substitution with an OR=2.31 (95% CI 1.38 to 3.86). The 71Thr decreased BLK protein half-life.
Conclusions These results show that rare and common regulatory variants in BLK are involved in disease susceptibility and both, albeit independently, lead to reduced levels of BLK protein.
This paper is freely available online under the BMJ Journals unlocked scheme, see http://ard.bmj.com/info/unlocked.dtl
Statistics from Altmetric.com
The BLK gene located on chromosome 8p23-p22 encodes a tyrosine kinase of the SRC family of proto-oncogenes. A genome-wide association study identified a single-nucleotide polymorphism (SNP) rs13277113 in the 5′ upstream region associated with systemic lupus erythematosus (SLE),1 a chronic autoimmune disease with complex inheritance. To date, multiple studies have confirmed the association of BLK with SLE in European and Asian populations as well as with other autoimmune disorders.2,–,10 BLK protein has 505 amino acids with SH3 (58–118), SH2 (124–220) and protein kinase domains (241–494). Although BLK is expressed in various tissues such as pancreatic β cells, thymus, hair follicles and salivary ducts,11 12 it is preferentially expressed in B lymphoid cells and has a role in B cell receptor signalling and B cell development.13
The BLK-rs13277113 SNP is part of a group of common variants across several genes that have been associated with SLE through genome-wide association study.14 Together, they account for 15% of the disease's familial aggregation and few have been demonstrated to be functional.14 The SLE-risk allele rs13277113-A correlated with decreased expression of BLK mRNA1 but the mechanism by which this or another SNP in linkage disequilibrium (LD) with it affects gene expression has not been characterised yet.
To address which and if common and/or low frequency functional variants are involved in SLE susceptibility at the BLK locus, we performed fine mapping by genotyping tag SNPs aided by imputation.
Patients and controls
After quality control of the data, the study sample consisted of 1163 patients with SLE and 1482 ethnicity-matched healthy control subjects from the European multicentre collaboration network BIOLUPUS (supported by the European Science Foundation), comprising individuals from Argentina (60 patients and 109 controls), Belgium (70 patients and 59 controls), Germany (284 patients and 176 controls), Hungary (38 patients and 35 controls), Italy (264 patients and 320 controls), Portugal (163 patients and 165 controls), Spain (206 patients and 227 controls) and Sweden (78 patients and 391 controls). A completely new and independent set of 1103 healthy Spanish controls was also used. All patients fulfilled at least four of the American College of Rheumatology 1982 criteria for the classification of SLE.15 Subjects with an individual genotyping rate lower than 90% or duplicated and/or related samples were excluded. We removed individuals with <90% of European ancestry, as estimated with STRUCTURE v2.3.3 and 350 ancestry informative markers.
Tag SNP selection
From the LD structure of the BLK locus in the HapMap CEU population (data release 27; available at http://hapmap.ncbi.nlm.nih.gov/), tag SNPs selection capturing more than 90% of common variation at the gene locus (minor allele frequency of greater than or equal to 10%) including 20 kb upstream and downstream of the gene, at a r2 threshold of greater than or equal to 0.8, were selected using the tagger algorithm implemented in Haploview, V.4.1.16 After quality control of the data, 27 tag SNPs were used for genotyping and to guide the imputation.
All samples were genotyped at the Feinstein Institute for Medical Research (Manhasset, New York, USA) using a GoldenGate Custom Genotyping Assay and a BeadXpress Reader from Illumina (San Diego, California, USA). Two SNPs (rs1478895 and rs2248932) were genotyped by TaqMan 5′-exonuclease assay (ABI, Foster City, California, USA). SNP rs55758736 was genotyped by RFLP and sequencing for verification in 178 individuals. An Illumina custom BeadArray was used for genotyping this SNP in 1103 healthy controls.
We used IMPUTE217 across a 250 kb genomic region (chr8:11,288,930-11,559,517) of the BLK locus. As reference panels we used haplotypes from all HapMap Consortium18 19 populations together (phase 3, release 2) and haplotypes of the CEU population from the pilot phase of 1000 Genomes Project20 (downloaded from http://mathgen.stats.ox.ac.uk/impute/impute_v2.html#filtered_1kg_hm3_haps on May 13th, 2011). HapMap haplotypes duplicated with 1000 Genomes data were removed. Default parameters were used and the Ne argument set to 15 000. Only imputed genotypes with a probability equal to or greater than 0.9, a minor allelic frequency (MAF) equal to or greater than 0.005, following Hardy–Weinberg equilibrium (p>0.001), and a genotyping rate equal to or greater than 0.95 were retained for association.
Genetic association analysis was done using PLINK v1.0721 and GenABEL v1.6-5.22 All SNPs with a genotype call rate <95% or not in Hardy–Weinberg equilibrium (p<0.001) were excluded. For single-marker tests, we performed a Cochran-Mantel-Haenszel (CMH) meta-analysis, using the country of origin as the stratification variable, and a Breslow-Day test to assess homogeneity of the ORs. p Values were adjusted using false discovery rate. We also performed an omnibus haplotype test for all windows of 2, 3 and 4 SNPs across the gene. In this test, the alternate hypothesis of each haplotype having a unique effect is compared with the null hypothesis (df=number of haplotypes-1).21 Omnibus as well as haplotype-specific tests for haplotype blocks were conducted by testing each haplotype against all others pooled together (1 df). A Bonferroni correction using the number of windows tested was applied (p<0.05/323 windows tested).
Conditional analysis was done by logistic regression using GenABEL's score test. By specifying the country of origin as the strata argument, we performed a structured test of association for all SNPs. In a stepwise manner, the SNP with higher p value was identified in each step and then used as covariate for the next association step until no SNP remained associated. GenABEL package was run using R version 2.9.2.
Genome sequence track analysis for epigenomic marks and transcription factor binding sites
Associated and non-associated SNPs within haplotype blocks were analysed for enrichment with eight epigenomic marks using GenomeRunner.23 For SNPs within each haploblock the same number of randomly selected nucleotides from the whole range of haploblocks within chr8:11374997-11475653 coordinates (NCBI36/hg18 genome assembly) was sampled over the course of 100 Monte-Carlo simulations. The average expected frequency of their association with a given genomic feature (eg, nuclear factor κ B (NFkB) binding site, epigenomic marks) was compared with the observed frequency using a χ2 test. Specifically, SNPs within haploblocks were tested for enrichment against experimentally validated NFκBp65, IRF4 and PAX5 transcription factor binding sites annotated in the wgencoderegtfbs clustered UCSC table of the genome browser.24 Other tables included histone acetylation and methylation marks (Regulation/Broad histone tables for Gm12878 lymphoblastoid cell line). p Values were –Log10 converted and a – was included for under-represented p values. The values are shown as a heatmap plotted using Matlab (Natick, Massachusetts, USA) (online figure S2). p Values are presented in the online table S3.
We used SIFT25 to predict the status of rs55758736. SIFT predicts whether an amino acid substitution affects protein function based on similarity to related sequences and physical properties of amino acids (http://sift.jcvi.org/).
CHiP-qPCR for NFkB p50 and p65 on the 1.2 Kb associated haplotype genomic segment
CHiP experiments were performed according to the manufacturer's recommendations (EZ-Magna CHiP A, Millipore, Billerica, California, USA). In brief, 4×10−6 Daudi cells were incubated with or without 10 mg/ml anti-immunoglobulin M (IgM) F(ab′)2 fragments (Jackson Immunoresearch, West Grove, Pennsylvania, USA) for the indicated time. Cells were fixed with 1% formaldehyde and nuclei were isolated and sonicated within 300 ml of lysis buffer. An amount of 50 ml of chromatin–protein complexes were immunoprecipitated overnight at 4°C by mild agitation with 5 mg of pAb specific for p65 (Santa Cruz, California, USA) or p50 (Abcam, Cambridge, Massachusetts, USA) or normal rabbit IgG (Millipore) as negative control. After elution from the immunoprecipitated chromatin complexes followed by reverse-crosslink and cleanup, DNA samples were subjected to real-time PCR analysis using primers as follows: rs2248932 neighbouring region within Blk intron 1 primer set: forward – TGCATCAGCATCACTGGG; reverse – GGCACAATAGTTCTGAAACCTC. IkBa was used as control for NFkB binding with the following primer set: forward – CACTTGCAGAGGGACAGGAT; reverse – GAGAAACTCCCTGCGATGAG.
Cloning and expression of BLK constructs with the Ala71Thr substitution
BLK sequence was PCR amplified using cDNA from a BJAB cell line and primers f-BLK: 5′-CACCatggggctggtaagtagc-3 and r-BLK 5′-gggctgcagctcgtactgcc-3. The start codon in the forward primer is underlined. The open reading frame was cloned in pcDNA3.1D/V5-His (Invitrogen, Life Technologies, Grand Island, New York, USA) and confirmed by sequencing. The protein tagged with the V5 epitope at the C terminal was produced by deletion of the stop codons. The BLK 71Thr isoform was produced by mutagenesis PCR with pfu-Ultra II fusion HS polymerase (Stratagene, Agilent Technologies, Santa Barbara, California, USA) and the primers BLKm71f: 5′-TATGACTACACCaCTATGAATGATCGGGACCT-3 and BLKm71r: 5′-ATCATTCATAGtGGTGTAGTCATACAGAG-3. Expression of the constructs was assayed by western blot with anti-V5 antibody (online figure S4).
Protein stability assay
Embryonic kidney HEK293T cells grown in RPMI 1640 medium supplemented with 10% fetal calf serumwere seeded on six-well plates and transfected with 4 µg of expression plasmids V5-tagged BLK isoforms using Lipofectamine 2000. At 22 h cells were harvested and distributed in six wells of 12-well plates. After 2 h cells were treated with 20 µg/ml cycloheximide in dimethyl sulphoxide (DMSO) or DMSO alone (time 0) for various times. Cellular extracts were prepared in lysis buffer: 1% NP40, 150 mM NaCl, 10 mM Tris (pH 7.5), 2 mM sodium orthovanadate, 1 mM EDTA, 10% glycerol, 1 mM phenylmethylsulfonyl fluoride and protease and phosphatases inhibitor cocktails (Roche). A fifth of the extract was resolved on SDS-PAGE and immmunoblotting carried out using standard protocols with anti-V5 antibodies (Invitrogen), b-tubulin (Sigma-Aldrich) and antimouse IgG-HRP (Zymed, San Francisco, California, USA). Quantification of the bands was done using ImageJ.
Fine mapping and conditional analysis of BLK reveals two independent effects
A total of 158 SNPs with MAF>0.005 were tested for association in 1163 European patients with SLE and 1482 age- and country-of-origin-matched controls. A meta-analysis demonstrated that the strongest association signal was in the BLK promoter region (rs998683, PCMH-corrected=1.05×10−4 and OR 1.38 (1.21 to 1.58)) (online table S1). However, even after correction for multiple testing, multiple association signals across the gene remained significant. To refine the BLK association, we tested sliding windows, which narrowed it down to three blocks (denominated B1, B3 and B5 in figure 1). The best model of association was observed for a 1.2 kb haplotype window between SNPs rs2248932 and rs9329246 located at the end of BLK's first intron (p=1.58×10−10, omnibus haplotype test).
The first block (B1) covered BLK 5′ upstream region, exon 1 (untranslated-UTR) and beginning of intron 1, which correspond to the promoter. This block comprised a set of highly correlated SNPs including all variants previously associated with SLE (rs2409780, rs2618444, rs2736336, rs2736337, rs2736338, rs2736340, rs13277113) and our strongest hit (rs998683) (figure 2A and online figure S1). Haplotype-specific analysis demonstrated a common, risk haplotype tagged by the minor alleles of these SNPs, including the rs13277113-A variant (omnibus test p value=1.61×10−5, OR=1.32 (1.16 to 1.49)) (table 1).
The second block (B3) was at the end of intron 1 (figure 1) and displayed moderate LD with B1 (figure 2A). Consistent with this observation, a risk haplotype correlated with the promoter risk haplotype and with similar haplotype frequency was over-represented in cases, although only with nominal association (p=0.013) (table 1).
In addition, an independent and low-frequency-haplotype within B3 (1.1% in controls) was strongly associated and had a large effect size (p=5.16×10−7, OR=2.75 (1.80 to 4.19)) (table 1). This haplotype correlated completely with a risk haplotype observed on the third associated block B5 (p=1.57×10−4, OR=2.48 (1.54 to 4.00)), which covered the protein-coding exons 4, 5 and 6 (figure 1). The minor alleles of six low-frequency variants occurred in this haplotype: rs55758736, a non-synonymous SNP in exon 4 resulting in an alanine to threonine substitution at amino acid position 71 (Ala71Thr; GCT>aCT); chr8:11443817 (intron 4); chr8:11444706 (intron 5), chr8:11446229 (intron 5); and chr8: 8-11446791 and rs2255227 (intron 6) (figure 2A).
Conditional analysis using multivariate logistic regression demonstrated that the association of SNPs in B3 disappeared after adjusting by the strongest hit located in B1 (rs998683) suggesting that the B3 and the B1 association was not independent (online table S2).
In contrast, the low-frequency associated SNPs located in B5 remained associated after adjustment to rs998683 supporting two independent effects in BLK. Further, we observed that SNP rs55758736 had an r2≈0 with SNPs tagging B1 and B3 (figure 2B,C and online table S2), although this may be an underestimation due to the low frequency of the SNP.
Analysis of epigenetic marks and transcription factor binding sites shows an over-representation of NFkB and IRF4 along the promoter region associated blocks
We first attempted the identification of functional regulatory variants across the promoter blocks B1 and B3. Bioinformatic analysis revealed that B1 was significantly enriched for H3K4me3 sites, a hallmark of active promoters (figure 1, online figure S2 and online table S3). Furthermore, SNPs in B1 and B3, when compared with randomly chosen nucleotides, were enriched for binding sites for the transcription factor NFκBp65 and IRF4 but not PAX5 (using data from ENCODE project Chip-Seq for lymphoblastoid cell lines), while SNPs in the non-associated blocks B2 and B4 were not (figure 1, online figure S2 and online table S3). However, we were unable to single out a unique variant to explain the correlation between low BLK gene expression and risk alleles of SNPs in B1 and B3 due to the strong LD. We hypothesised that multiple sites would therefore explain a coordinated regulatory effect on the expression of BLK. Nevertheless, the 1.2 Kb haplotype window with the strongest genetic association and within B3 contained SNP rs2248932 that overlapped with a DNAse hypersensitive cluster and with a strong NFκB binding site (online figure S3 and online table S4). To confirm if NFκB binds to this segment of BLK we performed a CHiP-qtPCR experiment in a human Daudi cell line (figure 3A–C). We observed that NFκBp50 binds somewhat earlier than NFκBp65 (30 min post-BcR stimulation). The IkBa promoter served as control (figure 3C). Simultaneously, expression of BLK is reduced after activation (figure 3A).
The 71Thr mutation leads to reduced half-life of BLK protein
The independent Ala71Thr substitution is located on the SH3 (Src homology 3) domain of the BLK protein, which is of importance in protein–protein interactions. The SIFT algorithm25 26 predicted this substitution to be potentially damaging. As rs55758736 was identified through imputation, we verified the presence of the mutation in 178 selected samples of cases and controls. Of these, we genotyped all of those individuals predicted by imputation to have the mutation and a group of individuals predicted not to have it. Verification was done using PCR, RFLP and sequencing (figure 4) reaching 92% concordance. The mutation was observed in 2.17% of SLE patients and 0.92% of controls, all in the heterozygous state (OR=2.31; 95% CI 1.38 to 3.86 with the Pearson's goodness-of-fitχ2 (df=1)). Adding 1103 new controls, the OR was adjusted to 1.79 (95% CI 1.15 to 2.51).
At the translational level, the substitution of alanine for threonine implies the formation of a putative phosphorylation site that could lead to the reduction of the half-life of the protein. We investigated if this was the case. Using constructs containing each allele of rs55758736 in transfection experiments we observed that, indeed, 71Thr had an accelerated degradation after treatment with the protein synthesis inhibitor cycloheximide as compared with the Ala71 allele (figure 5 A,B and figure S4).
We have herein shown that blocks of associated SNPs within the promoter regions of the BLK gene were significantly enriched for NFκB and IRF4 binding sites, potentially involved in regulating levels of BLK mRNA and the related reduction in BLK gene expression levels correlating with the risk haplotypes. Also, through conditional analysis we have identified a potentially deleterious, low-frequency mutation leading to reduced BLK protein half-life. Unlike the wild type allele, the threonine residue of the mutated variant is capable of being phosphorylated, and there is evidence that while phosphorylation sites are important for activation of proteins in signal transduction pathways phosphorylation sites can act at the same time as signals for degradation by ubiquitin ligases.27 28
Protein stability has been extensively studied in the cycling protein family. The abundance of cycling E is controlled by phosphorylation-triggered ubiquitin-dependent proteolysis.29 30 Phosphorylation of the Thr380 allows cycling E to be recognised by Fbw7, an F box protein substrate of related E3 ubiquitin ligases.31 Mutation of this Thr380 to Ala prevents degradation of cyclin E.32 We are currently investigating if the reduction of the half-life of the BLK 71Thr protein is due to an active process of degradation mediated by phosphorylation or just the result of a general decrease in protein stability.
As the LD within the promoter blocks of BLK was very tight, it was not possible for us to define a single variant that could explain the correlation between low BLK levels and a polymorphism. Instead, we observed a clear correlation between associated blocks and transcription factor binding sites for NFκB as experimentally determined by ChIP-Seq and contained in the ENCODE site. However, the database contains only NFkBp65 experiments. NFkBp65 is known to have a after BcR activation. Alternatively, NFkBp50 homodimers that lack a transactivating domain bind to the human BLK promoter, as has been shown in mice.33 Here, homodimers of NFκB/p50 were shown to compete with Pax5 and reduce Blk gene expression upon mature B cell activation and development to antibody-producing plasma cells,33 opposite to what occurs during early B cell differentiation, when Blk expression plays an important role.33 We also do not know the role of IRF4 in this context and a much more detailed analysis of the regulatory landscape of BLK is required to understand the correlation between the promoter risk haplotypes and lowered BLK gene expression. Also, due to the strong LD within the promoter haplotypes, it has also not been possible to discard the region proximal to the C8orf13 or FAM167A region lying 5′ of BLK. In this context, the same associated SNPs correlate with increased levels of FAM167A gene expression.1 The meaning of this is currently unknown and there is no knowledge that this gene would have any role in immunity in general. In this context, we chose not to include the FAM146A gene in our fine mapping, as ours and the actual evidence in other autoimmune diseases support BLK as the susceptibility gene in this locus.
We propose that subtle quantitative differences between risk and protective haplotypes occur during mature B cell stimulation leading to lower BLK levels, and that epigenetically several sites show a variegated effect of NFκBp50 homodimers on the promoter region of BLK. Nevertheless, our genetic analysis did show a potentially unique location within the highly associated 1.2 Kb haplotype window where a single binding site for NFκBp65 was shown to be particularly strong in the ENCODE data and confirmed here for NFκBp50 as well.
In summary, we identified two independent effects within the BLK locus one of which covers the promoter region of the gene where the associated haplotype blocks are enriched for transcription factor binding sites for NFκB and IRF4, providing a potential mechanism for the correlation of the risk alleles and reduced levels of BLK expression. We also identified a novel mutation where the SLE-associated 71Thr leads to the reduced half-life of BLK protein.
What effect reduced BLK expression has on the function of human B lymphocytes is at present under investigation.
The authors would like to thank all SLE patients who consented to their recruitment in this project. Funding for the project was provided by The Swedish Research Council, the Gustaf Ve:80-Års Fond, the Swedish Association against Rheumatism, the Instituto de Salud Carlos III (PS09/00129) that cofinanced partly through FEDER funds of the European Union, the Research Network BIOLUPUS with support from the European Science Foundation, the Consejería de Salud de Andalucía (PI0012) and grants from the NIH RR020143, the Alliance for Lupus Research and OCAST to MEAR.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Web Only Data - This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Funding Funding provided by FCT (Fundação Ciencia e Tecnologia, Portugal), DFG WI 1031/6-1 (Germany), Grupo Italiano LES and the CVDIMMUNE Project supported by the European Commission.
Competing interests None
Ethics approval Approval provided by the IRB from each participating institution.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement We will share data related to this article.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.