In the relatively short time since the initial GWAS successes in identifying variants associated with complex traits, accumulating evidence suggests, possibly unforeseen, that the majority of disease associated SNPs lay outside traditional protein coding genes. SNPs associated with the susceptibility to complex genetic diseases are enriched in long, superenhancer, regions, which are both cell type and stimulus specific, overlapping histone marks of active DNA and associated with transcription activity (enhancer RNA). The, non-trivial, task now is to define how variants in these enhancers can increase the risk of disease: which genes are implicated, which cell types involved and ultimately the mechanism as to the perturbation of the biological pathway.
In an attempt to begin the process of unravelling the GWAS implicated genes we have sought to gain evidence as to the physical chromosomal interactions of the regions containing associated variants. It is becoming well established that enhancers work to regulate gene transcription by physical interactions, and that these interactions can act over large genetic distances, not necessary regulating the closest gene. Annotating disease associated variants with the nearest, plausible candidate gene may therefore prove erroneous, leading to expensive and time consuming efforts defining the function of non-causal genes.
Expanding on recent advances, we sought to use established whole genome chromatin interaction methods (HiC) and capture technology (capture-HiC), for the first time applying these methodologies to complex autoimmune diseases, investigating the complete existing genetic structure, capturing both the GWAS implicated chromosomal regions and, in separate validation experiments, a targeted range of candidate promoters. We performed all these experiments in 2 relevant cell lines, T and B-cells, and in duplicate. Importantly, the capture of the associated regions in addition to the promoters within 500Mb of a lead associated SNP, in separate experiments, confirms the extent of the significance of the interactions detected, and validates findings in a hypothesis free approach.
We defined the associated regions to capture from the latest meta-analysis of each disease by LD correlation (r2>0.9). We selected a T-cell and B-cell line for the experiments, evidence from several epigenetic studies overlaying specific marks of DNA activity with GWAS signal, suggesting these cell types to be the most likely involved with our disease genetic susceptibility. Performing the experiments in duplicate, from the viewpoint of both region and promoter, with the complete known genetic landscape for each disease provides a robust platform on which to make hypothesis that can be further tested in future targeted experiments.
Our findings confirm that the interactions of enhancer and promoter is likely to be complicated, with many enhancers interacting with many promoters and that the distances can be long and not necessary to the closest gene. We also show that many of the interacting enhancers are within lncRNA regions and overlap with cell type specific enhancer marks. Intriguingly we show evidence that GWAS hits from different diseases, often separated by large genetic distances, interact with the same gene promoter; GWAS associated SNPs, from the same and different disease, physically interact and that potential candidate genes can be situated large distances (>1Mb) from the detected lead associated variant.
Disclosure of Interest None declared
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.