Pairwise coexpression networks derived from GWAS results. Each coloured ball indicates a transcription start region containing a GWAS-associated variant. Red - significantly coexpressed by network density analysis. Light blue - all other transcription region containing GWAS-associated variants for this phenotype. (3d visualisation by vasturiano)

Network Density Analysis


Our new post-GWAS analysis method (network density analysis; NDA) reveals new biological features of numerous disease states and traits. It works by examining a coexpression network of transcription start sites (discovered in FANTOM5). We find that transcripts containing GWAS hits for a given trait tend to fall into more dense groupings in the coexpression network than randomly-selected transcripts.

NDA demonstrates that GWAS hits for a given disease tend to be near promoter/enhancer elements with similar expression profiles, which enables us to find more hits, fine map probable causative SNPs, and implicate cell types in pathogenesis. Surprisingly, for some diseases, the underlying variants fall into distinct functional groups, suggesting either dual mechanisms of disease, or distinct disease endotypes.

Run an analysis or view the published results.

Baillie JK et al. “Shared Activity Patterns Arising at Genetic Susceptibility Loci Reveal Underlying Genomic and Cellular Architecture of Human Disease.” PLOS Computational Biology 14, no. 3 (March 1, 2018): e1005934. PMC5849332.

Submit a job

Upload a list of SNPs (or genomic locations) in the format described below. The SNPs should share some common feature, such as putative association with a given phenotype at a permissive p-value threshold (eg. 5e-6), such that it might reasonably be expected that some of the SNPs in the entry set will share an expression profile across the FANTOM5 expression atlas.

Number of permutations: (integer between 0 and 1000)
email address: (required)
Identifier: (Max 10 alphanumeric characters)
Upload data file:
Optional background file:

Submission format (BED):

chr start end [optional_snp_id]

  • tab or space-delimited
  • coordinates must be hg19 - use LiftOver if neeeded
Network density analysis method for detecting significant coexpression among GWAS hits. (a) A subset of regulatory elements is identified containing disease-associated SNPs. (b) The strength of the links between pairs of these regulatory regions is quantified, first as the Spearman correlation, then as the -log10p-value quantifying the probability, specific to this regulatory region, of a Spearman correlation of at least this strength arising by chance. This is determined from the empirical distribution of correlations between this regulatory region and all other regulatory regions in the entire network of all regulatory regions in the genome. (c) The subset of regulatory regions containing disease-associated SNPs form an unexpectedly dense grouping in the network. The NDA score assigned to any one node is the sum of the links it shares with other nodes in the chosen subset. d) NDA scores from the input subset of regulatory elements are compared with NDA scores from permuted subsets of regulatory elements in order to quantify the false discovery rate (FDR).

Instructions

If you submit a correctly-formatted file using the form above, your job will be entered into our queue for running on the Roslin Institute servers. Very large jobs (those with more than 1000 SNPs, or more than 400 SNPs mapping to FANTOM5 TSS) may take a long time and these will be pushed down the queue during busy periods, and may be cancelled if they are taking too long. Please contact us () if you have a very large job, or download our code below and run it on your own server.

For a full explanation of the network density analysis method, see Baillie JK et al. “Shared Activity Patterns Arising at Genetic Susceptibility Loci Reveal Underlying Genomic and Cellular Architecture of Human Disease.” PLOS Computational Biology 14, no. 3 (March 1, 2018): e1005934. PMC5849332..)

Code availability

Code used here is available from our github page.

View results of example analyses:

Height8882 snps searched471 promoters hit166 distinct regions mapped29 significantly-coexpressed regions
Total Cholesterol6421 snps searched519 promoters hit128 distinct regions mapped29 significantly-coexpressed regions
Low-density lipoprotein4644 snps searched321 promoters hit92 distinct regions mapped19 significantly-coexpressed regions
High-density lipoprotein5410 snps searched450 promoters hit101 distinct regions mapped17 significantly-coexpressed regions
Triglycerides4863 snps searched437 promoters hit97 distinct regions mapped23 significantly-coexpressed regions
Ulcerative colitis2162 snps searched234 promoters hit83 distinct regions mapped20 significantly-coexpressed regions
Crohn's disease1924 snps searched217 promoters hit70 distinct regions mapped23 significantly-coexpressed regions
Systolic Blood Pressure417 snps searched25 promoters hit13 distinct regions mapped
Diastolic Blood Pressure711 snps searched26 promoters hit14 distinct regions mapped

Funding

We are very grateful to recieve funding from the following sources: Wellcome Trust, BBSRC, Intensive Care Society, MRC, NIH.


Contact us

Email: