Home » Software

Category Archives: Software

GenoWAP website is ready

Genome-wide association study (GWAS) has been a great success in the past decade, with tens of thousands of loci identified associated with many complex diseases in humans. However, challenges still remain in both identifying new risk loci and interpreting results. Bonferroni-corrected significance level is a conservative threshold for high dimensional hypothesis testing, leading to insufficient statistical power when the effect size is moderate at each risk locus. Complex structure of linkage disequilibrium also makes it challenging to distinguish causal variants from large haplotype blocks. We propose GenoWAP (Genome-Wide Association Prioritizer), a post-GWAS prioritization method that integrates genomic functional annotation and GWAS test statistics. After prioritization, real disease-associated loci become easier to be identified. Within each locus, GenoWAP is also able to identify real functional spots among correlated markers. GenoWAP has the potential to be widely used to reveal functional spots at disease-associated risk loci and guide further studies such as resequencing analysis.

The GenoWAP paper will be submitted soon. The website is now officially online at genocanyon.med.yale.edu/GenoWAP. Feel free to check out the software and let us know what you think.

GenoCanyon server is online

GenoCanyon is a genomic functional annotation method based on unsupervised statistical learning. It integrates genomic conservation measures and biochemical annotations to predict functional potential of each nucleotide in the human genome. The pre-calculated GenoCanyon scores for the entire human genome (hg19) is now available for download. We have also developed a shiny application that visualizes the functional prediction.

Click HERE to check out the GenoCanyon server and web application. Please contact Qiongshi (qiongshi.lu@yale.edu) if you have any question about the method or the server. Our paper is currently under review. The preprint is available on BioArchive.

FacPad: Bayesian Sparse Factor Analysis model for the inference of pathways responsive to drug treatment

FacPad is the inference of Pathways responsive to drug treatment via Bayesian sparse Factor modeling.

It requires two matrices as input datasets. The first matrix Y is the gene expression ratios before and after drug treatment. It has dimension G×J, where G is the number of probesets measured by the specific microarray platform and J is the number of different treatments (usually, a treatment is combination of certain drug, concentration, and treatment time). The second matrix L is the binary pathway structure matrix of dimension G×K, where K is the number of pathways associated with the probesets, with Lg,k=1 representing that the g-th probeset is mapped to the k-th pathway and Lg,k=0 otherwise. FacPad models each pathway as a latent factor which is the weighted combination of its associated probesets, and decomposes the matrix Y into loading matrix W (G×K) and factor activity matrix X (K×J):


Illustration of the Bayesian sparse factor model used for the analysis of treatment response data.

The sparse structure of loading matrix W is determined by prior binary matrix L.

Microarray expression data before and after drug treatment was downloaded from the Connectivity Map database

Pathways associated with the probesets were derived using the functional annotation function of the Database for Annotation, Visualization and Integrated Discovery (DAVID).

The bayesian sparse factor model is implemented as the R package “FacPad” on CRAN.