A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of genomic conservation measures and biochemical annotations.
Tissue-specific functional annotation through integrative analysis of Roadmap epigenomic data.
GenoSkyline-Plus is a comprehensive update of GenoSkyline that incorporates more annotation data into the framework and extends to 127 integrated annotation tracks covering a spectrum of human tissue and cell types.
UTMOST (Unified Test for MOlecular SignaTures) is a principled method to perform cross-tissue expression imputation and gene-level association analysis.
GWAS.PC (GWAS Power Calculation) is an R package (Code) that does power analysis in genome wide association studies. In particular, genotyping error is considered in power calculation.
Post-GWAS prioritization through integrated analysis of GWAS summary statistics and GenoCanyon genomic functional annotation.
A statistical approach to prioritizing GWAS results by integrating pleiotropy information and annotation data.
This is an R package (Code Example) implementing a post-GWAS prioritization algorithm, which incorporates the rewiring information of co-expression network to prioritize GWAS signals.
This is a program to implement the Markov Random Field (MRF) method to incorporate pathway topology for genome wide association studies. (Example.R, fun_network.R, network.csv, pval.txt)
A Low-Rank representation and Sparse regression for eQTL mapping. This algorithm accounts for confounding factors such as unobserved covariates, experimental artifacts, and unknown environmental perturbations.
An R code for pathway-based classification and regression using Random Forests.
GRAPE is a template method that allows for identification of perturbed pathways in individual tumor samples relative to a reference collection of samples (e.g., matched healthy tissue). GRAPE is sensitive to biological variability, robust to batch effects and can be applied to any gene expression platform.
iPAC (Identification of Protein Amino acid mutation Clustering) finds mutation clusters on the amino acid level while taking into account the protein structure.
A bioconductor R package for identifying mutational clusters of amino acids in a protein while utilizing the protein tertiary structure via a graph theoretical model.
Variable importance-weighted Random Forests (viRandomForests) is an R package, which samples features according to their variable importance scores, and then selects the best split from the randomly selected features, to improved prediction accuracy in the presence of weak signals and large noises.
© 2022 Hongyu Zhao, Ph.D.
Created by Eddie, Chen and Wangjie.