Hongyu Zhao, Ph.D. bio photo

Hongyu Zhao, Ph.D.

Department Chair and Ira V. Hiscock Professor of Biostatistics, Professor of Genetics and Professor of Statistics and Data Science

Email Twitter Github ORCID

Softwares

Human Genome Annotations
Analysis of Genome Wide Association Study Data
eQTL
Pathway Analysis
Cancer Mutation Cluster Identifications

Human Genome Annotations

  • GenoCanyon (Server) (Web Application) A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of genomic conservation measures and biochemical annotations.
  • GenoSkyline (Website) Tissue-specific functional annotation through integrative analysis of Roadmap epigenomic data
  • GenoSkyline Plus(Website) GenoSkyline-Plus is a comprehensive update of GenoSkyline that incorporates more annotation data into the framework and extends to 127 integrated annotation tracks covering a spectrum of human tissue and cell types.

Analysis of Genome Wide Association Study Data

  • GWAS.PC (User Manual Code) GWAS.PC (GWAS Power Calculation) is an R package that does power analysis in genome wide association studies. In particular, genotyping error is considered in power calculation.
  • GenoWAP (Website) Post-GWAS prioritization through integrated analysis of GWAS summary statistics and GenoCanyon genomic functional annotation
  • GPA (Website) A statistical approach to prioritizing GWAS results by integrating pleiotropy information and annotation data.
  • GBR (Download Example) This is an R package implementing a post-GWAS prioritization algorithm, which incorporates the rewiring information of co-expression network to prioritize GWAS signals.

eQTL

  • LORS (Download) A Low-Rank representation and Sparse regression for eQTL mapping. This algorithm accounts for confounding factors such as unobserved covariates, experimental artifacts, and unknown environmental perturbations.

Pathway Analysis

  • GRAPE (Source code and sample datasets) GRAPE is a template method that allows for identification of perturbed pathways in individual tumor samples relative to a reference collection of samples (e.g., matched healthy tissue). GRAPE is sensitive to biological variability, robust to batch effects and can be applied to any gene expression platform.
  • COSINE  (Related Paper) (Software)
    An R package to extract the globally most discriminative sub-network from multiple gene expression data sets with integration of protein-protein interactions data.

Cancer Mutation Cluster Identifications

  • iPAC (More information) iPAC (Identification of Protein Amino acid mutation Clustering) finds mutation clusters on the amino acid level while taking into account the protein structure.
  • GraphPAC (More information) A bioconductor R package for identifying mutational clusters of amino acids in a protein while utilizing the protein tertiary structure via a graph theoretical model.