Our group develops novel statistical and computational methods to address important problems in biology and medicine. We collaborate with research groups at Yale and other institutions to apply our methods to study the biological basis and treatment strategies for many diseases, including lung diseases, cancer, substance dependence, psychiatric disorders (major depression, schizophrenia, bipolar, post-traumatic stress disorder), cardiovascular diseases, autism, congenital heart diseases, aneurysm, hypertension, and others. We also collaborate with plant geneticists and evolutionary biologists using next generation sequencing and microarrays to identify genetic associations with various traits.Some of our current research topics are:
We develop statistical methods to annotate the human genome using diverse data sources (e.g. sequence, ENCODE, Roadmap Epigenomics, GTEx) and apply these annotations to better analyze and interpret the association signals identified from numerous genome wide association studies. We also develop methods to integrate different data types and results from different phenotypes to better identify functional genes/variants. In addition, we explore ways to repurpose approved drugs for new indications.
Leveraging annotation information and shared genetic factors across multiple diseases, we develop computationally efficient and statistically powerful methods to identify high risk individuals for better disease monitoring and prevention. We also investigate how non-genetic factors interact with genetic susceptibility to impact disease risks.
We develop methods to analyze whole exome and whole genome sequencing data, including de novo variants and rare variants, to identify genes for both congenital and developmental diseases as well as complex diseases.
We develop statistical methods to integrate different types of –omics data to infer perturbed pathways in cancer, infer microenvironments in cancer patients, and identify drug targets from high-throughput screening data. We also develop tools to identify biomarkers that are predictive patient treatment response.
We study both genomics data and imaging data for neuroscience problems. We develop statistical methods to characterize the spatial-temporal expression patterns during human brain development and between different diseases, and use these inferred patterns to better identify genes for neurodegenerative and psychiatric disorders. We perform integrated analysis between genetic, genomics, and imaging data to delineate the biological basis underlying neurodegenerative and psychiatric disorders.
We develop methods to capitalize on the longitudinal information in the rich health records and –omics data to better predict patient disease progression and select more effective treatment.
We develop statistical methods to address the unique challenges of single cell –omics data, including clustering, trajectory analysis, spatial-temporal analysis, regulatory network inference, and joint analysis with bulk sequencing data.
We develop methods to better analyze microbiome data, especially leveraging the relationships among OTUs and addressing the compositional nature of the data.
We have long standing interest to infer biological networks from –omics data and integrate the inferred network information for disease mechanism and treatment studies. Our publications cover theoretical and methodological aspects of network reconstructions, and their applications to disease mechanism studies.
It is well recognized there are major sex and ethnic differences for many complex diseases. We develop statistical methods to both characterize the genetic basis of these differences.
Please go to the Research section to browse representation publications.
We also develop software that implements the methods we have developed. Please go to the Software section to browse packages developed from our group.