Research in our lab falls under four broad areas:

Gene network inference and dynamics


We develop methods to infer regulatory networks from genomics data for different normal and cancer cell types, as well as to compare networks and model their interaction dynamics to identify signature pathways or modules. Here we have identified a set of genes that form a module only in HepG2 (cancer) cells, and only respond to treatment in HepG2 cells (but not normal hepatocyte cells).

The repertoire of genes and regulatory elements that control their expression is typically fixed in all of the different cells of an individual human. However, these genes and regulatory elements are expressed and interact with each other in a highly dynamic fashion, which depends on factors such as cell or tissue type, age, environmental stimuli and disease status. At any given time or context, the expression of, and interactions between, genes and regulatory elements in the genome can be represented as a network (graph), where nodes are genes and elements, and edges are interactions between these entities.

We are developing methods to address several challenges:

  • Inferring gene and chromatin networks from epigenomics and transcriptomics profiles
  • Identifying topology changes and differentially active pathways during cellular differentiation and cancer
  • Aligning networks between species that contain ambiguous mappings between nodes in each network

Genetics of complex diseases

One of the central goals of human genetics is to determine how variation at the DNA sequence level can impact variation at the organism level (e.g. complex traits, disease onset).


Variation at the DNA level can use many mechanisms to impact phenotype — we are building models to predict which paths are used by which variants to impact disease risk.

Two directions we are currently pursuing are:

  • Prediction of disease incidence from genetic sequence
  • Prediction of the mechanism of action of genetic variation on phenotype

Much of the DNA sequence variation (both germline and somatic) tied to complex diseases is located in non-coding regions of the genome, which until recently has been poorly understood. We leverage data from projects such as ENCODE, the Roadmap Epigenomics Project and GTEx to build models relating the non-coding and coding regions of the genome in order to better understand the mechanism of action of genetic variation and how they impact molecular and cellular function.


We have constructed regulatory networks linking non-coding DNA elements (e.g. enhancers) to coding elements (target genes) and upstream regulators, in 111 different cell types. We are now mapping genetic variation associated with different complex diseases (Alzheimer’s, type 2 diabetes) to these networks to predict how they work. In this example, a genetic variant associated with high total cholesterol levels (red circle) maps to a non-coding region and is revealed to have six possible mechanisms of action (enhancers in its region that it may disrupt) in liver tissue. We need a statistical model to disambiguate which enhancer is the true target.

Genomics-based personalized medicine


One application of genomics to personalized medicine is to predict and distinguish those patients at low or high risk of death without further intervention, in order to decide the amount of therapy to provide.

A central goal of our work is to develop and bring genomics based computational tools to the clinic, through collaborations at the UC Davis Comprehensive Cancer Center for example. We are undertaking several collaborative projects, including:

  • Predicting cancer patient response to therapy and outcome
  • Predicting combinations of compounds that are more effective than expected based on their individual efficacies

Our projects focus first on testing predictions against cell lines using high throughput assays, with future followup studies on animal models and ultimately clinical trials.

Gene-environment interactions

While studying the genetics of complex diseases is a focal point of the lab, we are also interested in the complementary question of, to what extent (and through what mechanisms) does the environment impact both organism-level phenotypes and intermediary phenotypes such as epigenetics and transcription. Using data from both unrelated individuals and twin studies, we are interested in novel methods for identifying genetic loci that exhibit non-additive interactions with known and unknown environmental factors, and understanding what pathways these loci target and interact with other loci involved in complex diseases.