top of page

Research

Most human traits and susceptibility to diseases are “complex”, influenced by a multitude of genetic and environmental factors. Examples include traits such as body weight and diseases such as cardiovascular disease, cancer, and psychiatric disorders.

 

Despite the success of genome-wide association studies (GWAS) in identifying hundreds of thousands of genetic variants underlying a wide range of traits and diseases, we still lack a fundamental understanding of how these variants mechanistically lead to phenotypic variation and why this heritable component arises and is maintained.

 

In our group, we combine modeling and analysis of genetic and genomic datasets to enhance our understanding of trait biology and evolution, as well as the practical application of this knowledge in clinical settings.

 

Our approach is interdisciplinary, integrating molecular and cellular perspectives with insights from population genetics, considering that most human complex traits, and hence their genetic basis, are shaped by natural selection.

 

Below are the current areas of active research in the lab. We have a broad interest in human genetics and welcome collaborations and new ideas to expand our research scope.

Genetic architecture of complex traits

Trait-associated variants are typically (i) spread across the genome, each with very small effects, and (ii) non-coding, likely influencing traits through gene regulatory mechanisms. In our lab, we develop approaches to conceptualize these observations and elucidate biological insights from genetic association studies.

Integration of genetic and functional data

One strategy we pursue is the integration of genetic associations with functional data, such as genetic effects on gene expression (eQTLs) and gene regulatory elements like enhancers. In previous work, we have shown that eQTLs are biased away from top trait-relevant genes, partly due to natural selection, resulting in limited overlap with trait-associated variants. More generally, we are interested in using and understanding the genetic basis of intermediate features along the pathway from genotype to phenotype, from molecular DNA-proximal features such as chromatin accessibility to trait-proximal endophenotypes.

A gene view of complex traits

GWAS variants are typically non-coding, with unknown target genes, a crucial aspect for understanding biology. A major issue is that different methods for identifying putative causal genes often have poor overlap. Another key shortcoming is that most do not provide a quantitative measure of the relative importance of genes. We address these issues by combining population genetics modeling, analysis of protein-altering variants, and TWAS (transcriptome-wide association studies). Our goal is to understand which types of genes exert the highest impact on specific traits and why, and to identify which genes are suitable for follow-up analysis, such as for drug development.

Characterizing genetic interactions

While examples of gene-by-gene (GxG) and gene-by-environment (GxE) interactions are common in non-human organisms, their contribution to human complex trait variation is debated. This is partly due to the challenge of quantifying the "environment" and the multiple testing problem without prior knowledge of relevant genotypes or exposures. To address these challenges, we aim to develop innovative strategies to characterize and conceptualize genetic interactions. This involves developing models of genetic interactions funneled through gene regulatory networks and studying admixed populations such as African Americans.

Genomic risk prediction

The use of genetic data in clinical settings, particularly for disease classification, is fundamentally different between Mendelian (monogenic) diseases and complex (polygenic) diseases. For complex diseases, like breast cancer and cardiovascular disease, a common approach is aggregating the effects of disease-associated variants into polygenic scores (PGS). These scores can aid in preemptive measures before disease onset or in determining screening frequency. 

 

However, PGS often exhibit limited predictive accuracy. Additionally, they are less accurate for individuals who differ from GWAS samples, notably in their ancestry or, as we demonstrated in a previous study, based on characteristics such as sex, age, and socioeconomic status.

We are interested in developing approaches to improve the accuracy, equitable use, and biological interpretability of PGS, and genomic risk modeling in general.

Boosting PGS with functional data

The limited accuracy of PGS is primarily due to noisy genetic effect estimates from GWAS. One strategy to address this is to utilize functional knowledge to prioritize causal variants over background variation. We aim to improve PGS by pooling information from various sources, at the variant level (e.g., regulatory activity and distance to genes) and at the gene level (e.g., biological pathways, expression profiles, and burden of rare variants).

Biological decomposition of PGS

Complex traits and diseases typically arise from biological processes in multiple tissues or cell types. For effective disease prevention and treatment, it is crucial not only to identify individuals at risk but also to discern the specific biological components underlying the risk, such as distinguishing between eosinophilic and non-eosinophilic asthma. We are interested in identifying biologically meaningful ways of decomposing PGS, such as by cell type or pathways, to enable more informed and interpretable disease risk assessment.

bottom of page