The Genetics Podcast
EP 160: Artificial Intelligence, GWAS in Drug Discovery, and Career Insights with Dr. Eric Fauman, Executive Director and Head of Computational Biology in the Internal Medicine Research Unit at Pfizer
Episode notes
0:00 Introduction
1:30 The power of social media: How Eric published 10 papers based on ideas that he discussed on Twitter
5:50 Explanation of The Table of Everything, an internal database at Pfizer that catalogs nearly 20,000 human genes and their associated diseases and traits
13:20 How Eric’s team works to correlate genome-wide association study (GWAS) results to real biological phenotypes and outcomes
18:10 Introduction to protein quantitative trait locus (PQTL), including its importance in biological and genetic data
25:10 Examining the evolving bottlenecks in drug development and the challenges of validating genetic targets
28:30 Navigating the gap between genetic hits and biological understanding, and how AI or functional studies could bridge this in target discovery
32:20 Linus Pauling's mentorship of Eric and how he might react to AlphaFold2’s breakthroughs in structural biology
35:15 Eric's take on using AI and how he's experimenting with it on trusted datasets
41:00 An introduction to Mendelian randomization, as well as its strengths and limitations
47:00 How Eric uses the TOP Model (Talent, Opportunity, and Passion) to guide this career choices and path
52:00 Diversity and collaboration in genetics research and implementation
55:00 Closing remarks
Resources mentioned throughout the episode:
Mendelian Randomization with Proxy Biomarkers
Explores proxy biomarkers as a method to assess in vivo activity of a protein target.
Trait Colocalization and Causal Genes
Demonstrates how traits with opposing effects on a genetic variant may suggest a causal gene sits between them
Metabolite Profiling in Human Knockouts
Community Workshop on Effector Gene Standards
Presentation: Watch on YouTube
TOP Model for Career Guidance
The Table of Everything
Overview: Read more on Pfizer’s site
UK Biobank Protein QTL Study
Eric’s First GWAS Contribution
Every Gene Ever Annotated (EGEA)
Public Resource: View annotations on GitHub
Nine reasons not to use eQTLs to identify causal genes from GWAS:
Random Sequences Can Create Regulatory Elements
- “~83% of random promoter sequences yielded measurable expression” - de Boer CG, Nat Biotechnol, 2020
- “Recently evolved enhancers are formed predominantly by exaptation of ancestral DNA” - Villar D, Cell, 2015
- “Extensive co-regulation of neighboring genes complicates the use of eQTLs in target gene prioritization” - Tambets R, et al., HGG Adv., 2024
Enhancer Variants and Buffering in Important Genes
- “eQTLs at GWAS loci are more likely to point to genes with low enhancer redundancy not associated with disease” - Wang X, Goldstein DB, Am J Hum Genet., 2020
- “GWAS and eQTL studies are systematically biased toward different types of variants” - Mostafavi H, et al., Nat Genet., 2023
- “CNVs are buffered by post-transcriptional regulation in 23%-33% of proteins significantly enriched in protein complex members” - Gonçalves E, et al., Cell Systems, 2017
eQTL Data Limitations vs. Proximity Information
- “cis-eQTL target genes are relatively poor indicators of ‘true positive’ causal genes” - Stacey D, et al., NAR., 2018
- “When molecular QTL colocalization evidence was removed, we saw similar classification results” - Mountjoy E, et al., Nat Genet., 2021
- “Key predictive features included coding or transcript-altering SNVs, distance to gene, and open chromatin-based metrics” - Forgetta V, et al., Hum Genet., 2022