Inference of Host–Pathogen Interaction Matrices from Genome-Wide Polymorphism Data

Host–pathogen coevolution is defined as the reciprocal evolutionary changes in both species due to genotype × genotype (G×G) interactions at the genetic level determining the outcome and severity of infection. While co-analyses of hosts and pathogen genomes (co-genome-wide association studies) allow...

Full description

Bibliographic Details
Main Authors: Märkle, H, John, S, Metzger, L, STOP-HCV Consortium, Ansari, MA, Pedergnana, V, Tellier, A
Format: Journal article
Language:English
Published: Oxford University Press 2024
Description
Summary:Host–pathogen coevolution is defined as the reciprocal evolutionary changes in both species due to genotype × genotype (G×G) interactions at the genetic level determining the outcome and severity of infection. While co-analyses of hosts and pathogen genomes (co-genome-wide association studies) allow us to pinpoint the interacting genes, these do not reveal which host genotype(s) is/are resistant to which pathogen genotype(s). The knowledge of this so-called infection matrix is important for agriculture and medicine. Building on established theories of host–pathogen interactions, we here derive four novel indices capturing the characteristics of the infection matrix. These indices can be computed from full genome polymorphism data of randomly sampled uninfected hosts, as well as infected hosts and their pathogen strains. We use these indices in an approximate Bayesian computation method to pinpoint loci with relevant G×G interactions and to infer their underlying interaction matrix. In a combined single nucleotide polymorphism dataset of 451 European humans and their infecting hepatitis C virus (HCV) strains and 503 uninfected individuals, we reveal a new human candidate gene for resistance to HCV and new virus mutations matching human genes. For two groups of significant human–HCV (G×G) associations, we infer a gene-for-gene infection matrix, which is commonly assumed to be typical of plant–pathogen interactions. Our model-based inference framework bridges theoretical models of G×G interactions with host and pathogen genomic data. It, therefore, paves the way for understanding the evolution of key G×G interactions underpinning HCV adaptation to the European human population after a recent expansion.