UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Coevolutionary epidemiology : a population genetic exploration of evolutionary interactions between hosts… MacPherson, Ailene 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2020_may_macpherson_ailene.pdf [ 27.07MB ]
JSON: 24-1.0389723.json
JSON-LD: 24-1.0389723-ld.json
RDF/XML (Pretty): 24-1.0389723-rdf.xml
RDF/JSON: 24-1.0389723-rdf.json
Turtle: 24-1.0389723-turtle.txt
N-Triples: 24-1.0389723-rdf-ntriples.txt
Original Record: 24-1.0389723-source.json
Full Text

Full Text

Coevolutionary EpidemiologyA population genetic exploration of evolutionary interactionsbetween hosts and their infectious pathogensbyAilene MacPhersonB.Sc., The University of Idaho, 2013M.Sc., The University of Idaho, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Zoology)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)April 2020© Ailene MacPherson, 2020ii  The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled:  Coevolutionary Epidemiology:  A population genetic exploration of evolutionary interactions between hosts and their infectious pathogens  submitted by Ailene MacPherson  in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Zoology  Examining Committee: Sarah Otto, Professor, Zoology, UBC Supervisor  Matthew Pennell, Assistant Professor, Zoology, UBC Supervisory Committee Member  Jonathan Davies, Associate Professor, Botany, UBC University Examiner Jeffrey Joy, Assistant Professor, Medicine, UBC University Examiner   Additional Supervisory Committee Members: Daniel Coombs, Professor, Mathematics, UBC Supervisory Committee Member Michael Doebeli, Professor, Zoology and Mathematics, UBC Supervisory Committee Member  AbstractCoevolution between hosts and their parasites is widespread with important emergent consequences fornatural systems from across the tree of life. The Red Queen Hypothesis suggests coevolution should main-tain genetic variation in hosts and favour the evolution of sexual reproduction. Mathematical models havedemonstrated that coevolutionary dynamics and the resulting effects on genetic variation and evolution ofsex depend fundamentally on the genetic basis and life-history of the host-parasite interaction. Our under-standing of the interaction genetics in natural systems is, however, still limited. In Chapter 2 I develop astatistical method based on genome-wide association studies (GWAS) to identify the genetic interactionsbetween hosts and their parasites, demonstrating that inference of these genetic interactions is essential for arobust understanding of epidemiological traits.Classic models, including the one used in Chapter 2, consider the interaction between hosts and theirfree-living pathogens. Many pathogens are, however, directly transmitted between hosts and hence subjectto epidemiological dynamics. In the Chapter 3, I consider the effects of these epidemiological dynamics oncoevolution. We find, that epidemiological dynamics disrupt classic “Red Queen allele frequency cycles”observed in free-living pathogens, a change in dynamics that may limit the ability of coevolution to favourthe evolution of sexual reproduction. Chapters 4 and 5 extend this by exploring the effects of epidemiologicaldynamics on the maintenance of genetic variation. Chapter 4 develops a baseline for the effect, examiningthe stochastic dynamics of heterozyogsity in a free-living pathogen population of constant size. In contrastto existing hypotheses, we find that coevolution in this classic model does not maintain genetic variation.In Chapter 5 we show that epidemiology can maintain genetic variation in hosts of directly transmittedpathogens, due in part to associated changes in population sizes. My thesis, therefore, demonstrates that, likeother aspects of host and pathogen life-history, disease epidemiology fundamentally affects coevolutionarydynamics with implications for the evolution of sexual reproduction and the maintenance of genetic variation.iiiLay SummaryInfectious diseases are pervasive and possibly one of the most important agents of natural selection acrossthe tree of life. Coevolution between hosts and their pathogens occurs when hosts vary in their susceptibilityto different pathogen genotypes. This thesis begins (Chapter 2) by developing a method for identifying hostand pathogen genotypes underlying susceptibility. Although the genetic basis of susceptibility is unknownin many systems, one genetic basis that is often used in the theoretical literature is a “matching-alleles”interaction. In Chapter 3 we explore how such matching-alleles coevolution between a host and a directly-transmitted infection shapes disease epidemiology and vice versa. Chapter 4 and 5 build upon these resultsexamining how matching-alleles coevolution across a range of epidemiological models leads to the mainte-nance or more often the depletion of genetic variation in host susceptibility.ivPrefaceA version of Chapter 2 has been published [MacPherson, A., S.P. Otto, and S.L. Nuismer. 2018. Keepingpace with the Red Queen: Identifying the genetic basis of susceptibility to infectious disease. Genetics]. Theproject was conceived by S.L. Nuismer and together with S.P. Otto we developed the statistical methods,mathematical models, and performed the GWAS on previously published data. I wrote the manuscript andS.L. Nuismer, and S.P. Otto contributed to manuscript edits.A version of Chapter 3 has been published [MacPherson, A., and S.P. Otto. 2018. Joint coevolution-ary–epidemiological models dampen Red Queen cycles and alter conditions for epidemics. Theoreticalpopulation biology]. The project was conceived together with S.P. Otto. I developed and analysed the math-ematical models with input from S.P. Otto. I wrote the manuscript and S.P. Otto contributed to manuscriptedits.Chapters 4 and 5 are both in preparation for submission with co-authors M.J. Keeling and S.P. Otto. Theproject was conceived by me and developed and analysed with input from M.J. Keeling, S.P. Otto. I wrotethe manuscripts and M.J. Keeling and S.P. Otto contributed to manuscript edits.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Supplementary Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 May and Anderson and the History of Coevolutionary-Epidemiology . . . . . . . . . . . . 11.2 The Effect of Interaction Genetics on Coevolution . . . . . . . . . . . . . . . . . . . . . . 21.2.1 The gene-for-gene model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.2 The matching-alleles model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.3 The phenotypic-difference model . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.4 The phenotypic-matching model . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 The Effect of Host-Parasite Life History on Coevolution . . . . . . . . . . . . . . . . . . . 61.3.1 Time scale: discrete vs. continuous time . . . . . . . . . . . . . . . . . . . . . . . 61.3.2 Ecology: constant vs. variable population size . . . . . . . . . . . . . . . . . . . . 81.3.3 Epidemiology: indirect vs. direct transmission . . . . . . . . . . . . . . . . . . . . 91.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.4.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.4.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.4.4 Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12vi1.5 A Beginning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Identifying the Genetic Basis of Susceptibility to Infectious Disease . . . . . . . . . . . . . . 142.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3 Analytical Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.1 Single-species GWAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.2 Two-species co-GWAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.4 Host-Parasite Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.5 Daphnia-Pasteuria GWAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 A Generalized Deterministic Compartmental Model with Coevolution . . . . . . . . . . . . . 283.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.3 Background: Matching-Alleles Coevolution Model . . . . . . . . . . . . . . . . . . . . . . 323.4 Background: SIRS Model Without Coevolution . . . . . . . . . . . . . . . . . . . . . . . . 333.5 SIRS Model With Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.6 Model Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.7 Coevolution in an SEIRS Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434 Coevolution Does Not Slow the Rate of Loss of Heterozygosity in a Stochastic Host-ParasiteModel with Constant Population Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.2 Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.3 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.3.1 Deterministic dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.3.2 Ensemble moment dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.3.3 Individual-based simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Feedback Between Coevolution and Epidemiology Can Help or Hinder the Maintenance ofGenetic Variation in Host-Parasite Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.3 Deterministic Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.3.1 Constant-size model: review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.3.2 Ecological model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.3.3 Epidemiological model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.4 Stochastic Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64vii5.4.1 Constant model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.4.2 Ecological model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685.4.3 Epidemiology model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.5 Model Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.2 Context, Conclusions, and Caveats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.3.1 Genetic basis of coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.3.2 Multiple coevolving traits and epidemiological trade-offs . . . . . . . . . . . . . . 786.3.3 Spatial coevolutionary-epidemiology . . . . . . . . . . . . . . . . . . . . . . . . . 786.3.4 (Co)evolutionary (un)rescue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.4 A Final Thought . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91A Appendix for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92A.1 Supplementary Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92B Appendix for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94C Appendix for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96C.1 The Ensemble Moment Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96C.1.1 General approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96C.1.2 Dynamics of host heterozygosity . . . . . . . . . . . . . . . . . . . . . . . . . . . 97C.2 Maintenance of Genetic Variation in Discrete Time . . . . . . . . . . . . . . . . . . . . . . 97C.2.1 Model specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97C.2.2 Deterministic dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98C.2.3 Simulations of a finite population . . . . . . . . . . . . . . . . . . . . . . . . . . . 98C.3 Seed Dormancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99C.4 Supplementary Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101D Appendix for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103D.1 Stability and Transient Dynamics of the Epidemiological Model . . . . . . . . . . . . . . . 103D.2 Neutral Genetic Drift in the Individual-Based Simulations . . . . . . . . . . . . . . . . . . 104D.3 Effect of Ne in the Ecological Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105D.4 Supplementary Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106viiiList of Tables5.1 Individual-based simulation parameter selection. . . . . . . . . . . . . . . . . . . . . . . . 67D.1 Variation inH across parameter space versus among replicates. . . . . . . . . . . . . . . 106D.2 Linear model fits shown in Figure 5.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107ixList of Figures1.1 Allele-frequency dynamics in the matching-alleles model. . . . . . . . . . . . . . . . . . . 72.1 Host-parasite interaction models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Host-only model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3 Host-pathogen Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.4 Allelic effects over coevolutionary time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.5 GWAS of Daphnia magna susceptibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.1 MAM dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2 SIRS epidemiological dynamics without coevolution. . . . . . . . . . . . . . . . . . . . . . 363.3 SIRS epidemiological dynamics with coevolution. . . . . . . . . . . . . . . . . . . . . . . . 383.4 Comparison of epidemiological dynamics with and without coevolution. . . . . . . . . . . . 393.5 Effect of SIRS epidemiological dynamics on allele-frequency cycles. . . . . . . . . . . . . . 403.6 Effect of SEIRS epidemiological dynamics on allele-frequency cycles. . . . . . . . . . . . . 424.1 Deterministic dynamics of matching-alleles coevolution. . . . . . . . . . . . . . . . . . . . 494.2 Schematic of the Moran MAM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.3 The effect of early perturbations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.4 Individual-based simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.1 Model design and deterministic dynamics of the three models explored in this paper. . . . . 605.2 Observed heterozygosity with host-parasite coevolution relative to the neutral expectation . . 665.3 Maintenance of genetic variation relative to overdominance. . . . . . . . . . . . . . . . . . 705.4 Comparison of observed heterozygosity across models. . . . . . . . . . . . . . . . . . . . . 71A.1 Matching-alleles model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92A.2 GWAS of Daphnia magna susceptibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93C.1 Comparison between discrete- and continuous-time models. . . . . . . . . . . . . . . . . . 101C.2 Life-cycle diagram and numerical stability analysis of MAM with seed dormancy. . . . . . . 102D.1 Epidemiological model: deterministic model stability and the role of R0(). . . . . . . . . . 104D.2 Variation in effective population size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106D.3 Effect of directional selection across models. . . . . . . . . . . . . . . . . . . . . . . . . . 108xList of Supplementary MaterialsChapter 2: CoGWAS.nbChapter 3: SIRS_Model.nbChapter 3: SEIRS_Model.nbChapter 3: Morand_Model.nbChapter 4: Moran_Model.nbChapter 4: Wright_Fisher_Model.nbChapter 4: Seed_Dormancy.nbChapter 5: Constant_Size_Model.nbChapter 5: Ecological_Model.nbChapter 5: Epidemiological_Model.nbxiAcknowledgementsMy deepest thanks to Sally Otto for her endless support and guidance. Sally–thanks for everything you havetaught me, from the tiniest technical detail to the biggest existential perspective. Thanks for giving me anappreciation of the natural world and showing me how to live life with never-ending curiosity and grace.Everyone needs a role model, you will always be mine.Thanks to Linnea Sandell and Gil Henriques for your friendship and support from day 1.Thanks to the many Otto lab members who have and continue to teach and inspire me. During my time atUBC I benefited enormously from the help of Nathaniel Sharp, Matt Pennell, Michael Scott, Jasmine Ono,Matt Osmond, and Liz Kleynhans, Alirio Rosales, Linnea Sandell, Shahab Zaryan, Freek de Haas, ShingZhan, Seth Watt, and Niki Love. Thanks to the many visiting students and researchers to the Otto lab and allthe insights you have brought.A special thanks to my co-authors, Sally Otto, Scott Nuismer, and Matthew Keeling. What I have learnedfrom you is unmeasurable.Thanks to my supervisory committee (Dan Coombs, Michael Doebeli, and Matt Pennell) for your supportand perspectives. Thanks to my examining committee (Matt Pennell, Jonathan Davies, Jeff Joy, and AurélienTellier) for your comments and suggestions. They have added context and depth.Thanks to the BRC community, you have added so much depth and joy to my PhD.Thanks to UBC Zoology department, the European Society of Evolutionary Biology, and the AmericanAssociation for University Women for your support of my research.xiiDedicationFor my family: Papa, Mama, Blaise, Lily, Quinn, Chiara, Jenna, and Luis.xiiiChapter 1Introduction1.1 May and Anderson and the History of Coevolutionary-EpidemiologyInfectious diseases are pervasive with enormous impacts on both human and wildlife populations. Accordingto the World Health Organization (WHO), lower respiratory infections, diarrhoeal diseases, and tuberculosisremain three of the top 10 leading causes of death worldwide. In wildlife populations, infectious diseasescontribute to species endangerment and extinction (Smith et al., 2006). In addition to, and as a direct resultof, their effect on host survival, infectious diseases exert strong selective forces on their hosts providingthe possibility, in the presence of genetic variation for rapid coevolution between hosts and their infectiouspathogens.Mathematical models have always played a fundamental role in our understanding of both infectious dis-eases and coevolution. This thesis continues in this tradition exploring coevolutionary-epidemiology throughpopulation genetic models. In their seminal paper “Epidemiology and genetics in the coevolution of para-sites and hosts” May and Anderson (1983) were the first to unite the fields of epidemiology and host-parasitecoevolution by developing the first genetically explicit model of a directly-transmitted pathogen. Set againsta backdrop of “unsound theoretical predictions” based on group selection arguments that natural selectionshould favour the evolution of benign pathogens, May and Anderson set out to develop a rigorous modelfor how parasites 1) regulate the density and geographic distributions of their hosts, 2) maintain geneticpolymorphisms at coevolving loci, and 3) select for the evolution and maintenance of sexual reproduction.Although not set against the premise of evolution towards avirulence, the questions and aims of this thesisremain remarkably close to those of May and Anderson. As they did, here we develop simple theoreticalmodels to understand 1) the genetic basis of host-parasite coevolution, 2) how epidemiological dynamics in-fluence coevolutionary dynamics, and 3) when and how coevolution maintains or depletes genetic diversityin small populations.Focused on coevolution between hosts and micro-parasites, May and Anderson’s model differed fromother coevolutionary models of that time that, in the spirit of Lotka-Volterra models, considered interac-tions between hosts and their free-living macro-parasites (Levin, 1983). Despite being cited extensively insubsequent coevolutionary literature, the study of host-parasite coevolution, particularly on the theoreticalside, has remained largely an exploration of macro-parasitic infections. As we will emphasize throughoutthis thesis, the distinguishing characteristic of macro-parasites from a coevolutionary perspective is that theymay live, infect new hosts, and die independently of their hosts. The fact that macro-parasitic infections havedominated the coevolutionary literature is no coincidence as many coevolutionary model systems fall withinthis category. For example Microphallus, a trematode parasite of the New Zealand mud snail Potamopyr-1gus antipodarum, has a complex multi-host species life cycle external to this focal host species. Similarly,the bacterial pathogen Pasturia ramosa is transmitted environmentally through the water column betweenindividuals of it’s host, the water flea Daphnia magna. Determining when and how our understanding ofcoevolution from this ecological perspective differs from an epidemiological one is therefore a key aim ofthis thesis.In addition to their focus on disease epidemiology, May and Anderson emphasize that they consider asimplistic model of coevolution containing only “the essential features of the evolutionary relations betweenparasites and hosts.” They note that the simplistic single-locus genetic basis of coevolution used in theirmodel, although likely unrealistic, is essential for mathematical progress. This trade-off between biologicalrealism and mathematical tractability remains in the theoretical models explored here. Although we toofocus primarily on single-locus models for mathematical simplicity, in the intervening four decades, ourtheoretical understanding of the effect of the genetic basis of coevolution has matured. Within this broadercontext we attempt to understand how our simplistic models can inform our understanding of coevolution insystems with more complex genetic bases. An emergent theme of this thesis, therefore, is that the dynamicsof coevolution are determined and hence characterized by the underlying genetics of the interactions and,as emphasized above, the life history of the host-pathogen interactions. We begin with a review of thetheoretical literature on how these two determining factors, in general, shape the coevolution of hosts andtheir pathogens. In an attempt to draw general conclusions and connections this introduction steps awayfrom the mathematical details and rigour of subsequent chapters. More in depth and nuanced discussionsof the individual models are presented, when needed, in the introduction and background sections to eachindividual chapter.1.2 The Effect of Interaction Genetics on CoevolutionFrom a theoretical perspective there are two general approaches to modelling host-parasite interactions. Inthe style of population genetics, one approach is to model coevolution with an explicit genetic basis consist-ing either of a single or a small number of major-effect loci. The two classic examples of such models arethe "gene-for-gene" (GFG) and the “matching-alleles-model” (MAM). The alternative, quantitative-geneticsstyle approach, considers interactions based on the phenotypic traits of the host and parasite. Within this veinare the “phenotypic-matching model” and the “phenotypic-difference model” both of which are discussed inmore detail below. The modelling approach of Nuismer and Otto (2005) also used in Chapter 2, provides aninteresting intermediate between these two where, by developing statistical methods for identifying the ge-netic basis of phenotypic traits, we model “phenotypic-matching” and “phenotypic-difference” interactionswith an explicit two locus basis. Whether genotypic or phenotypic, these four interaction models have beenexplored extensively. The biological motivation and general dynamical behaviour of each is outlined below.1.2.1 The gene-for-gene modelThe fundamental requirement for the genetic basis of coevolution is specificity. Working in flax (Linummarginale) and rust (Melampsora lini), Flor (1955, 1956) was the first to demonstrate the existence of such2genetic specificity between a host and its pathogen. Since Flor’s initial work, flax-rust has become a modelsystem of coevolutionary biology and displays a form of genetic specificity found in many plant systems,known now as the gene-for-gene (GFG) interaction (Flor, 1971). Subsequent genetic analysis of the flax-rust system (Barrett et al., 2009) and other plant-pathogen systems has shown similar patterns of geneticspecificity governed by plant resistance (R) genes and corresponding host avirulence (Avr) genes (Van derBiezen and Jones, 1998).From a mathematical perspective, in the simplest formulation as a haploid single-locus GFG model, thehost can carry either a “susceptible” allele or a “resistant” allele whereas the pathogen is either “virulent”or “avirulent”. Avirulent pathogens can infect only susceptible hosts whereas virulent pathogens can infecteither host type. The result is an inherently asymmetric interaction with a universally virulent pathogengenotype and a universally susceptible host. In the absence of additional fitness costs to virulence andresistance alleles, the presence of parasitism favours the evolution of resistance in the host, which in turnfavours the evolution of pathogen virulence. The result is an increase in the frequency and the ultimatefixation of the host resistance and pathogen virulence genotypes (Jayakar, 1970). This dynamic is oftencalled a “coevolutionary arms race” as it favours the evolution of sequentially more resistant hosts and morevirulent pathogens (Dawkins et al., 1979; Woolhouse et al., 2002). Longer term maintenance of geneticvariation (Brown and Tellier, 2011) and complex cyclic dynamics arise in this model only when implicit(Salathé et al., 2005) or explicit costs (Sasaki, 2000; Bergelson et al., 2002) of resistance and virulencegenes are included.1.2.2 The matching-alleles modelThe matching-alleles model (MAM) of host-pathogen genetic specificity is characterized by the absence ofa universally resistant host genotype and a universally virulent pathogen genotype. In other words, everyhost is susceptible to at least one pathogen genotype, and every pathogen genotype fails to infect at leastone host genotype. This is in contrast to the gene-for-gene model where the “virulent” pathogen is able toinfect all host genotypes. The lack of universally resistant/virulent types inherently leads to the trade-offbetween susceptibility and resistance amongst strains. In its simplest haploid single-locus two-allele form(Segar and Hamilton, 1988), the result is that one host genotype is susceptible to exactly one, “matching”,parasite genotype and resistant to the other “mis-matching” genotype. Extensions of this model to diploids,multiple loci, and multiple alleles have been studied extensively (for example see Otto and Nuismer, 2004;Agrawal and Lively, 2002; Frank, 1993; Gandon et al., 1996; Hamilton, 1993).Mentioned above, parasitism of the water flea Daphnia magna by the bacterium Pasturia ramosa isone of the most well documented model systems of host-parasite coevolution (Ebert, 2005). In particular,genetic specificity in this system has a well determined MAM type genetic basis. When exposed to twodifferent parasite strains, Daphnia are susceptible to either one or both strains. The segregation of resistancein the host is consistent with a single-locus three-allele MAM (Luijckx et al., 2013). Subsequent genomicanalyses reveal a more complex multi-locus genetic basis involving extensive structural linkage and epistaticinteractions (Bento et al., 2017; Bourgeois et al., 2017). Yet the explanatory power of a single-locus MAMsuggests that, although simplistic, the dynamical behaviour of this model may accurately capture evolution3within this system.In contrast to the rapid fixation of host and pathogen genotypes in the GFG model, the single locusMAM results in cycles in host and pathogen allele frequency. The long term persistence of these allelefrequency cycles is sensitive to the life-cycle assumptions of the underlying model as discussed in moredetail below. These cycles have been given a variety of names, often related to their emergent evolutionaryeffects. For example they are often called “Red Queen cycles” due to their role in Red Queen hypothesis forthe evolution and maintenance of sexual reproduction (van Valen, 1973; Hamilton, 1980; Bell, 1982). Othernames include “trench-warfare dynamics” (Stahl et al., 1999) or a dynamic polymorphism (Woolhouse et al.,2002) in reference to their putative effects on the maintenance of genetic variation (Haldane, 1949). Thecontrast between the trench-warfare dynamics of the classic MAM model and the arms race dynamics ofthe classic GFG has established a dichotomy in coevolutionary theory to which more complex models areroutinely compared as being “MAM like” or “GFG like” (Agrawal and Lively, 2001; Holub, 2001; Bergelsonet al., 2002).In this thesis we focus almost solely on single locus matching-allele models in haploid hosts and par-asites. Previous theoretical studies have considered extensions of this model to diploids and polyploids(Lively, 1999; Nuismer and Otto, 2004) and to multiple loci (Frank, 1993; Gandon et al., 1996; Agrawaland Lively, 2002). In their unifying model, Agrawal and Lively (2002) present a coevolutionary model witha flexible genetic basis, ranging across parameter space from a MAM to a GFG model. They observe thatthe dynamical behaviour of the MAM is observed across much of the continuum between the two models.These two models and the continuum between them, however, only represent a small subset of the possiblegenetic architectures underlying host-parasite coevolution. In the classic manner of theoretical evolution analternative to the single or few locus genetic models is a phenotypic approach. As we have done here forthe genetic approaches, in the following subsections I discuss two such models, their biological motivations,and characteristic behaviour.1.2.3 The phenotypic-difference modelRather than based on genetic specificity, host susceptibility and pathogen virulence in the phenotypic modelsbased on the interaction between a host phenotype zH and a pathogen phenotype zP . Although presum-ably the interaction could depend on a multi-variate phenotype in each species, for simplicity we will onlyconsider the univariate case here. Two such models are used widely in coevolutionary theory. The firstis the phenotypic-difference model (Nuismer et al., 2007), where the probability that a pathogen success-fully infects the host is determined by the difference in their phenotypes. In particular, pathogens are mostsuccessful at infecting hosts with phenotypes less than or equal to their own.Examples of biological systems with phenotypic-difference type specificity include many host-parasiteand predator-prey interactions mediated by parasite/predator toxin and host/prey resistance (often in the formof an anti-toxin). The rough-skinned newt (Taricha granulosa) produces the neurotoxin tetrodotoxin (TTX)to deter predation by the common garter snake (Thamnophis sirtalis). In regions of high TTX production,garter snakes exhibit a corresponding increase in their resistance to TTX (Brodie et al., 2002). The sharpdecline in snake fitness with increasing dose of TTX is well captured by the S-shaped curve between pheno-4typic difference, zH  zP and fitness used in the phenotypic-difference model.Much like the GFG model, the characteristic dynamic of the phenotypic-difference model is a coevolu-tionary arms race, albeit a phenotypic one rather than a genetic one. Specifically, in the absence of abioticfactors, the presence of parasitism favours the evolution of toxic hosts, which in turn favours the evolutionof resistant parasites. This leads to an escalation in host toxicity and parasite resistance. The production oftoxins and anti-toxins is, however, costly putting constraints on the evolution of these phenotypes in naturalsystems. The ensuing arms race of host and parasite toxicity/resistance is, for example, eventually limitedby these abiotic selective effects and/or genetic constraints on the phenotypes.1.2.4 The phenotypic-matching modelAn alternative to the phenotypic-difference model is the “phenotypic-matching” model where infection de-pends on the relative difference between host and parasite but rather the absolute distance such that hosts aremost susceptible to pathogens with similar phenotypic values. Phenotypic-matching coevolution can arisefrom both direct and indirect interactions between species as. In the case of brood parasitism, for example,direct interactions between cuckoos (family: Cuculidae) who lay their eggs in the nests of conspecifics orheterospecifics. The success of brood parasitism in this clade, which has nearly 60 parasitic species, is deter-mined by the match in egg colouration between parasite and host (foster) (Kilner, 2006; Honza et al., 2007).In contrast to the direct interaction between parasite and foster eggs, phenotypic-matching specificity canarise in indirect species interactions between mimics and their models in systems with Batesian mimicry.As with the phenotypic-difference model in the absence of constraints, namely in the absence of abioticselection and with a constant genetic variance over time, the trait values of both host and pathogen willbecome more extreme over time. Introduction of constraints, for example from abiotic stabilizing selection,produces limit cycles in host and pathogen phenotypes (Gavrilets, 1997). Similarly, limit cycles arise whenthe phenotypes are constrained by explicitly modelling their genetic bases (Nuismer and Otto, 2005). In thiscase the allele frequency at each locus underlying the host and pathogen trait reach their own limit cycles.In contrast to the neutrally stable Red Queen cycles of the MAM, these allele frequency limit cycles are lesssensitive to perturbations and should be robust in the face of changes in host and pathogen population size.1.2.5 SummaryThe four classes of models summarized above illustrate the dependence of coevolutionary dynamics onthe underlying genetic architecture of the specificity between the interacting species. These dynamics canbe classified roughly into two groups: “Red Queen cycles” in which host and pathogen allele frequencyor phenotype and hence the underlying allele frequencies oscillate over time and “Arms race dynamics”in which the host and pathogen allele frequencies monotonically approach some given value (often allelefixation/loss). Which dynamic arises has importance consequences for the emergent evolutionary effectsof coevolution, for example the evolution of sexual reproduction (Reviewed by Salathé et al., 2008) ormaintenance of genetic variation.The models summarized so far, however, have focused almost exclusively on evolution, specificallythe effects of species interactions on the underlying allele frequencies. Coevolution is however inherently an5eco-evolutionary process. The changes in allele-frequencies in these models can not necessarily be separatedfrom their effects on the number of hosts and pathogens. In the following section we consider three featuresof what I have broadly labelled “life-history”. As we have done above summarize what each of these featuresmeans from a modelling perspective, examples of them in natural systems, and their general effects oncoevolutionary dynamics.1.3 The Effect of Host-Parasite Life History on CoevolutionA parasite’s life-cycle is an essential element of its natural history and the primary emphasis of any intro-ductory course in parasitology. From a theoretical perspective there are three defining characteristic of host-parasite life-history in relation to the resulting coevolutionary dynamics. The first of these is the time scale,specifically whether host and parasite generations are the result of a discrete temporal sequence of events orthe continual occurrence of events. The second characteristic is the system’s ecology, namely whether or notthe parasite controls the population density of the host and vice versa. The third and final characteristic isthe disease epidemiology. In particular we focus on whether the parasite is transmitted directly or indirectlyfrom one host to the next. In addition to these three elements of life history, one more life history feature thatis known to have qualitative effects on coevolutionary dynamics is the mode of reproduction (e.g., sexual,asexual, facilitative sexual etc.). Specifically, sexual reproduction and the associated processes of recom-bination and segregation can alter the coevolutionary dynamics (Segar and Hamilton, 1988; Engelstädter,2015). As this thesis focuses almost exclusively on coevolution at single loci between haploid hosts andparasites, where the dynamics are the same for both sexuals and asexuals. Hence, as with the models of thegenetic basis of coevolution described above, we will review the biological motivations for the theoreticalassumptions underlying the three relevant life-history factors introduced above and their effects on generalcoevolutionary dynamics.1.3.1 Time scale: discrete vs. continuous timeOne of the defining features of a life-history models with respect to coevolutionary dynamics is whether thebiological system is characterized by events occurring in discrete versus continuous time. From a purelymathematical perspective organisms with discrete generation life-histories are modelled with systems of re-cursion (difference) equations where evolution proceeds in a series of steps. In contrast to discrete-time, ifbirth and death and other life-history events are distributed continuously throughout time, the life-historyand evolution of the system can be modelled using systems of differential equations. Both discrete- andcontinuous-time models are used throughout coevolutionary theory reflecting the vast diversity of the bio-logical systems involved. For example, consider the interaction between the parasitoid wasp (Aphidius spp.)and their aphid hosts (Acyrthosiphon spp.) within whom they lay their eggs. Emerging as adults from theirhosts approximately 20 days later, these wasps mate and oviposit within subsequent aphid generations cre-ating a discrete non-overlapping life-history. In contrast, over the course of one’s life span, human hostsmay be infected repeatedly with Plasmodium (the causative agent of malaria), the resulting disease and(co)evolutionary dynamics therefore are more conveniently described on a continuous-time scale.6A. B. C. D. Ecology EpidemiologyTime-scaleFigure 1.1: Allele-frequency dynamics in the matching-alleles model. Panel A. Phase plane diagram ofthe neutrally stable allele-frequency cycles characteristic of the continuous-time MAM with a constant hostand pathogen population size. Red point indicates the initial allele frequency in the host and pathogen. PanelB. Unstable allele-frequency cycles in the discrete-time MAM. Panel C. Second-order stable cycles in thecontinuous-time MAM with changing population sizes. Panel D. First-order stable allele frequency cyclesin an SI compartmental model.7Although not a formal property of discrete-time models, relative to continuous-time, the stepwise mannerof these models introduces time lags that often have two dynamical effects. First is the increased tendencyfor producing cyclic or periodic dynamics. Second is the tendency to reduce the stability of an equilibrium.A classic example of this is the discrete-time model of logistic growth (May, 1974). For biological sys-tems where the intrinsic growth rate is between 0 and 2 the system approaches a stable equilibrium at thepopulation carrying capacity, as is always true for the continuous-time model. If the intrinsic growth rateexceeds 2 however, oscillatory dynamics arise. Finally, even further increases in the intrinsic growth rate be-yond roughly 2.5 produce chaotic dynamics. Whether a mathematical artefact or a truly relevant biologicalphenomenon, this simple model exemplifies some general effects of discrete-time dynamics.The differing effects of discrete- versus continuous-time life histories on the stability and oscillatory be-haviour of models has consequences in many of the coevolutionary models explored throughout this thesis.In the single-locus MAM model discussed above, if modelled in continuous time the dynamics are char-acterized by neutrally stable Red Queen allele frequency cycles (see Figure 1.1A). Starting from an initialperturbation from the internal-polymorphic equilibrium of this model, the characteristic allele frequency cy-cles neither grow nor shrink in amplitude over time. In contrast, when modelled in discrete time these cyclesare no longer neutrally stable but rather grow in amplitude over time, as shown in Figure 1.1B, until thesystem eventually becomes fixed for one host and one pathogen allele (Leonard, 1977; M’Gonigle et al.,2009).1.3.2 Ecology: constant vs. variable population sizeCoevolution is an inherent intersection between the fields of evolution and ecology. A key element of thisintersection is the effect of an interacting species on the population size and dynamics of the focal species.Whether or not parasites play a dominant role in determining the population size of their hosts has profoundconsequences on ensuing coevolutionary dynamics. Many parasites and infectious diseases have substantialeffects on the size and population dynamics of their hosts (Papkou et al., 2016), so much so that parasitismcan be a useful method of biological control of pests and invasive species (Myers and Bazely, 2003). Ex-amples of this include the population control of pea aphids with parasitoid wasps, mentioned in the previoussection, and the attempted eradication of the invasive European rabbit in Australia through the spread ofmyxoma virus. In contrast to the devastating effects of myxoma virus on European rabbits, infection in itsnative host (the American rabbit) leads only to a mild infection with little consequence to overall survivaland population density. Despite the inherent link between these ecological dynamics and coevolutionarydynamics, traditional models of host-parasite coevolution, including many of those mentioned above, do notexplicitly model host and pathogen population dynamics and hence do not fully capture the eco-evolutionaryfeedbacks of coevolution. Given the role that these changes in host and pathogen population size will play inthe models throughout this thesis, for lack of a more concise term, I will use the general term “ecology” torefer to the coevolutionary impacts on population dynamics in hosts and parasites and their role in shapingthe dynamics of coevolution.Traditionally, coevolutionary models, such as the gene-for-gene (GFG) and matching-alleles model(MAM) introduced above, focus exclusively on the allele frequency dynamics at coevolving loci. In doing8so these models ignore the effect of ecological dynamics either implicitly, or in the case of individual-basedsimulations explicitly, assuming that the population size of the host and pathogen remain constant over time.One justification for this assumption is the focus on endemic steady state infections where host and para-site population sizes remains approximately constant. Indeed the effect of including ecological dynamicson qualitative coevolutionary dynamics is often small (Nuismer, 2017). For example when modelled incontinuous-time for GFG coevolution, ecological dynamics influence the rate at which interactions occur.While this can influence the tempo of evolution and hence the transient dynamics it does not qualitativelyinfluence the long-term dynamics or equilibria. This is also largely true for the MAM. As with the allelefrequencies, the ecological dynamics of this model are characterized by cycles in host and pathogen pop-ulation size. While these fluctuations in population size change the tempo of coevolution to leading order(Nuismer, 2017) their stabilizing affect on coevolutionary allele frequency dynamics is of smaller secondorder (see Figure 1.1C). Hence the effects of these ecological dynamics emerge only in the very long term(Frank, 1993). Nevertheless, as we will emphasize with our “ecological” model of a free-living parasite inChapter 5, these small ecological effects on allele frequency dynamics can still be incredibly relevant.1.3.3 Epidemiology: indirect vs. direct transmissionThe third and final element of host and pathogen life-history impacting coevolutionary dynamics is dis-ease epidemiology. A general term applied primarily to the many distinguishing life-history characteristicsof transmissible pathogens, here we will use it primarily to distinguish between pathogens that are eitherdirectly transmitted from one host to the next versus free-living parasites. Although not the only two pos-sibilities (consider for example vector-borne diseases) they represent important epidemiological extremesfrom a coevolutionary perspective.Explored initially by Beck (1984) and May and Anderson (1983) the effect of epidemiology on hostparasite coevolution remains largely under-explored. As described in the preceding sections, in the inter-vening years, our understanding of host-parasite coevolution in free-living parasites has grown considerably.Through advances in genetics we now have a far better understanding of the genetic basis of host-pathogeninteractions and a theoretical understanding of how those genetic interactions affect coevolutionary dynam-ics. Similarly with 40 years of insights into the diversity of host/pathogen life-histories we can now re-visit questions posed by May and Anderson to begin developing one cohesive theory of coevolutionary-epidemiology.1.4 OverviewAs outlined above this thesis builds on existing coevolutionary theory by applying mathematical and statis-tical methods to understand the effect of disease epidemiology on coevolutionary dynamics. The primaryaims of this thesis are three-fold. My first aim is to develop statistical methods for identifying gene-by-gene interactions between coevolving species. My second aim is to identify the fundamental role ecologicaland epidemiological feedbacks play in shaping the outcome of coevolutionary interactions. Lastly, I aimto quantify the rate of loss of genetic variation in coevolutionary models, providing a framework for un-9derstanding if and when coevolutionary interactions are likely to persist over time coevolution can maintaingenetic variation across a range of ecological and epidemiological contexts.1.4.1 Chapter 2A fundamental component of any coevolutionary system, and hence an assumption of any coevolutionarymodel, is the genetic basis of the interaction. Whether consisting of multiple loci, as in the case of thephenotypic-difference and the phenotypic-matching models, or a single locus, as in the case of the MAand GFG models, the nature of the genetic interaction has, as discussed in detail in section 1.2, importantconsequences on the coevolutionary outcome. Nevertheless, the genetic basis of interactions is rarely knownin natural systems. The development of novel methods for identifying the genes underlying coevolutionaryinteractions is therefore necessary to fully understanding coevolution across a diversity of systems.One popular method for identifying candidate genes underlying any phenotype is genome-wide-associationstudies (GWAS). With the increase in whole-genome sequencing it has been increasingly common to ap-ply these methods to identify the genetic basis of interaction traits such as “host resistance” or “parasitevirulence” (Rowell et al., 2012; Newport and Finan, 2011). Although successful in some cases, like anyGWAS, these methods are most adept at identifying candidate genes with large phenotypic effects (Manolioet al., 2009), and the results may be sensitive to the environmental context due to the presence of gene-by-environment (GxE) interactions.In addition the these general considerations, as a result of gene-by-gene (GxG) interactions GWAS per-formed on traits in one of a pair of interacting species may depend on the genetics of the interacting partner(Newport and Finan, 2011). To explicitly account for these GxG interactions, Wang et al. (2018) developedstatistical methods to perform full genome-by-genome GWAS. Developed concurrently with Wang et al.(2018), Chapter 2 develops the statistical methods for performing such “Co-GWAS” inference. While fo-cusing only on inference at a few loci, in comparison to Wang et al.’s (2018) genome-wide approach, weapply the statistical inference to an array of theoretical systems enabling us to assess the general robustnessof single-species versus Co-GWAS inference. To tie these statistical methods to results from classic coevo-lutionary systems, we conclude by reanalyzing previously published data on Daphnia magna susceptibilityto infection by Pasteuria ramosa (Bourgeois et al., 2017), identifying genomic regions consistent with GxGinteractions.1.4.2 Chapter 3Developed in section 1.3, a key aim of this thesis is to understand how disease life-history, in the form ofepidemiological structure, affects coevolutionary dynamics and conversely how coevolutionary interactionsalter epidemiological dynamics. Compartmental models, such as the SIRS and SEIRS models in this chapter,are used widely throughout epidemiology and are crucial for prediction and control of infectious diseases.Incorporating population heterogeneities into these compartmental models is essential for their predictivepower and biological realism. Heterogeneities commonly considered include host age/risk structure or theinclusion of multiple pathogen types (Keeling and Rohani, 2008). Despite a general focus on the impor-tance of these heterogeneities, relatively few epidemiological models include genetic variation in hosts and10pathogens and specificity amongst these groups.While few, the exploration of such coevolutionary-epidemiological models is by no means non existent.As discussed in more detail in the theoretical background section of Chapter 3 to be analytically tractable,however, these models are often forced to make strong assumptions about the form of specificity (e.g., Mayand Anderson’s (1983) assumption of perfect matching) and/or assumptions of strong selection (e.g., Ashbyand Gupta (2014); Morand et al. (1996)). To avoid such assumptions other coevolutionary-epidemiologicalmodels (e.g., Boots et al. (2014)) take an adaptive dynamics approach, which, while useful for exploringepidemiological trade-offs between such quantities as host resistance and tolerance, does not allow us toassess the dynamical properties of a system (e.g., stable versus cycling dynamics).The aim of Chapter 3 is to describe the effect of coevolution on epidemiological dynamics and vice versaacross a broad range of coevolutionary and epidemiological scenarios. Specifically, we focus on the effect ofdisease epidemiology on the persistence of Red Queen allele frequency cycles and the effect of coevolutionon the emergence of cyclic epidemic dynamics. Facilitated by explicit comparisons between models withand without coevolution/epidemiology we are able to broaden the scope of previous models to cases withoutextreme specificity or strong selection. We find that Red Queen cycles are not robust in an epidemiologicalframework and that coevolutionary interactions can alter the conditions under which epidemic cycles arise.Incorporating both explicit epidemiology and genetic diversity, therefore, may have important implicationsfor the maintenance of sexual reproduction as well as disease management.1.4.3 Chapter 4Given the relevance of coevolution in directly transmitted infections it is important to understand the like-lihood of finding ongoing coevolution in natural systems. Long-term coevolutionary dynamics, like thoseexplored in Chapter 3, require the persistence of segregating genetic variation at coevolving loci. Negativefrequency-dependent selection as a result of coevolution has been hypothesized to maintain genetic vari-ation in host and parasites (Haldane, 1949; Clarke, 1979). Despite the extensive literature pertaining tohost-parasite coevolution, the effect of matching-alleles (MAM) coevolution on the maintenance of geneticvariation has not been explicitly modelled in a finite population. Hence, before addressing the maintenanceof genetic variation in coevolving systems in a broader ecological and epidemiological context (see Chapter5) we begin by examining the maintenance of genetic variation in the classic MAM (Segar and Hamilton,1988).While the dynamics of the MAM are reminiscent of negative frequency-dependent selection and henceexpected to maintain genetic vairation, analyses of the GFG model by Tellier and Brown (Tellier and Brown,2007) suggest that not all negative frequency-dependence is alike. Specifically, the GFG model results inindirect negative frequency-dependent selection (iFDS) where alleles that are rare in one interacting partnerare favoured in the other interacting partner. In contrast to direct frequency-dependence that arises withina single species, iFDS, and hence the GFG model, does not favour the maintenance of genetic variation.Tellier and Brown (2007), however, consider coevolutionary dynamics only in an infinitely large coevolvingpopulation and hence are unable to fully address the maintenance of genetic variation in combination with,and in contrast to, random genetic drift. The aim of Chapter 4 is to assess whether, in contrast to the long-11standing hypotheses, coevolution in the MAM does not maintain genetic variation due to a lack of directnegative-frequency dependence. We do so in a fully stochastic framework using both analytical ensemblemoment approximations (Keeling, 2000) and individual-based simulations.We find that, like for the GFG model, in the infinite-population the MAM behaves neutrally. Extendingthese results into a stochastic framework, we find that while this also is largely true in finite populations,two additional phenomena arise. The first of these effects is due to natural selection responding to stochasticperturbations in host and pathogen allele frequencies. While this may increase or decrease genetic variation,depending on the parameter conditions, the net effect is small relative to that of the second phenomena.Following fixation in the pathogen, the MAM exhibits directional selection, which in turn rapidly erodesgenetic variation in the host. Hence, rather than maintain it, we find that, on average, matching-allelescoevolution depletes genetic variation.1.4.4 Chapter 5As illustrated by the results of Chapter 3, epidemiological feedback can fundamentally alter coevolution-ary dynamics. Hence in Chapter 5 we extend the results of Chapter 4 to assess the maintenance of geneticvariation in host-parasite systems with explicit ecological and epidemiological feedbacks. Specifically wecontrast the maintenance of genetic variation in the classic coevolutionary model considered previously toan “ecological” model of a host and free-living pathogen with explicit population dynamics and finally toan “epidemiological” model of a directly-transmitted infection with an susceptible-infected (SI) epidemio-logical structure. As with Chapter 3, in the deterministic limit, we find that disease epidemiology stabilizesallele frequency dynamics. This is also true, although to a lesser second-order degree, for coevolutionarymodels with ecological feedback. To understand the implications of this deterministic stability in a finitepopulation we analyse the dynamics in a stochastic finite population size framework, as in Chapter 4, butallowing explicitly for changes in populations.Like the classic MAM considered in the previous chapter we find, in general, that coevolution rarelymaintains more host genetic variation than expected under neutral genetic drift alone. When and if coevo-lution maintains or depletes genetic variation relative to neutral drift is determined, predominantly, by twofactors: the deterministic stability of the Red Queen allele frequency cycles and the frequency at whichpathogen fixation occurs, as this results in directional selection and the depletion of genetic variation in thehost. Compared to purely coevolutionary models with constant host and pathogen population sizes, ecologi-cal and epidemiological feedbacks stabilize Red Queen cycles deterministically, but population fluctuationsin the pathogen increase the rate of pathogen fixation, especially in epidemiological models. Taken togetherour results illustrate the importance of considering the ecological and epidemiological context in which co-evolution occurs when examining the impact of Red Queen cycles on genetic variation.1.5 A BeginningDespite my naive vision that my thesis would be a concise, insightful, and complete exploration of sometopic, no matter how small, I now realize and hope that this thesis will be a beginning rather than an end.12Emphasized in this introduction and in the chapters to come, the models presented here cover only a smallslice of the vast diversity of coevolutionary life histories and interaction genetics both known and imagined. Ihave been fortunate, I believe, in that this thesis raises more questions than it answers, as these new questionshave been, for me, the greatest joy of this work. Hence before beginning, I want to end by hoping that in thiswork you too will find the joy of unanswered questions.13Chapter 2Identifying the Genetic Basis ofSusceptibility to Infectious Disease 12.1 IntroductionInfectious diseases are pervasive. So pervasive, in fact, that without effective mechanisms of resistance,host populations can be quickly reduced in size or even driven to extinction. For instance, chestnut blighteffectively wiped out the American chestnut, which had little if any resistance to this novel pathogen, afterits introduction to North America in the early 1900’s (Anagnostakis, 2000; Anderson et al., 2004). Simi-larly, when myxoma virus was introduced to Australia in the 1950’s, local rabbit populations were almostentirely susceptible, resulting in millions of deaths and the decimation of local populations (Ratcliffe et al.,1952). Human populations, too, have been heavily impacted by infectious disease in the past, perhaps mostnotably during the 1918 influenza pandemic that killed more than 50 million people before fading away in1920 (Johnson and Mueller, 2002; Taubenberger and Morens, 2006). Although these examples are strikingand demonstrate the impact of unchecked infectious disease, they are far from the norm. More commonlyhost populations have effective mechanisms of resistance against pathogens they encounter regularly (Re-vers et al., 2014), with significant variability between populations depending on their history of exposure(Bartholomew, 1998; Weatherall and Clegg, 2002).The existence of substantial variation in resistance to infectious disease within host populations hasgenerated hope that it may be possible to identify the genes conferring resistance. Identifying such resistancegenes would pave the way for genetic engineering of resistant crops and livestock, focus drug developmentefforts on likely targets, and open the door to gene therapeutic approaches within human populations. Asthe genomic revolution has progressed, it has become increasingly common to search for these “resistancegenes” using genome-wide association studies (GWAS) (Newport and Finan, 2011; Rowell et al., 2012).Loosely speaking, these studies compare the marker genotypes of individuals infected with disease and thoseuninfected and ask which loci predict an individual’s infection status. The GWAS approach has now beenused to successfully identify a range of candidate genes thought to be important in resistance to infectiousdisease in plants and animals (Wang et al., 2012; Zila et al., 2013; Chapman and Hill, 2012; Khor andHibberd, 2012; Gurung et al., 2014).Despite the successes of the GWAS approach in some cases, it is becoming increasingly recognized thatthe approach has significant limitations. For instance, GWAS are most powerful when resistance depends on1A version of this chapter has been published as MacPerson, A., S.L. Nuismer and S. P. Otto. 2018. Keeping pace with the RedQueen: identifying the genetic basis of susceptibility to infectious disease. Genetics. 208: 779-789, A supporting Mathematica fileis available for download from the Dryad Digital Repository (DOI: https://doi.org/10.5061/dryad.tb25q)14common genetic variants with relatively large phenotypic effects (Manolio et al., 2009). In addition, whichcandidate genes are identified by this method may depend on the environment in which the study is conducted(Thomas, 2010). These limitations apply to GWAS in general, not just those studies focused on infectiousdisease, and are widely recognized. When GWAS are used to understand the genetic basis of resistanceto infectious disease, however, a potentially more important problem arises. Specifically, the resistancegenes identified within the host population may depend on the genetic composition of the infectious diseaseitself (Newport and Finan, 2011). This sensitivity of the GWAS approach to the genetic composition of theinfectious disease becomes acute any time genotype-by-genotype interactions (GxG) exist; in other wordswhen particular combinations of host and pathogen genes yield resistance whereas other combinations lead tosusceptibility. These GxG interactions may have drastic effects on studies attempting to pinpoint the geneticbasis of disease resistance (Lambrechts, 2010), similar to the effects of gene-by-environment interactions.One particularly disconcerting possibility is that rapid pathogen evolution or host-pathogen coevolution willcause the host resistance genes that can be identified by GWAS studies to fluctuate rapidly over time.Here we quantitatively explore the performance of genome-wide association studies when resistance toinfectious disease involves GxG interactions between host and disease. We begin by presenting a generalmathematical model of an association study to study disease resistance and evaluate the role of GxG interac-tions for several forms of host-parasite interactions. We then simulate host-pathogen coevolution to illustratethe extent to which GxG interactions may vary across time and/or space. We conclude by reanalyzing pub-lished genome-wide association data (Bourgeois et al., 2017) of Daphnia magna resistance to its (Pasteuriaramosa) pathogen, distinguishing regions of the genome associated with overall health from those involvedin resistance specific to a particular P. ramosa strain.2.2 The ModelWe consider a scenario, common in practice, where host resistance is measured as a continuous quantitativetrait. This would be the case, for instance, if host resistance is assessed by measuring viral load, duration ofinfection, or damage to host tissues. Our model assumes host resistance depends on the value of a quantitativetrait in the host, zH , relative to the value of a quantitative trait in the pathogen, zP . Specifically, we assumehost susceptibility, S, is given by the following function:S = f(zH  zP ) (2.1)The function f is sufficiently general to accommodate many commonly observed resistance mechanisms.For instance, in the interaction between the snail, Biomphalaria glabrata, and its trematode parasites, resis-tance depends on the relative quantities of reactive oxygen molecules in the snail (zH ) and reactive oxygenscavenging molecules produced by the parasite (zP ) (Bayne, 2009; Mon et al., 2011). In cases like these, thefunction f may take a sigmoid form which we call the phenotypic-difference model (Figure 2.1A)15(Nuismer et al., 2007; Ashby and King, 2017).f(zH  zP ) = 11 + e↵(zHzP )(2.2)In contrast, in the interaction between the schistosome parasite, S. mansoni, and its snail host, B. glabrata,resistance depends on the degree to which the conformation of defensive FREP molecules produced by thesnail (zH ) match the conformation of parasite mucin molecules (zP ) and successfully bind to them (Mittaet al., 2012). In such cases, the function f may take a Gaussian form which we call a phenotypic-matchingmodel (Figure 2.1B)(Kopp and Gavrilets, 2006).f(zH  zP ) = e↵(zHzP )2 (2.3)In order to study the effects of genetic interactions on susceptibility to infection, S, we must integrategenetics into our phenotypic model. For a haploid host and pathogen where zH and zP depend on nH andnP bi-allelic loci respectively, we can write general expressions for these phenotypes as functions of allelespresent in each species:zH =bH0 +nHXi=1bHiXHi +nHXi,ji 6=jbHi,HjXHiXHj +nHXi,j,ki 6=j 6=kbHi,Hj,HkXHiXHjXHk + ...+ ✏HzP =bP0 +nPXi=1bPiXPi +nPXi,ji 6=jbPi,P jXPiXPj +nPXi,j,ki 6=j 6=kbPi,P j,PkXPiXPjXPk + ...+ ✏P(2.4)In these expressions XMi is an indicator variable describing the allelic state (0 or 1) of an individual ofspeciesM at locus i, bM0 is the phenotype of an individual of speciesM with all “0” alleles, bMi is the ad-ditive effect of carrying a “1” allele at locus i in speciesM . The remaining coefficients (bMi,Mj , bMi,Mj,Mk,etc.) describe epistatic interactions between loci. Finally, ✏M captures an environmental contribution to thephenotype of speciesM , which is assumed to have mean 0, a constant variance, and be uncorrelated with anindividual’s phenotype. Substituting equations (2.4) into equation (2.1) yields a model of host susceptibilityas a function of host and pathogen genotypes.Our goal now is to use this genetic model to predict the sensitivity of genome-wide association stud-ies to the genetic composition of the pathogen population. We will explore both traditional, single-speciesGWAS approaches and a novel approach that takes genetic information from both host and pathogen intoaccount (co-GWAS). Our investigation will rely on a pair of complementary approaches. First, we willdevelop and analyze analytical approximations that quantify the sensitivity of GWAS and co-GWAS ap-proaches to changes in pathogen genotype frequencies. These analytical approximations will rely on simpli-fied genotype-phenotype maps and will not explicitly integrate evolution and coevolution. Second, we willdevelop and analyze simulations that allow us to explore the consequences of rapid pathogen evolution andcoevolution between the species on the performance of both GWAS and co-GWAS approaches.16!" − !$ !" − !$Susceptibility %Phenotypic-difference% = 11 + )* +,-+. Phenotypic-matching% = )-* +,-+. /A. B.Figure 2.1: Host-parasite interaction models.Susceptibility to infection as a function of the distance be-tween host and pathogen phenotypes, zH  zP , for the phenotypic-difference (panel A) and phenotypic-matching model (panel B). Red curves show exact functions whereas black curves are the quadratic approx-imations.2.3 Analytical ApproximationTo simplify the genetic model of resistance developed in the previous section sufficiently for mathematicalanalysis, we begin by considering the case where nH = nP = 2. In addition, we assume that the phenotypesof host and pathogen are not too far from one another, such that the quantity zH  zP is small relative to theextent of phenotypic specificity (↵ in equations (2.2) and (2.3)). Under this assumption equation (2.1) canbe approximated by its second order Taylor series expansion. This allows the genetic model of susceptibilityto be simplified to the following approximate expression:S ⇡f(0) + f 0(0) ((bH0 + bH1XH1 + bH2XH2 + ✏H) (bP0 + bP1XP1 + bP2XP2 + ✏P ))+12f 00(0) ((bH0 + bH1XH1 + bH2XH2 + ✏H) (bP0 + bP1XP1 + bP2XP2 + ✏P ))2 +O((zH  zP )3)(2.5)where primes indicate derivatives with respect to the distance between host and pathogen phenotypes. With(2.5) in hand, we have a model that predicts host resistance as a function of host and pathogen genotypes.In the following two sections we will use (2.5) to investigate how the genetic composition of the pathogenpopulation influences the results of GWAS and co-GWAS studies. Extending these models to completegenome-by-genome association studies requires a large number of pathogen loci (nP  2) and thus maybe computationally prohibitive. For many pathogens, however, strain type or sub-type may be known andcapture much of the relevant genetic variation in the pathogen population. In these cases, tracking pathogentypes can greatly reduce the effective number of loci, even to nP = 2 as in equation (2.5). Such simplifica-tions should allow us to expand beyond two host loci to a whole host genome (nH  2) while avoiding the17computational complexity of tracking all possible genetic interactions between the full host genome and thefull parasite genome.2.3.1 Single-species GWASWe envision a standard genome-wide association study where susceptibility to infection has been measuredfor some number of host individuals, each of which has also been genotyped at a large number of markerloci. To focus our model on the effects of species interactions, we will assume this data accurately providesus with the genotype of individuals at the two host resistance loci. Using this data, the goal of the geneticassociation study is to partition host susceptibility between these genes relative to their effects. This can bedone by fitting susceptibility with a linear combination of the genetic indicator variables.S ⇡ H0 + H1XH1 + H2XH2 + H1,H2XH1XH2 (2.6)where the  coefficients can be found using least squares regression. The biological interpretation of thislinear model is straightforward. The intercept coefficient, H0, is the expected host resistance when both “0”host alleles are present. The coefficients H1 and H2 are the inferred additive effects of the “1” alleles atthe first and second loci respectively, and H1,H2 captures the epistatic interaction between the two host “1”alleles. Solving for the coefficients in (2.6) we have (see supporting Mathematica notebook):H0 =f(0) + f0(0)(b˜H0  (b˜P0 + bP1qP1 + bP2qP2)) + 12f 00(0)((b˜H0  b˜P0)2+b2P1qP1 + b2P2qP2 + 2⇣(bP1qP1 + bP2qP2)(b˜H0  b˜P0) bP1bP2(qP1qP2 +DP )⌘)Hi =f0(0)bHi +12f 00(0)b2Hi + 2bHi(bH0  bP0  qP1bP1  qP2bP2)for i = {1, 2}H1,H2 =f00(0)bH1bH2(2.7)where f(0), f 0(0) and f 00(0) are the resistance function and its first and second derivative evaluated at 0as in equation (2.5), and where b˜H0 = bH0 + ✏H and b˜P0 = bP0 + ✏P . Importantly these expressions forthe coefficients depend on the allele frequency at the pathogen loci, qP1 and qP2, as well as the linkagedisequilibrium between them, DP . Note that the relevant allele frequencies and linkage disequilibrium areamong pathogens to which the host is exposed, which may not be equivalent to the pathogen population as awhole.As a result of the dependence of the coefficients in (2.7) on the pathogen allele frequencies and linkagedisequilibrium, the allelic effects, s, inferred by a host-only GWAS can be quite sensitive to the geneticcomposition of the pathogen population (Figure 2.2). Changes in pathogen allele frequency can alter themagnitude and sign of the inferred effects. From a practical standpoint, if susceptibility is assayed in twohost populations that are exposed to pathogen populations that differ greatly in their allele frequencies, onemay find a host allele has a protective effect in one population but increases risk in the other. Similar tohidden host population structure, uncontrolled differences in the pathogen population can greatly alter theinferences of single-species GWAS.18Pathogen Allele Frequency !"# = !"%Variance ExplainedAllelic EffectsPhenotypic-difference Phenotypic-matchingA. B.C. D.Figure 2.2: Host-only model. Row 1: Allelic effects inferred using the host-only design from equation (2.6):0 (black), H1 and H2 (solid red lines), H12 (dashed red). Row 2: Variation explained by host additiveeffects only (solid red), and host additive and epistatic effects (dashed red) as given by the host-only modelin (2.6).A second result that can be drawn from equation (2.7) is that when the resistance function is approx-imately linear, f 00(0) = 0, inferred additive and epistatic effects, H1,H2, and H1,H2 are independentof the pathogen allele frequencies. For example, in contrast to the non-linear phenotypic-matching modelwhere the inferred effects vary with pathogen allele frequency, the inferred effects remain constant in the ap-proximately linear phenotypic-difference model (Figure 2.2). A third conclusion from equation (2.7) is that,at least under the assumption that zH  zP is small, the epistatic interaction between the host loci, H1,H2is independent of pathogen genetics. We will explore the consequences of this dependence on the pathogenallele frequencies for the stability of GWAS-inferred effects across evolutionary time (see Host-Parasite Co-evolution section below).In addition to identifying allelic effects on host resistance, an important metric of GWAS performanceis the proportion of phenotypic variation explained by the identified causative loci. Given the dependenceof the estimated allelic effects on pathogen allele frequencies, we calculated the total phenotypic variationexplained by the host loci across the range of pathogen allele frequencies (Figures 2.2C and 2.2D). When thepathogen population is monomorphic (qP1 = qP2 = 0 or 1) the host loci can explain 100% of the geneticvariation in the phenotype. If the pathogen population is polymorphic, however, the host-only approachmay explain as little as 10% of the variation. Partitioning the total variation explained into the additive19and epistatic contributions demonstrates that, due to changes in the additive effect size, bHi, the relativecontribution of additive and epistatic effects also varies with pathogen allele frequency and depends on theform of the host-parasite interaction.2.3.2 Two-species co-GWASThe results derived in the previous section demonstrate that traditional single-species GWAS studies may besensitive to the genetic composition of the pathogen population at loci involved in host-pathogen specificity.In this section, we attempt to overcome this problem by developing an alternative GWAS design in whichboth host and pathogen genetics are incorporated. In contrast to the traditional method where only hostgenotypes are recorded, this design requires that both host and pathogen genotypes are known. As withequation (2.6) we now attempt to fit host resistance as a linear function of the allelic indicator variables, butwe include pathogen indicators as well as interaction terms between host and pathogen loci:S ⇡0 + H1XH1 + H2XH2 + H1,H2XH1XH2 + P1XP1 + P2XP2 + P1,P2XP1XP2H1,P1XH1XP1 + H1,P2XH1XP2 + H2,P1XH2XP1 + H2,P2XH2XP2(2.8)As with equation (2.6) the coefficients of this equation have straightforward biological interpretations. Theintercept, 0, describes the expected host resistance when all host and pathogen loci have “0” alleles. Terms2,3,5 and 6 describe the additive effects of each individual host and pathogen “1” allele, terms 4 and 7describe the epistatic interactions between loci within the host and pathogen respectively. The remainingfour terms describe the GxG interactions between pairs of host and pathogen loci.Despite the complexity of equation (2.8), and hence the logistical and computational challenges of ap-plying it, the expressions for each of these coefficients in terms of the host and pathogen phenotypic effectsare simple (see supporting Mathematica notebook).H0 =f(0) + f0(0)⇣b˜H0  b˜P0⌘+12f 00(0)⇣b˜H0  b˜P0⌘2Hi =f0(0)bHi +12f 00(0)bHi(bHi + 2(b˜H0  b˜P0)) for i = {1, 2}H1,H2 =f00(0)bH1bH2Pi = f 0(0)bPi  12f 00(0)bPi(bPi + 2(b˜H0  b˜P0)) for i = {1, 2}P1,P2 =f00(0)bP1bP2Hi,Pj = f 00(0)bHibPj for i = {1, 2}, j = {1, 2}(2.9)Comparison of the equations in (2.9) with the coefficients in (2.7) reveals an important conclusion: the effectsizes no longer depend on the pathogen allele frequencies nor the linkage disequilibrium (Figure 2.3A andB). This result suggests that the two species co-GWAS approach is more robust to changes in the geneticcomposition of the pathogen population and thus may be much less sensitive to rapid evolution and spatialgenetic structuring within the pathogen population.In addition to stabilizing the estimated allelic effects across pathogen allele frequencies, the total pheno-20Pathogen Allele Frequency !"# = !"%Variance ExplainedAllelic EffectsPhenotypic-difference Phenotypic-matchingA. B.C. D.Figure 2.3: Host-pathogen Model. Row 1: Allelic effects inferred using the host-parasite design fromequation (2.8): 0 (black), Hi (solid red), H12 (dashed red), Pi (solid blue), P12 (dashed blue), andHiPj (dashed purple). Row 2: Variation explained by host additive effects only (solid red), host additiveand epistatic effects (dashed red), host and pathogen additive and epistatic effects (dashed blue), and a fullhost pathogen model as given in equation (2.8).typic variation explained by the co-GWAS greatly exceeds that of the host only GWAS. For the two-locuscase explored here, the co-GWAS approach can explain 100% of the variation regardless of pathogen allelefrequency (Figure 2.3C and D). The contributions of additive, epistatic, and GxG interactions do, however,vary with pathogen allele frequency. As with the host-only approach, when the pathogen population ismonomorphic the host effects explain all of the observed phenotypic variation. In summary, unlike the host-only model, the effect size coefficients (equation (2.9)) and the total variation explained, no longer vary withpathogen allele frequency. This contrast between the host-only and co-GWAS approaches is particularlyrelevant any time the composition of the pathogen population is likely to differ between the sample usedfor the association study and the population in which the resulting inferences are applied. In the followingsection we explore how temporal changes in the host and pathogen populations driven by coevolution affectsthe reproducibility of GWAS over time and by extension space.212.4 Host-Parasite CoevolutionTo simulate host-parasite coevolution we envision a system where each host comes into contact with asingle parasite each generation. The probability that this contact results in infection is determined by hostsusceptibility, S, which is a function of the host and parasite genotype. Infected hosts experience a fitnesscost, ⇠H , whereas their infecting parasites receive a fitness benefit ⇠P . In the absence of infection, both hostsand parasites have a fitness of 1. Together, these assumptions lead to the following fitness of a host withgenotype {XH1, XH2} that comes in contact with a pathogen with genotype {XP1, XP2}:WH = 1 ⇠HS(XH1, XH2, XP1, XP2) (2.10)whereas the pathogen has a fitness ofWP = 1 + ⇠PS(XH1, XH2, XP1, XP2) (2.11)Given these finesses we simulate allele frequencies and linkage disequilibrium over time assuming randommating, a per locus mutation rate of µ, and a recombination rate r (see supplementary Mathematica note-book). We then use equations (2.7) and (2.9) to calculate the inferred allelic effect sizes by using a host-onlyGWAS or co-GWAS each generation over the course of coevolution for both the phenotypic-difference andphenotypic-matching models (Figure 2.4).As expected, using the host-only GWAS approach, the inferred allelic effects can vary over time butonly under the quadratic shaped phenotypic-matching model. As noted above, the estimated effects caneven change sign having large positive values when sampled in one generation and large negative valueswhen sampled only a few generations later. In contrast, the inferred effects remain constant in the co-GWASapproach regardless of the coevolutionary model. In terms of the phenotypic variation explained, the host-only approach explains only a portion of genetically determined phenotypic variation whereas the co-GWASapproach can explain up to 100%. The contribution of different genetic components to the total variationexplained remains approximately constant under the phenotypic-difference model but varies rapidly as allelefrequency changes in the phenotypic-matching model.2.5 Daphnia-Pasteuria GWASTaken together, our analytical model and simulations illustrate that incorporating pathogen genetic informa-tion into the search for disease genes can greatly increase the explanatory power and repeatability of genomescans. Testing these theoretical predictions with biological data is a critical step in evaluating the powerof the Co-GWAS approach relative to a traditional single species GWAS. Analysis of biological data willinclude several complications which we ignored above including finite sample sizes, arbitrary forms of co-evolutionary interactions, and complex genomic architectures. Unfortunately, we know of no studies whichinclude full host and parasite genomic data as well as the outcome of infection experiments. Further, thecomputational tools to perform a Co-GWAS in the form of equation (2.8) do not yet exist. We can, however,22Phenotypic-difference Phenotypic-matching! "and ! #Allelic EffectsAllelic EffectsGenerations GenerationsHost-only modelHost-pathogen ModelFigure 2.4: Allelic effects over coevolutionary time. Row 1: Phenotypes zH (red) and zP (blue) simulatedover coevolutionary time in the phenotypic-difference and phenotypic-matching models. Row 2: Coefficientsestimated under the host only model (2.7) (black: 0, solid red: Hi, dashed red: H12). Row 3: Coefficientsestimated under the host-pathogen model (2.9) (black: 0, solid red: Hi, dashed red: H12, blue Pi, dashedblue: P12, purple dashed: HPij). Because epistatic and GxG interactions are absent in the phenotypic-difference model their allelic effects all overlap at 0 and hence are not all visible.use recently published data by Bourgeois et al. on the susceptibility of Daphnia magna to two Pasteuriaramosa strains, C1 and C19, as a preliminary test of our analytical predictions (Bourgeois et al., 2017). Inparticular we compare the results of genome scans for C1 and C19 susceptibility analysed separately to asingle genome scan for susceptibility using all the data but ignoring pathogen strain type. Our analyticalmodel predicts that, despite having half the sample size, the separate genome scans for C1 and C19 resis-tance should reveal loci that determine host-parasite specificity whereas the full data scan will have lowerpower to do so. Note that strain type captures almost all of the relevant genetic information in this case giventhat the parasite is clonal.23The original data set, provided on Dryad by original authors (Bourgeois et al., 2017), sampled 97 D.magna clones from three distinct geographic regions, one site in Germany, one in Switzerland, and elevensites in Finland and provided the sequence at 6403 SNPs. Host susceptibility (S:susceptible R: resistant)infection by each Pasteuria ramosa strain, C1 and C19, was determined by assessing whether flourescentlylabeled spores attached to the host’s esophagus (Duneau et al., 2011). All four possible combinations ofsusceptibility and resistance to the two strains (SS,SR,RS,and RR) were present. By performing two separateassociation studies, one for each strain, Bourgeois et al. (2017) used this experimental design to identifygenomic regions associated with susceptibility to a specific parasite strain. Following the methods in theoriginal work, we compare their results to a third genome scan including all the data, a total of 194 samples,ignoring the Pasteuria strain type tested. All genome scans were performed using the R package GenAbel,adjusting for population structure and repeated measures of the same host genotype using the Eigenstatmethod (Aulchenko et al., 2007).To accurately assess which genomic regions are associated with susceptibility to C1, C19, and/or “over-all” susceptibility from the complete data set, we used the Daphnia genetic map constructed by Dukic´ et al.(2016) to array the scaffolds into 10 linkage groups. To limit the detection of false positives, we followedan approach analogous to that used in Bourgeois et al. (2017) where SNPs were only considered signifi-cantly associated with a given susceptibility phenotype if there existed four SNPs in a 10cM region with alog-likelihood score above 2 (Figure 2.5). Multiple genomic regions are significantly associated with sus-ceptibility to C1, C19, and to disease susceptibility in the complete data set without strain information. Fourlinkage groups (4,5,7, and 9) with a total of 28 significant SNPs are associated with C1 susceptibility. Threelinkage groups (1,4,and 7) with 38 SNPs are associated with C19 susceptibility, and 2 linkage groups (4 and5) with 35 SNPs are associated with susceptibility in the complete data set. Thus, while the complete data sethas twice as many measures of disease susceptibility, it has less power to detect genetic regions underlyingdisease susceptibility because of the lack of parasite information.The contrast between the associations for C1 and C19 susceptibility to “overall” susceptibility in thecomplete data set provides additional information about the nature of the genetic basis to resistance. Genomicregions associated with the overall resistance regardless of parasite type, particularly when these regions arealso associated with C1 and C19 resistance, provide increased resistance regardless of the parasite straintested and are consistent with general host health and non-specific immune response. By contrast, sites thatare not associated with overall resistance, despite the data set having twice the size, but are associated witheither C1 or C19 are good candidates for loci that act in a parasite-specific manner. Examining Figure 2.5we therefore conclude that linkage group 4 and possibly 5 are involved in general health and resistance. Incontrast, the regions on the far left and right of linkage group 7 as well as the regions on linkage group 1 and9 which are associated only with C1 or C19 resistance, are indicative of parasite-specific resistance loci.These conclusions are in agreement with the hypothesized model and previous molecular work on Daph-nia resistance to Pasteuria. In particular, resistance to Pasteuria is hypothesized to be controlled by a three-locus matching alleles system. One of these loci (the C locus) determines overall host susceptibility regard-less of pathogen strain and is thought to reside on linkage group 4 (Bento et al., 2017). In the absence ofprotection from the C locus, a second “A locus” is thought to confer resistance to C1 when the dominant24Significance−"#$%cMFigure 2.5: GWAS of Daphnia magna susceptibility. Genetic associations of each SNP with C1 (redcircles), C19 (blue, squares), and “overall” susceptibility in the complete data set without parasite-typeinformation (green, triangles). Hence each SNP is represented three times, once for each genome wide scan.Note that closely linked SNPs often overlap with one another and are not all individually visible. SignificantSNPs are shown in color while those below the log-likelihood of 2 threshold or that are not clustered withina 10cM region of three other significant SNPs are shown in gray. The 10 linkage groups are delineated byvertical dashed lines.allele is present The regions detected on linkage groups 7 and 9 in the hosts exposed to C2 only may be can-didates for such C1-specific-resistance. Finally, if the C locus and A locus are both homozygous recessive,a third “B locus” determines susceptibility to the C19 strain. Such a locus would likely be hard to detect ina GWAS due to epistasis between the A,B and C loci, nevertheless, the regions associated with only C19resistance (on linkage groups 1 and 7) would be a candidates for such a B locus. Overall we conclude thatsignificant SNPs obtained without accounting for parasite type may signal general health status. Against thisbackground a co-GWAS can help identify genes which regions are likely critical to host-parasite specificityand variation in host susceptibility.2.6 DiscussionIdentifying genes that determine a host’s susceptibility to infection is a promising frontier with a wide rangeof applications including agriculture and human health. Yet, as our mathematical models demonstrate, asso-ciation studies focusing on identifying genes in a single species without accounting for the genetics of the25interacting species can drastically affect our ability to detect disease genes involved in host-pathogen speci-ficity and limit our ability to account for the genetic variation in disease susceptibility. When the geneticcomposition of the pathogen population varies over time and/or space, this can further lead to inconsis-tencies in the results of genetic association studies. Finally, using previously published data on D. magnaresistance to its Pasteuria parasite we illustrate that performing association studies with and without infor-mation about pathogen type can be used to distinguish genomic regions affecting general versus specificresistance to pathogens. Consistent with current models for Daphania–Pasteuria interactions, we identifyone region associated with general health as well as candidate regions more directly involved in mediatinghost-pathogen specificity.The mathematical analysis presented above focuses on host-pathogen interactions of a specific form,given by equation (2.1). Although we have relied on an approximation that assumes weak phenotype differ-ences, i.e., zH  zP is small, we postulate that the power to detect strain-specific resistance genes will beincreased whenever parasite information is incorporated, even when genes have major effects and phenotypicdifferences become large. Similarly, the methods used above can be extended to include alternative interac-tion types such as a “matching-alleles” interaction (see Mathematica notebook). The expressions for the coefficients under this interaction model are unruly and difficult to interpret. Using a numerical approach, weobserve that once again GxG interactions can explain a significant proportion of the variation in susceptibil-ity (Supplementary Figure A.1), particularly in highly variable pathogen populations. Unlike the phenotypicdifference and phenotypic matching models, however, the co-GWAS approach (equation (2.8)) no longerexplains all of the variation in susceptibility and the coefficients vary with pathogen allele frequency. Thisis a result of higher order interactions not included in our model. Hence although the co-GWAS approachperforms significantly better than a single-species approach, it will not always capture the full genetic basisof infection because of the second-order approximation used in equation (2.8).Regardless of the form of the interaction, our analytical models and simulations illustrate that incorpo-rating pathogen genetics into the search for disease genes can greatly increase the explanatory power andrepeatability of genome scans. Unfortunately, several logistical and computational challenges preclude ap-plying a full two-species GWAS. Most notably such a design requires additional genetic data that is notcurrently available. More specifically, this design requires genotyping all hosts and the pathogens to whichthey are exposed, not just the host-parasite combinations observed in infected individuals. Future explorationis warranted to determine whether uninfected individuals can simply be treated as unknown with respect topathogen exposure and what the consequences of doing so would be for the statistcal power of our approach.The complexity of the two-species design (equation (2.8)) relative to that of a single-species design(equation (2.6)) also introduces computational challenges. In addition to requiring larger sample sizes, esti-mating the effects of the large number of potential GxG interactions in a full host-genome by parasite-genomestudy is computationally unrealistic. In addition to the large number of pairwise interactions between hostsand pathogens, depending on the form of the interaction, higher order genetic interactions may be necessaryto fully explain the variation in susceptibility. These higher order interactions can be particularly importantas the number of loci underlying susceptibility, nH and nP , increases. Although incorporating completepathogen genetic data may be infeasible there often exists some form of pathogen typing, which is largely26indicative of the pathogen’s genotype and may be sufficient for the purposes of a host genome-wide scan.For example, despite its vast diversity, Hepatitis C Virus has been subdivided into 7 genotypes (Irvine et al.,1993; Murphy et al., 2015), which may capture much of the relevant variation in host susceptibility.The Daphnia–Pasteuria data set we analysed provides a valuable test case for a two-species Co-GWAS.In this study, we know exactly to which pathogen type individuals have been exposed, which is generallynot known in natural populations. This information may have increased the power of the study to detectloci underlying C1 and C19 susceptibility. Despite this increased power, we chose to use the arguably le-nient significance threshold of a log-likelihood score greater than 2 plus clustering of 4 or more SNPs, as inthe original paper. Requiring more stringent threshold corrections for multiple sampling, such as a Bonfer-roni correction, does not yield any significant SNPs. Given the correspondence between the GWAS resultsand those of functional studies (Bento et al., 2017), however, many of the observed SNPs are arguably notfalse positives. Using the log-likelihood of 2 and clustering threshold we observe fewer genomic regionsassociated with “overall” susceptibility when parasite information is not incorporated than when conductingGWAS with exposure to either C1 or C19, despite the complete data set containing twice the number of datapoints. As an alternative to analysing the complete data set, we could hold the sample size constant in acombined analysis by randomly choosing whether the host was exposed to C1 or to C19 for each host geno-type (Supplementary Figure A.2). Interestingly, this “mixed” GWAS not only identifies the same regions onlinkage group 4 and 5 but also identifies regions on linkage groups 1, 9, and 10, as were found in the singlepathogen-type GWAS. The fact that this mixed analysis picks up some of the potentially parasite-specific lociis likely due to randomly sampling an excess of C1 or C19 tested clones. Consistent with this interpretation,exactly which parasite-specific regions are identified varies with the random sample chosen. Nevertheless,as with the complete data set, a comparison between C1, C19, and mixed susceptibility provides additionalinformation about which genes are involved in general health versus parasite-specific susceptibility.The results presented here highlight several important avenues for future research. First and foremost,designing genome-wide association methods that allow for GxG interactions is critically important as is thecollection of genotypic data from hosts and pathogens. This could be approached, for example, by adapt-ing GWAS designs and analyses used to detect gene-by-environment interactions (Winham and Biernacka,2013). Recognizing the importance of host-pathogen genetic interactions is important for understandingthe applicability and limitations of single-species association scans. Developing metrics that capture rel-evant variability in host and pathogen populations may facilitate the application of these results. Finally,incorporating GxG interactions into our association studies will also enable us to understand what mathe-matical models of host-parasite interactions best predict the genetic interactions observed in natural systems,allowing further refinements of the models.27Chapter 3A Generalized DeterministicCompartmental Model with Coevolution23.1 IntroductionHost-parasite interactions and their relationship to infectious diseases is a topic of great interest in both publichealth and evolutionary biology. The cost of infection is evident from death tolls from the black death of 1346and the Spanish influenza in 1918. Indeed genomic data confirm that pathogen load is perhaps a key driver inhuman adaptation (Fumagalli et al., 2011). Similarly, as an analysis of the fitness landscape of P. falciparumreveals, pathogens experience strong selection from a variety of sources including host immune responses,host death, and vector availability Mackinnon andMarsh (2010). Strong selection on genetic variants in hostsand pathogens can have major implications, from the maintenance of deleterious recessive mutations such asthe Hbs allele for sickle cell anemia Allison (1956) to the maintenance of sexual reproduction Hamilton et al.(1990). Since the classic papers by Kermack and McKendrick (1927) and Hamilton (1980), epidemiologistsand evolutionary biologists have used mathematical models to understand and predict the outcomes of theseinteractions. While both coevolutionary and epidemiological models capture important dynamics of host-pathogen interactions they are often not integrated with one another to capture realistic feedback betweenevolutionary and population dynamics when both hosts and parasites are genetically variable.Most epidemiological models of pathogen spread are compartmental models in which a host population isdivided into a series of compartments depending on each individual’s disease status, for example susceptible(S), infectious (I), or resistant (R) in an SIR epidemiological model. Models tracking transitions amongcompartments can then be described by a set of differential equations, which can be analysed to predict theincidence of the disease over time. Such compartmental models have succeeded in producing a number ofinteresting and realistic phenomena, of particular interest here is the appearance of sustained cyclic outbreaks(epidemics). Many infectious diseases such as measles exhibit well characterized periodic outbreaks thoughtto be consistent with SIR dynamics (Fine and Clarkson, 1982; Hethcote and Van Ark, 1987; London andYorke, 1973).On the other hand, evolutionary models of host-pathogen interactions typically ignore population andepidemiological dynamics by assuming a constant rate of exposure, focusing instead on genetically vari-able host and pathogen populations. The simplest coevolutionary models consider a host and pathogen2A version of this chapter has been published as MacPerson, A. and S. P. Otto. 2018. Joint coevolutionary-epidemiological models dampen Red Queen cycles and alter conditions for epidemics. Theoretical Population Biology.122:137–148. A supporting Mathematica file is available at https://www.sciencedirect.com/science/article/pii/S004058091730032128interaction mediated by a single di-allelic locus from each species. In the matching-alleles model (MAM)each pathogen genotype preferentially infects a corresponding host genotype and either fails to infect orhas reduced infectivity on alternative host genotypes. Matching-allele models are motivated by systemsof self/non-self recognition present in many animals Luijckx et al. (2013), Grosberg and Hart (2000). Incontrast, in the gene-for gene model, which was developed to reflect plant resistance Flor (1955), there areresistant and susceptible host varieties interacting with virulent and avirulent pathogens. Susceptible hostssuccumb to infection by both pathogen types whereas resistant hosts resist infection by avirulent pathogensbut are susceptible to pathogens carrying the virulence gene. Here we focus on matching-allele type modelswhere, even in the simplest case when infection is determined by a single locus, important dynamics arise.Specifically, these models exhibit neutrally stable host and pathogen allele frequency cycles over time. Suchcycles are an important cornerstone of the Red Queen hypothesis for the evolution and maintenance of sex-ual reproduction (Hamilton, 1980; Lively, 1987). These cycles are, however, known to be very sensitiveto model assumptions (Agrawal and Lively, 2003) and are affected by the inclusion of realistic populationdynamics (Gokhale et al., 2013; Song et al., 2015).Despite the conceptual overlap between infectious disease spread and host-parasite coevolution, mod-els of coevolution with variable hosts and parasites in an epidemiologically realistic framework are lim-ited. Here we present such an integrated model focusing on how the combined dynamics differ fromthose of coevolutionary and epidemiological models considered separately. We begin by summarizing thefew previous coevolution-epidemiological models that have analysed the evolutionary dynamics of host-pathogen interactions. Building on this work, we analyse host-parasite interactions in a SIRS (Susceptible-Infectious-Resistant-Susceptible) compartmental model with the specific aim of understanding how epidemi-ology shapes coevolutionary dynamics and vice versa. We first review two models, a model of coevolutionin the absence of epidemiology and a model of epidemiological dynamics in the absence of coevolution thatdo not incorporate this feedback. Finally, we build upon these background models to analyse a combinedmodel of coevolution with explicit epidemiological dynamics. Using a combination of analytical and nu-merical techniques we characterize the dynamics of each model. In particular, we will focus on describingwhen epidemiological cycles and allele frequency cycles are maintained. This comparison across modelsreveals that incorporating genetic variability in epidemiological contexts can produce substantially differentdynamical behaviour than compartmental or coevolutionary models alone. In particular, we find that the RedQueen coevolutionary cycles present in matching-alleles models dissipate when accounting for epidemiolog-ical dynamics and that the conditions for epidemic versus endemic disease dynamics are altered by geneticvariability in hosts and parasites.3.2 Theoretical BackgroundMany compartmental models used in epidemiology incorporate either host variability or pathogen variabilityby following either multiple host types or multiple pathogen types. Few of these models, however, introduceboth the multiple host and multiple pathogen types necessary to capture coevolution in an epidemiologicalcontext. The few models that have done so have reached different conclusions about the nature of Red Queen29allele frequency dynamics, depending on the assumptions made, making it difficult to determine the generalimpact that incorporating epidemiological dynamics has on convolution.The earliest such model, by May and Anderson (1983), examined coevolution between two host andtwo pathogen types with a perfect matching-alleles infection where each pathogen type can infect onlyone host type. In each non-overlapping host generation, a pathogen epidemic occurs with infected hostsexperiencing increased mortality. Once the epidemic has subsided the remaining hosts give birth to the nextgeneration of susceptible hosts. Three different types of allele frequency dynamics can arise, depending onthe pathogen growth rates. First, for slowly growing pathogens, those with low transmissibility and shortduration infections, the evolutionary change per generation is small and as a result the host and pathogenallele frequencies gradually approach a polymorphic equilibrium. The value of this equilibrium is determinedby the relative growth rates of the two pathogens, favouring the faster growing pathogen. As the growth ratesof both pathogens increase, however, a second dynamic arises. The discrete host life history generates a twopoint allele frequency oscillation about the polymorphic equilibrium. Finally for rapidly growing pathogens,the allele frequency dynamics become chaotic.There are two important assumptions in May and Anderson’s model: first that pathogens are special-ists, leading to a perfect matching model, and second that each discrete host generation is long enoughfor the pathogen outbreak to completely subside. Together these two assumptions have several importantconsequences on the epidemiological and coevolutionary dynamics. First, long host generations favour thepathogen with the greatest long-term epidemic size, not necessarily the one with the fastest initial spread.Hence, unintuitively, the epidemiological and therefore coevolutionary dynamics are insensitive to the ini-tial number of infections per generation and the mode of transmission between discrete host generations.Together these two assumptions uncouple the evolution of the two host-pathogen pairs, in effect removingcoevolutionary feedback. This would no longer be the case if recovery were only temporary, if hosts hadoverlapping generations, or if any cross infection between host types were allowed. Many aspects of the ob-served allele frequency dynamics are a consequence of these assumptions, for example the two-point allelefrequency cycles, and may have few implications for continuously reproducing hosts.One such model of coevolution with explicit epidemiology in continuous time was proposed by Beck(1984). She considered a single locus model with a diploid host and haploid pathogen. Using a susceptible-infectious-susceptible (SIS) model, the virulence, infection, and recovery rates were dependent on the host-pathogen pair. However, to facilitate model analyses she assumed that the strain-specific effects were small.Under this assumption the system exhibits several types of equilibria: a set of equilibria where both the hostand pathogen populations fix one allele, a set of equilibria where one species remains polymorphic while theother fixes, and lastly an internal equilibrium at which both the host and pathogen are polymorphic. Usingperturbation theory (Karlin and McGregor, 1972) assuming the strain-specific effects are small, she wasable to conclude the system will approach the fully polymorphic equilibrium with damped cycles. Hence,coevolutionary allele frequency cycles do not exist when epidemiological dynamics are included. However,since this stability analysis only applies when strain-specific effects are small, the results may not hold incases where pathogens can differ markedly in their host preferences. In a rederivation of Beck’s results (stillassuming small differences between strains), Andreasen and Christiansen (1995) show that the dampening30of these coevolutionary cycles does not extend to more complex ecological models, for example when a freeliving pathogen population is introduced.Providing an important intermediate between May and Anderson’s (1983) model with long discretehost generations and Beck’s (1984) continuous-time model, Morand et al. (1996) present a discrete-timeepidemiological model in which multiple time steps occur within each host and parasite generation. Infectiondepends on a single diploid locus in the host and pathogen where each host genotype can only be infectedby the corresponding genotype in the pathogen. Importantly their susceptible-infected (SI) model, which ismotivated by the infection of snails of the genera Biomphalaria and Bulinus by their castrating Schistosomaparasites, assumes that hosts are castrated upon infection and never recover imposing a strong fitness cost toinfection. They show that, in comparison to Beck’s (1984) results, such castrating infections can generatesustained limit cycles in overall infection prevalence as well as host and pathogen allele frequencies. In thesupplementary Mathematica notebook we demonstrate that indeed this result holds only in epidemiologicalmodels with strong selection (castrating or highly-virulent infections).In addition to considering allele frequency dynamics, host-parasite interactions can also lead to cycles inhost genotype frequencies. In particular if host susceptibility is determined by its genotype at two or moreloci, coevolution can drive cycles in the linkage disequilibrium between loci even in the absence of changes inallele frequencies (Nee, 1989). Penman et al. (2013) analysed the effect of SI-type epidemiological dynamicson such genotype frequency cycles. Motivated by the evolution of HLA loci, the genotype of the two-locushaploid pathogen must differ at each locus from that of the host for infection to occur in their model. Thisin turn favours host genotype combinations that match the most abundant pathogen genotypes, favouringthe buildup of linkage disequilibrium. They observe that linkage disequilibrium between loci can cycleeven though allele frequencies need not. These linkage disequilibrium cycles are facilitated by epistasis infitness between loci (Gandon and Otto, 2007) and/or rare recombination (Kouyos et al., 2009), as well as theexistence of immune system memory (Penman et al., 2013).These epidemiologically explicit models of coevolution emphasize the need to better understand theinterplay between host-parasite coevolution, Red Queen dynamics, and epidemiology. Building on theseprevious models, we generalize previous results by using a continuous-time SIRS model that exhibits a widerange of epidemiological dynamics and a general coevolutionary model. Our coevolutionary model requiresneither the pathogens to be specialists (as in May and Anderson, 1983) nor the strain specific effects to besmall (as in the models by Beck (1984) and Andreasen and Christiansen (1995)). In addition our epidemio-logical model allows for arbitrary virulence from benign to complete loss of fitness for infected individuals,as in Morand’s model with parasitic castration (Morand et al., 1996). Although we focus on coevolutionarydynamics at a single locus, the results of Penman et al. (2013) suggests that linkage disequilibrium cyclescan be maintained when immune system memory is included, but future extensions are needed to determinethe nature of genotypic cycles across a variety of multi-locus epidemiological and coevolutionary models.We conclude with a discussion of the implications of our results on disease dynamics and on the evolutionof sex.313.3 Background: Matching-Alleles Coevolution ModelHere we present a classic coevolutionary model with a haploid host and pathogen interaction involving asingle di-allelic locus in each species. Such a model can be described by a 2x2 matrix where element ijdetermines the rate at which pathogens of type j infect hosts of type i: ="11 1221 22#(3.1)In particular, we will focus on a matching-alleles type interaction where pathogens preferentially infecthosts with the same genotype. For coevolution to ensue, infected hosts must experience a fitness cost due toinfection and successful parasites a relative fitness benefit, which we denote by ⇠H and ⇠P respectively. Theresulting allele frequency changes of type 1 hosts, pH , and type 1 pathogens, pP , are given bydpHdt= ⇠HpH(t)(1 pH(t)) (pP (t) (11  21) + (1 pP (t)) (12  22))dpPdt=⇠P pP (t)(1 pP (t)) (pH(t) (11  12) + (1 pH (t)) (21  22))(3.2)When 12 = 21 this system is a special case of Gavrilets and Hasting’s general coevolutionary model (seeequation 3 in Gavrilets and Hastings, 1998). Reiterating their results, system (3.2) has five equilibria. Fourof these equilibria describe the possible combinations of allele fixation in the host and pathogen. To simplifythe presentation of these equilibria we assume that the interaction is symmetric, 11 = 22 = X and 12 =21 = Y , where the matching alleles model assumes X > Y . Denoting the equilibrium allele frequenciesby pˆH and pˆp, the fifth equilibrium in this symmetric case is pˆH = 12 and pˆP =12 (See Mathematica file formore general results). The dynamics of system (3.2) are shown in Figure 3.1. Both species experience allelefrequency cycles about the polymorphic equilibrium. The frequency of these cycles is determined by twoquantities, ⇠H⇠P and (X  Y ). Specifically, the frequency increases as the interactions become more costlyto the host or more beneficial to the pathogen, in other words as ⇠H⇠P increases, and as pathogens becomemore specialized, as denoted by increases in (X  Y ). The amplitude of these cycles, on the other hand, isdetermined entirely by the initial conditions of the system.The properties of these cycles are evident from the linear stability analysis of the polymorphic equilib-rium. Specifically, the two eigenvalues of the Jacobian matrix of this system evaluated at the polymorphicequilibrium are both purely imaginary, = ± i2r⇣(X  Y )2 ⇠H⇠P⌘(3.3)Hence the system exhibits neutrally stable cycles with frequency12q((XY )2⇠H⇠P )2⇡ . As illustrated by Songet al. (2015) and Gokhale et al. (2013), such neutrally stable cycles are often sensitive to model assumptions,and the dynamics may be disrupted by the inclusion of population dynamics. In the following sections wedemonstrate that these cycles may also be disrupted by inclusion of epidemiological dynamics, which area natural consequence of disease spread. We show that the neutrally stable cycles of (3.2) no longer exist32even in cases where epidemiological dynamics approach a stable endemic equilibrium, a case that one mightinitially expect to leave the cycles of (3.2) untouched (Figure 3.5).CBA!− #$ %$ &'( ) $&$% = +. -! − # = +. ./0&$% = +. /! − # = +. 1/$&$% = +. 2! − # = +.+/TimeAllele FrequencyABCFigure 3.1: MAM dynamics. A Heat map showing the frequency of oscillations in the coevolutionarymatching-alleles model. Panels A,B, and C show allele frequency dynamics for the host (black solid) andpathogen (green dashed) for three combinations of ⇠H⇠P (A:0.9, B:0.5, C:0.1) and (X  Y ) (A:0.45, B:0.25,C:0.05). The initial host and pathogen allele frequencies, pH(0) = 0.2 and pP (0) = 0.5, determine cycleamplitude.3.4 Background: SIRS Model Without CoevolutionCompartmental models of pathogen spread can exhibit a wide range of dynamics. To begin with, there is thepossibility of disease extinction. If each initially infected host fails, on average, to spread the infection toat least one other host before the host recovers, then the basic reproduction number of the pathogen, R0, isless than 1, and the disease will die out. Long term pathogen extinction may occur even when R0 is greaterthan 1. This can occur if the susceptible host population is depleted through infection and/or immunity tosuch an extent that the reproductive capacity of the pathogen drops below 1 eventually leading to extinction.Compartmental models without life long immunity and/or with host birth, however, often sustain infectionsover the long term. In many cases such endemic infections persist at a constant level, a stable balance betweenthe depletion of susceptible hosts through infection and the restoration of the susceptible host populationthrough the loss of immunity and birth (see Hethcote and Van Ark, 1987; Keeling and Rohani, 2008, forreview of epidemiological models and their long-term dynamics).33Many diseases, as noted in the introduction, exhibit another long-term dynamic: cyclic disease outbreaks.The conditions for generating epidemiological cycles in compartmental models in continuous time havebeen explored rigorously (Hethcote and Van Ark, 1987). The first requirement is temporary immunity suchas in a “Susceptible-Infected-Recovered-Susceptible” (SIRS) model or a “Susceptible-Exposed-Infected-Recovered-Susceptible" (SEIRS) model. In addition, long-term cycles require stringent conditions on theduration of infection and immunity. Gonçalves et al. (2011) pinpoint these conditions using a SIRS modelwhere the length of infections and duration of immunity among individuals are described by two gammadelay distributions. By manipulating the shape parameter of these distributions, n, which ranges from 1to1, they captured a range of infection and immunity periods. When n = 1 the delays are shaped like anexponential distribution. This is analogous to the classic compartmental model where individuals recover andlose immunity at a constant rate. As n increases the variance in infection and immunity durations decrease.Finally, when n = 1 the gamma distribution becomes a fixed time delay, where all individuals experiencethe same duration of infection and recovery. As a broad generalization, Gonçalves et al. find that cyclesare facilitated by long immunity times relative to the length of infection and by delay distributions with lowvariance (i.e. n sufficiently large).As our goal in this paper is to understand the effect of coevolution on epidemiological dynamics (seenext section), we chose to build upon a model that can exhibit the full range of long-term dynamics, fromdisease extinction to an endemic equilibrium to stable limit cycles such as the one presented by Gonçalveset al. (2011). For coevolution to ensue, infection must come at a fitness cost to the host, resulting in anincreased death rate. Because Gonçalves et al. assumed the host was long lived relative to the pathogenand therefore that host birth and death were negligible, we begin by extending their results to include theseprocesses, which are necessary for coevolution. Distributed time delays from the infected to the recoveredcompartments and from the recovered to the susceptible compartments can be included by replacing thesingle infected and single recovered compartments by chains of nI and nR stages respectively. If individualstransition between adjacent stages in these chains at a constant rate of nI⌧I andnR⌧R, the resulting distributionof infection lengths and immunity times will be gamma distributed with shape parameter nI and nR andmeans ⌧I and ⌧R. New infections arise at a rate s (t)PnIj=1 i (j, t), where  is the transmission rate, s (t) isthe size of susceptible compartment at time t, and i (j, t) is the number of individuals in the jth stage in theinfected chain at time t. Host birth is included through logistic growth with a natural birth rate b and intensityof intraspecific competition  (all individuals are allowed to reproduce). Hosts die naturally at a rate d andat a rate  during infection,  > d. Because the average length of infection is ⌧I , the expected fitness costof infection is given by ⌧I (  d), a term that is equivalent to ⇠H in the matching-alleles coevolution model.34The resulting system of differential equations is:ddts (t) = b0@s (t) + nIXj=1(i (j, t)) +nRXj=1(r (j, t))1A| {z }Birth⇤0@1 0@s (t) + nIXj=1(i (j, t)) +nRXj=1(r (j, t))1A1A| {z }Density dependenceds (t)| {z }Death+nR⌧Rr (nR, t)| {z }Loss of immunitys (t)nIXj=1i (j, t)| {z }Infectionddti (1, t) =s (t)nIXj=1i (j, t)| {z }Infectioni (1, t)| {z }DeathnI⌧Ii (1, t)| {z }Transition to next stageddti (j, t) =nI⌧Ii (j  1, t)| {z }Transition from prior stagei (j, t)| {z }DeathnI⌧Ii (j, t)| {z }Transition to next stage8j > 1ddtr (1, t) =nI⌧Ii (nI , t)| {z }Recoverydr (1, t)| {z }DeathnR⌧Rr (1, t)| {z }Transition to next stageddtr (j, t) =nR⌧Rr (j  1, t)| {z }Transition from prior stagedr (j, t)| {z }DeathnR⌧Rr (j, t)| {z }Transition to next stage8j > 1(3.4)As designed, system (3.4) exhibits several types of long-term epidemiological dynamics. This systemhas three biologically valid equilibria, the first of which is host extinction. This equilibrium is only stablewhen the natural host death rate is greater than the birth rate, d > b. For the remainder of the analysis we willassume that this is not true. The second, disease free, equilibrium describes pathogen extinction. Letting sˆbe the equilibrium value of the susceptible compartment and iˆ (c) and rˆ (c) the value of the cth infected andrecovered stage at equilibrium, the disease-free equilibrium is given by iˆ(c) = 0, rˆ(c) = 0 8 c, and sˆ = bdb .Note that the disease free equilibrium population size ranges from 0 to 1 depending on the ratio of host birthand death rates. In the supplementary Mathematica notebook we show that this disease-free equilibrium isstable only when net disease spread is not possible, i.e. R0 = (bd)((nI+⌧I)nInInI )b(nI+⌧I)nI < 1. Finally, whenR0 > 1 there exists an equilibrium that allows host-pathogen coexistence.Focusing on the dynamics of the coexistence equilibrium, the system may either asymptotically approachthis equilibrium (endemic case) or exhibit stable limit cycles about it (epidemic case). Although there ex-ists other uses of the terms “epidemic” and “endemic”, we will use them throughout to distinguish betweencases where the interior equilibrium is stable (endemic) versus unstable with cyclic increases in the disease(epidemic). Given the high dimensionality of system (3.4), for the epidemic case the existence of the limit35cycles was confirmed numerically rather than through a formal proof (Figure 3.2). Some of the key parame-ters determining when limit cycles can arise are the host birth and death rates. There exists a critical birth ratebelow which cycles are not possible, similarly there exists a critical host death rate above which cycles arenot possible. Similarly, the parameter space cycles decreases with increasing host death through virulence.Lastly, cycles can only occur for highly infectious pathogens where the duration of infection ⌧I is relativelyshort compared to the length of the recovery period ⌧R.! = #.#%! = #.#&! = #.#' ( = #.###)( = #.##%( = #.##&* = #.'* = #.+* = #.) , = #. ##&, = #.##), = #. #%!! !!! "! "Limit cyclesDisease lossStable endemicLimit cyclesDisease lossStable endemicLimit cyclesDisease lossStable endemicLimit cyclesDisease lossStable endemicFigure 3.2: SIRS epidemiological dynamics without coevolution. Lines depict bifurcation points of sys-tem (3.4). The disease either goes extinct, reaches a stable endemic level, or exhibits stable limit cycles.Parameter values unless otherwise noted: nI = 20, nR = 20, b = 0.02,  = 1, d = 0.001,  = 0.01, = SIRS Model With CoevolutionUsing a similar epidemiological framework as in system (3.4) in the absence of coevolution, we now in-corporate genetic variation in both hosts and parasites exhibiting a MAM coevolutionary interaction. Eachof two host types, h = {1, 2}, can become infected by one of two pathogen strains, p = {1, 2}. Uponinfection a host of type h infected with a pathogen of type p progresses through a chain of infected stages,with each transition occurring at a rate nI⌧I . As with system (3.4) this results in a gamma distributed delay36between the time of infection and time of recovery, with the shape of the distribution given by nI and theexpected time to recovery ⌧I . We further assume that hosts currently infected with one parasite cannot beinfected by the other (no co-infection). Similarly, recovered hosts, which are assumed to have complete crossimmunity to both pathogen strains, progress through a chain of recovered stages each at a rate nR⌧R . As withthe matching-alleles coevolution model, host-pathogen specificity is captured by the infection matrix  (seeequation (3.1)). We once again focus on the symmetric case where 11 = 22 = X and 12 = 21 = Y > 0.The resulting system of (2 + 4nI + 2nR) differential equations is given by:ddts (h, t) = b0@s (h, t) + nI ,2Xj,li (h, l, j, t) +nRXjr (h, j, t)1A| {z }birth⇤0@1 0@ 2Xk0@s (k, t) + nI ,2Xj,li (k, l, j, t) +nRXjr (k, j, t)1A1A1A| {z }Density dependenceds (h, t)| {z }Death+nR⌧Rr (h, n,t)| {z }Loss of immunitys (h, t)2Xlh,l2,nXk,ji (k, l, j, t)| {z }Infectionddti (h, p, 1, t) = s (h, t)h,p2,nIXk,ji (k, p, j, t)| {z }Infectioni (h, p, 1, t)| {z }DeathnI⌧Ii (h, p, 1, t)| {z }Transition to next stageddti (h, p, j, t) =nI⌧Ii (h, p, j  1, t)| {z }Transition from previous stagei (h, p, j, t)| {z }DeathnI⌧Ii (h, p, j, t)| {z }Transition to next stage8j > 1ddtr (h, 1, t) =nI⌧I2Xli (h, l, n, t)| {z }recovery from infectiondr (h, 1, t)| {z }DeathnR⌧Rr (h, 1, t)| {z }Transition to next stageddtr (h, j, t) =nR⌧Rr (h, j  1, t)| {z }Transition from previous stagedr (h, j, t)| {z }DeathnR⌧Rr (h, j, t)| {z }Transition to next stage8j > 1(3.5)This system has four different types of equilibria. As with system (3.4), if d > b the host, and thereforethe pathogen, will go extinct. Assuming that this is not the case, if R0 < 1 the system will reach a disease-free state. Finally, there are two possible types of equilibria with host-pathogen coexistence. First, onehost and one pathogen type may go extinct leaving only one host and one pathogen type at equilibrium. Thismonomorphic equilibrium is, however, never stable as the other host or pathogen type can always invade. Thesecond coexistence equilibrium is characterized by a polymorphic host and both pathogens (see AppendixB). The stability of this equilibrium is similar to that of the coexistence equilibrium of system (3.4) andcan either be stable or approach a stable limit cycle. The conditions under which each of these equilibriumdynamics arises is shown in Figure 3.3 and shows similar characteristics to system (3.4). Once again, the37existence of limit cycles was confirmed through numerical integration of system (3.5) rather than through aformal proof.Epidemiologicallimit cyclesEpidemiologicallimit cyclesEpidemiologicallimit cycles!" !"! #! #Disease lossStable endemicDisease lossStable endemicLimit cyclesDisease lossStable endemicDisease lossStable endemicLimit cyclesLimit cyclesLimit cycles$ = &.&($ = &.&)$ = &.&* + = &.&&&,+ = &. &&(+ = &. &&)- = &.*- = &..- = &., / = &. &&)/ = &.&&,/ = &.&(Figure 3.3: SIRS epidemiological dynamics with coevolution. Lines depict bifurcation points of system(3.5). The disease either goes extinct, reaches a stable endemic level, or exhibits stable limit cycles. Param-eter values are the same as in Figure 3.2 except where X =  and Y = Model ComparisonComparing the dynamics of systems (3.2), (3.4), and (3.5) we can describe the effect of coevolution on theepidemiological dynamics and vice versa. The inclusion of host and pathogen variation into the SIRS epi-demiological model can alter the conditions under which the system will reach a stable endemic equilibriumor a limit cycle, as shown in Figure 3.4. When there is little difference between the two host or pathogenstrains, i.e. X ⇡ Y , coevolution has little effect on the epidemiological dynamics (Panels A,B). This is nolonger the case when large differences exist (Panel C). Depending on the original pair of host and pathogenstrains in the population (red for  = X and blue for  = Y curves), introduction of variation in bothhost and pathogen (black curve) can either stabilize the system at the endemic equilibrium or drive it towardan epidemic limit cycle. These changes to the dynamics illustrates the importance of incorporating genetic38diversity and host-pathogen coevolution into epidemiological models.Epidemiological dynamics have an even more drastic effect on the coevolutionary allele-frequency dy-namics. Specifically the neutrally stable MAM cycles are dampened with the inclusion of explicit epidemi-ology as, for example, shown in Figure 3.5. Although it is difficult to demonstrate analytically the generalityof this dampening, we confirmed numerically for one million randomly chosen parameter combinations that,except in the special case where Y = 0, the system either exhibits a stable endemic equilibrium with equalallele frequencies or, when this point is unstable, the system moves away from the equilibrium via shiftsin host numbers but remains on the plane where pH = 0.5 and pP = 0.5 (as determined by the eigen-vectors associated with any positive eigenvalue, see details in the Supplementary Mathematica file). In aperfect matching model where Y = 0, as studied by May and Anderson (1983), the dynamics of each hostand parasite pair are independent except for their interaction through the logistic growth of hosts. In thisparticular case allele frequency cycles can now occur but only when there are periodic disease outbreaks;these cycles are a consequence of the epidemic cycles for the host types remaining unsynchronized by crossinfection when Y=0. By contrast, whenever Y > 0, epidemics in the different host genotypes are temporallysynchronized by cross infection, which keeps the allele frequencies at 1/2. Of particular interest, for allvalues of Y  0 allele frequency cycles are damped when the endemic disease equilibrium is stable. Thisis surprising, given that the constant force of infection at an endemic equilibrium should closely reflect theassumptions of the constant infection rates made in a purely coevolutionary version of the matching-allelesmodel. The sensitivity of Red Queen allele-frequency cycles may be attributed to the fact that the cyclesare neutrally stable rather than limit cycles, arising in systems with a purely imaginary leading eigenvalue.In such cases, small perturbations to the system can lead to large shifts in the resulting dynamics (Karlinand McGregor, 1972). To test whether the disruption of allele-frequency cycles is a more wide-spread phe-nomenon we next assess the impact of epidemiological dynamics on an alternative coevolutionary modelthat exhibits allele-frequency stable limit cycles in allele frequencies.Limit cyclesDisease lossStable endemicDisease lossStable endemicLimit cyclesDisease lossStable endemic!!! "1. 2.3.A. B. C.Limit cyclesFigure 3.4: Comparison of epidemiological dynamics with and without coevolution. Lines depict bifur-cation points of system (3.4) (red-dotted:  = X and blue-dashed:  = Y ) and (3.5) (black-solid). Points1,2, and 3 correspond to columns of Figure 3.5. Parameters: nI = 20, nR = 20, b = 0.02, d = 0.001, = 0.01. Panel A: X = 0.5, Y = 0.45, B: X = 0.3, Y = 0.25, C: X = 0.5, Y = 0.25.39Allele FrequencyAbundanceAbundanceAbundanceAllele FrequencyTimeFigure 3.5: Effect of SIRS epidemiological dynamics on allele-frequency cycles. Allele-frequency dy-namics from the matching-alleles model (Row 1), epidemiological dynamics without (Row 2:  = X , Row3:  = Y ) and with coevolution (Row 4), and allele-frequency dynamics in an SIRS epidemiological model(Row 5) for three different parameter conditions as shown in Figure 3.4. In allele frequency plots, host-allelefrequencies are shown in green and pathogen allele frequencies in black. Epidemiological dynamics showthe abundance of susceptible hosts in red (solid), infected hosts in blue (dashed), and recovered hosts inpurple (dotted). Parameters: nI = 20, nR = 20, b = 0.02, d = 0.001,  = 0.01, X = 0.5, and Y = 0.25.Column 1: ⌧I = 7, ⌧R = 20, column 2: ⌧I = 7 and ⌧R = 35, and column 3: ⌧I = 15, ⌧R = 35. The initialconditions were chosen such that pH(0) = 0.55 and pP (0) = 0.49, in the epidemiological model this wasdone by adding a small perturbation to the endemic equilibrium.403.7 Coevolution in an SEIRS ModelAlthough there are several coevolutionary models that exhibit stable limit cycles in discrete time due tothe inherent time delays (for example Agrawal and Lively, 2002), these cycles are often not maintained incontinuous time. We therefore begin by deriving a coevolutionary model with which to test the effect ofcontinuous time epidemiological dynamics on limit cycles. We do so by introducing a time delay into thecoevolutionary model in (3.2). Upon infecting a host, pathogens must first pass through an immature stageduring which they are not infectious themselves. In addition to introducing this immature stage the mainte-nance of allele-frequency limit cycles requires that we introduce pathogen mutation at rate µ to prevent thesystem from cycling out until one allele fixes in each species. The resulting differential equations describingthe change of allele frequencies in the host pH , immature pathogen pP , and mature pathogen pM are givenby:ddtpH = ⇠HpH(t)(1 pH(t))((11  21)pM (t) + (12  22)(1 pM (t)))ddtpP =⇠P pM (t)(1 pM (t))((11  12)pH(t) + (1 pH(t))(21  22)) + (1 2pP (t))µddtpM =pP (t) pM (t)(3.6)The differential equation for the change in pathogen allele frequency is similar to that of system (3.2)except now infection depends only on the frequency of mature pathogens pM . Immature pathogens matureat a rate  and are replaced by new infections. Finally, the change in the frequency of mature pathogensis determined by pathogen maturation and mature pathogen death, both of which occur at rate  (held atthe same rate to avoid accumulation in one maturation phase or the other). Deriving this model as thecontinuous time limit of a discrete time model we must ensure that the change in allele frequencies overany one time step is small (See supplementary Mathematica notebook). This in turn requires that the rateof pathogen maturation and mature pathogen death, , is large. Once again assuming infection is sym-metric, 11 = 22 = X,12 = 21 = Y , this coevolutionary model without epidemiological dynam-ics exhibits stable allele-frequency limit cycles (Figure 3.6) as long as the mutation rate is not too high,µ < 14⇣ +p2 + (X  Y )2⇠H⇠P⌘.Introducing an immature pathogen stage is analogous to incorporating an “exposed” compartment intothe epidemiological model during which hosts are infected but not yet infectious themselves. The resultingsystem of differential equations without and with coevolution are very similar to systems (3.4) and (3.5)and exhibit the same range of dynamics (see Mathematica file). Comparing the allele-frequency dynamicsbetween the modified matching-alleles model and the SEIRS model with coevolution demonstrates that onceagain the allele-frequency cycles are disrupted by epidemiological feedback (Figure 3.6), confirming that theloss of these cycles extends beyond cases of neutrally stable coevolutionary cycles.41AbundanceAbundanceAbundanceAllele FrequencyAllele FrequencyTimeFigure 3.6: Effect of SEIRS epidemiological dynamics on allele-frequency cycles. Allele-frequency andepidemiological dynamics for the SEIRS model and alternative matching-allele model in section 7. Rows arethe same as in Figure 3.5. Parameters: nE = 1, nI = 2, nR = 60, b = 0.2, d = 0.001,  = 0.06, X = 0.5,Y = 0.35, ⌧I = 5, and µ = 0.001. Column 1:⌧R = 40, column 2: ⌧R = 50, and column 3:⌧R = 65. Theinitial conditions were chosen such that pH(0) = 0.55 and pP (0) = 0.49, in the epidemiological model thiswas done by adding a small perturbation to the endemic equilibrium. In epidemiological dynamic plots thedensity of susceptible hosts are shown in red (solid), exposed hosts in green (dash-dotted), infected hosts inblue (dashed), and recovered hosts in purple (dotted). Host allele frequencies are given in black (solid) whileparasite allele frequencies are shown in green (dashed).423.8 DiscussionDespite the conceptual overlap of epidemiology and host-pathogen coevolution, few models have includedboth. The few examples where this has been done (Andreasen and Christiansen, 1995; May and Anderson,1983; Beck, 1984; Morand et al., 1996) have made stringent assumptions that prevent the generalization oftheir results to a broad range of coevolutionary and epidemiological scenarios. By incorporating multipleinfectious and recovered stages into a SIRS model, we are able to explore the effect of coevolution on awide range of epidemiological scenarios from disease extinction to epidemic limit cycles. We find that in thepresence of matching-allele type coevolution, epidemiological dynamics are typically intermediate betweenthose expected in the absence of genetic variability when  = X and  = Y . Importantly however, thereexist parameter conditions under which the introduction of coevolution can lead to epidemic limit cyclesthat would not persist in the absence of genetic variability (Figure 3.4). In addition to affecting the type ofepidemiological dynamics, genetic variation in the host and parasite, along with the distribution of infectionand recovery times (as specified by nI , ⌧I , nR and ⌧R), determine the relative abundance of susceptible andinfected hosts at equilibrium (see Appendix B) extending the results obtained previously without geneticvariation (Gonçalves et al., 2011). Importantly however, when all host and pathogen types are present, theallele frequency of the host and the parasite at equilibrium is 1/2 regardless of infection and recovery times.Although the equilibrium allele frequencies are the same in the matching-alleles model with and withoutepidemiological dynamics, including epidemiology has a dramatic effect on the predicted allele frequencydynamics. In comparison to the neutrally stable allele frequency cycles observed in the classic coevolution-ary model (Figure 3.1), “Red Queen” cycles do not persist in the epidemiological models explored regardlessof the disease outcome (disease extinction, endemic equilibrium, or epidemic limit cycles). Instead the allelefrequency dynamics are forced to equilibrate by the epidemiological dynamics, quickly dampening to anallele frequency of 0.5 in both the host and pathogen. This finding is consistent with previous work showingthat Red Queen cycles are sensitive to other ecological effects, such as competition and predation (Gokhaleet al., 2013; Song et al., 2015) as well as epidemiological feedback such as parasite castration (Ashby andGupta, 2014). Importantly, this dampening of allele frequency cycles contrasts with the sustained allele fre-quency limit cycles found by Morand et al. (1996). A numerical analysis of the robustness of Morand et al.’smodel illustrates that allele frequency cycles are only maintained in cases of very strong selection, with par-asites that kill or castrate infected hosts at high rates. In addition, their model differs from ours in assuminga sexually reproducing hosts and parasites using a discrete-time model. In the supplementary Mathematicanotebook, we explore models that span the differences between the Morand et al. model and ours, findingthat sex and discrete-time steps both increased the parameter space in which cycles were observed; but inevery case strong selection (through castration and/or virulence) was required.One potential argument for the sensitivity of allele frequency cycles is that they are neutrally stablein the pure co-evolutionary model (Figure 3.1), with a purely imaginary leading eigenvalue (eg. Section3.3). Small perturbations to such systems can have drastic effects on the dynamics (Karlin and McGregor,1972). In contrast, a Red Queen model that generates stable limit cycles would be predicted to be morerobust to population or epidemiological dynamics. We thus developed a coevolutionary model that generatesstable limit cycles by incorporating an immature pathogen stage and pathogen mutation. Nevertheless by43comparing these coevolutionary limit cycles to a SEIRS epidemiological model in which newly infectedhosts must progress through an exposed but non-infectious stage we find that, at least in the cases tested,allele frequencies are damped by the inclusion of epidemiological dynamics, as shown in Figure 3.6. Thisconfirms that the disruption of allele-frequency cycles is not restricted to coevolutionary models with neutrallimit cycles.The sensitivity of Red Queen cycles has several important biological implications. For example thesecycles form the basis for the “Red Queen hypothesis” for the evolution and maintenance of sexual reproduc-tion (Hamilton, 1980). The inability of these cycles to persist in the face of ecological and epidemiologicaldynamics suggests that, at least in the absence of other cyclical drivers such as seasonal forcing of param-eters, other mechanisms are necessary to explain the abundance of sexual reproduction. For example allelefrequency cycles can sometimes be maintained with strong selection as with castrating parasites (Morandet al., 1996). Although we have demonstrated here that host epidemiological dynamics prevent the long-term persistence of “trench-warfare” type Red-Queen cycles, this result may not extend to other forms ofcoevolution such as host-pathogen arms races, where coevolution favours the fixation of progressively morevirulent pathogens and resistant hosts. If maintained despite epidemiological dynamics, the ensuing geneticarms race could provide one viable alternative mechanism selecting for sex. A second potential alternativeis where common pathogen types select for specific allelic combinations in hosts in a manner that fluctuatesover time and generates cycles in the frequency of each genotype, as seen in the explicitly epidemiologicalmodel of Penman et al. (2013). Addressing if and when such genotypic cycles can occur would requireextending model (3.5) to include multiple host and pathogen loci. Finally, sex may be maintained to relieveselective interference among loci, including loci subject to host-parasite interactions (Hodgson and Otto,2012).In addition to the evolution of sex, models of host-pathogen coevolution in the absence of explicit epi-demiological dynamics have been used to formulate predictions for a wide-range of phenomena includingthe evolution of self-fertilization (Agrawal and Lively, 2001), non-random mating (Nuismer et al., 2008),gene expression (Nuismer et al., 2005), and ploidy (Nuismer and Otto, 2004; M’Gonigle and Otto, 2011).Extending our results to determine whether or not these implications hold in the presence of epidemiologicaldynamics is an important next step. Nevertheless, we demonstrate that epidemiological processes can haveimportant implications on host-parasite coevolution. Hence to understand the interaction between hosts andtheir pathogens we need to consider both the genetic and epidemiological features of species interactions.44Chapter 4Coevolution Does Not Slow the Rate of Lossof Heterozygosity in a StochasticHost-Parasite Model with ConstantPopulation Size4.1 IntroductionThere is a rich history of evolutionary theory exploring the conditions under which genetic variation ismaintained or depleted in finite populations. The loss of genetic variation through drift is exacerbated, forexample, by fluctuations in population size (Crow and Kimura, 1970, eq., but variation is main-tained through balancing selection in the form of overdominance (Crow and Kimura, 1970, eq. 8.6.4) ornegative frequency-dependent selection (Takahata and Nei, 1990). First suggested by Haldane (1949), oneprocess that is often posited to maintain genetic variation is coevolution between hosts and their parasites.Coevolution, it is argued, should favour pathogens that are best at infecting the most common host geno-type. This in turn should favour the spread of rare host genotypes, a form of negative frequency-dependentselection (NFDS) believed to maintain genetic variation (Clarke, 1979). Despite the long-standing interestin coevolution as a mechanism maintaining variation, few studies have explicitly analysed the dynamics of ahost and parasite that are both polymorphic in a fully stochastic framework.Following Haldane’s initial hypotheses, balancing selection as a result of coevolution in the form ofNFDS and/or overdominance was suggested as a mechanism behind the extraordinary genetic diversity foundat mammalian Major Histocompatibility Complex (MHC) loci (Bodmer, 1972). These same arguments havebeen used recently to explain the diversity of anti-microbial peptides in Drosophila (Unckless et al., 2016;Chapman et al., 2019). MHC loci, and other immune defence genes, are notable not only for their high levelsof heterozygosity (> 200 alleles across three loci Klein and Figueroa, 1986; Zimmer and Emlen, 2013) butalso for the long-term trans-specific persistence of these polymorphisms (Lawlor et al., 1988; Klein, 1987).Using a coalescent approach Takahata and Nei (1990) found that heterozygote advantage and NFDS are bothcapable of generating the observed levels of polymorphism. Importantly, however, the model they used wasnot explicitly coevolutionary but rather based on NFDS within a single species.As exemplified by Takahata and Nei (1990), much of the literature on the maintenance of genetic varia-tion alludes to yet blurs the distinction between single-species and coevolutionary NFDS45(for example see Tellier et al., 2014; Otto and Michalakis, 1998; Zhao and Waxman, 2016; Llaurens et al.,2017; Ejsmond and Radwan, 2015; Rabajante et al., 2016). By the definition of NFDS in a single-speciesmodel (called direct-NFDS, Brown and Tellier, 2011), the fitness of an allele increases as its frequencydeclines, this in turn can favour the spread of rare alleles and the maintenance of genetic variation. In contrasttwo-species NFDS (called indirect-NFDS), the sort that commonly arises with host-parasite coevolution,favours alleles of the focal species that correspond to ones which are rare in the interacting species. As notedby Brown and Tellier for the gene-for-gene model (2011), we show that indirect-NFDS in matching-allelesmodel often has little if any impact on the maintenance of genetic variation.Hints that indirect-NFDS does not maintain genetic variation can be found throughout the theoreticalliterature on matching-alleles models. Simulating coevolution in a population where genetic variation isrepeatedly introduced through migration or mutation, Frank (1991, 1993) found that the dynamics weredominated by the fixation and loss of genetic variants. He attributed this effect to the repeated populationbottlenecks that occur from coevolutionary driven fluctuations in population size, but it was unclear whetherthis same behaviour would arise in a population of constant size. Similarly, although not explicitly discussed,many individual-based models of coevolution include mutation in either one (Agrawal and Lively, 2002) orboth species (Lively, 1999; Borghans et al., 2004; Ejsmond and Radwan, 2015) in order to maintain variationand hence coevolution over the long term. Modelling coevolution in an infinite population, M’Gonigle et al.(2009) found that genetic variation is not maintained at equilibrium except when mutation is very frequent.This is echoed in simulations in finite populations, where in the absence of mutation/migration, fixation ineither the host or pathogen is vary rapid (Gokhale et al., 2013; Schenk et al., 2018). In addition to includingmutation, there are several theoretical indications that, rather than being driven by NFDS, the emergenteffects of coevolution are dependent on the existence of heterozygote advantage in diploids. For example,M’Gonigle and Otto (2011) showed that the evolution of parasitism depends, not solely on NFDS, but onwhether the interaction induces heterozygote advantage, on average. Similarly, Nuismer and Otto (2004)showed that whether there is, on average, heterozygote advantage is the key determinant of how ploidylevels evolve in both hosts and parasites.Despite this long and extensive history of verbal and theoretical models, it is unclear whether indirect-NFDS can indeed maintain genetic variation and explain the extensive heterozygosity observed at immunedefence loci. Here we compare the maintenance of genetic variation in finite coevolving populations relativeto that expected under neutral drift. Previous studies of finite coevolving populations have focused instead onthe number of alleles maintained by mutation and coevolution (Borghans et al., 2004), the time until loss ofalleles (Gokhale et al., 2013), and the impact of different model assumptions on the time until loss (Schenket al., 2018). A direct comparison with the neutral expectation is, however, only possible analytically insimple models of host-parasite coevolution. We aim to understand the effect of host-parasite interactions onthe maintenance of genetic diversity by examining one such simple single-locus model of coevolution.464.2 Theoretical BackgroundThere are two classic models of coevolution involving a single locus major-effect genes in each species, thegene-for-gene (GFG) model, which was motivated by the genetic architecture of flax-rust interactions (Flor,1956), and the matching-alleles model (MAM), a form of host-parasite specificity that may arise from lockand key molecular interactions (Dybdahl et al., 2014). While GFG and MAM represent only two of the manypossible models of host-parasite specificity, their dynamics are representative of the full range that can arisein single-locus models (Agrawal and Lively, 2002; Engelstädter, 2015). In the GFG model, hosts carry ei-ther a “susceptible” or “resistant” allele. Susceptible hosts can be infected by both “virulent” and “avirulent”parasite genotypes whereas resistant hosts can only be infected by the virulent parasite genotype. This asym-metry favours fixation of resistant hosts followed by the fixation of virulent pathogens. Deterministic modelshave shown that this coevolutionary arms race does not in general maintain genetic variation (Leonard, 1977,1994). Rather stability of genetic polymorphism and the long-term maintenance of genetic variation requiresthe introduction of temporal delays and asynchrony between host and parasite life cycles (Brown and Tel-lier, 2011). As GFG is often used to model coevolution in plant-pathogen systems, this asynchrony can beintroduced, for example, through seed dormancy or perenniality (Tellier and Brown, 2009).In contrast to the coevolutionary arms-race dynamics with GFG the symmetry of MAM generates RedQueen allele frequency cycles (also known as trench-warfare dynamics), which are more amenable to themaintenance of genetic variation. MAM is often formulated, and simulated, as an interaction between ahost and pathogen with discrete non-overlapping generations (Nuismer, 2017). In each generation hosts areexposed to a single parasite. If host and parasite carry the same, “matching”, allele at the coevolutionary locusthen infection occurs with probability X whereas interactions between “mis-matching” host and parasitegenotypes occur at a reduced rate Y . Successful infection decreases host fitness by ↵H and increases parasitefitness by ↵P . The host and pathogen population sizes are often assumed to remain constant in these modelssuch that hosts and pathogens reproduce proportionally to their fitness creating a subsequent generation ofthe same size.In the deterministic (infinite population size) limit, the resulting recursion equations for the MAM arecharacterized by an unstable cyclic equilibrium about a host and pathogen allele frequency of 1/2 as shown inFigure 4.1A (M’Gonigle et al. (2009): haploid host, haploid pathogen, Leonard (1994): diploid host, haploidpathogen). The corresponding dynamics of host heterozygosity is shown in Figure 4.1B. Starting from asmall perturbation near the polymorphic equilibrium, heterozygosity in the host population begins near itsmaximum of 0.5 and declines, on average, as the allele frequencies cycle outward. As noted above, althoughNFDS generates Red Queen allele frequency cycles in the short term, these cycles grow in amplitude overtime and are not expected to maintain genetic variation in the long term except in the presence of very highmutation rates (M’Gonigle et al., 2009). Indeed, simulating the discrete-time MAM in a finite population, weconfirm that host heterozygosity in this model , defined asH = 2p(1p)where p is the host allele frequency,declines faster than expected under neutral genetic drift (see supplementary Mathematica notebook).In contrast to the unstable Red Queen cycles generated in discrete time, analogous continuous-timemodels create neutrally stable allele frequency cycles (Woolhouse et al., 2002). Perturbations from thepolymorphic equilibrium in these models lead to allele frequency cycles of constant amplitude, often referred47to as a “dynamic polymorphism” (Hamilton, 1993). This dynamic polymorphism gives rise to correspondingcycles in heterozygosity also of constant amplitude (Figure 4.1C). This neutral stability indicates that, whenaveraged over the cycle, genetic variation remains constant in an infinite population, much like the behaviourof a neutral locus in a single-species model. We thus hypothesize that drift in a finite population mayhave similar effects in the coevolutionary model as in the neutral single-species model. Our aim here istherefore to develop an appropriate continuous-time MAM of coevolution in a finite population of constantsize that, like the standard continuous-time MAM, does not lead to an inherent decline in heterozygosityin the deterministic limit. This model allows us to quantify the loss of heterozygosity with drift in thecoevolutionary MAM for comparison to the standard single-species neutral model.All the models discussed thus far make the traditional assumption that both host and parasite densities areinfinite and controlled by factors independent of the host-parasite interaction. This is the case, for example,when host and parasite population sizes are fixed by a hard carrying capacity that is very large. The numbersof hosts and parasites may, however, vary dramatically in response to coevolution (Papkou et al., 2016). Inaddition parasites that are transmitted directly between hosts may be subject to epidemiological dynamics.While we have reason to believe that SIR dynamics can stabilize allele frequency dynamics (Chapter 3:MacPherson and Otto, 2018), our goal in this work is to focus solely on the effects of NFDS that arises fromcoevolution. Unlike previous analyses, our goal here is to examine the stochastic nature of coevolutionarydynamics, allowing us to quantify rates of loss of genetic variation. To do so, we focus on the simplest case ofstrict external control of population size in both hosts and parasites without demographic or epidemiologicalfeedback.4.3 The ModelWe use a continuous-time birth-death model to describe the coevolutionary dynamics between a host anda free-living pathogen, as depicted in Figure 4.2. To keep the total host and pathogen population sizesconstant at the same fixed value , we use a Moran model design with coupled birth-death events. Extensionof the model to unequal host and pathogen population sizes is straightforward and the results do not differqualitatively from those presented here (see Chapter 5). Both host and pathogen are haploid, with coevolutiondepending on a single biallelic locus in each species. We represent the number of hosts and pathogens ofeach type by Hi and Pj where i 2 {1, 2} and j 2 {1, 2}. Hosts of type i come into contact with pathogensof type j at a density-independent rate HiPj . In keeping with MAM, upon contact the pathogen successfullyinfects the host with probability i,j . If the host and pathogen carry “matching” alleles (i = j) then i,j = X ,whereas “mis-matching” infection occurs with a reduced probability i,j = Y < X for i 6= j. If infectionoccurs, hosts die instantaneously with probability ↵. The resulting rate of fatal infection of host type i frompathogen type j is ↵i,jHiPj . Fatal infections result in four coupled events 1) the death of the infected hostof type i, 2) birth of a random host, 3) birth of the infecting pathogen of type j representing the release ofinfectious particles, and 4) death of a random pathogen. Non-fatal infections do not lead to the birth or deathof either the host or pathogen, effectively assuming that the pathogen mounts a very limited infection andreturns to the free-living pathogen population. In addition to these four events associated with infections,48●0.0 0.2 0.4 0.6 0.8 ●0 200 400 600 800 10000.●●●0.0 0.2 0.4 0.6 0.8 ●●●0 200 400 600 800 10000. BC DpP generationsp Hp HHeterozygosityHeterozygosityDiscreteTimeContinuousTimeFigure 4.1: Deterministic dynamics of matching-alleles coevolution. Panel A: Phase-plane diagram of theunstable allele frequency cycles of the discrete-time matching-alleles model given by system (C.1) startingfrom a small initial perturbation (red point) from the polymorphic equilibrium. Parameters: X = 0.8, Y =0.4,↵ = 0.4, tmax = 1000. Initial conditions: Black pH(0) = 0.54, pP (0) = 0.47. Panel B: Correspondingtemporal dynamics of host heterozygosity. Panel C: Phase plane of neutrally stable allele frequency cyclesarising in the continuous-time MAM given by system (4.3) starting form three different initial conditionsshown by the red points. Parameters: X = 0.8, Y = 0.4, = 100,  = 1,↵ = 0.2,  = 1, tmax = 1000.Initial conditions: Black pH(0) = 0.55, pP (0) = 0.45, Blue pH(0) = 0.7, pP (0) = 0.3, Green pH(0) =0.85, pP (0) = 0.15. Panel D: Corresponding temporal dynamics of host heterozygosity.49random host deathrandom host birthrandom pathogen deathrandom pathogen birthInfectious host death,pathogen birthrandom host birthrandom pathogen deathHostType i  {1, 2}HiPathogenType j  {1, 2}PiCoupledBirth-DeathEventCoupledBirth-DeathEventInfectionPathogen Birth/Host Death↵i,j Figure 4.2: Schematic of the Moran MAM. Coevolution consisting of three different types of coupledbirth-death events. Infection (green), natural host death (blue), and free-living pathogen death (purple).Solid lines denote events that occur at a rate that depends on the genotype whereas dashed arrows representbirth and death of random individuals irrespective of their genotype.natural host death-birth and free-living pathogen death-birth occur at per-capita rates  and , respectively.Both events are non-selective consisting of the death and birth of random host and pathogen genotypes.As stated above, our primary aim is to compare the maintenance of genetic variation in a finite coevolvingpopulation to that expected under neutral genetic drift alone. Focusing on genetic variation in the host, fora population of finite size , the expected decline in host-heterozygosity from neutral genetic drift in thehaploid Moran model is given by:Hneut(t) = H0e2t (4.1)whereH0 is the initial heterozygosity and time, t, is measured in units of host generations, (i.e. the expectedtime to  deaths, /(d) = 1/d where d is the death rate). To compare the neutral single-species modelto a host-parasite model, we need to define time in the host-parasite model in an equivalent way. In thecoevolutionary model, the per capita host death rate, D, hence the host generation time, changes over timeas the allele frequencies change.D =Pi,j i,j↵HiPj + (4.2)To measure time in terms of host generations we therefore rescale absolute time ⌧ such that t = D⌧ where tis again in units of host generations.504.3.1 Deterministic dynamicsIn the limit as the total host and pathogen population size goes to infinity (!1) the dynamics of the hostand pathogen allele frequencies pH = H1/ (1 pH = H2/) and pP = P1/ (1 pP = P2/) are givenby the following system of differential equations:dpH(t)dt=(X  Y )↵ (1 2pP (t)) pH(t) (1 pH(t))X↵+   (X  Y )↵ (pP (t) + pH(t) (1 2pP (t)))dpP (t)dt=(X  Y )↵ (1 2pH(t)) pP (t) (1 pP (t))X↵+   (X  Y )↵ (pP (t) + pH(t) (1 2pP (t)))(4.3)As shown in Figure 4.1, the behaviour of this Moran MAM in the deterministic limit is identical tothat of the traditional continuous-time MAM (Woolhouse et al., 2002). Specifically, there are five equilibriaof system (4.3). Four are unstable equilibria characterized by the fixation of one host and one pathogengenotype. The final polymorphic equilibria at pˆH = pˆP = 12 is neutrally stable with purely imaginaryleading eigenvalues, generating neutral-limit cycles in allele frequencies (see Figure 4.1C).As discussed previously, the dynamic polymorphism that arises in the deterministic continuous-timeMAM neither maintains nor depletes host heterozygosity (see Figure 4.1D). While the neutral stability ofthe MAM is well known, the consequences on heterozygosity are under-appreciated. Contrary to the pro-posed effect of coevolution and NFDS on genetic diversity (Haldane, 1949; Clarke, 1979), neutrally stableallele frequency cycles neither deplete nor restore genetic variation. Rather, in an infinite population, coevo-lution has no net effect on genetic diversity, averaged across a cycle. To see if this behaviour holds in finitecoevolving populations, we begin by describing an analytical approach to compare the dynamics of geneticdiversity in a finite population to those expected under neutral genetic drift. Because this analytical ap-proach only applies while the cycles are small we couple this approach with an individual-based simulationdescribing the loss of genetic diversity more generally.4.3.2 Ensemble moment dynamicsTo model the dynamics of genetic diversity in a finite population we express the model depicted in Figure 4.2as a continuous-time Markov chain. The complexity of the model limits the available analytical approaches.One approach we can use, however, is the ensemble moment approximation, the derivation of which is givenin the Appendix C. The result is an approximation for the expected host heterozygosity which we will callH˜(t). As with the neutral expectation in equation (4.1), this is the expectation of host heterozygosity overall realizations of the stochastic process, the approximation of which is given by a Taylor series assumingthe system is near the deterministic equilibrium (H0 ⇡ 1/2) and the population size large.H˜(t) ⇡ H0✓1 2t◆+O(✏3) (4.4)where ✏ is the deviation of the stochastic process from the deterministic equilibrium and ✏ = 1/. Onceagain t is measured in units of host generations. To this order, the ensemble moment dynamics are identicalto that of the neutral expectation given by (4.1). Hence, as expected from the deterministic dynamics, host51heterozygosity in the MAM declines as expected under neutral genetic drift. Although analytically unwieldy,numerical evaluation of H˜ to third order reveals slight deviations between the ensemble moment dynamicsand the neutral expectation that are also observed in the individual-based simulations presented in the fol-lowing section. Specifically, H˜ exceeds the neutral expectation when the free-living pathogen death rate is less than the death rate of the host  (negative values of    in Figure 4.3B) and falls below the neutralexpectation as this rate increases.The mechanism behind these lower-order deviations is shown schematically in Figure 4.3A. If we start,for example, at the equilibrium host and pathogen allele frequencies (pˆH = 0.5, pˆP = 0.5), infection, naturalhost death-birth, and free-living pathogen death-birth events perturb the allele frequencies. Selection thenacts on those perturbations to produce small counter-clock wise allele frequency cycles. Although over thelong term these allele frequency cycles have no net effect on heterozygosity, they do have an effect over theshort term. For example, if only the host allele frequency is perturbed by a random host birth event thenduring the first quarter cycle natural selection will increase host heterozygosity, because the parasite thatmatches the host that, by chance, became more common, will spread in turn reducing the more commonhost genotype. In contrast, if only the pathogen allele frequency is perturbed, selection will decrease hostheterozygosity by preferentially killing off the host that matches the parasite allele that increased by drift.The effect of these early quarter cycle responses to selection would average out if perturbations were dis-tributed uniformly about the equilibrium. However, natural host death affects only the host allele frequencywhereas free-living pathogen death only affects the pathogen allele frequency. Hence the relative rates ofthese two events determine whether these early responses to selection transiently increases or decreases het-erozygosity. Specifically, relatively high rates of natural host death and low rates of free-living pathogendeath (   < 0) lead to an increase in host heterozygosity relative to the neutral expectation (see Figure4.3B).4.3.3 Individual-based simulationsUsing a Gillespie algorithm we simulated host-parasite coevolution in a finite population. For each of one-hundred randomly drawn parameter combinations, we simulated coevolution for seven different populationssizes ranging on a log scale from  = 102 to  = 103.5 = 3162. The relative infection rates of matching andmis-matching genotypes were drawn such that 0  Y < X  1. We restricted the probability of dying frominfection, ↵, to lie between 0 and 1/2 as much stronger selection causes the rapid loss of alleles, limitingthe amount of coevolution. Host and pathogen free-living natural death rates  and  were both drawnbetween 0.1 and 1. For each of the 700 parameter combinations, we simulated 1000 replicate populationsfor tmax = 500 host generations.In tandem with each coevolutionary replicate population we simulated an analogous “neutral population”to check that it behaved correctly according to the neutral expectation given in equation (4.1). In thesesimulations, the exact same series of host and pathogen birth-death events occurred as in the coevolutionarysimulations, except that the individual who was born or died due to infection was drawn at random, thesesimulations captured random genetic drift in both host and pathogen. As shown in Figure 4.4B the meanheterozygosity of these neutral simulations (shown in gray) are identical to the analytical neutral expectation52pathogen allele frequency"̂#infectionhost allele frequency"̂#natural host death"̂#free-living pathogen deathSelection increases heterozygositySelection decreases heterozygosityChange in allele frequencyEffect of selection"̂ $-1.0 -0.5 0.0 0.5 1.0-0.0006-0.0004-0.00020.0000R2= 0.958AB  HEMHneutFigure 4.3: The effect of early perturbations. Shown schematically in panel A are the effects of the threeprocesses, infection, natural host death, and natural pathogen death on host and pathogen allele frequencyand the resulting effect of selection on host heterozygosity. H˜  Hneut is shown in panels B. Points givethe deviation at time t = 50 host generations for 200 randomly drawn combinations of  and , where0.1 <  < 1, 0.1 <  < 1, and X = 0.9, Y = 0.7,↵ = 0.2, = 105.given by equation (4.1), confirming that this is the appropriate neutral expectation for the two-species model.Figure 4.4 captures the four key results of our model. First, as predicted by the second-order ensemblemoment approximation given in equation (4.4) and shown in the inset of panel B the dynamics of heterozy-gosity in the coevolving populations behave neutrally, for approximately the first 10 generations, when thedynamics are very near the deterministic equilibrium. Second, as the distance from the deterministic equi-librium grows, the coevolutionary dynamics depart slightly from the neutral expectation due to the transienteffects of selection on perturbations. The resulting slight increase (decrease) in heterozygosity is shown bythe filled points in panel A that are either slightly greater (less) than the neutral expectation. The third keyresult is that heterozygosity is sometimes substantially less than expected from the neutral expectation. Fol-lowing fixation, the matching-alleles genetic mechanism becomes one of directional selection, favouring themis-matching host of the remaining pathogen genotype. This directional selection (shown by red trajectoriesin Figure 4.4C) in turn rapidly erodes genetic variation within the host population. Finally, as predictedby the deterministic model, host heterozygosity in most cases is near the neutral expectation. Specifically,open points in Figure 4.4A) indicate parameter conditions that did not differ significantly from the neutralexpectation.53○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○ ○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○ ○○○○○○○○○○○○○ ○○○ ○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●0.0 0.1 0.2 0.3 0.4κ100178316562100017783162Neutral ExpectationIBS MeanBootstrap Means0 100 200 300 400 5000. 10 20 30 40 50 600.400.420.440.460.480.500 100 200 300 400 50000.250.50.751A BCHneutHcoev generationsHeterozygositypHgenerationsFigure 4.4: Individual-based simulations. Panel A: Heterozygosity in the coevolving populations versusthe neutral expectation at time t = 250 host generations. Black dashed line represents the expectation underneutrality. The red curve is the smoothed fit of the mean, light red lines are bootstrap means. The points arethe mean heterozygosities for each of the 420 parameter combinations averaged across the 1000 replicatepopulations simulated for each. Points are filled if the mean heterozygosity for the parameter combinationdiffers significantly from the neutral expectation (with 95% confidence with Bonferroni correction for the100 points for each ) versus empty when the replicate population means overlap neutrality. Panel B: Ex-ample dynamics for a single parameter condition (X = 0.29, Y = 0.14,↵ = 0.38,  = 0.99,  = 0.24,and  = 102.75 = 562). Red region represents CI of heterozygosity in the coevolving population,gray region is the CI in the analogous neutral simulation, Blue line is the neutral expectation from equa-tion (4.1) and green line is heterozygosity in the second order ensemble moment approximation. PanelC: Allele frequency dynamics for example simulation. Gray trajectories represent ongoing coevolution,red are periods of directional selection, while blue are fixed host replicate populations. Parameters:X = 0.26, Y = 0.25,↵ = 0.35,  = 0.36,  = 0.83, = 103.5 = 17784.4 DiscussionContrary to theories that posit that host-parasite coevolution and the associated negative frequency-dependentselection (NFDS) should maintain genetic variation (Haldane, 1949; Clarke, 1979; Takahata and Nei, 1990),we use stochastic methods, both describing ensemble moments and conducting individual-based simula-tions, to show that genetic variation is lost at nearly the same rate as in the neutral model (continuous time)or faster (discrete time). Although these results are consistent with stability analyses of the correspondingdeterministic models, such stability analyses only apply near the equilibrium with equal allele frequencies.Thus, in contrast to single-species NFDS, coevolutionary NFDS does not generally maintain genetic vari-ation. Specifically, the neutrality of the cycles ensures that for any given host allele frequency selection isas likely to be selecting for as against that allele. While the largely neutral effect of matching-alleles co-evolution is recovered in our models of finite-population size, we find that stochasticity introduces two new54effects. First is the effect of the initial response to selection following perturbations in allele frequency fromthe deterministic equilibrium, which may on average either maintain or deplete heterozygosity depending onthe relative death rates of the host versus the pathogen. The second non-neutral effect is that of directionalselection following pathogen fixation which rapidly erodes genetic variation.The contrasting effects of coevolutionary indirect-NFDS and single-species direct-NFDS have beenpointed out previously by Brown and Tellier (2011) in the context of the gene-for-gene coevolutionarymodel. Nevertheless, the distinction between these two processes and in particular their differing effectson the maintenance of genetic variation remains under-appreciated. Our finding that the MAM does not in-herently maintain genetic variation contrasts with and clarifies the findings of previous computational modelsof the maintenance of genetic variation at MHC loci (Ejsmond and Radwan, 2015; Borghans et al., 2004).Simulating matching-alleles coevolution at multiple loci in the presence of rapid host mutation, both Ejs-mond and Radwan (2015) and Borghans et al. (2004) find a weak signal of increased fitness of rare hostalleles. Our results clarify, however, that this is not in fact a true signal of negative frequency-dependentselection. Rather this is a result of the fact that novel, and hence rare, host alleles may be favoured tran-siently if there does not yet exist genetic variation at the corresponding pathogen locus. In contrast, longterm changes in host allele frequencies are not associated with corresponding changes in fitness as would berequired under NFDS (see Borghans et al. (2004) Fig 4A).Here we have focused on coevolution in a single population. Coevolution is however inherently a spatialprocess. Previous theoretical models suggest that spatial structure in combination with variability in thestrength and nature of host-parasite coevolution across space may promote local adaptation (Gandon et al.,1996; Nuismer, 2006). Our results illustrate, however, that this is not the result of coevolution within eachindividual deme but rather an emergent effect of spatial structure and gene flow. Whether these spatialmodels maintain more genetic variation than expected under neutral drift remains a valuable topic for futurework.In the main text we have focused on host-parasite coevolution in a continuous-time model as opposedto a model of coevolution in discrete time (see equation S9). This is because, as explored in the back-ground section, the continuous-time model is more likely to maintain genetic variation based on the be-haviour of the deterministic dynamics. We confirmed that this expectation holds when extended to finitepopulations by simulating host-parasite coevolution in a Wright-Fisher MAM (see Appendix C). As ex-pected host-heterozygosity in such a discrete-time MAM is significantly less than the neutral expectationand less than observed in the continuous-time model (see Figure C.1).The largely neutral effects of the continuous-time MAM and variation eroding effects in discrete-timemakes coevolutionary NFDS an insufficient explanation for the high levels of genetic diversity at hostimmuno-defence loci. The models presented here make several implicit assumptions that may influencethe maintenance of genetic diversity. In particular, we assume that the population size remains constant.Although this is a classic assumption in coevolutionary theory, coevolution is known to have drastic effectson population sizes of both host and pathogen in natural systems (Papkou et al., 2016). Indeed inclusion ofexplicit population size dynamics in the MAM results in fluctuations in host and pathogen population size(Nuismer, 2017; Frank, 1993). While these changes in population size are expected to have little to no effect55on the allele frequency dynamics, they are expected to have important consequences on the maintenanceof genetic variation (Crow and Kimura, 1970). We also assume that the host and parasite generations aresynchronized. The extensive literature on the deterministic stability of the polymorphic equilibrium has sug-gested that temporal asynchrony between the host and parasite, arising for example through seed dormancy,can stabilize polymorphisms and maintain variation (Brown and Tellier, 2011; Tellier and Brown, 2009).Developing an analogous MAM model, we find that here too including a temporal delay in the ability ofone species to respond to the other (here by including seed dormancy) has a stabilizing effect and can helpmaintain variation (see Appendix C).The maintenance of genetic diversity at human MHC loci may also be the result of coevolution betweenhosts and pathogens that are transmitted directly from one host to another and hence subject to epidemiolog-ical dynamics. Our previous work showed that these epidemiological dynamics stabilizes Red Queen allelefrequency cycles resulting in a stable polymorphic equilibrium (Chapter 3: MacPherson and Otto, 2018). Incontrast to the instability of the discrete-time MAM and the neutral stability of the continuous-time MAM,the stabilizing effects of these epidemiological dynamics are expected to maintain genetic variation. Whilethe work presented here was aimed solely at understanding the effects of coevolution on the maintenance ofgenetic variation in finite populations, extending these results to include more realistic population size andepidemiological dynamics is necessary to fully understand the maintenance of genetic variation in naturalsystems.56Chapter 5Feedback Between Coevolution andEpidemiology Can Help or Hinder theMaintenance of Genetic Variation inHost-Parasite Models5.1 IntroductionMathematical models play an important role in our understanding of infectious-disease epidemiology, allow-ing us to predict disease spread and design effective interventions. For many infectious diseases, accurateprediction of disease dynamics requires realistic modelling of host and pathogen heterogeneity (Woolhouseet al., 1997), such as variation in host behaviour or in epidemiological rates among pathogen strains. Geneticvariation in turn allows evolution in one species or coevolutionary responses if both host and parasite arevariable. Our previous work has demonstrated that coevolution can, in turn, influence the epidemiologicaldynamics (Chapter 3 MacPherson and Otto, 2018) and impact our ability to identify the genetic variationunderlying the host and pathogen interactions (Chapter 2 MacPherson et al., 2018).These potentially important impacts of coevolution on our understanding of infectious diseases are onlyrelevant, however, if we are likely to find genetic variation segregating in both host and pathogen. As firstsuggested by Haldane (1949), antagonistic coevolution between hosts and their parasites is thought to main-tain genetic variation through negative frequency- dependent selection. For example, in the matching-allelesmodel (MAM), parasites are better at infecting hosts that carry a “matching” genotype; natural selection willthus favour host and parasite genotypes that are rare in the interacting species. Despite resembling negativefrequency-dependent selection within a species, which does help to maintain genetic variation (Takahataand Nei, 1990), our recent work showed that coevolution between a host and free-living pathogen does notmaintain genetic variation relative to an equivalent model of neutral genetic drift when the host and pathogenpopulation sizes are forced to remain constant (Chapter 4). This previous chapter, however, ignored popula-tion dynamics that frequently accompany coevolutionary models, such as the SIR model (MacPherson andOtto, 2018; Penman et al., 2013).In this chapter, we investigate the loss of genetic variation in a stochastic co-evolutionary model in whichthe population sizes of host and parasite fluctuate over time in response to disease. Infectious diseases canhave drastic impacts on the population size of their hosts, as exemplified by the 1918 flu (Johnson andMueller, 2002; Taubenberger and Morens, 2006), myxoma virus, and rabbit haemorrhagic disease virus. We57consider both an ecological model involving a free-living parasite and an epidemiological (SIR) model wherepathogens are transmitted directly from an infected host to a susceptible host. Exemplified in part by ourprevious work (Chapter 3 MacPherson and Otto, 2018) and discussed in more detail in the following section,the feedback between coevolution and population size, as well as the free-living or non-free-living natureof the parasite, has important effects on coevolution dynamics. We thus set out to explore the stochasticdynamics and the maintenance of genetic variation in these coevolutionary models using individual-basedsimulations. We do so first by allowing density-dependent population growth in an ecological model withhosts and free-living pathogens. We then explicitly model SIR dynamics with a directly transmitted parasitein an epidemiological model. In both cases, we compare the maintenance of genetic variation to results of acoevolutionary model with finite but constant population sizes (Chapter 4).5.2 Theoretical BackgroundThere exists an extensive theoretical literature exploring when and how evolution either maintains or de-pletes genetic variation. In infinitely large populations, genetic variation is eroded by purifying and direc-tional selection. On the other hand, some forms of natural selection can help maintain genetic variation,for example heterozygote advantage (Crow and Kimura, 1970, sec. 9.7) or negative frequency-dependentselection (Takahata and Nei, 1990). In addition to these deterministic processes, genetic variation is lostcontinuously through genetic drift in finite populations. The rate at which drift occurs is proportional to theeffective population size, which in turn depends on the census population size, sex ratio (Crow and Kimura,1970), population subdivision (Whitlock and Barton, 1997), or age structure (Hill, 1972). In addition tothese effects on the maintenance of genetic variation within a single species, biotic interactions in the formof coevolution may influence the maintenance of genetic variation (Haldane, 1949; Clarke, 1979). As coevo-lution favours genotypes that are rare in the interacting species, these hypotheses are rooted in the view thatnegative frequency-dependent selection helps maintain variation.There exists an important distinction, however, between traditional direct frequency-dependent selection(dFDS) which favours genotypes when their own frequency is rare and indirect frequency-dependent selec-tion (iFDS) that arises via coevolution, as reviewed for models of gene-for-gene (GFG) coevolution betweenplants and their pathogens (Brown and Tellier, 2011). As in the GFGmodel, in Chapter 4 we found that iFDSin the MAM does not maintain genetic variation relative to neutral genetic drift. Specifically, we focused ex-clusively on the effects of coevolution (changes in allele frequency) on the maintenance of genetic variationby assuming that the host and parasite population sizes remained constant over time, a common assumptionof many MAM (Nuismer, 2017). Comparing heterozygosity under coevolution to the neutral expectationunder drift, the extent of genetic variation is almost always less than or equal to this expectation. While thedeterministic stability of this model would suggest that heterozygosity should evolve neutrally, decreases inheterozygosity relative to the neutral expectation occurs, in part, because chance fixation of the pathogen infinite populations turns symmetric coevolutionary selection into directional selection, which rapidly erodesgenetic variation in the host.While assuming that the host and parasite population size remain constant allowed us to focus on the58effects of coevolution, this assumption is violated in many systems (Papkou et al., 2016). Host populationsize fluctuations resulting from coevolution reduce effective population size, increasing the rate of geneticdrift. All else being equal, coevolution and the associated population dynamics are thus expected to depletegenetic variation. All else, is however, never equal. In particular, modelling host and parasite populationdynamics with density-dependent population growth allows the growth rate of the host to increase wheneverparasitism results in more deaths. This introduces an additional form of feedback between the host andparasite. What the net effect of population dynamics are, then, on the maintenance of genetic variationremains unclear. To disentangle these effects, we contrast the dynamics of the constant-size model usedpreviously (Chapter 4).Specifically, we explore two models that include density-dependent growth in the host. In the first,which we label the “ecological” MAM, the parasite is again free living, which a population size that variesdepending on the infection dynamics. While this assumption accurately captures the life history of manyhost-parasite systems, for example the infection of Daphnia magna by one of its many water-borne parasites(Ebert, 2005), many diseases are directly-transmitted and cannot live independently of their hosts. We thusalso develop a model of a directly-transmitted pathogen using compartmental models from epidemiology,where hosts are categorized by their infection status, for example as “susceptible” (S) or “infected” (I), whichwe refer to as the “epidemiological’; MAM. As a result the evolution of directly-transmitted pathogensis subject to yet another feedback mechanism between host and pathogen resulting from the depletion ofsusceptible hosts from infection.Epidemiological dynamics may also be expected to affect the maintenance of genetic variation. In ourprevious work we demonstrated that epidemiological dynamics disrupt the neutral stability of Red Queenallele frequency cycles observed in the constant-size model, resulting instead in a stable polymorphic equi-librium (Chapter 3 MacPherson and Otto, 2018). One consequence of the stabilizing effects of epidemiologyis that, in comparison to free-living parasites, directly-transmitted infections should be better able to main-tain genetic variation in their hosts. However, the effective population size of a directly-transmitted pathogenis restricted to being less than or equal to that of its host, because the total number of infected hosts mustbe less than or equal to the total host population size. As a result of having a smaller effective populationsize, parasites are more likely to reach fixation by random chance, increasing the likelihood of directionalselection and hence reducing the amount of genetic variation.Here we explore the maintenance of genetic variation in three models: a constant-size model as in Chap-ter 4, an ecological model, and a third SI epidemiological model, the latter two incorporating density de-pendence and allowing population size fluctuations in both the host and parasite. All the models exploredin this paper consider coevolution between a haploid host and haploid parasite with a single-locus bi-allelicmatching-alleles genetic basis. For each model we begin by exploring the deterministic coevolutionary dy-namics, using the stability of the polymorphic equilibrium to predict ability of each model to maintain geneticvariation in a finite population. We then test these predictions and compare the models using individual-basedsimulations of coevolution in finite populations. To evaluate the ability of different models to maintain vari-ation we compare host heterozygosity under coevolution Hcoev relative to the heterozygosity under neutraldriftHneut in the three models, first by fixing the fixed equilibrium host population size to a valueK (Section595.4), then in Section 5.5 by fixing both host and pathogen equilibrium population sizes across models.5.3 Deterministic DynamicsA.ConstantModelHostType i  {1, 2}HiPathogenType j  {1, 2}PjCoupledBirthDeathCoupledBirthDeathInfectionPath. Birth/Host Death i,jrandom host deathrandom host birthrandom path deathrandom path birthinfectious host deathpathogen birthrandom host birthrandom path. death●●●0.0 0.2 0.4 0.6 0.8 Allele Frequency pHPath.AlleleFrequencypP●●●0 200 400 600 800 10000. i  {1, 2}HiPathogenType j  {1, 2}PjInstantaniousInfectionInfectionb(1  H )↵ 1i,j●0.0 0.2 0.4 0.6 0.8 Allele Frequency pHPath.AlleleFrequencypP●0 100 200 300 400 5000.  {1, 2}ShInfectedh  {1, 2}p  {1, 2}Ih,pb1 S+I.i,j (↵+ )●0.0 0.2 0.4 0.6 0.8 Allele Frequency pHPath.AlleleFrequencypP●0 100 200 300 400 5000. 5.1: Model design and deterministic dynamics of the three models explored in this paper. Flowdiagrams illustrate the design of the A. constant-size, B. ecological, and C. epidemiological models. Greenlines indicate infection, blue lines host demographic dynamics, and purple pathogen demographic dynamics.Solid lines indicate processes that are genotype specific, whereas dashed lines indicate events that occur atrandom with respect to genotype. Top right hand panels show phase-plane dynamics of host and pathogenallele frequency for different initial conditions (shown by red points). Lower right-hand panels depict thecorresponding deterministic dynamics of heterozygosity.605.3.1 Constant-size model: reviewIn Chapter 4, we considered coevolution between a host and free-living pathogen, assuming that the popu-lation size of both species was constant over time and equal. We begin by generalizing the model allowingthe total host population size to be fH = K and pathogen population size fP while still remaining re-maining constant over time. The dynamics of this model are exclusively coevolutionary, characterized solelyby the dynamics of host and pathogen allele frequency, denoted by pH and pP , respectively. Hosts of typei = {1, 2} are infected by pathogens of type j = {1, 2} at a rate of i,j = X/ if host and pathogen are“matching” (i = j) and at a reduced rate Y/ if “mis-matching” (i 6= j). Shown graphically in Figure 5.1A,infection in the constant-size model leads to host death and pathogen birth. To keep host and pathogen pop-ulation sizes constant, we use a Moran model structure that couples host death and pathogen birth resultingfrom infection with the birth of a randomly drawn host genotype and the death of a random pathogen.In addition to infection, host and pathogen population turnover is determined by the death (and coupledrandom birth) of hosts, which occurs at rate , and free-living pathogen death (and coupled birth), whichoccurs at rate . The resulting dynamics of host and pathogen allele frequencies are given by:dpHdt=fP↵(X  Y )(1 2pP )pH(1 pH)dpPdt= fH↵(X  Y )(1 2pH)pP (1 pP )(5.1)System (5.1) is equivalent to the model presented in Chapter 4, except that time is no longer normalized interms of host generations (see supplementary Mathematica notebook). System (5.1) has two types of equi-libria: first, fixation of one host and one pathogen genotype and second, an internal polymorphic equilibriaat pH = pP = 1/2 (see supplementary Mathematica notebook). The fixation equilibria are unstable and thepolymorphic equilibria characterized by neutral stability. With a purely imaginary leading eigenvalue, thismodel is characterized by neutral Red Queen cycles in host and pathogen allele frequencies. Shown in Figure5.1A, these Red Queen allele frequency cycles correspond to cycles in host heterozygosity, the measure ofgenetic variation that we will use throughout. While heterozygosity fluctuates coevolution in this model nei-ther increases nor decreases host heterozygosity when averaged over the time. Hence, the neutral stability ofthe polymorphic equilibrium in the deterministic limit predicts that heterozygosity in MAM should behavelike neutral genetic drift in a finite population.5.3.2 Ecological modelWe explicitly model the effect of host-parasite coevolution on population size using a model first presentedby Frank (1993). Shown graphically in Figure 5.1B, hosts are born at a per-capita density-dependent rateb⇣1PiHi⌘where b is the intrinsic growth rate of the pathogen, Hi is the number of hosts of type i and is the “growth-limit” of the host, a quantity that is proportional to the carrying capacity. Hosts die fromnatural causes at rate . In the absence of parasitism the host reaches an equilibrium population size (carryingcapacity) of H⇤ = (b)b . We use {⇤} notation throughout to signify summation over the genotype indices.As in the constant-size model, infection of hosts of type i by pathogens of type j occurs at rate i,j as defined61above. Infections occur instantaneously in this model such that infected hosts die instantly with probability↵ (and survive with probability 1  ↵). All infections lead to the instantaneous birth of a pathogen, whilefree-living pathogens die at rate . The resulting dynamics are given by:dHidt=bHi✓1 H⇤◆ ↵HiXji,jPj  hidPjdt=PjXji,jHi  Pj(5.2)There are four types of equilibria of system (5.2): host extinction, disease-free hosts, disease persistencewith the fixation of one host and one pathogen type, and finally disease persistence with a polymorphic hostand polymorphic pathogen with allele frequencies at 1/2 (see supplementary Mathematica notebook). Hostextinction occurs whenever b < . The system reaches the disease-free equilibrium whenever b >  and > (XY )(b)2b ⌘ ⇤, defining a critical pathogen death rate, ⇤, above which the pathogen will not persist.As in the constant model, the equilibria with pathogen persistence and fixation of one host and one pathogentype are never stable, therefore when b >  and  < ⇤ the dynamics of system (5.2) is described by thestability of the following internal polymorphic equilibrium:Hˆi =X + Y= KPˆj =(b(2 +X + Y ) (X + Y ))↵(X + Y )2(5.3)The linear stability analysis gives two complex conjugate pairs of eigenvalues. The first, leading, pair ofeigenvalues is purely imaginary. Corresponding to eigenvectors in the direction of the host and pathogenallele frequencies (leaving the total population sizes constant), these leading eigenvalues predict neutrallystable allele frequency cycles. The second complex conjugate pair may be real or complex, but in eithercase, the real part is always negative when  < ⇤. Corresponding to eigenvectors altering the total host andpathogen population sizes (leaving allele frequencies constant), these eigenvalues describe the, often rapid,stabilization of host and pathogen population size.Despite the often drastic transient effects of coevolution on host and pathogen population dynamics, theneutral stability of both this ecological model and the constant population size model has been used to ar-gue that population dynamics have little effect on coevolutionary dynamics (Nuismer, 2017). Exploring themodel numerically, Frank (1993) reveals, however, that despite the neutral linear stability of system (5.2),the dynamics near the polymorphic equilibrium is characterized by stable cycles. Shown in Figure 5.1B, nu-merical analysis of this system reveals second-order stability not captured by the first-order (linear) stabilityanalysis. As a result of the second-order stability of the polymorphic equilibrium, mean host heterozygosityincreases, on average, over time in the deterministic limit, !1.In contrast to the constant-size model, the second-order stability arises from the additional feedbackbetween host and parasite dynamics that arises as a result of density-dependent population growth. This isillustrated by the contrast between the dynamics observed here relative to those of Song et al. (2015), whosemodel does not include explicit intra-specific density-dependence and does not exhibit second-order stability62(neutrally stable cycles are found around the equilibrium). The second-order stability of system (5.2) hasremained largely uncharacterised in previous explorations of similar models, as the stabilization is inherentlyslow and difficult to observe when dynamics are only examined numerically over short time scales (Nuismer,2017). In another similar ecological model, Rabajante et al. (2016) explore the effect of different types ofecological responses (e.g. form of the functional response) on the persistence of cycles involving multiplealleles. However, they do not explicitly quantify the maintenance of genetic variation.Unlike a first-order stable equilibrium, it is difficult to obtain an analytical expression for the magnitudeof this stabilizing effect. Instead we compute a numerical one by fitting the host allele frequency dynamicswith the sinusoidal function:pH(t) ⇡ aertSin r(b(2 +X + Y ) (X + Y ))X + Yt+ t0!(5.4)where a, r, and, t0 are fitted parameters, the quantity r approximating the rate of stabilization. The coeffi-cient of t in this expression is the period of the allele frequency cycles as predicted from the linear stabilityanalysis.5.3.3 Epidemiological modelTo model coevolution between a host and a directly-transmitted pathogen we use a susceptible-infected (SI)compartmental model. In this model, hosts of type h 2 {1, 2} are characterized by their infection status aseither susceptible, denoted by Sh or infected, Ih,p, where p 2 {1, 2} is the genotype of the infecting pathogen.Shown schematically in Figure 5.1C, all hosts give birth regardless of infection status at a per-capita density-dependent rate b⇣1 S⇤+I⇤,⇤⌘with all individuals susceptible at birth. Similarly, hosts die at of naturalcauses at a constant per-capita rate  regardless of their infection status.  is once again proportional to thehost carrying capacity in the absence of infection. Hosts of type h are infected with pathogens of type p atrate h,p defined as above. Once infected hosts die from virulent causes at a rate ↵. Finally, infected hostsrecover at a rate . As with the free-living death rate,  in the epidemiological model captures pathogendeath (turnover) that is independent of death (turnover) of the host. The resulting SI model is given by thefollowing system of differential equations:dShdt=b (Sh + Ih,⇤)✓1 S⇤ + I⇤,⇤◆ Sh +Xk,lh,pShIk,p + Ih,⇤dIh,pdt=h,pShXkIk,p  ( + ↵)Ih,p  Ih,p(5.5)System (5.5) has the same four types of equilibria as the ecological model; host extinction, disease-freehosts, endemic equilibria with host and pathogen fixation, and an endemic-polymorphic equilibrium with ahost and pathogen allele frequency of 1/2 (see supplementaryMathematica notebook add Appendix D.1 forthe complete equilibrium and stability analysis). The stability of each equilibrium, as well as the transientdynamics of system 5.5, is determined by the basic reproductive number, R0(), which is the number ofsecondary infections per infected host which in a coevolutionary model depends on the fraction of matching63hosts  (see appendix), where R0() is:R0() =(X+ Y (1 )) (b )b(↵+  + )(5.6)In particular, the fourth polymorphic equilibrium is determined by setting the frequency of matching hoststo 1/2 and is stable whenever R0(1/2) > 1. The stability of the endemic-polymorphic equilibrium of thisSI model reiterates the results of our previous work (Chapter 3 MacPherson and Otto, 2018), which foundthat epidemiological dynamics stabilize Red Queen cycles.For our purposes of exploring of the maintenance of genetic variation, we will focus exclusively onparameter conditions for which the endemic-polymorphic is stable, R0(1/2) > 1. The first-order stabilityof this equilibrium establishes an interesting contrast between the epidemiological model, the second-orderstability of the ecological model, and finally the neutral stability of the constant-size model. Developingpredictions for the maintenance of genetic variation in finite populations from the stability of the determin-istic models alone we should expect that each additional form of feedback between host and pathogen (i.e,coevolutionary feedback, density-dependence ecological feedback, and epidemiological feedback) would in-troduce an additional degree of stability to the polymorphic equilibrium, which in turn would help maintaingenetic variation. Before we can contrast the maintenance of genetic variation across models, however, wemust understand how the parameters of each individual model affect the dynamics of heterozygosity in finitepopulations.5.4 Stochastic DynamicsHere we explore the effect of coevolution, ecological, and epidemiological dynamics on genetic variationusing individual-based simulations. To do so we compare the expected heterozygosity under host-parasitecoevolution, Hcoev, measured as the mean heterozygosity averaged across 1000 replicate sample paths, tothe expected heterozygosity simulated under neutral genetic drift Hneut (see Appendix D.2). In contrastto the constant-size model for which there is an expression for neutral drift that allowed us to explore thedynamics of heterozygosity analytically (see Chapter 4), there is no such expression for the ecological andepidemiological birth-death processes. Given our previous results from the constant-size model, however,we expect the tendency to maintain genetic variation, measured asH = HcoevHneut, to be a function offour quantities: the total system size (K), the stability of the polymorphic equilibrium, the relative turnoverrates in the host versus the pathogen, and the probability of pathogen fixation. Summarized by the top row ofTable 5.1, for each model we then sample parameters pseudo-randomly across parameter space in a mannerdesigned to most effectively test the effect of these four factors. All simulations were run for 250 hostgenerations with the results presented in the main text focused on explaining variation inH at the last hostgeneration.When population size varies over time, the effective population size, Ne, is given by the harmonic meanof the total population size (Crow and Kimura, 1970). For the ecological and epidemiological model, theeffective population size is therefore determined by two factors, the equilibrium population size and variation64about this equilibrium due to stochasticity, density dependence, and Red Queen cycles. To separate thesetwo effects, we fix the expected equilibrium host and parasite populations sizes to a constant value, K, withthe remaining variation in effective population size due solely to variation about this equilibrium size (notethat for the epidemiological model, it is not possible to separately hold the pathogen population at K, asit equals the total number of infected hosts). Sampling 100 random parameter combinations for a givenvalue of K we expect a positive correlation between H and Ne, with the latter is estimated from thedynamics of the simulation. We then explore the effect of equilibrium population size by repeating all 100simulations for seven values ofK ranging on a log scale from 102 to 103.5. To facilitate comparisons acrossparameter space the individual-based simulations are analysed in terms of host generations where one hostgenerations is defined by the death of K individuals in the simulation. As emphasized in our exploration ofthe deterministic models, the stability of the polymorphic equilibrium differs across models but also variesacross parameter space. In particular, the constant size model is neutrally stable and hence exhibits novariability in stability across parameter space. The ecological model exhibits second-order stability withvariation in the degree of stability across parameter space approximated by the variation in the value of r(see equation (5.4)). Finally, the epidemiological model is first-order stable with variation in stability givenby the variation in the leading eigenvalue , which can be computed numerically.In addition to total host population size and stability, in our analysis of the maintenance of genetic vari-ation in the constant-size model, we identified two additional factors influencing the dynamics of heterozy-gosity in finite coevolving populations (Chapter 4). The first is the effect of coevolutionary natural selectionon the stochastic perturbations in host and pathogen allele frequency. Discussed in detail in Chapter 4, co-evolution increases genetic variation when turnover in the host population exceeds turnover in the pathogenpopulation and vice versa. The second effect of finite population size is the rapid erosion of genetic variationin the host following stochastic fixation in the pathogen. Specifically, following pathogen fixation, the MAMshifts from a model of symmetric coevolutionary natural selection to one of directional selection favouringthe “mis-matching” host of the remaining pathogen genotype. The importance of directional selection isdetermined primarily by the probability of pathogen fixation.5.4.1 Constant modelAs in the previous paper, we begin by considering the case where the host and pathogen population sizesare equal, fH = fP = 1 and  = K. The relative turnover in the host and parasite population is deter-mined by   . By drawing both  and  randomly between 0.1 and 1 we ensure that the turnover rate isnot exceedingly small and that the distribution of relative turnover rates in the host versus the pathogen issymmetric. Throughout we consider the strength of selection to be determined by the extent of matching0 < Y < X < 1 and pathogen virulence ↵, which in this case is a probability that we let vary between0 and 1. At generation 250 we use a LOESS smoothed mean fit and an accompanying 100 bootstrap fits(resampling with replacement from the 1000 sample paths per parameter set), to discern the average effect ofcoevolution on heterozygosity relative to the neutral expectation (see supplementary Mathematica file). Aspredicted by the neutral stability of the deterministic dynamics, the constant size model behaves by-and-largeneutrally (see Figure 5.2A).65A.ConstantModel●○○ ●●●●●●●●●●○○○○○○○○○○ ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●○○○○○○○○○○○○○○○○○○○●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○0.0 0.1 0.2 0.3 0.4 K100178316562100017783162Neutral HeterozygosityObservedHeterozygosity-0.5 0.0 0.5-0.2-0.100.1  H0 20 40 60 80 10000.250.50.751HProportionofreplicatesB.EcologicalModel●●●●●●●○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○●●●●●●●●●●●●●●●●●○○0.0 0.1 0.2 0.3 0.4 HeterozygosityObservedHeterozygosity-0.4 -0.2 0.0 0.2 0.4 0.6  H-8 -6 -4 -2●● ●●● ●●●●○○○○○○○○○○○○○●●●●●● ●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●○○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○0.0 0.1 0.2 0.3 0.4 HeterozygosityObservedHeterozygosity0.00 0.05 0.10 0.15 0.20-0.4- []H0 20 40 60 80 1000246810HProportionofreplicatesFigure 5.2: Observed heterozygosity with host-parasite coevolution relative to the neutral expectation.Right hand panels show the observed heterozygosity versus the (simulated) neutral expectation for the A.constant-size, B. ecological, and C. epidemiological models, respectively. Black dashed line gives the neu-tral expectation, dark coloured lines gives LOESS smoothed fits to the observed data whereas light colouredlines are smoothed fit for 100 bootstrap samples. Colour of data points depicts their respective values of K(see table 5.1). Panel A: (Upper Left) Effect of relative host versus pathogen turnover on H . Slope andR2 values of linear model fits are given in table D.2. (Lower Left) Bar chart showing the relative frequencyof directional selection (red), host fixation (blue), or ongoing coevolution (gray) for the 100 parameter con-ditions for K = 102.5 arranged in order of increasing H . Panel B: (Upper Left) Effect of relative hostversus pathogen turnover on H , see table D.2. (Lower Left) Effect of the estimated second order stabiliz-ing effect, r (see equation 5.4), on H , see table D.2. Panel C: (Upper Left) Effect of the stabilizing force,the real part of the leading eigenvalue Re [], on H , see table D.2. (Lower Left) Bar chart showing therelative frequency of directional selection (red), pathogen fixation (green), host fixation (blue), or ongoingcoevolution (gray) for the 100 parameter conditions forK = 102.5 arranged in order of increasing H . Allplots are shown at generation 250.66Constant-size Ecological EpidemiologicalRandomSamplingTime scale and Population SizefH = fP = 1 equal host and path.population size = KTurnover = [0.1, 1]Generation time notinfinitesimallysmall = [0.1, 1] symmetricperturbationsSelection0 < Y <X  1↵ = [0, 1]Time scale and Population Sizeb =(X+Y )(↵Y+)X+Y2Hˆ⇤ = Pˆ⇤ = KX+Y2Turnover = [0.1, 1] comparable toconstant-size = [0.1, ⇤] stable equilibriumnote: perturbations are symmetric by inspectionSelection0 < Y < X  1↵ = [0, 1]Time Scale and Population Sizeb = 1 sets time scale  Sˆ⇤ + Iˆ⇤,⇤ = KTurnover = [0.1, 1] non-zero turnover =h0, X+Y2(X+Y )2i stable equilibrium:Ro(1/2) > 1note: perturbations are symmetric when  = 0Selection0 < Y < X  1↵ =h0, X+Y2(X+Y )2i stable equilibrium:Ro(1/2) > 1ModelComparisonTime scale and Population SizefH =Hˆ⇤fP =Pˆ ⇤ = KTurnover = ⇤ = ⇤ Hˆ⇤Pˆ ⇤equal perturbationsSelectionX = X⇤Y = Y ⇤↵ = ↵⇤↵⇤+⇤equal probability ofdeath frominfectionTime scale and Population Size =Hˆ⇤(X⇤Y ⇤)2 = KX⇤+Y ⇤2Turnover =Pˆ ⇤(X⇤Y ⇤)2equal perturbationsSelectionX = X⇤Y = Y ⇤↵ = ↵⇤↵⇤+⇤equal probability ofdeath frominfectionTime scale and Population Sizeb = 1 sets time scale  Sˆ⇤ + Iˆ⇤,⇤ = KSˆ⇤ + Iˆ⇤,⇤ ⌘ Hˆ⇤ Iˆ⇤,⇤ ⌘ Pˆ ⇤Turnover = [0.1, 1] ⌘ ⇤ non-zero turnover = 0 equal perturbationsSelection0 < Y < X  1 X = X⇤ Y = Y ⇤↵ =h0, X+Y2(X+Y )2i⌘↵⇤stable equilibrium:Ro(1/2) > 1Table5.1:Individual-basedsimulationparameterselection.67Given that host and parasite population sizes remain constant, there exists no variation in effective pop-ulation size for a given value ofK. Similarly, as the constant-size model exhibits true neutral stability, thereexists no variation in stability across parameter space, leaving only two factors to explain the variation inH , the relative turnover rates and the probability of pathogen fixation. Explored in greater analytical detailin Chapter 4, genetic variation increases as a function of   , with higher host turnover helping maintaingenetic variation (Figure 5.2A right panel). In addition to the effect of turnover, simulations that maintainmore variation (higher values ofH) tend to be those where sample paths tended not to fix for one parasitetype and instead are characterized by more ongoing coevolution.5.4.2 Ecological modelThe pseudo-random sampling of parameters in the ecological model closely resembles that of the constant-size model. In particular, we focus on the case when Hˆ⇤ = Pˆ⇤ = K. As with the constant-size modelwe draw  and  from the same distribution so that relative turnover rates in the host and pathogen are, onaverage, symmetric and not exceedingly small.We find that variation is maintained more often in the ecological model, relative to the constant-sizemodel (Figure 5.2B), although the total amount of variation maintained is only slightly greater than expectedin a neutral model with an equal host population size. The greater ability of the ecological model to maintainvariation is consistent with the behaviour of the deterministic models where, in contrast to the neutral stabilityof the constant-size model, the ecological model exhibits second-order stability.Relative to the constant-size model, there exists less variation in the amount of genetic variation main-tained (H) across parameter sets, especially when compared to the variation in genetic variation observedamong the 1000 stochastic sample paths simulated with the same parameters (see Figure D.1). This signifi-cantly reduces the power to identify correlations betweenH and factors such as the deterministic stabilityr or the relative turnover rates. Nevertheless the inferred stability of the internal equilibrium in the ecologicalmodel, r, is a significant predictor of how much variation is maintained. In addition to the effects of stability,relative turnover rates affect the extent to which genetic variation is maintained (see Figure 5.2B), as in theconstant-size model.Despite the a priori expectation that fluctuations in population size, and a corresponding decrease ineffective population size, would be a strong determinant of the maintenance of genetic variation, we in factobserve little variation in the effective population size due to the stabilizing effect of density dependenceon the host and consequently on the parasite population sizes (as captured by the strongly negative non-leading eigenvalues that involve the population sizes, see Appendix D.3). Finally, like the constant-sizemodel, pathogen fixation followed by directional selection can occur in the ecological model, although thelikelihood of this occurring is reduced due to the second order stability. When pathogen fixation did occur,it was often rapidly followed by pathogen extinction, which was not possible in the constant-size model andwhich resulted in neutral drift in the host rather than in directional selection. As a result, we observe littleeffect of directional selection onH in this model.685.4.3 Epidemiology modelThe key distinction between the ecological and epidemiological model is the free-living versus directly-transmitted nature of the pathogen. One consequence of direct-transmission is that the effective parasitepopulation size, I⇤,⇤, is restricted to being a subset of the host population, S⇤ + I⇤,⇤. Therefore, unlike theecological and constant population size model above we are unable to consider the case of equal host andpathogen population sizes. Instead we focus on the case where the equilibrium host population size is fixedatK, letting the equilibrium pathogen population size vary. A second consequence of direct-transmission isthat pathogen turnover is no longer independent of host turnover, which occurs at a rate . Specifically, therate of pathogen turnover can never be less than that of the host and will exceed that of the host if  > 0,with the special case of  = 0 and equal turnover rates explored in more detail in the next section.Relative to the previous models,H in the epidemiological model is highly variable for a given value ofK (Figure 5.2C). On average however, heterozygosity is reduced relative to the neutral expectation whenK issmall and above the neutral expectation whenK is large. This is a result of the balance between equilibriumstability and directional selection following pathogen fixation. In the absence of pathogen fixation, thedeterministic model predicts that heterozygosity should exceed the neutral expectation, which is borne outwhen K is large and when pathogen fixation is rare. As the host, and more importantly the pathogen,population size declines as K decreases, the probability of pathogen fixation followed by the loss of hostgenetic variation increases.These same processes explain variation across parameter space for a given value of K. Increasing thestability of the internal equilibrium increases genetic variation (Figure 5.2C). Similarly, cases for whichgenetic variation far exceeds the neutral expectation (large positive H) are characterized primarily bypathogen extinction whereas cases where genetic variation falls below the neutral expectation (negativeH)are dominated by pathogen fixation and sustained directional selection. As with the ecological model, theeffective population size is a weak predictor ofH as heterozygosity in both the neutral and coevolutionarymodels are impacted by the effective population size. The final potential determinant of H we consideredis the relative turnover rates in the host versus pathogen. Unlike the constant-size and ecological models wefind no significant effect of relative turnover. This is not surprising given the additional model complexity andmultiple feedbacks between host and pathogen as well as the first-order stability of the endemic equilibrium.In summary, host heterozygosity behaves by-and-large neutrally, particularly in contrast to other pro-cesses that maintain genetic variation within populations (see Figure 5.3). Nevertheless we find that geneticvariation in coevolving populations can be maintained slightly more often than expected in a neutral modelwhen host populations are large, as a result of the stabilizing processes such as density-dependent growth orepidemiological feedback. By contrast, genetic variation is often rapidly depleted when host populations aresmall, as a result of directional selection following fixation in the pathogen.69●○○ ●●●●●●●●●●○○○○○○○○○○ ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●○○○○○○○○○○○○○○○○○○○●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○0.0 0.1 0.2 0.3 0.4●●●●●●●○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○●●●●●●●●●●●●●●●●●○○0.0 0.1 0.2 0.3 0.4●● ●●● ●●●●○○○○○○○○○○○○○●●●●●● ●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●○○○○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●○○○○○○○○○○○○0.0 0.1 0.2 0.3 0.4 Model Ecological Model Epidemiolgoical ModelNeutral ExpectationObservedHeterozygosityFigure 5.3: Maintenance of genetic variation relative to overdominance. The data in Figure 5.2 are com-pared to the expected amount of variation at generation 250 in a host-population subject to overdominancein the absence of parasites (calculated using the Moran model with exact matrix iteration, see supplementaryMathematica file). Overdominance model: WAA = 1  s, WAa = 1, Waa = 1  s where s = 0.01 lightgray, s = 0.05 gray, and s = 0.1 black.5.5 Model ComparisonIn this section we explore in more detail the effect of stability and directional selection by comparing themaintenance of genetic variation across models. Here our aim is to sample parameters in such a manner asto optimize the comparison across models (see Table 5.1). In particular,the host and pathogen populationsize are both important determinants of the maintenance of genetic variation, but noting that for directly-transmitted parasites (as in the epidemiological model) are restricted by host population size. Thus our modelcomparisons above differed in havingK constant for hosts and parasites, except in the epidemiological modelwhere the parasite population was always smaller than K. Hence we begin by first equating both the hostand parasite population sizes across the models, focusing solely on cases where the host population size isgreater than that of the pathogen.Another factor effecting the maintenance of genetic variation is the relative turnover rates in the hostand pathogen. Hence we eliminate this effect by setting the host and pathogen turnover rates equal to oneanother. Finally, whereas ↵ in the epidemiological model is the rate of host death from infection, ↵ in theecological and constant-size models is the probability of death from infection. Hence, we equate the value ofthis parameter across models accordingly such that the probability of death from infection remains constant.As we did in the previous section we then sampled randomly across this constrained parameter space runningsimulations for 100 random parameter combinations each replicated for 7 values of K. Once again each ofthese 700 simulations consists of 1000 replicate stochastic sample paths run over 250 host generations.Figure 5.4 gives an across model comparison of the smoothed mean fit (and bootstrap confidence in-tervals) for the observed heterozygosity versus the expectation under simulated neutral drift. The foremostresult is that deterministic stability is an overall good predictor of the relative maintenance of genetic vari-ation, with the constant-size model less able to maintain variation than the ecological model, which in turn70is less able to maintain variation than the epidemiological model. In addition, as we concluded previously,coevolution rarely significantly increases genetic variation above what would be expected in a neutral model,and when it does so this increase is small. Equated in this way, in fact, we only observe an increase in geneticvariation in the highly stable epidemiological model at large population sizes.The overall decrease in genetic variation in this model comparison is due to directional selection fol-lowing pathogen fixation (see Figure D.3). By constraining pathogen population size to be small relativeto that of the host (as is in the epidemiological model), the probability of pathogen fixation, and resultingdirectional selection, is increased. This is exemplified by the constant population size model for which thevast majority of simulations experience directional selection. This comparison across models demonstratestherefore the overall importance of directional selection and deterministic stability on the internal equilib-rium in determining the maintenance of genetic variation in systems exhibiting host-parasite coevolution.Model Comparison0.0 0.1 0.2 0.3 0.4 HeterozygosityObservedHeterozygosityConstant SizeEcologicalEpidemiologicalFigure 5.4: Comparison of observed heterozygosity across models. The observed heterozygosity underhost-parasite coevolution relative to the expectation under neutral genetic drift for the constant populationsize (red), ecological (green), and epidemiological (blue) models. Black dashed line shows neutral expecta-tion, dark coloured lines give the LOESS smoothed fit to the simulated data across parameters, whereas lightcoloured lines give smoothed fit to 100 bootstrap samples. Parameter conditions for simulations were drawnto optimize model comparison as given in the second row of table 5.1.715.6 DiscussionAnalysing the maintenance of genetic variation across ecological and epidemiological contexts allows usto consider if and when we are likely to observe coevolution in nature. One key finding of our models isthat, regardless of ecological and epidemiological context, coevolution, on average, behaves nearly neutrally.More specifically, summarized by the rough comparison to the maintenance of genetic variation under over-dominance (Figure 5.3), the average effect of coevolution on the maintenance of genetic variation is, at best,equivalent to very weak overdominance. Specifically, when variation was maintained at a higher rate thanneutral, the effect was equivalent to selective advantage of heterozygotes of < 1% across parameter sets, onaverage, despite the much stronger coefficients used in the coevolutionary models. Hence, like neutral drift,genetic variation and coevolution will only persist in populations on the order of 1/N generations, makingit unlikely to observe ongoing coevolution in small isolated populations.Despite this average neutral behaviour, there exists substantial variation in the maintenance of geneticvariation across models and across parameter space within each model allowing us to predict the conditionsunder which coevolution is most likely to persist. We are most likely to observe coevolution in systemswith large parasite effective population sizes, where the parasite is unlikely to fix and generate directionalselection in the host. We thus expect, in general, more variation to be maintained in both host and parasitepopulations where there are large free-living pathogen populations than with directly-transmitted diseaseswhose population sizes are limited by the host, all else being equal. For directly-transmitted infections,pathogen population size increases with transmission rate and decreases with virulence. Note however, thatreducing virulence reduces the strength of coevolutionary selection and subsequently the frequency of RedQueen allele-frequency cycles. As a result, even in populations where coevolution is likely to persist itmay be difficult to observe. To observe rapid coevolution in these systems there must be strong geneticspecificity, X  Y . For a given host and parasite effective population size, we are most likely to findongoing coevolution between hosts and their directly-transmitted infections. In contrast to the free-living orindirectly-transmitted parasites of most model systems of coevolution, this result provides an argument forwhy directly-transmitted infections may be especially informative examples of coevolution.Another key result of these models is that the outcome of host-parasite interactions depends qualitativelyon the ecological and epidemiological context in which that interaction occurs. Two coevolving systemswith the same equilibrium population sizes and allele frequencies can have fundamentally different effectson the long-term maintenance of genetic variation and the persistence of coevolution, depending on theeco-evolutionary processes that stabilize the small perturbations from that equilibrium.While the models presented here capture different host-parasite feedback, they represent only a smallset of all possible ecological and epidemiological contexts in which coevolution can occur. For examplevector-transmitted infections, such as malaria, or heteroecious infections, such as the trematode parasiteMicrophallus sp. whose life cycle requires both the infection of snails and herring waterfall (King et al.,2011). Understanding how our results extend to a broader range of life cycles would be valuable to predictingand understanding coevolution within these systems.As coevolution rarely persists over long periods of time within isolated populations, processes suchas mutation and migration likely play a predominant role in shaping patterns of coevolution and genetic72variation in nature. Consistent with the geographic mosaic theory of coevolution (Thompson, 1994, 2005),loss of variation due to genetic drift coupled with its reintroduction through migration will generate spatialvariability in the presence/absence of coevolution. Spatial patterns of host and parasite local adaptation willbe determined by the balance between coevolutionary selection, genetic drift and migration all of whichthemselves will depend on the ecological and epidemiological dynamics within individual demes. Takentogether our results reinforce that indirect negative frequency-dependent selection in the MAM is generallynot a strong force for maintaining variation in and of itself. Nevertheless, the contrast between the constant-size and ecological and epidemiological models exemplifies the importance of considering the full ecologicaland epidemiological context in which coevolution occurs.73Chapter 6Conclusion6.1 OverviewMy aim in thesis was to use population genetic models to build connections between the fields of coevolu-tionary biology and disease epidemiology. Hence, I conclude by exploring more generally how the proceed-ing chapters develop the connection between these fields and the questions that they pose. The connectionbetween coevolution and the epidemiology of infectious disease is bi-directional with coevolution havingimplications for our understanding of disease and vice versa. The growth of the “one health” concept (Cun-ningham et al., 2017; Daszak et al., 2000), a scientific paradigm exploring the shared underlying threats tohuman health and biodiversity, exemplifies the growing appreciation for the role of ecology and evolution onthe emergence and spread of infectious diseases. In the other direction, infectious diseases are widespread,often well studied, and insightful examples of host-parasite coevolution. Emphasized throughout this thesis,host-parasite interactions are extremely diverse. Spanning an enormous range of life-histories from oppor-tunistic pathogens, to obligate free-living pathogens, to vector-borne and directly-transmitted infections. Un-derstanding the role of epidemiology, as an element of host-parasite life-history, in shaping coevolutionarydynamics is essential for a general understanding of species interactions in nature.This thesis has explored the connection between epidemiology and coevolution with mathematical andstatistical population genetic models. As they are characterized by highly complex and non-linear systems,mathematical models have always played a fundamental role in both epidemiology (Keeling and Danon,2009) and (co)evolution (Servedio, 2019; Nuismer, 2017). I view the role of models in these fields as three-fold. First, they are an invaluable tool helping guide the simplification of biological complexity. Simplifyingcomplex systems through modelling allows us to explicitly characterize both the underlying commonalitiesand vast uniqueness of biological diversity. Second, mathematical models formalize verbal ones, formingconcrete descriptions that aid in the development of both quantitative and qualitative predictions. Finally,mathematical models are fundamental to the development and assessment of the methodologies used toanalyse and understand biological data. Here I will explore how the proceeding four chapters of this thesishelp establish and develop the two-way connection between coevolution and epidemiology. In particular, Iwill emphasize how they do so by simplifying biological complexity, the predictions they make, and whenand how they contribute to robust methodologies for exploring the natural world. I conclude by outliningopen questions and future directions.746.2 Context, Conclusions, and CaveatsChapter 2: Applied to epidemiological traits such as susceptibility, tolerance, or virulence, the co-GWASmethod developed in Chapter 2 emphasizes that epidemiological traits are inherently coevolutionary, arisingfrom the interaction between species. These traits should therefore not be considered as properties of ahost or parasite alone when attempting to discern the genetic basis of diseases. This chapter fits squarelyinto the extensive and growing literature on the robustness, limitations, and irreproducibility of genomescans (Marigorta et al., 2016; Sella and Barton, 2019). We demonstrate how between species gene-by-gene(GxG) interactions, like gene-by-environment (GxE) interactions (Lambrechts, 2010), can drastically alterthe results of genome scans. When not appropriately taken into account, the GxG interactions affect thereproducibility of GWAS (Marigorta et al., 2016). Despite its connection to this larger literature, a key aimof this work was to go beyond a cautionary tale by providing suggestions for pathways forward. First andforemost we suggest a complete two-species coevolutionary genome-wide association study (Co-GWAS)should be applied in the analysis of epidemiological traits. Published concurrently with our work, Wanget al. (2018) developed the statistical methodology and software for doing so. Exemplified by the analysis ofDaphnia–Pasturia interaction, in addition to performing Co-GWAS, our results exemplify the insights intogene function that can be gained through the comparison of single- and two-species genome scans.Exemplified by the interaction between Daphnia magna and Pasturia ramosa, the results of Chapter2 highlight the relationship between the genetic basis of coevolution in our mathematical models and thegenetic basis of coevolution in natural systems. As I have used it throughout this thesis, the Daphnia–Pasturia interaction is considered an example of matching-alleles coevolution in nature (Luijckx et al., 2013).Yet the complexity of the genetic basis of this interaction (Bento et al., 2017) goes beyond the single andmulti-locus models employed in theoretical coevolution. In reality, this system involves complex dominance-by-epistasis interactions, which create the “general health” locus characterized in Chapter 2. Given thesensitivity of coevolutionary dynamics on their genetic basis, the disconnect between the genetic bases ofcoevolution in mathematical versus natural systems begs the question of whether our mathematical modelsaccurately represent nature.Analysis and interpretation of biological data often occurs in the context of mathematical models. Thesemodels most likely make either overly simplistic, or even contradictory, assumptions about the genetic basisof coevolution to that of natural systems to which they are applied. For example, a matching-alleles anda gene-for-gene model have been used to interpret host allele-frequency dynamics and pathogen fitness intime-shift analyses in the experimental coevolution between C. elegans and its bacterial parasite Bacillusthuringiensis (Papkou et al., 2019). I often ponder how shoehorning coevolution observed in natural systemsinto mathematical categories limits our understanding of these systems.Chapter 3: Compartmental models, epidemiological models that categorize hosts by their disease sta-tus, exemplify the beauty of mathematical models in simplifying the complexity of natural systems. Viatheir design, the enormous diversity that characterizes infectious diseases is standardized by shared epidemi-ological form, for example into susceptible-infected, SI, or susceptible-exposed-infected-recovered, SEIR,infections. This standardization, in turn, allows us to apply knowledge to and draw inferences betweenwhat would otherwise be disparate diseases. One element of compartmental models that is known to have75important epidemiological implications is the presence and characterization of host and/or parasite hetero-geneity. For example, variability in transmission rates among hosts is known to be pervasive (Woolhouseet al., 1997), can contribute to disease spread (Hethcote and Van Ark, 1987; Diekmann et al., 1990), andmust be accounted for to effectively control infectious disease (Murphy et al., 2003). While the effect ofheterogeneities on epidemiology has long been recognized their effect on evolution (e.g. Yates et al., 2006)and coevolution remain understudied. Chapter 3 helps fill this gap by developing a compartmental modelwith both host and pathogen genetics underlying heterogeneity in transmission.The compartmental model used in Chapter 3 was designed to elucidate the two-way connection betweencoevolution and epidemiology. By using a fairly general susceptible-infected-recovered-susceptible (SIRS)model we were able to explore the interplay of coevolutionary dynamics and a range of epidemiologicaldynamics (disease extinction, endemic infection, and epidemic limit cycles). While the coevolutionary dy-namics had only a small effect on the epidemiological dynamics, slightly altering the conditions for diseasespread and for periodic outbreaks, we found that epidemiology fundamentally alters the stability of RedQueen allele frequency cycles.Despite its relative generality, the SIRS model used in Chapter 3 as well as the SI epidemiological modelused in Chapter 5, are only applicable to directly-transmitted infections. Addressing the generality of theseresults to alternative disease epidemiologies is an important future direction of this research. For exampleextensions to vector transmitted infections, such a malaria, or pathogens with complex heteroecious lifehistories, for exampleMicrophallus, the trematode parasite of the New Zealand mud snail, will be necessaryto address the role of epidemiology in these model (co)evolutionary systems.Chapter 4: This chapter exemplifies the role of mathematical models in formalizing verbal models inevolution. Long hypothesized to help maintain genetic variation in natural populations, the results of thischapter contribute to a shift in this narrative. Once considered analogous to negative frequency-dependentselection, this work explores the distinction between direct and indirect frequency-dependent selection, us-ing the terminology of Brown and Tellier (2011). As in the gene-for-gene model (Tellier and Brown, 2007),coevolution induces indirect frequency-dependent selection, favouring genotypes that are rare in the inter-acting species. In contrast to direct frequency-dependence within a species, coevolution ultimately does notmaintain genetic variation. In the matching-alleles model, in fact, pathogen fixation leads to directional se-lection in the host and a net decrease in genetic variation in these systems. The results of Chapter 4, and themore general shift in narrative about the effect of coevolution on the maintenance of genetic variation, raisequestions about the long-term persistence of coevolution in natural populations. Is long-term coevolutionin fact rare? How does spatial structure (Tellier and Brown, 2011) contribute to the maintenance of varia-tion and the persistence of coevolution in natural systems? These questions link the results of this thesis tothe broader framework of the geographic-mosaic theory and the general importance of spatial structure andmigration in coevolution (Thompson, 1994, 2005). If gene-for-gene or matching-alleles interactions do notinherently maintain variation, are there forms of genetic interactions that do?Chapter 5: This chapter provides a unifying conclusion to my thesis, placing the results of both Chapter3 and 4 into a broader context. Whereas Chapter 3 identifies the general stabilizing effect of epidemiologyon coevolutionary allele frequency dynamics, the broader implications of this stability are manifest in the76maintenance of genetic variation in the epidemiological model. Similarly, Chapter 5 demonstrates that wecan extend the results of Chapter 4 to a much broader range of coevolutionary systems even beyond the onesexplored explicitly in this thesis by considering the stability of the deterministic model. More generally,although deterministic models and associated stability analyses apply, technically, only to populations ofinfinite size, Chapter 5 demonstrates the insights that these analytical approaches can provide to drasticallymore complex systems, as well as their limitations. Due to non-linearities, it is difficult to intuit the dynamicsof models like those presented in Chapters 3,4 and 5. The results of this thesis exemplify the role thatmathematical modelling plays in understanding and predicting the dynamics of biological systems.In addition to extending and unifying many of the results of this thesis, Chapter 5 is replete with unan-swered questions. First and foremost, why does epidemiology stabilize Red Queen allele frequency cycles?Although we can conclude that the stabilizing effect of epidemiology is not due, exclusively, to density-dependent ecological feedbacks, what it is exactly about epidemiological structure that produces stabilityremains unclear. Is it, for example, the depletion of susceptible hosts? Or the time delays created by thenon-instantaneous nature of infections? Although, I have attempted to address this question both through theanalysis of the SI model presented in Chapter 5 and through its comparison with the ecological and constant-size models, I have yet to reach a completely satisfying explanation. Insights into the underlying mechanismwill likely be advanced by exploring coevolutionary dynamics in a broader range of epidemiological con-texts, for example modelling coevolution of hosts and vector-borne pathogens.6.3 Future DirectionsI conclude by outlining four broad directions for future research. First, is a critical evaluation of the ge-netic basis of host-parasite coevolution. While undoubtedly one of the most difficult challenges in the field,existing coevolutionary models have established how fundamental an understanding of the genetic basis ofcoevolution is for developing robust predictions for coevolutionary dynamics. In addition to their simplegenetic bases, the models presented in this thesis focus on coevolution of a single epidemiological trait, sus-ceptibility. The second direction for future research discussed below is to connect the models presented hereinto the broader context and literature on the evolution of epidemiological trade-offs, for example betweentransmission rates of disease virulence. The third direction for future research is to extend the above work,which has focused exclusively on single isolated populations, into a spatial context. Finally, I consider theimpact of coevolution and infectious disease in the conservation of wild populations.6.3.1 Genetic basis of coevolutionBeginning with the introduction and throughout this thesis I have emphasized the importance of the geneticbasis of coevolution. Whether gene-for-gene, matching-alleles, or polygenic, the underlying genetic basisof a host-parasite interaction has fundamental implications on the resulting coevolutionary dynamics, themaintenance of genetic variation, and even our ability to successfully identify and characterize that geneticbasis in the first place. Nevertheless, our understanding of coevolutionary genetics remains limited in nat-ural systems. While gene-for-gene interactions are thought to be a common hallmark of plant-pathogen77interactions and matching-alleles interactions the result of lock-and-key parasite-immune interactions in ver-tebrates, the generality of these “model interactions” in nature is relatively unexplored. For example howpolygenic is coevolution? Are epistasis, dominance, or epistasis-by-dominance interactions common, as inthe Daphnia–pasturia system (Bento et al., 2017)?6.3.2 Multiple coevolving traits and epidemiological trade-offsThe coevolutionary models addressed in this thesis focus on a single epidemiological trait, namely hostsusceptibility. Coevolution in natural systems is by no means limited to a single phenotypic trait. There existsan extensive theoretical literature exploring epidemiological models of (co)evolution of virulence, resistance,and tolerance (Cressler et al., 2016; Alizon et al., 2009; Carval and Ferriere, 2010). The breadth of thisliterature illustrates the many and diverse ways coevolution and epidemiology may interact. While extensiveand fascinating, to date the epidemiological trade-offs literature has centred primarily around the effectof evolution on epidemiological outcomes. By focusing on one direction of the two directional interplaybetween (co)evolution and epidemiology this literature complements that of my thesis, which has focusedinstead primarily on exploring the effect of epidemiological dynamics on coevolutionary outcomes. Buildingon the work of Best et al. (2010) additional research is necessary to fully understand the effect of multipleepidemiological traits on coevolutionary dynamics. Such explorations will shed light on the robustness andgenerality of the results presented in this thesis6.3.3 Spatial coevolutionary-epidemiologyFormalized and exemplified by the Geographic Mosaic Theory of Coevolution (Thompson, 2005), spatialstructure is recognized as playing an important role in coevolution from antagonisms to mutualisms. Brownand Tellier (2011) emphasize the role of migration and metapopulation structure on the promotion of “staticalpolymorphisms” that reduce fixation rate at coevolving loci. Specifically, in the GFG model Thrall andBurdon (2002) found that migration rate and distance were critical determinants of both the number of hostand pathogen genotypes maintained as well as the extent of genetic variation at individual resistance andvirulence loci. The near neutral behaviour of genetic variation in coevolving systems identified in Chapters4 and 5 highlights the role spatial structure and migration are likely to play in the maintenance of geneticvariation and the long-term persistence of coevolution in the MAM.Spatial heterogeneity is also a critical determinant of epidemiological dynamics. Additional researchis needed to fully understand the interplay between coevolution and epidemiology in spatially structuredpopulations. My preliminary attempt at addressing these questions, by extending the epidemiological modelpresented in Chapter 5 into a metapopulation framework, revealed, however, several challenges. For examplelocal adaptation, a widely used metric in spatial coevolutionary biology, is not easily defined in the context ofa directly-transmitted pathogen. Definitions of local adaptation based on reciprocal transplant experiments(Blanquart et al., 2013) are ambiguous when applied to directly-transmitted pathogens as the results of suchstudies will depend on the infection status of the transplanted individuals. How then can we quantify spatialadaptation in these systems? How do the metrics we use affect the inferences we make and do these metricsallow us to draw meaningful connections to spatial coevolution in free-living parasite systems?786.3.4 (Co)evolutionary (un)rescueAs a result of anthropogenic change and contact with domesticated animals, many wild populations arethreatened by the (re)emergence of infectious pathogens (Smith et al., 2006). Examples of wild speciesthat have become critically endangered or extirpated by infection include the black-footed ferret (Mustelanigripes), the Ethiopian wolf (Canis simensis), the Tasmanian Devil (Sarcophilus harrisii), and the thylacine(Thylacinus cynocephalus), a now extinct carnivorous marsupial. While this thesis addressed several ques-tions concerning host-pathogen coevolution in small populations, we have not yet considered the effect ofevolution on the dynamics of emergent infectious diseases or the role of mutation in these systems. Evo-lutionary rescue via the evolution of resistance in the host is certainly possible. Many invasive species, forexample, evolve resistance to their biocontrol agents. Despite this, infectious diseases continue to pose arecurrent threat to conservation at both the population and species level. This contrast raises the questionof what the likelihood of evolution of resistance is. How do epidemiology and host-pathogen (co)evolutionaffect the probability of population rescue. Does evolution always help host populations persist? Do coevo-lutionary responses promote or limit the success of population rescue?6.4 A Final ThoughtI would be remiss if I did not conclude this thesis by expressing what a privilege and honour it has beento explore these scientific questions. I recognize that this privilege is only possible because of the manyscientists and non-scientists alike who have, and continue to, express their wonder in nature. The most I cantherefore hope, is that this thesis has helped expand your own wonder of the natural world.79BibliographyAgrawal, A., and C. M. Lively. 2002. Infection genetics: Gene-for-gene versus matching-alleles models andall points in between. Evolutionary Ecology Research 4:79–90.Agrawal, A. F., and C. M. Lively. 2001. Parasites and the evolution of self-fertilization. Evolution 55:869–879.———. 2003. Modelling infection as a two-step process combining gene-for-gene and matching-allelegenetics. Proc. R. Soc. Lond. B 270:323–334.Alizon, S., A. Hurford, N. Mideo, and M. Van Baalen. 2009. Virulence evolution and the trade-off hypothe-sis: History, current state of affairs and the future. Journal of Evolutionary Biology 22:245–259.Allison, A. C. 1956. Population genetics of abnormal human haemoglobins. Acta Genetica Et StatisticaMedica 6:431–434.Anagnostakis, S. 2000. Revitalization of the majestic chestnut: Chestnut blight disease. APSnet FeatureArticles .Anderson, P. K., A. A. Cunningham, N. G. Patel, F. J. Morales, P. R. Epstein, and P. Daszak. 2004. Emerginginfectious diseases of plants: Pathogen pollution, climate change and agrotechnology drivers. Trends Ecol.Evol. 19:535–544.Andreasen, V., and F. B. Christiansen. 1995. Slow coevolution of a viral pathogen and its diploid host. Phil.Trans. R. Soc. B 348:341–354.Ashby, B., and S. Gupta. 2014. Parasitic castration promotes coevolutionary cycling but also imposes a coston sex. Evolution 68:2234–2244.Ashby, B., and K. C. King. 2017. Friendly foes: The evolution of host protection by a parasite. EvolutionLetters 1:211–221.Aulchenko, Y. S., S. Ripke, A. Isaacs, and C. M. van Duijn. 2007. GenABEL: An R library for genome-wideassociation analysis. Bioinformatics 23:1294–1296.Barrett, L. G., P. H. Thrall, P. N. Dodds, M. Van Der Merwe, C. C. Linde, G. J. Lawrence, and J. J. Burdon.2009. Diversity and evolution of effector loci in natural populations of the plant pathogen Melampsoralini. Mol. Biol. Evol. 26:2499–2513.80Bartholomew, J. L. 1998. Host resistance to infection by the myxosporean parasite Ceratomyxa shasta: Areview. Journal of Aquatic Animal Health 10:112–120.Bayne, C. J. 2009. Successful parasitism of vector snail Biomphalaria glabrata by the human blood fluke(trematode) Schistosoma mansoni: A 2009 assessment. Molecular and Biochemical Parasitology 165:8–18.Beck, K. 1984. Coevolution: mathematical analysis of host-parasite interactions. J. Math. Biol 19:63–77.Bell, G. 1982. The Masterpiece Of Nature: The Evolution and Genetics of Sexuality.Bento, G., J. Routtu, P. D. Fields, Y. Bourgeois, L. Du Pasquier, and D. Ebert. 2017. The genetic basisof resistance and matching-allele interactions of a host-parasite system: The Daphnia magna-Pasteuriaramosa model. PLoS Genetics 13:1–17.Bergelson, J., G. Dwyer, and J. Emerson. 2002. Models and Data on Plant-Enemy Coevolution. Annu. Rev.Genet. .Best, A., A. White, and M. Boots. 2010. Resistance is futile but tolerance can explain why parasites do notalways castrate their hosts. Evolution 64:348–357.Blanquart, F., O. Kaltz, S. L. Nuismer, and S. Gandon. 2013. A practical guide to measuring local adaptation.Ecology Letters 16:1195–1205.Bodmer, W. F. 1972. Evolutionary Significance of the HL-A System. Nature 237:139–146.Boots, M., A. White, A. Best, and R. Bowers. 2014. How specificity and epidmiology drive the coevolutionof static trait diveristy in hosts and parasites. Evolution 68:1594–1606.Borghans, J., J. Beltman, and R. De Boer. 2004. MHC polymorphism under host-pathogen coevolution.Immunogenetics .Bourgeois, Y., A. C. Roulin, K. Müller, and D. Ebert. 2017. Parasitism drives host genome evolution:Insights from the Pasteuria ramosa-Daphnia magna system. Evolution .Brodie, E. D. J., B. J. Ridenhour, and E. D. I. Brodie. 2002. The Evolutionary Response of Predatorsto Dangerous Prey : Hotspots and Coldspots in the Geographic Mosaic of Coevolution between GarterSnakes and Newts. Evolution 56:2067–2082.Brown, J. K. M., and A. Tellier. 2011. Plant-Parasite Coevolution: Bridging the Gap between Genetics andEcology. Annu. Rev. of Phytopathology 49:345–367.Carval, D., and R. Ferriere. 2010. A unified model for the coevolution of resistance, tolerance, and virulence.Evolution 64:2988–3009.Chapman, J. R., T. Hill, and R. L. Unckless. 2019. Balancing Selection Drives the Maintenance of GeneticVariation in Drosophila Antimicrobial Peptides. Genome Biol. Evol. 11:2691–2701.81Chapman, S. J., and A. V. S. Hill. 2012. Human genetic susceptibility to infectious disease. Nat. Rev. Genet.13:175.Clarke, B. 1979. The evolution of genetic diversity. Proc. R. Soc. Lond. B pages 19–40.Cressler, C. E., D. V. McLeod, C. Rozins, J. Van Den Hoogen, and T. Day. 2016. The adaptive evolution ofvirulence: A review of theoretical predictions and empirical tests. Parasitology 143:915–930.Crow, J. F., and M. Kimura. 1970. An Introduction to Population Genetics Theory. Harper and Row.Cunningham, A. A., P. Daszak, and J. L. Wood. 2017. One health, emerging infectious diseases and wildlife:Two decades of progress? Proc. R. Soc. Lond. B 372.Daszak, P., A. A. Cunningham, and A. D. Hyatt. 2000. Emerging infectious diseases of wildlife - Threats tobiodiversity and human health. Science 287:443–449.Dawkins, R., J. R. Krebs, J. Maynard Smith, and R. Holliday. 1979. Arms races between and within species.Proceedings of the Royal Society of London. Series B. Biological Sciences 205:489–511. Publisher:Royal Society.Diekmann, O., J. A. Heesterbeek, and J. A. Metz. 1990. On the definition and the computation of thebasic reproduction ratio R0 in models for infectious diseases in heterogeneous populations. Journal ofMathematical Biology 28:365–382.Dukic´, M., D. Berner, M. Roesti, C. R. Haag, and D. Ebert. 2016. A high-density genetic map revealsvariation in recombination rate across the genome of Daphnia magna. BMC Genetics 17:137.Duneau, D., P. Luijckx, F. Ben-Ami, C. Laforsch, and D. Ebert. 2011. Resolving the infection processreveals striking differences in the contribution of environment, genetics and phylogeny to host-parasiteinteractions. BMC Biology 9:11.Dybdahl, M. F., C. E. Jenkins, and S. L. Nuismer. 2014. Identifying the molecular basis of host-parasitecoevolution: Merging models and mechanisms. Am. Nat. 184:1–13.Ebert, D. 2005. Ecology, Epidemiology, and Evolution of Parasitism in Daphnia. National Library ofMedicine (US), National Center for Biotechnology Information, Bethesda (MD).Ejsmond, M., and J. Radwan. 2015. Red Queen Processes Drive Positive Selection on Major Histocompati-bility Complex (MHC) Genes. PLOS Comp. Biol. .Engelstädter, J. 2015. Host-Parasite Coevolutionary Dynamics with Generalized Success/Failure InfectionGenetics. The American Naturalist 185:E117–E129.Fine, P. E., and J. A. Clarkson. 1982. Measles in England and Wales–I: An analysis of factors underlyingseasonal patterns. Int J Epidemiol 11:5–14.82Flor, H. 1955. Host-parasite interactions in flax rust: its genetics and other implications. Phytopathology45:680–685.———. 1956. The Complementary Genic Systems in Flax and Flax Rust. Advances in Genetics 8:29–54.Flor, H. H. 1971. Current Status of the Gene-For-Gene Concept. Annu. Rev. of Phytopathology 9:275–296.Frank, S. A. 1991. Ecological and genetic models of host-pathogen coevolution. Heredity pages 73–83.———. 1993. Specificity versus detectable polymorphism in host-parasite genetics. Proc. R. Soc. Lond. B254:191–197.Fumagalli, M., M. Sironi, and U. Pozzoli. 2011. Signatures of Environmental Genetic Adaptation PinpointPathogens as the Main Selective Pressure through Human Evolution. PLoS Genetics 7:e1002355.Gandon, S., Y. Capowiez, Y. Dubois, Y. Michalakis, and I. Olivieri. 1996. Local adaptation and gene-for-gene coevolution in a metapopulation model. Proc. R. Soc. Lond. B 263:1003–1009.Gandon, S., and S. P. Otto. 2007. The Evolution of Sex and Recombination in Response to Abiotic orCoevolutionary Fluctuations in Epistasis. Genetics 175:1835–1853.Gavrilets, S. 1997. Coevolutionary chase in exploiter-victim systems with polygenic characters. J. Theor.Biol. 186:527–534.Gavrilets, S., and A. Hastings. 1998. Coevolutionary Chase in Two-species Systems with Applications toMimicry. J. Theor. Biol. 191:415–427.Gokhale, C. S., A. Papkou, A. Traulsen, and H. Schulenburg. 2013. Lotka-Volterra dynamics kills theRed Queen: population size fluctuations and associated stochasticity dramatically change host-parasitecoevolution. BMC Evol. Bio. 13.Gonçalves, S., G. Abramson, and M. F. Gomes. 2011. Oscillations in SIRS model with distributed delays.Eur. Phys. J. B 81:363–371.Grosberg, R. K., and M. W. Hart. 2000. Mate selection and the evolution of highly polymorphic self/nonselfrecognition genes. Science 289:2111–2114.Gurung, S., S. Mamidi, J. M. Bonman, M. Xiong, G. Brown-Guedira, and T. B. Adhikari. 2014. Genome-wide association study reveals novel quantitative trait loci associated with resistance to multiple leaf spotdiseases of spring wheat. PLoS ONE 9:e108179.Haldane, J. 1949. Disease and Evolution. Ricerca scient. Suppl. 19:68–76.Hamilton, W. D. 1980. Sex versus non-sex versus parasite. Oikos 35:282–290.———. 1993. Haploid Dynamic Polymorphism in a Host with Matching Parasites: Effects of Muta-tion/Subdivision, Linkage, and Patterns of Selection. Journal of Heredity 84:328–338.83Hamilton, W. D., R. Axelrod, and R. Tanese. 1990. Sexual reproduction as an adaptation to resist parasites(a review). PNAS 87:3566–73.Hethcote, H. W., and J. W. Van Ark. 1987. Epidemiological models for heterogeneous populations: propor-tionate mixing, parameter estimation, and immunization programs. Mathematical Biosciences 84:85–118.Hill, W. 1972. Effective Size of Populations with Overlapping. Theor. Popul. Biol. 289:278–289.Hodgson, E. E., and S. P. Otto. 2012. The red queen coupled with directional selection favours the evolutionof sex. J. Evol. Biol 25:797–802.Holub, E. 2001. The arms race is ancient history in Arabidopsis, the wildflower. Nat. Rev. Genet. .Honza, M., L. Ploacˇiková, and P. Procházka. 2007. Ultraviolet and green parts of the colour spectrum affectegg rejection in the song thrush (Turdus philomelos). Biological Journal of the Linnean Society 92:269–276.Irvine, B., P. L. Yap, J. Kolberg, S.-W. Chan, T.-A. Cha, P. Simmonds, E. Beall, M. S. Urdea, E. C. Holmes,and F. McOmish. 1993. Classification of hepatitis C virus into six major genotypes and a series of subtypesby phylogenetic analysis of the NS-5 region. J. Gen. Virol 74:2391–2399.Jayakar, S. D. 1970. A mathematical model for interaction of gene frequencies in a parasite and its host.Theor. Popul. Biol. 1:140–164.Johnson, N. P. A. S., and J. Mueller. 2002. Updating the accounts: global mortality of the 1918-1920"Spanish" influenza pandemic. Bulletin of the history of medicine 76:105–115.Karlin, S., and J. McGregor. 1972. Application of method of small parameters to multi-niche populationgenetic models. Theor. Popul. Biol. 3:186–209.Keeling, M. J. 2000. Metapopulation moments: coupling, stochasticity and persistence. J. Animal Ecol.69:725–736.Keeling, M. J., and L. Danon. 2009. Mathematical modelling of infectious diseases. British Medical Bulletin92:33–42.Keeling, M. J., and P. Rohani. 2008. Modeling Infectious Diseases: In Humans and Animals. PrincetonUniversity Press.Kermack, W. O., and A. G. McKendrick. 1927. Contribution to the mathematical theory of epidemics. PNAS115:700–721.Khor, C. C., and M. L. Hibberd. 2012. Host-pathogen interactions revealed by human genome-wide surveys.Trends in Genetics 28:233–243.Kilner, R. M. 2006. The evolution of egg colour and patterning in birds, vol. 81.84King, K. C., J. Jokela, and C. M. Lively. 2011. Trematode parasites infect or die in snail hosts. BiologyLetters 7:265–268.Klein, J. 1987. Origin of major histocompatibility complex polymorphism: The trans-species hypothesis.Human Immunology 19:155–162.Klein, J., and F. Figueroa. 1986. Evolution of the major histocompatibility complex. Crit. Rev. Immunol.6:295–386.Kopp, M., and S. Gavrilets. 2006. Multilocus genetics and the coevolution of quantitative traits. Evolution60:1321–36.Kouyos, R. D., M. Salathé, S. P. Otto, and S. Bonhoeffer. 2009. The role of epistasis on the evolution ofrecombination in host–parasite coevolution. Theor. Popul. Biol. 75:1–13.Lambrechts, L. 2010. Dissecting the genetic architecture of host-pathogen specificity. PLoS pathogens6:e1001019.Lawlor, D. A., F. E. Ward, P. D. Ennis, A. P. Jackson, and P. Parham. 1988. HLA-A and B polymorphismspredate the divergence of humans and chimpanzees. Nature 335:268–271.Leonard, K. J. 1977. Selection Pressures and Plant Pathogens. Ann. N. Y. Acad. Sci. 287:207–222.———. 1994. Stability of equilibria in a gene-for-gene coevolution model of host-parasite interactions,vol. 84.Levin, S. A. 1983. Coevolution. Pages 328–334 .Lively, C. M. 1987. Evidence from a New Zealand snail for the maintenance of sex by parasitism. Nature328:519–521.———. 1999. Migration, virulence, and the geographic mosaic of adaptation by parasites. Am. Nat. 153:34–47.Llaurens, V., A. Whibley, and M. Joron. 2017. Genetic architecture and balancing selection: the life anddeath of differentiated variants. Molecular Ecology 26:2430–2448.London, W. P., and J. A. Yorke. 1973. Recurrent Outbreaks ofMeasles, Chickenpox andMumps .1. Seasonal-Variation in Contact Rates. Am. J. Epidemiol. 98:453–468.Luijckx, P., H. Fienberg, D. Duneau, and D. Ebert. 2013. AMatching-Allele Model Explains Host Resistanceto Parasites. Curr. Biol. 23:1085–1088.Mackinnon, M. J., and K. Marsh. 2010. The Selection Landscape of Malaria Parasites. Science 328.MacPherson, A., and S. P. Otto. 2018. Joint coevolutionary–epidemiological models dampen Red Queencycles and alter conditions for epidemics. Theor. Popul. Biol. 122:137–148.85MacPherson, A., S. P. Otto, and S. L. Nuismer. 2018. Keeping Pace with the Red Queen : Identifying the.Genetics 208:779–789.Manolio, T. A., F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M.Ramos, L. R. Cardon, A. Chakravarti, J. H. Cho, A. E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis,C. N. Rotimi, M. Slatkin, D. Valle, A. S. Whittemore, M. Boehnke, A. G. Clark, E. E. Eichler, G. Gibson,J. L. Haines, T. F. C. Mackay, S. A. McCarroll, and P. M. Visscher. 2009. Finding the missing heritabilityof complex diseases. Nature 461:747–753.Marigorta, U. M., J. A. Rodriguez, and A. Navarro. 2016. GWAS replicability across time and space.Pages 53–66 inK. Appasani, ed. Genome-Wide Association Studies: From Polymorphism to PersonalizedMedicine. Cambridge University Press, Cambridge.May, R. M. 1974. Biological Populations with Nonoverlapping Generations: Stable Points, Stable Cycles,and Chaos. Science 186:645–647.May, R. M., and R. M. Anderson. 1983. Epidemiology and genetics in the coevolution of parasites and hosts.Proc. R. Soc. Lond. B 219:281–313.M’Gonigle, L. K., and S. P. Otto. 2011. Ploidy and the evolution of parasitism. Proc. R. Soc. Lond. B278:2814–2822.M’Gonigle, L. K., J. J. Shen, and S. P. Otto. 2009. Mutating away from your enemies: The evolution ofmutation rate in a host-parasite system. Theor. Popul. Biol. 75:301–311.Mitta, G., C. M. Adema, B. Gourbal, E. S. Loker, and A. Theron. 2012. Compatibility polymorphismin snail/schistosome interactions: From field to theory to molecular mechanisms. Developmental andComparative Immunology 37:1–8.Mon, Y., A.-C. Ribou, C. Cosseau, D. Duval, A. Thron, G. Mitta, and B. Gourbal. 2011. An exampleof molecular co-evolution: Reactive oxygen species (ROS) and ROS scavenger levels in Schistosomamansoni/Biomphalaria glabrata interactions. Int. J. Parasitol 41:721–730.Morand, S., S. D. Manning, and M. E. J. Woolhouse. 1996. Parasite-Host Coevolution and GeographicPatterns of Parasite Infectivity and Host Susceptibility. Proc. R. Soc. Lond. B 263.Murphy, B. M., B. H. Singer, and D. Kirschner. 2003. On treatment of tuberculosis in heterogeneous popu-lations. J. Theor. Biol. 223:391–404.Murphy, D. G., E. Sablon, J. Chamberland, E. Fournier, R. Dandavino, and C. L. Tremblay. 2015. HepatitisC virus genotype 7, a new genotype originating from central Africa. Journal of Clinical Microbiology53:967–72.Myers, J. H., and D. R. Bazely. 2003. Ecology and Control of Introduced Plants. Cambridge UniversityPress.86Nee, S. 1989. Antagonistic co-evolution and the evolution of genotypic randomization. J. Theor. Biol.140:499–518.Newport, M. J., and C. Finan. 2011. Genome-wide association studies and susceptibility to infectious dis-eases. Briefings in Functional Genomics 10:98–107.Nuismer, S. L. 2006. Parasite Local Adaptation in a Geographic Mosaic. Evolution 60:24.———. 2017. Introduction to Coevolutionary Theory. First. ed. Macmillan Learning, New York.Nuismer, S. L., M. Doebeli, and D. Browning. 2005. The Coevolutionary Dynamics of Antagonistic Inter-actions Mediated by Quantitative Trains with Evolving Variances. Evolution .Nuismer, S. L., and S. P. Otto. 2004. Host-parasite interactions and the evolution of ploidy. PNAS101:11036–11039.———. 2005. Host-parasite interactions and the evolution of gene expression. PLoS Biology 3:1283–1288.Nuismer, S. L., S. P. Otto, and F. Blanquart. 2008. When do host-parasite interactions drive the evolution ofnon-random mating? Ecol. Lett. 11:937–946.Nuismer, S. L., B. J. Ridenhour, and B. P. Oswald. 2007. Antagonistic coevolution mediated by phenotypicdifferences between quantitative traits. Evolution 61:1823–1834.Otto, S. P., and Y. Michalakis. 1998. The evolution of recombination in changing environments. Nat. Rev.Genet. 13:145–151.Otto, S. P., and S. L. Nuismer. 2004. Species interactions and the evolution of sex. Science 304:1018–20.Papkou, A., C. S. Gokhale, A. Traulsen, and H. Schulenburg. 2016. Host–parasite coevolution: why changingpopulation size matters. Zoology 119:330–338.Papkou, A., T. Guzella, W. Yang, S. Koepper, B. Pees, R. Schalkowski, M. C. Barg, P. C. Rosenstiel,H. Teotónio, and H. Schulenburg. 2019. The genomic basis of red queen dynamics during rapid reciprocalhost–pathogen coevolution. PNAS 116:923–928.Penman, B. S., B. Ashby, C. O. Buckee, and S. Gupta. 2013. Pathogen selection drives nonoverlappingassociations between HLA loci. PNAS 110:19645–19650.Rabajante, J. F., J. M. Tubay, H. Ito, T. Uehara, S. Kakishima, S. Morita, J. Yoshimura, and D. Ebert. 2016.Host-parasite Red Queen dynamics with phase-locked rare genotypes. Science Advances 2:e1501548.Publisher: American Association for the Advancement of Science Section: Research Article.Ratcliffe, F. N., K. Myers, B. V. Fennessy, and J. H. Calaby. 1952. Myxomatosis in Australia: A step towardsthe biological control of the rabbit. Nature 170:7–11.Revers, F., V. Nicaise, F. Revers, and V. Nicaise. 2014. Plant resistance to infection by viruses. In eLS. JohnWiley & Sons, Ltd, Chichester, UK.87Rowell, J. L., N. F. Dowling, W. Yu, A. Yesupriya, L. Zhang, and M. Gwinn. 2012. Trends in population-based studies of human genetics in infectious diseases. PLoS ONE 7:e25431.Salathé, M., R. D. Kouyos, and S. Bonhoeffer. 2008. The state of affairs in the kingdom of the Red Queen,vol. 23.Salathé, M., A. Scherer, and S. Bonhoeffer. 2005. Neutral drift and polymorphism in gene-for-gene systems.Ecol. Lett. 8:925–932.Sasaki, A. 2000. Host-parasite coevolution in a multilocus gene-for-gene system. Proc. R. Soc. Lond. B267:2183–2188.Schenk, H., H. Schulenburg, and A. Traulsen. 2018. Suicidal Red Queen: Population dynamics and geneticdrift accelerate diversity loss. bioRxiv page 490201.Segar, J., and W. D. Hamilton. 1988. Parasites and Sex. Pages 176–193 in The Evolution of Sex: Anexamination of current ideas.Sella, G., and N. H. Barton. 2019. Thinking About the Evolution of Complex Traits in the Era of Genome-Wide Association Studies. Annual Review of Genomics and Human Genetics 20:461–493.Servedio, M. R. 2019. An effective mutualism? The role of theoretical studies in ecology and evolution.Am. Nat pages 1–19.Smith, K. F., D. F. Sax, and K. D. Lafferty. 2006. Evidence for the Role of Infectious Disease in SpeciesExtinction and Endangerment. Conservation Biology 20:1349–1357.Song, Y. X., C. S. Gokhale, A. Papkou, H. Schulenburg, and A. Traulsen. 2015. Host-parasite coevolutionin populations of constant and variable size. BMC Evol. Bio. 15.Stahl, E. A., G. Dwyer, R. Mauricio, M. Kreitman, and J. Bergelson. 1999. Dynamics of disease resistancepolymorphism at the Rpm1 locus of Arabidopsis. Nature 400:667–671.Takahata, N., and M. Nei. 1990. Allelic Genealogy Under Overdominant and Frequency-Dependent Selec-tion and Polymorphism of Major Histocompatibility Complex Loci. Genetics pages 967–978.Taubenberger, J. K., and D. M. Morens. 2006. 1918 influenza: the mother of all pandemics. Emerg. Infect.Dis. 12:15–22.Tellier, A., and J. K. Brown. 2007. Stability of genetic polymorphism in host-parasite interactions. PNAS274:809–817.———. 2009. The influence of perenniality and seed banks on polymorphism in plant-parasite interactions.American Naturalist 174:769–779.———. 2011. Spatial heterogeneity, frequency-dependent selection and polymorphism in host-parasiteinteractions. BMC Evolutionary Biology 11.88Tellier, A., S. Moreno-Gámez, and W. Stephan. 2014. Speed of adaptation and genomic footprints of host-parasite coevolution under arms race and trench warfare dynamics. Evolution 68:2211–2224.Thomas, D. 2010. Gene-environment-wide association studies: emerging approaches. Nat. Rev. Genet.11:259–272.Thompson, J. 1994. The Coevolutionary Process.———. 2005. Geographic Mosaic of Coevolution.Thrall, P. H., and J. J. Burdon. 2002. Evolution of gene-for-gene systems in metapopulations: The effect ofspatial scale of host and pathogen dispersal. Plant Pathology 51:169–184.Unckless, R. L., B. P. Lazzaro, and R. L. Unckless. 2016. The potential for adaptive maintenance of diversityin insect antimicrobial peptides. Phil. Trans. R. Soc. B .Van der Biezen, E. A., and J. D. Jones. 1998. Plant disease-resistance proteins and the gene-for-gene concept.Trends Biochem. Sci. 0004:454–456.van Valen, L. 1973. A new evolutionary law. Evolutionary Theory .Wang, M., F. Roux, C. Bartoli, C. Huard-Chauveau, C. Meyer, H. Lee, D. Roby, M. S. McPeek, and J. Bergel-son. 2018. Two-way mixed-effects methods for joint association analysis using both host and pathogengenomes. PNAS 115:E5440–E5449.Wang, M., J. Yan, J. Zhao, W. Song, X. Zhang, Y. Xiao, and Y. Zheng. 2012. Genome-wide associationstudy (GWAS) of resistance to head smut in maize. Plant Science 196:125–131.Weatherall, D. J., and J. B. Clegg. 2002. Genetic variability in response to infection: malaria and after. Genesand Immunity 3:331–337.Whitlock, M. C., and N. H. Barton. 1997. The effective size of subdivided population. Genetics 146:427–441.Winham, S. J., and J. M. Biernacka. 2013. Gene-environment interactions in genome-wide associationstudies: current approaches and new directions. Journal of Child Psychology and Psychiatry 54:1120–1134.Woolhouse, M., C. Dye, J. Etard, T. Smith, J. Charlwood, G. Garnett, P. Hagan, J. Hii, P. Ndhlovu, R. Quin-nell, C. Watts, S. Chandiwana, and R. Anderson. 1997. Heterogeneities in the transmission of infectiousagents. PNAS 94:338–342.Woolhouse, M. E. J., J. P. Webster, E. Domingo, B. Charlesworth, and B. R. Levin. 2002. Biological andbiomedical implications of the co-evolution of pathogens and their hosts. Nature Genetics 32:569–577.Yates, A., R. Antia, and R. R. Regoes. 2006. How do pathogen evolution and host heterogeneity interact indisease emergence? PNAS 273:3075–3083.89Zhao, L., and D. Waxman. 2016. The influence of genetic drift on the formation and stability of polymor-phisms arising from negative frequency-dependent selection. J. Theor. Biol. 391:51–64.Zila, C. T., L. F. Samayoa, R. Santiago, A. Butrón, and J. B. Holland. 2013. A genome-wide associationstudy reveals genes associated with fusarium ear rot resistance in a maize core diversity panel. G3: Genes,Genomes, Genetics 3:2095–2104.Zimmer, C., and D. J. Emlen. 2013. Evolution: Making sense of life. 2nd ed. W.H. Freeman & Company.90Appendices91Appendix AAppendix for Chapter 2A.1 Supplementary FiguresHost-only model Host-pathogen modelAllelic EffectsVariation ExplainedPathogen Allele Frequency!"# = !"%Figure A.1: Matching-alleles model.Row 1: Estimated allelic effects as a function of pathogen allele fre-quency (black: 0, solid red: Hi, dashed red: H12, blue Pi, blue dashed: P12, purple dashed: HPij).Row 2: Variation explained by host additive effects only (solid red), host additive and epistatic effects (dashedred), host and pathogen additive and epistatic effects (dashed blue, overlaps with dashed red), and a full hostpathogen model as given in equation (2.8) (dashed purple).92cMSignificance−"#$(&)Figure A.2: GWAS of Daphnia magna susceptibility.Genetic associations of each SNP with C1 (red cir-cles), C19 (blue, squares), and “mixed” susceptibility (green, triangles). Here, the mixed case was obtainedby taking each host clone and randomly choosing the data upon exposre to C1 or C19 (not both, as incor-porated in the overall analysis in Figure 2.5). Hence each SNP is represented three times, once for eachgenome wide scan. Note that closely linked SNPs often overlap with one another and are not all individuallyvisible. Significant SNPs are shown in color while those below the log-likelihood of 2 threshold or that arenot clustered within a 10cM region of three other significant SNPs are shown in gray. The 10 linkage groupsare delineated by vertical dashed lines.93Appendix BAppendix for Chapter 3We begin by setting the equations in (3.5) to zero to give equilibrium conditions of the SIRS model withcoevolution. Denoting the value of the variables at equilibrium with hats we can derive expressions for eachrecovered class and each infected class in terms of only the first infected class iˆ(h, p, 1).iˆ(h, p, c) =✓nI⌧I + nI◆c1iˆ(h, p, 1) 8h, p, crˆ(h, c) =✓nRd⌧R + nR◆c1✓ nI⌧R⌧I(d⌧R + nR)◆✓nI⌧I + nI◆nI1Xpiˆ(h, p, 1) 8h, c(B.1)Given these expression we can reduce the equilibrium conditions of system (3.5) to the following 6 equations:0 =b sˆ(h) + (zI + zR)Xp(ˆi(h, p))! 1 Xk sˆ(k) + (zI + zR)Xp(ˆi(k, p))!! dsˆ(h) + nR⌧RznRXp(ˆi(h, p)) sˆ(h)zIXk,p⇣k,piˆ(k, p, 1)⌘8h0 =sˆ(h)zIh,pXk(ˆi(k, p, 1))✓ +nI⌧I◆iˆ(h, p, 1) 8h, p(B.2)wherezI =(nI + ⌧I)⇣1⇣nI⌧I+nI⌘nI⌘⌧IzR =nI⇣nI⌧I+nI⌘nI1 ⇣1⇣nRd⌧R+nR⌘nR⌘d⌧IznR =nI⌧R⇣nI⌧I+nI⌘nI1 ⇣ nRd⌧R+nR⌘nRnR⌧I.(B.3)As shown in the supplementaryMathematica file equations (B.2) can be satisfied if there is disease extinction(no infections present), extinction of one host and one pathogen type (a single combination of host andparasite present), extinction of one pathogen type (two host-parasite combinations present), or all typespresent (all four host-parasite combinations present). Here we will focus on this latter equilibrium. Althoughit was not possible to solve equations (B.2) in general we can do so if we make a number of assumptions.First we assume the infection matrix is symmetric, with equal transmission rates between matching hostsand parasites (and between non-matching hosts and parasites): 1,1 = 2,2 = X,1,2 = 2,1 = Y . Wethen inquire whether there is an equilibrium (B.4), where the density of susceptible hosts is given by Sˆ, the94density of infected hosts by Iˆ , that the host allele frequency is pH , and that the probability a host is infectedby the matching parasite type is pX .sˆ(1) =pH Sˆ, sˆ(2) = (1 p)H Sˆiˆ(1, 1, 1) =pXpH Iˆ , iˆ(2, 2, 1) = pX(1 pH)Iˆiˆ(1, 2, 1) =(1 pX)pH Iˆ , iˆ(2, 1, 1) = (1 pX)(1 pH)Iˆ(B.4)Constrained to this form there are only two equilibria of (B.2), which are the two roots of a quadratic in Iˆand satisfy:Sˆ =2(nI + ⌧I)(X + Y )zI⌧IpX =XX + YpH =12(B.5)Numerically we confirmed (see supplementary Mathematica notebook) that only a single one of these tworoots gives a biologically valid equilibrium. In addition, numerical analysis of cases where the infectionmatrix, , is neither symmetric nor the equilibrium constrained to the form in equations (B.4) confirmedthat all biologically valid equilibria of system (3.5) have host and pathogen allele frequencies of 12 . Finally,equilibrium (B.5) can be reduced to those given by Gonçalves et al. (2011) (equations 6 or 11) if there isonly one host/pathogen type and birth and death are negligible.95Appendix CAppendix for Chapter 4C.1 The Ensemble Moment ApproximationC.1.1 General approachWe consider the general stochastic process in Z+n. Specifically, for the MAM model presented here n = 2,one dimension for the number of hosts of type 1 and pathogens of type 1, respectively. Let ~z(t) denote thestate of the process at time t, for our case ~z = {H1, P1}. We begin by transforming the variables into theirdeviation from the deterministic equilibrium. Specifically if zˆi is the value of variable i at equilibrium, thenthe transformed variables are given by ~x = ~z  ~ˆz. In addition, we only consider the case where the initialcondition of the stochastic processes is the deterministic equilibrium, ~x = ~0. This transformation in variableswill then allow us to easily express the moments of the stochastic process in terms of Taylor series aroundthe deterministic equilibrium. In terms of the transformed variables, the stochastic process is described bymdistinct events that occur at rates fe(~x) where e = {1, 2, ...,m}. Finally, we denote the ensemble average,the average across all sample paths, by h⇤i.The aim of the ensemble moment approximation is to derive a system of ODEs giving the change in themoments of the stochastic process. Beginning with the first moment (mean) we have:dhxiidt=hmXe=1(event rates) (change in xi from event)i=hmXe=1fe (~x) ((xi +e,i) xi)i(S1)where e,i is the change in xi from event e (Keeling, 2000). To take the ensemble average of the sumwe express the argument as a polynomial by approximating it with a Taylor series around the deterministicequilibrium assuming xi 8 i is small. Equation (S1) is then approximated as:dhxiidt⇡ hmXjajxj +m,mXj,kbj,kxjxk +m,m,mXj,k,lcj,k,lxjxkxl + ...i (S2)where aj , bj,k, and cj,k,l are constants, which are themselves functions of the deterministic equilibrium ~ˆz .Rearranging we have:dhxiidt⇡mXjajhxji+m,mXj,kbj,khxjxki+m,m,mXj,k,lcj,k,lhxjxkxli+ ... (S3)96The result is an expression for the change in the first moment as a linear function of the first, second, third,and higher moments.We use the same technique to derive the second moment ODEs:dhxixjidt=hXefe(~x) ((xi +e,i) (xj +e,j) xixj)i=hXefe(~x) (xie,j + xje,i +e,ie,j)i(S4)Approximating the right-hand argument as a sum once again by a Taylor series expansion we have:dhxixjidt⇡mXjAjhxji+m,mXj,kBj,khxjxki+m,m,mXj,k,lCj,k,lhxjxkxli+ ... (S5)Similarly for the third moments:dhxixjxkidt⇡mXjAjhxji+m,mXj,kBj,khxjxki+m,m,mXj,k,lCj,k,lhxjxkxli+ ... (S6)where A,B,C,A,B, and C are all constants that depend on the deterministic equilibrium.C.1.2 Dynamics of host heterozygosityWe are interested in the dynamics of host heterozygosity and calculate the ensemble moment approximationfor the dynamics of heterozygosity, H˜:dH˜dt=ddt(E[2pH(1 pH)]) = 2 ddtE[pH  p2H ]=2✓ddthH1i ddthH21 i2◆ (S7)The dynamics of heterozygosity thus depends directly on that of the first two moments. If we substi-tute in equation (S3) and (S5) and assume that population size is large  = 1/✏, then approximated tomathcalO(✏2) the dynamics of host heterozygosity simplify to equation (4.4) given in the main text.C.2 Maintenance of Genetic Variation in Discrete TimeC.2.1 Model specificationWe model host-parasite coevolution in discrete time with a Wright-Fisher model with a constant host andpathogen population size . In each generation, hosts come into contact with a single random pathogen. Ifhost and pathogen have the same genotype i, the probability of successful infection is given by i,i = X ,whereas hosts of type i are infected by pathogens of mis-matching genotype j, with probability i,j = Y <X . In the absence of infections host’s and pathogen’s have a fitness 1, whereas infection decreases host97fitness by a factor ↵H and increases pathogen fitness by ↵P such that the expected fitness of host genotype iand pathogen genotype j are given by:WH(i, t) = 1 ↵HXji,jPjWP (j, t) = 1 + ↵PXij,iHi(S8)C.2.2 Deterministic dynamicsAs we do for the continuous-time model, we begin by developing an understanding of the effect of coevo-lution on genetic variation in a finite population by analysing the dynamics of heterozygosity in the limit asthe population size goes to infinity. In this deterministic case the coevolutionary dynamics are given by thefollowing difference equations:pH(t+ 1) =pH +↵HpH(1 pH)(1 2pP )(X  Y )↵H(pH(1 2pP ) + pP )(X  Y ) ↵HX + 1pP (t+ 1) =pP  ↵P pP (1 pP )(1 2pH)(X  Y )↵P (pH(1 2pP ) + pP )(X  Y ) + ↵PX + 1(S9)There are five equilibria of this model, four of these specify the fixation/loss of one host and one pathogentype, and a fifth internal equilibrium occurs at pH = pP = 1/2. Unlike the continuous time model exploredin the main text this internal equilibrium is unstable and cyclic. The dynamics of the allele frequency andheterozygosity are shown in Figure 4.1 panels A and B, respectively. The rate at which the amplitude of thecycles grow is given by the magnitude of the leading eigenvalue, which is given by kk, a result consistentwith those of (M’Gonigle et al., 2009).kk =s1 +↵H↵P (X  Y )2(2 (X + Y )↵H) (2 + (X + Y )↵P ) (S10)Due to the instability of the internal equilibrium host heterozygosity decays over time. Furthermore thelarger the value of kk the faster the heterozygosity should decay.C.2.3 Simulations of a finite populationIn analogy to the continuous-time model presented in the main-text where the death of the host and birth ofthe pathogen from infection are coupled, for our simulations in a finite population we assume ↵H = ↵P = ↵.As with the continuous-time model we simulate coevolution for 100 random parameter combinations ofX,Y, and,↵ (0 < ↵ < 0.5, 0 < Y < X < 1) and 7 values of  (ranging on a log scale between 102and 105) for a total of 700 simulations. In contrast to the continuous-time model natural host death andfree-living pathogen birth are incorporated directly in the Wright-Fisher model. For each of the simulations,sample paths were generated for 1000 independent replicate populations. All simulations are initialized atthe internal polymorphic equilibrium and run for t = 500 generations.98In the absence of coevolution, the dynamics of host heterozygosity are given by the neutral expectationfor the Wright-Fisher model:Hneut = H0✓1 1◆t(S11)The difference between the simulated heterozygosity and the neutral expectation at time t = 500 is shown inFigure C.1A. When population sizes is large ( = 105 upper right hand corner) the deviation in heterozygos-ity between the coevolutionary simulations and neutral expectation, H = Hcoev  Hneut, is statisticallynegative as expected due to the erosion of genetic variation by coevolution. The magnitude of this deviationis however small, so while coevolution does indeed erode genetic variation the process is slow with onlysmall effects over the parameter and time scale sampled.As with the continuous-time model presented in the main text, directional selection (see Figure C.1C)following fixation in the pathogen has a much larger effect than that of coevolutionary selection. When thepopulation size is small (e.g.  = 102), drift dominates over the effects of selection, and the heterozygositybehaves nearly neutrally. As the population size increases, the effect of this directional selection emerges.As predicted by the deterministic model variation in H for a given population size is explained largelyby variation in kk (see Figure C.1B). This is not only because kk determines the effect of coevolutionbut because it is highly correlated with the probability of pathogen fixation and the strength of directionalselection following pathogen fixation.C.3 Seed DormancyHere we extend the above discrete-time model to include the effects of seed dormancy, or any other processthat causes a subset, s, of the host population to avoid parasite induced selection, making the host lessresponsive during coevolution. We analyse the effect of the resulting temporal asynchrony between host andparasite on the stability of the polymorphic equilibrium. Given a life cycle shown if Figure C.2 we modelevolution by following the allele frequencies among the host seedlings pH , host juvenile seeds pJ , and withinthe parasite pP . The dynamics are given by:pH(t+ 1) =spJ(t) +(1 s)pH(t)(1 ↵HpP (t)(X  Y ) ↵HY )↵HpH(t)(1 2pP (t))(X  Y ) + ↵HpP (t)(X  Y ) ↵HX + 1pJ(t+ 1) =(1 s)pJ(t) + spH(t)(1 ↵HpP (t)(X  Y ) ↵HY )↵HpH(t)(1 2pP (t))(X  Y ) + ↵HpP (t)(X  Y ) ↵HX + 1pP (t) =pP (t)(↵P pH(t)(X  Y ) + ↵PY + 1)1 ↵P pH(t)(1 2pP (t))(X  Y ) ↵P pP (t)(X  Y ) + ↵PX(S12)There are five equilibria of system S12, four of which are trivial fixation/loss of the host and pathogen.The fifth internal equilibrium is at pˆH , pˆJ , pˆP = 12 . The Jacobian matrix at this internal equilibrium is givenby:J =26641 s s s(XY )↵H↵H(X+Y )2s 1 s (1s)(XY )↵H↵H(X+Y )20 s(XY )↵P↵P (X+Y )2 13775 (S13)99Focusing on the diagonal elements, we note that dpH(t+1)dpH(t) < 1 anddpJ (t+1)dpJ (t)< 1. As noted by Tellierand Brown (2009) this is indicative of direct negative frequency-dependent selection. We can evaluate thestability of system S12 by using the Routh-Hurwitz conditions on the transformed characteristic polynomialof S13. Specifically, although the Routh-Hurwitz conditions can only be applied to identify the stabilitycriteria in continuous time, if we transform the characteristic polynomial, p(), into a third order polynomialin z such that:(z  1)3p✓z + 1z  1◆= a0 + a1z + a2z2 + a3z3 (S14)then the equilibrium is stable (all eigenvalues have a magnitude less than 1) if and only if ai < 0 8i anda2a1  a3a0 > 0 assuming ↵H = ↵P = ↵. These criteria are satisfied in our model for a range of s,indicating the the internal equilibrium is stable under these condition:↵2(3X  Y )(X  3Y ) + 4B↵2XY  1 < s <↵2(3X  Y )(X  3Y ) + 4 +B↵2XY  1B =p(↵(X + Y ) + 2)(2 ↵(X + Y )) (↵2 (9X2  14XY + 9Y 2) 4)(S15)where B must be real.100C.4 Supplementary Figures○○○○○○○○○○○○○○○○○●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●0.0 0.1 0.2 0.3 0.4κ100316100031621000031623100000Neutral ExpectationMoran ModelWF Model1.000 1.001 1.002 1.003 1.004 1.005 1.006-0.5-0.4-0.3-0.2- 100 200 300 400 5000. BCHneutHcoevHcoevHneutpHgenerationsFigure C.1: Comparison between discrete- and continuous-time models. Panel A: Mean heterozygosityin a coevolving population (at t = 250 generations) versus the neutral expectation for the Moran model incontinuous time (red as in- see Figure 4.4) versus the Wright-Fisher model in discrete time (green), lightgreen lines give smoothed fits to 1000 bootstrap samples. Points depict heterozygosity for each of the 420parameter conditions simulated for the WF model averaged over the 1000 replicate populations. Panel B:Deviation in heterozygosity from the neutral expectation in the WF as a function of the magnitude of theleading eigenvalue. R2 values are given for the non-linear model fit of a+eb(x1). Panel C: Example allelefrequency dynamics in the WF model. Allele frequency trajectories are coloured gray when there is ongoingcoevolution, red when there is directional selection, and blue when the host population is fixed. Parameters:X = 0.9, Y = 0.26,↵ = 0.047, = 102.5 = 316101CENSUSpH , pPGERMINATIONINFECTIONpH = (1  ↵H)Pj i,j pHi pPjpP = (1 + ↵P )Pi i,j pHi pPj1REPRODUCTIONss(1  s)pJ(t)0.0 0.1 0.2 0.3 0.4↵H = ↵PsStabilityFigure C.2: Life-cycle diagram and numerical stability analysis of MAM with seed dormancy. Hostand pathogen life-cycle is shown in blue, flux in and out of seed bank is shown by green arrows. Shadedregion of right-hand plot gives region for which the polymorphic equilibrium is stable assuming ↵H = ↵Pand given X = 1, Y = 0.102Appendix DAppendix for Chapter 5D.1 Stability and Transient Dynamics of the Epidemiological ModelShown schematically in Figure D.1, and derived in the supplementary Mathematica notebook, R0() deter-mines the transient dynamics of system (5.5), given the frequency of matching-hosts, . As noted in the text,R)(1/2) determines the stability of the stability of the endemic equilibrium (stable when R)(1/2) > 1). Inaddition R0(0) and R0(1) determine the ability of the parasite to spread where no hosts match or all hostsmatch, respectively, as determined by the local stability analysis (see supplementary Mathematica file).When R0(1) < 1 (dark pink area), the pathogen is never able to invade even in a population consistingcompletely of matching susceptible hosts. When R0(1) > 1 but R0(1/2) < 1, at least one of the twopathogen types will be able to spread initially. This spread will, however, result in host evolution, leading toan increase in the number of resistant hosts and eventual extinction of the disease. This is in contrast to whenR0(1) and R0(1/2) are both greater than 1 but R0(0) < 1 (light green area) where, while only the matchingpathogen type will be able to spread initially, but resulting change in host allele frequency eventually allowsthe invasion of the second pathogen type resulting in a stable endemic equilibrium. Finally, whenR0() > 1for all phi both pathogen types are able to invade initially, and the endemic equilibrium is stable regardlessof the coevolutionary dynamics. While for the majority of parameter space the dynamics of system (5.5) arecharacterized by cyclic dynamics, when transmission rates are high and virulence low, the cyclic dynamicsdissipate resulting in a rapid and smooth approach to the endemic equilibrium. Given this behaviour is rarewith parameters we consider only parameters with cyclic dynamics.103Not Valid:X<Y Diseasealways spreadsCoevolution facilitateddisease spreadCoevolutiondrivendisease freeAlwaysdisease freeNo CyclesRo(0)=1Ro(1/2)=1Ro(1)=1Im[λmax]=0MatchingTransmissionRate,XMis-matchingTransmissionRate,YFigure D.1: Epidemiological model: deterministic model stability and the role of R0(). Transient andequilibrium dynamics of the deterministic epidemiological model (system 5.5) as a function of R0() andthe relative matching (X) and mis-matching (Y ) transmission rates. Considering only areas where X > Y(i.e. excluding black area), the pathogen is either lost (pink areas) or maintained (green areas) at equilibrium.Dark pink area: R0() < 1 for all , disease never spreads. Light pink area: R0() < 1 except when is near 1, disease never maintained. Light green area: R0() > 1 for all  greater than 1/2, polymorphicendemic sustained by coevolution. Dark green area: R0() > 1 for all , both pathogen types spread whenrare.D.2 Neutral Genetic Drift in the Individual-Based SimulationsWhile the single-species Moran model is a good approximation to neutral genetic drift in the ecological andepidemiological model, it is not in fact the exact true neutral expectation for the birth-death processes con-sidered here because of the changes in host population size. For the ecological and epidemiological modelsconsidered the Moran model provides a good approximation as the dynamics of host population size werenear constant (only slightly changing the geometric mean size, see Appendix D.3). Nevertheless to eliminate104any biases, rather than using a comparison between the observed heterozygosity and the expectation from theneutral Moran model, unless otherwise noted, we compare the heterozygosity in coevolutionary simulationsto the expected heterozygosity from an equivalent neutral simulation. Specifically, for every coevolutionarysimulation we ran a complementary neutral simulation with the same sequence of birth and death events ex-cept where the genotype of the individual who is born or died was drawn at random, allowing the dynamicsof the population to change accordingly. The one exception to this is the few rare cases where hosts wentextinct within a replicate, making it impossible to measure host heterozygosity. In order to eliminate possiblebias from excluding these replicates, in the few cases where host extinction occurs prior to generation 250we set the coevolutionary heterozygosity to 0 and the neutral expectation to that expected under the Moranmodel.In contrast to the model presented in Chapter 4 where time was normalized with respect to host gener-ations, the deterministic models above are specified in terms of absolute time. However to facilitate com-parisons across parameter space the individual-based simulations are analysed in terms of host generationswhere one host generations is defined by the death of K individuals in the simulation. Analytical normal-ization of the constant-size model in the Chapter 4 was required for us to compare the ensemble momentdynamics analytically to the expectation under the one-species neutral Moran model. While analogous en-semble moment dynamics can be obtained for the ecological and epidemiological model, there exists noconcise analytical neutral expectation with which to compare them.D.3 Effect of Ne in the Ecological ModelFluctuations and stochastic variation in population size in the ecological model is expected to decreaseeffective population size Ne, which, as mentioned above, is the harmonic mean population size over time(Crow and Kimura, 1970). These decreases in effective population size should increase the rate of loss ofgenetic variation via genetic drift and result in a reduction in the maintenance of genetic variation relative tothe expectation in the neutral Moran model. Due to the absence of an exact analytical neutral expectationfor the ecological and epidemiological models, in the main text we focused on the comparison between theobserved heterozygosity under coevolution and the simulated neutral expectation, H = Hcoev  Hneut.As the simulated neutral expectation is also subject to population fluctuations and stochasticity, we need toinstead compare the observed heterozygosity under coevolution to the expectation under the constant-sizeMoran model HMoran = Hcoev HMoran.HMoran =12✓1 2Hˆ⇤◆g(S1)where g is the host generation. We find, however, only a weak relationship between the relative effectivepopulation size N˜e = NeHˆ⇤Hˆ⇤and HMoran as shown in the supplementary Mathematica notebook. Thisis likely the result of low variation in Ne (shown in Figure D.2). As addressed in detail in the sectionon the deterministic ecological model the dynamics of population-size are determined by a pair of non-leading eigenvalues (see supplementaryMathematica notebook), which typically had substantially negative105real parts, leading to rapid stabilization of the total population size.100 316 1000 3162100316100031620.0890.0660.0340.0120.0040.0020.001equilibrium population size, Hˆeectivepopulationsize,NeFigure D.2: Variation in effective population size. Effective population size, Ne averaged across the 1000replicate demes for each of the 100 parameter combinations, as a function of equilibrium population sizeHˆ⇤. Numbers above each value of Hˆ⇤ give the coefficient of variation.D.4 Supplementary Figures and TablesLog(K) Constant-size Ecological EpidemiologicaW B Ratio W B Ratio W B Ratio2 0.0003 0.001 4.14 0.0011 0.003 2.27 0.0013 0.007 5.742.25 0.0013 0.011 8.84 0.0034 0.019 5.62 0.003 0.04 13.212.5 0.0029 0.04 13.57 0.0057 0.043 7.5 0.0047 0.08 17.052.75 0.0046 0.079 16.97 0.0059 0.048 8.11 0.0053 0.111 21.023 0.0054 0.103 19.06 0.0043 0.032 7.41 0.0047 0.123 26.433.25 0.0049 0.098 19.85 0.0027 0.017 6.47 0.0035 0.116 33.223.5 0.0038 0.073 19.52 0.0017 0.009 5.6 0.0024 0.091 37.29Table D.1: Variation in H across parameter space versus among replicates. Table: Column 1: Theaverage variation in the observed heterozygosity Hcoev among the 1000 replicates within (W) each of the100 simulations. Column 2: The variation between (B) the average observed heterozygosity for the 100simulations across parameter sets. Column 3: The ratio of the variation within versus between simulations.106Constant-size Ecological Epidemiological      ln (r) Re []R2 Slope CI R2 Slope CI R2 Slope CI R2 Slope CI0.12 0.002 0.001 0.25 0.005 0.002 0 0 0 0.06 -0.048 0.0390.24 0.016 0.006 0.14 0.025 0.012 0.03 0.002 0.002 0.1 0.346 0.2140.45 0.068 0.015 0.1 0.044 0.027 0.06 0.005 0.004 0.35 1.357 0.370.58 0.127 0.022 0.06 0.038 0.03 0.12 0.008 0.004 0.37 1.893 0.4990.69 0.131 0.018 0.04 0.019 0.02 0.21 0.007 0.003 0.28 1.795 0.580.7 0.086 0.011 0.0 -0.003 0.011 0.23 0.004 0.001 0.17 1.304 0.5690.72 0.052 0.006 0.01 -0.003 0.006 0.12 0.002 0.001 0.13 0.879 0.453Table D.2: Linear model fits shown in Figure 5.2. Fitted slope, associated confidence intervals, and R2values for linear model fits shown in Figure 5.2. Slopes that differed significantly from 0 (95% confidence)are given in bold.1070 20 40 60 80 1000. 20 40 60 80 1000. 20 40 60 80 10002468100 20 40 60 80 10002468100 20 40 60 80 10002468100 20 40 60 80 1000246810⇤ = 102.5 ⇤ = 103SimulationsEpidemiologicalEcologicalConstantFigure D.3: Effect of directional selection across models. Bar charts showing the frequency of directionalselection following pathogen fixation (red), pathogen extinction (green), host fixation (blue), and ongoingcoevolution (gray) for each model where parameters are drawn to optimize model comparison (see Table5.1). In all panels parameter sets are arranged in order of increasing H , with each vertical bar giving theproportion of the 1000 replicates exhibiting each type of behaviour at generation 250.108


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items