The Open Collections website will be unavailable July 27 from 2100-2200 PST ahead of planned usability and performance enhancements on July 28. More information here.

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Historicity in biology Cyr Desjardins, Eric 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2009_fall_cyrdesjardins_eric.pdf [ 30.35MB ]
JSON: 24-1.0067848.json
JSON-LD: 24-1.0067848-ld.json
RDF/XML (Pretty): 24-1.0067848-rdf.xml
RDF/JSON: 24-1.0067848-rdf.json
Turtle: 24-1.0067848-turtle.txt
N-Triples: 24-1.0067848-rdf-ntriples.txt
Original Record: 24-1.0067848-source.json
Full Text

Full Text

    HISTORICITY IN BIOLOGY   by  ERIC CYR DESJARDINS  B.Sc., Université du Québec a Montréal, 2000 M.Arts., Université du Québec a Montréal, 2004    A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY   in   THE FACULTY OF GRADUATE STUDIES  (Philosophy)     THE UNIVERSITY OF BRITISH COLUMBIA  (Vancouver)    October 2009  © Eric Cyr Desjardins, 2009   ii Abstract  Biology is often characterized as an historical science. Why and to what extent does history matter in biology? Overall, this analysis shows that biology is deeply historical. Using examples from various fields (evolution, ecology and development), I develop a theory of biological historicity starting from the idea that the most fundamental property of historical processes is their capacity to retain information from the past, which ultimately depends on the existence of divergent outcomes and a causal dependence between a process’ outcome and its past. Yet the conditions generating divergent outcomes and information preserving processes are numerous and can be illustrated in various ways. These are developed and clarified through the notions of historical contingency, path dependence, irreversibility, and generative entrenchment.  iii Table of Contents  Abstract ............................................................................................................................... ii Table of Contents............................................................................................................... iii List of Tables ..................................................................................................................... vi List of Figures ................................................................................................................... vii Acknowledgements............................................................................................................. x Dedication .......................................................................................................................... xi Chapter 1: Introduction ....................................................................................................... 1 1.1 Posing the Problem ................................................................................................... 1 1.2 The Adaptationist Program....................................................................................... 3 1.3 The Eclipse of History in Ecology............................................................................ 7 1.4 Overview................................................................................................................. 12 Chapter 2: Information-destroying and Information-preserving Processes...................... 17 2.1 Reconstructing the Past........................................................................................... 18 2.2 Deterministic and Stochastic Frameworks.............................................................. 25 2.3 Concluding Remarks............................................................................................... 35 Chapter 3: Historical Contingency I: Gould, Lewontin and the Long-Term Evolutionary Experiment........................................................................................................................ 38 3.1 Gould on Contingency ............................................................................................ 39 3.2 Historical Contingency in the Long-Term Evolutionary Experiment .................... 49 3.3 Lewontin and the Role of Environmental History.................................................. 64 3.4 Concluding Remarks............................................................................................... 73  iv Chapter 4: Historical Contingency II: Community Ecology ............................................ 75 4.1 Historical Contingency in the Process of Community Assembly........................... 76 4.1.1 The Priority Effect, Keystone Species and Predictability................................ 80 4.1.2 On Other Regional and Local Factors Affecting the Degree of Convergence/Divergence of the Community Assembly........................................... 90 4.2 The Role of Other Histories: Geological and Evolutionary.................................... 93 4.2.1 Why Evolutionary History Matters.................................................................. 94 4.2.2 Why Geological History Matters ..................................................................... 95 4.3 Concluding Remarks............................................................................................... 99 Chapter 5 : Path Dependence.......................................................................................... 104 5.1 Szathmáry on Path Dependence............................................................................ 105 5.2 Some Problematic Aspects ................................................................................... 118 Chapter 6: Path Dependence Revisited, Revised and Applied ....................................... 122 6.1 Lessons From the Social Sciences ........................................................................ 122 6.2 Redefining Path Dependence: Towards a General Theory of Stochastic Information Preserving Processes................................................................................................... 130 6.2.1 Definitions...................................................................................................... 130 6.2.2 Applications ................................................................................................... 142 6.3 Concluding Remarks............................................................................................. 149 Chapter 7: Generative Entrenchment, Historicity and Irreversibility............................. 150 7.1 Phylogenetic Constraints and Historicity in Evolution and Ecology.................... 152 7.2 A Role For Generative Entrenchment................................................................... 159 7.2.1 Generative Entrenchment............................................................................... 160  v 7.2.3 Generative Entrenchment and Developmental Historicity ............................ 164 7.2.3 Generative Entrenchment as a Metaevolutionary Constraint ........................ 168 7.3 Generative Entrenchment and (Ir)reversibility ..................................................... 171 7.3.1 Generative Entrenchment and Irreversibility..................................................... 172 7.3.2 When Reversibility Relies on GE and Historicity ......................................... 177 7.4 Concluding Remarks............................................................................................. 181 Chapter 8: Conclusion..................................................................................................... 183 Bibliography ................................................................................................................... 192 Appendix 1: Urn Dynamics, Increasing Returns and Lock-In ....................................... 202  vi List of Tables  Table 3.1: Genotypes fitness attribution and related dominance.......................................71 Table 3.2: (A)historical nature of evolutionary outcome in different conditions ..............72 Table 4.1: Different outcomes resulting from alternative assembly histories ...................83 Table 4.2: Invasion sequence and final composition for a four-species community.........88  vii List of Figures  Figure 1.1: Graphical representation of the equilibrium model of biogeography for a single island .........................................................................................................................8 Figure 2.1: Information-destroying processes ...................................................................20 Figure 2.2: Information-preserving processes ...................................................................22 Figure 2.3: A linage branching into two and undergoing different mutational histories...23 Figure 2.4: Elements of an initial set mapping into elements of a final set.......................29 Figure 2.5: Flat fitness landscape.......................................................................................33 Figure 3.1: Effects due to adaptation, chance and history on evolutionary dynamics.......53 Figure 3.2: Trajectories for mean fitness relative to the ancestors in 12 replicate populations during10,000 generations ...................................................................56 Figure 3.3: Example of parallel evolution, transiently divergent evolution, and sustained divergent evolution ................................................................................................59 Figure 3.4: Alternative hypotheses for the origin of the citrate-metabolizing (Cit+) phenotype...............................................................................................................62 Figure 3.5: Evolutionary pathways with order of selection values reversed .....................66 Figure 3.6: Evolutionary pathways of haploid populations placed in reversed non-random environments..........................................................................................................69 Figure 3.7: Evolutionary pathways of diploid populations placed in reversed non-random environments..........................................................................................................71 Figure 3.8: Evolutionary pathways of diploid populations placed in reversed random environments..........................................................................................................72  viii Figure 4.1: Environmental determinism and historical contingency in community ecology ............................................................................................................................................76 Figure 4.2: Isoclines of three species feeding on two resources........................................81 Figure 4.3: Food web with parallel food chains rooted in a common nutrient pool..........85 Figure 4.4: Trajectory of equilibrium nutrient as community is assembled from three simple food chains .............................................................................................................86 Figure 4.5: Distribution of large flightless birds and phylogenetic relationship ...............97 Figure 4.6: Reconstruction of the successive stages in the break-up of the ancient supercontinent of Gondwanaland ......................................................................................98 Figure 5.1: Irreversibility of evolution.............................................................................110 Figure 5.2: The formose reaction.....................................................................................117 Figure 6.1: The history of the QWERTY keyboard ........................................................123 Figure 6.2: A branching tree representing a partially ordered set of moments mj.................131 Figure 6.3: Weak and strong path-dependent processes..................................................136 Figure 6.4: Branching diagram showing the high probability of ichthyosaurus (and low probability of fish) to arise in aquatic life conditions after terrestrial probability...........138 Figure 6.5: Branching trees representing sustained divergence produced by isolated-event collective replacement and parallel evolution produced by coincident-event collective replacement ......................................................................................................................145 Figure 6.6: Branching tree representing the assembly of a three-species community.....147 Figure 7.1: A sawfly using its saw-like ovopositor to lay its eggs inside a plant............153 Figure 7.2: Different modes of ovoposition resulting in alternative adaptive syndromes and emergent properties...................................................................................................156  ix Figure 7.3: Hierarchical structure comprising three levels of organization.....................165 Figure 7.4: A hierarchical structure developing in alternative outcomes ........................166    x Acknowledgements  I am especially grateful to family members, friends and colleagues who made this project possible: to Karine Beaudoin for listening to extracts and offering enormous moral support during all these years, to Jocelyne Desjardins for taking such good care of our kids Nathan and Isha, to George Beliveau for his guidance and for all these hours spent in the trails of Pacific Spirit Park exchanging ideas, to Stephane Desjardins, Alirio Rosales, Yuitchi Amitani, and to all the students present in the philosophy of biology reading group for listening and reading draft chapters, to Eors Szathmáry for his recommendations and for offering the opportunity to share ideas with other brilliant biologists in Budapest, to Paul Bartha and Chris Stephens for their careful reading of the whole dissertation and for their numerous, thoughtful comments, and to my supervisor John Beatty for his help, guidance and generous support.  xi Dedication   To Nathan and Isha  1 Chapter 1: Introduction  1.1 Posing the Problem Dobzhansky (1973) suggests that nothing in biology makes complete sense except in the light of evolution. All organisms and the features they display have arisen at some point in the history of life and they have been transmitted (with modifications) across lineages through an evolutionary process. There exists therefore a profound historical link that runs through all life forms. So, saying that nothing in biology makes sense except in the light of evolution entails that everything in biology is the result of a historical process. But we know this story, which originated in the 18-19th century transformationist theories of nature, and acquired its modern meaning after the Darwinian revolution. So it may appear surprising that some prominent biologists during the last 50 years, long after the Darwinian revolution, have felt the need to remind us of the fact that history matters in biology. Take for instance Williams’ opening statement in his book, Natural Selection: Successful biological research in this century has had three doctrinal bases: mechanism (as opposed to vitalism), natural selection (trial and error, as opposed to rational plan), and historicity. This last is the recognition of the role of historical contingency in determining properties of the Earth’s biota. (Williams 1992)  As we will see, historical contingency is but one face of the notion of historicity. Nonetheless, Williams is right to consider that historicity is a realization of the 20th, not the 18-19th century. This suggests that the phrase “history matters” must mean something other than the platitude that everything in biology has its origin in evolution. But what does it mean?  2 This dissertation will show that we are far from a clear and unified understanding of why and how history matters in biology. Many have come to agree that attending to the history of biological systems can sometimes provide increased understanding and explanatory power. But the reasons cited as to why history is an important element in biological processes are varied and far between. We will see that answering the question “Why does history matter?” is not as straightforward as it may seem. Some important conceptual reflection remains to be done before we can establish a solid basis for historical biology. This dissertation is a step in that direction. In order to understand the meaning of the concept of historicity, we need to look at episodes in the recent history of biology, when some evolutionary biologists and ecologists have criticized the lack of historical perspective in their fields. The reader may wonder at this point “How could evolutionary biology and ecology have become ahistorical?” A full answer would require that we attend to the historical and sociological details of this ahistorical movement. Unfortunately, I will not accomplish such a task in these pages. Instead, I will only present some fragments of this story, pointing in the direction of influential theories and philosophies, by way of which I will illustrate (but not fully describe) how evolutionary biology and ecology became momentarily ahistorical. Part of the answer resides in the explanatory approach favored in these fields, the models they developed and the philosophy of science that became important in biology during the early 20th century. We need to look at these episodes because they constitute the seeds of historicity. The brief remarks offered in this introduction should provide a sense of the conditions that triggered discussions about historicity in biology. In developing critical responses to  3 an approach that turned its back to historical explanations and that endorsed a philosophy favoring generalizations and predictability as overarching values of a good scientific enterprise, some biologists (and a few scholars) had to explain why and to what extent history matters in biological processes. Historicity therefore emerged as an important subject of an historical turn (Griffiths 1996) that was initiated in the 1960s-1970s. 1.2 The Adaptationist Program In “The Spandrels of San Marco and the Panglossian Paradigm,” Gould and Lewontin criticized what they called “the adaptationist programme.” In essence, the latter is characterized by: the near omnipotence of natural selection in forging organic design and fashioning the best of all possible worlds. This programme regards natural selection as so powerful and the constraints upon it so few that direct production of adaptation through its operation becomes the primary cause of nearly all organic form, function, and behavior (Gould and Lewontin 1979, pp. 584-585).  Although it encompasses several dimensions, two aspects of the adaptationist program are especially worth emphasizing in connection with the importance (or unimportance) of history. First, there is the view that natural selection leads to the production of “optimal” organic form (“the best of all possible worlds,” Gould and Lewontin 1979, pp. 581, 584, 585, 593, 597). If organisms reach an optimal form, then considerations of (current) adaptive fit are sufficient in a way that they would not be if organic forms were suboptimal. Take the human eye for example. Historically, this organ has been taken as an example of perfect evolution and evidence for the existence of a Designer (Paley 1805). Modern biologists however have come to realize that its design bears some imperfections. The retina, the photoreceptive structure at the back of the eye, happens to  4 be poorly organized in the human case. The photoreceptors are oriented in such a way that photons have to go through a thin layer of tissues (nerves and blood vessels) before reaching the photosensitive end of the receptors. This reduces the quality of the image forming on the retina. One can further appreciate the suboptimality of such design by comparing it to the one that happened to have evolved in the squid eye. The latter resembles the human eye in many respects, except that the photons do not have to go through a layer of tissues before reaching the sensitive end of the photoreceptor. So it turns out that evolution got it right for squids, but not for us. Making sense of this difference requires that we invoke some historical factors. We have a sub-optimally designed retina because our ancestors happen to have evolved and integrated face-down receptors, whereas the squid’s eye ancestor happened to have evolved and integrated face-up receptors. Referring to the evolutionary history of species would be unnecessary if natural selection always produced the best of possible worlds. Historical details would not make a difference at the end of the day if we work from the assumption that all paths lead to a single common, optimal destination. Chapter 7 will show that most biologists often adopt a different position. They now explain the occurrence of maladaptations by inferring that some past adaptations act as constraints on the evolution of organisms. The second aspect of the adaptationist programme worth mentioning for the present project is the view that natural selection leads to the “direct production of adaptation,” or as Gould and Lewontin also put it, natural selection leads to “immediate adaptation to local conditions” (1979, pp. 583, 584, 587, 590, 593; emphasis added). By “immediate” they did not mean instantaneous. Rather, they were referring to what they  5 believed was the all-too-common conviction that adaptation to the current environment determines organic form. If current utility were sufficient to explain the present genetic or morphological state of a population, then evolutionary biologists could justifiably ignore the history of that population. Its previous environments would be irrelevant, as would the ancestral states that had prevailed in previous environments, and the evolutionary paths taken en route to the current form. Nothing other than the current form and the current environment, and the respects in which the former is suited to the latter, would bear on the explanandum. Thus Gould and Lewontin also objected to the exclusive consideration of “current utility” in explanations of form (1979, p. 581). Has anyone ever defended such a position? Another prominent evolutionary biologist, John Maynard Smith, came close to endorsing it, at least on one (perhaps careless) occasion: “[…] most populations have had time to come close to the optimum for the environment in which they live (Maynard Smith [1975] 1993, pp. 11-12; emphasis added; see also Maynard Smith et al. 1985).” But Gould and Lewontin were for the most part not concerned to quote their colleagues. Their point was not that too many evolutionary biologists explicitly advocate the adaptationists’ conception of evolution by natural selection, but rather that too many evolutionary biologists practice evolutionary biology as if it were true, by ignoring history. Lewontin (1974) very clearly expresses this point in a discussion about the problems in the theoretical structure of population genetics. He criticizes the fact that evolutionary geneticists are “[…] anxious to purge historical elements from their explanations.” (p.269) They accomplish this by making population genetics an equilibrium theory.  6 Although we do have the dynamical laws of change in deterministic theory and are able to describe the time dependence of distribution functions in a few cases of stochastic theory, in application only the equilibrium states and steady states distributions are appealed to. Equilibria annihilate history. It is the nature of an equilibrium point that all paths in the dynamical space lead to it (at least locally), so that particular history of change is irrelevant and, once the system is at equilibrium, there is no trace left of historical information (Lewontin 1974, 269).  Lewontin’s characterization of the equilibrium theory is slightly incomplete. We will see in Chapter 2 that only a unique and globally stable equilibrium really can erase history in the long run. In fact, a process admitting of multiple equilibria can embody traces of historical information. Nevertheless, defenders of historicity in biology often developed models and theories that opposed equilibrium theory. The development of an equilibrium theory in population genetics was partly motivated by the impossibility of accessing relevant historical details that could explain the state reached by populations. But endorsing an equilibrium theory made historical information irrelevant in the explanation of the unique stable (and optimal) state reached by populations. This point will be developed in detail in Chapter 2. Lewontin (1974) acknowledges that the lack of historical records makes it difficult to introduce historical details in explanations. But the problem is that in practice population geneticists often assume the existence of a global equilibrium, even in the absence of evidence. Take for instance the polymorphism and lack of uniformity in the distribution of blood types in humans. We know that blood type A is over 50% in East Anglia, but declines importantly in frequency in the North and West of the island, reaching as low as 25% in Western Ireland. The groups B and AB on the other hand have the opposite trend, reaching higher proportion in Scotland (18%) than in the East and South (8%) (Lewontin 1974). Because we have historical records for our species we can  7 explain the distribution of blood type in England and Ireland by the major invasions and displacements that occurred during the last 100 generations – Beakers, Vikings, etc. But a biologist practicing according to the adaptionist programme would not try to look for historical records in attempting to explain this absence of stability. As a good selectionist, he/she would assume that blood types were adaptive, i.e., that they should have reached a unique stable equilibrium in given environmental and cultural conditions, but now that human populations are “off the hook” of natural selection, we notice a lack of stability, even throughout the same territory. According to Lewontin, too many population geneticists are making the mistake of embracing blindly the equilibrium framework. For Lewontin, the equilibrium, a-historical explanation may give biologists something to work on in the absence of historical records, but this path is dangerous because it can lead to false conclusions. 1.3 The Eclipse of History in Ecology In a brief and insightful review of the fields of behavioral and evolutionary ecology, Brook and McLennan (1991) notice a decrease in the number of “historically based studies” in the 1960s-1970s, during which period ecology was rapidly burgeoning. Kingsland ([1985] 1995) develops an interesting account for this movement away from a historical approach in ecology. In Modeling Nature, Kingsland suggests that the development of a more mathematical and theoretical ecology has contributed importantly to the eclipse of history in ecology. One of the most influential and engaged actors of an ahistorical approach in ecology (or science in general) was Robert MacAthur. Like population geneticists, MacArthur developed models and theories that assumed the existence of a unique stable equilibrium. The equilibrium theory of insular  8 biogeography, developed in collaboration with the biologist E.O. Wilson in the 1960s, is a striking example (MacArthur and Wilson 1963, 1967). In brief, the “equilibrium theory” assumes that the number of species on an island (or equally isolated area) depends essentially on immigration and extinction rates (Figure 1.1). The immigration rate increases as a function of the size of the area, but decreases as a function of the number of species already occupying the area and the distance from the mainland. On the other hand, the extinction rate increases as a function of the number of species and the distance from the mainland, but it decreases as the area of the island gets smaller. The expected equilibrium corresponds to the point where these two curves intersect.   Figure 1.1 Graphical representation of the equilibrium model of biogeography for a single island (From MacArthur and Wilson 1967, 21. Reprinted with permission from Princeton University Press).  Once the equilibrium is reached, the number of species constituting a community remains relatively constant because the system tends to return to its equilibrium. This  9 kind of theory can be qualified ahistorical because the number of species occupying a given island does not depend on the history of colonization of the area. A given island has a certain capacity for containing species that depends merely on physical and biogeographical factors like size, altitude and distance from mainland. The type of species entering the community and the order or timing in which they do so are not determining factors of the richness of a community. We will see in Chapter 4 that many ecologists have criticized this assumption and came to realize that the assembly history of communities, i.e., the order, timing and rate at which species enter the community, makes a difference to the specific composition and richness. This example is especially interesting because the equilibrium theory and the spirit in which it has been developed have influenced many other ecologists. In fact, some of the deep philosophical roots explaining a bias for general equilibrium models are apparent through many of MacArthur’s writings. Take for instance the opening word of his last publication, Geographical Ecology (1972). To do science is to search for repeated patterns, not simply to accumulate facts, and to do the science of geographical ecology is to search for patterns of plant and animal life that can be put on map. The person best equipped to do this is the naturalist who loves to note changes [across different environments]. But not all naturalists want to do science; many take refuge in nature’s complexity as a justification to oppose any search for patterns. This book is addressed to those who wish to do science. … Science should be general in its principles. (MacArthur 1972, 1)  This passage gives a glimpse of the philosophy of science endorsed by MacArthur. An important purpose of science is to develop theories, which “identify the factors that determine a class of phenomena and … state the permissible relationships among the factors as a set of verifiable propositions” (MacArthur and Wilson 1967, 5). These factors  10 and relationships are often hidden in the real world and it is the role of a good theory to shed light on them and generate new empirical research. MacArthur’s approach consisted of developing general models capable of making empirically testable predictions. Certain methodologies were clearly inappropriate for achieving this philosophical ideal. For MacArthur and Wilson (181), ecology (biogeography in particular) has remained for a long time in a “natural history phase,” which consists essentially in “accumulating information about distribution of species and higher taxa and the taxonomic composition of biotas.” It is important to keep in mind that this is but one, rather simplistic, view of natural history. Moreover, the word “phase” is very important here. Island biogeography particularly needs this “descriptive activity,” which continues to be of “fundamental importance to the science (181).” But MacArthur and Wilson consider that stopping at this descriptive phase is incomplete. It would be like developing a portrait of the solar system describing how planets move, but without developing a theory that explains why planets behave as they do. They maintain that it is time for biogeography to enter an “equally interesting experimental and theoretical phase (181).” The equilibrium theory sketched above is an important step towards such a phase. As pointed out by Ishida (2007), some of the literature seems to forget that MacArthur did, on certain occasions, recognize that some problems are more fruitfully “interpretable in historical terms and not in terms of the machinery controlling species diversity” (MacArthur 1972, 64). Moreover, we just noted that, in his view, natural history is a very important phase in geographical ecology. So it would be unfair to depict MacArthur as being anti-historical. A more generous reading would rather suggest that he essentially objected to a certain limited form of historical approach to science.  11 Nevertheless, he does put forward an experimental and theoretical approach that remains to a large extent ahistorical. Even when he recognizes that history becomes important when a system admits of multiple equilibria, he claims that such influence is but transitory. History even leaves its mark on equilibria, although how long its influence will be felt is unknown. We have already seen (p. 91) how very hard it is for a second species to colonize an island containing a reasonably close competitor. In this sense whichever species arrives first is practically permanent, and the later arrival is virtually certain to remain missing. But this is not really stable. Given enough time, early species A will go extinct from some islands and B will certainly successfully invade some of them. By this time random processes will have erased most of the history. (MacArthur 1972, 247)  How exactly history leaves a mark on equilibria will be discussed more fully in Chapter 4. Suffice for now to note that this influence is rapidly swept under the carpet when he considers what might happen in the long run. The connection between MacArthur’s philosophy of science and his bias for equilibrium models remains somewhat unclear. As the history of community ecology has proven, there exists no necessary connection between the equilibrium model and a theory capable of stating testable general relationships between important determining factors. Yet, an answer may come from the importance put on predictability and on the intimate association between history and contingency. A perfect balance between immigration and extinction might never be reached, since it would be approached exponentially; but to the extent that the assumption of a balance has enabled us to make certain valid new predictions, the equilibrium concept is useful as a step beyond the more purely descriptive techniques of multiple regressions” (MacArthur and Wilson 1967, 21).  So the equilibrium model was providing something fundamental, something that previous approaches seemed to lack. History becomes a nuisance in this context because it is  12 associated with the idea of contingency. Historical contingencies bring noise into the general patterns hidden in the data. These past vagaries blur the relationships and make the project of prediction more difficult. MacArthur’s approach therefore advocates that we get rid of this noise, which gets in the way of predictability and slows down the quest for the true cause and machinery underneath general natural phenomena. As we will see, finding a role for historical contingency in models and explanations became very important in the historical turn that happened in community ecology. 1.4 Overview The following pages present and reflect upon the work of biologists and philosophers who have offered interesting thoughts about historicity in biology. Many of these accounts are the result of a critical step back from the ahistorical theories summarized above. Although most of the examples and ideas discussed in the subsequent chapters will be familiar to many working in the fields of evolution, ecology and economics, their connection to a general theoretical framework on historicity has, to the best of my knowledge, never been done. So the reader should look at this project as an attempt to provide a general framework laying the basis for a better understanding of the phenomenon of historicity, and from which it is possible to assess and compare different claims to the effect that history matters in biological processes. My approach is thematic and not chronological. Chapter 2 introduces and extends the notions of information-preserving and information-destroying processes addressed by Elliott Sober in Reconstructing The Past (1988). Sober’s account highlights the very fundamental requirement that a process can retain information from the past if and only if it admits of multiple outcomes. I extend his analysis to stochastic processes and argue  13 that a process will be information-preserving if the outcome and/or the probability distribution of outcomes can change as a function of the past. I use these notions as a starting point because they are very intuitive and general. I argue that “preserving information from the past” is a principle that runs through almost all accounts of historicity. Yet, there exist different ways in which processes can be information- preserving, different extents to which information is preserved, different reasons for divergence, and different contexts in which this general principle can apply. Chapters 3 and 4 assess the notion of “historical contigency” as developed in evolution and community ecology. This notion is very important in the biological literature about historicity, but it is also very equivocal. I present the views of Gould and some other biologists, showing that there exist at least three important meanings attached to the expression “historical contingency:” unpredictability, causal dependence, and a third one integrating unpredictability and causal dependence. The Long-Term Experimental Evolution (LTEE) project conducted by Lenski and his collaborators has a central role in Chapter 3. This experiment allows testing Gould’s ideas about historical contingency and it contains an interesting combination of meanings. It also shows very well how conclusions about the historical nature of a process can depend on the level of analysis. While many have invoked historical contingency to emphasize the fact that the order of mutations can affect the evolutionary outcome, Richard Lewontin’s (1966, 1967) account of historicity instead emphasizes the importance of the order of environmental events. More precisely, he shows how modifying the order of selection events can affect the general behavior of a population and can lead to alternative outcomes. The idea that the order of events in a historical sequence can be a determining factor is also very  14 prominent in community ecology. This will be the main focus of Chapter 4. Ecologists have come to realize that the final species composition and structure of a community can depend on what type of species enters in a community in what order. Interestingly, this principle (often called the priority effect) was also developed by MacArthur in 1972. Chapter 5 looks at another face of historicity, path dependence. The latter was first developed in social science, but it was recently introduced in biology by Szathmáry (2006). At its essence, path dependence involves the existence of contingent irreversibility and multiple outcomes. Although very rich and interesting, Szathmáry’s account faces some difficulties. Chapter 6 offers a new account of path dependence with the objective of solving these difficulties and providing a concept in line with the general principle of stochastic information-preserving processes. I inform this revision by revisiting some of the paradigmatic examples and concepts developed in the social science literature. In a nutshell I argue that a process is path dependent if and only if it admits of multiple outcomes at a given instant, and the probability of reaching one or another outcome changes when we go from one trajectory to another. Finally I show that this new account can apply to landmark cases of historicity that were presented in the previous chapters. Chapter 7 talks about the long-lasting effects of phylogenetic constraints and considers the role of path dependence and generative entrenchment in this phenomenon. I show that the notion of path dependence can be useful in interpreting the type of historicity found in the context where phylogenetic constraints affect the course of evolution and some emergent ecological patterns. The work of Peter Price on the sawfly species plays a central role here, for it illustrates very well how accidents of history can  15 have a long-lasting and deep-reaching effect. I also devote special attention to the notion of generative entrenchment as developed by Wimsatt and Shank (1988, 2001), which happens to be an important source of path dependence in development, micro and macroevolution. Finally, Chapter 7 also offers further reflections on irreversibility and historicity. The main highlights of this research are that historicity has many faces in the biological literature, some more general than others.  Information-preserving processes, historical contingency, path dependence and irreversibility, all are at the heart of the view that history matters to biology. We will see through various examples and models that history matters a great deal in biology. But we will also see that it is useful to think in terms of degree of historicity. I articulate two interesting sense of degree, one related to the divergent vs. convergent nature of processes, and the other one related do the difference that taking one path instead of another makes on the probability of reaching a given outcome. There exist however a third interesting sense of degree of historicity that I have unfortunately not addressed in these pages.1 This third sense of degree relates to how similar or different are the outcomes reached by alternative historical processes. In other words, one could say that history matters more if changing the past lead to very large dissimilarities of outcomes, but that history matters to a lesser extent when changing the past leads to more similar outcomes. Finally, this project also shows that the state of knowledge right now does not favor absolutism: one cannot easily say that history always or never matters. The conclusion one reaches will greatly depend on the  1 Thanks to Chris Stephens for mentioning this aspect.  16 type of systems under investigation, the level of analysis, the features that were judged relevant, and to some extent the time scale considered.  17 Chapter 2: Information-destroying and Information-preserving Processes  This chapter introduces one of the most central aspects of historicity.  I argue that a fundamental property of historical processes is their capacity to retain information from the past, which depends on the existence of alternative outcomes. Elliott Sober’s (1988) account of historicity will prove very useful in developing this idea. His distinction between information-preserving and information-destroying processes captures the divide between historical and ahistorical processes. In a nutshell, information-destroying processes are ahistorical because changes in the history of the process do not affect the final outcome. There are no ways in which one can know the initial state and/or the trajectory of the process by looking at the outcome of a information-destroying process. Information-preserving processes, on the other hand, admit of multiple outcomes and the (probability of) outcome(s) change as a function of the history of the process. Thus, observing that a process results in a given outcome allows us to identify a range of past initial conditions and/or trajectories that could have yielded this outcome. This will be explained and illustrated more in detail below. In the end, I shall reject the existence of a strong dichotomy between ahistorical and historical processes. I will argue instead that there exist various degrees of historicity and that it may in certain circumstances be difficult to draw a clear line between historical and ahistorical processes. Still, starting this analysis with a characterization of the opposite poles is very helpful for understanding the problem at hand. Moreover, the characterization of historical processes as information-preserving is perhaps the most  18 general definition we can get. The notion of information-preserving processes runs through all accounts and examples of historicity. 2.1 Reconstructing the Past In Reconstructing the Past, Elliott Sober addresses some fundamental problems related to phylogenetic inference (Sober 1988). Evolutionary biologists – especially systematists – often face the difficult task of inferring the connections between different lineages. For example, one may want to know if the human species is more closely related to chimpanzees, bonobos or gorillas. In order to answer this question, it is necessary that evolution preserve information from the past. As nicely put by Sober, “[t]he knowability of the past depends on whether the physical processes linking past to present are information-preserving or information-destroying” (Sober 1988, 1). Posing the problem in this way creates a need to specify the meaning of “information.” For now, let’s simply say that information in this context can be understood rather loosely as characteristics belonging to life forms. The word “information” simply stands for what is “in” these “forms,” which can be genetic, molecular, morphological or even behavioral. Thus, when we say that some information from the past is preserved, we mean that some past genetic, morphological or behavioral features are to a certain extent maintained and potentially traceable in the current life forms. Sober illustrates nicely the difference between information-preserving and information-destroying processes with the example of a marble released from inside the edge of a bowl with one well (see Figure 3.1a). In normal terrestrial conditions, the marble will roll from the edge inscribing “great circles,” but the friction will cause it to  19 slow down, and it will eventually end up at the bottom of the bowl. Note that without friction, the marble would simply oscillate from one side of the bowl to the other, and no stable equilibrium would be reached. The system would behave just like a pendulum in the void. But since it is an open system, the marble can reach the stable equilibrium admitted by the system. The exact position of the marble during the process will vary as a function of the initial conditions, but “once the ball reaches equilibrium, nothing can be inferred about its starting point” (Ibid, 3). The reason of this incapacity is not our lack of intelligence or the great complexity of the process. We understand it perfectly, and can predict how the marble will behave. The difficulty lies not in us, but in the very nature of the process. There might be a very large set of possible initial conditions, and consequently alternative routes, but since all of them converge at some point, history makes no difference to the final position of the marble and we consequently cannot infer anything about the starting position. One can always elaborate some narrative about the position of the marble at each time period and how the outcome in the next time period is determined by gravitation, the position and velocity of the marble at each time period and other external forces imposed by the shape of the bowl. Despite the fact that this approach will result in a correct and perhaps more complete explanation, it will not be the most parsimonious one.2 There exists a simpler way of accounting for the outcome which consists of merely referring to the fact that all histories will converge towards the same stable outcome. And this is what makes the process information-destroying; information from the past vanishes once the equilibrium state has been reached.  2 One may object here that we still need to justify why the value of parsimony should be more important than completeness.  20 In evolutionary terms, the bowl with a single well would correspond to situations where only one solution exists for a given adaptive problem. The situation can be somewhat similarly represented with an inverted Fujiyama-like “adaptive landscape” (see Figure 2.1b). There exist different types of adaptive landscapes (Gavrilets 2004), and the one produced in Fig. 1 tells how the fitness of organisms relates to genes or trait combinations. a)                                                                     b) a                       a’      Figure 2.1 Information-destroying processes. a) A marble dropped from the edge at different positions (a or a’) of a bowl with one well will always end up in the same final position. This final outcome retains no traces of its past. b) Adaptive landscape for which the outcome will be history independent (see text).  The x and z axes represent the range of genotypes or morphologies under investigation, and the y axis represents the adaptive values (fitnesses) of the various genotypes or morphologies in a particular environment. Given sufficient time, a population described by these genotypic or morphological variables and adaptive values would ascend by natural selection to the peak of the landscape, where it would then rest, regardless of where it began its climb. If we had no record of the process, then we would have no idea what route it took to that outcome. This type of process is therefore information-destroying. Fi tn es s Genotypic/morphological value 1  Genotypic/morphological value 2   X  21 Single-peak adaptive landscapes need several conditions in order to obtain. Ronald A. Fisher’s view of evolution by natural selection, if put in the context of a fitness landscape, would imply a single-peak topography (Gavrilets 2004).3 In general, models of single-peaked landscapes assume that the fitness contributions of individual genes or traits are independent of each other.4 Now that we have a better sense of the qualities of information-destroying processes and why they can erase history, let’s look at processes that retain information from the past. After Sober (1988), I argue that historical processes are information- preserving. Imagine a bowl with multiple wells (see Figure 2.2a). In this kind of situation, some information about the past is preserved because the final outcome  3 Fisher believed that there exists one optimal combination of genes, as opposed to a series of more or less equivalent alternative combinations. He also worked with the assumption that this optimal combination can be found by selection without the intervention of random variations (drift). Although fairly common in population genetics, the notion of drift is not always interpreted in the same way. For sake of simplicity, I mean by drift any stochastic variation in the genotype frequency between generations. In order to make the effect of drift negligible, Fisher also needed to presume a very large number of reproducing individuals in the population. The size of the population can eliminate the effect of drift because even if “bad luck” happens to certain fit individuals in the population, there will be enough individuals acquiring the right set of beneficial mutations and reproducing. Unless the population is extinct or at optimal equilibrium, there should always be organisms with a better combination of genes or traits for selection to act upon. However, needless to say, many populations will not be large enough to guarantee that drift will become a negligible evolutionary force. 4 This can be achieved with additive or multiplicative models (Gavrilets 2004, 37-38). In the additive model, the fitness of an organism can be found by summing up each individual contribution. Note that the landscape of additive models will have a unique optimal combination only if the contributions are small. In the multiplicative model, the fitness of an organism is found by multiplying the fitness contributions of genes. Interestingly, the existence of a unique peak in a finite population does not necessarily result in that population reaching or staying on top of the nearest fitness peak. A population’s fitness can decrease for instance because of the accumulation of detrimental mutations (Muller’s Ratchet Effect), which can in some conditions lead to a mutational meltdown, i.e., a nonviable population. Or it may simply keep the population in a steady fitness state away from the genotypic combination that would provide the highest fitness value (Woodcock and Higgs 1996). Woodcock and Higgs, after Kaufman (1993), describe the situation somewhat metaphorically: “If the fitness landscape is viewed as a mountain range, then the population is likely to be found hanging like a layer of cloud below the mountain peaks but above ground level.” (p.62) Does the incapacity to reach the very top of the fitness peak change anything about the capacity of the process to erase history? I am inclined to say that it does not, or at least not necessarily. If changing the initial genetic combination or aspects of the trajectory taken by the population were to make a difference in the height of the cloud layer for example, then we could say that history matters, despite the existence of a unique domain of attraction, i.e., a unique genetic combination towards which the population tends to evolve but cannot totally reach.  22 depends on the marble’s initial position. As Sober says, “[t]he bowl divided into many wells represents any historical process in which there are multiple local equilibrium points” (Sober 1988, 4). Thus, multiple stable equilibria make the long-term preservation of information from the past possible.        a)                                                                       b)  a                         a’     Figure 2.2 Information-preserving processes. a) A marble dropped from the edge at different positions (a or a’) of this irregularly shaped bowl will stabilize in one of the two possible outcomes because of its initial position. Contrary to the system in 1a, the final outcome retains traces of its past. b) Adaptive landscape for which the outcome will be history-dependent.  Wright’s typical vision of a fitness landscape offers an analogue to the bowl with multiple wells. For a particular population in a particular environment, there might be more than one solution, i.e., alternative evolutionary stable strategies. In the context of fitness landscapes, we can compare these stable strategies to separate equilibrium states—two genetic or morphological combinations of relatively high adaptive value, separated in genotypic or morphological space by forms of lower value, and not traversable by mere natural selection (see Figure 2.2b).5 The evolutionary outcome would  5 Wright (1931) was perhaps the first one to argue that in finite populations, dependent fitness-contributions of different loci (a phenomenon called epistatic correlations) produce the conditions for the existence of multiple peaks. Wright thus proved that, in these conditions, natural selection is not the sole important force Fi tn es s Genotypic/morphological value 1  Value 2 X  23 depend on the stochastic variations in the genotype frequencies (drift) and history. The latter becomes important because the genotypic (or morphological) starting point, together with the mutational history, i.e., the sequence of mutations that happened to arise, can make a difference to which peak a population will climb. A population might come to rest on one rather than the other peak for the simple reason that mutational changes in that direction happened to arise and were selected, whereas mutational changes in the direction of the other peaks did not occur and could not be selectively accumulated. To understand better how mutational history can be an important factor in evolution, let’s consider a fictional scenario of a lineage branching off into two (see Figure 2.3).      Figure 2.3 A linage branching into two and undergoing different mutational histories. XmYn, represent two loci in state m and n respectively. The numbers in subscript can be interpreted as a range of magnitude away from the ancestral genotype.  Initially, all members of the lineage are in state X0Y0. At some point in the history of the lineage, some members acquire a series of mutations and stabilize in a new genotype X4Y0. A new niche opens to these organisms, which eventually results in the  in evolution. As explained in the text, drift and the order of mutations become also relevant forces in the process. X0Y0 X0Y6 X4Y0 X4Y6  24 reproductive isolation from the ancestral type. Responding to a new adaptive challenge created by this change, the derived population then acquires a second series of beneficial mutations and reaches a different stable adaptive peak with genotype X4Y6. In this hypothetical scenario, we can say that the first series of mutations, by creating new opportunities and potentials, also increased the probability of acquiring a new genotype. But this peak may not have been reachable with a different mutational history. This situation is represented on the bottom branch of Figure 2.3. If for instance the population had evolved genotype X0Y6 first, and this had taken the population into an evolutionary stable state, then it may have been virtually impossible to acquire the genotype X4Y6. This happens if going from state X0Y6 to state X4Y6 requires an intermediate series of changes that reduces fitness significantly. The lineage would not be able to go “down the hill” of the adaptive peak they were on because any changes in this direction would in fact be selected against. Therefore, the population would be locked into state X0Y6, and the subsequent evolution of genotype X4Y6 could not happen. Thus, even if the two lineages represented in Figure 2.3 lived in the same type of environment, their alternative mutational histories (and natural selection) would make it very unlikely for their genotype to converge. Therefore, unlike the population evolving into a single-peaked landscape, the process of evolution in rugged landscapes can be information preserving. Although they can be defined in a precise mathematical way (Gavrilets 2004), these fitness landscapes are metaphors. The dimensionality of real fitness landscapes is much higher, and the genotypic and morphological values can be discrete instead of continuous. Nevertheless, the three-dimensional figures and simple branching diagrams capture some of the essential features of fitness landscapes and makes easier for us to  25 understand how Sober’s example of information-preserving and information-destroying processes can apply to population genetics. 2.2 Deterministic and Stochastic Frameworks Historicity is often associated with the ideas of stochasticity and unpredictability and commonly opposed to determinism. In a recent contribution, Fukami et al. (2005) open their article on the assembly of ecological communities by saying:  Whether the structure of ecological communities is deterministic or historically contingent has become increasingly controversial … The deterministic view, … suggests that communities converge towards a common structure determined by environmental conditions, irrespective of the history of community assembly. The alternative view, … suggests that community structure is historically contingent: stochastic forces producing variation in the sequence and timing of species arrivals can cause divergence in community structure among localities, even under identical environmental conditions and regional species pool (Fukami et al. 2005).  The details of this assertion will be explained more fully in Chapter 4. I nevertheless mention it here because it very eloquently illustrates a common way of conceiving the opposition between determinism and historicity and the association between stochasticity and history. A similar claim appears about ecology in general in Ecological Understanding, a more philosophically inclined piece written by Pickett et al. (2007). According to them, a “new ecology” has emerged which embraces history and contingency over equilibrium and determinism. For them, [t]he lack of rigid determinism is encapsulated in the term “contingent,” which connotes that the current state of an ecological or evolutionary system is dependent on the specific conditions that may have occurred from time to time, or on the order of events that affected the trajectory (Pickett, Kolasa, and Jones 2007, 181).   26 This passage suggests that “contingency” entails both lack of determinism and that the historical details affect significantly the states visited by ecological and evolutionary processes. Again, history finds itself intimately associated with chance (contingency per se) and opposed to determinism. Similar claims have been made in the biological literature (Chase 2003; Pickett, Kolasa, and Jones 1994; Belyea and Lancaster 1999; Weiher and Keddy 1999). But I want to keep it brief, so I will limit myself to the above passages and hope that they suffice to convince the reader of the presence of a strong association between history and stochasticity, and an opposition between historicity and determinism. Although common, this pattern of association should not be generalized. Contrary to these analyses, I maintain that we can double-dissociate unpredictability and historicity. The bowl with multiple wells was treated deterministically and yet it provided a scenario where the outcome is sensitive to changes in initial conditions. There is a fascinating literature on extreme sensitivity to initial conditions in complex systems and chaotic dynamics. Most of us are familiar with the Butterfly Effect. Although this example is paradigmatic of chaos theories, and the idea of chaos is often associated with unpredictability, the Butterfly Effect assumes a deterministic world. In such a world, physical objects are governed by deterministic laws: a process will always undergo the same series of events given the same initial conditions because the state of the world at time t0 will totally determine the state of the world in the subsequent time periods. So the Butterfly Effect does not mean that the world is indeterministic. It shows that some complex systems can be extremely sensitive to changes in initial conditions because what would have been thought as insignificant changes in initial condition can translate into  27 increasingly divergent and almost unpredictable outcomes in the long run. Yet, this unpredictability must be practical if the deterministic assumption holds. It comes from our epistemic limits, not from the fact that the exact same initial conditions could possibly evolve into divergent states. I do not wish enter into metaphysical debates about the (in)deterministic nature of our universe. For now, my goal is to elaborate further the idea that we can interpret the Butterfly Effect as a form of historicity, namely deterministic historicity. The Butterfly Effect resembles the bowl with multiple wells. Both examples entail that history matters because we can observe changes in outcomes as a function of (past) initial conditions. Historicity in deterministic processes will typically boil down to dependence on initial conditions. The only source of variability and divergence will come from the wiggling of the starting point, and such variation will matter if and only if the system under investigation admits of multiple outcomes. This is one species of historicity. Another one is required for stochastic systems. I do not mean here that stochastic processes are insensitive to changes in initial conditions. The same can apply to stochastic processes, but we need an additional kind of historicity for the latter type of processes. Stochastic processes differ from deterministic ones because they admit of a more extended form of dependence on history. This is essential to understand since many accounts of historicity developed in the literature assume a stochastic framework. In fact, the fitness landscape examples were already assuming such a framework because I have been treating mutations as chancy, random events in the  28 history of populations.6 What follows serves as a prolegomenon to some of the changes in perspective required by the introduction of chance into the process. One can make the bowl example stochastic by imagining that the marble has a built-in device that randomly affects the marble’s trajectory inside the bowl. If we conceive of such changes as contingent, chancy events, then the process becomes better described in probabilistic terms. By opposition to the deterministic treatment provided before, this stochastic process comprises insufficiently caused events and the structure of the process is not linear, but rather branching like a tree. The outcome at any time period is not totally predetermined by the initial conditions and constraints acting on the process. If the marble is traveling in an environment with multiple equilibria, the final outcome can very well change even if we keep the initial position constant. Note however that unpredictability does not entail historicity. In order to prove my point, I will re-describe the bowl example in a more formal way. We can easily imagine that the built-in device introduces different degrees of noise into the marble’s trajectory. This creates a continuum between a deterministic and a totally unpredictable situation. Determinism occurs in the absence of perturbations, a situation characterized by an extreme consistency of outcomes given initial conditions. At the opposite pole, we have total indeterminacy. This occurs when the device creates a lot of noise and very poor predictability characterized by the fact that any initial condition can yield any admitted outcome with equal probability.  6 Note that a stochastic framework may not be necessary to develop if one conceives of mutations as merely chancy in an epistemic sense. Darwin for example thought that mutations are random with respect to what would be favored by natural selection. This does not preclude the possibility of a deterministic explanation of such events. A stochastic framework becomes useful though in absence of a deterministic explanation.  29 According to this new description of the bowl example, we obtain a set of possible initial positions (initial set) and another set of possible final positions (final set), where each element of the initial set can map into the different elements of the final set with various transition probabilities ranging from 0 to 1.      Figure 2.4 Elements of an initial set mapping into elements of a final set with different probabilities pif. For sake of simplicity, the distribution probability of only one element of the initial set is represented here.  If each element of the initial set maps onto only one element of the final set with a positive probability value – i.e. only one arrow in the graph equals one and the rest equals zero – then we are in a deterministic system. For example, we might have the situation where: p11 = 1,  p12 = 0, p13 = 0; p21 = 0, p22 = 1, p23 = 0; p31 = 0, p32 = 0, p33 = 1.  Applied to the bowl example, this probability distribution corresponds to the situation where each initial position consistently yields a unique outcome. In this case, information about the initial situation is fully preserved in the outcome. p11 p12 p13 Initial set   Final set F1 I1 I2 I3 F2 F3  30 If, by contrast, at least one element of the initial set can be mapped with positive value onto more than one element of the final set, then we are in a stochastic situation, as in the following probability distribution: p11 = 0.8,  p12 = 0.1, p13 = 0.1; p21 = 0.2, p22 = 0.7, p23 = 0.1; p31 = 0.05, p32 = 0.05, p33 = 0.9. In the bowl example, this reflects the situation where the marble occasionally settles into alternative wells, even if initially placed in exactly the same position on the rim. This represents an unpredictable situation: each run of the experiment may produce a range of outcomes that one cannot foretell with certainty because of the chancy nature of variations. Depending on how strongly the random factors affect the system, the stochastic situation admits of degrees of predictability. We maintain a fairly high power of predictability if the process is a little bit noisy, as in the probability distribution depicted above. One can use the above distribution and be able to predict the outcome, most of the time. Increasing the level of noise would reduce this predictive power and affect the probability distribution such that it becomes more flat. With more noise, we will come to a point where each element of the initial set will map on to the different elements of the final set with an equal probability value. This extreme situation would produce a flat probability distribution of outcomes for all elements of the initial set: p11 = 1/3 , p12 = 1/3, p13 = 1/3; p21 = 1/3, p22 = 1/3, p23 = 1/3;  31 p31 = 1/3, p32 = 1/3, p33 = 1/3.  This scenario, applied to the bowl example, corresponds to the situation in which each initial position shows no tendency to settle into one well over another. This can be described as extreme unpredictability and it corresponds to the situation where the information about the initial situation is completely destroyed. There exists no way in which we can rely on data for inferring some kind of preferred association between elements of the initial and final sets, and it is therefore impossible to say anything about the initial position given the knowledge of the final position. This brings us to the dissociation alleged between unpredictability and historicity (which was the motivation for introducing this formalism). A process with a flat distribution of outcomes is truly unpredictable and information-destroying because there exists no obvious dependence on the (remote) past here. When the noise becomes too prominent, the process becomes both maximally unpredictable and completely uninformative about the past. One could simply say that the outcome is the result of sheer chance, which is different from saying that the process is dependent on its history. This shows that unpredictability does not always go hand in hand with historicity. Another important conclusion may be derived from this examination of the bowl example. I have said earlier that the existence of multiple outcomes is a necessary condition for a process to be information-preserving, and for history to matter. We often say that history matters when things could have been otherwise. Interestingly, the more formal examination of the bowl example has led us to conclude that the existence of alternative outcomes is not a sufficient condition for historicity. This follows from the  32 fact that total unpredictability (with flat probability distributions) renders the final outcome independent of the past. We cannot say that the final state reveals anything about where the process has been. When every state in the past becomes equally compatible with every state in the future, and we have no record of what happened between the two, then the present cannot help us know what the past looked like, even if there exist alternate end results. This does not mean that we can ignore the basic fact that each time we release the marble from the rim or repeat an experiment, a process is occurring and a certain outcome results as the unfolding of a particular history. Historicity should not be confounded with this basic (and some would say trivial) fact that all historical processes will have a starting point and an end point, connected by a temporally and causally ordered series of events. The valuable lesson we learn from the disconnection between historicity and unpredictability is that when past-present-future dependency vanishes, as in the extreme case of a flat distribution of outcomes for all elements of an initial set, then we cannot say that history matters in a non-trivial sense. An interesting comparison can be drawn between the completely unpredictable situation and the adaptive landscape entailed by Kimura’s neutral theory (Kimura 1983). According to the latter, most mutations at the molecular level are neutral because they do not result in changes of fitness. Represented in the fitness landscape framework, this theory predicts that the relationship between the genotypic space (all the possible genetic combinations) and the fitness of organisms is flat, as represented in Figure 2.5.    33       Figure 2.5 Flat fitness landscape.  If all genetic combinations produce the same fitness, then selection cannot lead to directional evolution towards local or global peaks and populations will present extensive divergence resulting from the action of stochastic factors. In such populations, the random mutations become the only source of evolution and the outcome becomes extremely unpredictable. Typically, a population with a flat fitness landscape will describe a random walk, showing no tendency to settle into one genetic combination or another. Now, can we say, like we did for the case where the marble’s trajectory is strongly affected by stochastic factors, that the process becomes information-destroying? The answer is not straightforward. It depends on how much connection exists between the admitted states of the system. If any genetic combination can be reached from any other combination regardless of where the population has been before, then we shall say that the system is fully connected. Such a situation leads to the absence of the past-to-present dependency necessary for historicity. This could be the case if the population is very large, so that the entire genotypic space is visited at each generation. If this is the situation, then we could Genotypic value 1 Genotypic value 2 Fi tn es s  34 say that the population’s trajectory is both unpredictable and information-destroying because nothing in the history of the process enables or constrains the population to reach one state over another. If, on the other hand, the population cannot “travel” across the genotypic space as thoroughly, so that only certain regions of the space can be explored at each instant, then history matters. In these cases, information from the past is preserved (at least the early past) since some regions of the genotypic space will be left out because of their extremely low probability of being reached at some point. Note that this asymmetry in the possibility of visiting certain states does not entail a process of selection. The latter would occur only if members of the population are attracted towards or repelled by certain regions of the genotypic space.7 The situation entertained here has no such attractors. There is no directionality in populations where mutations are neutral, so the asymmetry has to come from somewhere else. More will be said about the connectivity of different systems in relation to their tendency to converge or diverge in the next chapter where we will consider the results obtained by Lenski (and collaborators) in the long-term experimental evolution project. For now, I would like to introduce the idea that different processes admit of different degrees of historicity. We agreed that information is destroyed when a system admits of a unique stable equilibrium or when the distribution probability of outcomes does not change as a function of the past. Conversely, we saw that a process admitting of multiple equilibria can retain information from the past if it exhibits sensitivity to changes in initial positions. But systems can retain information to different degrees. When all elements of the initial set lead to the same unique element of the final set with probability  7 This could happen if the population cannot travel in some regions due to constraints in variation, or due to the maladaptiveness of those regions.  35 1, then we have perfect convergence and an absence of historicity. At the opposite pole, we have an extremely high degree of historicity when each element of the initial set is associated with only one element of the final set. Here, the slightest change in past conditions would be preserved and would result in an alternative outcome that could not have been reached otherwise. This is perfect divergence. It requires as many final states as we have initial states, such that it is impossible to associate two or more elements of the initial set with one element of the final set. This degree of historicity entails some form of uniqueness, i.e., that each final state can only be realized by one historical past. In between these two extremes, we have processes admitting of convergence and divergence. This means that several elements of the initial set can be positively associated with one element of the final set, as with the bowl with two wells and the fitness landscape with few peaks (Figure 2.2). In fact, these systems have a very low degree of historicity, for half of the initial set yield one outcome, whereas the other half yields another outcome. Note however that a much higher degree of historicity could be achieved if the bowl becomes rippled or if the fitness landscape becomes very rugged. Many wells/peaks produce a higher degree of historicity because less convergence happens, or conversely because the system becomes much more sensitive to changes in initial/past conditions. 2.3 Concluding Remarks Briefly summing up, we have established in this chapter that historical processes are information-preserving if they are not totally convergent (which is to say that they admit of some alternative outcomes at some point), and if there exists some kind of  36 systematic dependence between changes in the past and variations of outcomes.8 The latter condition eliminates from the range of historicity all purely random processes. We have seen that past-to-present dependency vanishes when a process becomes maximally unpredictable. At least some information must be preserved along the way for history to matter. We also learned that there exists an interesting connection between the degree of historicity and the degree of convergence/divergence of a process. An absence of historicity is characterized by the fact that some outcome will obtain regardless of the past. All information from the past vanishes once this unique outcome is realized. For information to be preserved, we need at least a pair of alternative outcomes. Yet, it may be possible that a large proportion of the initial set remains associated with (converges towards) a particular element or domain of the final set. As the bowl with only two wells illustrates, certain outcomes have a poor power of discrimination between various past alternatives. When the initial set becomes related to the final set in a more unique way, i.e. each element of the initial set becomes specifically associated with distinctive elements of the final set, then the degree of historicity increases. Before we carry this analysis in the realm of historical contingency and path dependence, let me say a few words about the very loose usage I make of the terms “outcome” and “initial condition.” What counts as an outcome or an initial condition? I do not have a good answer to this very difficult question. Ultimately, initial conditions should qualify the state of an entity when it begins its existence and the outcome should  8 A weaker form of this principle can be expressed formally. Suppose a process with initial state I and final state F2. Historicity occurs iff there exists an alternative final state F1 and the following inequality is met: Pr(I|F2) ≠ Pr(I|F1).  37 characterize its state at the end of its existence. However, this ideal and absolute definition rarely applies in science. In many cases an outcome is what comes at the end of an observation or experiment. And initial conditions are often the set of conditions characterizing the state of a system at the beginning of an experiment or observation. Therefore, in practice, these notions are somewhat instrumental. In the marble example, I deliberately focused on the position of the marble on the rim and at the bottom of the bowl. In the fitness landscape example, I focused on the fitness value of a population as it enters certain conditions and once it has reached a stable strategy. But these observational or experimental starting/end points are instrumentally determined. They certainly depend on the nature of the process under investigation, but they are also justified by the goals I had. So they cannot be taken as absolute initial conditions or outcomes. The marble need not stay at the bottom of the bowl forever and a population will most likely undergo some changes even if it reaches some stable state. The ultimate beginning is not when an experimenter decides to put together the elements and observe how they change (or not) with time, and the ultimate outcome is not when someone decides to immobilize time in an observation. Yet this instrumental way of using the expressions “initial” and “final” is common enough and I hope that it will not cause any problems of interpretation.  38 Chapter 3: Historical Contingency I: Gould, Lewontin and the Long-Term Evolutionary Experiment  This chapter examines different usages of the notion of “historical contingency” in evolutionary biology. Biologists typically refer to Stephen Jay Gould and his work on historical contingency when they emphasize that history matters. We will see that Gould uses two importantly different notions of historical contingency, namely causal dependence and unpredictability. Interestingly, very few have noticed the difference between the two notions, and they have used them interchangeably (but see Beatty 2006). A good example of this appears in the series of publications presenting results and reflections related to the “Long-Term Evolutionary Experiment” (LTEE) conducted by Lenski and collaborators since 1988. Nonetheless, these publications are very informative and achieve a deep understanding of the factors affecting the historical nature of evolutionary processes. As I will explain, the upshot of the long-term experiment is that Gould was right about the fact the evolution is historical, but perhaps not to the extent he had suggested. Richard Lewontin also contributed importantly to the early debates about historicity in biology, with and without Gould (Gould and Lewontin 1979; Lewontin 1966, 1967, 1974, 1978). Although both authors agree on many points, their respective contributions display differences worth mentioning. Gould was more interested in defending historicity in macroevolution. Lewontin on the other hand developed most of his arguments using the tools of population genetics. I will focus essentially on one of his original contributions, his proof that history can matter because the order of  39 environmental events can affect the pathway taken by initially identical populations evolving in the same average environmental conditions. 3.1 Gould on Contingency Gould famously alleged that “[a]lmost every interesting  event of life’s history falls into the realm of contingency” (1989, 290). However, John Beatty rightly emphasizes that we find two predominant, non-equivalent notions of contingency in Gould’s writings, namely causal dependence and unpredictability (Beatty 2006). The purpose of this section is to present both notions and reflect on their connection to the notion of information-preserving processes. I will argue that the unpredictability version needs to be combined with or complemented by causal dependence (or a notion of this type) in order to result in historicity. Beatty (2006) discusses the possibility of combining both notions in one explanation, but not in the perspective of developing an account of historicity. Contingency as causal dependence appears explicitly in one of the last thoughts expressed in Wonderful Life, where Gould addresses the debate related to the alleged inferiority of historical explanations in science. I will leave this debate for a future discussion. For now I simply wish to focus on the notion of contingency that arises in this section of Gould’s seminal book on the nature of history. Historical explanations take the form of narrative: E, the phenomenon to be explained, arose because D came before, preceded by C, B, and A. If any of these earlier stages had not occurred, or had transpired in a different way, then E would not exist (or would be present in a substantially altered form, E′, requiring a different explanation). Thus, E makes sense and can be explained rigorously as the outcome of A through D … I am not speaking of randomness … but of the central principle of all history – contingency … A historical explanation [rests] on an unpredictable sequence of antecedent states, where any major change in any step of the  40 sequence would have altered the final result. This final result is therefore dependent, or contingent, upon everything that came before – the inerasable and determining signature of history (Gould 1989, 283).  Note that the expression “causal dependence” must be interpreted carefully here, for Gould’s account requires a particular species of causal dependence. A more general sense of causal dependence would integrate cases of overdetermination, such as: A  B  E  X  Y. When two independent processes result in the same outcome E, then the alteration of one chain of events, for example the absence of B, may not preclude the occurrence of E, for example if the route via X and Y remains active. Although this may sound counter-intuitive, Gould’s account of historical contingency as causal dependence means that some past events play a necessary role in bringing about the current state of a system. For instance, suppose that a process goes from initial state s0 at instant i0 to outcome s1 at later instant in. To say that s1 is causally dependent on s0 means that changing the initial state to some alternative s0′ necessarily yields a different outcome, let’s say s2, at in. So, s0 is causally necessary for the occurrence of s1. Consider for a moment the evolution of humans. It is suggested that Homo sapiens has branched off from Homo erectus and that it was the first hominid species to emigrate from Africa to Europe and Asia. Interestingly, Asian Homo erectus died out and it does not enter the history of the modern man. Migration to Europe was more successful, despite the existence there of the Neanderthals. The explanation of this crucial moment in our evolution remains difficult. The Neanderthals had survived for over 200,000 years and were well adapted to the rough European winters. Moreover, fossil  41 records indicate that the Neanderthal man was in fact much stronger, taller and had a bigger brain than Homo sapiens. Yet they did not last for longer than a few thousand years after the arrival in Europe of the modern man, who was adapted to African environment. Why then did the Neanderthals become extinct but not Homo sapiens? The traditional explanation invokes the competitive exclusion of the Neanderthals, potentially due to their lower intelligence, poorer language skills, and the less efficient and ingenious tools they fabricated (Horan et al. 2005). However, recent studies show that the Neanderthals were probably as intelligent as modern humans and also as capable of speech and symbolic representation. So anthropologists and paleontologists considered other explanations and two new hypotheses have recently emerged, both having to do with the division of labor. These hypotheses, if right, suggest that lacking this capacity could have compromised our current existence, or at least change considerably what the hominid species looks like. One hypothesis, put forward by Shogren (2006) and Horan et al. (2005), alleges that long-distance trade could explain why H. sapiens survived but not H. neanderthalensis. Archeologists have found stone tools made of non-local materials and sea-shell jewelry far from the coast, thus suggesting evidence of long-distance exchanges in H. sapiens. Moreover, the data shows that such trade was going on even 40,000 years ago, i.e., when H. sapiens started to occupy Europe. Shogren and collaborators made an additional interesting discovery, by means of simulation. To see if trade could explain the dominance of H. sapiens, they developed a model in which two species compete for the same resources, but where populations can differ in various properties, among which is their capacity to trade. Basically, they allowed members of the population to differ in  42 their hunting ability, and made possible the exchange of craft for meat, so that the poorer hunters could also have high-energy food despite their diminished skills for obtaining it directly. Such arrangements between hunters and craftsmen resulted in everyone getting more meat, which increased the overall survival and fertility of the population. Since the meat supply was finite, the population with lesser ability to trade declined. This simulation suggested that the Neanderthals became extinct between 2,500 and 30,000 years, a range that nicely reflects reality. Moreover, they showed that it is possible for H. sapiens to outcompete H. neanderthelensis even if the latter are advantaged in other biological attributes, such as hunting. The second hypothesis, which in fact complements rather than competes with the one related to trade, looks at another form of division of economic labor in hominids, namely the labor division between women who took care of children, agriculture and small animals, and men who hunted for big game (Harford 2008; Kuhn and Stiner 2006). This relatively simple behavioral trait could have made a huge difference in our evolutionary history. Kuhn and Stiner (2006) suggest that the Neanderthal men, women and children were behaving all very similarly and were all involved in hunting big game. We have very little evidence of remains of small animals like rabbits and tortoises, nor evidence that Neanderthals were preserving seeds and nuts, and their tools were much less specialized. Evidence of such diversified and specialized behaviors in Europe coincide only with the arrival of H. sapiens, about 40, 000 years ago. A key aspect of their hypothesis is that the work of women gave our ancestors an advantage when big game was scarce. Agriculture and small game allowed the population to have quality food during non-successful periods of hunting. So H sapiens grew while H.  43 neanderthalensis shrank. Interestingly, we can relate this hypothesis to the other one. The wimpy/crafty hunters in Shogren’s model could very well have been women and children! These recent hypotheses suggest that the exclusion of Neanderthals by H. sapiens was due to the fact that the Neanderthals were missing on an important economical institution: the division of labor. The division of labor is ubiquitous in our societies and it comes quite naturally for us. From the coffee we drink in the morning to the technologies we use throughout the day, most products we are using are the result of the collaboration of several parties. Of course the level of industry and the trade displayed by H. sapiens during the Upper Paleolithic was much more modest, and it is still very speculative to assert that they exhibited any behavioral regularities and patterns. Yet, it seems to have made a difference, even in the rudimentary forms described above. If they had displayed a better capacity for sharing tasks in a collaborative way, the Neanderthals could have outcompeted H. sapiens. Our existence would therefore be causally dependent on this aspect of our history. One may wonder however if Gould’s notion of contingency as causal dependence is general enough to make sense of all forms of historicity. I believe the answer is no if we read Gould’s interpretation as portraying evolution as extremely idiosyncratic. To be sure, Gould’s notion of causal dependence is compatible with the idea of information- preserving processes. In fact, this passage would even suggest that the information from the past is sometimes impossible to erase. But his account of causal dependence implies that each evolutionary outcome has a unique history attached to it. Changing any aspect of our evolutionary history would have resulted in a very different human species, or  44 perhaps in the total absence of our species. Under this view, virtually everything in the realm of evolution should be unique and no convergence or multiple realization seems possible. Note however that not all notions of causal dependence are that strong, and one can go for a more generous interpretation of Gould’s notion of causal dependence. Gould does after all say that any major changes will lead to alternative results. This weaker notion of causal dependence can admit of some degree of convergence and becomes compatible with the simple examples used to illustrate information-preserving processes in Chapter 2 (see Figure 2.2). It is not true that any changes in the marble’s initial position or trajectory will necessarily result in different outcomes. Similarly, it may be exaggerated to claim that altering slightly our evolutionary past would have made any difference to the way human beings look and behave nowadays. Homo sapiens’ could have migrated to Europe via an alternate route and still have outcompeted the Neanderthal man and successfully come to occupy the rest of the globe because of its ingenious use of industry. To my mind, the strong, idiosyncratic notion of causal dependence is too restrictive to serve as a stand-in for historicity. I believe that the weaker version and the framework developed in Chapter 2 are more general and more appropriate than strong causal dependence. Historicity entails that changing the initial conditions or the trajectory of the process could lead to a different (probability of) outcome(s), whereas strong causal dependence entails that changing one element of the sequence should or will necessarily yield a different end result. As a consequence of this kind of uniqueness, historical contingency as (strong) causal dependence becomes a special case. I prefer the weaker  45 form of causal dependence present in the notion of “information-preserving processes.” It admits of different degrees of multiple realizability and thus reflects the way in which historicity is understood by many biologists, as we will see throughout this dissertation. Let us now attend to the second notion of historical contingency entertained by Gould, which refers to chance and unpredictability. Gould claimed that “[w]e are the accidental result of an unplanned process ... the fragile result of an enormous concatenation of improbabilities, not the predictable product of any definite process” (Gould 1983, 101–2). He explains this view with a thought experiment called “replaying life’s tape.” The scenario goes thus: suppose that we could stop the course of life, press the rewind button and go back to any time in the evolutionary past. Gould suggests: [A]ny replay of the tape would lead evolution down a pathway radically different from the road actually taken … Each step proceeds for cause, but no finale can be specified at the start, and none would ever occur a second time in the same way, because any pathway proceeds through thousands of improbable stages (Gould 1989, 51).9  After Beatty (2006), we will call this the “unpredictability” notion of contingency, which differs from the “causal dependence” interpretation. Replaying life’s tape experiment suggests a probabilistic view of the world and requires a statistical approach, whereas causal dependence could very well be described deterministically. The unpredictability version of contingency means that the same initial conditions are compatible with  9 I have mentioned in Introduction that there is a sense of degree of historicity that I have not addressed in my dissertation, which relates to how much a difference history makes. We see in this quote that Gould expects that relaying life’s tape would result in “radically different” pathways, which suggests a very high degree of historicity, if we are not merely talking about chance variations, but actually considering a combination of causal dependence and chance (see text).  46 multiple outcomes and we have no way of predicting precisely which one will occur, even if we perfectly know the initial state of the system. 10 Gould, and many after him, have related this notion to the idea that history matters. The question is how? In fact, mere unpredictability does not entail that history matters. As concluded in Chapter 2, historicity is lost in cases of extreme unpredictability. Even if it admits of multiple outcomes, a process is information destroying when any past is equally compatible with any future. Mere chance erases traces of history. So there must be something else, some extra element that they put in their explanation that makes history relevant. Otherwise, replaying life’s tape only emphasizes the role of chance, which is not history. I will further develop this claim in the chapter on path dependence. For now I simply want to emphasize that the unpredictability of any replay can come from the fact that events in the causal chain are conceived as historical accidents. But this differs from historicity because historical accidents (events of low probability) must also have the capacity to causally affect the course of evolution, and consequently lead to alternative outcomes in a way that would let the past be inferred because information is preserved. Thus, Gould’s replay scenario entails historicity only when a notion of causal influence  10 Joel Cohen’s idea of an “irreproducible result” can reflect Gould’s hypothesis (Cohen 1976). Cohen uses Polya’s urn dynamics in order to explain the irreproducible nature of selection processes. He suggests that we do not have to explain the divergent or branching nature of evolution by inferring that the same kind of deterministic factors act differently. In fact, by treating evolution as a probabilistic process with positive feedback, we can easily explain that different outcomes in evolution obtain from the same initial conditions. This is also entailed by Gould’s unpredictability version of historical contingency. However, Cohen develops his example in connection to reproducibility, but says nothing about historical contingency or history dependence. I will elaborate further on this when we will get to Chapter 6. We will see a striking resemblance between Polya’s urn experiment and Gould’s description of the replaying life’s tape experiment in the sense that the unpredictability version of historical contingency entails that the same initial conditions can yield different outcomes and Polya’s urn dynamics admit of multiple equilibria that virtually never reoccur when we repeat the experiment.  47 complements the unpredictability conferred by chance. This causal influence transpires when altering these past vagaries would make the occurrence of the same outcome extremely unlikely – or the occurrence of an alternative outcome extremely likely. Gould occasionally omitted this point and simply focused on the realm of chance. Applying historical contingency to our species for example, Gould argues that Homo sapiens isn’t an “anticipated result” but an “improbable and fragile entity fortunately successful after precarious beginnings as a small population in Africa …” (319). Yet, it is clear that the overall argument alleges that evolution does not progress linearly or converge towards a unique and inevitable outcome (what he calls the ladder and the cone iconographies). He argues for the existence of several “… alternative worlds that didn’t emerge, but might have arisen with slight and sensible changes in some early events” (Gould 1989, 309). It is interesting to note that Simon Conway Morris, the same paleontologist who undertook the reinterpretation of the Burgess Shale fossils and who provided data and suggestions for Gould’s Wounderful Life, comes to a very different conclusion in his book Life’s Solution: Inevitable Humans in a Lonely Universe. Diametrically opposing Gould, Conway Morris wants to “… refute the notion of the ‘dominance of contingency” (p. 297). Both authors agree that the evolutionary routes are many, but contrary to Gould, Conway Morris sees the destinations limited. Taking a theological perspective, he claims that evolution is fundamentally convergent, that “[t]he science of evolution does not belittle us …[for] something like ourselves is an evolutionary inevitability, and our existence also reaffirms our one-ness with the rest of Creation” (pp. xv–xvi). His argument goes beyond theology though and relies primarily on empirical evidence of patterns and trends found in evolution. Given the type of universe we reside in, carbon-  48 based self-organizing entities were almost inevitable. Significantly different forms of genetic code may have arisen in the origin of life, and searchers have found some of these alternatives in the laboratory, but only one seems to have persisted in nature. Eyes, ears and legs have evolved more than once. Were we to replay life’s tape, sentience and intelligence would most certainly evolve again (he argues). The next step follows almost too naturally: “As all the principle properties that characterize humans are convergent, then sooner or later, and we still have a billion years of terrestrial viability in prospect, ‘we’ as a biological property will emerge” (Conway Morris 2003, 96).11 Conway Morris’ book is welcome in this debate and his arguments are convincing in certain respects. Note however that the argument is directed towards unpredictability, not historicity. He would certainly agree that a different type of universe would have yielded different outcomes. So he is not attacking the notion of causal dependence. His target is rather the idea that we (or any other life forms) were unlikely in this universe. Although interesting and supplemented by much scientific evidence, I believe that his claim remains controversial. As established in Chapter 2, evolutionary outcomes most often depend significantly on the mutational order – changing the order of mutations can lead the same population to evolve very different phenotypes. The point will be discussed again later. Such sensitivity to mutational order allows us to predict that human-like creatures will possibly re-evolve only if the list of properties characterizing humans arises in the exact same order. This is a much harder case to make. Conway Morris can always claim that he is not arguing that something identical to us will evolve. The convergences highlighted in his book concern not the nitty-gritty  11 Note that Conway Morris is not alone to argue that ‘we’ are inevitable on more empirical bases. (de Duve 1995; Van Valen 1991)  49 details, but the general structures and patterns observed in the realm of life. The reign of contingency is not totally annihilated in his argument, but it is pushed back to the fine details of life forms. So the conclusion we reach may in fact depend on the level of organization we consider. Conway Morris could be criticized for taking an extreme, perhaps even provocative stance. We may observe important convergences in evolution that are likely to recur in many replays. But this seems insufficient for asserting that all replays will contain the same patterns, always! We can appreciate that not everything in the history of evolution must follow a different path if the ancestral conditions are slightly modified. As argued above, we don’t have to endorse an extreme version of causal dependence. Nor do we have to agree that everything in evolution falls under the realm of contingency. But Conway Morris focuses mostly on the extreme case of information-destroying processes. What is lacking in this debate between Gould and Conway Morris is a better understanding of the ground existing in the middle. The following section offers, I believe, a moderate view of biological historicity. Yet, we will see that, like Gould, biologists do not always differentiate between the two forms of contingency and commonly go from one to the other. 3.2 Historical Contingency in the Long-Term Evolutionary Experiment Because of the scale at which they have been imagined, Gould’s views on historical contingency are very difficult to test empirically. We cannot go back in time and remove or modify past events to see if the evolutionary process is causally dependent. Nor can we replay life’s tape and see if “we” would still be here displaying the same morphological and behavioral traits. As far as we can tell, life on earth has  50 evolved only once and there are no other planets or universes offering the conditions for naturally testing such scenarios. As Conway Morris suggested, we are lonely in this universe. Fortunately, we don’t have to wait for scientists to discover extraterrestrial life forms in order to determine if historical contingency has any grip in reality. There exist at least three ways in which we can elucidate questions about the role of history and test if evolution is unpredictable (i.e. divergent in an unexpected way) or not. One consists of using simulations and models. It is easy to modify the values of variables in a time series or to run a simulation twice and see if the system behaves in an information-preserving or information-destroying manner. However, mathematical models never achieve the complexity of living systems, and this lack of realism is a critique that any simulation must be ready to accept. Nevertheless, we will see that mathematical models can be very useful in generating insights about what could be happening in natural populations. A second alternative is to use the comparative method, i.e. look in nature for conditions of the same type and see if species tend to evolve in the same way. This was Gould’s and Conway Morris’ method of choice. However, these conditions may be extremely rare, depending on how much similarity in the initial conditions and environments one is looking for. Another difficulty, if the purpose is to understand the relationship between changes in the historical series and outcomes on the long run, is that we cannot manipulate the variables. A third avenue, which I think offers a nice compromise between realism and the need to manipulate some parameters, consists in conducting lab experiments with organisms having short generation time, such as viruses and bacteria. The latter  51 alternative can at least partially solve the issue of time travel and multiple universes by creating replicated (cloned) populations that are subsequently and simultaneously propagated in identical environments. These lab experiments also offer a better control on the conditions in which the populations evolve as well as on various parameters judged important in the evolutionary dynamics. This section focuses on one lab experiment in particular, the so-called Long-Term Evolutionary Experiment (LTEE). This experiment started in 1988 and is still going on today. It started with twelve genetically identical populations of the bacteria E. coli issued from the same clone.12 These lines have been propagated in a new controlled milieu. Each of them has evolved for over 44,000 generations, and samples have been frozen every 500 generations. The frozen library provides a rich historical record allowing the team to go back in time and replay the experiment from different instants. The goal of the experiment is to understand better the dynamics of adaptation and divergence in evolution. Over the years, the investigators have been following closely these populations, measuring various characteristics such as genotype, fitness, cell size, and other phenotypic traits. The paper from Travisano et al. (1995) is especially interesting for understanding the conceptual foundations of this experiment. They identify and explain three basic influences guiding the evolution of these lines: adaptation, chance and history. All three have different repercussions on evolution and they developed systematic ways in which their effects can be measured numerically and compared.  12 To be more precise, the twelve populations were identical except for one neutral marker that distinguished six lines from six others.  52 Adaptation and chance can be detected by comparing the mean value of some trait in ancestral and derived populations. As explained above, the ancestral populations are genetically identical and they are put in a new medium, in which they will derive new adaptations. If one finds that “… none of the derived populations has changed significantly relative either to their common ancestor or to one another” (87), it means that the measured traits have not evolved (represented in Figure 3.1A). If on the other hand one finds a significant variation among the derived populations, but that the grand mean (overall populations) has remained sensibly the same, then they infer that chance or drift had an impact (Figure 3.1B). Another possible result is that the grand mean of the derived populations changed significantly from the ancestor, but without significant variation among the populations (Figure 3.1C). This kind of effect is interpreted as evidence of adaptation to the new condition of existence. A fourth scenario (Figure 3.1D) combines the influence of chance and adaptation. This is characterized by significant variations among the derived populations and in the grand mean of the trait.  53     The effect of history can also be visualized in this framework, but at least for this occasion, the authors suggest a slightly different setting than the one put forward for adaptation and chance. Instead of having twelve genetically identical populations introduced in a new medium, we need multiple groups (at least two) of different ancestral genotypes, also introduced into a new medium. If the derived populations tend to homogenize, then it suggests that historical differences have been erased by adaptation and/or chance (Figure 3.1E). If on the other hand the variation among sets of derived populations remains significantly different and similar in magnitude to the initial variation among the ancestral groups, then we can conclude that history has left a trace on Figure 3.1 Effects due to adaptation, chance and history on evolutionary dynamics. A) No initial variations and no evolutionary changes. B) Effect due to chance only. C) Effect due to adaptation only. D) Effect due to both chance and adaptation. E) Initial effect due to history eliminated by subsequent effects due to chance and adaptation. F) Initial effect due to history is maintained, with subsequent effect due to chance and adaptation superimposed. (From Travisano et al. 1995, 87. Reprinted with permission from ASSS.)  54 the evolution of these populations (Figure 3.1F). More generally, the effect of history will be detected if the variations in traits between the different ancestral populations are conserved (or amplified) once they are transferred into the new medium. This form of historicity relates very well to the notion of information-preserving processes and the idea of dependence on initial conditions. We have a non-homogenous set of initial populations and we are looking for the maintenance or amplification of ancestral differences. When past information is preserved this way, then they infer that neither adaptation nor chance has erased traces of history. So we see in this interpretative framework a strong distinction between adaptation, chance and history. The idea that adaptation is opposed to historical contingency is relatively common in population genetics. Adaptation is the result of natural selection and it can erase history when it has a homogenizing effect on derived populations.13 Note however that adaptation rarely acts alone. As we will see bellow, we can have evidence of directional changes in the mean value of a trait and still have a certain degree of divergence among the derived populations. This is to say that adaptation does not always preclude chance and history. Another interesting and less commonly discussed aspect of their framework is that chance too can mask the importance of historical differences. The different ancestral populations form relatively homogeneous clusters. In a sense, one could see there some form or structure in the initial sets. If chance becomes very important, the derived populations will cease to exist as relatively isolated clusters of organisms; the initial structure will vanish and there will be no more variation between than among the groups.  13 Travisano et al. (1995) acknowledge that a trait may not be targeted by natural selection and still present what would look like adaptation. This would be the case for instance if two traits are correlated.   55 Thus chance too has an homogenizing effect, but in a different way than adaptation. Adaptation to the same environment can homogenize the derived populations by making them more similar (convergent evolution). Chance on the other hand can make the derived set more homogeneous by making it more scattered (divergent evolution). Divergence in the derived population is therefore not sufficient to claim that history matters. This can be related to the claim made in Chapter 2 about the fact that pure chance can lead to information-destroying processes and therefore can be opposed to historicity. We will see that a very similar understanding of historicity exists in the literature on phylogenetic constraints (Chapter 7). Unfortunately, this distinction between historicity and chance is often absent in the literature. It is more common to see a strong association between historical contingency and unpredictability. Perhaps this comes from the fact that it is very difficult to distinguish the two effects and that both factors seem to act in combination. In fact, many of the results in the LTEE suggest a combined effect of chance and history (Travisano et al. 1995), and we will see evidence of this combined action in the following examples. But before we go there, let me mention that the analytical framework put forward by Travisano et al. (1995) is not the only way in which the impact of history can be detected. In what follow, I will show that we can witness the role of history (and chance) in initially identical populations too. One of the first results reported by the LTEE group clearly indicates that both adaptation and chance are playing a role (Lenski et al. 1991; Lenski and Travisano 1994;  56 Travisano et al. 1995). Consider Figure 3.2, showing that the average fitness14 has increased in all twelve populations very early in the experiment. This result is not very surprising, at least in the case of fitness. An increase in average fitness was expected because the ancestral populations had evolved for many generations on a quite different medium. The bacteria did very poorly once put in their new environment, but with time the populations became constituted of bacteria more efficient at using the resources and reproducing. So the increase in mean fitness was a natural reaction to the fact that their environment had changed significantly and that they needed some time to adapt to it.   Figure 3.2 Trajectories for mean fitness relative to the ancestors in 12 replicate populations during 10,000 generations. (From Lenski et al. 1994, 6811. Reprinted with permission from National Academy of Sciences, U.S.A.)   14 The mean fitness of the derived population bacteria was obtained by allowing it to compete against the common ancestor. The relative fitness was then obtained by calculating the ratio of the competitors’ realized rates of increase.  57 Although adaptation occurred in all populations, we can nevertheless observe some divergence in the mean fitness of these populations. This divergence is in itself a surprising result given the conditions of the experiment – identical genetic materials in identical environments.15 Several things could be responsible for such divergence. Lenski et al. 1991, 1994 have considered three scenarios, and historical contingency is important in at least two of them.16 I will summarize these scenarios because they depict a form of historical influence that differs slightly from the one entertained above in Figure 3.1. We will invoke history as an explanatory element although all populations are initially identical. One of the scenarios suggests that some divergence will occur along the way, because chance variations do not happen simultaneously in all populations, but in the limit, all populations would converge to the same fitness peak. So this scenario basically claims that divergence is essentially due to chance,17 and history has only a temporary effect. This scenario envisages an adaptive landscape with a global equilibrium towards which all populations converge, but via alternative routes. They all acquire the same  15 One could look at the extent to which these populations diverge and see different degrees of historicity. History matters more if the differences in fitness are greater, but it matters less if fitness values diverge to a lesser extent. See also Introduction and Footnote 9 of this Chapter. 16 One of the hypotheses, ruled out in their 1991 contribution (Lenski et al. 1991), suggested that divergence results from an unstable adaptive surface. The adaptive landscapes considered in Chapter 2 were always stable. However, it happens that the ecological interactions among genotypes are such that the landscape becomes unstable. In particular, the fitness of some genetic combinations (in this case some genetic clones) will be non-transitive. This would occur if, for example, clone B outcompetes clone A, clone C outcompetes clone B, but clone A outcompetes clone C. This scenario implies that the mean fitness relative to the ancestral state will show some periods of decline. The data showed no periods of decline. 17 The effect of chance is detected from the presence of a significant amount of among-population variations. By chance they either mean random mutations, drift or a combination of both. The absence of among-population variation in the mean value of some traits would have indicated that chance is not an important factor in the process. If all populations had reached the same level of average fitness for instance, then the dynamic of their evolution would have been parallel. But the significant amount of variation among the derived populations suggests that chance has affected the evolutionary dynamics. The question is to what extent? The scenario entertained here suggests a low and momentary influence.  58 range of benefits, but not simultaneously. This scenario was ruled out after 10,000 generations, as they realized that these divergences were maintained (see Figure 3.2 and Lenski et al. 1991 and 1994). But let’s forget for a moment that the divergences happen to be stable, and let’s suppose that we only have the results of 2K generations. Can we say that history has any role here? Some may say no, for the long-term result could have been a long-run convergence of mean fitness. The different mutational histories only cause transient divergences in the trajectories of relative fitness. There is something true about this conclusion. But we can nevertheless appreciate that the history of beneficial mutations matters at least before the population reaches an equilibrium state. The end result may be reproducible, and hence predictable, but what happens along the way remains unpredictable and somewhat influential when the population is far enough from its equilibrium. I will argue in another chapter that we can perceive a milder degree of historicity in this type of scenario. After observing the 12 populations for 10,000 generations, they concluded that the divergences were in fact stable because the within-population variation in mean fitness was sustained for a very long period of time (Lenski and Travisano 1994). They took this to confirm the scenario according to which derived populations approach separate peaks of unequal mean fitness on a stable adaptive surface. This supports the Wrightian model presented in Chapter 2, where the evolutionary dynamic is information preserving because of epistatic interactions and random mutational order. They claimed that their experiment “demonstrates the crucial role of chance events (historical accidents) in adaptive evolution” (Lenski et al. 1994, 6813). Chance most certainly plays  59 a role, but I would like to add that history, or more precisely the history of mutations, is a key explanatory element. We see this more clearly in a paper from Johnson et al. (1995), who developed a model in which they were actually able to show that initially identical populations in identical environments can subsequently diverge from one another not only because of genetic drift, but also by selection on different beneficial alleles arising in a random order. A very similar analysis was performed more recently by Wahl and Krakauer (2000). They divide the evolutionary dynamic for initially identical populations evolving in identical environments into two classes. The coincident-event dynamic happens when all populations integrate the same mutation at the same time. This entails that populations with a coincident-event dynamic will display parallel evolution (Figure 3.3a).  a)                                           b)                                               c)     Figure 3.3 Example of (a) parallel evolution, (b) transiently divergent evolution, and (c) sustained divergent evolution. See text for explanation. The x-axis represents time and the y-axis represent any mean trait value.  Alternatively, the isolated-event dynamic happens when different beneficial mutations are integrated in different populations at different instants. This entails that populations with an isolated-event dynamic will have a divergent evolution, which divergence may or may not be sustained (Figure 3.3 b and c). The scenarios of sustained and transient  60 divergence highlight the impact of mutational history. We can therefore observe the influence of history even if we do not have variation in initial (ancestral) populations. Although the starting point is identical, divergences are due to changes in the order and timing of historical accidents. I will argue in Chapter 6 that the notion of path dependence can be useful in interpreting this form of historicity, where we don’t have differences in initial conditions. Interestingly, these mathematical models show that the type of dynamic is strongly influenced by the product Nu, where N is size of the population and u is the rate at which beneficial mutations occur. When Nu is large, and selection is strong, then the entire genetic space is visited thoroughly and simultaneously at each generation and selection can favor the beneficial types coincidentally in all populations, thus resulting in coincident-event dynamic and parallel evolution. This could characterize some but not all populations of small organisms such as viruses and some bacteria (Wahl and Krakauer 2000, 1447). On the other hand, when Nu is small, the waiting time for a beneficial mutation to occur tends to be longer, which leaves room for alternative mutations to become fixed at different moment in different populations. Some beneficial mutations may arise, but they become lost by genetic drift before being selected. The latter situation would therefore result in isolated-event dynamic and divergent evolution, either transient or sustained. The divergence will be transient if there exists but one combination with significantly higher fitness that all replays eventually acquire, but it will be sustained if the landscape is rugged.18  18 Note however that these models are ignoring the possibility of neutral mutations. If mutations are neutral, i.e., if they do not affect the fitness of organisms, then the populations will simply wander across the genetic space without showing a tendency to stabilize in one region.  61 Another interesting usage of “historical contingency” appears in a more recent contribution from the LTEE group (Blount et al. 2008). This one too recognizes the combined effect of chance and historical factors, and it adds an interesting twist to the interpretation of historicity.  At its core, evolution involves a profound tension between random and deterministic processes. Natural selection works systematically to adapt populations to their prevailing environments. However, selection requires heritable variation generated by random mutations, and even beneficial mutations can be lost by random drift. Moreover, random and deterministic processes become intertwined over time such that future alternatives may be contingent on the prior history of an evolving population. For example, multiple beneficial mutations will arise in some unpredictable order … thus constraining some evolutionary paths while potentiating other outcomes (p. 7899).  Historical contingency as described in the second part of this passage emphasizes the fact that certain evolutionary histories can constrain or potentiate different outcomes. This view of historical contingency appears in the explanation of one of the most recent findings in the LTEE project. The bacteria used for this experiment, E. coli, does not usually use citrate as a source of energy in aerobic conditions. Citrate was nevertheless abundant in the medium from the very beginning because it is involved in the transport of iron into the cell. Despite the potential of using citrate in anaerobic conditions and the ecological opportunity to evolve a citrate-metabolizing phenotype in aerobic conditions, none of the 12 populations has shown any sign of such a capacity for over 30,000 generations. This indicates that evolving a citrate-metabolizing phenotype (Cit+) is a hard thing to do (at least in the conditions set up by the LTEE). It was therefore a great surprise to discover that one of the twelve populations evolved that capacity after 31,500 generations. They claim that this is one of the most profound adaptations observed during  62 the LTEE. This new phenotype has achieved a severalfold increase in size and it possesses a much higher fitness than the rest of the populations.19 Two hypotheses were formulated to explain the late arrival of such a beneficial mutation. The long-delayed and unique evolution of the Cit+ phenotype might indicate that [1] it required some usually rare mutation … that does not scale with the typical mutation rate. [2] Alternatively, the occurrence or phenotypic expression of the mutation that generated the Cit+ function might depend on one or more earlier mutation, such that its evolution was contingent on the particular history of that population (Bount et al. 2008, 7900).  The first scenario is called the rare mutation hypothesis, whereas the second scenario is called the historical contingency hypothesis, and both are illustrated in Figure 3.4.  Figure 3.4 Alternative hypotheses for the origin of the citrate-metabolizing (Cit+) phenotype. The rare mutation hypothesis assumes a very low but constant probability of mutation from Cit-  Cit+. The historical contingency hypothesis assumes a shift in the probability of mutation from Cit-  Cit+ due to the acquisition of a certain set of mutations (From Blount et al. 2008, 7901. Reprinted with permission from National Academy of Sciences, U.S.A).  19 Recall that the fitness is established by letting the derived population compete with the ancestral population and by measuring the ratio of ancestral/derived. In fact, it is interesting to note here that the derived population does not totally outcompete the ancestral type. They note for the first time a stable polymorphism with a Cit- minority coexisting with a Cit+ majority.  63  Because they keep a frozen fossil record of the different populations every 500 generations, the group was able to “replay” the evolution of this lineage before and after the arrival of the new phenotype. They observed a higher proportion of citrate- metabolizing phenotype in the later generations, thus “ … support[ing] the hypothesis of historical contingency, in which a genetic background arose that had an increased potential to evolve the Cit+ phenotype” (Blount et al. 2008, 7903). So the numbers suggest that one lucky population went through a series of contingencies that made possible the emergence of an ability to metabolize citrate in aerobic conditions, as eloquently expressed in their very literary conclusion: [O]ur study shows that historical contingency can have a profound and lasting impact under the simplest, and thus most stringent, conditions in which initially identical populations evolve in identical environments. Even from so simple a beginning, small happenstances of history may lead populations along different evolutionary paths. A potentiated cell took the one less traveled by, and that has made all the difference (Blount et al. 2008, p. 7905).  A fuller support of this hypothesis would ask that they find the elements that arose in the mutational history of that lineage that favored that capacity. Yet, the historical contingency hypothesis offers an interesting point of view on the role of history. The idea that a certain mutational history affects the probability of occurrence of certain traits is the fundamental element that allows us to relate the replaying life’s tape experiment with historicity. It is not only a mater of reaching different evolutionary states from the same starting point. This can be achieved by chance alone. Historical accidents lead to historicity only when they can shift the probability of occurrence of various evolutionary  64 outcomes. As we will see later, this also constitutes the basis for the notion of path dependence. In retrospect, the LTEE project and the theoretical models developed in order to understand the dynamics of divergent and parallel evolution show that Gould may be right, but not in all conditions or at all levels of analysis. They show that the same initial conditions can yield alternative and unpredictable evolutionary outcomes, but they also show that the same organisms can display convergent and/or parallel evolution, depending of which trait we are following. These analyses also advanced our understanding of the conditions that can lead to divergent or convergent evolutionary paths. But we still cannot predict exactly what these alternative trajectories should look like. There is also an interesting conceptual advancement throughout the years in the literature surrounding the LTEE. One of the most important ones I believe is the capacity to draw a distinction between mere chance and historical contingency, which often combines causal dependence and unpredictability. This is essential to the notion of historicity as a property of information-preserving processes, and we will see in Chapters 4 & 5 that it is also central to the notion of path dependence.  3.3 Lewontin and the Role of Environmental History We have seen in the Introduction that Richard Lewontin also tackled the idea that the mutational order affects the evolutionary process and that the equilibrium point erases history. Here, I would like to highlight another interesting and original contribution by Lewontin, namely the proof of the importance of environmental history. The dynamics of gene-frequency change has a built-in historicity relying on the order of environmental  65 events. He develops this argument in two early papers (Lewontin 1966, 1967), in which he demonstrates that, in a fluctuating environment (and what environment does not fluctuate?), the order of environmental changes affects the trajectory and final outcome of two populations starting at identical initial gene frequency and exposed to the same average environmental conditions. Imagine an environment in which some days/months/years are cooler than average, some warmer, some wetter, some drier, etc. Selection for or against some genotypes will therefore fluctuate accordingly, causing the gene frequency to respond to these environmental fluctuations. Now imagine another environment in which there are exactly the same numbers of cooler, warmer, wetter and drier days, but in which the fluctuations above or below the average occur in a different (reversed) order. Suppose also two initially identical populations inhabiting those two environments. As it turns out, their evolutionary trajectories and outcome after 50 generations may diverge considerably, as Lewontin showed by way of a simulation. Figure 3.5 shows a typical result obtained using his model (details of the model are found in his 1967 article). In the first run (full line), Lewontin randomly varied the environmental values associated with two genotypes, A and a. In the second run (dotted line), he applied the very same environmental values, but in reverse order. The difference is remarkable. For example, in the first run, the frequency of the charted allele was below 0.5 for most of the duration of the simulation, whereas in the second run, it was above 0.5 most of the time. Further more, not only were the pathways different, but also the outcome after 50 generations.  66  Figure 3.5 Evolutionary pathways with order of selection values reversed (see text). Solid line: allele frequency (right-hand y axis) changes in response to fluctuating environments. Cross-marks represent selection values (left-hand y axis) for and against the allele. Dashed line: allele frequency changes in response to the reverse order of selection values (From Lewontin 1967, 86. Reprinted with permission from Wistar Institute).  The main reason behind this discrepancy of behaviors and outcomes can be found in the basic Mendelian models. Consider the following standard equation determining the allele frequency in the next generation for a population of haploid organisms with two possible alleles, A and a. Qt+1 = WAQt / [WAQt + Wa(1-Qt)]                                            (3.1)   Here, Qt is the frequency of allele A at time period t, and WA and Wa stand for the fitness of alleles A and a respectively. We can simplify this equation by introducing the relative fitness of allele A, VA, given by the equation: VA = WA/Wa. By, replacing VA in Equation 3.1, we obtain: Qt+1 = VAQt / VAQt+(1-Qt)                                                        (3.2)   67 We see that the frequency of allele A in the next generation depends on two variables, namely its relative fitness and the allele frequency in the current generation. The relative fitness gives an indication of the direction of selection a certain time period. It indicates wether the environmental conditions favor one allele over another. If the relative fitness is higher than 1, the fitness of allele A is higher than the fitness value of allele a, and the frequency of A in the next generation will increase. If, to the contrary, the relative fitness is lower than 1, then the frequency of allele A in the next generation will decrease. If the relative fitness equals 1, then the allele frequency in the next generation remains the same, meaning that there is no selective advantage for either allele. More importantly for the present discussion, Equations 3.1 and 3.2 entail that the magnitude (but not the direction) of change in allele frequency is also affected by the allele frequency at current time period. To see this more easily, we can look at the following differential equation:  Qd/dt = scQt(1-Qt),                                                                   (3.3)  which describes how the allele frequency changes over time. Equation 3.3 can be derived from equation (3.2) by factorizing and substituting the proportional difference in fitness, (Wa – WA)/Wa, for a selection coefficient sc (see Otto and Day 2007 for details). We see from 3.3 that the rate of change is proportional to the product Qt(1-Qt). Therefore, a high or a low allele frequency entails that the population will respond less effectively to a given selection pressure (i.e. will yield a lower rate of change), whereas a moderate allele frequency entails a higher sensitivity to the same selection pressure. The “built-in”  68 historicity highlighted by Lewontin (1966, 1967) follows from the above dependence on allele frequency. If we look back at Figure 3.5, we notice that the starting frequency is 0.5, value at which any selection pressure results in the greater change in frequency. The initial environmental values included many positive values, increasing the frequency of Q. However, the subsequent selection values were negative, which pulled Q down to a level so low that it hardly responded to the positive selection regime that followed, and the frequency of Q after 50 generations remained considerably below 0.5. If we compare this result with the population affected by the same average environmental conditions but where the selection values occur in the reverse order, we see that the allele frequency first drops under the influence of very negative selection values. It then climbed in response to a mostly positive selection regime, up to a point where it could be only slowly dragged down by a sustained period of strong, negative selection. The discrepancies between the two trajectories result from the fact that a given selection pressure has a differential influence, depending on the gene frequencies at the time. This relatively simple model can therefore illustrate how the order of environemtal values (which determine the selection coefficient) can have a considerable impact on evolutionary pathways, and to some extent on the outcome. In fact, even a single reversion in a pair of environmental values can lead to small but persisting divergences. By exploring the consequences of Lewontin’s model for haploid organisms, one can discover that the stability of the environment can also have an impact on the convergent/divergent dynamic of the populations. The environmental values were generated randomly in Lewontin’s model. Interestingly, the divergence in the final gene  69 frequency of the populations vanishes if the environment fluctuates to a lesser extent. Consider Figure 3.6, where the same experiment was repeated, but this time, instead of allowing the selective values to fluctuate randomly, I have kept them constant in one direction for 30 generations, and then reverted them for another 20 generations. In the other run, I have started with 20 generations of negative selection of the same magnitude, followed by 30 generations of positive selection pressure.  Figure 3.6 Evolutionary pathways of haploid populations placed in reversed non-random environments (environmental values omitted).   Lewontin never mentioned the importance of changing the environment randomly in order to obtain divergence on the long run. This may have to do with the fact that he was mainly interested in the fact that the two reverted populations take dramatically different pathways. Yet, this result shows that we need to be careful in how we interpret Lewontin’s experiment, because the conclusion can differ if we focus on the pathway or  70 the outcome of the process. As we see, it turns out that, at least in haploid populations, the divergence in outcome is affected by the stability of the environment. The situation is slightly different when we try the same experiment with a model in population genetics of diploid organisms. Again, we have two possible alleles, A and a, with frequencies p and q, but this time three possible genotypes AA aA and aa. The model for predicting the frequency of one allele at the next time period is slightly more complex: Qt+1 = (Qt2WAA + QtPtWAa) / W                                                 (3.4)  where the average fitness W is defined as:  W = Qt2WAA + 2Qt(1-Qt)WAa + (1-Qt)2Waa.                               (3.5)  As it is the case with haploid models, the magnitude of the change from one generation to another depends on the relative fitness and the frequency of the allele at time t. However, an important difference between the haploid and diploid models, which makes a difference for the historicity of outcomes, is dominance. Dominance exists only for greater than haploid populations. I will consider only the case of diploid populations. In the absence of dominance, the fitness value of the heterozygote type is exactly the median point between the two homozygote types. Dominance occurs when a bias in fitness value exists for genotypes including one or the other allele. Table 3.1 provides examples of fitness distribution and the according type of dominance.   71   Table 3.1 Genotypes fitness attribution and related dominance Fitness Attribution Dominance AA: 1, Aa: 1, aa: 0.9 Dominance for allele A AA: 1, Aa: 0.9, aa: 0.8 No dominance AA: 0.9, Aa: 1, aa:1 Dominance for allele a  We saw (Figure 3.6) that historicity does not apply to haploid models when we have long time periods with stable selection regimes. The same ahistorical dynamics obtain with diploid populations when there is no dominance. However, this is not true for diploid models when we have dominance for one allele (Figure 3.7) or when we have no dominance, but the selection regime fluctuates randomly from one generation to the other (Figure 3.8).  Figure 3.7 Evolutionary pathways of diploid populations placed in reversed non-random environments (environmental values omitted). Dominance for allele A (see text).    72  Figure 3.8 Evolutionary pathways of diploid populations placed in reversed random environments (environmental values omitted).   What should we conclude then about the claim that evolution is historical because the order of environments makes a difference to the outcome? Obviously, this claim is not absolutely correct and needs specification. First, it depends on the focus of the analysis. The claim is valid if we consider pathways. A different conclusion imposes itself if we focus on the outcome. A summary of the possible conclusions we have reached regarding evolutionary outcomes is presented in Table 3.2.  Table 3.2 (A)historical nature of evolutionary outcomes in different populations and environmental conditions  Diploid with no dominance Diploid with dominance Haploid Random Environment Historical (Figure 3.6) Historical Historical (Fig. 3.3) Non- Random Environment Ahistorical  Historical (Figure 3.5) Ahistorical (Figure 3.4)   73 The expression “Historical” means that changing (reversing) the order of selection regimes yield the population to evolve different outcomes. “Ahistorical” means that changing the order of the selection regimes makes no difference to the outcome – despite the fact that the pathways can differ between the original and reversed populations. By further exploring the behavior of populations governed by the haploid and diploid models, we can conclude that randomizing the selection regimes tends to make the evolutionary process more historical. However, changing the order of non-random environmental conditions takes the population along a different trajectory, but makes no difference to the final outcome for haploid populations and diploid populations with no dominance. Finally, introducing dominance also contributes to the historical nature of the evolution of diploid populations. 3.4 Concluding Remarks This chapter has addressed some of the most important ways in which history is said to matter in evolutionary biology. Looking back at these influential accounts of historicity, we see that the expression “historical contingency” receives importantly different interpretations and that we need to be careful while reading biologists on this theme. Some studies will emphasize unpredictability, others will look more at causal dependence or dependence on initial/ancestral conditions, while others will look for a dependence on mutational or environmental histories. Moreover, the conclusion we reach about a particular system is contextual and depends on the level of analysis. The results of the LTEE clearly show that some traits in E. coli tend to evolve in parallel or follow transiently divergent trajectories, whereas other traits clearly undertake sustained divergent evolutionary dynamics. This shows that  74 fairly simple organisms can display sustained divergent dynamics, and as such display historicity. Moreover, as noted by Szathmáry (2006), the lessons obtained from mathematical modeling (Johnson et al. 1995) seem to indicate that historicity should be even more likely to occur for populations of large organisms, like humans (for whom the Nu product tends to be even smaller). To my mind, this shows that Conway Morris’s argument lies on shaky ground. We have evidence of convergent evolution, but we also have a lot of evidence supporting the view that evolution can be greatly affected by historical contingencies and reach alternative states because of their particular histories. Note however that the examples discussed here do not exhaust all the ways in which historical contingency is emphasized in evolutionary biology. For sake of simplicity and conciseness, I have omitted for now the discussion related to phylogenetic constraints and malfunctions. Biologists writing about these topics, such as Gould and Vrba (1982) and Williams (1992) for example, commonly explain their observations by invoking historical contingency. I will address this dimension of historicity in a subsequent chapter. Let us next assess the usage and understanding of “historical contingency” in community ecology.   75 Chapter 4: Historical Contingency II: Community Ecology  Although more commonly used in evolutionary biology, the notion of historical contingency also became fairly important in the field of community ecology in the explanation of the formation and persistence of ecological communities. The main phenomenon in question is community composition, understood either as the actual species constituting a community, or as the species diversity of a community. The main ecological processes involved in determining community composition are migration and interspecific interactions. “Historical contingency” is said to play an important role in explaining differences in community composition when the latter are believed to be due to chance differences in which species migrated into the area in question, and/or chance differences in the order in which they initially arrived. In discussions of the important factors determining community composition, ecologists, like evolutionary biologists, use the expression “historical contingency” without distinguishing between causal dependence and unpredictability. We will see that both notions commonly appear together. Something like causal dependence is inherent in many explanations of why the assembly history of community matters, although somewhat hidden by a strong focus on unpredictability. I will suggest in Chapter 6 that this can be sorted out with a notion of probabilistic causal dependence. Here we will see that the notions of information preserving and information destroying processes and the idea of degree of historicity again prove useful in interpreting why and to what extent history matters in community ecology.   76 4.1 Historical Contingency in the Process of Community Assembly The type of dynamics and the processes we are about to consider differ from the evolutionary processes discussed in the previous chapters. So let me begin by framing the different components at play in the process of community assembly, and at the same time clarify some of the technical jargon used by ecologists. Consider Figure 4.1, based upon Chase 2003 article, which contains the main elements and hypotheses entertained in the debate about historical contingency in community ecology.           Figure 4.1. Left side (environmental determinism): community assembly when there exists a single stable equilibrium for each of several environments (En). Local diversity and composition are the result of regional species pool and environmental conditions. Right side (historical contingency): community assembly when there are multiple stable equilibria from different assembly histories (Hn) in the same environment.  Ecologists don’t always agree on the meaning of ‘community’ (Cooper 2003, Sterelney and Griffiths 1999, Keller and Golley 2000). This problem is beyond the scope of this chapter. For the present purpose, it suffices to conceive of ecological communities as A B C D E F G H I J K …  ABC  ABC EFGH E1 E2 Environmental Determinism  ABC  DEF  GHI E1 + H1 E1 + H4  Historical Contingency Regional pool Local communities Local communities EFGH  JKLM E1 + H2  E1 + H3  E1 E2  77 associations of interacting species. The claim of importance/unimportance of history in community ecology often boils down to the importance/unimportance of assembly history, i.e., the order and timing in which species enter into a community. The assembly history can matter essentially in two ways. Either it affects the species diversity, i.e. how many species coexist in a local community, or it affects the species composition of local communities without necessarily changing the diversity from site to site. When a given regional pool of species and environment tend to yield the same stable local composition, or the same species diversity, then ecologists typically conclude that history does not matter. Environmental determinism thus means that the environment acts as a filter determining both the diversity among and between communities. In contrast, historical contingency is invoked when different stable equilibria can be explained by alternative assembly histories, given the same environment and regional pool. One can see an analogy with evolutionary biology here. Just like the order of mutations can affect which one of the alternative adaptive peaks a population will climb, the order of invasion can result in alternative community composition. This is the big picture. Let us now explore the factors underlying these hypotheses. Jared Diamond’s (1975) seminal paper on the assembly of ecological communities in the Bismark islands is a landmark study for ecologists taking seriously the idea that history matters. The almost book-length article was published in a collection edited by Martin Cody and Jared Diamond, Ecology and Evolution of Communities. It shows that the structure of communities (i.e. the number of species) and the specific community composition of New Guinea and its satellite islands depend on various factors, such as the size of the island, the distance from the mainland, the capacity of  78 dispersion of species, the intensity of competition for resources and chance (historical accidents).20 Diamond does not use the expression “historical contingency” in this paper and he is not the first one to emphasize the role of invasion sequence in the process of community assembly (see Booth and Larson, 1999). Yet ecologists frequently refer to Diamond’s paper as one of the discussions that launched the topic. Consider for instance his concluding words in a section entitled “Chance or Predestination,” in which he expounds two extreme possibilities summarized in Figure 4.1: At one extreme, the species composition of an island fauna can be uniquely determined by an island’s physical properties. Combinations of colonists might be reshuffled through invasion and extinction until the best-suited groups of colonists had been assembled, and these would then persist. […] At the other extreme, chance in the form of random historical events might play a large role in building up nonidentical communities that represent alternative stable equilibria (Diamond 1975, 440-441).  The first extreme mentioned in this passage summarizes the left side of Figure 4.1. Through a process of trial and error, the best-suited group of species for a given environment will come together. There exists but one (optimal) equilibrium for each environment, an equilibrium determined by physical or environmental factors such as the distance from the mainland and the island’s area. This is equivalent to the equilibrium theory of island biogeography, except that it considers not only the diversity of species in  20 Diamond formulated in this paper what he called “assembly rules.” The question of what constitute an assembly rule has created a vivid debate among ecologists (see Weiher & Keddy 1999 and references therein). If one attends to Diamond’s original formulation, assembly rules resemble generalizations on observed patterns. For example, he included among the rules statements such as “A combination [of species] that is stable on a large or species-rich island may be unstable on a small or species-poor island” (Diamond 1975, 423). In what sense this counts as a rule remains somewhat unclear. It does not impose any constraint on how species can or cannot come together in the process of community formation. Ecologists have since modified slightly their usage of the expression, and the more recently formulated assembly rules related more to our common sense of what a rule is. Take for instance Keddy and Weiher’s definition: “An assembly rule specifies the values and domain of factors that either structure or constrain the properties of ecological assemblage” (1999, 8). This requires not only the finding of some patterns, but also the specification of mechanisms causing the patterns.   79 a community, but also its specific composition. Because these theories entail that the formation process for a given environment will always tend towards the same equilibrium, we can qualify them as ahistorical. An assembly process following this kind of rule would in fact be information destroying. Regardless of what species comes first, second, third, etc, the final outcome will remain unchanged. This kind of physical determination (or predestination as Diamond’s title suggests) is similar to the bowl with one well, where information from the past can never be retrieved once the equilibrium has been reached. Diamond also sees in his results that chance and history play at least some role. The other extreme mentioned in the quote is opposed to environmental determinism. Diamond says that history matters because some past vagaries (random historical events) can lead communities towards alternative equilibrium states. He observed for instance that the species composition wanders among adjacent stability maxima. He explains this in terms of coincidence or right timing for species that happened to be blooming. Despite the fact that the Lonchura grass finch species display a regular checkerboard distribution, he discovered that they occupy a wide array of altitudes, rainfall conditions and grassland types. He concludes that the successful colonists are selected on a first-come-first-served basis. Although Diamond mainly focuses on the unpredictability notion of contingency, as most ecologists seem to do, we see in his conclusion that causal dependence comes into play via the priority effect, which is the central topic of the next section.   80 4.1.1 The Priority Effect, Keystone Species and Predictability The second type of situation described by Diamond (see last quote) corresponds to the priority effect and it falls on the right side of Figure 4.1. Loosely defined, the priority effect is the idea that the first species to arrive in an habitat sets the balance of resources available in the habitat, which in turn affects the chance of other species becoming successful colonists. Thus, the priority effect entails that the order and timing of species arrival in a community can make a difference to the outcome because different species will use the resources available and change the suitability of the habitat in different ways. Saying that the assembly history can be a determining factor in the properties of communities relates more to causal dependence than chance, so why emphasize unpredictability? It is commonly assumed that process of colonization is unpredictable because the order of the assembly history is random. For reasons that are more difficult to explain, ecologists have most often tied historicity to chance than causal dependence. A solution to this problem will be discussed in Chapter 6. In a nutshell, one can integrate both unpredictability of the process and causal dependence if we adopt the right notion of probabilistic causal dependence. Interestingly, Robert MacArthur and E.O. Wilson, the two authors of the equilibrium theory of island biogeography (see introduction for a summary), have eventually come to the conclusion that history matters in community ecology because of the priority effect (Wilson 1992, MacArthur 1972). They both acknowledged the relevance of the order of arrival of species and their competitive abilities. I will consider only MacArthur’s account of this phenomenon, which requires that I introduce some formalism and explain his graphical representation of the principle of competitive exclusion. Consider Figure 4.2.  81   Figure 4.2 Isoclines of three species feeding on two resources of abundance R1 and R2. Full-line isoclines represent the level at which species 1 and 2 can maintain themselves, i.e. dX1/dt = 0 and dX2/dt = 0. Dotted lines are hypothetical isoclines for a third species entering the community. The point (a, b) corresponds to the level of resources for which species 1 and 2 can coexist. The point (c,d) corresponds to the case where species 2 and 3 coexist, whereas the point (e,f) corresponds to the case where species 1 and 3 coexist. (Adapted from MacArthur 1972, p.48.)  Suppose two species (1 and 2) of respective abundance X1 and X2 feed on two resources 1 and 2 of abundance R1 and R2. Each species needs a certain level of both resources to maintain their population, here represented with isoclines: dXn/dt = 0.21 A species will increase in abundance if and only if the level of resources remains above its isocline. Conversely, abundance will decrease if the level of resources falls under its isoclines. The isoclines of two species will intersect if they have different capacity to survive and grow  21 Note that these isoclines do not have to be linear. (c,d) Scenario 1  Scenario 2  (e,f)  82 on the mixture of resources available. In the present example, species 1 does better than species 2 on resource 1 – it can maintain itself at a lower level of R1, whereas species 2 does better than species 1 on resource 2. This results in their respective isoclines intersecting at point (a,b). The latter point corresponds to the level of resources where both species can coexist. If only these two species are introduced in a new habitat, the systems will eventually stabilize at that point where coexistance is possible. Now imagine that the system has stabilized at point (a, b) where species 1 and 2 coexist. Suppose furthermore that a third species joins the community, with still a different capacity to maintain itself on resources 1 and 2. The dotted lines represent two possible isoclines for this third colonist. According to scenario 1, the third species’s isocline lies above the now realized point (a, b). Unfortunately, this attempt will be a failure because of the insufficient level of resources. In other words, there is an insufficient amount of resources 1 and/or 2 for that third species to establish itself. But things could be different if the third colonist were using resources 1 and 2 in a different, more efficient way, as represented in Scenario 2 of Figure 4.2. According to this second scenario, the third species’s isocline lies under the equilibrium point (a, b). Thus the level of resources present in the habitat is sufficient for species 3 to increase in abundance and establish itself. Unlike Scenario 1, Scenario 2 would allow the system to reach a new equilibrium. The outcome of this scenario will depend on how that third species ends up using the resources. If for instance the resources level happens to shift from (a, b) to (c, d), then species 2 and 3 will coexist and species 1 will be eliminated from the community, unless it can maintain itself using a third resource, or if it becomes more  83 efficient at using resources 1 and 2. If on the other hand, the system is dragged towards the equilibrium point (e,f), then species 1 and 3 will coexist and outcompete species 2. In order to see how the assembly history can affect the community composition, we need to consider what would happen if we were to change the order in which each species enters the community. For sake of argument, suppose that the third colonist is quite efficient at using the resources available, such that its isocline corresponds to the one lying below the point (a,b). We saw already that if the order corresponds to species 1 followed by species 2 and 3, then the system will stabilize at either point (c,d) or (e,f). If species arrive in order 1, 3 and 2, then the system will stabilize at point (e, f), where species 1 and 3 coexist. This would happen because the level of resources will be too low for species 2 to establish itself. Following the same logic, Table 4.1 summarizes the alternative results that could obtain if we change the order of arrival of species into the community.  Table 4.1 Alternative Outcomes Associated With Alternative Assembly History Assembly History Resource Level(s) Resulting Community 1, 2, 3 (c, d) or (e, f) (2-3) or (1-3) 1, 3, 2 (e, f) (1-3) 2, 1, 3 (c, d) or (e, f) (2-3) or (1-3) 2, 3, 1 (c, d) (2-3) 3, 1, 2 (e, f) (1-3) 3, 2, 1 (c, d) (2-3)  This very simple system clearly shows how the outcome of the assembly process depends on the assembly history. The form of the isoclines and how the species joining the community ends up using the resources would also be important determinants. The multiple stable equilibria, and historicity, would vanish if the third species’s isocline lay  84 outside (a, b). In this case, species 1 and 2 would be better competitors in any circumstances, and the only stable equilibrium would be given by the intersection of species’ 1 and 2 isoclines. This shows that assembly history can be an important factor determining the structure of communities, but not in all circumstances. The importance of competitive ability in the priority effect is tightly related to the doctrine of “keystone species.” In short, as the name suggests, a keystone species plays a fundamental role in the stability of communities; removing such a species would cause a cascade of effects on species abundance and eventually lead to a significant change in community composition, structure and stability. A keystone species will often be the natural enemy controlling the abundance of other highly competitive species (Paine 1966). If a species of plant is especially good at using the nutrients available and depletes them to a level where almost no other species can survive, then the keystone species can be an herbivore that stabilizes and reduces the abundance of that very competitive plant species, thus allowing the nutrients to rise to a level capable of supporting other plant and herbivore species (Crawley 1997). However, the idea of a unique keystone species has been criticized on the grounds that we cannot claim that one species is responsible for the whole community’s richness and stability. In fact, any species can be attributed a degree of “keystone-ness” or a “keystone rank” by counting how many species would disappear after it is removed. Thus understood, the doctrine of keystone species entails that the species with the highest keystone rank puts a higher constraint on the stability of a community, but also on the assembly process because it can determine what history will be feasible or not. As we  85 shall see shortly, an interesting consequence is that the doctrine of keystone species introduces some element of predictability in the process. The connection between the doctrine of keystone species, historicity and predictability is very well illustrated by Grover (1994), who shows that we can predict to a certain extent the outcome of various assembly histories given the role of species in their local interactions. His model entertains a relatively simple food web containing n species organized in parallel food chains using the same nutrients:  P1 H1 P2  H2 Pn  Hn  Figure 4.3 Food web with parallel food chains rooted in a common nutrient pool, R. Pn stands for plant species, Hn for herbivore species.  In order to simplify matters, I will describe the result of this model for three food chains. Starting from the assumption that plant 1 is a better competitor than plants 2 and 3, Grover shows that the only assembly history resulting in a six species community occurs when plant 1 comes first, followed by the herbivore feeding on plant 1, followed by plant 2 and its herbivore, and finishing with plant 3 with its herbivore. If this order is not respected, then only a limited number of species can become successful colonists. In order to understand this result, we need to pay attention to changes in the level of resources available after the introduction of species. Consider Figure 4.4. R …  86   The model assumes that each species assemblage reaches an equilibrium nutrient concentration before the arrival of another species. Plant species 1 stabilizes at nutrient concentration R*(1) (the * means that this level is an equilibrium point). Then herbivore 1 is introduced and shifts the equilibrium level to R*(1;1), which happens to be much larger than R*(1). This means that the herbivore species 1 is very efficient at feeding on plant species 1, such that there are more resources left when the community is composed of these two species. Then, a second plant can colonize the habitat, which will result into a stable three species community iff  R*(1) < R*(2) < R*(1;1). In words, the second plant species introduced must have enough resources to establish itself, but it must not lower the resources level to a point where plant species 1 cannot be maintained. If plant species 2 happens to be a better competitor for nutrients than plant species 1, then the latter will be outcompeted. The right-side inequality means that the second plant species cannot be too poor a competitor either, otherwise it will fail to become a successful colonist.  Figure 4.4 Trajectory of equilibrium nutrient as community is assembled from three simple food chains. R*(n;m): equilibrium nutrient level when the system is composed of plant species n and herbivore species m (From Grover 1994, 274. Reprinted with permission from the University of Chicago Press).  87 Once the second plant species has reached a stable density (i.e., number of individuals per unit of space), then the second herbivore species (feeding exclusively on plant 2) can be introduced. But again some restrictions apply for a stable four-species community to occur. The equilibrium for four species requires the following inequality: R*(1) < R*(2) < R*(1,2;1,2) < R*(j;j). R*(1,2;1,2) represents the equilibrium nutrient concentration for two food chains, and R*(j;j) represents the equilibrium nutrient value for a community formed of a single food chain (either plant 1 herbivore 1, or plant 2  herbivore 2). The right-side inequality means that the equilibrium R*(1,2;1,2) must be smaller than the minimum nutrient level required for either of the food chains alone. If R*(1;1) < R*(1,2;1,2) or R*(2;2) < R*(1,2;1,2), then the system would support one food chain or the other, but not both. This shows well the effect of introducing alternatively plant and herbivore species, and it helps us to understand better the constraints imposed by species’ respective competitiveness and keystone-ness in the process. The species with the highest keystone-rank here is herbivore 1 because it reduces the abundance of the most competitive plant, shifting the resource availability to a level that makes it possible for other food chains to be added. Without herbivore 1, the equilibrium nutrient level is too low, and no other plant can become a successful colonist. Thus, if the assembly process does not follow the pattern discovered by Grover, then only a portion of the regional pool of species will become successful colonists, depending on the order in which the species are introduced.  Results for four species are presented in the Table 4.2 below.    88 Table 4.2 Invasion sequence and final composition for a four-species community. Xn and Hn stand for plant and herbivore species respectively. (From Grover 1994, Reprinted with permission of the University of Chicago Press.)   Grover’s model contains many assumptions making it less realistic. We will return to it shortly. Yet, it offers interesting insights into the assembly process and it proves several interesting points. First, it strikingly illustrates that assembly history can matter in the process of community formation. Grover’s analysis shows that a community assembly governed by local processes does not always lead to convergent community structure for  89 a given pool of species and environment. His mathematical analysis of the assembly process proves for example that the same 4 species community will reoccur in these conditions only if the assembly history is the same. Note also that this model, unlike MacArthur’s, entails that the species richness (and not only the species composition) can be affected by the assembly history. This is related to Figure 4.1 and to the idea that the formation of ecological communities exhibits the quality of an information-preserving process. Unlike the bowl example used in Chapter 2, Grover’s model does not implement changes in initial conditions. Every run of the model assumes the same pool of species and identical environmental conditions. Nevertheless it shows that changes in the past conditions, i.e., assembly history, create alternative (probability of) outcomes. The process is not totally information-preserving, though. Table 4.2 shows that different histories can yield alternative final composition. For example, 50% of the final compositions are composed of plant species 1. There is nevertheless the possibility of affecting the outcome by changing the past. The information-preserving nature of the process also manifests itself in the possibility of reconstructing the past. We can to a certain extent “retrodict” what the assembly history could have been given the outcome. However, since different assembly histories can yield the same outcome, as very well shown in Table 4.2, the reconstruction will be rather probabilistic. Finally, the model also allows for a certain form of predictability. Even if the order of species remains in itself random, we can infer from the model that a higher richness and stability will possibly be achieved if the assembly history follows the pattern  90 illustrated in Figure 4.5. This does not mean however that we can predict the community structure without knowing the order of colonization. We still need to take history into consideration in order to make such a prediction. 4.1.2 On Other Regional and Local Factors Affecting the Degree of Convergence/Divergence of the Community Assembly The divergent/convergent property of the assembly process can be interpreted both in reference to Figure 4.1 and in terms of information-preserving/destroying processes. A convergent community assembly process corresponds to environmental determinism (left side of Figure 4.1): the same community will form over and over in the same environmental conditions for a given regional pool of species. As suggested previously, this corresponds to an information-destroying process because changes in the assembly history will not make a difference to the final composition of local communities. A divergent community assembly process, on the other hand, corresponds to the historical contingency hypothesis (right side of Figure 4.1). It occurs when a certain regional pool of species forms different local communities in similar environments. We can interpret this divergence in terms of information-preserving processes in so far as the divergence in community composition results from difference in the assembly history of local communities. Certain assumptions made in the models presented above lack realism and many variables that could make assembly history more or less important are left unaddressed. In an interesting survey study, Chase (2003) identifies four factors that can affect the divergent/convergent tendency of communities. One of the factors capable of affecting the divergence/convergence of the assembly process is the inter-patch connectance, i.e., the dispersal rate of species among the various sites. Chase shows that a greater inter-  91 patch connectance tends to increase the likelihood of convergence. The explanation for this comes in two parts. First, “early invading species do not have time to grow high enough densities to dominate and preclude invasion by subsequently invading species …” (Chase 2003, 492). In other words, the priority effect becomes negligible when the dispersal rate is high. Second, “increased rates of connection among local communities with multiple equilibria can cause all but one of those to be eliminated” (Ibid.). If the dispersion rate is high and populations have a finite extinction probability, then the most competitive species will spread rapidly and eventually overcome other species through some kind of random walk. In the long run, this will tend to reduce the species diversity in the regional pool and homogenize the different local communities. Since interpatch connectance can come in different degrees, we can also expect various degrees of convergence/divergence between local communities. We should therefore observe a greater variety of species composition and/or richness between communities when it is more difficult for species to travel from one path to another. Summarizing: the assembly process becomes increasingly convergent/information destroying when the dispersion rate increases, whereas a low dispersion rate will be more likely information preserving. Tying this back to Grover’s model, one can say that the fundamental role of the assembly history found in his model partly comes from the fact that each species had a unique opportunity to colonizing the habitat. This amounts to assuming extremely low connectivity, an assumption that has a very low applicability in nature though. A more realistic scenario consists of species having more than one chance (Chase 2003; Grover  92 1994). In terms of time, this means that species in nature have more time to migrate into different patches than what Grover’s model assumed. The size of the regional pool of species is another important determinant of the diversity within and between local communities. In general, multiple equilibria should be more likely when the regional pool contains a greater variety of species. This follows from the fact that more species can coexist, but many species can have similar traits. If a species with certain traits and ecological role enters a habitat, it usually reduces the chance of other similar species to enter the same habitat (Fox 1999). These other similar species will then have to establish themselves somewhere else. The assembly history has therefore a greater likelihood of becoming important when the regional pool contains a greater diversity of species and that some species play a similar ecological role (like feeding on similar resources).22 Conversely, smaller regional pool species diversity entails that fewer species can coexist locally, which in turn increase the chance of local communities to be similar. The process of community formation in a given environment therefore becomes increasingly information preserving when the regional diversity increases and vice versa. Local factors have also been identified as important predictors of divergence. One of them is primary production of sites. It has been suggested that high productivity environment are more likely to yield a greater diversity of communities (Booth and Larson 1999). The rationale is that in low productivity situations, very few species can potentially persist, whereas increasing productivity leaves more room for alternative  22 Note however that this is true only if we consider colonies for their specific composition. The rule according to which the occurrence of certain type of species (or guild) in a habitat reduces the chance of the same type to reoccur encompasses a decreasing return function, which in the long run can result in convergence of the number of species in colonies living in similar environments.  93 species to colonize and persist in available habitats. So primary productivity has a positive effect on site-to-site diversity. Other local factors tend to reduce the divergence between local communities, such as perturbations. This can be partly understood by the negative effect that perturbations have on the regional pool, the increasing time necessary for early species to reach significant density, and by the fact the fewer species have evolved traits to resist the perturbation. In a nutshell, the literature suggests that higher frequency of perturbations tends to minimize the diversity between communities. Again, these positive/negative effects come in degrees. So one can expect various levels of convergence/divergence, and consequently different degrees of preservation of information from the past. 4.2 The Role of Other Histories: Geological and Evolutionary. So far I have talked about the extent to which different assembly histories can lead to divergent outcomes and thus render the colonization process information- preserving. Although assembly history constitutes the heart of the debate about historical contingency in community ecology, history can matter for other reasons. As Ricklefs and Schluters (1993, 241) express it, “[c]ommunities have histories of development. Their present properties reflect the phylogenetic origins of the taxa they contain, as well as the unique events and geological circumstances that occurred during their formation.” What follows explains these two other forms of historical effect.   94 4.2.1 Why Evolutionary History Matters One of the essential reasons why evolutionary history matters in the community assembly process comes from the fact that competitive ability and other types of qualities of the species in the regional pool are themselves a legacy of evolutionary history. Imagine for example that we are studying a community of birds feeding on different species of insects. Being insectivorous already is the product of a long chain of adaptations requiring the evolution of the proper digestive system, an agility to detect and catch prey, and sometimes coordination with the prey’s life cycle. The importance of evolutionary history in determining the ecology-based traits of species will be addressed more fully in Chapter 6. So I will not develop this point further here. Let me nevertheless stress the fact that even if we were to find out that species in a given regional pool tend to have convergent composition when placed in a given environmental setting, it does not mean that other historical factors would become irrelevant. It does not mean for instance that we will also observe convergence given different regional pools or given the same regional pool in which species have evolved different properties. It is not out of question to allege that changing the evolutionary history of species forming the regional pool would also modify their competitive and dispersing abilities, or the primary productivity of habitats. As we have seen in the previous section, this will most likely result in the formation of divergent communities. Such a process would therefore qualify as historical, but for reasons that are often left unaddressed in the debate on the role of assembly history.   95 4.2.2 Why Geological History Matters Different factors can explain the distribution of species around the globe. According to the “best fit” hypothesis, species tend to occupy environments in which they are better adapted for. Thus, comparable selective forces, acting on organisms living in similar habitats but different part of the world, can cause totally unrelated species to evolve similar morphologies. This phenomenon, known as “convergent evolution,” is very well illustrated by succulent plants, i.e., plants with juicy tissues specialized for the storage of water. Most succulent plants, such as the cacti of the American deserts, the euphorbias of the African deserts and the century plant of Asian and African deserts, normally grow in arid regions where the ability to retain water is necessary for their survival (Raven, Evert and Eichhorn 1992, 518). The distribution of these plants around the globe can be explained by the fact that they are better adapted to arid regions. Explanation in terms of best-fit hypothesis can be considered ahistorical because it does not refer to the evolutionary history of species and/or the history of the land. It simply assumes that species will occur in environments for which they are better fit. But looking at the actual distribution of other species shows that the best fit hypothesis falls short in some cases. Consider for instance the distribution of large flightless birds (Figure 4.5). Studying the degrees of genetic relatedness and spatial distribution of large flightless birds (ostrich, emu, kiwi, etc.), Jared Diamond (1983) emphasized that some closely related species (ostrich and rheas for instance) occupy different regions of the world (Africa and South America), and that less closely related species (tinamous and rheas) live in the same region (South America). Starting from the assumption that more closely related species tend to be more similar, Begon et al. (1990) suggest that the best-fit hypothesis cannot explain this  96 distribution. The distribution of flightless birds does not suggest that species occupy certain environments because they are better adapted to these environments, since similar species are found in different regions and environments. Instead, Diamond’s data suggest that we observe a correlation between the different episodes of continental drift and the large flightless birds distribution. Consider figures 4.5 and 4.6. The first divergence divided the tinamous from the rest of the species. This seems to have no connection with continental drift. However, the subsequent divergences correlate well with the way in which the ancient “super-continent” (Gondwanaland) broke into parts. The first rift, between Australia and the other southern continents, explains the divergence between the group tinamous-ostrich-rheas (located in South America and Africa) and the rest of the species (found in Australia or New Zealand). The second rift, between Africa and South America, explains the second divergence, between the ostrich (Africa) and the rheas (South America). The third rift, the opening of the Tasman Sea, explains the third divergence, between the group emu-cassowary (located in Autralia) and the Kiwis (all located in New Zealand). This example shows that the distribution of flightless birds makes sense under the light of the history of the land.   97      Figure 4.5 Distribution of large flightless birds and phylogenetic relationship. (From Begon et al. 1990, 11. Reprinted with permission of John Wiley & Son, Inc.)   98   So we see here that geological history can be important in explaining why certain species happen to exist in certain regions but not in others. In other words, changing the history of the land will also change the species composition in different regional pools. By extension from what we have seen in the previous section, we can also infer that the geological history will affect the result of various assembly processes. We saw that the composition and structure of communities is sensitive to species’ properties like their capacity to travel from one path to another, their ability to use the resources available, Figure 4.6 Reconstruction of the successive stages in the break-up of the ancient supercontinent of Gondwanaland. The red lines indicate the edges of the main tectonic plates in the area. (From Begon et al. 1990, 10. Reprinted with permission of John Wiley & Son, Inc.)  99 their place in the food web and their keystone rank. Since these properties will change from one species to another and from one community to another, we can infer that the geological history, which affects the species composition of regional pools, will also impact the species composition of local communities and the site-to-site diversity. Moreover, one can also infer that the geological history can become a determining factor of regional diversity (not only species composition). As discussed in section 3.1.2, species diversity of the regional pool is an important predictor of the divergent/convergent nature of community assemblies. We saw for instance that regions with higher species diversity tend to yield a greater diversity of local communities, which by extension favors the occurrence of information-preserving assembly process. 4.3 Concluding Remarks In this chapter, I have argued that the formation of ecological communities can be sensitive to three types of history: assembly history, evolutionary history and geological history. For reasons of clarity, I considered each type of historical influence in isolation, but it is important to keep in mind that they can have combined effects. This means that explanations in community ecology can sometimes require the consideration (and reconstruction) of several historical processes. An excellent example of such an approach exists in the paper from Jacques Blondel and Jean-Denis Vigne, “Space, Time, and Man as Determinants of Diversity of Birds and Mammals in the Mediterranean Region” (Blondel and Vigne 1993). Consider for instance their take on the origin and development of mammalian fauna.   100 The present diversity of this biota can be ascribed to three causes: (1) the multiple biogeographical origin of the mammals, which entered the Mediterranean from Europe, Asia and Africa [i.e., the assembly histories], (2) Pleistocene climatic changes, which produced periodic faunal turnovers, and hence large [site-to-site] diversities, over time because of large intercontinental faunal exchanges; and (3) severe and ancient (Neolithic) human pressures (modification of habitat, animal husbandry, hunting, and introductions of alien species) that modified natural distributional patterns and produced a decrease in the previous, much richer, glacial fauna. (Blondel and Vigne 1993, 135, italics added)  This is only a peek into a more comprehensive historical approach. Yet we can see from this passage that it adds several historical processes, which we do not typically find in classical explanations based on competitive abilities. It is also interesting to see an integration of human impact into this study, an element too often ignored in ecology. Humans have significantly shaped the diversity of the Mediterranean region by extensively hunting large hoofed mammals and by introducing species like the domestic sheep, the black rat and the domestic mouse. So Blondel and Vigne show that there exists a complex web of interactions between geographical, environmental and biological factors, and that explaining the diversity and composition of ecological communities in the Mediterranean region requires that we investigate systems at a large spatiotemporal scale. Although they do not address the notion of historicity explicitly, we find nevertheless in this contribution the belief that changing these historical elements would have yielded different species diversity. Although I have been mostly emphasizing examples where the outcome of a process causally depends on its history, there exists a noticeable tendency to bind historical contingency and chance in the community ecology literature. I would like to close this chapter with a comment on what could have been the source of the dominance of the unpredictability notion of contingency in this literature. Kingsland (1995, p.225-  101 232) attributes the origin of the promotion of historical contingency in ecology to evolutionary biologists like Gould. This is true in some cases. Take for example the following passage from Pickett et al. (2007):  The important concept of “contingency” emerged as an insight from historical sciences, such as paleontology and evolution (Gould 1989). …  Contingency simply means that the course of the dynamics or the particular trajectory followed by entities depends on external constraints that operated singularly or episodically at some point in the past, or on constraints embodied in the evolved or accumulated current status of the entities. (Picket et al. 2007, 18-119)  This passage clearly emphasizes the causal dependence reading of contingency. However, in another reference to contingency, the same authors suggest an interesting mixture of unpredictability and causal dependence: The lack of rigid determinism is encapsulated in the term “contingency,” which means that the current state of an ecological or evolutionary system is dependent on specific conditions that may have occurred from time to time in the past or on the order of events that affected the trajectory (… Gould 1989…) (in Pickett et al. 2007, 181).  This striking example of Gould’s legacy on the notion of contingency clearly shows how causal dependence and unpredictability (or here indeterminism) are often intertwined. But I would like to point in the direction of another potential source for the strong association between history and unpredictability. Recall that in the Introduction, we discussed MacArthur’s defense of a more scientific ecology. We saw that one of the fundamental aspects of his philosophy of science is the formulation of general, testable hypotheses. He wanted to go beyond a “natural-history phase” in which ecologists simply described what nature looked like by enumerating what and how many species were found in different regions or by simply drawing correlations between indirectly  102 dependent variables. This was deemed insufficient for a true scientific ecology that encourages generalizations and mechanistic explanations. Historical contingency was in the way of this ideal in the sense that the occurrence (and presence in the data) of historical accidents made the discovery of general patterns difficult. Historical contingencies were a nuisance, hiding the true and more fundamental causes of natural phenomena. The equilibrium theory of island biogeography (discussed in the Introduction) avoided this problem: The equilibrium model has the virtues of making testable predictions which were not immediately obvious and of making the individual vagaries of island history seem somewhat less important in understanding the diversity of the island’s species. (MacArthur and Wilson 1967, 64)  Being a central figure in the field of ecology (Pianka and Horn 2005), it would be interesting to see if MacArthur’s way of defining the problems and goals of science did not in fact influence not only the approach taken by his peers, but also the way in which people have responded to his theory. In other words, I am suggesting that it is possible that the historical turn in ecology greatly emphasized the role of chance and the unpredictability of processes because this is how it has been cashed out by a central figure of ecology. Looking at how MacArthur’s ideas about science and historical contingency have been received in his field would need an enormous amount of work and it would take us beyond the scope of this dissertation. This is why I put this reasonable conjecture in a concluding remark, as something potentially interesting to look at. Note however that MacArthur does not exclude all forms of historical approach. In a chapter entitled “The Role of History” (1972), he suggests that an historical approach can be made scientific when generalizations are made and tested against new  103 information. The problem, he says, is that few persons are good at both unraveling histories and generalizing about the machineries governing natural phenomena. The ecologist (like the physicist) tends to be machinery oriented, whereas the paleontologists and biogeographers tend to be history oriented. At least in this particular occasion, MacArthur seems to contrast two approaches, favoring one of them without invalidating the other. People see different things because they have different ideals and explanatory goals. But this is not exactly discrediting history, especially when he claims that “many generalizations about machinery have been proposed by biogeographers with an historical outlook …” (p.239).  104 Chapter 5 : Path Dependence   Chapters 3 and 4 have discussed examples from evolutionary biology and community ecology where “historical contingency” is said to play an important role. We also saw that a few of these examples combine two interpretations of “contingency”: causal dependence and unpredictability. This was especially explicit in the Long-Term Evolutionary Experiment (LTEE), where initially identical populations reached alternative equilibria because different mutational histories caused the duplicated populations to take divergent trajectories. A similar phenomenon was discussed in community ecology with the priority effect. Alternative communities can form from the same regional pool of species placed in similar environmental conditions because the assembly histories randomly change from site to site. Since events in mutational and assembly histories are chancy, we obtain a phenomenon of dependence on past vagaries. I believe that these correspond to stochastic information-preserving processes. I have argued in Chapter 2 that historicity is a property of information-preserving processes. In a nutshell, a process is information-preserving when we have distinct probabilities for past as a function of present. This means that there are alternative outcomes at a given instant (present) and the (probability of) outcome(s) changes as a function of the past. We can find examples of information-preserving processes both in deterministic and stochastic settings. I suggested that deterministic information- preserving processes are essentially explained in terms of sensitivity to initial conditions. The situation is different for historicity in stochastic processes, where the dependence on the past is not limited to changes on initial conditions, but extends to the whole history of  105 the process under consideration. A stochastic information-preserving process will admit of multiple outcomes, and the probability of outcomes should be sensitive to changes in the entire series of events preceding the outcome. This includes but is not limited to sensitivity to initial conditions. This chapter and the next one discuss the notion of path dependence, a good candidate for explaining the form of historicity occurring in stochastic information- preserving processes. The notion of path dependence was initially introduced in the social science literature as a way of elaborating the extent to which history matters (Bassanini and Dosi 1999; Castaldi and Dosi 2006; David 2001, 2005; Hodgson 1993; Mahoney 2000, 2006; Pierson 2004). Like historical contingency, path dependence has received several interpretations. Recently, Eors Szathmáry has introduced the notion of path dependence in biology (Szathmáry 2006). This chapter introduces and analyzes his notion, which invokes essentially multiple equilibria, irreversibility and the possibility for some stochastic process to stay away from a unique stable attractor. However, we will see that Szathmáry’s account faces some difficulties when considered under the perspective of a general notion of stochastic information-preserving processes. An alternative account will be elaborated in Chapter 6. 5.1 Szathmáry on Path Dependence Szathmáry (2006) frames his discussion of path dependence as a response to DeDuve’s (1995) suggestion that the universe was pregnant with life as we know it on earth. Simon Conway Morris’s thesis, according to which human(ish) life is inevitable given the conditions on earth, is very similar (see Chapter 3). Although Conway Morris mostly grounds his argument in biology, whereas  DeDuve bases his on physics, both  106 take a very similar stance and allege that evolution is remarkably convergent, almost teleological. Regardless of how one articulates it, Szathmáry considers this kind of position untenable. For him, science, and especially evolutionary biology, tells us that life and its evolution could have been significantly different because of its tendency to be path dependent. After others (Bassanini and Dosi 1999), Szathmáry distinguishes between two forms of path dependence: The strong form is a result from a combination of irreversibility and the existence of multiple stable attractors for the dynamical system under consideration. In biology, multiple evolutionary stable strategies (see Maynard-Smith, 1982) serve as a good example. Depending on the initial conditions, one, or another, stable strategy may dominate the population, and be resistant against invasion by a mutant strategy. The weak form of path dependence does not invoke multiple stable attractors: if the dimensionality of the system is sufficiently high, and if the dynamics of the system is as described by a Markov process, then path dependence is ensured if the trajectories of two possible realizations do not cross with certainty. Population genetics of small populations is able to generate many such cases. (Szathmáry 2006, 140)  This account combines technical notions that merit careful introduction. To begin, this formulation admits of degrees. This will prove useful for capturing different magnitudes of historicity, although I will later suggest a slightly different way of framing the issue. Let us start with the conditions characterizing strong path dependence. The reason why multiple stable attractors matter to path dependence should not be too difficult to understand by now. In general, an attractor is defined as a region in a phase space where the system tends to get trapped. The wells in the bowl example used in Chapter 2 are very powerful representations of attractors. There exist, however, various kinds of attractors. Some are “equilibrium points,” others are “dynamical,” and yet others are called “strange attractors,” in which case the trajectory in the phase space displays some patterns without  107 exactly stabilizing into a given point or trajectory (Lorenz 1930). Szathmáry’s definition does not favor one type of attractor over another. We can better appreciate why the existence of multiple attractors contributes to path dependence by considering the case of multiple evolutionary stable strategies, which is an analog to the adaptive landscape with multiple peaks. We have elaborated and exemplified this in Chapters 2 and 3. Translated into the language of adaptive landscapes, we can say that for a particular population in a particular environment, there might be two evolutionary stable strategies or fitness peaks: two genotypes or morphologies of relatively high adaptive value, separated in genotypic or morphological space by forms of lower value, and not traversable by mere natural selection (see Fig 2.2). The evolutionary outcome would then depend on the genotypic or morphological starting point, together with the mutational history. As explained in Chapter 2 and exemplified in Chapter 3 with the LTEE, some information from the past is preserved in the outcome because changing some aspects of the trajectory of the population will also potentially affect where the population will end up. Thus, multiple attractors make it possible for alternative pasts to yield different presents. In contrast, a system with a unique attractor will (possibly)23 display a convergent, information-destroying dynamic. Let us now consider “irreversibility,” the other putatively important characteristic of strong path dependence. Szathmáry states: At the population level path dependence entails that the probability of going back to some previous state decreases with time, or that switching to a state that could have easily been reached, had the population taken a different turn previously, is becoming increasingly improbable as time goes on (Szathmáry 2006, 141, italics mine).  23 Please note that we will soon see that it is possible for a system to have a single attractor and still be considered path dependent.  108  Although he does not specify it, I believe we can interpret “state” as meaning “equilibrium state.” The notion of irreversibility has received many interpretations (Denbigh 1989; Gould 1970), and resolving this issue would require more than a couple of paragraphs. Further reflections on the topic will be offered in Chapter 7. But Szathmáry’s understanding is common and sensible enough: a process is irreversible if returning to a previous state becomes increasingly improbable with time due to the influence of conditions and preconditions that are individually or jointly improbable. This view we will call “contingent irreversibility.” Consider for instance the high degree of irreversibility of what Maynard Smith and Szathmáry (1995) called “major transitions in evolution,” such as the appearance of eukaryotes. “Eukaryotic” cells, which have a true nucleus, evolved from non-nucleated prokaryotic cells. The hypothetical explanation of the origin of eukaryotes invokes many specific conditions and preconditions including the loss of the outer cell wall, the evolution of a new organization and mechanisms of transmission of the genetic material, the symbiotic origin of the nucleus and other organelles like mitochondria and chloroplasts,24 and the evolution of a cytoskeleton. Just consider the origin of mitochondria. Once bacteria had lost their outer wall, phagocytosis became possible (whereby the cell can engulf particles in its environment). This kind of mechanism allowed for the assimilation of some cells by others. Mitochondria, the energy producing organelles in eukaryotes, were previously prokaryotes that were assimilated by  24 Note that the authors remain uncertain about the symbiotic origin of the nucleus. They conclude that “ the evidence for a symbiotic origin of the nucleus is at present weak, and, in contrast to an autogeneous origin, lacks a sensible scenario” (Maynard Smith and Szathmáry 1995).  109 eukaryotes. The mitochondria retained some of their genetic material and the capacity to synthesize proteins and RNAs. However, they lost their reproductive autonomy because many of their genes were transferred to the nucleus. Each step in the evolution of eukaryotes depended on specific environmental conditions and the occurrence of appropriate mutations for natural selection to accumulate. The product of the probability of all those events is exceedingly small, and the probability the conditions that would lead to eukaryotes losing their mitochondria and to mitochondria (re)evolving the capacity to live autonomously, would also be staggeringly small. I concur with Szathmáry that contingent irreversibility is fundamental to historicity and path dependence, but we need to strengthen the connection between these notions. We find an interesting expression of this connection in the work of the Belgian paleontologist Louis Dollo, who provided theoretical grounds and empirical evidence of the irreversibility of macroevolutionary changes. Typically, Dollo would emphasize irreversibility by arguing that species that have to adapt more than once to similar environmental conditions never totally re-evolve the previous adaptations. Suppose that a lineage existing in environment E evolves adaptation A. Then, the environment changes to E′ and, after a certain time, the lineage evolves a different character A′. What would happen if the environment were to return to state E?  According to Dollo:  An organism never returns exactly to a former state, even if it finds itself placed in conditions of existence identical to those in which it previously lived. But by virtue of the indestructibility of the past … it always keeps some trace of the intermediate stages through which it has passed. (Dollo 1905, 443, cited in Gould 1970, 196)   110 For Dollo, and many others after him, the irreversible nature of the evolution of organisms is due to the irrevocability of the past. Consider for instance the evolution of the marine reptilian ichthyosaurus, which presumably evolved from land reptiles, which were previously derived from fish (Figure 5.1)        Figure 5.1 Irreversibility of evolution results in alternative adaptations to aquatic life. *: a reversible evolutionary process would have resulted in the (re)evolution of fish, instead of ichthyosaurus.  For sure, the ichthyosaurus did look like a fish in many respects, but it still had lungs and four rudimentary limbs, all signs of the virtual indestructibility of its past reptilian stage.25  25 Note however that the later environmental conditions are not exactly as they were before the terrestrial stage. So an alternative explanation to the one provided here would be that the conditions for irreversibility are never totally met. In other words, we do not truly test Dollo’s law. Maybe the irreversibility of evolution is true; even if the environment were identical, one would not expect the new species to be just Fish Adaptation to aquatic life. Fish Ichthyosaurus *Fish Reptiles Adaptation to terrestrial life Adaptation to aquatic life after terrestrial life  111 This example suggests that the evolutionary history taken by a species leaves some traces and acts as a constraint on future possibilities. The set of phenotypes available to species is a dynamic space, changing from time to time. What may have been likely to evolve in the past (e.g., fish) becomes extremely unlikely to come again, if the lineage has acquired new adaptations (e.g., reptiles) and therefore modified the raw material available for natural selection to act upon. Under this account of irreversibility, a path-dependent process will be irreversible, whereas a path-independent process will be reversible. The latter means that the same life form can evolve by a different historical path. In other words, there would be convergent trajectories instead of multiple evolutionary stable strategies if evolution were reversible. In a reversible world, fish could have evolved de novo (see Figure 5.1*), which also corresponds to path independence: if reptiles had lost all traces of their reptilian stage and went back to their ancestral fish form, then the path taken to the latter would not have made any difference. By contrast, an irreversible situation entails that the path taken makes a difference. The ancestral form becomes unlikely to (re)evolve because the lineage has taken a different (reptilian) path. More will be said about irreversibility and its relation to path dependence in the last chapter. We will see that both properties can in fact have a common cause: generative entrenchment. We have established enough now to explain the fundamental elements of strong path dependence, namely multiple stable attractors and contingent irreversibility. Let us consider now Szathmáry’s notion of “weak path dependence.” Contrary to strong path dependence, the weak form does not require multiple stable attractors. As  like the old one. But realistic examples show something weaker, since the early environmental conditions never recur exactly.  112 suggested in the definition, a small population with one domain of attraction could exhibit weak path dependence. This raises some difficulty for what has been said so far. We saw that the existence of multiple equilibria creates a situation where history can make a difference to the outcome, but history vanishes when a process repeatedly converges towards the same stable global equilibrium. In what sense then can we possibly say that we have path dependence and a unique global attractor? Should not we see here the perfect conditions for converging, information-destroying processes? In order to understand how weak path dependence can happen, we need to focus on the following part of Szathmáry’s characterization: “ if the trajectories of two possible realizations do not cross with certainty.” In other words, a process will be path dependent as long as we are not in a situation of complete convergence, despite the existence of a unique domain of attraction. How can this be possible? There might be more than one explanation here. Take Szathmáry’s example of drift in small populations. When the size of the population is small, the change in genotype frequency in subsequent generations has a stochastic component coming from variations in the number of offspring between individuals and from inherent randomness in the way in which alleles are sampled (Gavrilets 2004). This stochastic effect is generally referred to as random genetic drift. Even if the fitness landscape had but one domain of attraction, i.e., one region with the highest fitness value, genetic drift can possibly preclude the population from actually reaching it. This entails that the unique optimal (and stable) combination of genes or morphological characters does not always occur in a small population, and Szathmáry suggests that we can interpret such  113 divergence as a weak form of path dependence because the trajectory of some realizations (or replays) will not converge towards the same final state. We have seen another example of weak path dependence in Chapter 3. Consider again Figure 3.5 showing the results of Lewontin’s experiment on the importance of the order of environmental changes in the trajectories of population. I believe that this too instantiates a weak form of path dependence. Although the environmental conditions are presented in reverse order in both replays, the average selection regime remains identical, thus creating a unique domain of attraction. Yet, we clearly observe that both populations only partially converge towards the same final gene frequency.26 At least under the conditions imagined by Lewontin, this model yields results that fit Szathmáry’s definition of weak path dependence. What about cases of transient divergence? Should they fall under the same category? Recall our discussion of the notion of historical contingency in the LTEE, where we saw that the divergent behavior of initially identical populations of bacteria evolving in the same type of environment is most likely explained by fluctuations in the order of beneficial mutations. Although the final explanation favored a scenario that fits a strong form of path dependence (the populations taking divergent evolutionary path have remained in separate stable adaptive peaks), another hypothesis was also entertained (Lenski et al. 1991, see also Figure 3.3b of this dissertation), at least for a moment, which could resemble weak path dependence. According to that ultimately rejected scenario, all populations would in the limit converge towards a unique stable adaptive peak, but  26 Lewontin wanted to emphasize that the two populations, although governed by the same average selection regime, took very different pathways. He therefore did not emphasize and explaine why both populations terminate their journey in alternative states.  114 stochastic variations in the time of origin of particular classes of beneficial mutations would give rise to transient divergences. In other words, the adaptive landscape would be Fujiama-like (see Figure 2.1b), but we would be recording some temporary divergences in the evolutionary trajectories among populations because they undergo different mutations at different times. So my question is: does this type of evolutionary dynamic count as weak path dependence? Szathmáry (personal communication) welcomes the idea that transient divergence could count as a weaker form of path dependence. The problem is that his definition clearly stipulates that the trajectory of two alternative realizations should not cross with certainty. This excludes transient divergences. Unlike the situation where drift plays a role, these trajectories do cross/converge at the end of the day. One solution could accommodate the view that transient divergence is a weak form of path dependence by saying that weak path dependence includes processes governed by a unique domain of attraction if the trajectories of two possible realizations are not identical. Under this new description, identical evolutionary trajectories (what evolutionary biologists sometimes refer to as “parallel evolution”) would characterize path-independence and the degrees of path dependence would go thus: transient divergence would characterize weak path dependence, whereas long lasting, stable divergence would characterize strong path dependence. Adopting this slightly revised definition permits a more natural integration of the theoretical reflections generated around the LTEE (Johnson, Lenski, and Hoppensteadt 1995; Wahl and Krakauer 2000). We saw in Chapter 3 that when the product Nu (where N represents the size of the population and u represent the mutation rate) is large enough,  115 then optimal mutations co-occur and become selected in parallel in each replicated population, thus repeatedly resulting in the same evolutionary trajectory and outcome. Conversely, if Nu is small, then different chance variations will happen at different times (i.e., the mutational histories of populations will differ), which in turn will most likely lead to divergent (transient or permanent) evolutionary outcomes. An important drawback however is that the bowl with a single well now counts as weakly path dependent. The account I develop in the next chapter will avoid this problem. If these models provide any insight about some of the deep structures of evolutionary processes, we can also conclude that the course taken by life has become increasingly path dependent. As noted by Szathmáry (2006, 143), species with large-size individuals usually present smaller N, thus suggesting a greater susceptibility to path dependence. Szathmáry develops further his argument in favor of path dependence of complex adaptive systems by applying the notion of (un)limited heredity to different evolutionary processes. The nature of inheritance is critical for cumulative selection and, hence, path dependence. The capacity of a hereditary system crucially influences the scope of evolution. … We have limited heredity when: number of individuals > (or ∝) number of potentially heritable states; and we have unlimited heredity when: number of individuals << number of potentially heritable states. … It is unlimited heredity that allows ongoing evolution and cumulative build-up of complex adaptations. (Szathmáry 2006, 143)  In other words, a system of heredity with a smaller scope of evolution has less potential for cumulative selection and less chance of being path dependent because there are fewer paths to explore and strategies to settle into. Conversely, a greater scope of evolution increases the chance of path dependence because of the existence of a greater number of  116 paths to explore and strategies to select. The concepts of limited and unlimited inheritance provide a qualitative description of the scope of evolution, taking into account not only the number of heritable states for a population, but also the number of individuals. We say that inheritance is unlimited when the population can almost infinitely change because the range of possible heritable states outnumbers the individuals capable of carrying them (granting that they can actually explore different strategies and do not get trapped forever in a local equilibrium). As Szathmáry says, unlimited inheritance makes it possible for a system continually to evolve and build-up complex adaptations. Thus, unlimited inheritance creates conditions in which different populations can undergo different mutational histories and accumulate different set of adaptations. Conversely, it makes parallel and convergent evolution very unlikely, for the range of possibilities is extremely high. Limited inheritance on the other hand occurs when the number of heritable states is proportional to the number of individuals exploring the adaptive space. Populations with limited inheritance will therefore quickly and thoroughly explore the totality of potentially heritable states, and thus find the optimal configuration(s) more easily. This also limits the build-up of adaptations and makes path independence more likely because different populations will tend to rapidly evolve in parallel or converge towards the same set of adaptations. It is interesting to note that the limitedness of inheritance has changed throughout evolution, and with it the degree of path dependence. The first chemical systems of inheritance that appeared on earth were in fact very limited. These systems rely essentially on the existence of self-replicators and autocatalysis. Szathmáry alleges  117 that one of the simplest self-replicators of biological relevance is glycoaldehyde, the autocatalytic seed of the formose reaction (see Figure 5.2).  Figure 5.2 The formose reaction. (a) The spontaneous generation of the autocatalytic seed; (b) the autocatalytic core of the network. Each circle represents a group with one carbon. (From Szathmáry 2006, 144. Reprinted with permission from Palgrave Macmillan)  The “autocatalytic seed,” glycoaldehyde, is the version of the molecule that arises spontaneously (5.2a). This process occurs very slowly. But once the milieu has enough of these seeds, their formation is accelerated via a chemical network (5.2b). The existence of a certain type of molecule favors the formation of more molecules of the same type. Some may oppose the idea that we truly have inheritance here. There is no transfer of material, but merely replication of form. Still, the point is that these reactions tend to be fairly stable and very few alternative compositions seem possible. This system has very limited inheritance and Szathmáry (p.145) suggests that changes in the chemical identity of these molecules will be only transient fluctuations or they will drain the system. Consequently, such systems are more likely to be path independent. formaldehyde glycoaldehyde  118 The further we move away from chemical inheritance, the less limited inheritance becomes. Genetic inheritance for example is much less limited. Szathmáry uses a very telling image to explain the extent to which genetic evolution is unlimited. Imagine … that initially the whole [genetic] space is dark. Then, sequence the genomes of all creatures, great and small, that have ever lived on this planet, and light a bulb in the corresponding part of sequence space. After lighting a bulb for every individual, the space will remain virtually dark. (Szathmáry 2006, 144)  Other systems (epigenetic inheritance, language, cultures) are even less limited, and the space should remain even darker for them. And if we grant that unlimited inheritance creates an important basis for path dependence, we should also observe more cases of path dependence in these more complex systems of inheritance. Although I am about to criticize and revise his account of path dependence, I still agree with Szathmáry on several points, and I concur with him that life is on many occasions path dependent in his sense and deeply historical. Based on the reflections and observations regarding the type of inheritances displayed by various biological systems, I concur with Szathmáry that DeDuve must be wrong when suggesting that life as we know it on earth was inevitable.  5.2 Some Problematic Aspects Although consistent, I think that Szathmáry’s account has a few drawbacks. First, the distinction between path independence and path dependence remains to a certain extent unsatisfactory. In fact, Szathmáry does not provide a clear definition of path independence. One can extract from his paper, as I have done above, the idea that there exists a qualitative break between identical and divergent processes, and suggest that the  119 latter correspond to path dependence whereas the former corresponds to path independence, but nothing is clearly explicated. The absence of a qualitative break between path-independent and path-dependent processes is not detrimental, but it would have been helpful to be clear on what path dependence is opposed to. Second, the gradation between strong and weak path dependence may not reflect the way in which biologists explain the existence of different degrees of historicity among the class of processes admitting of multiple attractors. I embrace Szathmáry’s idea that there exists a difference in degree of path dependence between systems admitting of one attractor and those admitting of many. However, the gradation established is more like unconnected qualitative levels: irreversible system with more than one region of attraction qualifies as strong path dependence and some systems with a unique region of attraction can (in certain conditions) display a weak form of path dependence. But there exists a break between the “one-attractor” and “multiple-attractor” classes of dynamics, and nothing clearly indicates a form of gradation among the multiple-attractor systems. Should we conclude that a system with two attractors is less path dependent than the ones admitting three, four, etc., of them? I believe that Szathmáry’s account would benefit from integrating the idea that a system with multiple attractors can be more or less sensitive to fluctuations in the historical series leading to alternative outcomes or probability distributions. Doing so would also be more representative of the degree of historicity found in the literature, which suggest the existence of a continuum between strong and weak historicity. Recall for instance Chase’s (2003) analysis of historical contingency in community ecology (see Chapter 4). He suggested that the assembly history is more or less important depending on the extent to which other factors allow the  120 divergence in history to yield alternative equilibria. In brief, the the more divergence we find between communities forming from a given regional pool in similar environmental conditions, the more history has played a role. Similarly, reflections on the results of the LTEE suggest that mutational history is more or less important depending on the degree of divergence of identical populations evolving in a given environment. Szathmáry's account, although internally consistent, remains unrepresentative of this type of gradation.  A third difficulty arising with Szathmáry’s account comes from the presumption that all path-dependent processes should have at least one attractor. This seems to be common in the literature on path dependence (David 2001), but we will see in the next chapter that some systems do not have any attractor and still have path-dependent dynamics (Page 2006). Finally, I believe that Szathmáry’s account faces another substantial problem because the category of strong path dependence could include some instances of information-destroying processes. Recall our discussion of systems with multiple equilibria in which we introduce noise in the process’s trajectory (Chapter 2). Take for instance the bowl with multiple wells, but in which the marble is equipped with a device that makes it jump erratically from time to time, randomly with respect to direction. We concluded that if the frequency and magnitude of the perturbations are not too high, then the process could remain information preserving. It would also be irreversible. Despite the variability in the marble’s trajectory, the starting position will systematically be different from the end position (given that the system is far enough from equilibrium), and the probability of returning to a previously visited state would be smaller and smaller  121 as the marble proceeds to the bottom of the bowl. So the stochastic version of the bowl example can qualify as path dependent. However, we have also concluded that the process becomes information destroying when the frequency and magnitude of the perturbations become extremely high, such that the probability distribution of outcomes becomes invariant. In this situation, wiggling (aspects of) the marble’s history would not change the probability of obtaining one outcome or another in multiple replays. Information from the past is destroyed, and it would be impossible to tell which path the marble has taken to a given outcome. Yet, the conditions of irreversibility and multiple attractors are met. Thus, under Szathmáry’s account, such a system would qualify as strongly path dependent, despite its information-destroying nature. This is problematic and I believe that this difficulty and the ones mentioned above call for a revision of the definition of path dependence, a journey that we will embark on in the next chapter.   122  Chapter 6: Path Dependence Revisited, Revised and Applied  Although interesting and rich, we saw that Szathmáry’s (2006) analysis of path dependence faces some important difficulties. I therefore suggest a revision of his account, informed by earlier sources, outside the field of biology. Path dependence has become an increasingly popular notion in the social sciences, often in a context criticizing a lack of historical perspective in social and economic explanations. In this regard, path dependence has played a similar role to that of historical contingency in the historical turn in biology. Although it does not yet have a standard interpretation, as it is a relatively “young” technical notion, I believe it would be instructive to go back to some paradigmatic examples that started off discussions about and motivated the formulation of some principles of path dependence. Informed by these reflections, I will provide a new account of path dependence. My account will be presented as part of a general theory of stochastic information-preserving processes. I will then compare this new account with Szathmáry’s. Finally, I will show that it can successfully capture the form of historicity encountered in some examples presented in the previous chapters where we find a combination of unpredictability and causal dependence. 6.1 Lessons From the Social Sciences Let us begin with a paradigm case of path dependence: the adoption and retention of the QWERTY keyboard (David 1985). The QWERTY keyboard was adopted in the 1890’s and has dominated the North-American market ever since, despite its actual suboptimality in terms of performance compared to some of the models that were   123 launched in the early 20th century, such as the Dvorak Simplified Keyboard (DSK).27 The QWERTY keyboard was not the product of an irrational process, but rather an adjustment to some defects found in early models of typing machines. Consider Figure 6.1.            Figure 6.1. The history of the QWERTY keyboard represented with a branching tree. The time row represents the progression of time with different instants in. The context/environment row indicates which kind of typing machine existed at different instants. The probabilities p = high vs. low are probabilities of selection of a certain layout given contextual and economical considerations. Bold characters indicate the actual history.  Instant i0 corresponds to the time when no typing machine existed, hence the absence of layout at this point. Then came the typewriter (instant i1). Different models probably competed at the beginning, but for whatever reasons, the early models had typebars. This  27 Please note that I am following the standard interpretation here, according to which QWERTY is suboptimal compared to DSK in the present context. This interpretation has been challenged by Liebowitz and Margolis (1990) in the article “The Fable of the Keys.” They argue that there are some myths about the history of QWERTY and it may very well turn out that the QWERTY layout may not be as suboptimal as suggested by David (1985). Even if Liebowitz and Margolis were right about the ambiguity regarding the suboptimality of QWERTY, this does not change the fact that the history of QWERTY is an example of path dependence. Although path dependence is often used in order to explain the fact that we are “stuck” with suboptimal technologies or policies, this is not a necessary condition (see David 2001 on this). qwerty layout quicker layout qwerty layout quicker layout - Time i1  i2  i0 Context/ Environment typebars no typebars - qwerty layout quicker layout high low low high   124 choice greatly influenced the kind of keyboard layout selected for typing machines. Typebar models originally had the unfortunate tendency to collide and jam if keys were struck in rapid succession. One solution consisted in a keyboard design that would slow down the writing process, and thus reduce the frequency of typebar jams. The QWERTY layout became that solution, but again alternative layouts had been considered. The typing process was slower with a QWERTY keyboard because some of the most frequently used letters in English occupy positions more difficult to reach. So QWERTY slowed typing, but by preventing typebar jams it overall made the writing process more efficient, given the typewriters available at the time. It is important to note that initially QWERTY had other competitors, performing the same speed reduction function. But it gradually conquered the market and became widely adopted. David says: “This part of the story is likely to be governed by “historical accidents,” which is to say, by the particular sequencing of choices made close to the beginning of the process” (David 1985, 335). Then the technology changed for computerized machines (from instant i1 to i2), which finally solved the problem of colliding bars. At this point, there was no need for a keyboard layout that slows down the operator. Still, QWERTY dominates the market, despite the existence of superior (quicker) designs. As the diagram of Figure 6.1 indicates, once QWERTY had been selected in the typebar environment, its probability of selection in the non-typebar environment remained quite high. After QWERTY keyboards were selected to prevent typebar jams, they also became the standard for teaching typing skills. And as the number of QWERTY-trained typists increased, so did the desirability of the layout that best suited their skills. Thus selecting QWERTY keyboards created a kind of positive feedback where each adoption of a QWERTY-   125 designed keyboard at an early instant increased the probability of its selection at later instants. The positive feedback created by the preferences of users explains why the probability of QWERTY remains high in the non-typebar environment, and hence why we are “stuck” with a now suboptimal technology. But had a quicker layout been selected in the typebar environment, then the probability of selecting the QWERTY layout in the non-typebar environment would have been low, and the probability of staying with the quicker layout would have been high. Some have suggested that the retention of the QWERTY keyboard results from a sensitivity to initial conditions characteristic of deterministic chaos. But David (2001) rejects such a conclusion. The outcome is not merely a result of initial conditions, but a matter of the path taken during the early history of typewriter design. To be sure, the outcome depends on early conditions, but also on the series of choices that gradually led to the adoption of QWERTY. David (2001) argues that the notion of “path dependence” can better explain the history of QWERTY than extreme sensitivity to initial conditions. His account of path dependence refers to a dynamic property of stochastic processes. The concept is defined in terms of the relationship between a process’ trajectory and the outcome(s) towards which it tends to go, or the limiting probability distribution of the process. He says: “[a] path-dependent stochastic process is one whose [probability distribution on the long run] evolves as a consequence (function of) the process’s own history” (David 2001, 19). 28  28 David also defines path dependence in negative terms: “Processes that are non-ergodic, and thus unable to shake free of their history, are said to yield path dependent outcomes” (p.19). I am not going to discuss this version of path dependence here because it is possible to do without and because doing so would require lengthy clarifications of alternative ergodic theories. This would take us away from our main point. For those curious about the fascinating (and challenging) world of ergodicity, I recommend the work of   126 This very broad definition is easier to interpret by focusing on the contrast with David’s notion of path independence. The latter should be familiar to the reader, for it coincides with the class of information-destroying processes defined in Chapter 2. A process is path independent when its “dynamics guarantee convergence to a unique, globally stable equilibrium configuration; or in the case of stochastic systems, those for which there exists an invariant (stationary) asymptotic probability distribution that is continuous over all the states that are compatible with the energy of the system” (David 2001, 18). Path dependence is the complementary concept, perhaps easiest to express by introducing a branching tree metaphor. A path-dependent process is captured by the image of an irreversible journey along a branching tree. From a given starting point, different paths can be taken towards two or more distinct, stable outcomes. Once embarked on a path, the cost of returning to a previous branching point would be too high, so that the system gets trapped in a locally stable equilibrium. Referring to the QWERTY case again, we can understand that it would be too costly for users to learn different typing skills; hence the low probability of adopting a different, quicker layout. This helps us to understand better the meaning of David’s definition, which says that the long-term probability distribution of path-dependent processes evolves as a consequence of the process’s own history. It means that the details of the history of the system can affect the probability distribution among feasible states in the long run. This applies to the history of the QWERTY keyboard. In brief, developing typebars typewriters and inventing the QWERTY layout shifted the probability distribution of outcomes by reinforcing the selection probability in favor of a QWERTY-layout. But had  Joel Cohen (for the Ergodic Theory developed in demography) and the excellent discussion of Lawrence Sklar  in Physics and Chance (for the Ergodic Theory developed in thermodynamics).   127 the typebar arrangement had not led to jamming, another keyboard design would have been adopted and I would most likely be typing these words with a different type of keyboard. The two fundamental sources of path dependence in the QWERTY case are “increasing returns” and “lock-in.” These are easily explained with George Polya’s urn experiment. In order to perform this experiment, you need an urn, balls of different colors and a set of rules, summarized below.  Rules for the Polya Process: • Initially, n1 = n2 = 1, (where ni is the number of balls of color i in the urn) • In any period, if a ball of a certain color is selected, then it is put back in the urn together with one additional ball of the same color.  The first rule means that you initially place one ball of color 1 (c1) and one ball of color 2 (c2) in an urn. Then, you blindly draw one ball out of the urn. We assume that the balls are identical in size and texture and the probability of selecting one over the other is the same: 0.5. The second rule gives instructions about replacing the selected ball (replacement rule). In this case, the ball should be returned into the urn with another additional ball of the same color. A ball is selected at every instant and the same replacement rule applies each time. To understand the phenomenon of increasing returns, let’s consider the second rule. Let’s suppose that you pick a c2 ball at the first draw. After applying the replacement rule, you have three balls in the urn: one c1 and two c2. Thus, when you   128 make the second draw, instead of having an equal chance of selecting each color, you have a 2/3 chance of drawing a c2 ball and 1/3 chance of drawing a c1 ball. This change in selection probability may be said to result from the increasing returns implemented by the second rule. The selection of one type of ball at time t increases the probability for the same type to become selected at t + 1. Thus we see clearly with this example that increasing returns are like positive feedback (or self reinforcement) created by the selection of one strategy. Increasing returns are for some the very essence of path dependence (Arthur 1994, Pierson 2004), and we find them operating in the QWERTY example. As mentioned above, the early selection of QWERTY contributed to the fact that typists became familiar with it, which created a kind of positive feedback where each adoption of a QWERTY-designed keyboard increased the probability of its selection at later instant.29 And this resulted in the second phenomenon note above, which theorists call “lock-in.” This term refers to a state that is unlikely to change. The expression “lock-in”  29 Note that appealing to increasing returns in economical explanations opposes the “conventional” way in which economists have been explaining market shares. The most common economic theory is built on the assumption of decreasing returns, or negative feedback (Arthur 1994). When products become available, their price and market shares are determined by availability of resources, geography, population density, customer tastes and other technologies available, but not by the history of choices made by agents. The particularity of processes driven by diminishing returns is the existence of a unique and predictable equilibrium. Any shift from this equilibrium is counteracted by opposing forces. For example, if the demand for a given product suddenly increases, but the production is kept constant, the cost for that product will eventually become higher. If an alternative product of similar quality is available at a lesser price on the market, then it will become more valuable to purchase that latter alternative. So at the end of the day, the market shares and the relative prices of items should return to what they were. Resource-based goods (e.g. agriculture or mining) are still, for the most part, governed by diminishing returns. In these cases, history does not matter very much. As Arthur (1994) puts it, “[history] merely delivers the economy to its inevitable equilibrium” (p.11). However, knowledged-based goods (e.g. computers, pharmaceuticals, automobiles, war weapons, telecommunication equipment, etc.) are largely subjected to increasing returns. For many of these goods, history matters because more than one alternative outcome exists, and the one that will obtain depends on the path taken by the process.   129 is simply a vivid way of saying that the system becomes trapped in an equilibrium state that may be not optimal, like QWERTY keyboards in a non-typebar environment.30 Another interesting aspect of Polya’s urn experiment is that from the same initial conditions and the same set of rules, the system can reach many different equilibria, i.e., long-term stable proportions of colors in the urn. As nicely showed by Joel Cohen (1976), a given Polya's urn result is virtually irreproducible because each repetition of the experiment is overwhelmingly likely to yield a different stable distribution of colors in the urn. The existence of multiple equilibria and the fact that the outcome depends on the series of events (especially the early ones) shows well that Polya’s urn dynamics cannot shake free of history, and it has become for most a classical case of path dependence. In summary, revisiting some of the classical examples and models of path dependence in social sciences reveals that the latter is a property of stochastic branching processes, that it entails some kind of unpredictability but it differs from deterministic chaos and finally that many instances are associated with increasing returns and lock-in.31   30 David (2001, p.22-25) warns us not to systematically associate path dependence with “inextricable inefficiencies.” Path-dependent processes can yield suboptimal outcomes, but this is not necessary. Take for instance the history of the videocassette recorder (VCR), which demonstrates path dependence but without a clear difference in optimality between the selected outcome and its alternative. The VCR market began with two competing technologies, VHS and Beta. Both technologies were selling out at a similar price, and both could realize increasing share of the market if they started to sell more. As it turned out, larger numbers of VHS had encouraged video outlets to stock more tapes compatible with VHS, which thereby increased the value of owning a VHS recorder, thus leading to more people buying this technology. So small gains in the market share tilted the competition in favor of VHS, which helped increase its lead and domination in almost no time. Although some have claimed otherwise, it remains unclear if the Beta technology was in fact much better than VHS. So path dependence does not necessarily yield suboptimal outcomes. 31 Note however that increasing returns and lock-in are not always associated with path dependence (see Appendix 1).   130 6.2 Redefining Path Dependence: Towards a General Theory of Stochastic Information Preserving Processes We now enter the second part of this chapter, in which I provide a new definition of path dependence. Recall that one of the motivations for doing so is that I want to solve some of the difficulties arising with Szathmáry’s definition. Briefly, I have argued at the end of Chapter 5 that Szathmary’s account of path dependence could benefit from a clearer distinction between path-independent and path-dependent processes, that a more useful and representative sense of degree of path dependence would rely on the degree of convergence/divergence of outcomes, not on how many stable attractors there are, and finally that the notion of path dependence should not be compatible with instances of stochastic information destroying processes. Another important motivation was to develop an account in line with paradigmatic examples of path dependence in other fields, like QWERTY or Polya’s urn expriment. Thus my account should be compatible with the mechanisms of increasing returns, multiple equilibria and lock-in (although these may not be necessary conditions, see Appendix 1). I also wish to offer more definitional clarity about essential notions like “history” and “path,” which are commonly loosely defined and assumed unproblematic. 6.2.1 Definitions A fundamental property of stochastic processes is their capacity to diverge towards the future. This results from the occurrence of some probabilistic events such that we can obtain alternative outcomes from a given initial point. It is useful to represent such processes with a branching tree. Let us represent such a “tree” as a partially ordered   131 set of moments mi.32 Each moment can be pictured as concrete event and the tree is ordered by a “causal order,” <, which indicates the flow from past to future and can be understood as “earlier < later” or “backward <forward.” An important concept here is comparability. We say that two moments mx and my are comparable if one of them lies in the past of the other, i.e., (mx < my) or (my < mx). Branching time systems include incomparable pairs, and this is why we say that a tree is a partially ordered set of moments. In Figure 6.2 for instance, m1 and m2 form an incomparable pair, whereas m1 and m3 are comparable.         Figure 6.2 A branching tree representing a partially ordered set of moments mi.  We define a history h as a complete course of events or a process, i.e., as a maximal, totally ordered set of moments in a tree.33 That is: (a) Totally ordered: Any two moments m1, m2 in h are comparable, i.e., either m1  32 Please note that the following account of tree follows closely the one developed in Facing the Future (Belnap, Perloff, and Xu 2001), except that I am substituting the reflexive order <, for the strict irreflexive causal order <. 33 Note that I am only considering a particular history here. One could also want to define The History H, which would correspond to the complete sequence of events of the entire universe. m4 m5 m0      m3  m2 m1 m6   132 < m2 or m2 < m1. (b) Maximal: If m is any moment not in h, then h ∪ {m} is not totally ordered. Figure 6.2 encompasses four histories: h1= [… m0, m1 and m3]; h2=[…m0, m1 and m4]; h3= […m0, m2 and m5]; and h4= […m0, m2 and m6]. A fundamental element of branching trees is the branch point, i.e., a moment where histories bifurcate.34 We say that mi is a branch point if and only if it belongs to two histories, hx and hy, as long as there is no mj such that mj > mi and mj belongs to hx and hy. In contrast to a branch point, a convergence point is a moment where histories fuse into one another and become indistinguishable. A convergence point occurs when two distinct histories meet, i.e., when histories containing incomparable moments in the past happen to have comparable moments in the future. We postulate that trees do not have convergence points. In other words, we say that there is no backward branching. Furthermore, we say that all moments in a tree have a historical connection. For any two moments mi and mj, there exists a moment mk, such that mk < mi and mk < mj, and there is no later moment mn > mk such that mn < mi and mn < mj.  We can now define a path p as a complete, totally ordered set of branch points in a tree. A path p is complete if, for any branch points mx, my in p, if mx < m < my, and m is a branch point, then m is also in p. Completeness makes sure that there are no in-between branch points missing, but it does not necessarily include the first branch point (if there is one). This opens the possibility for considering only a (complete) subset of a history, without having to take into consideration every event that came before. This definition of  34 For simplicity, I shall only consider trees whose branch points split into two histories.   133 path also leaves out the last nodes on the tree (which are not a branch points). Figure 6.2 encompasses two paths: p1= [m0, m1] and p2= [m0, m2]. This definition of path focuses our attention on the moments where something non-trivial (branching) occurs. We also need to postulate that any moment is associated with one of several possible states. It is important to note that moments, unlike states, are concrete and non- repeatable events. States on the other hand can come again. We may have a situation where sx < sy < sx. Hence, sx < sx becomes admissible for states but not for moments, which cannot precede themselves. This also means that trees for which states (instead of moments) are the main constitutive elements do not have a “no backward branching” clause. We signify the state x at moment i as sxmi. Finally, we introduce the notion of instant. A given instant is the result of a vertical cut in a tree, such that each instant intersects each particular history in a unique moment. For example in Figure 6.2, m1 and m2 are alternative moments for a given instant, whereas m3, m4, m5 and m6 are alternative moments at a later instant. This notion is relevant because, as we will see, I argue that inferences about path dependence consist of making same-instant comparisons: we consider what else could have happened at the instant at which a given moment occurs. Note also that instants, unlike moments, are temporally but not causally ordered. We signify the occurrence of state x and moment i at instant z as sxmiiz These basic elements of ontology in place, we can define path dependence, which should convey that history matters.  Path dependence: a process (or a history) that leads to state sx at moment mi in instant iz is path dependent if and only if:   134  (i) There is a path p which has sx at mi as one possible outcome at iz, and some state other than sx as another possible outcome at iz;  (ii) There exists at least one alternative path p′ that intersects p such that either:  (a) sx does not occur as a possible outcome of p′ at iz; or  (b) sx does occur as a possible outcome of p′ at iz, but Pr(sxmiiz|p) ≠ Pr(sxmjiz|p′).  More informally, the two fundamental conditions stated in this definition are that alternative outcomes must exist at a given instant and the probability of a particular outcome will change when we switch from one path to another.35 Szathmary’s account also requires the existence of multiple outcomes. However, these outcomes do have to be equilibrium state or stable attractors. The latter condition is simply relaxed under my account. The indifference about the existence of stable equilibrium states also contrasts with many accounts of path dependence in social sciences that focus on lock-in effects. Under my account, locked-in phenomena are possible but not necessary for path dependence to occur (see also Appendix 1). Like Szathmáry’s, this account admits of degrees. However, the reason for the change in strength is different, for it may have nothing to do with whether there are stable attractors and whether there are many or one. In fact, my account admits of two senses of degrees. The first one has to do with the extent to which different paths yield divergent/convergent outcomes. If an outcome sx occurs only once at a given instant, it means that the probability of reaching state sx at that instant via any alternative path is null. This case corresponds to the most extreme degree of path dependence, where every  35 The probabilities here are objective probabilities, understood in terms of propensities or long-term frequencies.   135 path diverges to a different outcome. It is certain that the system will end up in a different outcome each time it changes path (total divergence). A lesser degree of path dependence will admit multiple occurrences of a state at a given instant (partial convergence), but there will exist at least one path for which the probability of that outcome is different from all other paths leading to that outcome. This is expressed by the very last condition in the definition. This middle zone of path dependence includes a wide range of divergence. More divergence means more diversity and less repetition of states at a given instant. Pushing this difference in degree to its other extreme, we obtain path independence (complete convergence) when at a given instant that same state occurs in every history. So an interesting feature of this definition is that we have a continuum between historical and ahistorical processes. I believe that this way of accounting for the difference in degrees suits better the type of gradation found in the biological literature. We saw at the end of Chapter 4 that some biologists infer that history matters more or less depending on the extent to which the processes are divergent or convergent. This was especially explicit in Chase’s (2003) analysis of the factors that favor or not the formation of multiple equilibria in ecological communities. A similar notion of degree could be extracted from the long-term evolutionary experiment (LTEE). This sense of degree also reflects well the description of degrees of historicity sketched in Chapter 2. I have argued that the extreme form of historicity entails that any changes in the past lead to an alternative present/future.  This translates well in the present framework as extreme path dependence, where the probability of reaching a different outcome when switching from one path to another is certain. The other extremes   136 also coincide. Path independence and ahistoricity happen when a process totally converges towards the same state, regardless of how we change the past. In between, we find partially converging/diverging processes, where changes in path also affect the probability of being in a certain state at a later instant, but to a lesser extent than extreme path dependence because different paths can yield the same outcome, although with different probabilities. There is therefore a true sense of continuum between ahistorical and historical processes. There is yet another sense of degree implicit in my account that is not typically discussed by theorists of path dependence. That sense of degree relates to just how small or large the difference is between conditional probabilities in the second condition. For example, the processes illustrated in Figure 6.3 have the same structure. They both are partially convergent because the two final states appear twice in the final instant. However, the processes in 6.3a are weakly path dependent, whereas the ones in 6.3b are strongly path dependent. a)                                                                             b)       Figure 6.3. a) Weak and b) strong path-dependent processes.  s2m4 s2m5 sim0 .5 .51 .49 .5 .51 s1m3  .49 s2m2  s1m1 s1m6 s2m4 s2m5 sim0 .8 .2 .8 .2 .2 s1m3  .8 s2m2  s1m1 s1m6   137 This second sense of degree is important because it captures well the claim made in Chapter 2 and Chapter 5 that history sometimes does not matter even if there exist multiple possible outcomes. This was the case for the bowl with multiple wells in an extremely noisy environment causing the marble to erratically jump from one position to another and to display an extremely stochastic trajectory. I have concluded that this situation could lead to information-destroying processes when none of the admitted outcomes can be associated with a distinct (range of) past condition(s). This type of situation would also count as path independent under my account. If we suppose that the bowl has two wells, we can represent the situation with the same structure of tree as in Figure 6.3, but we would have to replace the weights so that we have p = 0.5 on every branch. Although that tree has multiple outcomes at its final instant, there are no differences in conditional probabilities. This means that condition (ii) of the definition of path dependence is not met. This also proves that, unlike Szathmáry’s, my account would not let some instances of information-destroying processes count as path dependent. Although I have not made it an explicit and fundamental characteristic of path dependence, I believe that this account encompasses the idea of contingent irreversibility as discussed in the previous chapter. Recall that contingent irreversibility essentially means that returning to a previous (equilibrium) state becomes increasingly improbable with time due to the influence of conditions and preconditions that are individually or jointly improbable. There can be different sources of contingent irreversibility and we have discussed essentially two in the previous chapter: complexity and historical constraints. That second one is more closely related to path dependence.   138 Consider again the case of ichthyosaurus, but this time represented with a branching tree diagram (Figure 6.4).          Figure 6.4 Branching diagram showing the high probability of ichthyosaurus (and low probability of fish) to arise in aquatic life conditions after terrestrial conditions.  We have established in Chapter 5 that this is an example of contingent irreversibility because of the very low probability to return to a fish state after adaptation to terrestrial life. It is also path dependent. We have two alternative states at the last instant, and none occurs more than once. More importantly, the reason why fish are very unlikely to evolve at this stage is because of the terrestrial stage. Fish were likely to evolve after a long series of adaptations to aquatic life. But changing this history, by adding adaptations to terrestrial conditions, has produced different life forms and significantly lowered the probability of fish to evolve (again) from that different path. So we have path dependence not only because the last instant admits of alternative life forms, but also because changing the path has affected the probability of certain outcomes. Different histories tend to result in different aquatic life forms. Note however that it is not necessary for a process to be irreversible and path dependent at the same time. We can imagine a scenario where the probability of returning Fish high Very low Land Animals Fish Ichtyosaurus i1 i2 i3 Time Environ- ment  Aquatic Terrestrial  Aquatic   139 into a fish state is reasonably high, which would make the process reversible. There would be some lineages that actually returned to a fish state despite their reptilian history.36 Still, if the probability of outcomes had changed after a period of terrestrial adaptation, which is a very reasonable hypothesis, then the process would have remained path dependent – although perhaps to a lesser extent. Therefore, irreversibility is not a necessary condition for path dependence in my framework. Another fundamental difference between our accounts comes from the fact that my account makes the property of path dependence instant-relative. Szathmáry, like most theoreticians of path dependence, does not integrate this type of relativity. A process is either path dependent or it is not, but cannot be path dependent at some instant and path independent at another instant. I believe that this is a problem. Recall the example of a marble rolling in a bowl with only one well. We can say that information from the past vanishes only once the marble has reached the bottom of the bowl. We cannot say that history does not matter as long as the system is far enough from its global equilibrium. If only for making sense of this type of inference, I think that adopting an instant-relative approach has an important advantage over a more universal or omniscient one. Also related to this point, adopting an instant-relative approach solves the problem of what to do with evolutionary dynamics displaying transient divergences. I have suggested in the previous chapter that two initially identical populations undergoing different order of mutations and taking temporarily different paths to the same genetic/morphological state could be more path dependent than processes in which no divergence occurs at all. This conclusion is difficult to endorse, but only because we were  36 The next chapter will present a similar type of example with stick insects (phasmides).   140 in a context in which we could “see” from the beginning that the fate of the populations was to converge in the long run. The same discomfort would be true of the bowl with one well, which is after all an exemplar of an information-destroying process. We come to this conclusion because we can predict that the marble will always end its ride at the bottom of the bowl. So we do not want to say that this example can count as a weaker form of path dependence. This problem does not arise within the context of an instant- relative framework because we compare outcomes at a given instant. Thus history can matter along the way, yet ceases to have an influence once processes are compared at the instant when a global equilibrium is actually reached. I also believe that adopting an instant-relative approach well reflects the epistemological state we are in when observing and comparing real processes. What we call “outcomes” in real life observations are always relative to an instant. Only mathematical models or simulations can give a global, omniscient view. These projection tools make possible inferences based on what comes at the end. The problem is that we seldom reach this end point and observations and data are always instant relative. This may not cover all the differences between our accounts, but it highlights some fundamental discrepancies and shows that this revised notion solves the problems mentioned at the end of Chapter 5. Before moving on to the application of this new account of path dependence, I would like to answer a potential objection. Although it may look as if path dependence simply reduces to probabilistic causation, this is not the case. Probabilistic causation may be defined in various ways, but the following condition is often essential to it: An event C is a probabilistic cause of another event E if the   141 occurrence of C raises the probability of E.37 The principle of path dependence differs from this account of probabilistic causal dependence in at least two important ways. First, the inequality does not require that the occurrence of an event raises the probability of the outcome that will obtain. The requirement is simply that there exists an alternative path for which the conditional probability of outcomes differs. In this sense, it is more closely affiliated to the notion of causal dependence as developed by Woodward (2003), according to which the probability of a present event changes as a function of a certain past event. Note that this notion of probabilistic causal dependence (or some may prefer “causal relevance”) is also the basis of the definition of information-preserving processes. Second, because “path dependence” requires the existence of intersecting paths, the probability of an outcome will depend on the occurrence of a set of at least two moments. Now, in many accounts of probabilistic causation, the relationship between cause and effect should be Markovian, which means that the probability that a system will be in a certain state at time t+1 will depend only on the state at time t, and not on what happened at any earlier time t-n. This Markovian assumption is denied here, for the probability of an outcome in general depends on the (complete) path leading up to the outcome. 38 As shown with the QWERTY example (see Figure 6.1), the reason why we are stuck with a slower keyboard layout today has to do with the fact that the early typing machines had jamming problems and because of the series of choices that followed the selection of the QWERTY layout. There are at least two important branch points in this story: one when typebar machines were invented, and one when QWERTY became the  37 Even the Stanford Encyclopedia of Philosophy will do. 38 Note that I am not insinuating here that Markovian processes cannot be path dependent. I simply say that they do not have to be. Path dependent processes can have an extended temporal dependence, as it is often the case with (random) nonlinear dynamics.   142 solution (adopted by most typists) to the jamming problem. Essentially, the existence of alternative outcomes at a certain instant is a necessary but insufficient condition for path dependence. When speaking of path dependence, we refer to a sequence of branching points admitting of at least two stages of multiple realizations. 6.2.2 Applications Let’s now look at some previously described scenarios to see if this principle of path dependence applies to them, starting with the QWERTY case. Figure 6.1 illustrates well the definition of path dependence. There exist alternative outcomes at the last instant, and the probability of being in one or another of them changes as we switch paths. In the actual history, a sufficient number of users adopted QWERTY and developed skills in typewriting with that layout. This resulted in a greater market share for QWERTY and a lock-in, i.e., a high probability of being adopted in the future, despite the fact that the environmental conditions changed and a slower layout actually became a suboptimal strategy. But had another path been taken, things would have gone otherwise. Another outcome would have occurred (or would have had a high possibility of occurring) if there had not been a jamming key problem and if users had initially adopted and developed typing skills on an alternative quicker layout. Although this stands as an incomplete representation of the whole story, we nevertheless can appreciate how a QWERTY-type of example can be accounted for by the above notion of path dependence. This account also applies to biological examples displaying a phenomenon of dependence on past vagaries presented in previous chapters. Chapter 3 for instance discussed the forms of historical contingency arising in the Long-Term Evolutionary   143 Experiment (LTEE), in which initially identical populations put in a new environment would either present sustained divergence, transient divergence or parallel evolutionary dynamics. Using mathematical models, Johnson et al. (1995) have discovered that a large population size and high mutation rate favor the occurrence of “coincident-event” replacement, which means that the same beneficial mutations happen in the same order and at the same time in duplicated populations. Coincident-event replacement will thus result in the parallel evolution of these populations, and by extension the unimportance of history. By contrast, a low mutation rate and small population size tend to favor “isolated-event” replacement, in which different populations undergo different beneficial mutations at different instants. Isolated-replacement will therefore result in either sustained or transient divergent evolutionary trajectories. Johnson et al.(1995) prove their point using a three-allele model. The initially identical populations are constituted of haploid organisms and the model assumes three possible alleles, or three possible states, at a given locus: s1, s2 and s3. All populations begin their journey in state s1. They further establish that each one of these genotypes has a different fitness value, w(sx), ordered thus: w(s1) < w(s2) < w(s3). This means that natural selection will favor the establishment of s3 over s2, and s2 over s1. In order to create two stable equilibria, i.e., two disconnected adaptive peaks, the authors further assume that it is impossible for a population go from state s2 to s3 or s3 to s2. More precisely, the mutation rate from si to sj is u12 = u13 = u > 0, and all other uij = 0. Thus, if the mutation rates are sufficiently low so that collective replacement happens as an isolated event, then some populations will be stuck in a suboptimal peak, if s2 replaces s1 before a successful s3-mutation occurs. They conclude:   144 [W]hen collective replacement is isolated-event, the random origin of mutations, either along or in concert with genetic drift, can theoretically lead to sustained divergence of formerly identical populations in identical environments, even for selectively important traits. In contrast, if collective replacement is coincident- event, then the most fit allele will usually appear and win in every population, so among the populations of the metapopulation. (Johnson et al. 1995, 128-129)  Figure 6.5 summarizes the two situations contrasted in this passage and shows how path dependence can apply to sustained divergence, whereas path independence applies to evolution driven by coincident-event collective replacement. We can safely say that processes included in 6.5a are path dependent because multiple outcomes are possible at the latest instant, and none of the admitted states can actually recur. If an allele is selected at instant i1, then it has a probability p = 1 to remain in place at a later instant, and the alternative allele has a probability p = 0 to replace the previously selected allele. Thus, switching from one path to the other guarantees that the alternative equilibrium will be reached because populations will get locked into the first advantageous state they acquire. a)                                                                          b)        Figure 6.5. Branching trees representing a) sustained divergence produced by isolated- event collective replacement and b) parallel evolution produced by coincident-event collective replacement. See text for details. States si are different alleles at a given locus. s3m4 s2m5 s1m0 s2m1 s3m6 s3m2 s2m3 0 1 0 0 1 s3m4 s2m5 s1m0 s2m1 s3m6 s3m2 s2m3 .5 .5 1 0 1   145  Conversely, the model indicates that coincident-event collective replacement, produced by large mutation rate and large population size, rapidly and synchronically converges towards the same evolutionary outcome. Because s3, the allele with the highest fitness value, will usually appear in all populations, there are no opportunities for the other allele to go to fixation in some populations. Processes in 6.5b represent that situation, which is not path dependent. Although this tree theoretically admits of alternative end states, there is but one admitted process. The evolutionary dynamic is therefore linear, but not path dependent. Although the divergence of initially identical populations evolving in identical environments has been observed in experiments with living organisms (fruit flies and bacteria), the model used by Johnson et al. (1995) makes several strong and unrealistic assumptions. These include the restriction on the mutation rates, for example, or the fact that populations are perfectly isolated from one another. The authors acknowledge that in nature there will be migration between populations and that some suboptimal local adaptations could be wiped out by the immigration of organisms with more advantageous traits. This only means though that the phenomenon of sustained divergence is going to be less frequent and more difficult to detect in nature. It does not invalidate the claim that that path dependence can account for the situation where populations of the same species evolve independently different solutions to the same adaptive challenge. This account of path dependence can also accommodate the priority effect in community ecology (discussed in Chapter 4). Recall that community ecologists typically invoke historical contingency when the assembly history of communities can yield   146 alternative species composition and richness. In a nutshell, the process of community assembly is said to be contingent on history when changing the order of arrival of species creates multiple, divergent equilibria. This was very well illustrated in Grover (1994). Unfortunately, reproducing this model would take too much space here (even the simple model with four species would have 24 possible assembly histories). This is in fact an inconvenient aspect of branching diagrams. Although they facilitate the understanding of path dependence, they can only be used for representing simple processes. Nonetheless, we can reproduce the simpler situation entertained by MacArthur (1972), which also proves the importance of the priority effect in community assemblies. Recall that this example involves three species feeding on two resources. Each species uses each resource differently, which was represented in Figure 4.3. by their crossing isoclines. An isocline represents the level of resources at which a species can maintain itself. The fact that they are not juxtaposed means that each species needs different levels of resources to maintain itself. It also means that the system will allow for the co- existence of some species, but not others. If the system stabilizes at a level of resources that stands above a certain species’ isocline, then that species will not become (or cease to be) a successful colonist because it will be outcompeted by the other species.         147             Figure 6.6 Branching tree representing the assembly of a three-species community (refer to Fig 4.3 for more details). States with two numerical indices indicate the species present after the community has reached equilibrium. For example, s1,2 means that species 1 and 2 have joined the community.  Since the order of arrival of species is conceived as random, it is possible to illustrate the situation using a branching tree diagram. Consider Figure 6.6. At instant i0 the community has no species. Then, any of the three species present in this regional pool can colonize the habitat with equal probability. This yields equal probability on the three branches leaving from m0 and three possible states at instant i1. We assume that each species colonizes only once, such that if an habitat is occupied by species 1, then the next instant can only correspond to the arrival of either species 2 or 3, but not 1. This results in three alternative states at instant i2, i.e. s1,2 , s1,3 or s2,3. The numerical index indicates the s1m1 s2m2 s3m3 s1,2m4 s1,3m5 s1,2m6 s2,3m7 s1,3m8 s2,3m9 s0m0 s1,3m10 s2,3m11 s1,3m12 s1,3m13 s2,3m14 s2,3m15 s1,3m16 s2,3m17 .33 .33 .33 .5 .5 .5 .5 .5 .5 .5 .5 1 1 1 1 .5 .5 i3 i2 i1 i0   148 species present once the community has reached a stable equilibrium. Note however that the state s1,2 is only stable at this particular instant because it will always be replaced by states s1,3 or s2,3 at the next instant, when the third species joins the community. Given the way in which the isoclines are disposed (Figure 4.3.), the state s1,2 cannot be an equilibrium state in a three species community. Note also that some moments at instant i2 are not branching points. There does not always exist alternative moments leaving nodes in i2 because of the deterministic treatment of the local interactions in MacArthur’s model, and because the community composed of species 1 and 3 or 2 and 3 is stable and cannot change (given the assumptions of the model). The order in which species can enter the community is left undetermined, but the result of the competition is sometimes certain. This is the case for example when we go from s1,3m5 to s1,3m12. Although species 2 makes an attempt at joining the community, the resource level reached when species 1 and 3 are at equilibrium is below the species 2 minimum maintaining level, so this attempt does not change the state (i.e. species composition) of the community. MacArthur’s model does not have such a deterministic solution for all instances of colonization, and this is why some of the nodes in instant i2 are branch points. Take for instance the situation described at moment s1,2m4. The latter is a branching point because we don’t know how the resources will be used, and consequently which one of the two stable equilibria will actually occur. Processes in this branching tree are path dependent. The fact that we have two alternative states at i3 means that condition i) of path dependence is met. Condition ii) is met as well because although these two stable states co-occur at this instant, the   149 probability of reaching them changes depending on the path taken. For instance, Pr(s1,3m10i3|p1) < Pr(s1,3m12i3|p2), where p1 = [m0,m1, m4] and p2 = [m0,m1, m5]. Similarly, Pr(s2,3m11i3|p1) < Pr(s2,3m14i3|p3), where p3 = [m0,m1, m6]. Although simple, this illustration of the priority effect very well fits my account of path dependence. 6.3 Concluding Remarks This chapter has developed a new account of path dependence that solves many of the difficulties we faced by using Szathmáry’s account. It has also shown how this notion can apply to some of the landmark cases of historicity in evolution and community ecology. I hope that the analysis provided here has convinced the reader that path dependence is a fundamental notion of historical biology. It clarifies the conditions in which history matters in (stochastic) biological processes and provides more substance and nuance to the notion of historicity. The next chapter will go further in the application of path dependence by looking at cases involving historical constraints in macroevolution and macroecology. I will also suggest that generative entrenchment (defined in the next chapter) can be an important cause of path dependence (and contingent irreversibility).   150 Chapter 7: Generative Entrenchment, Historicity and Irreversibility  The purpose of this chapter is threefold: 1) I want to discuss the role of phylogentic constraints (defined later) in the production of path-dependent evolutionary and ecological processes, 2) I want to show how generative entrenchment (also defined later) can be an important cause of phylogenetic constraint and contribute to path dependence at the population and developmental levels, and 3) I want to discuss the role of generative entrenchment in the irreversibility of evolution. Let’s start with the first goal, which is also the occasion for discussing further an important aspect of the historical turn in biology. We saw in the “Introduction” that one of the main objections deployed by Gould and Lewontin (1979) against the adaptationist program was directed towards the (too common) assumption that the process of evolution by natural selection results in the best of possible worlds. Evolution by natural selection tends to increase the average fitness of populations and as such is an optimizing process, but it is not the only force at work and there exist many examples of imperfect adaptations out there. One of the reasons for this is that natural selection has to work with what is already there. In fact biologists commonly explain the occurrence of imperfect adaptations by invoking some kind of historical constraint or legacy (Griffiths 1996; Williams 1992; Gould 1980, 1989, 1991; Gould and Lewontin 1979). Take for instance the following passage from Gould (1989): Darwin recognized that the primary evidence for evolution must be sought in quirks, oddities, and imperfections that lay bare the pathways of history. Whales, with their vestigial pelvic bones, must have descended from terrestrial ancestors with functional legs. Pandas, to eat bamboo, must build an imperfect “thumb” from a nubbin of a wrist bone, because carnivorous ancestors lost the requisite   151 mobility of their first digit. Many animals of the Galapágos differ only slightly from neighbors in Ecuador, though the climate of these relatively cool volcanic islands diverges profoundly from conditions on the adjacent South American mainland. If whales retained no trace of their terrestrial heritage, if pandas bore perfect thumbs, if the life on the Galapágos neatly matched the curious local environments – then history would not inhere in the production of nature. But contingencies of “just history” do shape our world, and evolution lies exposed in the panoply of structures that have no other explanation than the shadows of their past. (Gould 1989, 300-301)  In Darwin’s time, imperfections were evidence against natural theology and intelligent design. Nowadays, these imperfections also indicate that natural selection does not act alone and freely in evolution. Natural selection is somewhat constrained by previous evolutionary history and it cannot guarantee that species will ever become perfectly adapted to their environments. The panda’s thumb, an appendage developed from a wrist bone in order to eat bamboo, is perhaps one of Gould’s favorite illustrations of this principle. It shows that evolution has to work with what is available from past evolution and that nothing guarantees that the available materials can yield the most optimal solution.39 The wrist bone from which the panda’s thumb has arisen originally evolved for facilitating terrestrial locomotion. But now, at least in this lineage, it has acquired a different form and serves a different, alimentary purpose. Although it works relatively well, the panda’s “thumb” remains nevertheless an imperfect tool, much more clumsy than a truly opposable thumb. More generally, we can say that this example shows that  39 Gould and Vrba (1982) have invented the notion of “exaptation,” which accounts for the fact that features sometimes serve a different function than the one they were selected for. The panda’s thumb would be an example of exaptation. The notion of exaptation has been criticized on the basis that it encompasses too much (see Griffiths 1996 for elaboration of this argument). If we consider that variations occur by chance, i.e., independently of the purpose they will eventually serve, then everything in Darwinian evolution becomes an exaptation. This is, to my sense, not a charitable or even correct reading of the notion. What seems to have motivated Gould and Vrba (1982) is the fact that some traits, once they have originated, are reshaped by a change in selective forces. Perhaps many, but not all, traits become modified in order to serve a different function under different circumstances.   152 current forms and functions can carry a legacy from the distant past, which explains the occurrence of what looks like arbitrary, maladaptive features. I will argue that historicity, and more particularly path dependence, is essential to this view of evolution. In fact, I believe that path dependence will be a general consequence of the existence of “phylogenetic constraints.” The latter notion will be developed with the help of the work of Peter Price and his collaborators on sawflies, in which they illustrate how some features of developmental histories are maintained ancestral characters around which adaptations are focused and as such become a source of patterns in macroevolution and macroecology. I will show how path dependence can be useful in interpreting this cascade of causal influence. 7.1 Phylogenetic Constraints and Historicity in Evolution and Ecology There is an important volume of literature on phylogenetic constraints and covering all of it would take us beyond the scope of the discussion. For the purpose of this chapter, I will rely on an account provided by Price and Carr (2000), which provides interesting details that will be useful for the rest of the discussion. According to this account, phylogenetic constraints should possess three essential properties. First, they should themselves be conserved evolutionarily, which means that they persist in lineages. The persistence of phylogenetic constraints may result from the fact that the character is “tightly integrated into developmental programs” (p. 646). Being tightly integrated results in some sort of phylogenetic inertia because changing them would most likely produce malfunctions. We will come back to this hypothesis later, which closely resembles the notion of “generative entrenchment” developed by Wimsatt (1986, 2001) and Shank and Wimsatt (1988, 2001). Second, phylogenetic constraints should set key aspects of the   153 ecological interactions and the selective regime of a taxon. For example, they can be features that determine species’ modes of alimentation, locomotion or reproduction. Third and finally, by virtue of the first two properties, phylogenetic constraints limit the major adaptive options available to a lineage. This is very abstract at this point, but it will become clearer when we get to a concrete example. With this view of phylogenetic constraints in hand, Price and Carr (2000) and Price (2003) formulate the “Phylogenetic Constraint Hypothesis:” Macroevolutionary patterns provide the basis for understanding broad ecological patterns in nature involving the distribution, abundance and population dynamics of species. A phylogenetic constraint is a critical plesiomorphic character [a shared character, derived from a common ancestor], or set of characters, common to a major taxon …. Such characters limit the ecological and thus the major adaptive options in a lineage, but many minor adaptations are coordinated to maximize the ecological opportunities that can be exploited given the constraint. Such a set of adaptations is called the ‘adaptive syndrome.’ These characters in the adaptive syndrome, which evolve in response to the constraint, then result in inevitable ecological consequences, called ‘emergent properties.’ (Price and Carr 2000, 265)  They tested this hypothesis with sawflies.40 The latter are named after their morphology, which includes a saw-like, plant piecing ovipositor (see image below).  Figure 7.1 A sawfly using its saw-like ovipositor. (Drawing from Kabo 2009. Printed with permission from Karine Beaudoin).   40 They worked essentially with the species Euura lasiolepis, which is not the one represented in Figure 7.1.   154 Price and Carr (2000) suggest that the saw-like ovipositor is a phylogenetic constraint. This character is common to the entire family of sawflies and traces back to some of the earliest wingless insects, the silverfish, from Devonian time (more than 350 million years ago). The ovipositor’s architecture is complex and delicate and limits oviposition to the inside of soft plant tissues (endophytic oviposition), thus minimizing wear on the saw. So females tend to lay their eggs in the youngest shoots of their host plant. This further limits the evolution of alternative life cycles because females must emerge when their host plant phenology is appropriate, i.e., when the host plant has reached a certain stage of development relative to climatic conditions. The possession of a saw-like ovipositor thus generates a certain type of adaptive syndrome because it limits the range of options available for ecological interactions and some major adaptations. An interesting and fairly novel aspect of Price’s and Carr’s project is the importance given to the impact of phylogenetic constraints on ecological phenomena, the so-called “emergent properties” that result from the adaptive syndrome evolved by the species. The close relationship between the endophytic oviposition and the host plant phenology limits the sawflies to certain patterns in abundance, distribution and population dynamics. The distribution of sawflies depends on the availability of young and vigorous shoots. Data show a strong correlation between perturbations (e.g., fire, flood, heavy mammals browsing) and population abundance. Disturbed sites tend to increase the availability of young plants and vigorous growth, which in turn increases the survival and reproduction rate of sawflies. Conversely, if at some site populations of host plants become more stable and older, then sawflies become rare and may even go locally extinct. Thus, the main emergent ecological effects of having a saw-like ovipositor are: 1)   155 very patchy distribution (on young, vigorous shoots), 2) generally low abundance at the landscape level, but rare dense populations in very favorable sites, and 3) relatively stable and predictable dynamics for many generations. Price and his collaborators further support the phylogenetic constraint hypothesis by showing that species having evolved a different mode of oviposition do not display the same adaptive syndrome and consequently present different emergent properties. For instance, the spruce budworm – whose outbreaks in Canadian forests are well known – lacks a plant-piercing ovipositor and lays eggs on the surface of mature foliage. The spruce budworm oviposition is also phylogenetically primitive trait and is commonly found in the order Lepidoptera. But the constraint in this case has to be conceived in terms of behavior instead of organ. The posterior opening of the vagina (oviporus) of most lepidopterans is not an ovipositor per se, because it merely serves for discharging eggs. Price and Carr (2000) argue however that their mode of oviposition can nevertheless be a constraint in the sense that it limits the amount of information available to females. Because females simply lay their eggs on the surface, they receive less information from the host plant about its content in nutrients. The authors infer that natural selection tends to favor unspecific utilization of foliage as oviposition site.41 This in turn has important effects on the ecological emergent properties, which diametrically differ from the ones observed in sawflies. Resources are often abundant and not limiting for the spruce budworm, so populations can build to high densities and adopt non-patchy spatial distribution. Moreover because their abundance and density can reach very high proportions, the functional and numerical responses of natural enemies can be strong.  41 There is no clear evidence that this is a result of natural selection or a mere collateral effect of the oviposition mode. Nevertheless, the data clearly show no preference in oviposition site.   156 That is, species feeding on the spruce budworms will be able to proliferate during periods of high abundance in budworms. This, and the fact that epidemic disease can be common when abundance is high, will cause eruptive, as opposed to stable, population dynamics. In summary, the Phylogenetic Constraint Hypothesis can be represented with the branching diagram presented in Figure 7.2. The states in this diagram represent primitive and derived characters, and the adaptive syndromes and emergent properties associated with the latter. The lower bound is the state of the common ancestor of the lineages leading to sawflies and spruce budworms. The exact state of the lower bound is not specified since we don’t know (from the story told by Price et al.) what the common ancestor looked like.          Figure 7.2 Different modes of ovoposition resulting in alternative adaptive syndromes and emergent properties. sca: state of common ancestor, sendo: endophytic ovoposition, sexo: exophytic ovoposition, sendo(a&e): the adaptive syndrome and emergent properties associated with endophytic evoposition, sexo(a′&e′): the adaptive syndrome and emergent properties associated with exophytic ovoposition.  At some point a lineage branching-event occurred, with the alternative lineages divergent paths, p1 = [sca-sendo] and p2 = [sca-sexo], leading to the evolution of endophytic sca sendo       high low high low sendo (a&e) sendo (a&e) sexo (a′&e′) sexo (a′&e′) sexo   157 oviposition in one lineage, and exophytic oviposition in the other lineage. These characters then acted as constraints, favoring the subsequent evolution of alternative adaptive syndromes and emergent properties (the adaptive syndromes and emergent properties associated with endophytic oviposition sendo(a&e), and those associated with exophytic oviposition sexo(a′&e′)). This is represented by the different probabilities of the subsequent evolutionary developments. Figure 7.2 also helps us to see that the Phylogenetic Constraint Hypothesis entails historicity. Even when they live in similar environments, actual sawflies and spruce budworms have different adaptations and ecological responses because their distinct evolutionary paths have led to the accumulation of alternative modes of oviposition, which constrain differently their range of subsequent evolutionary possibilities. As Gould would have said, these life forms bear the signature of their past. I believe that the form of historicity at play in this diagram is in fact path dependence. I argued in Chapter 6 that in order to qualify as path dependent, a process resulting in a certain state must have an alternative at a given instant and the probability of being in that state must change as a function of the path taken. Despite the lack of information regarding the precise weight on each branch, we can use the results obtained by Price and his collaborators to infer that the evolutionary processes resulting in sendo(a&e) and sexo(a′&e′) are path dependent. The probability of being in state “a&e” is high when species integrate phylogentic constraint sendo, but it is low when species adopt phylogenetic constraint sexo, and vice-versa for the state “a'&e'.” So there exist alternative outcomes at a given instant, and the probability of being in one or the other of these states changes as a function of the evolutionary path.   158 Thus, phylogenetic constraints can lead to path dependence because they shape the probability distribution of evolutionary and ecological outcomes. Let me close this section by suggesting a rapprochement between Prices’ et al. work and the research program in historical ecology initiated by Brooks and McLennan (1991). There was for a long time, and perhaps still is, a tendency to leave evolution out of explanation in ecology (Brooks and McLennan 1991; Kingsland 1995; Price 2003). Patterns in population dynamics and species diversity of current ecological associations are often explained by the distribution of habitats and local interactions (especially competitive exclusion, parasitism, predation, diseases). The explanatory role for long- term evolutionary past is often limited to the residual variance, but is seldom mentioned as a main causal factor (Ricklefs and Schluter 1993; Farrel and Mitter 1993). The expression “historical ecology” (Brooks and Maclennan 1991, 1994), refers to an approach emphasizing the role of phylogenetic constraints in explaining ecological phenomena, such as abundance, distribution, community composition, etc. More precisely, historical ecology claims that local interactions between organisms of various species is mediated via traits that have individual and evolutionary histories, so the patterns we observe at the population and community levels result not only from individual encounter in local environments, but also from evolutionary constraints. The intuition behind this kind of hypothesis is that species with similar evolutionary past are more likely to show similarities in their morphological, behavioral and ecological characters. Clearly, this is in line with the Phylogenetic Constraint Hypothesis. Historical ecology is based on a historical view of species. Species have to change in order to adapt to new environments, but they also tend to retain some of their   159 evolutionary past in the form of shared and derived traits. As Brooks and McLennan (1993) so poetically put it: “Species then are vessels of future potential, living legacies of past modifications and stasis shaped by millennia of biotic and abiotic interactions. They are history embodied” (Brooks and McLennan 1993, 267). This kind of spirit, which perhaps first appeared in the transformationist philosophy of the 18th century, still transpires in the work of contemporary biologists working in historical ecology and I believe that it also forms the basis of the Phylogenetic Constraint Hypothesis. Finally, I hope that the above discussion has convinced the reader that the notion of path dependence can be useful in interpreting the nature of processes following from the existence of such constraints. 7.2 A Role For Generative Entrenchment In this section, we are going deeper into the cause of phylogenetic constraints, and by extension the cause of historicity that follows from them. Recall that Price and Carr (2000) argue that features become phylogenetic constraints if they persist in lineages, which can happen if they become tightly integrated into the developmental program. I will show that this hypothesis can be spelled out with the notion of generative entrenchment (GE) put forward by Wimsatt (1986, 2001) and Shank and Wimsatt (1988, 2001). Following Gould (1989), Wimsatt (2001) maintains that evolution is historical by virtue of its contingent nature. He, too, makes this claim by using (and combining) two notions of contingency resembling “causal dependence” and “unpredictability.” In order to mark history, he says: “an event must cause cascades of dependent events that affect evolution” (227). This clearly refers to some species of causal dependence. He also   160 suggests that these events are contingent in the unpredictability sense and that history matters to evolution because “minor unrelated ‘accidents’ or ‘incidents’ can massively change evolutionary history” (226). Both notions of contingency appear combined in this view, which is reaffirmed when he suggests that the evolution of life forms consists of a “successive layered patchwork of contingencies” (Ibid). So Wimsatt’s view of evolution clearly combines both senses of “contingency.” But he also adds an interesting explanation for such contingent evolution. He suggests that only GE can explain how this happens: GE provides an explanation, perhaps the only possible explanation, for how and why this [successive layered patchwork of contingencies] is possible. In reproducing heritable systems, GE and selection may provide sufficient conditions for the incorporation and growing importance over time of contingency, and of history, in the explanation of form. (Wimsatt 2001, 226)  As discussed with the Phylogentic Constraint Hypothesis, the evolutionary past can often become the framing principle for the subsequently acquired adaptations. The purpose of this section is to elaborate on the idea that GE makes it possible for some chance variations to accumulate, become integrated and have long-lasting effects on the evolution of life forms. If we can establish this, then we can also claim, building from what has been said in the previous section, that GE can be an important source of phylogenetic constraints and path dependence at the macroevolutionary and macroecological levels. Let us now clarify what GE amounts to. 7.2.1 Generative Entrenchment The notion of “generative entrenchment” combines two elements: “generativity” and “entrenchment.” A generative structure typically possesses multiple elements that come together as a whole over a time period, and where later elements presuppose or   161 depend on the presence and proper assemblage of earlier ones. To give a very simple example, one can imagine how we build a tower with blocks. The building process extends in time; even the simplest tower will need at least two steps. In a metaphorical sense, we can say that our tower grows or develops each time a piece is added. Moreover, the position and stability of the pieces at a given instant depend on the position and stability of the pieces assembled earlier. A wide and solid basis offers the potential for a higher and more stable tower, whereas a narrow and weak basis will offer less support and less potential for development. So building a tower is a generative process: it takes place during an extended time period and later events causally depend on earlier ones. The notion of generativity, because it entails such causal dependence, can resemble historicity. Note however that although generative entrenchment is compatible with historicity, it does not guarantee it. One can imagine a situation where it is only possible to obtain one type of tower. Depending on the shape and the number of the pieces available, there might be only one way of building a tower that will stand on its own. To speak metaphorically again, alternative assemblages would virtually always yield non-viable outcomes. However, such unique viable developmental outcomes are very unlikely in biological contexts, and I will argue in the next section that path dependence applies very well to most developmental processes. Before going into detail about this, let me introduce the notion of “entrenchment.” Something is entrenched when difficult to change. In the present discursive context, this stability can be interpreted at the developmental and evolutionary scale. Take the tower example again. The pieces in the tower that come earlier in the building process and constitute the foundations will generally have the highest degree of   162 entrenchment. Removing or changing them (even slightly) may be catastrophic for the whole tower, which would become highly unstable. The same is not true of the top and superficial pieces because they present a lesser degree of generative entrenchment. Thus the architect has more freedom with the latter because the stability of the whole does not depend as much on them. Even if it is much simpler than real biological populations, we can see with the tower example how differential entrenchment can impact the evolution of towers and create some evolutionary stability into certain parts. Imagine a city in which a population of towers is generated by replication of successful designs. When an architect finds a viable design, she/he gets more contracts and her/his plans are more likely to become copied by other architects. Some modifications will be added here and there, but we should observe on the long run that most towers have similar foundations. It will be easier to change the coat than the core of towers because doing so will most likely result in unstable construction. Thus, having a higher degree of entrenchment at the developmental level can have a stabilizing/constraining effect at the population level, creating a phenomenon of inertia for these parts. A similar story can be told about biological organisms. Like building a tower, biological development is a generative process. The whole organism comes into being as a result of assembling various elements. As in the building of the tower, the proper development of an organism at a later stage causally depends upon earlier stages. This fundamental property of biological development is especially obvious in multicellular and hierarchical organisms. Each step of their development is causally connected to some   163 antecedent events, and the form resulting from the whole process depends on the presence and proper assemblage of several developmental features. It is important to note also that evolution is essential in this story because it is the process by which the integration of features takes place. Shank and Wimsatt (2001) explain that certain features become increasingly integrated and entrenched with the mechanism of accretion, i.e., by the accumulation of new features at later developmental stages through evolution. The first forms of life were relatively simple, but they evolved into more complex species by accumulating new features on top of the ones they had previously integrated. With enough time, this process of accretion resulted into complex developmental networks with certain parts being more or less deeply integrated in a whole developmental process. Thus evolution by accretion will most likely result in organisms with elements presenting different degrees of entrenchment, with older features being (on average) more entrenched than more recent ones. More generally, a feature will become more or less entrenched in a developmental process depending on the extent to which it is integrated into a complex generative structure. Wimsatt (1986, 2001) and Shank and Wimsatt (1988, 2001) explain that the degree of entrenchment of a given developmental feature depends on the magnitude of its “down stream effect.” A feature with many important down stream effects possesses a high degree of GE, and vice-versa. Put in other terms, a feature has a high degree of GE if its modification produces (on average) a large impact on the developmental process. Moreover, these effects will be most likely detrimental: “If these [highly GE’d] features are absent or changed …, that has a high probability of causing a malfunction at one or more of the later stages” (Shank and Wimsatt 1988, 37). Conversely, a feature with a low   164 degree of GE does not tend to produce important effects (or malfunctions) in developmental process if modified. This phenomenon of differential entrenchment is at the very basis of phylogenetic constraints because it explains why some features display more evolutionary inertia than others. In a nutshell, if we grant the idea that modifications of highly GE’d features most likely result in malfunctions, and therefore reduce fitness, then GE implies that evolution should tend to be more conservative with highly GE’d features. Shank and Wimsatt (1988, 2001) have focused a lot on the higher degree of generative entrenchment of early developmental features and regulatory genes. Early developmental and regulatory features are like the foundations in a tower. Although they can undergo mutations, their important downstream effects will confer them a higher evolutionary stability. 7.2.3 Generative Entrenchment and Developmental Historicity In this section, I will show that GE can also contribute to path dependence at the developmental level. I will do so by applying the principles of generativity and differential entrenchment to a hierarchical developmental structure. Imagine a hierarchical network composed of three levels of organization (see Figure 7.3). The bottom level represents regulatory genes, i.e., genes involved in the regulation of the expression of structural genes. For instance, they could be (highly- conserved) homeotic genes involved in the regulation of the expression of other genes by binding to their promoters. The middle level represents structural genes. These are not directly involved in the regulation of expression of other coding sequences, but they code   165 for a single chain of amino acids that constitutes either a protein or protein domain.42 Finally, the upper level represents phenotypic features, either morphological or behavioral, that causally depend on the level of expression and state of structural genes.      Figure 7.3. Hierarchical structure comprising three levels of organization. Arrows represent type-cause connections. Darker shade means a higher degree of entrenchment.  In this hypothetical structure, we see two regulatory genes (L1) controlling two structural genes each (L2), and three structural genes causally involved in the presence of two phenotypic features (L3). The central structural gene (the middle element of L2) is causally involved in two phenotypic features, whereas the two external genes (left and right elements of L2) are factors involved in only one phenotypic feature each. Differences in shade indicate difference in degrees of entrenchment of elements. Darker means that magnitude of the downstream effect of an element is greater – more arrows leaving from it directly or indirectly – and paler means a lower magnitude of downstream effect.43 Thus, the two regulatory genes (5 arrows downstream) possess more  42 Note that the chain of amino acids constitutes only the secondary structure of the protein. A fuller description would include a tertiary structure resulting from the folding of the chain. In other words, the same sequence of amino acids will possibly turn into two different proteins and have different physiological functions if folded on itself in different ways. 43 Following the terminology in Shank and Wimsatt (2001), we could say that darker genetic elements have greater “pleiotropic entrenchment,” which is a type of generative entrenchment related to the interdependence of genetic constituents. L1: Regulatory genes L2: Structural genes L3: Phenotypic features   166 entrenchment than the structural genes (2 or 1 arrows downstream). The phenotypic features (L3) are all white because they have the lowest degree of entrenchment. Notice that this graph contains only type-causation relationships and all these features are presented somewhat synchronically. In other words, arrows are all types of causal connections that occur at one time or another during the life history of an organism. In order to show the impact of differential entrenchment, and why it makes development path dependent, we need to represent how this hierarchical structure develops over time. Consider Figure 7.4.                Figure 7.4. A hierarchical structure developing in alternative outcomes. Ordered series of dashed arrows are possible developmental paths. Changes in shape represent mutations: circle = no change, square = one change, triangle = two changes. Numbers on the branches are probabilities of survival to the next developmental stage. .85 .05 .1 .15 .85 .85 .15 .99 .01 i0 i1 i2   167  At the earliest stage, only the genetic elements of the structure exist (i0). Only some of the causal connections are actually realized at the first developmental stages. The phenotypic components will only appear later in development. From the initial state, three probable intermediary states are illustrated.  The probabilities represent the chances of survival to the next stage of development. The most likely of these three, perhaps corresponding to normal development, is represented on the left hand side. The two others are alternatives resulting either from genetic mutations at the structural level (on the right hand side), or, as it is often the case in development, some changes in phenotypic features resulting from a change in the environment (in the middle). Each of these intermediate states then give rise to two possible final states, for a total of four possible – although not equally probable – developmental outcomes. At this instant, all the genes and phenotypic features are expressed. In order to represent the idea of differential entrenchment, I have indicated changes to genetic and phenotypic elements as follows: the initial (normal) state for a given element is a circle. One change transforms it into a square and two changes transform it into a triangle. Also, I stipulate that a change in any element is transferred to every element causally affected by it (downstream effect). Thus, changing a regulatory gene will impact the three levels of organization at once (as in the top right corner), whereas a change in a phenotypic feature will only affect that feature, for noting is immediately causally affected by it (as in the second structure starting from the left). This branching diagram of a developing hierarchical structure clearly shows why development can be conceived as a path-dependent process. Not only do we have here a   168 process with alternative states at the final instant, but the probability of reaching one state or another depends on the path taken by the developmental process. The graph admits of some convergence (compare for instance the structures in the middle in the last instant), but we still obtain path dependence because not all developmental alternatives are equally likely from any one of the trajectories. The second and third states from the left are identical, but their probability of occurrence changes if we take the path from the left or the one from the middle. The difference is even stronger when we compare the 4th and 5th states from the left. Generative entrenchment plays a fundamental role here. Although the probabilities assigned to each branch are hypothetical, they nevertheless follow the general principle according to which elements with higher level of GE have more chance to persist. Integrating differential entrenchment this way, we obtain some kind of developmental constraint that shapes unevenly the probability of outcomes.  7.2.3 Generative Entrenchment as a Metaevolutionary Constraint Although developmental processes constrained by GE can be path dependent, it does not immediately follow that a population of these developmental processes will also evolve in a path-dependent way. In fact, most developmental variations will never become important in framing future life forms. Changes in developmental histories can be erased for different reasons. Not being inheritable is perhaps the most important of these reasons. But the capacity of being transmitted to future generations is not sufficient to mark history at the population level. Changes in developmental histories will not matter at the population level if the adaptive landscape has only one peak and organisms   169 happen to reach that peak (as represented in Figure 2.1b). I have discussed in Chapter 2 and 3 that this kind of situation results in an information-destroying process because changes in initial conditions or mutational history will not affect the evolutionary outcome. The same is true of variations in developmental histories. The fact that different organisms can develop into alternative forms during several generations will not matter, once the population reaches a global equilibrium. We saw however that history can matter when the adaptive landscape adopts a rugged topography (as in Figure 2.2b). A landscape is rugged when there exist multiple (more than one) fitness peaks separated by valleys of low fitness values. As mentioned in Chapter 2, fitness valleys typically occur when the fitness contribution of loci is interdependent (phenomenon called epistatic correlations). In a sense, epistatic correlations act as a metaconstraint on the genotype/phenotype set by creating multiple stable equilibria. Some forms will have a higher fitness than others. If a population reaches one of them, then it will tend to remain in that state because going away from it would require going through forms with significantly lower fitness. Natural selection will act against any such changes. Thus, epistatic correlations, by creating fitness valleys also contribute historicity at the population level because different initial conditions or mutational histories will lead to alternative peaks, to different evolutionary stable strategies. I believe that GE can help us understand the existence of such fitness valleys. As mentioned in the previous section, changes in highly GE’d features most likely result in detrimental effects, that is, into developmental outcomes with extremely low fitness. So the accumulation and integration of different features into a complex developmental   170 structure contributes to the ruggedness of the adaptive landscape; it creates regions of high and low fitness values. Moreover, Wimsatt (2001) suggests that GE contributes to evolutionary historicity of populations because it helps these alternative peaks to stand relatively still. Wimsatt acknowledges that the existence of multiple adaptive peaks is insufficient for history to matter. According to him, the optima must also stand still, or otherwise history will vanish. Unfortunately, Wimsatt does not provide a full explanation of how this can work.44 I believe that we can interpret this in the light of what has been said in Chapter 2 about stochastic processes that become extremely unstable. In the absence of fixed local equilibria, the system acquires a unique and stable distribution of outcomes on the long run. If the optima keep changing, then the entire system becomes unstable and the process will, under natural selection, track different regions of the phenotype space at different times. Thus, the average value of a randomly evolving feature will eventually converge towards a unique probability distribution of outcomes, and thus “forget” about its history. It will become impossible a fortiori to reconstruct the history of the process. We would know nothing about where it started from and what chances variations occurred along the way. If this is a correct interpretation45 of Wimsatt’s claim about equilibria that do not stand still, we see that he has in mind an  44 Wimsatt cites Lewontin’s (1966) paper in this section of the text, but the connection is difficult to establish. Lewontin’s model shows that history matters because the order of environmental events makes a difference to the pathway. Although not incompatible, this differs from Wimsattt’s view of historicity that focuses on the long-lasting effects of certain contingent events. 45 One may also read Wimsatt’s comment as making an epistemic claim. That is, in situations where the optima do not stand still it becomes almost impossible for us to know the trajectory of a species. But this does not mean that history does not make a difference to the outcome. As in Polya’s urn dynamics, the limit reached by the process is not static but evolves along the way (Cohen 1976; Page 2006). Each moment of the trajectory further defines the value towards which the distribution evolves. The equilibrium does not stand still and each replication of the experiment most likely produces a different outcome. Yet, the positive feedback mechanism implemented in the replacement procedure makes the process path dependent.   171 extreme case of stochasticity. I have argued however that some lesser degree of instability does not necessarily erases history (see Chapter 2 on stochastic information- preserving processes). So one can imagine that the equilibria do not have to stand still absolutely. Aside this point, we can nevertheless appreciate that GE can install stability in the evolution of populations. In brief, GE prevents evolution to become information destroying by maintaining the ruggedness of the landscape. We can tie this back to the first section, in which I have suggested that different phylogenetic constraints lead different adaptive syndromes and emergent ecological properties. One could explain the fact that GE is a source of phylogenetic constraints by its tendency to stabilize the topography of adaptive landscapes on the long run. 7.3 Generative Entrenchment and (Ir)reversibility Although it has been addressed in Chapter 5, the topic of irreversibility is far from being exhausted and still deserves further attention. As Gould (1970) clearly shows in his discussion of Dollo’s law, hoping to find a clear and unified understanding of the notion of irreversibility throughout the biological literature would simply be naïve. Dollo himself offered three interpretations of his own formulation of the law of irreversibility, and the commentators of Dollo’s theory provided at least six additional ones (Gould 1970). I shall not revise all versions of biological irreversibility and try to obtain a unified view. In these further reflections on historicity and irreversibility, I will suggest that GE can be a source of irreversibility and that it can help illuminate why certain types of (alleged) counter-examples to Dollo’s law may fall short as such. I will show that some ostensible instances of evolutionary reversibility, both in isolated organisms and in   172 lineages, are not problematic for Dollo’s law to the extent that they are explained by the fact that organisms maintain the (dormant) information from the past. This is not a true challenge to Dollo’s law because only the morphological characters are reversed, but not the underlying developmental mechanisms. This response assumes a strong and more focused reading of irreversibility. 7.3.1 Generative Entrenchment and Irreversibility Gould’s qualifies Dollo’s law as “a statement … of the nature of history; or put another way, it is an affirmation of the historical nature of evolutionary events” (Gould 1970, 208). I have defended a similar point of view in Chapters 4 and 5, where the notion of contingent irreversibility was developed in relation to path dependence. In brief, contingent irreversibility means that the probability that an evolutionary process return to a previously visited state is extremely low because of the tendency of biological systems to conserve traces from their evolutionary past. I have argued that this view of irreversibility can be associated with historicity as path dependence in a generic way. Reversibility requires that the exact same state can be reached from a different starting points and a reversed order of events. This is opposed to path dependence, which entails that different (probability of) outcomes will result from different starting points and series of events. In this section, I will build upon this discussion and suggest (after Shank and Wimsatt 1988) that generative entrenchment (GE) can also shed some light on the possible association between irreversibility and historicity. In brief, GE adds to the explanation of why certain features can have long-lasting effects and as such prevents the system from going back to a previously visited evolutionary state.   173 It is common to explain the irreversibility of evolution by the fact that life forms are highly intricate and complex. This argument was central in Dollo’s account and it has been reasserted several times after him. When evolutionary biology became more genetically oriented, focusing on genes as unit of inheritance, the discourse about reversibility also changed its focus from morphology to genes. Yet, complexity and intricate relationship between parts remained a central cause of irreversibility. Simpson, in The Major Features of Evolution, provides a standard example of this: Functional, adapted organisms noticeably different in structure have different genetic systems. They differ in tens, hundreds, and thousands of genes, and such genes as are the same have different modifiers and are fitted into differently integrated genetic backgrounds. A single structure is likely to be affected by many different genetic factors, practically certain to be if its development is at all complex. The chances that the whole system will revert to that of a distinctly different ancestor, or even that this will happen for any one structure of moderate complexity, are infinitesimally small (Muller, 1939). The genetical factors are so complex and so constantly changing that the extensive reversion is almost impossible. (Simpson 1953, 311)  Dollo did not frame the explanation of irreversibility in terms of genetic complexity, but the logic of the arguments is very similar. Dollo and Simpson both refer to the very low probability of reversing the changes accumulated throughout evolution because of the complexity of biological systems. To be sure, not all explanations of irreversibility appeal to complexity and interdependency of various parts. With the molecular revolution in biology, the argument for irreversibility became even more focused on the state of the genome, leaving aside the fate of morphologies. Even if biologists generally admitted the logical possibility of the evolutionary reversal of single gene changes, they nevertheless argued that the reversibility of genome evolution would be a very unlikely course of events. Jacques   174 Monod for instance, in Chance and Necessity, agrees that point mutations are theoretically reversible, but he emphasizes that the accumulation of successive independent mutations makes the process of evolution irreversible. “Because of the number of independent events that produce [evolution], such phenomenon is for statistical reasons irreversible” (Monod 1971).  A very similar line of argument is put forward by Szathmáry (2006). It is not the case that reversal would be logically impossible, it is just too demanding on the side of the requisite heritable variation: the number of simultaneous, chance genetic changes enabling the reversal is so large (their joint probability is so small) that for all practical purposes we can assume that they will not happen. (Szathmáry 2006, 151)  These explanations are essentially probabilistic and refer to the extremely low probability that a large number of independent and random changes in the genetic material can co- occur. Genomes accumulate changes during their evolutionary history, and the complete reversal of those past stochastic modifications becomes very improbable when their number gets high. Recall however that complexity and chance are but one part of Dollo’s explanation. As argued in Chapter 5, and emphasized by Gould (1970), Dollo’s law is an affirmation of the historical nature of evolution. The chance of going back to previously visited states is so low in virtue of the tendency of organisms to retain past acquired modifications. This form of explanation still persists in recent accounts of irreversibility. Brown for instance says: “because the direction and magnitude of any change is affected by preexisting conditions, the structure and dynamics of [complex adaptive] systems are effectively irreversible, and there is always a legacy of history” (Brown 1995, p. 14).   175 Brown thus suggests, as did Dollo, that irreversibility in complex adaptive systems follows from the long-lasting effects of past conditions. One source of this legacy of history comes from the fact that large-scale events of speciation, colonization and extinction leave persistent marks on systems; they affect the structure and dynamics of ecological and evolutionary processes, making them “effectively irreversible.” Shank and Wimsatt (1988) offer a similar a similar line of argument by considering the role of generative entrenchment. Among the numerous implications that GE has, Shank and Wimsatt mention that it can explain Dollo’s law: Evolution is irreversible. … When features first occur and are incorporated thru selection, they presumably … have little if any generative entrenchment. The longer they persist, the greater the chance that they will become increasingly generatively entrenched … . If the feature then becomes selectively disadvantageous, it becomes increasingly likely that its presence will be in some way modified or compensated for, since its straightforward elimination would now be more costly, due to its greater generative entrenchment, than its modification. The resultant change is thus not a reversion to the original state. (Shank and Wimsatt 1988, 40-41)  We saw that that GE can act as a source of path dependence at several levels and we see here that it adds more substance to the idea that Dollo’s law expresses the historical nature of macroevolutionary process. By locking-in certain features, GE contributes to the path dependence of evolutionary processes, and to its irreversibility. A similar association is put forward by Maynard Smith & Szathmáry (1995), who suggested that the major transitions that occurred throughout the evolution of life (i.e. the origin of chromosomes, the origin of eukaryotes, the origin of sex, the origin of multicellular organisms and the origin of social groups) were in part caused by   176 cumulative adaptations and contingent irreversibility.46 They say that the latter becomes relevant in maintaining higher-level entities, once they have arisen. This suggests that contingent irreversibility is responsible for the fact that higher-level entities are maintained. I think that this makes sense if we build GE into contingent irreversibility.47 Recall the explanation provided for the irreversible evolution of eukaryotic cells. From a prokaryotic state, some lineages went through several evolutionary steps including the loss of their outer-cell wall, the reorganization of the genetic material and the symbiotic origin of organelles like mitochondria. These evolutionary changes often presupposed or depended on one another and the eukaryotic cell that resulted from this extended evolutionary history has acquirer a high degree of GE. According to the narrative provided by Shank and Wimsatt, features become increasingly GE’d (and hard to modify) with time. The longer a feature sticks around, the more integrated it becomes and the less likely it will change because doing so would significantly reduce its fitness. The eukaryotic cell has been around for a very long time and it has become increasingly compartmented with different components performing essential, specialized functions.  It it would be very difficult to function well without all the specialized organelles and an isolated nucleus. The general structure of eukaryotic cells could perhaps have been different, but these features will most likely be maintained now that they have evolved because losing them will almost certainly result in detrimental effects. This also means that returning to a prokaryotic state would also be extremely difficult. When complex and convoluted structures have accumulated and become locked-in, the re-evolution of some  46 In fact, they add a third mechanism: central control. 47 Note that contingent irreversibility does not require the conservation of certain forms. It only requires that returning to previously visited states is extremely improbable.   177 remote and simpler historical states becomes very unlikely. So my interpretation of Maynard-Smith’s and Szathmáry’s claim is: the maintenance of higher-level entities follows from GE, the progressive accumulation and integration of specialized features with time, which entails irreversibility. 7.3.2 When Reversibility Relies on GE and Historicity We just saw that GE and the lock-in phenomenon that results from it can be at the source of irreversibility. Interestingly though, it can also play a central role in some alleged instances of reversibility. In an insightful reflection on the usage of “irreversibility” in biology, Norman Macbeth (1980) identifies two phenomena on which biologists seem to agree. 1) Evolution never goes backward in a big way (i.e. completely and on large scale). The different views of contingent irreversibility presented in the previous section deal with explanations of this degree of irreversibility. But they also believe that 2) evolution often goes backward in a small way (i.e., isolated and limited cases). In this section, I will consider a counter-example to (1) and an example of (2). We will see that in both cases, the reversibility of morphological features relies on the GE of some underlying mechanisms. I will use this to argue that both cases are not problematic for a strong reading of the principle of irreversibility and that they are still compatible with path dependence. The re-appearance of lost ancestral characters in single organisms, a phenomenon commonly called “throwback” or “atavism,” is perhaps the most common instance of small-scale reversibility. One of the long-noticed cases of throwbacks is the development of lateral digits in individual horses, a trait that existed during the Lower Pliocene about four million years ago, but disappeared during evolution (Macbeth 1980). All instances of   178 throwback may need different kind of explanation. Nonetheless, the idea that some mechanisms can become suppressed and then redeployed seems a typical explanation. [I]n mammals the capacity to develop atavistic structures of the limbs may be retained for 106 to 107 generations. The original limb structures are originally lost by accumulation through natural selection of modifiers which suppress their development. The effect of the suppressor genes can be reversed by polygenic modifications or by a single gene which causes widespread disturbance of development, leading to various degrees of expression of the lost structures. (Lande 1978)  If throwbacks can be simply explained by the re-expression of suppressed genetic information, then they are not cases of evolution reversal. In fact, throwbacks are the very expression of the historical nature of biological systems. Ancestral characters reappear in some singular individuals because organisms have retained information from the past; they are the re-expression of dormant capacities. So the principle of historicity is the basis of why we observe instances of singular and small-scale reversibility. Although not explicitly stated in Lande’s explanation, one can see a role for generative entrenchment here. The reason why these suppressed mechanisms are preserved can follow from the fact that they are deeply entrenched. These mechanisms could have been suppressed but not erased because doing so may have resulted in serious malfunctions. Hence, generative entrenchment becomes the reason why small and isolated cases of reversal are possible. One could claim that throwbacks, if they count as evolutionary reversal at all, would not be very interesting because they occur only in a few organisms rather than most or all the members of a lineage. Perhaps a more serious case of reversal involves the re-appearance of a lost character in a whole lineage. Such an example was cited recently in a letter to Nature. Whiting and his collaborators  (2003) have investigated the   179 evolution of wings in stick insects (phasmids) and discovered that their ancestral condition is wingless (after a wing loss) but that wings have reappeared on at least four occasions. In other words, according to the most parsimonious phylogenetic reconstruction, phasmids have evolved from an ancestor with wings, they subsequently lost that character (very early in their history), and some lineages subsequently reevolved partially or fully winged morphologies. So the presence of wings seems to be a very plastic feature in phasmids, and suggests that we are in front of a more serious counter- example to Dollo’s law. Because it occurs at a larger scale, this example would effectively be a more serious challenge to Dollo’s law. Throwbacks were in fact not a problem for Dollo, who was mostly interested in complete reversal of forms in a lineage after a relatively long period of adaptation to different conditions (see Dollo 1922 on this). There might be a reoccurring resemblance in isolated individuals, but not a complete return. The return of winged phasmids could therefore pose a greater challenge than throwbacks because it occurs in full lineages. Unfortunately, Withing et al. (2003) do not compare the complete morphology of the winged ancestors with the one that reevolved, so we cannot tell how similar these wings are. Recall also that Dollo’s law is not a universal claim, but rather a statistical regularity that one should observe across evolution. Thus, even if there were a few instances of complete reversal affecting a whole lineage, this would not undermine the claim that macroevolution is most likely irreversible on the large scale. But I think that the return of winged phasmids is not a problem for Dollo for a more interesting reason. That is, the explanation for this example of large-scale reversibility has the same structure as the one provided for reversion in single individuals   180 (throwbacks). The authors note that wing development depends on multiple genetic factors and developmental systems and that “mutations in any one of these factors may lead to winglessness” (Ibid, 266). They observe that non-flying phasmids have “retained the neural structures and basic circuitry required for flight,” which result suggests that “the developmental pathway for wing formation evolved only once in insect diversification, but that wings evolved many times by silencing and re-expressing this pathway in different lineages during insect evolution” (Ibid, 267). They come to this conclusion by looking at the homology in wing features shared by phasmids and other insects. They argue that wings did not “re-evolve de novo in phasmids, but are rather a re- expression of the basic insect wing which was lost in ancestral stick insects” (Ibid 266). Thus, it is in virtue of the fact that a capacity from the past was retained that phasmids display such plasticity in their wing features. Again, GE seems to be a very good candidate for explaining why we may encounter such reversibility. As Withing et al. (2003) mention, “it is not surprising that the basic genetic instructions for wing formation are conserved in wingless insects, because similar instruction are required to form legs, and probably other critical structures” (p. 266). Such fundamental structures must have a very high degree of generative entrenchment, hence their highly conserved status throughout evolution. This kind of explanation raises a serious problem for the idea that the examples discussed above are counter-examples to Dollo’s law. Although Dollo essentially talked about the reversal of morphological characters, it seems reasonable to raise the bar and demand that the reversibility does not only apply to morphological characters, but also to the mechanisms responsible for their expression. This would be in line with the idea that   181 Dollo was mostly interested in the complete (ir)reversibility of species. Moreover, if we are to include in the explanation the underlying developmental causes of morphologies, then perhaps we should also consider the extent to which these have re-evolved subsequently to their modification. The explanation put forward for small and large-scale cases of reversibility suggests that the mechanisms responsible for the expression of the (reversed) morphological features never changed or disappeared. The causes for the (re)expression of traits were simply “silenced,” but they never had to reevolve from a different starting point. We may have reversibility of morphologies both in individuals and lineages, but we still don’t have reversibility of their deeper causes. If this reading of irreversibility is correct, then it turns out that throwbacks and plasticity in phamids’ wings do not constitute counter-examples to Dollo’s law. 7.4 Concluding Remarks Phylogenetic constraints are commonly (but not exclusively) used in explanation of suboptimal adaptations. This chapter has shown that one can extend their explanatory role to historicity. Biologists have long recognized the existence of phylogenetic constraints, which I suggest we conceive as entrenched past vagaries with long-lasting evolutionary effects. Moreover, we saw that there is a growing literature in ecology showing that these past vagaries, when they become entrenched and act as phylogenetic constraints, are more pervasive than one might have expected, for they also contribute to shifting the weights on some ecological alternatives as a function species evolutionary past. The effects are not only long-lasting but also deep-reaching in a wide array of phenomena, and I have shown that the notion of path dependence can also apply to this important episode of the historical turn in biology.   182 Finally, GE allowed to pursue further the close connection between the properties of irreversibility and path dependence: at least in the domain of evolution, path dependence and irreversibility can share a common basis, GE. And to give more scope to this association between irreversibility, path dependence and entrenchment, let me mention that a similar claim was made in influential works in the social sciences (Levi 1997; Pierson 2004).  Path dependence has to mean, if it is to mean anything, that once a country or a region has started down a track, the costs of reversal are very high. There will be other choice points, but the entrenchments of certain institutional arrangements obstruct an easy reversal of the initial choice. Perhaps the better metaphor is a tree, rather than a path. From the same trunk, there are many different branches and smaller branches. Although it is possible to turn around or to clamber from one to the other – and essential if the chosen branch dies – the branch on which a climber begins is the one she tends to follow. (Levi 1997, 28, italics are mine)  This metaphor has important limitations in the realm of biology. For instance, evolutionary lineages are not climbing on branches but are themselves the different branches. Still, the point about entrenchment as a source of irreversibility and path dependence remains to my sense valid.    183 Chapter 8: Conclusion  This dissertation has been a quest for a better understanding and definition of the notion of historicity, and more specifically the extent to which history matters in biological processes. Many biologists, often citing Gould, have turned to the notion of historical contingency in order to articulate the sense in which history matters in the phenomena they study. We have seen however that there are different usages of this notion and that there is no general framework from which interpret these claims. This dissertation is a step forward in the direction of a more general understanding of historicity. Briefly, I have argued that historicity can be understood as a property of information-preserving processes, whereas ahistoricity is a property of information- destroying processes. Building from Sober’s (1988) account, I argued that a process is information preserving if (part of) the past can be inferred from the present state of a system. This happens if the present state of a system can be attributed to a distinct past, if changing the past can affect the present and yield different outcomes (or with different probabilities). On the other hand, a process is information destroying and ahistorical if the present cannot be associated with a distinct past, if changing the past has no effect on the present state of a system. This view of historicity relies essentially on the existence of multiple outcomes and some form of extended causal dependence between past and present. This means then that it is not sufficient to say: “things could have been otherwise” in order to prove that history matters. I have shown in Chapter 2 that a process admitting of multiple outcomes, but in which we introduce a lot of noise, can qualify as information destroying if the   184 probability distribution of outcomes becomes flat, i.e., if any end state can be equally associated with any past/initial state. This is an important conclusion because it undermines the too common identification of historicity with contingency as unpredictability. A process can be unpredictable because of chance, but yet fail to retain information from the past. As very well illustrated with the framework of Travisano et al. (1995; see Chapter 3), sheer chance tends to erase history because it blurs the differences in structure that history has created in the initial set of populations. Chance does not conserve information from the past, it makes indistinguishable states that were initially different. This does not mean that chance should never appear as an important factor in processes displaying historicity. To the contrary, it can be the seed of divergence towards alternative outcomes. But it cannot be the whole story. Historicity entails that differences in initial conditions or modifications in the path taken by a process will affect the (probability of) outcomes at a later instant in a distinctive way. Defining historicity as a property of information-preserving processes has the other advantage of being specifiable in deterministic and stochastic contexts. This flexibility reflects well the modeling and theorizing practices in biology, which encompass both deterministic and stochastic models. I have argued that historicity boils down to “dependence on initial conditions” in deterministic systems, whereas the notion of “path dependence” is more suitable for stochastic settings. Path dependence was initially developed and applied to social sciences, and despite alleged connections to the biological sciences, the expression “path dependence” had seldom appeared in the biological and philosophy of biology literature. In fact, biologists seem to have adopted the expression “historical contingency” instead. Why   185 then use a different expression in a philosophical analysis of historicity in biology? My inclination to use and develop the notion of path dependence – and not historical contingency – for explaining the nature of stochastic information-preserving processes is justified by the specificity and conceptual clarity it allows. I believe that the notion of historical contingency does not serve us as well for the distinction between historicity in deterministic and stochastic processes. The expression “path dependence” more explicitly suggests the idea that the outcome of a process depends on the entire path or trajectory of a process. The definition provided in Chapter 6 made this form of dependence more concrete. I argued there that path dependence encompasses two essential conditions: (i) the existence of alternative outcomes at a given instant and (ii) a change in the probability of these outcomes as we switch from one path to another. This second condition is important because it precludes stochastic information-destroying processes to qualify as path dependent. By definition, any stochastic process is branching into alternative outcomes at some instant. However, as mentioned above, it is possible that any path result in the same probability of outcomes. History does not matter in such cases, and condition (ii) precludes them to qualify as path dependent. The various examples discussed throughout this dissertation clearly indicate that history matters a great deal in evolutionary and ecological processes. For sure we miss a twin earth or a replay of a tape from which we could compare the roads taken by biological systems. Nevertheless, we saw that ingenious experimental setups and mathematical simulations clearly suggest that history matters in many circumstances. Moreover, the extent to which a process is influenced by its past depends on several conditions specific to the system. We have seen for instance that the size of populations   186 and the mutation rates are fundamental determinants of historicity in population genetics, or that the degree to which colonies are connected or how large the regional pool is can also greatly affect the tendency of local ecological communities to display convergent or divergent outcomes. Furthermore, the conclusion we reach may as well depend on which part of a system we decide to investigate. One may find evidence of sustained differences resulting from alternative histories for some traits but not for others in the same species. Or a set of communities issued from the same regional pool and living in similar environments may differ in terms of species composition but not in term of richness. This suggests that general and non-specific claims about the relevance or irrelevance of history are to a large extent vacuous. Another important and related point is that history can matter to various degrees. Processes can be more or less information preserving. This idea of degree of historicity is largely missing in some of the most influential writings on history and contingency in biology. I have presented arguments from Gould’s Wonderful Life and Conway Morris’ Life’s Solution as an example of how the debate can sometimes immobilize into extreme positions, leaving us with the impression of a dichotomy. But there exist a whole range of states between the fully historical and the absolutely ahistorical poles. My analysis of path dependence provided two interesting and complementary senses of degree of historicity, each related to one of the conditions for path dependence. The first sense of degree, related to condition (i), has to do with the extent to which processes are divergent or convergent at a given instant. Extreme divergence corresponds to the situation where each path yields a different outcome, partial convergence corresponds to the situation where the same outcome can occur more than once at a given   187 instant, but with different probabilities, and finally complete convergence (or path independence) corresponds to the situation where there exist but one outcome at a given instant. The second sense of degree, related to condition (ii), has to do with the extent to which the probabilities of outcomes are different when we switch from one path to another. Alternative processes can result in different outcomes, and thus meet condition (ii), but the degree of path dependence will also depend on how big a difference taking alternative paths makes on the probability of occurrence each outcome. If taking different paths does not affect the probability of outcomes at a given instant, then the processes will be path independent. Conversely, if the probability of outcomes given different paths changes significantly, then the processes will be strongly path dependent. So, having condition (ii) in the definition of path dependence does not only make divergence a necessary but insufficient condition, but it also provides more depth to the idea of degrees of historicity. I have paid much less attention to deterministic historicity. The examples discussed and the conceptual reflections put forward in this dissertation focused primarily on the notion of path dependence in stochastic settings. This discrepancy does not mean that path dependence is more important than deterministic historicity. I devoted more energy developing an account of path dependence because it is a relatively new notion compared to its deterministic sibling, which we saw boils down to dependence on initial conditions. Still, being able to relate “path dependence” and “dependence on initial conditions” to the idea of “information-preserving processes” contributes both cohesion and generality that was lacking up to now in the literature. Note however that theories of   188 historicity remain in a very early stage and we may eventually find a more coherent way of presenting the conceptual network surrounding historicity. I would like to close this dissertation, by looking towards the future, at where we can go from here. First, I believe that we could elaborate further the reflections on irreversibility and historicity. Irreversible processes are allegedly historical because of the time arrow they entail. But not all arguments about irreversibility have the same structure and the same connection to historicity. I have argued in Chapters 4 and 6 that contingent irreversibility is encompassed by path dependence. Contingent irreversibility means that the probability of a process returning to a previously visited state is extremely low. This very general principle also entails a time arrow, a difference between past and future states. However, we also saw that, at least in the context of macroevolution, one of the causes of contingent irreversibility is historicity: a tendency of biological systems to conserve traces from their evolutionary past. Moreover, we have seen in Chapter 6 (after Shank and Wimsatt 1988) that generative entrenchment (GE) can also shed some light on this type of association between irreversibility and historicity. In brief, GE adds to the explanation of why certain features can have long-lasting effects and as such prevents the system from returning to a previously visited evolutionary state. Thus, according to this analysis, contingent irreversibility displays two fundamental properties, a time arrow and divergence going towards the future. This view of irreversibility contrasts with the one elaborated in population genetics through the principle of natural selection developed by Fisher (1930), which does not entail divergence towards to future. By analogy to the second law of thermodynamics, according to which entropy should always tend to increase with time in   189 specified conditions (Barrett and Sober 1995; Prigogine 1986), the average fitness of populations should also tend to increase with time, given that certain conditions are met. Both principles entail a similar built-in temporal asymmetry because they predict that some measurable quantity presents directional change. However, the analogy has important limitations (Barrett and Sober 1995) and the relationship with history does not entail historicity. In fact, entropic and average fitness irreversibility typically occurs when a system converges toward a unique and global equilibrium. In this context, irreversibility essentially follows from the existence of a time arrow, but it is not generated by path dependence.  Another project that I believe would be interesting to pursue in relation to historicity in biology has to do with the form of explanation provided when history matters, as compared to explanations in contexts in which it does not. Claiming that history matters suggests that a good explanation of historical processes requires the inclusion of historical factors. This raises two important related questions. First what kind of explanation do we obtain when we invoke historicity? Second, do these explanations qualify as good scientific explanations?  The first question could be answered by comparing the inference produced by biologists with different models of explanation found in the literature, especially in philosophy of science and philosophy of history. There was a time when philosophers thought that all legitimate explanations should comply with Hempel’s deductive- nomological (DN) model, according to which a particular event is explained when derived from a law and a set of specified initial conditions. For example, the fact that a particular raven is black can be explained by the general law: “all ravens are black,” and   190 the fact: “this is a raven.” Historical narratives don’t qualify as good DN explanations because they don’t make reference to laws. In fact, Hempel qualified them as explanatory “sketches.” Particular events in historiographic inferences are explained by reference to another particular (causal) event. Claims such as “the extinction of dinosaurs happened because a meteorite collided with the earth and caused a cascade of reactions that made the environment very unsuitable for dinosaurs to survive,” are thus at the opposite pole from the DN model. They are neither nomological nor derivative.  At the time of the historic turn in biology, these were the main models of explanation, and it has been suggested that explanations in historical biology correspond more to historical narratives (Nitecki and Nitecki, 1992). Reconciling biological inferences with historical narratives was a problem for some biologists because of the unscientific status attached to the narrative form (MacArthur 1972). One can see some physics envy underlying this position. Biologists want to acquire the same lettres de noblesse as physics. This could be achieved by the discovery of repeatable patterns in nature and the formulation of general principles or laws. Attending to the historical details in theories and explanations would take them away from that ideal of science. A true scientific explanation should not be historical, but should rather follow a DN model.  However, the DN model is no longer taken as the only form of scientific explanation. Some of the leading alternatives are the causal-mechanical model (Salmon 1989, 1997), the unification model (Kitcher 1989) and the counterfactural model (Woodward 2003). These were developed in close connection with different areas of sciences and they are less extreme than the other two models. The requirement for universal laws has been relaxed (for example, we no longer require laws, but only   191 qualified invariant causal relationships) and some of these models are compatible with the narrative model. Moreover, historiographic inferences are not always considered to be historical narratives, but should rather inference to the best explanation.  In the face of all these changes in the literature on scientific explanation, it is time to reassess the claim that historical biology involves mainly historical narratives and is therefore less scientific. This work needs to be done and I cannot offer a definitive answer here. I can at least point in the direction of two possible scenarios. A very liberal hypothesis would say that the kind of explanations produced in historical biology are compatible with all the models mentioned above, and none of them seems to capture more than the other what is going on. A less liberal and more interesting hypothesis would say that explanations in historical biology are theoretically compatible with all these models, but some of them seem to be favored in different conditions. For example, cases where path dependence occurs would be more adequately accounted for by the Woodword’s counterfactual model because it pictures what would have happened (or have been likely to happen) if another path had occurred. The unification model on the other hand seems to capture well the type of inferences found in some of the work on phylogenetic constraints, where general and inclusive patterns of dependence between ancestral characters, adaptive responses and emergent ecological properties are at play. These are some of the issues that I believe are worth pursuing in the future.   192 Bibliography  Arthur, W. Brian (1994), Increasing returns and path dependence in the economy, Economics, cognition, and society. Ann Arbor: University of Michigan Press. Barrett, Martin, and Elliott Sober (1995), "When and Why Does Entropy Increase?" in Steven F. Savitt (ed.), Time's Arrow Today: Recent Physical and Philosophical work on the Direction of Time Cambridge: Cambridge University Press, 230-255. Bassanini, A., and G. Dosi (1999), "When and How Chance and Human Will Can Twist the Arms of Clio ", LEM Working Paper Series 05. Pisa. Beatty, John (1995), "The Evolutionary Contingency Thesis", in Wolters G., J.G. Lennox and P. McLaughlin (eds.), Concepts, Theories, and Rationality in the Biological Sciences, Pittsburgh: University of Pittsburgh Press, 45-81. ——— (2006), "Replaying Life's Tape", The Journal of Philosophy 53 (7):336-362. Begon, Micheal, John L. Harper, and Colin R. Towsend (1990), Ecology: Individuals, Populations and Community. Second ed. Cambridge: Blackwell Scientific Publications. Belnap, Nuel D., Michael Perloff, and Ming Xu (2001), Facing the future : agents and choices in our indeterminist world. Oxford [England] ; New York: Oxford University Press. Belyea, Lisa R., and Jill Lancaster (1999), "Assembly Rules within a Contingent Ecology", Oikos 86 (3):402-416. Blondel, Jaques, and Jean-Denis Vigne (1993), "Space, Time, and Man as Determinants of Diversity of Birds and Mammals in the Mediterranean Region", in Robert E. Ricklefs and Dolph Schluter (eds.), Species Diversity in Ecological Communities: Historical and Geographical Perspectives, Chicago: The University of Chicago Press, 135-146. Blount, Zachary D., Christina Borland, and R. E. Lenski, and  (2008), " Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli" in PNSA 105: 7899-7906. Blum, Harold F. (1955), Time's arrow and evolution. [2d ed. Princeton,: Princeton University Press. Boer, P. J. den, and Joannes Reddingius (1996), Regulation and stabilization paradigms in population ecology. 1st ed, Population and community biology series ; 16. London ; New York: Chapman & Hall. Booth, Babara D., and Douglas W. Larson (1999), "Impact of Language, History, and Choice of System on the Study of Assembly Rules", in Evan Weiher and Paul A. Keddy (eds.), Ecological Assembly Rules: Perspectives, Advances, Retreats, New York: Cambridge University Press, 206-229. Breck, Allen D and Wolgang Yourgrau (Eds.) (1972), Biology, History, and Natural Philosophy. New York: Plenum Press. Brooks, D. R., and Deborah A. McLennan (1991), Phylogeny, ecology, and behavior : a research program in comparative biology. Chicago: University of Chicago Press.             (1993), "Historical Ecology: Examining Phylogenetic Components of Community Evolution" in Ricklefs, Robert E., and Dolph Schluter (eds.), Species diversity in   193 ecological communities : historical and geographical perspectives. Chicago: University of Chicago Press. Brooks, Daniel R. and Deborah A. McLennan (1994), "Historical Ecology as A Research Programm: Scope, Limitations and the Future. " in Paul and Richard I. Vane- Wright Eggleton (ed.), Phylogenetics and Ecology, London: Academic Press Limited, 1-27. Brown, James H. (1995), Macroecology. Chicago: University of Chicago Press. Bull, J.J., and E.L. Charnov (1985), "On Irreversible Evolution", Evolution 39 (5):1149- 1155. Cappuccino, Naomi, and Peter W. Price (1995), Population dynamics : new approaches and synthesis. San Diego: Academic Press. Carthwright, Nancy (1983), How The Laws of Physics Lie. New York: Oxford University Press. ——— (1989), Nature's Capacities and their Measurement. Oxford: Clarendon Press. Castaldi, Carolina , and Giovanni Dosi (2006), "The Grip of History and the Scope for Novelty: Some results and Open Questions on Path Dependence in Economic Processes." in Andreas and Reinhart Kössler Wimmer (ed.), Understanding change: Models, Methodologies, and Metaphors, New York: Palgrave Macmillan, 99-128. Cattin, Marie-France, F. Louis, lix-Bersier eacute, Carolin Banasek-Richter, Richard Baltensperger, and Jean-Pierre Gabriel (2004), "Phylogenetic constraints and adaptation explain food-web structure", Nature 427 (6977):835. Chase, Jonathan M. (2003), "Community Assembly: When Should history Matter?" Oecologia 136:489-498. Churchill, Gary A. (2000), "Inferring ancestral character states", Clegg, Michael T. [Editor, Reprint Author], Hecht, Max K. [Editor], MacIntyre, Ross J. [Editor] Limits to knowledge in evolutionary genetics:117-134. Coddington, Jonathan A. (1988), "Cladistic tests of adaptational hypotheses", Cladistics 4:3-22. Cody, Martin L., and Jared M. Diamond (1975), Ecology and evolution of communities. Cambridge, Mass.: Belknap Press of Harvard University Press. Cohen, E. Joel (1976), "Irreproducible Results and the Breeding of Pigs (Or Nondegenerate Limit Random Variables in Biology)", Bioscience 26 (6):391-394. ——— (1979), "Ergodic Theorems in Demography", Bulletin of the American Mathematical Society 1 (2):275-295. Conway Morris, Simon (2003), Life's Solution: Inevitable Humans in a Lonely Universe, Cambridge, U.K., Cambridge University Press. Cooper, Gregory John (2003), The science of the struggle for existence : on the foundations of ecology, Cambridge studies in philosophy and biology. Cambridge, U.K. ; New York: Cambridge University Press. Crawley, Michael J. (1997), "Plant-Herbivore Dynamics", in Michael J. Crawley (ed.), Plant Ecology, Oxford: Blackwell Science. Danto, Arthur C. (1965), Analytical Philosophy of History. London: Cambridge University Press. Darwin, Charles (1859), The origin of species by means of natural selection, or The Preservation of favored races in the struggle for life. London: J. Murray.   194 ——— (1862), On the various contrivances by which British and foreign orchids are fertilised by insects, and on the good effects of intercrossing. London,: J. Murray. David, Paul A. (1985), "Clio and the Economics of QWERTY", The American Economic Review 75 (2):332-337. ——— (2001), "Path Dependence, its Critics, and the quest for 'historical economics' ", in Pierre and Stavros Ioannides Garouste (ed.), Evolution and Path Dependence in Economic Ideas, Northampton: Edward Elgar, 15-40. ——— (2005), "Path Dependence in Economic Processes: Implications for Policy Analysis in Dynamical System Context." in Kurt Dopfer (ed.), The Evolutionary Foundations of Economics, Cambridge: Cambridge University Press, 151-194. de Duve, Christian (1995), Vital Dust: The Origin and Evolution of Life on Earth. New York: Basic Books. Denbigh, K. G. (1989), "The Many Faces of Irreversibility", The British Journal for the Philosophy of Science 40 (4):501-518. Diamond, Jared M. (1975), "Assembly of Species Communities", in Martin L. and Jared M. Diamond Cody (ed.), Ecology and Evolution of Communities, Cambridge: The Belknap Press of Harvard University Press, 342-444. ——— (1983), "Taxonomy by Nucleotides", Nature 305:17-18. Dobzhansky, Theodosius (1973), "Nothing in Biology Makes Sense Except in the Light of Evolution", American Biology Teacher 35: 125-129. Dollo, Louis (1905) "Les Dinosauriens adapte a la vie quadrupede secondaire" in Bull. soc. Belge. Geol. Pal. Hydr.,19:441-448.             (1922), "Les Cephalopodes deroules et l'irreversibilite de l'evolution," Bijdragen tot de Dierkunde, (1922), 215-227. Drake, James A. , Terry E. Flum, Gregory J. Witteman, Timothy Voskuil, Anne M. Hoylman, Chris Creson, David A. Kenny, Gary R. Huxel, Cheri S. Larue, and Jeffrey R. Duncan (1993), "The Construction and Assembly of an Ecological Landscape ", The Journal of Animal Ecology 62 (1):117-130. Drake, James A., and Terry Flum; Gary R. Huxel (1994), "On Defining Assembly Space: A Reply to Grover and Lawton ", The Journal of Animal Ecology 63 (2):488-489. Dray, William (1957), Laws and Explanation in History. London: Oxford University Press. Dumouchel, Paul (1993), "The Role of Fiction in Evolutionary Biology", SubStance 22 (2/3):321-330. Efferson, Charles, and Perter J. Richerson (2007), "A Prolegomenon to Nonlinear Empiricism in the Human Behavioral Sciences", Biology and Philosophy 22 (1):1-33. Eldredge, Niles (1993), "History, function, and evolutionary biology", in, 33-50. Farrel, Brian D., and Charles Mitter (1993), "Phylogenic Determinants of Insects/Plant Community Diversity", in Robert E. Ricklefs and Dolph Schluter (eds.), Species Diversity in Ecological Community, Chicago: The University of Chicago Press, 253-266. Feder, Martin E. (1987), New directions in ecological physiology. Cambridge [Cambridgeshire] ; New York: Cambridge University Press. Fisher, R. A. (1956), The Genetical Theory of Natural Selection. 2nd ed. Oxford: Clarendon Press. Original edition, 1930.   195 Fox, Barry J. (1987), "Species Assembly and the Evolution of Community Structure", Evolutionary Ecology 1:201-213. Fukami, Tadashi, Martjin T. Bezemer, Simon R. Mortimer, and Wim H. van der Putten (2005), "Species Divergence and Trait Convergence in Experimental Plant Community Assembly", Ecology Letters 8:1283-1290. Gavrilets, Sergey (2004), Fitness landscapes and the origin of species, Monographs in population biology ; v. 41. Princeton, N.J. ; Oxford, England: Princeton University Press. Ghiselin, Michael T. (1997), Metaphysics and the origin of species, SUNY series in philosophy and biology. Albany: State University of New York Press. Ginzburg, Lev R., and Mark Colyvan (2004), Ecological orbits : how planets move and populations grow. New York: Oxford University Press. Givnish, Thomas J. (1987), "Comparative Studies of Leaf Form: Assessing the Relative Roles of Selective Pressures and Phylogenetic Constraints", New Phytologist 106 (Supplement):131. Goldstone, Jack A. (1998), "Initial Conditions, General Laws, Path Dependence, and Explanation in Historical Sociology", American Journal of Sociology 104 (3):829. Goudge, Thomas Anderson (1961), The ascent of life : a philosophical study of the theory of evolution. [Toronto]: University of Toronto Press. Gould, Stephen Jay (1970), "Dollo on Dollo's Law: Irreversibility and the Status of Evolutionary Laws ", Journal of the History of Biology 3 (2):189-212. ——— (1980), The Panda's Thumb: More Reflections In Natural History. New York: W.W. Norton & Company              (1983) Extemporaneous Comments of Evolutionary Hopes and Realities'. In Charles L. Hamrum (Ed.), Darwin's Legacy, Nobel Conference XVIII (1983), 101-102. ——— (1989), Wonderful life : the Burgess Shale and the nature of history. New York: W.W. Norton. ——— (1991), Bully for Brontosaurus: Reflections in Natural History. New York: W.W. Norton & Company. ——— (1991), "The Panda's Thumb of Technology ", in, Bully For Brontosaurus: Reflections in Natural History, New York: W.W. Norton & Company, 59-75. Gould, Stephen Jay, and Richard C. Lewontin (1979), "The Spandrels of San Marco and the Panglossian Paradigm: A Critique of the Adaptationist Programme", Proceedings of The Royal Society of London. Series B, Biological Sciences 205 (1161):581-598. Gould, Stephen Jay, and Elisabeth Vrba (1982), Exaptation: a missing term in the science of form" Paleobiology 8(1): 4-15. Griffin, David Ray, and Claremont Center for Process Studies. (1986), Physics and the ultimate significance of time : Bohm Prigogine, and process philosophy. Albany: State University of New York Press. Griffiths, Paul E. (1996), "The historical turn in the study of adaptation", British Journal for the Philosophy of Science 47 (4):511. ——— (2006), "Function, Homology, and Character Individuation", Philosophy of Science 73 (1):1-25.   196 Grover, James P. (1994), "Assembly Rules for Communities of Nutrient-Limited Plants and Specialist Herbivores ", The American Naturalist 143 (2):258-281. Halmos, Paul, R. (1956), Lectures on Ergodic Theory. New York: Chelsea Publishing Company. Harford, Tim (2008), The Logic of Life: The Rational economics of an Irrational World. New York: Random House. Harvey, Paul H., and Mark D. Pagel (1991), The comparative method in evolutionary biology, Oxford series in ecology and evolution ; 1. Oxford ; New York: Oxford University Press. Hempel, Carl Gustav (1942), "The Function of General Laws in History", The Journal of Philosophy 39:35-48. ——— (1965), Aspects of scientific explanation, and other essays in the philosophy of science. New York,: Free Press. Hodgson, Geoffrey M. (1993), Economics and Evolution: Brining Life Back Into Economics. USA: The University of Michgan Press. Horan, Richard D., and Erwin Bolte, and Jason F. Shogren (2005), "How Trade Saved Humanity from Biological Exclusion: An economic Theory of Neanderthals Exclusion" in Journal of Economics Behavior and Organization, 58(1): 1-29. Hull, David L. (1974), Philosophy of Biological Science. Englewood Cliffs: Prentice-Hall Inc. ——— (1992), "The Particular-Circumstance Model of Scientific Explanation ", in and Doris V. Nitecki Nitecki Matthew H. (ed.), History and Evolution, New york: State University of New York Press, 69-80. Humphreys, Paul (1989), The Chances of Explanation: Causal Explanation in the Social, Medical, and Physical Sciences. Princeton: Princeton University Press. Ishida, Yoichi (2007), "Patterns, Models, and Predictions: Robert MacArthur's Approach to Ecology", Philosophy of Science 74:642-653. Johnson, A. Paul, Richard E. Lenski, and Frank C. Hoppensteadt (1995), "Theoretical Analysis of Divergence in Mean Fitness Between Initially Identical Populations", Proceedings of The Royal Society of London. Series B, Biological Sciences 259:125-130. Keddy, Paul A., and Evan Weiher (1999), "Introduction: The Scope and Goals of Research on Assembly Rules", in Evan Weiher and Paul A. Keddy (eds.), Ecological Assembly Rules: Perspectives, Advances, Retreats, Cambridge: Cambridge University Press, 1-20. Keller, David R., and Frank B. Golley (2000), The philosophy of ecology : from science to synthesis. Athens, Ga: University of Georgia Press. Kimura, M. (1983), The Neutral Theory of Molecular Evolution. London: Cambridge University Press. Kingsland, Sharon E. (1995), Modeling nature : episodes in the history of population ecology. 2nd ed, Science and its conceptual foundations. Chicago: University of Chicago Press. ——— (2005), The evolution of American ecology, 1890-2000. Baltimore, Md.: Johns Hopkins University Press.   197 Kitcher, Philip (1989), "Explanatory Unification and the Causal Structure of the World", in Philip Kitcher and Wesley C. Salmon (eds.), Scientific Explanation, Minneapolis: University of Minnesota Press, 410-505. Kitcher, Philip, and Wesley C. Salmon (1989), Scientific Explanation. Vol. 13, Minnesota Studies in the Philosophy of Science. Minneapolis: University of Minnesota Press. Kuhn, Steven L., and Mary C. Stiner (2006), "What's a Mother to Do?" Current Anthropology 47 (6):953-980. Lande, R. (1978), "Evolutionary Mechanisms of Limb Loss in Tetrapods", Evolution 32:73-92. Lange, Marc (2002), "Who's Afraid of Ceteris-Paribus Laws? (or: How I Learned to Stop Worrying and Love Them)", Erkenntnis 57:407-423. Law, Richard, and R. Daniel Morton (1996), "Permanence and the Assembly of Ecological Communities", Ecology 77 (3):762-775. Lenski, Richard E., Michael R. Rose, Suzanne C. Simpson, and Scott C. Tadler (1991), "Long-term Experimental Evolution in Escherichia coli.: Adaptation and Divergence During 2,000 Generations ", The American Naturalist 138 (6):1315- 1341. Lenski, Richard E., and Michael Travisano (1994), "Dynamics of Adaptation and diversification: A 10,000-Generation Experiment with Bacterial Populations", Proceedings of the National Academy of Sciences of the United States of America 91 (15):6808-6814. Levi, Margaret (1997), "A Model, a Method, and a Map: Rational Choice in Comparative and Historical Analysis", in Mark I. Lichbach and Alan S. Zukerman (eds.), Comparative Politics: Rationality, Culture, and Structure, Cambridge: Cambridge University Press, 19-41. Levine, Andrew, and Elliott Sober (1985), "What's Historical About Historical Materialism?" The Journal of Philosophy 82 (6):304-326. Lewis, David (1973), "The Laws of Nature", in, Counterfactuals, Oxford: Blackwell, 72- 77. Lewontin, Richard C. (1966), "Is Nature Probable or Capricious?" Bioscience 16:25-26. ——— (1967), "The Principle of Historicity in Evolution", in Paul S. and Martin M. Kaplan Moorhead (ed.), Mathematical Challenges to the Neo-Darwinian Interpretation of Evolution, Philadelphia: The Wistar Institute Press, 81-88. ——— (1970), "The Units of Selection", Annual Review of Ecology & Systematics 1:1- 18. ——— (1974), The genetic basis of evolutionary change, Columbia biological series ; no. 25. New York: Columbia University Press. ——— (1978), "Adaptation", Scientific American 239:212-230. ——— (1984), "Adaptation", in Elliott Sober (ed.), Conceptual Issues in Evolutionary Biology, Cambridge, MA: MIT Press, 235-251. Liebowitz, Stan J., and Stephen E. Margolis (1990), "The Fable of the Keys", Journal of Law and Economics XXXIII:1-26. Losos, Johnathan B., Todd R. Jackman, Allan Larson, Kevin de Queiroz, and Lourdes Rodriguez-Schettino (1998), "Contingency and Determinism in Replicated Adaptive Radiations of Island Lizards", Science 279 (5359):2115-2118.   198 MacArthur, Robert H. (1972), Geographical ecology; patterns in the distribution of species. New York: Harper & Row. MacArthur, Robert H. , and E.O. Wilson (1963), "An Equilibrium Theory of Insular Zoogeography", Evolution 17:373-387.             (1967), The Theory of Island Biogeography, Princeton: Princeton University Press. Macbeth, Norman (1980), "Reflections on Irreversibility", Systematic Zoology 29 (4):402-404. Mahoney, James (2000), "Path dependence in historical sociology", Theory & Society 29 (4):507. ——— (2006), "Analyzing Path Dependence: Lessons From the Social Sciences", in Andreas and Reinhart Kössler Wimmer (ed.), Understanding Change: Models, Methodology, and Metaphors, New York: Palgrave Macmillan, 129-139. May, Robert, M. (1973), Stability and Complexity in Model Ecosystems Princeton: Princeton University Press. Maynard Smith, John (1993), The Theory of Evolution, 3rd edn. Cambridge: Cambridge University Press. Maynard Smith, John, R. Burian, S. Kauffman, P. Albrerch, J. Campbell, B. Goodwin, R. Lande, D. Raup, and L.Wolpert (1985), "Developmental Constraints and Evolution: A Perspective from the Mountain Lake Conference on Development and Evolution ", The Quarterly Review of Biology 60 (3):265-287. Maynard Smith, John, and Eörs Szathmáry (1995), The major transitions in evolution. Oxford ; New York: W.H. Freeman Spektrum. Mayr, Ernst (1983), "How To Carry Out the Adaptationist Program?" The American Naturalist 121 (March):324-333. McDonald, Terrence J. (ed.) (1996), The Historic Turn in the Human Sciences. USA: The University of Michigan Press. Meagher, Richard B. (1995), "The Impact of Historical Contingency on Gene Phylogeny: Plant Actin Diversity", in Ross J. Macintyre and Michael T. Clegg Max K. Hecht (ed.), Evolutionary Biology, New York: Plenum Press, 195-215. Miles, Donald B., and Arthur E. Dunham (1993), "Historical Perspectives in Ecology and Evolutionary  Biology: The Use of Phylogenetic Comparative Analyses", Annual Review of Ecology & Systematics 24:587. Monod, Jacques (1971), Chance and necessity; an essay on the natural philosophy of modern biology. [1st American ed. New York: Knopf. Murdoch, W.W. (1994), "Population Regulation in Theory and Practice", Ecology 75:271-287. Nitecki, Matthew H., and Doris V. Nitecki (1992), History and evolution, SUNY series in philosophy and biology. Albany: State University of New York Press. Otto, Sarah P., and Troy Day (2007), A Biologist's Guide to Mathematical Modeling in Ecology and Evolution. New Jersey, Princeton University Press. Paley, William (1805), Natural Theology: or, Evidences of the Existence and Attributes of the Deity, Collected from the Appearances of Nature, The Online Book Page: Page, Scott E. (2006), "Path Dependence", Quarterly Journal of Political Science (1):87- 115.   199 Peters, Robert Henry (1991), A critique for ecology. Cambridge [England] ; New York: Cambridge University Press. Pickett, Steward T. , Jurek Kolasa, and Clive G. Jones (1994), Ecological Understanding: The Nature of Theory and the Theory of Nature. San Diego: Academic Press, INC. Pierson, Paul (2004), Politics in time : history, institutions, and social analysis. Princeton, N.J.: Princeton University Press. Price, Peter W. (2003), Macroevolutionary theory on macroecological patterns. Cambridge, UK ; New York, NY: Cambridge University Press. Price, Peter W., and T. G. Carr (2000), "Comparative ecology of membracids and tenthredinids in a macroevolutionary context" in Evol. Ecol. Res. 2: 645-655. Prigogine, Ilya (1986), "Irreversibility and Space-Time Structure", in David Ray Griffin (ed.), Physics and the Ultimate Significance of Time: Bohm, Prigogine and Process Philosophy Albany: State University of New York Press, 232-250. Raven, Peter H., Evert, R., Eichhorn, S.E. (1992) Biology of Plants, New York: Freeman, Worth Publishers. Richards, Robert J. (1992), "The Structure of Narrative Explanation in History and Biology", in and Doris V. Nitecki Nitecki Matthew H. (ed.), History and Evolution, New York: State University of New York Press, 19-54. Ricklefs, Robert E. (1990), Ecology. 3rd ed. New York: W.H. Freeman. Ricklefs, Robert E., and Dolph Schluter (1993), Species diversity in ecological communities : historical and geographical perspectives. Chicago: University of Chicago Press. Rolston, Holmes (2005), "Inevitable Humans: Simon Conway Morris's Evolutionary Paleontology", Zygon 40 (1):221-230. Rosenberg, Alex (2000), "Laws, history, and the nature of biological understanding", Clegg, Michael T. [Editor, Reprint Author], Hecht, Max K. [Editor], MacIntyre, Ross J. [Editor] Limits to knowledge in evolutionary genetics:57-72. Ruse, Michael (2000), "Limits to our knowledge of evolution", Clegg, Michael T. [Editor, Reprint Author], Hecht, Max K. [Editor], MacIntyre, Ross J. [Editor] Limits to knowledge in evolutionary genetics:3-33. Salmon, Wesley C. (1984), Scientific Explanation and the Causal Structure of the World. Princeton: Princeton University Press. ——— (1989), Four Decades of Scientific Explanation. Pittsburgh: The University of Pittsburgh Press. ——— (1998), Causality and Explanation. New York: Oxford University Press. ——— (1998), "Causality: Production and Propagation", in Wesley C. Salmon (ed.), Causality and Explanation, New York: Oxford University Press, 285-301. Schank, Jeffrey C., and William C. Wimsatt (1988), "Generative Entrenchment and Evolution", PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, Vol. 1986 2:33-60. Shimeld, Sebastian M. , Andrew G. Purkiss, Ron P.H. Dirks, Orval A. Bateman, Christine Slingsby, and Nicolette H. Lubsen (2005), "Urochordate βγ-Crystallin and the Evolutionary Origin of the Vertebrate Eye Lens", Current Biology 15 (18):1684-1689.   200 Simpson, George Gaylord (1953), The major features of evolution, Columbia biological series ; no. 17. New York: Columbia University Press. Sklar, Lawrence (1993), Physics and Chance: Philosophical Issues in The Foundations of Statistical Mechanics. Cambridge: Cambridge University Press. Sober, Elliott (1988), Reconstructing the past : parsimony, evolution, and inference. Cambridge, Mass.: MIT Press. ——— (2000), Philosophy of Biology. San Francisco: Westview Press. Sterelny, Kim and Paul E. Griffiths (1999), Sex and Death: An Introduction to Philosophy of Biology, Chicago: Chicago of University Press. Strong, Donald R. (1984), Ecological communities : conceptual issues and the evidence. Princeton, N.J.: Princeton University Press. Szathmáry, Eörs (2006), "Path Dependence and Historical Contingency in Biology", in Andreas and Reinhart Kössler Wimmer (ed.), Understanding change: Models, Methodologies, and Metaphors, New York: Palgrave Macmillan, 140-157. Temperton, Vicky M. (2004), Assembly rules and restoration ecology : bridging the gap between theory and practice, Science and practice of ecological restoration. Washington, D.C.: Island Press. Travisano, Michael, Judith A. Mongold, Alfred F. Bennett, and Richard E. Lenski (1995), "Experimental Tests of the Roles of Adaptation, Chance and History in Evolution", Science 267:87-90. Tucker, Aviezer (2004), Our Knowledge of the Past: A Philosophy of Historiography. New York: Cambridge University Press. Turchin, Peter (1995), "Population Regulation: Old arguments and New Synthesis", in Naomi and P.W. Price Cappuccino (ed.), Population Dynamics: New Approaches and Synthesis, San Diego: Academic Press, 19-41. ——— (2003), Complex population dynamics : a theoretical/empirical synthesis, Monographs in population biology ; 35. Princeton, N.J.: Princeton University Press. ——— (2003), Historical Dynamics: Why states Rise and Fall. Princeton: Princeton University Press. Van Valen, Leigh M. (1991), "How Far Does Contingency Rule?" Evolutionary Theory 10:47-52. Wahl, Lindi M., and David C. Krakauer (2000), "Models of Experimental Evolution: The Role of Genetic Chance and Selective Necessity", Genetics 156 (November):1437-1448. Weiher, Evan, and Paul A. Keddy (1999), Ecological assembly rules : perspectives, advances, retreats. New York: Cambridge University Press. White, Morton (1965), Foundations of Historical Knowledge. New York: Harper and Row, Publishers. Wiens, John A. (1984), "On Understanding a Non-equilibrium World: Myth and Reality in Community Patterns and Processes", in Donald R. Strong, Daniel Simberloff, Lawrence G. Abele, and Anne B. Thistle (ed.), Ecological Communities: Conceptual Issues and the Evidence, Princeton: Princeton University Press, 439- 457. Williams, George C. (1992), Natural selection : domains, levels, and challenges, Oxford series in ecology and evolution ; 4. New York: Oxford University Press.   201 Wilson, Edward O. (1992), The Diversity of Life. Cambridge, Massachusetts: The Belknap Press of Harvard University Press. Wimmer, Andreas, and Reinhart Kössler (2006), Understanding change : models, methodologies, and metaphors. Houndmills, Basingstoke, Hampshire ; New York: Palgrave Macmillan. Wimsatt, William C. (1986), "Developmental Constraints, Generative Entrenchment, and the Innate-Acquired Distinction", in W. Bechtel (ed.), Integrating Scientific Disciplines, Dordrecht: Maritinus-Nijhoff, 185-208. ——— (2001), "Generative Entrenchment and the Developmental Systems Approach to Evolutionary Processes ", in Susan Oyama, Paul E. Griffiths and Russell D. Gray (eds.), Cycles of Contingency: Developmental Systems and Evolution, Cambridge: MIT Press, 219-237. Withing, Micheal F., and Sven Bradler, and Taylor Maxwell (2003), "Loss and Recovery of Wings in Stick Insects" in Nature 421: 264-267. Wolff, Jerry O. (1997), "Population regulation in mammals: An evolutionary perspective", Journal of Animal Ecology 66 (1):1-13. Woodward, James (2003), Making Things Happen: A Theory of Causal Explanation. New York: Oxford University Press. Wright, Sewall (1932), "The Roles of Mutation, Inbreeding, Crossbreeding and Selection in Evolution", Proc. 6th Int. Cong. Genet. 1:356-366. ——— (1968), Evolution and the Genetics of Population. Chicago: University of Chicago Press.    202 Appendix 1: Urn Dynamics, Increasing Returns and Lock-In  Scott Page (2006), in an insightful analysis of the notion of path dependence, proves that increasing returns are neither necessary nor sufficient and that multiple equilibria and lock-in are not necessary conditions for path dependence. Page distinguishes between two forms of historicity: phat and path dependence. The former applies to processes whose “outcome at any time period depends on the set outcomes and opportunities that arose in a history but not upon their order” (p. 97). The word “phat,” which contains the same letters as the word “path” but in a different order, reminds us of the unimportance of the order of events in the process. A simple example of a phat- dependent process would be building a jigsaw puzzle. We cannot replace the pieces in a puzzle and obtain the same result at the end, but the order in which we assemble them does not really matter. By contrast a process is said to be path dependent if the outcome at any time period depends on the history of outcomes and on their order. The priority effect of ecological communities discussed in Chapter 4 would be consistent with this definition. More examples will be provided below, but before we go there, I simply want to mention that phat dependence has a lesser degree of historicity than path dependence. History matters to a greater extent when changing the order of events yields alternative outcomes. As pointed out by Page (2006, p.98) some examples have been qualified as path dependent that his framework would rather count as phat dependent. Polya’s version of the urn process is one of them. Because the return function is time invariant, the order in which balls are selected makes no difference to the outcome. The proportion of balls in   203 the urn after three time periods is the same if the order is c1 c2 c2, c2 c1 c2 or c2 c2 c1. Polya’s urn process is therefore phat dependent in the sense understood by Page. The same is not true however if the return function is not time invariant, as in the Strong Path- dependent Process:  Rules for Strong Path Dependent Process: • Initially, n1 = n2 = 1 • In period t a ball is chosen and 2t-1 balls of the selected shade are added to the urn.  Now let’s suppose that we have the following series of draws c1 c2 c1 c1 c2. After one draw, a c1 ball is added to the urn. After the second draw, two c2 balls are added. After draws 3 and 4, four and eight c1 balls are added. Finally, after the fifth draw, 16 c2 balls are added. We therefore obtain 14 c1 balls and 19 c2 balls in the urn. I leave to the reader the task of repeating the experiment with the same number of c1 and c2 balls but in a different order to realize that c1 c2 c1 c1 c2 is the only history that yields this proportion. Interestingly though, because the magnitude of the returns increases with time, the process does not stabilize into a limit on the long run. This means that lock-in is not a necessary property of path dependent processes. Most account of path dependence would not be able to conclude so, for they assume that there must be a limit probability distribution towards which the system evolves. Page (2006) also proves with a biased version of Polya’s urn that some processes can have a built-in increasing-returns mechanism and yet fail to be path dependent.   204  Rules for the Biased Polya Process: • Initially, n1 = 1 and n2 = 2 • If c1 is selected, it is put back in the urn together with another c1 ball and another c2 ball. • If a c2 ball is selected in period t, it is put back in the urn together with 2t additional c2 balls.  This process is biased for two reasons. First, the initial condition, instead of being 1 ball of each color, starts with n1 = 1 and n2 = 2. Second, in each period a ball is selected, it is returned to urn, but this time the returning function has changed and is strongly biased towards c2. In other words, c2 benefits from a much greater boost once selected. This set of rules entails increasing returns for both shades. Selecting a color in a given time period will always increase the probability of selecting that same color in the next time period. But it does not result in path dependent dynamics in the longer run. Eventually, a c2 ball will be selected. When this happens, the probability of selecting a c2 ball in the next time period becomes significantly higher (over 75%) than the probability of selecting a c1 ball. Thus, the proportion of c2 balls in the urn will rapidly converge towards 100%. We therefore see that, although governed by an increasing returns mechanism, this process admits of a unique stable equilibrium, which also proves that increasing returns does not guarantee path dependence. When one option benefits from much higher returns than the other(s) alternatives, then it will always win. Using another variant of the urn process, Page (2006, p.100-101) also shows that   205 increasing returns is not necessary for multiple equilibria to occur. Instead of having 2 colors of balls in the urn, image that we have four and that the following set of replacement rules applies.  Rules for the Balancing Polya Process: • Initially, n1 = n2 = n3 = n4 = 1 • Pick c1  Add c2 • Pick c2  Add c1 • Pick c3  Add c4 • Pick c4  Add c3  In brief, if you select a c1 ball, you add a c2 ball and vice versa, but if you select a c3 ball, you add a c4 ball and vice versa. Note that there are no increasing returns in this set of rules. The selection of a given color does not increase the probability of selecting that same color in the next time period. Nevertheless, the proportion of balls in the urn will always depend on the history of selection, and a different equilibrium will be reached each time we repeat the experiment. Page suggests that this version of the urn dynamics can give some insights about how complementarities between outcomes can create an alternative equilibrium. See also Pierson (2004) excellent discussion of institutional development showing strong interlinkages among institutional arrangements. 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items