UBC Undergraduate Research

Effects of Evolution on a Nonzero-Sum Game Chen, Ruiyuan 2010-06-01

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
51869-Chen_Ruiyuan_Science_One_Program_2009-2010.pdf [ 1.52MB ]
Metadata
JSON: 51869-1.0107214.json
JSON-LD: 51869-1.0107214-ld.json
RDF/XML (Pretty): 51869-1.0107214-rdf.xml
RDF/JSON: 51869-1.0107214-rdf.json
Turtle: 51869-1.0107214-turtle.txt
N-Triples: 51869-1.0107214-rdf-ntriples.txt
Original Record: 51869-1.0107214-source.json
Full Text
51869-1.0107214-fulltext.txt
Citation
51869-1.0107214.ris

Full Text

E ects of Evolution on a Nonzero-Sum GameRuiyuan ChenScience One Program, University of British Columbia, Vancouver, CanadaMay 28, 2010AbstractAn N-player nonzero-sum game was modelled on a computer and evolutionary algorithmswere used to determine whether the evolutionarily stable strategy coincides with the groupoptimum behavior.IntroductionThis project aims to model and study the following game:N players sit in a line and take turns standing up. Each player seeks to stand up at aunique time. If two or more players stand up at the same time, they are all punished(for example, made to do pushups) and the game ends. If the game continues untilonly one player is left sitting, that last player is punished instead.This game is nonzero-sum, meaning that the total gain (or loss, in this case) of all the playersdepends on the choices that the players make: if all players stand up as soon as the game startsthen all will be punished, whereas if each player stands at a random time then probably at leastone player will pass unscathed. This is unlike a zero-sum game such as gambling, in which totalwealth is conserved and redistributed among the players. In contrast, the players of a nonzero-sumsuch as that described above together determine how much joy or pain is created and shared.The goal of this study is to determine the e ects of evolution on the players of the gamedescribed above. After having an initial population play a certain number of games, we mimicnatural selection, killing the losers and allowing the winners to reproduce. Many generations later,we expect to be left with a homogeneous population consisting of the evolutionarily stable strategy(ESS), that will be able to outcompete any minority groups in the population [1]. However, sucha selection procedure works only on an individual level. Well-known nonzero-sum games like theprisoner’s dilemma exemplify a key feature of this class of games: maximization of individual gainmay result in suboptimal group gain. Thus the e ective goal of this study is to determine whethersuch a con ict between individualism and collectivism exists in the game described above.MethodsIn order to simulate the game on a computer, we created the following simpli ed model. Thetimes at which players may stand are represented as nonnegative integers, and each strategy is1Type and Parameters pmf(k)Fixed(t): always stand at time t2f0;1;2;:::g 1 if k = t, 0 otherwiseUniform(n): stand before time n2f1;2;3;:::g 1=n if k < n, 0 otherwiseGeometric(p): at any time, if still seated, stand with prob-ability p2(0;1]p(1 p)kBell(q): probability of remaining seated starts at 1 anddecreases by a factor of q2[0;1) at each timeqk(k 1)=2(1 qk)Uniform2(a;b): stand at or after time a2f0;1;2;:::gbutbefore time b > a1=(b a) if a k < b, 0 otherwiseTable 1: The types of strategies used in the initial populations, their parameters, and their prob-ability mass functions (probabilities of standing at each possible time k)Figure 1: pmf curves for strategies (a) Fixed(10), (b) Uniform(4), (c) Geometric(0:5),(d) Bell(0:9), and (e) Uniform2(14;17)represented by a probability distribution on the set of nonnegative integers. In a game, each playerstands at a time randomly chosen according to his strategy’s distribution. If two or more playerschoose the same time, the earliest such group of players is \punished" by a single point added totheir score; otherwise the player who chooses the latest time gains a single point.Note that these assumptions fail to re ect many aspects of the real-life game, such as theability to change one’s strategy mid-game depending on when other players stand, the potential tolearn about other players’ strategies and improve one’s own strategy for future games, and variouspsychological e ects. While our assumptions were necessary to allow e cient simulation of thegame on a computer, they do mean that the results of this study are best interpreted only inthe context of our simpli ed model. Any extrapolation to the original game must take the abovefactors into account.For each simulation, we set up an initial population of strategies with simple discrete probabilitydistributions, listed in Table 1, with probability mass functions shown in Figure 1. Multiple(M 2 [100;1000]) games were then played in order to reduce the random e ects of the choicesof standing times. At the end of all the games, the total score of all the players was divided bythe number of games, M, to obtain the total loss per game L. The players were then ranked inorder of increasing score, and a certain fraction (E) of the highest-scoring strategies were removed.An evolutionary algorithm (with parameter P) was used to produce \children" of the survivingstrategies which  ll the vacancies in the population; the algorithms used are shown in Table 2. Allof this constitutes one generation of the simulation. We call the initial population \generation 0",the population after one evolution \generation 1", and so on. A minimum of 8 generations wererun, and then more generations were run until the population had \converged". \Convergence" isde ned as the earliest generation when the mean standing time of the entire population has notchanged by more than 0:01 over the last half of all generations up to that point.2Type and Parameter (P) DescriptionevolveP: # of parents per childP parents are selected randomly (with replacement) and com-bined into a child that plays according to each parent with auniformly randomly chosen probabilityevolve2P: # of children per parentParents’ pmfs are split at P-quantiles (median for P = 2, quar-tiles for P = 4, and so on) into P new pmfs; the lowest-scoringparent gives rise to the  rst P children, the second-lowest-scoringgives rise to the next P, etc. until the population is  lled againevolve3P: # of parents per childSame as evolve, except parents are not combined at random;rather, the  rst P are combined, the next P are combined, etc.such that each group of P gets to procreate at least once (assum-ing enough vacancies in the population)Table 2: The types of evolutionary algorithms usedIn order to determine the average total loss per game L of the convergent population, we tookthe mean and standard error of the L of the last half of all generations. There were two exceptions.First, if the last generation consisted of a single deterministic strategy (i.e. with a probability massfunction zero except at a single point), we took the L of that generation, with an uncertainty of 0,since all future generations will have the same L. Second, for each individual population we rana control simulation in which no evolution occurred; in these control simulations, 100 generationswere always run, and the L of all 100 generations were averaged.We conducted the simulations in two stages, with each simulation using an initial population ofsize 64. In stage 1, we conducted 8 simulations, consistently using evolutionary algorithm evolve(see Table 2) while varying the composition of the initial population. In any simulation, each typeof strategy present in the initial population was present in equal numbers, and the parameters of thestrategies of each type were distributed uniformly across the parameters’ domains. For example, aninitial population of Fixeds and Geometrics is the setfFixed(t) : t = 0;:::;31g[fGeometric(p) :p = 1=32;:::;32=32g. In stage 2, we conducted 12 simulations, keeping an initial population offUniform2(a;b) : 0 a < 8;a < b a + 8g while changing both the type and parameter P of theevolutionary algorithm used.Results and DiscussionThe results of the stage 1 simulations are shown in Table 3. A quick glance at the Ls at convergenceand the ratios to the Ls of the controls reveals that while evolution e ectively reduced the L valuein most of the simulations, there are three blatant exceptions to this trend. Note that these areexactly the simulations that began with Fixeds in the initial population. Figure 2 shows the pmfs ofthe strategies in the Fixed+Uniform simulation at four generations. In Fig. 2a we see the \spikes"that are the pmfs of Fixed(0);:::;Fixed(31), as well as the pmfs of the Uniforms clustered aroundtime 0. After 3 generations, as shown in Fig. 2b, the strategies that stand up early have killed eachother through competition, while the surviving strategies have successfully reproduced. However,because children inherit from parents, this reproduction is e ectively suicidal, for the parents endup competing with their own o spring. Meanwhile, the late-standing strategies proliferate, suchthat by generation 20 (Fig. 2c), the once-sparsest region on the pmf graph has become the densest.This continues until the latest-standing strategy (Fig. 2d) is the only strategy left.3Initial population L at convergence L of control L ratioFixed 44:8 0:5 1 0 44:8 0:5Uniform 2:413 0:004 4:805 0:006 0:5022 0:0010Geometric 2:394 0:004 32:50 0:02 0:07366 0:00013Bell 3:021 0:009 32:49 0:02 0:0930 0:0003Fixed + Uniform 40:4 1:5 5:053 0:005 8:0 0:3Uniform + Geometric 2:931 0:005 3:17 0:18 0:92 0:05Geometric + Bell 3:502 0:008 16:495 0:007 0:2123 0:0005all 4 types 39:5 1:4 12:884 0:006 3:07 0:11Table 3: Stage 1 simulation results, showing the types of strategies in the initial population, theaverage total loss L at convergence, the L of the control, and the ratio of these two Ls. The exactcompositions of the initial populations were as described in the Methods section.(a) 0 (b) 3 (c) 20 (d) 60Figure 2: pmfs of strategies in Fixed + Uniform at generations 0, 3, 20, and 60We can see, then, that in the simulations involving Fixed strategies, the ESS is the latest-standing strategy because such a strategy avoids competition until all other strategies have died.In fact, this also applies to the simulations in Table 3 not involving Fixeds, and explains thereduction of L in those simulations just as well: for Uniforms, Geometrics, and Bells are allstrategies where the latest-standing strategy also has the most spread-out pmf curve (see Figure1). Thus in an initial population lacking Fixeds, the ESS is far less likely to compete with itself,which explains the e ectiveness of evolution in reducing L.Suppose that in some initial population including Fixeds, a lone early-standing strategy some-how survives without competition until the rest of the population has converged to an ESS similarto that shown in Figure 2d. The early-standing strategy has an enormous advantage, a near-guarantee of neither competing with another strategy, nor being punished for being the last playersitting { for a single turn. As soon as the early-standing strategy is rewarded with like-mindedo spring, competition kicks in and kills both parent and children. This ability of the equilibriumto withstand invasion allows us to fully justify labelling the latest-standing strategy as the ESS ofany given initial population.The results of the stage 2 simulations are shown in Table 4. There are a number of trends, themost apparent being the common L value of 64 0 for all of the simulations using evolve2. Sincethe size of the population was 64, this was the maximum L possible. The uncertainty is 0 becausein each of the 4 simulations, the  nal population was completely homogeneous and consisted of asingle deterministic strategy. This is the worst possible collectivistic outcome, since the populationmust maintain the maximum L for all future generations. This result makes theoretical sense, forrecall the pmf-splitting mechanism of evolve2 (Table 2): the only populations on which evolve2does not act must consist of identical deterministic strategies.4Algorithm P E L at convergence L of control L ratioevolve20:59:90 0:053:279 0:0143:02 0:025 8:83 0:02 2:693 0:0138 9:26 0:02 2:824 0:0142 0:25 8:66 0:04 2:641 0:017evolve220:564 0 19:52 0:085 64 0 19:52 0:088 64 0 19:52 0:082 0:25 64 0 19:52 0:08evolve320:58:50 0:05 2:59 0:025 8:77 0:03 2:675 0:0158 10:61 0:05 3:24 0:022 0:25 8:51 0:04 2:595 0:016Table 4: Stage 2 simulation results, showing the evolutionary algorithm used and its parameterP, the selection rate E, the L at convergence, the control L, and the L ratio.Examining the L ratios for the evolve and evolve3 simulations, we see that these two evolu-tionary algorithms produced net increases in L that were largely independent of the evolutionaryparameters P and E. Furthermore, the e ect of evolve3 on the pmf curves of the population looksvery similar to the sequence shown in Figure 2: a gradual but steady recession of early-standingstrategies that leads to a late-standing ESS. Thus it appears that the key features of both algo-rithms are the same: the blessing/cursing of well-scoring parents with like-minded children. Andjust as the detailed composition of the initial populations in stage 1 did not a ect the general\evolutionarily stable" = \late-standing" trend, such quantitative  ne print as the parameter Por the selection rate E seem unimportant to describing the general e ects of evolution.ConclusionThe e ects of evolution on our modelled game were studied using di erent initial populationsand evolutionary algorithms and were found to optimize neither individualistic nor collectivisticbehavior. However, the latest-standing strategy became the ESS in all cases, irrespective of theinitial population and the evolutionary algorithm. It seems remarkable that a complex systemwith many variables always behaves in the same way, independently of those variables.AcknowledgmentsI would like to thank Fok-Shuen Leung for allowing me to proceed with this project and givingme some early feedback on my results, Josh Zukewich for a number of useful suggestions forimprovement, and Andy Tan for reviewing and commenting on my paper.References[1] Maynard Smith, J. and Price, G.R. The Logic of Animal Con ict. Nature 246, 15-18 (1973).5

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.51869.1-0107214/manifest

Comment

Related Items