UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Modeling the surge beds in the emergency department in a hospital by Markov decision process and queueing… Tirdad, Ali 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2016_february_ali_tirdad.pdf [ 2.32MB ]
Metadata
JSON: 24-1.0223115.json
JSON-LD: 24-1.0223115-ld.json
RDF/XML (Pretty): 24-1.0223115-rdf.xml
RDF/JSON: 24-1.0223115-rdf.json
Turtle: 24-1.0223115-turtle.txt
N-Triples: 24-1.0223115-rdf-ntriples.txt
Original Record: 24-1.0223115-source.json
Full Text
24-1.0223115-fulltext.txt
Citation
24-1.0223115.ris

Full Text

Modeling the Surge Beds in theEmergency Department in a Hospitalby Markov Decision Process andQueueing TheorybyAli TirdadB.Sc., University of Tehran, 2006M.Sc., Amirkabir University of Technology, 2008A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE COLLEGE OF GRADUATE STUDIES(Mathematics)THE UNIVERSITY OF BRITISH COLUMBIA(Okanagan)December 2015c© Ali Tirdad, 2015The College of Graduate Studies      The undersigned certify that they have read, and recommend to the College of Graduate Studies for acceptance, a thesis entitled:        submitted by   in partial fulfilment of the requirements of   the degree of   .      Supervisor, Professor (please print name and faculty/school above the line)    Supervisory Committee Member, Professor (please print name and faculty/school in the line above)    Supervisory Committee Member, Professor (please print name and faculty/school in the line above)    University Examiner, Professor (please print name and faculty/school in the line above)    External Examiner, Professor (please print name and university in the line above)        (Date Submitted to Grad Studies)   Additional Committee Members include:    (please print name and faculty/school in the line above)    (please print name and faculty/school in the line above)     Page 1 of 1 Revised: October-15-14 AbstractIn this thesis, we apply Markov decision theory to the problem involvingM(t)/M/c/c queue to conduct a case study at Kelowna General Hospital(KGH) in British Columbia, Canada. Health-care systems have been chal-lenged in recent years to deliver high quality care with limited resources.Emergency departments (ED) are perhaps the most sensitive componentsof the health-care system due to their nature. KGH has extra beds in its EDin a unit called the surge section. They use this section in case the ED isovercrowded. There is no systematic approach to when this section shouldbe in use, and managerial decisions are made based on the what seems nec-essary at the time. Obviously, these decisions are usually costly and notbased on a careful analysis. Therefore, they want to have a policy to knowwhen to use the surge section.In this thesis, first we adapt the fourth order Runge-Kutta method (RK4)to obtain more accurate transient solutions for M(t)/M/c/c queues. Weshow numerically that our method works better than RK4 for our specificqueue. Then we provide the Morkov decision process (MDP) model forsolving the problem. In this model, the arrival rate is time-dependent, andthere are two levels for the number of servers. We prove that decisions foran MDP with periodic and time-dependent Poisson arrivals are periodic aswell. Consequently, the contour control policies which are obtained basedon the optimal decision show periodic behavior as well. Numerical resultspresented support this claim. Moreover, we model the seasonal flu epidemicsand define a combined arrival rate for the ED. The results of this modelsupport the claim about the periodicity of the policies as well. In addition,we show that a type of linear extrapolation is a reliable way to obtain controlpolices for large-scale problems.iiPrefaceSome of the research results presented in this thesis have been publishedin journal articles. The co-authors for these publications were Dr. WinfriedGrassmann, and Dr. Javad Tavakoli, my research supervisors. The relationsbetween the published papers and this thesis are summarized below.A part of chapter 2, and chapter 3 has been published as a journal paper.− Ali Tirdad, Winfried K. Grassmann, and Javad Tavakoli. OptimalPolicies of M(t)/M/c/c Queues with Two Different Levels of Servers.European Journal of Operational Research, accepted, 2015.A part of chapter 4 has been published as a journal paper.− Ali Tirdad, Winfried K. Grassmann, and Javad Tavakoli. OptimalPolicies of an Emergency Department with Two Levels of Servers andSeasonal Flu Epidemic Arrivals. Journal of the Operational ResearchSociety, Submitted, 2015.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiGlossary of Notation . . . . . . . . . . . . . . . . . . . . . . . . . viiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . xDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiChapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . 11.1 Kelowna General Hospital Case . . . . . . . . . . . . . . . . . 21.2 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Markov Decision Processes . . . . . . . . . . . . . . . . . . . . 61.4 M/M/c/c Queues . . . . . . . . . . . . . . . . . . . . . . . . . 81.5 Literature on Application of Queueing Theory in HospitalBed Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 111.6 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Chapter 2: Transient Solutions for M(t)/M/c/c Queues . . . . 162.1 Time-dependent Arrival Queues in the Literature . . . . . . . 162.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 202.3 The Modified Fourth Order Runge-Kutta Method . . . . . . 222.4 The Randomization Method . . . . . . . . . . . . . . . . . . . 252.5 Time Averages . . . . . . . . . . . . . . . . . . . . . . . . . . 30ivTABLE OF CONTENTS2.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . 30Chapter 3: Optimal Policies for M(t)/M/c/c Queues with TwoDifferent Levels of Servers . . . . . . . . . . . . . . 323.1 Two-level Service Queues in the Literature . . . . . . . . . . . 333.2 Markov Decision Process for KGH . . . . . . . . . . . . . . . 383.2.1 Backward Induction . . . . . . . . . . . . . . . . . . . 423.3 Periodicity of the Policies . . . . . . . . . . . . . . . . . . . . 433.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . 463.5 Extreme Case Analysis . . . . . . . . . . . . . . . . . . . . . . 483.5.1 The Deterministic Case . . . . . . . . . . . . . . . . . 483.5.2 Total Randomness . . . . . . . . . . . . . . . . . . . . 543.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . 54Chapter 4: Case Study: An Emergency Department withTwo Levels of Servers and Seasonal Flu Epidemic 564.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 574.2.1 Seasonally Forced SIR Model . . . . . . . . . . . . . . 584.2.2 Combining the Rates . . . . . . . . . . . . . . . . . . . 604.3 Periodicity of The Control Policies . . . . . . . . . . . . . . . 604.4 Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . 65Chapter 5: Conclusions and Future Work . . . . . . . . . . . . 66Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68vList of TablesTable 1.1 Most common codes in Kendall’s notation . . . . . . . 8Table 2.1 The blocking probability, b(t), with the modified RK4for c = 100 and 2 ≤ t ≤ 16 . . . . . . . . . . . . . . . . 23Table 2.2 The blocking probability, b(t), with the modified RK4for c = 100 and 14.8 ≤ t ≤ 53 . . . . . . . . . . . . . . 24Table 2.3 The blocking probability, b(t), with 5-term randomiza-tion for c = 100 and 2 ≤ t ≤ 16 . . . . . . . . . . . . . 28Table 2.4 The blocking probabilities, b(t), with 15-term random-ization for c = 100 and 2 ≤ t ≤ 16 . . . . . . . . . . . . 29viList of FiguresFigure 1.1 Rate transition diagram for an M/M/c/c queue . . . 9Figure 1.2 Two dimensional queueing model for emergency carechain [dBvRVK07] . . . . . . . . . . . . . . . . . . . 13Figure 3.1 Bi-level hysteretic service rate control [Geb67] . . . . 34Figure 3.2 System with two modes of operation [AM89] . . . . . 35Figure 3.3 An M/M/c queue with two modes of operation[AM88] 36Figure 3.4 Markov decision process diagram . . . . . . . . . . . . 39Figure 3.5 Nt and mt for three cycles, T = 364, with 52 decisionepochs in each cycle . . . . . . . . . . . . . . . . . . . 47Figure 3.6 Number of arrivals, departures, and occupied bedsover one cycle . . . . . . . . . . . . . . . . . . . . . . 50Figure 3.7 qt and demonstration of t¯1, t¯1, t¯2, and t¯2 . . . . . . . 51Figure 3.8 Nt and mt for three cycles with 20 decision epochs ineach cycle where β = 60 and α = 1 . . . . . . . . . . 55Figure 4.1 SIR model with: (a) constant, and (b) time-dependenttransmission rates . . . . . . . . . . . . . . . . . . . . 59Figure 4.2 Nt and mt for three cycles, T = 364, with 52 decisionepochs in each cycle . . . . . . . . . . . . . . . . . . . 61Figure 4.3 Changes in Nt/κ and mt/κ by increasing the size ofthe problem, where κ1 = 1, κ2 = 2, κ3 = 3 and κ4 = 4 62Figure 4.4 Linear extrapolation of Nt for different decision epochs 63Figure 4.5 Linear extrapolation of mt for different decision epochs 63Figure 4.6 Comparing the extrapolation and MDP results of mtand Nt for κ = 5 . . . . . . . . . . . . . . . . . . . . . 64viiGlossary of Notationpj(t) probability of being in state j at the time tP (t) the row vector of pj(t)A(t) the transition matrix at the time tb(t) the blocking probability at the time tm1 the number of main bedsm2 the number of stretcher bedsm3 the number of surge bedsλt the arrival rate at the time tµ the service rateTH the planning horizonT the length of the cycleny the number of cycles in the planning horizonbn the number of patients in the system at stage nxn the status of the surge section at stage nan the action at stage npnij the probabilities of having j patients in the system at stagen given that bn−1 = i and an−1 = 0qnij the probabilities of having j patients in the system at stagen given that bn−1 = i and an−1 = 1Oax(n) the immediate opening cost of the surge section at stage nCo the constant start-up cost of the surge sectionRa(n) the surge operating cost for the interval (tn, tn+1)Cr the cost per unit of time of running the surge sectionSa(n, b) the expected cost of having patients on stretchers for theinterval (tn, tn+1)Cs the cost of having a patient on a stretcher for a unit of timeCrej the cost of rejecting a patientFx(n, b) the rejection cost for the interval (tn, tn+1)rn(x, b, a) the costs at stage n for state (x, b) and action aviiiGlossary of Notationu∗n(xn, bn) the cumulative optimal cost function of the system fromthe decision epoch t˜nτ the location of decision epoch in the cycle of the arrivalrate functionγaij(ti, tf ) the probability of being in state j at time tf with initialstate i at time ti and a as the latest action before tfNt the contour point at the time t for opening the surgesectionmt the contour point at the time t for closing the surgesectionqt the number of occupied beds in the ED at time tdt the number of departures at time tS the proportion of susceptibleI the proportion of infectiousR the proportion of recoveredγ the recovery rateβ the transmission rateNp the population sizeNR the re-production numberixAcknowledgementsI would like to offer my sincere gratitude to my supervisors, Dr. JavadTavakoli and Dr. Winfried Grassmann, and thank them for their generoushelp and unconditional support throughout my studies.xDedication To my mom and dad, Sara and Hossein who sacrificed a lot for our success! Thank you for your great support and continuous care. This humble work is a sign of my love to you.    To my brother, Amir who his beautiful soul makes everything better! Thank you for helping me through thick and thin, and for all the constructive and positive words.  To my brother, Iman who makes the impossible possible! You are the best thing in my life, and my inspiration for never giving up. Thank you for being there for me, and taking charge of the matters that were my responsibilities.   xiChapter 1IntroductionUsing mathematics to solve real-life problems often leads to interestingand difficult mathematical challenges. In this thesis, we apply Markov de-cision theory as a well-known operations research technique to solve a casestudy in Kelowna General Hospital (KGH).Our problem arises from a case study at KGH, in British Columbia,Canada. Health-care systems have been challenged in recent years to deliverhigh quality care with limited resources. Emergency departments (ED) areperhaps the most sensitive components of the health-care system due to theirnature. KGH has extra beds in its ED in a unit called the surge section.They use this section in case the ED is overcrowded. There is no systematicapproach to when this section should be in use, and managerial decisions aremade based on what seems necessary at the time. Obviously, these decisionsare usually costly and not based on a careful analysis. Therefore, they wantto have a policy to know when to use the surge section.For modeling the KGH problem, we should notice that the arrivals anddepartures in an ED form a stochastic process. In other words, there is nodeterministic pattern for the number of people showing up at an ED, andno fixed time that it takes to take care of a patient. Moreover, it is rea-sonable to say the pace of receiving patients in an emergency departmentdepends on the time of the year as well. Considering these characteristics,mathematics helps to model this problem. However, the problem is mathe-matically sophisticated and challenging. The complexity of dealing with thetime-dependence of the model is discussed in chapter 2 and the mathemati-cal model is provided in detail in chapter 3. In this chapter, we explain theproblem in hand in section 1.1. An overview of Markov chains and Markovdecision processes, as the method we use for modeling our problem, is pro-vided in section 1.2 and 1.3, respectively. We explore pertinent results fromqueueing theory in section 1.4 to familiarize ourselves with queueing theoryconcepts and notation. The literature on the application of queueing theoryin health-care, especially in bed allocation, is provided in section 1.5. Theoverview of the remainder of the thesis is provided in section 1.6.11.1. Kelowna General Hospital Case1.1 Kelowna General Hospital CaseSurge beds (or generally surge resources) can be temporary beds or a sec-tion which are only used when there is a crisis situation [GMM02], [PT12].In our case, a crisis situation arises from a condition that the number ofpatients accommodated on inconvenient stretchers exceeds the acceptablethreshold. In most of the current literature, EDs with surge beds have beenstudied using queueing theory. The key points here are (1) when to decideto use the surge resources and (2) for how long to keep them open.It is not economical to use the surge resources every time a crisis situationhappens. For instance, consider the case when the surge resources consistsof an additional section next to the ED. Every time that one uses the surgesection, it needs to be prepared and of course this process imposes a cost tothe system. This is a cost that should be considered as the start-up cost forstarting the surge resources each time. Accordingly, it is not economical toopen and close the section frequently. On the other hand, it is not economicalto keep the surge section open while it is not in use or working with a low rateof occupancy. So we have to find an optimal schedule to balance betweenstart-up and running cost of the surge section.Kelowna general hospital (KGH) has already used the idea of the surgebeds in its ED. But there are differences in the KGH case and the generalcase; patients don’t really ”wait” in KGH to be admitted at the emergencydepartment. Patients are admitted and put in a less than ideal bed; some-times it is a hallway bed, other times a stretcher in the ED. Therefore,technically there is no waiting line. Often, patients will spend their entirelength of stay in the hospital in one of the stretchers. Even if an ideal bedbecomes free, hallway patients are not necessarily moved into that new bed;it may be that a ”new” patient from ED will be moved to that bed. Thequality of the service has an inverse relation with the number of the patientson the stretchers. The lower the number of patients on the stretchers, thebetter the quality of service.We want to obtain an optimal policy which is a trade-off between thequality of the service and the cost of the surge section. The problem ad-dressed here is how one can manage the surge resources such that one hasthe maximum quality of service with minimum cost according to the startup and running cost of a surge section of ED.In this work we plan to model this problem as an M(t)/M/c/c queuewith a special control policy obtained from the results of a Markov decisionprocess. Before that, let us familiarize ourselves with Markov chains, Markovdecision processes, and queueing theory.21.2. Markov Chains1.2 Markov ChainsFor defining a Markov decision process, it is best to begin with theconcept of the stochastic process and Markov chain (or process) first. Astochastic process is the mathematical abstraction of an experiment which isdeveloped by probabilistic laws [GH85]. A stochastic process can be definedas a set of random variables over a range of time T , {X(t), t ∈ T}. X(t)is the state of the process at time t. Stochastic processes can be classifieddepending on the nature of the problem. For continuous-time stochasticprocesses, the index variable takes a continuous set of values. A discrete-time stochastic process, as contrasted with a continuous-time process, isa sequence of random variables and time series related to these randomvariables.The Markov chain or the Markov process is a type of stochastic processthat has the ”memoryless” property. A process is memoryless if given thepresent state of the process, the future can be predicted independent of thepast. Mathematically, a stochastic process is a Markov process if, for any setof n time points t1 < t2 < · · · < tn, and for any real numbers x1, x2, . . . , xn,we havep {X(tn) = xn| X(t1) = x1, . . . , X(tn−1) = xn−1}= p {X(tn) = xn| X(tn−1) = xn−1} , (1.1)where X(t) are discrete-time random variables, and p {A|B} is the condi-tional probability of A when B is given. Like any other stochastic process,a Markov chain can be continuous-time or discrete-time.1The states of a Markov chain can be period or aperiodic. State i hasperiod d if, starting in i, state i can only be revisited after d, 2d, . . . transi-tions. A state with period 1 is said to be aperiodic. In addition, they canbe transient or recurrent. State i is recurrent if starting in i, the processrevisits state i again at some point almost surely; if not, state i is transient.Moreover, if the expected time for revisiting state i is finite the state i ispositive recurrent. If a state is positive recurrent and aperiodic, then it iscalled ergodic. A Markov chain containing only ergodic states is called anergodic Markov chain. Irreducibility is the property that regardless of thepresent state, we can reach any other state in finite time. Hence, it can besaid that all ergodic Markov chains are irreducible. For the analysis of ourmodel in chapter 3, we need periodic Markov chains.An ergodic Markov chain always converges to an equilibrium (limiting1some authors use the term ”chain” only for the discrete processes[GH85].31.2. Markov Chainsprobabilities) independent of the initial state. For writing the related equa-tions, we need the Chapman-Kolmogrov equation. We assume a countablestate space as S = {0, 1, 2, . . . }. Let pij(t) represent the probability thatthe process transits from state i to state j at time t where i, j ∈ S. If theseprobabilities are homogeneous we can represent them by pij . Let pnij be theprobability of going from state i to state j in n transitions in a discrete-timeMarkov process. Chapman-Kolmogrov sayspmij =∑r∈Spm−kir pkrj for 0 < k < m. (1.2)For continuous-time Markov chains, Chapman-Kolmogrov is as follows:pij(u, s) =∑r∈Spir(u, t)prj(t, s) for any u < t < s (1.3)where pij(u, s) is the probability of moving from state i to j in time beginningat u and ending at s. Let us focus on the discrete-time Markov chains fornow. Let A denote the matrix of one-step transition probabilities such thatA = [pij ], and P (n) is the probability row vector after n transitions from astarting point such that P (n) = [pi(n)] is the probability of being in state iafter n transitions. Using equation (1.2), we can saypi(n+ 1) =∑j∈Spj(n)pji i ∈ S,P (n+ 1) = P (n)A.(1.4)In the discrete case, the limiting probabilities are easy to obtain. Anrepresents n-step transition probabilities where An =[pnij]. Let pj be theprobability of being in state j in the long-run. The limiting probabilitiescan be obtained by the following well-known equations [Ros10]:pj = limn−→∞ pnij , j ∈ S,pj =∑i∈Spi [A]ij , j ∈ S,∑j∈Spj = 1.(1.5)If A is given, obtaining limiting probabilities is possible using equationsin (1.5). However, these solutions only work for the time-homogeneous cases.41.2. Markov ChainsFor a continuous-time Markov chain, using equation (1.3), and lettingu = 0 and s = t+ ∆t, we havepij(0, t+ ∆t) =∑r∈Spir(0, t)prj(t, t+ ∆t). (1.6)For simplicity in notation in continuous-time cases, let pi(t) denote theprobability of being in state i at time t knowing the initial condition pi(0)for all i ∈ S. pi(0) represents the initial probability of being in state i attime t = 0. Multiplying (1.6) by pi(0) and summing over all states i, wehave ∑i∈Spi(0)pij(0, t+ ∆t) =∑r∈S∑i∈Spir(0, t)pi(0)prj(t, t+ ∆t),and hence with new notationpj(t+ ∆t) =∑r∈Spr(t)prj(t, t+ ∆t). (1.7)If the transition probability function pij(t, t + ∆t) can be written asaij(t)∆t+ o(∆t), then we can rewrite equation (1.7) aspj(t+ ∆t) =∑r∈Spr(t)arj(t)∆t+ o(∆t). (1.8)Setting aii(t) := −∑j 6=iaij(t) for all i ∈ S, we havep′j(t) =∑r∈Spr(t)arj(t) j ∈ S,P ′(t) = P (t)A(t),(1.9)where A(t) = [aij(t)], P (t) is the probability row vector of pi(t)s, and p′i(t)and P ′(t) are the derivative of pi(t) and P (t) with respect to t, respec-tively. pj(t) can be obtained only numerically in general for homogeneousand non-homogeneous cases. We discuss two numerical methods for findingthe solution of equation (1.9) in chapter 2 in detail.The Poisson process is the most famous example of a continuous-timeMarkov chain. Many punctual phenomena have the Poisson distribution.The Poisson process is a stochastic process that counts the number of eventsin a given time interval where the time between each pair of consecutive51.3. Markov Decision Processesevents has an exponential distribution. The inter-arrival times are indepen-dent of each other. For a Poisson process, and i, j ∈ S, we haveai,j =λ for j = i+ 1,−λ for j = i,0 otherwise.(1.10)The general solution of equation (1.9) for a Poisson process is as follows:pi(t) =(λt)ii!e−λt i ∈ S. (1.11)The birth-death process is another special case of homogeneous continuous-time Markov chains and it is basic to queueing theory. The birth-deathprocess is a continuous-time Markov chain with states S = {0, 1, . . . } forwhich from state i ∈ S only transitions to states i− 1 and i+ 1 are possible.The state of a birth-death process can be interpreted as the population ina system. In such a process, births and deaths are Poisson processes withrate λi and µi at state i, respectively. The transition matrix A in equation(1.9) for a birth-death process is as follows:A =−λ0 λ0 0 . . . . . .µ1 −(µ1 + λ1) λ1 0 . . .0 µ2 −(µ2 + λ2) λ2 . . ............. . . . . (1.12)The birth-death process is the basis of the much of classic queueingmodels. The solution of a relevant case of it is provided in section 1.4. Wesee in chapter 2 a matrix similar to (1.12) for our problem with limitednumber of states and time-dependent rates for the arrival’s Poisson process.1.3 Markov Decision ProcessesA Markov decision process (MDP) is a Markov process with decisionmaking at some or all the states of the process. The decisions, or actions,can be dependent on the current state and stage of the system generally.The transition rates can be dependent on the actions in addition to thedependence on the initial and final state of the transition. There couldbe a reward/cost2 associated with a transition from one state to another.2In our model, we have costs. Therefore, we only use the term ”cost” instead of”reward/cost” from now on. The structure of the problem for the reward system is the61.3. Markov Decision ProcessesMoreover, choosing an action can result in a cost as well. Some costs aredependent on the initial and final state of the transition, and some areindependent of the final state. Usually the later one is an immediate cost.We focus on discrete-time MDP with a finite planning horizon to explain abasic MDP model.Let Xn be the sequence of the random variable making-up a discrete-timeMarkov chain. n represents the stage of the process, or the number of thetransition. Assuming a finite planning horizon, we have n ∈ {0, 1, 2, . . . , N}.Let S = {0, 1, 2, . . . } be the state space. Let An(i) be the set of actions atstate i and stage n. Then a ∈ An(i) is a possible action in state i ∈ S atthe n-th stage. Let un(i, a) be the cumulative expected total reward fromthe n-th stage to the end of the planning horizon if the system is in state inow and one chooses action a ∈ An(i). Let paij be the probability of goingto state j in the next transition with action a knowing the system occupiesstate i now.There could be two types of costs in a MDP. The first type, denotedby raij , is the cost associated with the transition from i to j with actiona ∈ An(i). However, costs can be only dependent on the current state anddecision. This type of cost is called immediate cost, denoted by qai . Let raibe the expected reward in the next transition out of state i ∈ S with actiona ∈ An(i). That is,rai = qai +∑j∈Spaijraij . (1.13)If u∗n(i) is the optimal expected total cost at stage n, then we can writethe following simple recursive relation:u∗n(i) = mina∈An(i)rai +∑j∈Spaiju∗n+1(j) . (1.14)This equation is the application of dynamic programming which can besolved recursively if an initial condition is provided. We explain the detailsof the MDP model for our problem in chapter 3 and provide backward in-duction as one of the common algorithms to solve this type of mathematicalmodel. Our MDP model is more complicated than equation (1.14). Forour MDP model, we need to obtain the transition matrix from the transientsolution of the M(t)/M/c/c queue. Moreover, the structure of the costs insame as the cost analysis except for the optimization, we maximize the rewards andminimize the cost.71.4. M/M/c/c QueuesTable 1.1: Most common codes in Kendall’s notationCode Name DescriptionM Markovian or memoryless Poisson process.D Degenerate distribution A deterministic inter-arrival orservice time.G General distribution For the service timeGI General and independent For the arrivalsdistributionPH Phase-type distribution It consists of a sequence of phaseswhich define a Markov chain withan absorbing state.our model needs a time-average of transient probabilities which is discussedin chapter 2. Before that, we review the basics of queueing theory in thefollowing section.1.4 M/M/c/c QueuesIn queueing theory, a mathematical model is built to predict charac-teristics of a problem. A queueing system can be described as customers3arriving for service, waiting for service if there is no immediate service avail-able, and leaveing after receiving service. Kendall’s notation is the standardsystem used to describe and classify a queue. The most common versionof Kendall’s notation is A/S/c where A denotes the arrival pattern of cus-tomers, S the service pattern of server(s), and c the number of servers.Sometimes, including in our case, a fourth parameter, k, is considered inKendall’s notation as the capacity of the queue. The capacity of the systemprovides a maximum number of customers allowed into the queue includingthe customers that are receiving the service. When the last parameter inKendall’s notation is not provided, it is assumed k =∞. A summary of themost common codes used in Kendall’s notation are provided in Table 1.1. Inthis section, we review the M/M/c/c queue to have a better understandingof this type of queue. Then we explain the difference between our problemand the traditional M/M/c/c queue.An M/M/c/c queue has c parallel servers with maximum capacity of ccustomers in the system. It means that if all servers are busy and we have3”Customer” is a general term here. It is not necessarily an individual human.81.4. M/M/c/c QueuesFigure 1.1: Rate transition diagram for an M/M/c/c queuean arrival, it has to leave the system without receiving any service. Arrivalsoccur by a Poisson process, say with rate of λ. Let n, number of customersin the system, be the state of the queue. Since there is no arrival allowedafter having c customers in the system, we can say arrivals are dependenton n. We present the arrival rate at state n byλn ={λ for 0 ≤ n < c,0 for n ≥ c. (1.15)The service time for each server is exponential. Let µ be the service ratein terms of completions per unit of time. The total service rate dependson the number of customers in the system since the service rate for an idleserver is zero. Therefore, we can write the service rate at state n byµn = nµ for 0 ≤ n ≤ c. (1.16)Figure 1.1 shows the schematic of an M/M/c/c queue. This queue is anexample of a birth-death process. We analyze this system with a stochasticbalance procedure which is a consequence of the Chapman-Kolmogrov equa-tion. It basically says that total flow into the state should be equal to totalflow out the state if the system is in steady-state. Let pn be the probabilityof having n customers in the system. By stochastic balance we haveλp0 = µp1 for n = 0,λnpn + µnpn = λn−1pn−1 + µn+1pn+1 for 1 ≤ n < c,λc−1pc−1 = µcpc for n = c.(1.17)91.4. M/M/c/c QueuesWith simple iteration, we can findpn =1n!(λµ)np0 for 0 ≤ n ≤ c. (1.18)Usingc∑n=0pn = 1, we can obtain p0 as follows:p0 =[c∑n=01n!(λµ)n]−1. (1.19)We have a well-defined closed-form solution for the M/M/c/c queueswith equations (1.18) and (1.19). pn is time-homogeneous since it is in-dependent of time. However, this is not the case for our model. We willstudy M(t)/M/c/c queues in chapter 2 which have the same characteristicsas the queue reviewed here except for the fact that the arrival rate is time-dependent. In this case, pn would no longer be homogeneous, so we definepn(t) as the probability of being in state n at time t. To write a differentialequation for solving pn(t), we consider all the possible ways that the systemoccupies state n at time t + ∆t. The probability that an arrival occurs ina small interval of length ∆t which begins with the system in state n isassumed to be λn(t)∆t + o(∆t). For the probability of a departure within∆t with the same assumptions, we have µn∆t+o(∆t). Using the Chapman-Kolmogrov equation for continuous time Markov chains as in (1.3), we canwritep0(t+ ∆t) = p0(t) [1− λ0(t)∆t] + p1(t) [µ1∆t] + o(∆t) n = 0,pn(t+ ∆t) = pn(t) [1− λn(t)∆t] [1− µn∆t] + pn+1(t) [µn+1∆t]+pn−1(t) [λn−1∆t] + o(∆t) 1 ≤ n < c,pc(t+ ∆t) = p0(t) [1− µc∆t] + pc−1(t) [λc−1∆t] + o(∆t) n = c.(1.20)When ∆t −→ 0, we can obtain the following differential equations from(1.20) by simple steps:101.5. Literature on Application of Queueing Theory in Hospital Bed Allocationdp0(t)dt= −λ0(t)p0(t) + µ1p1(t) n = 0,dpn(t)dt= − [λn(t) + µn] pn(t) + µn+1pn+1(t) + λn−1(t)pn−1 0 < n < c,dpc(t)dt= −µcpc(t) + λc−1(t)pc−1(t) n = c.(1.21)If there were no time-dependence then dpn(t)/dt would be zero and wereach the same equations as in (1.17). However, that is not the case for ourproblem. We need to solve the differential equations in (1.21) to find thepn(t)s. That will be the main purpose of chapter 2.1.5 Literature on Application of QueueingTheory in Hospital Bed AllocationBed allocation is one of the famous problems in the literature of healthmanagement. There have been different methods used for modeling andsolving this type of problem. In this section, we focus on the applicationof queueing theory in this area. Most of these queueing models discuss thesolutions at the equilibrium. The ones that obtain the transient solutionuse simulation. The nature of the bed allocation for most of the body ofwork reviewed here is different from our problem. However, awareness ofthis body of work helps us to have a better vision to approach our problem.Considering each bed as a server is a common assumption in these works.The arrival process is usually modeled by a variation of the Poisson process.Kolesar in [Kol70] represents several queueing models for hospital ad-missions scheduling. The first two models he used were developed by JohnP. Young in [You65] and [You62]. The first system consists of c beds and theservice time is exponentially distributed. Two types of independent arrivalsare considered; the scheduled arrivals form an L-phase Erlang process 4 andthe emergency admissions form a Poisson process. The state of the systemis given by the number of occupied beds. It is assumed that whenever thesystem is full, new arrivals will be lost, i.e., will be sent to another hospital.Therefore, there is no queue for admission. He describes the system as a fi-4The Erlang distribution can be seen as a sum of L independent exponential variable.L is a positive integer and it is called the shape parameter of the Erlang process111.5. Literature on Application of Queueing Theory in Hospital Bed Allocationnite state birth-and-death process. Young obtains the steady state behaviorof the system which is independent of the shape parameter of the Erlangprocess. The other system considered by Young has similar characteristicsexcept for the distribution of the scheduled arrivals. The scheduled arrivalsare assumed to be a deterministic stream which depends on the number ofoccupied beds. He obtains the steady state probabilities for this model too.Kolesar, in continuation of Young, analyzed the problem with a Marko-vian decision model and linear programming. He considers four variables foreach period of time; the number of beds occupied, non-scheduled arrivals,scheduled arrivals, and the number of discharged and transferred patientsdependent on the number of occupied beds. He examines the behavior ofthe number of the occupied beds using a Markov chain. He introduces a va-riety of optimization problems and models them with linear programming,such as maximization of average occupancy with an overflow constraint, andminimization of overflow with constraints on utilization.Worthington in [Wor87] considers beds as servers (fixed number) in hos-pitals and the service time as the length of stay (LOS) in the hospital bed.He considers the M(λq)/G/c queue for this purpose where λq is the rateof arrival when there are q patients in the line. λq is a linear and decreas-ing function of q. He approximates the average waiting time by a Normaldistribution and discusses the sensitivity analysis for the model.Green in [Gre06] discusses cases where queueing theory has been usedin health care administration. In [Gre03], she provides an example of usingthe M/M/c model to determine the capacity of different units (obstetricsand ICU) in the hospital for a target occupancy levels. Green and Nguyenin [VN01] use the M/M/c model to estimate the effect of reducing thenumber of the beds in units of a hospital on the probability of delay inservice for patients. Kim et. al in [KHYB99] also consider an M/M/cmodel for analyzing the capacity of ICU in a hospital. They consider fourtypes of patients for ICU, with arrivals for each class following a Poissonprocess. The aggregate of the arrivals from these four classes is consideredas the arrival process for ICU. They use the queueing model to validate theirsimulation model. Green et. al in [GSGG06] use an M/M/c queueing modelto estimate the number of servers needed during each staffing interval.De Bruin et. al in [dBvRVK07] study the impact of fluctuations inarrivals on LOS. First, they consider an M/M/∞ queue and as the outcomethey obtain the number of beds for accommodating all arrivals. Then theyconsider a limited capacity queue M/M/c/c where blocking the patients andsending them to another hospital is allowed when there are c patients alreadyin the unit. The main contribution for this work was that they consider a121.5. Literature on Application of Queueing Theory in Hospital Bed AllocationFigure 1.2: Two dimensional queueing model for emergency care chain[dBvRVK07]two dimensional queueing model to obtain the optimal bed allocation overthe emergency care chain, including coronary care unit (CCU), normal careclinical ward (NC) and first cardiac aid (FCA). The number of beds in theFCA is assumed to be fixed. There are three types of transitions in themodel: from the FCA to the CCU, from the CCU to the NC and departurefrom the NC. It is assumed that CCU patients can stay only on CCU beds;however NC-patients can stay on NC or CCU beds. N1 and N2 are thenumber of the beds at the CCU and the NC, respectively and x and yare the number of CCU and NC patients. The graphical representationof the model is provided in Figure 1.2. They wrote a computer programthat gives the number of refused admissions for given arrival rates, and thenumber of beds at the CCU and NC and LOS at these stations. The optimalnumber of beds in the CCU and NC are obtained by running the program fordifferent combination of bed allocation for the CCU and NC and finding thecombination with minimum average LOS (ALOS) and refused admissions.Palvannan and Teow in [PT12] talk about the significance of queueingtheory in bed allocation. They provide two case studies. First, they usethe M/G/c/c queue for analyzing the patient flow in the endoscopy centre.They obtain the wait time for patients. They state that the relationshipbetween capacity and patient wait time is not linear. The number of addi-tional recovery beds required increases much more when one needs a lowerprobability of loss. Moreover, when the demand increases the increase incapacity is lower for the same level of the probability of loss. As the secondcase, they use the M/M/c queue for modeling the partitioning in the ED.They cluster patients into two groups; clean patients and infected ones andthey divide ED into two units for each cluster. In this way the averagelength of stay (ALOS) in ED will increase in comparison to the previous131.6. Overviewone. They want to decide how many beds have to be added to the ED tomeet the primal ALOS.Teow et al. in [TCT+09] construct a series of M/G/c/c queues withpure loss to compute the probability that a patient could not find a bedafter endoscopy.Gorunescu et al. in [GMM02] model geriatric departments as a queuewith c staffed beds. They consider extra unstaffed back-up beds as a wait-ing room. N is the maximum capacity of the system; hence N − c is themaximum number of patients in the waiting room. Extra beds are only usedin case of crises. The arrivals are Poisson and the service time distributionis phase-type5. The goal of their work is to figure out the effect of variationsin arrival rates in this setting on rejection probability (an arrival patientwill be rejected if the system is full), emptiness and costs to analyze theadvantages of using such extra beds. Fomundam and Herrmann in [FH07]reviewed the application of queueing theory in health care up to 2007. Wehave used some of their references related to bed allocation.To sum up, in the literature of bed allocation in hospitals, beds are gen-erally considered as the servers. We assume the same in our model. Hence,we have two levels of servers, with and without surge beds. Consideringthat the service time is distributed exponentially is reasonable for this typeof problem as well. It has the advantage of having the Markovian property;therefore, we assume the same. Moreover, it is a common assumption toconsider the arrivals as a Poisson process. Hence, we assume the same forour model. To distinguish our model and make it more flexible, we considertime-dependence for the rate of the arrival process. This way, the MDPmodel produces time-dependent bi-level policies which has not been studiedin the literature to the best of our knowledge.1.6 OverviewThe rest of this thesis is organized as follows. The problem describedbriefly in section 1.1 is detailed in chapter 3 and a complex MDP is pro-vided for modeling the problem. As we said, for this MDP model, we needthe transient probabilities of the M(t)/M/c/c queue which are obtained inchapter 2.We tried to prepare chapter 2 in a way that it could be read and usedseparately from the rest of the text. One of the novelties of this work is5Phase-type distribution is a mixture of exponential distributions. A special case ofphase-type distribution is Erlang distribution.141.6. Overviewthe modified Runge-Kutta method introduced in chapter 2. We believethis method provides more accurate results for time-dependent queues ingeneral. However, the results are only provided for M(t)/M/c/c queues. Inaddition to the modified Runge-Kutta method, we provide an adaptationof randomization method for the M(t)/M/c/c queue based on Grassmann’swork in [Gra77] as well. However, it turns out that the modified Runge-Kutta works more effectively for time-dependent queues.Chapter 3 provides the main results for our problem. Using the resultsof the MDP model, we provide a time-dependent bi-level control policy forthis problem. The time-dependent nature of the control policies is anothernovelty of this work. We show that when the arrival rate function is periodicthe policies would be periodic as well. We provide numerical results tosupport these findings.Chapter 4 presents an application of the MDP model developed in chap-ter 3. We consider an ED explained for the KGH in section 1.1, with arrivalsdue to epidemics in addition to the regular arrivals. We consider seasonalityfor the epidemics which is a useful contribution for diseases like influenza.Moreover, linear extrapolation is used to approximate the control policiesfor large size problems, and the results are promising.Chapter 5 provides a summary of the results and ideas for future workbased on this thesis.15Chapter 2Transient Solutions forM(t)/M/c/c QueuesBefore defining a Markov decision process, we need to obtain the tran-sient solutions for an M(t)/M/c/c queue. There is an analytic solutionincluding a complex integral for this system of differential equations. How-ever, this solution is not numerically tractable, as we will see in section2.4. In fact, it is better to apply numerical methods to solve equation (1.9).These methods are categorized into exact and approximate methods. Theexact methods do not involve any approximation other than those neces-sary for solving the set of differential equations. The control of the errorin these methods can be done by choosing the parameters of the methodappropriately. Exact methods for our problem are Runge-Kutta and therandomization method. There are approximate methods as well such asfluid approximation, and diffusion approximation. However, we focus on ex-act methods in this thesis and try to adopt them to our problem. Simulationis another technique for obtaining transient solutions. However, it appearsthat simulation is not competitive solution for this problem, and the exactmethods provide better results in the same amount of time.In this chapter, after exploring time-dependent queues in the literature,we start with the fourth order Runge-Kutta and attempt to modify it to get amore accurate answer from the model. Then we explore the randomizationmethod and attempt to extend it for an M(t)/M/c/c queue with time-varying arrival rates. At the end, based on the comparison made, we makea decision as to which method is more suitable for our problem.2.1 Time-dependent Arrival Queues in theLiteratureIn practical queueing situations, arrival rates may depend on the time,λ(t). For instance, at an intersection the arrival rate is higher during rushhour; or in an ED, the arrival rate is higher on specific days; or at an Inter-162.1. Time-dependent Arrival Queues in the Literaturenet service provider (ISP), data package request arrivals are higher duringspecific hours of the day. A non-stationary queueing system is a systemwhere parameters for arrival and service processes may be time-dependent.In this section, we review approximate and exact methods described in theliterature for non-stationary queues, including method using time-dependentarrival and service rates.Fluid approximation [New82] is one of the simplest approaches for time-dependent queues. This method uses the actual data to approximate adeterministic arrival and service rate for a deterministic queue. Let A(t) bethe cumulative number of customers entering the system to time t, and D(t)be the cumulative number of customers leaving the queue by time t. Fluidapproximation treats customers as a continuous fluid that is flowing into atank at some rate dA(t)/dt, and flowing out at rate dD(t)/dt. When thereis uncertainty in the system, the fluid method approximates A(t) and D(t)as follows:A(t) = E[A(t)],D(t) = E[D(t)],where E[.] is the expected value of the variable in question. The fluid ap-proximation only requires the first moment of the arrival and service process.The moments of a random variable can be obtained from the moment gen-erating function. The moment generating function of a random variable Xis MX(t) := E[etX ], t ∈ R, where the expectation exists. The expansion ofetX can be used to find all the moments of the distribution as follows:MX(t) = E[etX ] = 1 + tE[X] +t2E[X2]2!+t3E[X3]3!+ · · ·+ tnE[Xn]n!+ . . .= 1 + tM1 +t2M22! +t3M33!+ · · ·+ tnMnn!+ . . . ,where Mn is the n-th moment.Another method for solving time-dependent queues is diffusion approx-imation. This method was initially introduced in Physics. A diffusionequation is a partial differential equation which describes processes show-ing diffusive behavior, e.g. heat conduction, electric fields, etc. Physicistspresented diffusion approximation for solving these differential equations.Later, Newell used this approximation for non-stationary queues. If we con-sider a two-dimensional stochastic process for the joint distribution of thearrival and departure processes, then the diffusion equations can help us tofigure out the properties of such a queue [New82]. If the mean and variance172.1. Time-dependent Arrival Queues in the Literatureof A(t) and D(t) only depend on the queue length and time, then we canuse the diffusion equation. The diffusion approximation requires the firstand second moments of the arrival and service processes.Newell uses the diffusion approximation to study time-dependent arrivalqueues in his triplet papers published in 1968. In [New68a], he studies anM(t)/M/1 queue where the arrival rate is strongly time-dependent. It isassumed that arrivals increase at a constant rate, dλ(t)/dt = a where a ispositive. He assumes that the queue is long enough to always ignore the dis-crete nature of the arrivals. In other words, a scale of time, τ , can be foundsuch that during a time τ , the change in the length of the queue is smallwhile the number of arrivals and departures is sufficiently large. He ana-lyzes the length of the queue, X(t), in under- and over-saturated conditions(saturation: λ(t) = µ) as well as on the transition, and provides a numeri-cal calculation example. In [New68b], he considers a more generic case byassuming that the arrival rate, λ(t), increases to a maximum which is largerthan the rate of service, µ, and subsequently decreases again. He investi-gates the distribution function of the maximum queue and the evolution ofthe queue distribution after λ(t) drops down below µ long enough to havethe queue length equal zero. In [New68c], he assumes the arrival rate has theform of λ(t) = λ(0)− βt2 where β is a positive constant and he re-examinesthe diffusion method under this new condition. Keller in [Kel82] providesa contribution to Newell’s work by assuming that the service rate changesover time as well. He considers an M(t)/M(t)/1 queue where arrival andservice rates are changing slowly over time; λ(t) and µ(t).  is the ratio ofan average inter-arrival time to the time over which the rates change appre-ciably, and it is small. He analyzes the queue using a diffusion equation infive different time periods consisting of the initial period, the period of lighttraffic where λ/µ < 1, the saturation transition period (passing throughλ/µ = 1), the over-saturated period where λ/µ > 1, and the transition pe-riod at the end of the over-saturation where the queue length drops downto the light traffic condition. Duda in [Dud86] considers an GI(t)/G(t)/1queue and uses the diffusion approximation to estimate the average num-ber of customers in the system in different time period situations. Yangand Knessl in [YK97] study an M(t)/G/1 queue where the arrival rate isweakly dependent on time, as λ = λ(t), where  is a small parameter. Theyuse the diffusion approximation as well and relate the various time scalesby asymptotic matching. In separate work, Knessl in [Kne00] provides anasymptotic solution using diffusion processes for a general form of partialdifferential equations that arises from queues with time-dependent arrivalrates. In continuation of his work, Knessl and Yang in [KY02] provide a182.1. Time-dependent Arrival Queues in the Literaturesolution for the probabilities pn(t) of having n customers in the system attime t for an M(t)/M(t)/1 queue using the diffusion approximation. Theyconsider an M(t)/M(t)/c/c queue in [KY06] where λ(t) and µ(t) are smoothfunctions of time. They study the distribution of pn(t), the probability ofhaving n (0 ≤ n ≤ m) servers occupied asymptotically at time t, usingsingular perturbation which is a simulation technique.The study of the effect of the variation in arrival rates on performancemeasures in M(t)/M/c queues shows that stationary models under-estimatedelays even when the arrival rate is slightly varying [GKS91]. The point-wise stationary approximation (PSA) is introduced by Green and Kolesar[GK91] as an easy-to-compute approximation for determining long-run av-erage performance measures such as expected delay and probability of delayfor M(t)/M/c queues with periodic arrival rates. The idea is to segmentthe time interval and use a series of stationary models based on the aver-age arrival rates. The solution of this method is a tight upper bound ofthe analytic value. Whitt in [Whi91] uses PSA for the same purpose inM(t)/M(t)/c queues and he shows that it works properly. Jennings et al in[JMMW96] introduce an G(t)/G/c(t) queueing model where the number ofservers is a function of time which is needed to be selected to meet workloadrequirements. They develop an approximation procedure and compare theresults of the special case M(t)/M/c(t) to the exact solution for validation.There are more analytic approaches for dealing with time-dependentqueues as well, however they are not successful in obtaining the transientsolutions and mostly talk about the asymptotic behavior of a time-dependentqueue. Heyman and Whitt in [HW84] discuss the asymptotic behavior of anM(t)/G/c queue. They consider an embedded Markov chain with a periodicarrival process. Massey in [Mas85] uses operator analytic techniques todevelop the non-stationary Markovian queueing theory for M(t)/M(t)/1.He assumes uniform acceleration for the queue length. Mandelbaum andMassey in [MM95] derive period-dependent, pathwise asymptotic expansionfor the M(t)/M(t)/1 queue length.There are other methods in the literature for queues with time-dependentarrivals. For example, Fan assumes an M(t)/M/1 queue and introduces analgorithm for the simulation of this model. He showed that his algorithm isfaster than the Monte Carlo simulation methods [Fan76]. Worthington andWall in [WW99] describe and review a discrete time modeling (DTM) ap-proach for dealing with time-dependent queues. They conclude that DTMapproaches often have dimensional problems, but they can be handled byappropriate approximations leading to feasible algorithms with high qual-ity results. Ingolfsson in [Ing02] models an M(t)/M/c(t) queue under a192.2. Problem Descriptionnon-preemptive discipline. He uses mixed discrete-continuous time Markovchains (MDCTMC) to model the problem. The MDCTMC is a continuoustime stochastic process which behaves like a discrete time Markov chain atsome epochs like t1, t2, . . . , and behaves like a continuous time Markov chainbetween those epochs. He shows how to compute the waiting time distribu-tion for the model. Roth and Shmuel in [RO79] study an M(t)/M/c queuewhere the arrival rate is non-stationary using the approximation method forsolving differential equations called closure method.We find few works that compare the performance of the different methodsfor solving the time-dependent queues. Kivestu in [Kiv76] studies M(t)/G/1queue and compares the results of five different methods of approximationfor solving this problem: simulation, fluid approximation, equilibrium anal-ysis, diffusion approximation, and Koopman’s model. Ingolfsson et al in[IABL07] compare seven different methods for computing the service levelfor an M(t)/M/c(t) queue under a non-preemptive discipline. They showthat the closure approximation method has a modest computation time andan acceptable accuracy in comparison to other methods. Ingolfsson et alin [ICWC10] model a scheduling system by an M(t)/M/c(t) queue under apreemptive service discipline. They provide an integer programming modelto minimize the cost of scheduling problems such that the waiting time forcustomers is less than some specific threshold, τ , and introduce a heuristicalgorithm for solving the integer programming model.As for the exact methods, Green et al in [GKS91], Odoni and Rothin [OR83], use the Runge-Kutta method for time-dependent single facilityqueues. Randomization is used for time-homogeneous queues frequently,however we find only one paper [IABL07] using randomization for non-stationary queues.We want to obtain the transient solution of M(t)/M/c/c queues by exactalgorithms. To provide an improved method for solving time-dependentqueues, we work with two exact methods. First, we modify Runge-Kuttamethod as a well-known reliable and accurate numerical method to ourproblem in section 2.3. Secondly, we intend to adapt randomization as anaccurate method that has not been used directly for time-dependent queuesin section 2.4.2.2 Problem DescriptionConsider the state of an M(t)/M/c/c queue given by the number ofpatients in the system, j = 0, 1, 2, ..., c, and let pj(t) be the probabil-202.2. Problem Descriptionity of being in state j at the time t. We want to calculate pj(t) withinthe interval (0, T ). Let P (t) be the row vector of pj(t), that is P (t) =[p0(t), p1(t), p2(t), . . . , pc(t)]. We define the initial conditions P (0) = Qwhere Q = [q0, q1, q2, . . . , qc] is given. Let A(t) = [aij(t)] be the transi-tion matrix of an M(t)/M/c/c queue. A(t) is a square matrix of size c + 1which is dependent on t due to arrival rates. Also we putaii(t) = −c∑j=1j 6=iaij(t).Therefore A(t) can be written asA(t) =−λt λt 0 . . . . . . 0µ −(µ+ λt) λt 0 . . . 00 2µ −(2µ+ λt) λt . . ....... 0. . .. . .. . . 0.... . . 0 (c− 1)µ −[(c− 1)µ+ λt] λt0 . . . . . . . . . cµ −cµ.(2.1)We want to compute pj(t) from the following differential equations, as wesaw in section 1.2, with initial conditions given above:p′i(t) =c+1∑j=1pj(t)aji(t) i = 1, 2, . . . , c+ 1, (2.2)orP ′(t) = P (t)A(t). (2.3)For the cost function, we also need the accumulated cost over time whichrequires integrals of the probabilities over time. More precisely, we need tocalculate the following integral:tn+1∫tnpij (t) dt, (2.4)where pij(t) is the probability of going from state i to j during time t, andtn and tn+1 are times at stage n and n1, respectively. For this purpose weneed the transient solutions.212.3. The Modified Fourth Order Runge-Kutta Method2.3 The Modified Fourth Order Runge-KuttaMethodFor solving equation (2.3), one can use the Runge-Kutta method. Themost common Runge-Kutta method is the fourth order Runge-Kutta (RK4).Consider differential equations such as y′ = g(t, y), y(tn) = yn. It is worthnoting that y is a function of t. Given yn and the step size ∆t, the RK4computes four quantities, κ1, κ2, κ3 and κ4, to find yn+1 as follows [Dav92]:κ1 = g(tn, yn)∆t, (2.5)κ2 = g(tn +∆t2, yn +κ12)∆t, (2.6)κ3 = g(tn +∆t2, yn +κ22)∆t, (2.7)κ4 = g(tn + ∆t, yn + κ3)∆t, (2.8)yn+1 = yn +16(κ1 + κ2 + κ3 + κ4). (2.9)One can consider A(t)P (t) as g(P, t) and simply apply RK4 directly to thefunction g(P, t). To reduce round-off errors in the RK4 method, we useuniformization by letting B(t) = A(t)/f + I, where I is the identity matrixof the same size as A(t) and f is an arbitrary number such that f ≥ |aii(t)|for all i = 1, 2, . . . , N and t ≥ 0. We can simply set f = maxi,t{−aii(t)} =λmax + cµ where λmax = maxtλt. In this way, all elements of B(t) will bepositive. Substituting A(t) by (B(t)− I)f in RK4 formulas we have:κ1 = fP (tn) [B(tn)− I] ∆t, (2.10)κ2 = f[P (tn) +κ12] [B(tn +∆t2)− I]∆t, (2.11)κ3 = f[P (tn) +κ22] [B(tn +∆t2)− I]∆t, (2.12)κ4 = f [P (tn) + κ3)] [B (tn + ∆t)− I] ∆t, (2.13)P (tn+1) = P (tn) +16(κ1 + κ2 + κ3 + κ4) . (2.14)We have compared the modified RK4 with other available methods, and ithas been found that the modified RK4 is more accurate for the same amountof running time. For comparison between the RK4 and the modified RK4,222.3. The Modified Fourth Order Runge-Kutta MethodTable 2.1: The blocking probability, b(t), with the modified RK4 for c = 100and 2 ≤ t ≤ 16t b(t) RK4 Abs. Error Modified RK4 Abs. Error2 3.59999E-06 3.64584E-06 4.58519E-08 3.60006E-06 6.71181E-113 5.34454E-05 5.41670E-05 7.21582E-07 5.34469E-05 1.50448E-094 1.02495E-04 1.03897E-04 1.40206E-06 1.02494E-04 6.58520E-105 1.25536E-04 1.27258E-04 1.72210E-06 1.25531E-04 4.70825E-096 1.51806E-04 1.53895E-04 2.08851E-06 1.51803E-04 3.45188E-097 2.07856E-04 2.10723E-04 2.86717E-06 2.07857E-04 8.67902E-108 3.32638E-04 3.37230E-04 4.59245E-06 3.32642E-04 4.31086E-099 6.13348E-04 6.21818E-04 8.47005E-06 6.13357E-04 9.32078E-0910 1.25605E-03 1.27337E-03 1.73194E-05 1.25604E-03 7.13503E-0911 2.72464E-03 2.76232E-03 3.76813E-05 2.72473E-03 9.45772E-0812 5.95564E-03 6.03863E-03 8.29913E-05 5.95646E-03 8.23840E-0713 1.25268E-02 1.26997E-02 1.72893E-04 1.25269E-02 8.88180E-0814 2.44133E-02 2.47498E-02 3.36457E-04 2.44130E-02 3.12181E-0715 4.30498E-02 4.36445E-02 5.94709E-04 4.30506E-02 8.39452E-0716 6.81984E-02 6.91393E-02 9.40900E-04 6.81985E-02 1.23130E-07we use the numerical results in Knessl and Yang paper [KY06]. Consider anM(t)/M/c/c queue with a periodic time-dependent arrival rate as follows:λt = c(α+ a sin(ω0t+ φ)), (2.15)where c = 100 is the number of servers. They choose α = 1.2, a = 0.5,ω0 = 0.1 and φ = −2.0. The initial state probability vector at t = 0 isP (0) = (1, 0, 0, . . . , 0), and the rate of service for each server is µ = 1.The blocking probability, b(t), is the probability that a customer showsup at the time t and leaves the system without receiving any service. InM/M/c/c queues, b(t) equals to the probability of being in the state c atthe time t, pc(t). For times in the range [2, 16], b(t) has been providedin Table 2.1 for different methods6. The first column represents the timein seconds. The second column represents the best solution as claimed inKnessl and Yang’s paper. They did not mention the actual method used forobtaining the values given in this column, however, we believe they used theRunge-Kutta method with a very small step size since there is no exact andclosed-form solution for b(t). The third and fifth columns are the results ofthe RK4 and the modified RK4 methods with the same step size h = 0.005,6We used MATLABR© for coding RK4 and Modified RK4. The CPU of the computerthat used for running the codes was IntelR© Core 2 Duo T9600.232.3. The Modified Fourth Order Runge-Kutta MethodTable 2.2: The blocking probability, b(t), with the modified RK4 for c = 100and 14.8 ≤ t ≤ 53t Exact b(t) RK4 Abs. Error Modified RK4 Abs. Error14.8 3.87645E-02 3.93000E-02 5.35500E-04 3.87650E-02 4.57137E-0715 4.30498E-02 4.36450E-02 5.95200E-04 4.30506E-02 8.39452E-0715.2 4.76027E-02 4.82600E-02 6.57300E-04 4.76030E-02 3.24357E-0715.4 5.24109E-02 5.31330E-02 7.22100E-04 5.24101E-02 8.17373E-0715.6 5.74576E-02 5.82490E-02 7.91400E-04 5.74569E-02 7.00288E-0715.8 6.27252E-02 6.35910E-02 8.65800E-04 6.27261E-02 9.29096E-0716 6.81984E-02 6.91390E-02 9.40600E-04 6.81985E-02 1.23130E-0716.2 7.38551E-02 7.48720E-02 1.01690E-03 7.38535E-02 1.59216E-0616.4 7.96697E-02 8.07690E-02 1.09930E-03 7.96698E-02 6.60809E-0816.6 8.56253E-02 8.68070E-02 1.18170E-03 8.56258E-02 4.94799E-0716.8 9.17000E-02 9.29650E-02 1.26500E-03 9.17004E-02 4.06798E-0717 9.78735E-02 9.92230E-02 1.34950E-03 9.78732E-02 3.45577E-0717.2 1.04125E-01 1.05561E-01 1.43600E-03 1.04125E-01 3.38707E-0717.4 1.10437E-01 1.11960E-01 1.52300E-03 1.10437E-01 1.42826E-0717.6 1.16793E-01 1.18404E-01 1.61100E-03 1.16793E-01 1.19425E-0717.8 1.23178E-01 1.24878E-01 1.70000E-03 1.23178E-01 3.31078E-0718 1.29579E-01 1.31366E-01 1.78700E-03 1.29579E-01 1.29812E-0718.2 1.35982E-01 1.37858E-01 1.87600E-03 1.35983E-01 5.47240E-0718.4 1.42378E-01 1.44343E-01 1.96500E-03 1.42379E-01 5.07500E-0718.6 1.48757E-01 1.50809E-01 2.05200E-03 1.48757E-01 1.13528E-0718.8 1.55110E-01 1.57250E-01 2.14000E-03 1.55110E-01 1.79615E-0719 9.78735E-02 9.92230E-02 1.34950E-03 9.78732E-02 3.45577E-0720 1.92312E-01 1.94965E-01 2.65300E-03 1.92312E-01 4.04803E-0723 2.74270E-01 2.78054E-01 3.78400E-03 2.74271E-01 6.68177E-0726 3.37081E-01 3.41731E-01 4.65000E-03 3.37081E-01 3.07410E-0729 3.81090E-01 3.86347E-01 5.25700E-03 3.81090E-01 1.12472E-0832 4.07990E-01 4.13618E-01 5.62800E-03 4.07990E-01 4.96976E-0735 4.19203E-01 4.24986E-01 5.78300E-03 4.19203E-01 1.79639E-0738 4.15407E-01 4.21138E-01 5.73100E-03 4.15407E-01 2.72002E-0741 3.96428E-01 4.01897E-01 5.46900E-03 3.96429E-01 5.25599E-0744 3.61326E-01 3.66310E-01 4.98400E-03 3.61326E-01 4.05741E-0847 3.08773E-01 3.13033E-01 4.26000E-03 3.08774E-01 6.57755E-0750 2.38302E-01 2.41590E-01 3.28800E-03 2.38302E-01 4.06168E-0753 1.53899E-01 1.56022E-01 2.12300E-03 1.53899E-01 2.45400E-08242.4. The Randomization Methodrespectively. The modified RK4 gives more accurate results in comparisonto the RK4. The order of the error for the modified RK4 is significantlysmaller than the RK4.We find similar results to Table 2.1 for longer runs. Table 2.2 shows theblocking probabilities for times in the range 14.8 ≤ t ≤ 53. Therefore, weuse the modified RK4 for our analysis since it showed better performancewhen compared to RK4.2.4 The Randomization MethodThe randomization method was introduced by Grassmann for time-homogeneous Markovian queues [Gra77]. It is used frequently and pro-posed as an efficient exact method for transient analysis of stationary queues[Ros95]. It has been used for time-dependent queues as well, it was nota direct adaptation. Ingolfsson et al. [IABL07] approximate the arrivalrate λt with a piecewise constant function λˆ(t). They basically break downthe time horizon into small intervals and assume that those intervals aretime-homogeneous. In this section, we attempt to adapt randomization forqueues with time-dependent arrivals using different approaches and discusstheir performance accordingly.The solution of equation (2.3) can be written asP (t) = P (0) exp[∫ t0A(z)dz]= Q exp[∫ t0A(z)dz]. (2.16)The negative diagonal entries of A(t) (or of the integral A(z)) causesignificant round-off errors when we expand equation (2.16) by Maclaurinseries [Gra77]. To avoid this problem, the randomization method suggeststo use uniformization by letting B(t) = A(t)/f + I, similar to what we sawin section 2.3. In this way, all the elements of B(t) are positive. SubstitutingB(t) for A(t) we haveP (t) = Q exp[f∫ t0[B(z)− I] dz]= Q exp (−ft) exp[f∫ t0B(z)dz]= Q exp (−ft) exp[ftBˆ(t)](2.17)where Bˆ(t) =1t∫ t0B(z)dz. Now by expanding exp[ftBˆ(t)]with Maclaurinseries we have252.4. The Randomization MethodP (t) = Q∞∑n=0Bˆn(t)(ft)nn!exp (−ft) . (2.18)Due to the complexity of calculating P (t) by equation (2.18), especiallyas a result of time-dependence in the elements of matrix Bˆ(t), one canrewrite Bˆ(t) asBˆ(t) = B0 + λˆ(t)B1 (2.19)where λˆ(t) =1tt∫0λ(z)dz. In this case, B0 ≥ 0 while B1 < 0 which is notdesirable in accordance with the error analysis we mentioned before. Wemay consider the same procedure as for matrix A and replace matrix B1by C1 = B1 + I. However, since B1 should be multiplied by λˆ(t), it makesexpansion of the power element in equation (2.18) even more complicated.Therefore, we do not follow this path.Instead, we will try to break down matrix A in a different way. Let us goback to equation (2.16). Instead of replacing A with equation (??), considerthe following definition for B:B(t) =1f{[A(t) + [λt + cµ] I} . (2.20)Using equation (2.20), equation (2.16) can be rewritten as follows:P (t) = Q exp(−cµt) exp[−∫ t0λ(z)dz]exp[f∫ t0B(z)dz]. (2.21)Now by Maclaurin expansion of exp[f∫ t0B(z)dz]we haveP (t) = Q∞∑n=0Bˆn(t)(ft)nn!exp(−cµt) exp[−tλˆ(t)](2.22)where Bˆ(t) =1t∫ t0B(z)dz and λˆ(t) =1tt∫0λ(z)dz. The Bˆ(t) matrix is262.4. The Randomization MethodBˆ(t) =1fcµ λˆ(t) 0 . . . . . . 0µ (c− 1)µ λˆ(t) 0 . . . 00 2µ (c− 2)µ λˆ(t) . . . ...... 0. . .. . .. . . 0.... . . 0 (c− 1)µ µ λˆ(t)0 . . . . . . . . . cµ λˆ(t). (2.23)Bˆ(t) can be easily broken down intoBˆ(t) = B0 + λˆ(t)B1where B0 and B1 areB0 =µfc 0 0 . . . 0 01 c− 1 0 . . . 0 00 2 c− 2 0 . . . ....... . .. . . . . ....... c− 1 1 00 c 0, (2.24)andB1 =1f0 1 0 . . . 0 00 0 1 . . . 0 00 0 0 1 . . ........ . .. . .. . ....... 0 0 10 0 1. (2.25)In (2.24) and (2.25), it is obvious that B0, B1 > 0. Therefore, we willnot have negative elements in the coefficients after replacing Bˆ(t) by B0 andB1 in equation (2.22):P (t) = Q∞∑n=0[B0 + λˆ(t)B1]n (ft)nn!exp(−cµt) exp[−tλˆ(t)](2.26)272.4. The Randomization MethodTable 2.3: The blocking probability, b(t), with 5-term randomization forc = 100 and 2 ≤ t ≤ 16t Exact b(t) RK4 Abs. Error Mod. RK4 Abs. Error Rand. Abs. Error2 3.600E-06 3.646E-06 4.585E-08 3.600E-06 6.712E-11 3.599E-06 6.440E-103 5.345E-05 5.417E-05 7.216E-07 5.345E-05 1.504E-09 5.344E-05 5.257E-094 1.025E-04 1.039E-04 1.402E-06 1.025E-04 6.585E-10 1.025E-04 8.168E-095 1.255E-04 1.273E-04 1.722E-06 1.255E-04 4.708E-09 1.255E-04 7.578E-096 1.518E-04 1.539E-04 2.089E-06 1.518E-04 3.452E-09 1.518E-04 6.132E-107 2.079E-04 2.107E-04 2.867E-06 2.079E-04 8.679E-10 2.079E-04 1.610E-088 3.326E-04 3.372E-04 4.592E-06 3.326E-04 4.311E-09 3.327E-04 4.197E-089 6.133E-04 6.218E-04 8.470E-06 6.134E-04 9.321E-09 6.134E-04 9.732E-0810 1.256E-03 1.273E-03 1.732E-05 1.256E-03 7.135E-09 1.256E-03 1.962E-0711 2.725E-03 2.762E-03 3.768E-05 2.725E-03 9.458E-08 2.725E-03 5.511E-0712 5.956E-03 6.039E-03 8.299E-05 5.956E-03 8.238E-07 5.957E-03 1.782E-0613 1.253E-02 1.270E-02 1.729E-04 1.253E-02 8.882E-08 1.253E-02 1.885E-0614 2.441E-02 2.475E-02 3.365E-04 2.441E-02 3.122E-07 2.442E-02 2.570E-0615 4.305E-02 4.364E-02 5.947E-04 4.305E-02 8.395E-07 4.305E-02 4.664E-0616 6.820E-02 6.914E-02 9.409E-04 6.820E-02 1.231E-07 6.820E-02 4.209E-06This transformation solves the round-off error problem sinceB0+λˆ(t)B1 >0. However, we lose the main expected property of the randomizationmethod, memorylessness. When the process is time-homogeneous, Bˆn(t)in equation (2.18) can be considered as a Markov process imbedded in aPoisson process with the rate f . In this process, the probability of havingn events during a time interval t is (ft)n exp(−ft)/n!, and the probabilityvector of being in state i in a discrete Markov process is QBˆn(t). Hence,the whole expression in equation (2.18) provides the probability vector ofhaving n Poisson events and being in state i after such events. However,we cannot make the same conclusion here since our parameters are time-dependent. Bˆ(t) is not a well-defined transition probability matrix becauset is unknown. In this sense, Bˆn(t) in equation (2.18) cannot be seen as aMarkov process.We can calculate P (t) numerically in the same way we did for the mod-ified RK4. In other words, first we break down the range of time (0, T ] intosmall intervals, say (0, t1), [t1, t2), . . . and [tn+1, tn = T ]. Then we obtainP (t) in each interval assuming P (ti) is the initial condition for calculatingP (t) in the next interval [ti, ti+1). Since transition probabilities are not well-defined, this assumption can be used only when intervals are very small. Thevalue of Bˆ(t) for a given value of t is calculated and plugged into equation(2.18).For comparative purposes, we use the randomization method for thesame numerical example in section 2.3. We use only 5 terms in the Tay-lor expansion in equation (2.18) and consider the same step size as we didfor RK4 methods. The results are provided in Table 2.3. The 5-term ran-domization method shows more accurate results in comparison to the reg-282.4. The Randomization MethodTable 2.4: The blocking probabilities, b(t), with 15-term randomization forc = 100 and 2 ≤ t ≤ 16t Exact b(t) Rand.(15) Rand. 5 Error RK4 Error Mod. RK4 Error Rand. 15 Error2 3.600E-06 3.586E-06 1.363E-08 4.585E-08 6.712E-11 6.440E-103 5.345E-05 5.332E-05 1.243E-07 7.216E-07 1.504E-09 5.257E-094 1.025E-04 1.024E-04 1.285E-07 1.402E-06 6.585E-10 8.168E-095 1.255E-04 1.255E-04 2.910E-08 1.722E-06 4.708E-09 7.578E-096 1.518E-04 1.519E-04 1.249E-07 2.089E-06 3.452E-09 6.132E-107 2.079E-04 2.082E-04 3.799E-07 2.867E-06 8.679E-10 1.610E-088 3.326E-04 3.335E-04 8.930E-07 4.592E-06 4.311E-09 4.197E-089 6.133E-04 6.154E-04 2.049E-06 8.470E-06 9.321E-09 9.732E-0810 1.256E-03 1.261E-03 4.696E-06 1.732E-05 7.135E-09 1.962E-0711 2.725E-03 2.735E-03 1.074E-05 3.768E-05 9.458E-08 5.511E-0712 5.956E-03 5.979E-03 2.359E-05 8.299E-05 8.238E-07 1.782E-0613 1.253E-02 1.257E-02 4.421E-05 1.729E-04 8.882E-08 1.885E-0614 2.441E-02 2.449E-02 7.453E-05 3.365E-04 3.122E-07 2.570E-0615 4.305E-02 4.316E-02 1.098E-04 5.947E-04 8.395E-07 4.664E-0616 6.820E-02 6.834E-02 1.368E-04 9.409E-04 1.231E-07 4.209E-06ular RK4, and it is comparable to the modified RK4 method; though themodified RK4 still provides a better approximation. By considering moreterms in equation (2.18), the randomization approximation becomes moreaccurate; however, obtaining results would be more time consuming. Withcurrent settings, step size h = 0.005 and 5-term expansion, obtaining P (t)for 0 < t ≤ 16 takes 12.50 seconds for the randomization method. Therunning time is much shorter for modified RK4, 1.071 seconds. Despite thefact that the elapsed time does not look considerable for this example, forlarger problems it may become significant. That said, the step size of therandomization method does not need to be as small as the RK4 when weconsider more terms in our expansion. For instance, let step size be h = 0.01and expand 15 terms in equation (2.18) for the randomization. In this way,the elapsed time for calculating P (t) for 0 < t ≤ 16 will be 1.85 secondswhich is significantly smaller than the previous setting. The results are pro-vided in Table 2.4. The 15-term randomization with step size h = 0.01 hasa smaller error than the RK4 with h = 0.005. However, the modified RK4works better with h = 0.005. Here, the advantage of randomization methodis that we can control the error by adjusting both the number of the termsin the expansion as well as the step size. Notice that the larger the numberof terms in the Maclaurin expansion, the larger the step size can be.292.5. Time Averages2.5 Time AveragesWe will see in the section 3.2 that in the cost function we need to solvethe integral in equation (2.4). Generally, the integral1tn+1 − tntn+1∫tnpij(t)dt (2.27)represents the average probability of being in state j over the time interval(tn, tn+1) when the system is at state i at time tn. We use this averageprobability to define an expected cost of a decision between time tn andtn+1.We can obtain the transient solutions numerically using the modifiedRK4. For calculating the integral in (2.4), we approximate it with Simpson’srule since it cannot be computed analytically. Let h be the step size of themodified RK4. It means that to calculate pnij , the modified RK4 methodcomputes pij (tn−1 + lh) for all l = 0, 1, 2, . . . , L = ∆th , where ∆t = tn− tn−1with the initial condition of P (tn−1) = [0, 0, . . . , 0, 1, 0, . . . , 0] where the i-thelement is 1. We tune the step size in a way that L is always an even integer.By Simpson’s rule, we havetn+1∫tnpij (t) dt ≈ h3L2−1∑l=0[pij (tn + 2lh) + 4pij (tn + (2l + 1)h)+pij (tn + (2l + 2)h)] .(2.28)It is obvious that the smaller the step size, h, the better the approxi-mation for the integrals. It is worth noting that the precision of transitionprobabilities is more important than the accuracy of the Simpson approx-imation in (2.28). The reason is in the MDP model, the integral in (2.28)is one of the cost terms; however the transition probabilities are multipliedto all the cost terms. Therefore, we focused more on the accuracy of thetransition probabilities.2.6 Chapter SummaryWe provided two methods for solving time-dependent arrival queues; themodified Runge-Kutta and randomization. We modified the RK4 methodand obtained more accurate results as compared to the traditional fourth or-der RK4 for the same time-frame and with the same step size. However, the302.6. Chapter Summaryrandomization was not tractable for time-dependent queues. Homogeneityis a necessity for using the randomization method without any approxi-mations. We applied the randomization method assuming the existence ofMarkovian property for the initial probabilities of very small time intervals.Results of comparison with the modified RK4 showed that randomization ismore time-consuming and less accurate as compared to the modified RK4for this type of problem. Therefore, we decided to use the modified RK4 forobtaining the transient solutions of M(t)/M/c/c queues in our model. Inaddition, we used Simpson’s rule to approximate the time averages for thetransient probabilities which are required for our MDP model.31Chapter 3Optimal Policies forM(t)/M/c/c Queues with TwoDifferent Levels of ServersIn this chapter, we discuss the determination of the optimal time-depen-dent control policy of the M(t)/M/c/c queue. The (m,N) policy gives twocontours for deciding the level of service based on the length of the queue orthe number of customers in the system. If the number of customers reachesN from below, we use the higher level of service, and if it drops down belowm, we switch back to the lower level of service. We will show that for aperiodic arrival rate, the decisions will be independent of each other after amixing time. We will also discuss that the policies are periodic with the sameperiod as the arrival rates. In the literature, different levels of service rateshave been considered for queues. However, to the best of our knowledge, achange in the number of servers have not been studied.Time dependence is important because the seasonality or periodic be-havior of a stochastic process can be observed in many real-world servicesystems, e.g. walk-in clinics, hospital emergency departments, road trafficsystems, public transportation, call centres, and internet services. In orderto have a better understanding of these systems, one can study the effects oftime variation on decisions for running a system optimally. Considering thatin our work the arrival rate is time-dependent, obtaining a constant (m,N)-policy will not produce an optimal solution. The novelty of our work is toconsider the effect of time-dependence on this category of control policies.That is, the m and N values can be varied depending on the decision makingtime-point within the cycle of the arrival rate function.We model and solve this problem with a Markov decision process (MDP)and use the results of the same to obtain optimal control policies. In section3.1, we review the literature on multi-level service queues, especially wherethere is time-dependence. Section 3.2 describes the problem in detail andprovides an MDP model for obtaining the optimal decisions for minimizing323.1. Two-level Service Queues in the Literaturethe cost function. In section 3.3, we prove that the policies will be periodicfor a finite-horizon MDP and consequently that we can extend the resultsto an infinite planning horizon. In section 3.4, we give a numerical exampleof the (mt, Nt)-policy based on the results of the MDP model and explorethe existence of phase-difference in the results and in the pattern of λt,and observe that for our case, policies reach their periodic pattern quickly.Section 3.5 studies two extreme cases for our model; the deterministic caseand total randomness. Conclusions are provided in section 3.6.3.1 Two-level Service Queues in the LiteratureA variety of service control policies for queueing systems have been stud-ied in the literature. Among these policies, bi-level hysteretic control basedon the queue length is closest to what we are looking for. Gebhard in[Geb67] introduces this policy and applies it on an M/M/1 queue. He de-fines a control doctrine which prescribes the average service rate dependingon the queue length. He assumes a bi-level service rate such that generally ithas an exponential distribution, and an average service rate increases fromµ1 = µ to an average rate of µ2 = kµ whenever the queue length reachesan upper control level N2 from below and it switches back to µ1 when thelength of the queue drops below a lower control level N1 (N1 < N2). The bi-level hysteretic control is illustrated in Figure (3.1). He considers two typesof measures for the performance of the bi-level server; the rate of switchingbetween service rates and the proportion of time which the server works atits higher service rate. He obtains explicit expression for steady-state prob-abilities and the two above-mentioned measures for performance. Tadj andChoudhury in their review paper on the control of queues [TC05], mentionGebhard’s work as a specific case of Yadin and Naor’s paper [YN67].Yadin and Naor [YN67] consider a set of possible service rates µ0, µ1, . . . ,µk, . . . , where rates are strictly increasing according to their subscripts andµ0 = 0. There are two sets of levels; {Rk} = {R1, R2, . . . , Rk, . . . } and{Sk} = {S1, S2, . . . , Sk, . . . } where Rk+1 > Rk, Sk+1 > Sk and Rk > Sk.They define a policy such that whenever the length of the queue reachesa value Rk from below, the service rate is increased from µk−1 to µk, andwhenever it drops to Sk, the service rate is decreased from µk+1 to µk. Theyapply the policy on an M/M/1 queue with state-dependent service. Lu andSerfozo in [LS84] consider a finite set {(λ1, µ1), . . . , (λm, µm)} for the arrivaland service rates and study an M/M/1 queue under the following policy.Let A = {1, . . . ,m} denote the set of all arrival-service-rate pairs. At each333.1. Two-level Service Queues in the Literature N2 N1 μ1 2 Service rate Queue length Figure 3.1: Bi-level hysteretic service rate control [Geb67]arrival or completion of service, suppose the queue length is i and the currentrate-pair is d ∈ A. A rate-pair a is chosen from A with switching cost ofs(d, a); it assumes that s(d, d) = 0. The system operates with this rate-pairup to the next arrival or completion. Therefore, the time is an exponentialrandom variable with parameter λa + µaδ(i) where δ(i) = 0 if i = 0 andδ(i) = 1 if i > 0. A usage cost with the pair a has been considered asc(i, a). A cost function is provided based on these two costs, consideringa discounted rate, and they express the optimal control policy in differentcases.Alam and Mani in [AM89] discuss an M/M/1/N queue with bi-levelarrival and service time rates where N is the maximum number of customersin the system. The system has two modes, i = 1, 2. At mode i we have λias the rate of arrival and µi as the rate of service. They consider transitionrates, αi from mode i to the other. The illustration of the system is providedin Figure (3.2). They develop a recursive method to compute steady-stateprobabilities and performance measures. In continuation of this work theystudy an M/M/c queue with breakdown possibility in [AM88]. They assumethat only one server can break down at a time. Therefore, they consider abi-level for the number of servers; c and c−1. The arrival rate, λ, the servicerate of a server, µ, and the transition rates, α1 and α2, are constant (Figure3.3). They solve the model using a recursive method.Gary and Wang in [GWS92] consider an M/G/1 queue with bi-level343.1. Two-level Service Queues in the Literature λ1 λ2 μ1 2  Mode 1  Mode 2 α1 α2 Figure 3.2: System with two modes of operation [AM89]service time distribution. They assume that if the length of the queue is nomore than a value N , service times are i.i.d. with density f1(x), and if upona completion, the queue length is at least N + 1, the service time of thecustomer has another distribution f2(x). The service time distribution isswitched back to f1(x) when upon completion of a service, the queue dropsto a value K (0 ≤ K ≤ N). They provide an algorithm to obtain the averagequeue length and some probabilities. Liu and Tseng in [LT99] consider anM [x]/M/1 queue with a bi-level service rate. Two thresholds K and N aredefined the same way as in [GWS92]. The service rate consists of two levels;µ and tµ. They derive the steady-state probabilities and introduce a costmodel consisting service cost, queueing cost, and switching cost.George and Harrison in [GH01] discuss an M/M/1 queue with state-dependent service rates based on the length of the queue. Let µn be therate of service when there are n customers in the system; the rate can bechosen from a subset A of [0,∞). They assume a cost function consistingof the customer holding cost and the service cost. The aim is to find theoptimal policy for minimizing the average cost per unit of time over aninfinite horizon. A policy is a vector µ = (µ1, µ2, . . . ) with µn ∈ A for all n.A policy µ is ergodic if there exists a probability distribution p = (p0, p1, . . . )satisfying the balance equations pn = pn+1µn+1 for all n ≥ 0. They claimthat the ergodic or stationary distribution p is unique if it exists and if µn > 0for all n then pn = p0n∏i=1µ−1i for n ≥ 0, where p0 is the normalizationconstant. They provide a method to obtain the optimal policy which is353.1. Two-level Service Queues in the Literature λ μ  (c-1) servers  c servers α1 α2 Figure 3.3: An M/M/c queue with two modes of operation[AM88]fast and transparent in comparison to the previous methods for solvingservice rate control problems. They show that if the policy is non-ergodic,the optimal policy is a do-nothing policy. Ata in [Ata05] studies a finite-buffer Markovian queue with adjustable service rates. Arrivals are Poisson-distributed and the service time needed for each arrival is exponentiallydistributed. Depending on the length of the queue, the service rate is chosenfrom a fixed set A = [0, µ¯] where µ¯ is a given constant. A policy is definedthe same way as in [GH01]. He obtains the steady-state probabilities fora given policy, and then introduces an optimization model for minimizinga long-run average cost function. Ata and Shneorson in [AS06] study anM/M/1 queue with an adjustable rate of arrival and service. They assumethat the system can choose λn and µn from the sets A = [0,Λ] and B =[0,M ], respectively, for n = 0, 1, 2, . . . ,. The state of the system is thelength of the queue. They define a cost function associated with the servicerate and the customer holding cost. First, they analyze the queue with aninfinite buffer and then do the same with limited buffer capacity, and obtainthe optimal policy according to the defined cost function as (−→λ∗,−→µ∗), where−→λ∗ = (λ∗0, λ∗1, . . . ) with all components belonging to A and−→µ∗ = (µ∗0, µ∗1, . . . )with all components belonging to B.Mahabhashyam and Gautam in [MG05] study a single-server queue withvariable service capacity. The speed of the server changes according to acontinuous time Markov chain (CTMC) which is independent of the arrivalprocess and the service requirements of the customer. The arrival rate is363.1. Two-level Service Queues in the LiteraturePoisson distributed and each customer needs a certain random amount ofwork which is distributed exponentially. The service rate is time-varying.They obtain the first and second moments of the service times, the averagesteady-state number of the customers in the system, and the tail distribu-tions’ workload of the system.Kc and Terwiesch in [KT09] study a model of service worker productivitywhere service rates are dependent on workload (limited to the capacity of thesystem) and over-workload (over the capacity of the system). They studytwo cases of patient transport services and cardiothoracic surgery and showthat service rates on these cases are adaptive and sensitive to the level ofworkload. They test their hypothesis statistically.The N-policy is a special case of the bi-level service rate where µ1 = 0,µ2 = µ, N1 = 0 and N2 = N . In other words, a stationary N-policy isa policy that prescribes the server to turn on when the total number ofcustomers in the queue reaches the value N , and turn off when the systembecomes empty [Coo91]. There is a fair bit of literature on N-policy queues.However, this is not quite what we want for our model (we have two levelsfor the number of servers which can be approximated with a bi-level servicerate for a single channel queue). When N1 > 0 and µ1 = 0, then we havean (m,N)-policy where the server stops working when the length of thequeue drops to m(m < N). Sobel in [Sob69] examines an (m,N)-policyfor an GI/G/1 queue defining a two-dimensional state space consisting ofthe queue length and the operational state of the server, and argues theefficiency of the random-walk approach.The (m,N)-policy studied in the literature focuses on one server withmultiple levels of service. Heyman in [Hey68] considers an M/G/1 queuewith a server that can be turned off. The start-up, shut-down, runningcost for the server, and holding cost for customers is considered. Based onthe similarity of this model and inventory models, Heyman suspects thatthe optimal policy is of the (s, S) form. He uses dynamic programmingfor modeling the problem and obtaining the optimal control policy. Bellin [Bel71] provides the proof and improved the computational algorithm forHeyman’s problem. Tadj in [TC05] provides a review on control policies. Tothe best of our knowledge, time-dependence for the (m,N)-policy has notbeen studied in the literature. We use Markov decision process (MDP) tomodel a bi-level number of servers, and obtain the optimal hysteretic controlpolicy based on its results.373.2. Markov Decision Process for KGH3.2 Markov Decision Process for KGHIn this section, we first explain the problem and its objective, and wethen provide a Markov decision model for solving it. The condition for usingthe model for an infinite horizon will then be discussed.Consider an emergency department with three types of beds; main beds,stretchers, and surge beds. The service time for each bed, regardless of itstype, is exponentially distributed with a service rate µ. The surge beds arein an extra section with an additional cost when in use. There is a start-up and running cost for the surge section. On the other hand, having apatient on a stretcher bed imposes another cost to the system due to thelower quality of service received by the patient. When the surge section isclosed, at the time of a new arrival, the priority is to allocate main bedsfirst, and then the stretchers. When the surge section is open, the priorityof allocation is the main, surge, and stretcher beds, respectively.Let m1, m2 and m3 be the number of main beds, stretchers, and surgebeds, respectively. There are two levels for the number of beds in the ED;excluding and including of the beds in the surge section, that is c1 = m1+m2and c2 = m1 + m2 + m3, respectively. The objective is to find an optimalpolicy for opening and closing the surge section such that the utility (cost)function is minimized, i.e. the objective is to find the optimal policy foropening and closing the surge section such that the cost of the system relatedto the surge section and the patients on stretchers is minimized.The arrivals follow a Poisson process with time-dependent arrival rates.The theory we develop will work whenever the arrival rate varies in a periodicfashion. However, for the purpose of this thesis, we considerλt = β + α sin(ω0t− pi2ω0) (3.1)as the time-dependent arrival rate. We chose the sine wave for our studysince it has been considered before in [GK91] and [GKS91] and it seems tobe a reasonable assumption for the arrival rate in health care systems.The model considered is a discrete-time Markov decision process [Put94].Figure 3.4 illustrates the diagram of the MDP for our problem. The ele-ments of the MDP are explained in this section in detail and the backwardinduction algorithm for solving the model is provided in section 3.2.1.Stages Assume we would like to obtain the optimal policy for a timeinterval (0, TH ] where TH is the planning horizon. Let TH = nyT , where Tis the length of the cycle for function λt, that is T =2piω0and ny is the number383.2. Markov Decision Process for KGH (x,b) (0,0) (0,1) (0,m1+m2) (1,0) (1,1) (1,m1+m2+m3) an=0 (closed) an=1 (open)  TC0 TC1 𝑝𝑏,0𝑛  𝑞𝑏,𝑚1+𝑚2+𝑚3𝑛  𝑞𝑏,1𝑛  𝑞𝑏,0𝑛  an=0 an=1 Stage n-1 Stage n Δt 𝑝𝑏,1𝑛  𝑝𝑏,𝑚1+𝑚2𝑛  Figure 3.4: Markov decision process diagram393.2. Markov Decision Process for KGHof cycles in the planning horizon. Decision points are assumed to be discreteand equidistant. This is a reasonable assumption since decision-making inour problem might be done daily, weekly as in KGH, biweekly, monthly, etc.We break down T into N˜ intervals. tn is the decision epoch at t = nTN˜where n ∈{0, 1, . . . , N˜ − 1}and tn − tn−1 = ∆t,∀n ∈{1, . . . , N˜ − 1}.States Let b be the number of patients in the system, and k1, k2 and k3 bethe number of patients on main, stretcher and surge beds, respectively. Letm1, m2 and m3 be the total number of beds in each section: main, stretcher,and surge, respectively. We assume that moving patients between sectionsat each decision epoch is allowed with the following priority of the sections:− when the surge section is closed, 0 ≤ b ≤ m1 +m2,– if b ≤ m1 then k1 = b, k2 = 0, and k3 = 0,– if b > m1 then k1 = m1, k2 = b−m1, and k3 = 0.− when the surge section is open, 0 ≤ b ≤ m1 +m2 +m3,– if b ≤ m1 then k1 = b and k2 = k3 = 0,– if m1 < b ≤ m1 +m3 then k1 = m1, k2 = 0 and k3 = b−m1,– if b > m1 +m3 then k1 = m1, k3 = m3 and k2 = b−m1 −m3.The value of b determines the number of patients in each section of theED. Hence, two indexes suffice to represent the state of the system, (xn, bn).xn ∈ {0, 1} represents the status of the surge section at stage n; 0 is forclosed and 1 is for open. bn is the number of the patients in the system atthe n-th decision epoch. The possible values for bn depend on the status ofthe surge section. That is bn ∈ B0 = {0, 1, . . . ,m1 +m2} when the surgesection is closed and bn ∈ B1 = {0, 1, . . . ,m1 +m2 +m3} when the surgesection is open.Actions There are two types of possible actions/decisions at each stage.Let A = {0, 1} be the set of possible actions. Possible actions are givenby the state to be chosen until the next decision. That is, xn+1 = an,irrespective of the value of xn. an = 0 (an = 1) means the surge section isclosed (open) between stages n and n+1. We cannot close the surge sectionwhen we have more than m1 + m2 patients in the system because it willresult in patient evacuation which is not allowed. Therefore, for states with403.2. Markov Decision Process for KGHbn > m1 + m2, the only possible action is an = 1. In other words, there isno decision-making needed in these states as this action is mandatory.Transition Probabilities Let pnij and qnij be the probabilities of having jpatients in the system at stage n given that i beds were occupied at stagen − 1 when the action at stage n − 1 was an = 0 and an = 1, respectively.When the chosen action at stage n − 1 is 0, we have c1 = m1 + m2 serversand the queueing model will be M(t)/M/c1/c1. With the initial conditionof having i patients at stage n− 1, pnij can be calculated using the modifiedRK4 (explained in section 2.3) for the time interval (tn−1, tn]. When theaction at stage n − 1 is 1, we have c2 = m1 + m2 + m3 servers and thequeueing model will be M(t)/M/c2/c2. In this case, qnij can be obtained byrunning the modified RK4 for this queue.Costs The objective is to minimize the cost of operating an ED with asurge section. There are three different types of costs in this system; (1)start-up cost for opening the surge section, (2) cost of running the surge perunit of time, and (3) cost of a patient on a stretcher per unit of time. IfOax(n) is the immediate opening cost of the surge section at stage n whilethe status of surge is x ∈ {0, 1} and action a ∈ A has been selected, thenOax(n) ={Co if xn = 0 and an = 1 at n0 otherwise,(3.2)where Co is the constant start-up cost of the surge section. There is anothercost between two decision epochs whenever the surge is open. If the selectedaction at tn indicates that the surge is open for the coming interval (tn, tn+1),then the surge operating cost, Ra, has to be considered depending on theaction, a at stage n, and independent of tnRa(n) ={Cr∆t if an = 1 at n0 otherwise,(3.3)where Cr is the cost per unit of time of running the surge section. The thirdterm of the costs is an expected cost. Depending on the current state atstage tn, (x, b), and the chosen action a, there will be an expected cost ofhaving patients on stretchers for the interval (tn, tn+1), called Sa(n, b).Sa(n, b) =Csm2∑j=1jtn+1∫tnpnb,m1+j (t) dt if a = 0Csm2∑j=1jtn+1∫tnqnb,m1+m3+j (t) dt if a = 1.(3.4)413.2. Markov Decision Process for KGHwhere Cs is the cost of having a patient on a stretcher for a unit of timeand pnb,k (t) and qnb,k (t) are the probabilities of having k beds occupied atthe time t after tn with b beds occupied initially and the surge is closed andopen, respectively. The integrals in equation (3.4) can be calculated usingthe Simpson rule explained in section 2.5.There is also an additional term for the cost when the system is full andwe reject arrivals. When the surge section is closed, then the probability ofrejecting a patient is equal to the probability of having m1 +m2 patients inthe system. Let Crej be the cost of rejecting a patient. We define Fx(n, b)as the rejection cost as follows,Fx(n, b) ={Crej∫ tn+1tnλtdt if x = 0, b = m1 +m20 otherwise.(3.5)Now we take the sum of all costs defined above and have the costs at stagen for state (x, b) and action a as follows,rn(x, b, a) = Oax(n) +Ra(n) + Sa(n, b) + Fx(n, b). (3.6)At the last stage, tnyN˜ , the costs must be given. Suppose we want to closethe surge at the end of the planning horizon. Then, it can be assumedrnyN˜ (x, b) ={0 if xnyN˜ = 0M otherwise,(3.7)where M is a large number. Now we are able to use backward inductionto solve our MDP model. This algorithm provides an efficient method forsolving finite-horizon discrete-time MDPs (for more details, see [Put94]).3.2.1 Backward InductionLet u∗n(xn, bn) be the cumulative optimal cost function of the systemfrom the decision epoch t˜n when the state at t˜n is (xn, bn), to the end of theplanning horizon, t˜nyN˜ . The algorithm is as follows1. Set n = nyN˜ andu∗nyN˜(xnyN˜ , bnyN˜ ) = rnyN˜ (xnyN˜ , bnyN˜ ), for all xnyN˜ , bnyN˜ . (3.8)2. Substitute n− 1 for n and compute u∗n(xn, bn) for each xn and bn byu∗n(xn, bn) = minan∈Arn(xn, bn, an) + ∑j∈Banγanbnj (n+ 1)u∗n+1(an, j)(3.9)423.3. Periodicity of the Policieswhereγanbnj (n+ 1) ={pn+1bnj if an = 0,qn+1bnj if an = 1.The optimal action ispi∗n(xn, bn) = arg minan∈Arn(xn, bn, an) + ∑j∈Banγanbnj (n+ 1)u∗n+1(an, j) .(3.10)The sequence of actions is called a policy.3. If n = 1, stop. Otherwise return to step 2.This algorithm provides the optimal policy for running the ED. We usethese policies to obtain control policies for the infinite planning horizonproblem. If the decisions in different cycles are independent, then we canuse a general (mt, Nt)-policy for all cycles. We will show the condition forthe independence of decisions in different cycles in the following section.3.3 Periodicity of the PoliciesIn time-independent ergodic MDPs with infinite planning horizon, de-cisions converge to a steady state. That is, in the long run for a specificstate there is a specific optimal action all the time. However, for Markovchains with periodically changing transition rates, this is not necessarilytrue. We convert our Markov chain with periodically changing arrival ratesinto a periodic Markov chain. In this section, first we show that our MDPis periodic. Then using the periodicity of MDP and cyclic classes, we provethat there exists a mixing time7 such that the probability of being in a stateconverges. We prove that the decisions are independent of the final decisionsafter going backward a finite amount of time to show how far we must goin order to obtain the stationary policies at a specific point in time duringthe cycle of arrival rates. Finally, we prove that the stationary policies at afixed location of a cycle are periodic. Knowing that the policies are eventu-ally periodic, we can extend our results from a finite to an infinite planninghorizon. A related discussion can be found in Martin-Lo¨f [ML67], wherehe showed that there exists a periodic control which provides an optimal7Mixing time is a parameter which measures the time required by a Markov chain suchthat the distance to stationarity is small [LPW06].433.3. Periodicity of the Policiesexpected future cost for a periodic continuous-time Markov chain, and itcan be approximated by discretizing in time.The periodic Markov chain is a recurrent chain with the property thatif the system occupies some state at the present time it would only occupythe same state after p, 2p, 3p, . . . transitions, where p is a positive integerdescribing the periodicity of the system [How60]. Considering (x, b, τ) asthe state of the system where τ is the location of decision epoch in the cycleof the arrival rate function, it is evident that our model is a periodic MDP.Now we show how the probabilities converge for a periodic MDP. Forthis, let γaij(ti, tf ) be the probability of being in state j at time tf withinitial state i at time ti and a as the latest action before tf .Theorem 3.1. For any periodic MDP, for all i1, i2, j ∈ {Ba} and ε > 0there exists a mixing time, t, such thatmaxi1,i2,j,a∣∣γai2j(ti, ti + t)− γai1j(ti, ti + t)∣∣ < ε. (3.11)Proof. Suppose we have N decision points in each cycle. Then we havea periodic MDP with the period of N . Let S = {(x, b, τ)} be the statespace and {Xi : i = 0, 1, 2, . . . } be the stochastic process of the MDP whereXi = (xi, bi, τi). Since the period of the MDP is N , there exist N cyclicclasses8. We fix a reference state u ∈ S and for k = {0, 1, . . . , N − 1} wedefineSk ={v ∈ S : pnN+kuv > 0, for some n ∈ N}. (3.12)Let Pk be the transition matrix of going from stage k to k + 1. Hencethe transition matrix, P k, from stage k + nN to k + (n + 1)N for all n ={0, 1, 2, . . . } will be as follows:P k = PkPk+1 . . . Pk+N−1. (3.13)Consider the cyclic class, S0. We have an N -chain {X0, XN , X2N , . . . } withtransition matrix P 0, which is the subset of the main periodic Markov chain.The N -chain is also a Markov chain sinceP(XkN = xk|X(k−1)N = xk−1, X(k−2)N = xk−2, . . .)= P(XkN = xk|X(k−1)N = xk−1).Moreover, the N -chain is positive recurrent because the expected hittingtime for every state is finite. It is also aperiodic, since all states can be8Cyclic classes of a periodic Markov chain with period d partition the state space Sinto d classes, (S0, S1, . . . , Sd−1). In one step transition, we always go from a state in Skto a state in Sk+1. If we are at Sd−1 the in the next transition we go back to S0443.3. Periodicity of the Policiesrevisited at any kN -th stage for k = {0, 1, 2, . . . }. Therefore, the N -chain isan ergodic Markov chain. The above argument is true for all cyclic classeswhich means that they are all ergodic. Hence, for each cyclic class, thereexists a tk such that∣∣γai2j(τk, τk + tk)− γai1j(τk, τk + tk)∣∣ < ε.By choosing t = maxk {tk} we havemaxi1,i2,j,a∣∣γai2j(ti, ti + t)− γai1j(ti, ti + t)∣∣ < ε, (3.14)which is what we wanted to prove.Using (3.11) in theorem 3.1, we obtain a condition for decision indepen-dence of stage n and n+ l if tn+l − tn ≥ t.Theorem 3.2. For any periodic MDP, there exists a mixing time t < ∞such that the decisions at stage n and n+l are independent when tn+l−tn ≥ t.Proof. First notice that any action at stage n can affect only the probabilitiesat stage n+ l. Since tn+l+1− tn ≥ tn+l− tn ≥ t, by equation (3.11) we have,maxi1,i2,j,a∣∣γai2j(tn, tn+l+1)− γai1j(tn, tn+l+1)∣∣ < ε. (3.15)Let ∆u∗n+l(x, b) be the variation in the cost function at stage n+ l and state(x, b) by changing the policies at stage n. Then,max ∆u∗n+l(xn+l, bn+l) ≤∑j∈Banε u∗n+l+1(xn+l+1, j)< ε c2 maxj∈Banu∗n+l+1(xn+l+1, j),where c2 = m1 + m2 + m3. Let δ be the maximum variation in u∗n+l+1elements such that the optimal policy does not change for the previousstage, n+ l − 1. For any δ > 0, there exists ε > 0 such thatε c2 maxj∈Banu∗n+l+1(xn+l+1, j) < δ.This is true because ε decreases as the mixing time t increases. Hence, forany δ > 0, there exists a t such that whenever tn+l − tn ≥ t the actions atstage n+ l and n are independent.453.4. Numerical ResultsNow, we discuss the behavior of the stationary policies. We show in thefollowing theorem that policies are periodic. We prove that, in general, adiscrete-time MDP with a periodic parameter, such as arrival rate, servicerate, or number of servers, is a periodic Markov chain and consequently, thedecisions will be periodic as well.Theorem 3.3. In finite-state periodic Markov chains with time-homogeneoustransition rates with all states communicating, the optimal policies will even-tually repeat.Proof. Using theorem 3.1, we know that a periodic Markov process has cyclicclasses which each class is an ergodic Markov chain. The number of cyclicclasses is equal to the number of decision points in the cycle. By theorem3.2, we know the policies are independent from the previous actions after amixing time. Therefore, in each cyclic class, the same decision will be madeeventually. Cyclic classes are revisited periodically, therefore policies repeatwith the period of d as well.3.4 Numerical ResultsIn this section, we use numerical computations to discuss the periodicbehavior of Nt and mt. We also show how fast the policies reach theirequilibrium and that they have a phase delay in respect to the arrival ratefunction. We performed a number of numerical solutions for the problemdescribed here and if 1/µ is small enough compared to T (1/µT < 0.1), thenwe found in all numerical examples that two cycles are sufficient to reachthe equilibrium. The other parameters do not have a significant effect onthe policies as long as the system is not saturated.For obtaining the contour points at each decision epoch, we use theoutcomes of the MDP model. To find Nt, we look at the decisions at eachstage n for each state (0, b), where b ∈ B0 and find the minimum value forb at which the surge is opened. This minimum value is equal to Ntn , wheretn is the time at stage n. For mt, we investigate the decisions at each stagen for each state (1, b), where b ∈ B1. mt will be the maximum value of b forclosing the surge section at stage n. We put mtn = −1 if we do not closethe surge even if the system is empty at stage n.Figure 3.5 shows Nt and mt as an example of the case study at KGH.We put µ = 0.25 per day per bed, m1 = 12, m2 = 28, m3 = 20, α = 5 perday, β = 10 per day, Co = 200, Cs = 50 per day per stretcher and Cr = 100463.4. Numerical Results -5 0 5 10 15 20 0 20 40 60 80 100 120 140 160 Number of the stage N(t) m(t) Arrival rate Figure 3.5: Nt and mt for three cycles, T = 364, with 52 decision epochs ineach cycleper day. Decisions are made weekly, therefore there are 52 decision points ineach cycle. We set T = 364 days to be equivalent to 52 weeks. The planninghorizon is 3 cycles. Figure 3.5 shows that except for decisions close to theend of the planning horizon, both Nt and mt exhibit a periodic behavior,with the same cycle length as the arrival rate function, but shifted by acertain amount. It can be observed that policies match the periodic patternafter a short time moving away from the terminal stage, which shows thatwith this setting, reaching the equilibrium happens very quickly.One might expect that the minimum of Nt and mt occur at the maximumof λt, since the arrival rate is higher, and intuitively the surge section needsto be open. That is, as λt gets larger, we potentially have more patients,the quality of service gets poorer and the cost function increases if the surgesection is closed. Hence, it makes sense to reduce Nt and mt to increasethe chance of having the surge open. However, the minimum of Nt and mthappens before the peak of λt. The model predicts that the maximum arrivalrate will happen in the near future and increases the chance of opening thesurge and keeping it open before reaching this peak. In this sense, it avoidsimposing the extra cost due to the high chance of having lots of occupied473.5. Extreme Case Analysisstretchers. At the maximum point of the arrival rate function, the value ofNt increases. The reason is that at this point, the function changes fromincreasing to decreasing and the worst case has passed already. So if we havenot opened the surge by this point, we can decrease the chance of openingit by increasing Nt. However, if it is open, mt does not increase as early asNt. It is worth noting that the maximum value of Nt and mt occur where λtis still decreasing. The reason is that there exists a minimum of the arrivalrate ahead and it is safe to decrease the chance of opening the surge sectionto reduce the cost related to running it. This causes a phase delay in Nt andmt compared to what is expected from time-independent contour points.3.5 Extreme Case AnalysisIn this section, we study the two extreme cases of arrival rate random-ness and discuss their effect on the (mt, Nt)-policy. The first case is thesituation where the variation in λt is much larger than the variance of theexponentially-distributed inter-arrival times. In this case, the problem be-comes asymptotically deterministic. The exact opposite of this situation, thesecond case, is when total randomness is approached because the varianceof inter-arrivals is much greater than the amplitude of the arrival rate’s sinewave such that the change in the sine wave can be ignored. We are hopingthat the deterministic case will help us to have a better understanding of theexistence of phase difference between (mt, Nt) and λt. The total randomnesscase shows that the time-independent (m,N)-policy is a special case of ourmodel.3.5.1 The Deterministic CaseFor studying the deterministic case of our model, we assume that thearrival process is a deterministic process. For the simplicity of analyzingthis case the service time is considered constant for all beds. In this case,the optimal points for opening and closing the surge section can be ob-tained by straight-forward optimization of the continuous time-dependentcost function.We assume that we open and close the surge section only once in a cycleT . This is a reasonable assumption since there is only one peak in eachcycle. Consider t1 as the time that we open the surge section in the firstcycle, and t2 as the time that we close it. The number of arrivals at time tare given byλt = β + α sin(ω0t+ φ), (3.16)483.5. Extreme Case Analysiswhere φ = −pi/2, and ω0 = 2pi/T . Every arrived patient occupies a bedfor 1/µ units of time where µ is the deterministic rate of service. As withthe MDP model, we assume that there is always a bed available in ED foran arriving patient. Therefore, there exists a departure 1/µ units of timeafter each arrival. Hence, we can write the number of departures at time tas follows,dt = β + α sin(ω0t+ φ− ω0µ). (3.17)Let qt be the number of occupied beds in the ED at time t. Knowingthe number of the arrivals and departure at any time, we can easily obtainqt by,qt =∫ t0λtdt, for t <1µ,∫ 1µ0λtdt+∫ t1µ(λt − dt) dt, for t ≥ 1µ.(3.18)Using equation (3.16) and (3.17), and trigonometric identities 9 we haveqt =βt+αω0[cosφ− cos (ω0t+ φ)] for t < 1µ,q0 +2αω0sin(ω02µ)[sin(ω0t+ φ− ω02µ)− sin(φ− ω02µ)]for t ≥ 1µ,(3.19)where q0 =βµ +αω0[cosφ− cos(ω0µ + φ)]is the number of patients in thesystem at time 1/µ where the first departure occurs. It is worth notingthat when t ≥ 1/µ, β only shifts the length of the queue by a constant.Therefore, it does not affect on the optimal t1 and t2. For having the emptyED at the end of the cycle, we put β = α. If β has some other value β¯, thenqt becomes qt + β¯−α. Moreover, considering equation (3.19) for t = 1/µ, qtis periodic with the same period of the arrivals and departures. Hence, wecan say,qt = qt+nT for t ≥ 1/µ, and n = {0, 1, . . . } . (3.20)9sinx± sin y = 2 sin(x± y2)cos(x∓ y2).493.5. Extreme Case AnalysisTimeArrivals Departures Occupied bedsT𝑡𝑒𝑞𝑡λ𝑡𝑑𝑡 1 μFigure 3.6: Number of arrivals, departures, and occupied beds over one cycleFigure 3.6 shows the shape of number of arrivals, departures, and occu-pied beds over a cycle. Because of the structure of our arrivals and conse-quently our departures, qt first increases to the point te where the numberof the arrivals and departure are the same, then drops down to almost zeroat the end of the cycle. It is evident that the maximum of qt occurs at pointte. We can obtain te by setting λt = dt, or dqt/dt = 0,te =12ω0(pi − 2φ+ ω0µ)=T2+12µ.(3.21)So the maximum of qt occurs 1/2µ after the maximum of λt. Havingqt in explicit form, we can define the cost function for scheduling the surgesection. Let m1, m2 and m3 be the total number of beds in each section:main, stretcher, and surge, respectively. If qt > m1 when the surge section503.5. Extreme Case AnalysisTimeT𝑞𝑡 𝑡1𝑚1 +𝑚3𝑚1 𝑡1 𝑡2 𝑡2Figure 3.7: qt and demonstration of t¯1, t¯1, t¯2, and t¯2is closed, there will be qt − m1 patients on the stretchers. Similarly, Ifqt > m1 +m3 when the surge section is open, there will be qt − (m1 +m3)patients on the stretchers. Let t¯1 and t¯2 be the times that qt reaches m1 andm1 +m3 from below, respectively. Similarly, let t¯1 and t¯2 be the times thatqt drops to m1 and m1 + m3, respectively, as shown in Figure 3.7. Thesevalues can be obtained by setting qt = m1, and qt = m1 +m3 as follows,t¯1 =1ω0{sin−1[(m1 − q0)ω02α sin θ+ sin (φ− θ)]+ θ − φ}, (3.22)t¯2 =1ω0{sin−1[(m1 +m3 − q0)ω02α sin θ+ sin (φ− θ)]+ θ − φ}, (3.23)t¯1 =1ω0{− sin−1[(m1 − q0)ω02α sin θ+ sin (φ− θ)]+ θ − φ+ pi}, (3.24)513.5. Extreme Case Analysist¯2 =1ω0{− sin−1[(m1 +m3 − q0)ω02α sin θ+ sin (φ− θ)]+ θ − φ+ pi}, (3.25)where θ = 1/2µω0. There is no point to have an open surge section beforet¯1 and after t¯1 since there are no patients on stretchers. Therefore, we cansayt¯1 ≤ t1 ≤ te,te ≤ t2 ≤ t¯1. (3.26)Assume that the surge section is closed at t = 0, same as before. Thereare two cases we need to distinguish in order to formulate the cost function:the case that qt > m1+m3 for some t ∈ [0, T ), and the case that qt ≤ m1+m3for all t. For both of these cases qt > m1 for some t ∈ [0, T ). Otherwise, wewill never open the surge section.For the first case where qt > m1 + m3 for t ∈ (t¯2, t¯2), the cost functionof the system for cycle T is defined as follows,fT (t1, t2) = Co + Cr (t2 − t1) + Cs∫ t1t¯1(qt −m1) dt+Cs∫ t¯2t¯2(qt −m1 −m3) dt+ Cs∫ t¯1t2(qt −m1) dt.(3.27)The second integral in (3.27) is independent of t1 and t2. Therefore, wehave,∂fT (t1, t2)∂t1= −Cr + Cs (qt1 −m1) ,∂fT (t1, t2)∂t2= Cr − Cs (qt2 −m1) .(3.28)For finding the optimal opening and closing time we put ∂fT /∂t1 = 0,and ∂fT /∂t2 = 0. The results is,t∗1 =1ω0[sin−1 (ζ)− φ+ ω0µ], (3.29)523.5. Extreme Case Analysist∗2 =1ω0[pi − sin−1 (ζ)− φ+ ω0µ], (3.30)where,ζ =CrCs+m1 − q02αω0sin(ω02µ) + sin(φ− ω02µ). (3.31)Since ζ ≥ 0 we have 0 ≤ sin−1 ζ ≤ pi/2, and therefore t∗1 < t∗2. To showthat the pair (t∗1, t∗2) is the local minimum point of the cost function, weshould demonstrate [HHMG+09],D =∂2fT (t∗1, t∗2)∂t21∂2fT (t∗1, t∗2)∂t22− ∂2fT (t∗1, t∗2)∂t1∂t2> 0, (3.32)and∂2fT (t∗1, t∗2)∂t21> 0. (3.33)It is easy to see that these conditions are hold for t∗1 and t∗2. Because∂fT /∂t1 is independent of t2, and 0 ≤ sin−1 ζ ≤ pi/2, we have∂2fT (t∗1, t∗2)∂t1∂t2= 0,∂2fT (t∗1, t∗2)∂t21= Cs2αω20sin(ω02µ)cos(sin−1 ζ)> 0,∂2fT (t∗1, t∗2)∂t21= −Cs 2αω20sin(ω02µ)cos(pi − sin−1 ζ) > 0.Hence, t∗1 and t∗2 are the local optimal opening and close time, respec-tively. Since (t∗1, t∗2) is the only local minimum, and the boundary values donot provide the better cost, we can claim that t∗1 and t∗2 are global optimalopening and close time.In case qt ≤ c2 for all t, we will not have the second integral in equation(3.27). Therefore, the cost function of the system for cycle T will be,fT (t1, t2) = Co + Cr (t2 − t1) + Cs∫ t1t¯1(qt −m1) dt+Cs∫ t¯1t2(qt −m1) dt.(3.34)533.6. Chapter SummaryDerivatives of fT (t1, t2) are exactly the same as equation (3.28). Hence,the optimal opening and closing time for this case are the same as (3.29)and (3.30).Therefore, we can claim that in both cases t∗1 and t∗2 are our optimalopening and close time, respectively. This results shows the importance ofthe ratio of cost coefficients Cr and Cs on the optimal policies. We did notobserve any symmetry for t∗1 and t∗2 according to the arrivals functions.3.5.2 Total RandomnessFor total randomness case, let us go back to our arrival rate function inequation (3.1) one more time. Let X(t) be the sequence of inter-arrival timesof this Poisson process. We know X(t) have the exponential distribution,X ∼ exp(λt). The variance ofX(t) is λt which is time-dependent. Therefore,we consider the average of V ar(X) for a cycle of λt to see what parametersaffects on it. Since for the Poisson distribution, the variance is approximatelyequal to the expectations, we approximate V ar(X) by:V ar(X) =1T∫ T0λtdt = β (3.35)Now for having total randomness, V ar(X) should be much larger thanthe change of λt which is α. Therefore, V ar(X) >> α which means β >> αby (3.35). In this case, there is practically no time-dependence for thearrivals so, as we expected, N and m will be fixed numbers for all t’s. Thiscase is basically a process with time-independent Poisson arrivals. For thistype of problem, we expect to have a constant (m,N)-policy. The numericalresults agree with our expectation as shown in Figure 3.8.3.6 Chapter SummaryIn this chapter, we provided a control policy for a decision process with aperiodic time-dependent arrival rate and two levels for the number of servers.An MDP model was built to obtain the optimal policies. We proved that forsuch an MDP, transition probabilities converge at a fixed location of the cycleand consequently, the policies are independent of prior ones after a mixingtime. These results led us to the conclusion that the control policies will beperiodic as well, hence we can extend the result of a finite planning horizon toan infinite planning horizon.The numerical results showed a phase-differencebetween the control policies and the rate function. This helps in avoiding543.6. Chapter Summary -4 -2 0 2 4 6 8 10 12 0 10 20 30 40 50 60 70 N(t) m(t) Number of the weekFigure 3.8: Nt and mt for three cycles with 20 decision epochs in each cyclewhere β = 60 and α = 1the crisis situation where the arrival rate reaches its maximum, and enablesus to save the resources at the minimum of arrival rate.55Chapter 4Case Study: An EmergencyDepartment with Two Levelsof Servers and Seasonal FluEpidemicThis chapter is an application of the model in chapter 3. We considerthe same ED described in section 3.2 with two types of the arrivals; regulararrivals and arrivals due to epidemics. We model the flu epidemic here witha Poisson process with time-dependent rate. The seasonality behavior of theflu epidemics is reflected in our modeling which is one of the contributions ofthis case study. Moreover, we provide an extrapolation method for obtainingthe optimal policy for large-scale problems of this type.4.1 IntroductionEpidemics are one common reason for overcrowded emergency rooms[Der02]. Simple diseases, such as influenza or flu, can easily cause a widerange of disease severities, from minor symptoms to severe pneumonia, en-cephalitis or general infection, any of which can lead patients to the emer-gency room which can be life-threatening. In case of a flu epidemic, whichhappens seasonally, the occurrence is predictable. Therefore, with properscheduling of resources, a decision maker can control the impacts of theepidemic on service quality in EDs.There are different well-known models for describing the epidemiologicalphenomena, however, the SIR (Susceptible-Infected-Recovered) model hasoften been used to model the flu epidemic and is claimed to be a better fit fora population with no age structure [Het00]. To the best of our knowledge,there has been relatively little work in the literature involving stochasticmodeling and epidemiology. The classic way of dealing with epidemic ar-rivals in queueing theory is to consider them as batch arrivals [Bla72]. This564.2. Problem Descriptionapproximation does not reflect the time-dependent nature of the epidemicand only discusses the steady state property of the queue. Ball and Donnelly[BD95] use the M/G/1 queue to find the total cost of early stages of theepidemic. Trapman and Bootsma [TB09] use M/G/1 queues with processorsharing to model SIR epidemics. Herna´ndez et al. [HSCCLHC10] derivenew approximations to the quasistationary distribution of SIS (Susceptible-Infected-Susceptible) and SEIS (Susceptible-Exposed-Infected-Susceptible)stochastic epidemic models using the M/G/c queue implying that any indi-vidual is a server which can be either infected (busy) or susceptible (idle).While these endeavours are good examples of modeling an epidemic withstochastic processes, they involve entirely different approaches as comparedto ours. We use results from the SIR model, more specifically the infectedpopulation, to define an arrival rate for the M(t)/M/c/c queue. In thismanner, we can obtain the transient probabilities of a system with epidemicarrivals.In this chapter, we will study the influence of a seasonal flu epidemicon resource allocation policies of an ED with two levels for the number ofavailable beds. The main questions are when to decide to use the surgeresources, and how long to keep them open. Here, we establish a periodictime-dependent arrival rate by combining regular and epidemic arrival rates,and study the behavior of optimal policies of the MDP. The structure of theED is the same as explained in section 3.2 except for its arrival process.We start by describing the model in section 4.2, showing the behavior ofthe SIR model with time-dependent transmission rate, and comparing it tothe traditional constant SIR model. We then discuss what the arrival ratedue to the epidemic looks and how it can be combined with regular arrivals.The periodicity of the policies is shown with numerical results in section4.3. Subsequently, linear extrapolation is explored for the large scale casesin section 4.4. Conclusions are presented in section 4.5.4.2 Problem DescriptionWe provide general characteristics of the problem as it arose from KGH.Consider an emergency department with two arrival processes; a Poissonprocess with a constant rate for regular arrivals, and another non-homogeneousPoisson process with a time-dependent rate. There are two levels for thenumber of beds (servers) and the decision-maker decides which one to pickin different situations. Let m1, m2 and m3 be the total number of bedsin each section: main, stretcher, and surge, respectively. Then the lower574.2. Problem Descriptionlevel for the number of servers is c1 = m1 + m2 and the upper level isc2 = m1 + m2 + m3. The service time is exponentially distributed and theservice rate of each bed is constant. There are two costs related to the surgesection; the cost of opening the surge section (applied each time it opens),and the cost of running the surge section (the daily cost). There is anothercost related to the number of patients on stretchers. The quality of servicehas an inverse relation to the number of patients on stretchers. The lowerthe number of patients on stretchers, the better the quality of service.In this section, first we compare the SIR model with time-dependenttransmission rate with the traditional model, then we study the effects oftime-dependence on the variation of the proportion of infected people. Thenwe define an arrival rate by combining the results of the time-dependent SIRmodel with regular arrivals and discuss its distribution. At the end, usingfourth order Runge-Kutta, we obtain necessary probabilities of M(t)/M/c/cfor the MDP which is provided in section 3.2.4.2.1 Seasonally Forced SIR ModelWe consider the SIR model for modeling an epidemic. The model consistsof three non-linear ordinary differential equationsS′ = −βSI (4.1)I ′ = βSI − γI (4.2)R′ = γI, (4.3)where S, I and R are the proportions of susceptible, infectious and recov-ered/immunized individuals, respectively. The prime denotes the derivativewith respect to time t. γ is the recovery rate, and β is the transmission rate.Np denotes the population size. In SIR model, it is assumed that the totalpopulation is fixed. This impliesS + I +R = 1 (4.4)S′ + I ′ +R′ = 0. (4.5)The re-production number is defined asNR =βγ. (4.6)If NR > 1, the infection is capable of spreading into the population, other-wise it will die out. NR is 3-5 for diseases such as influenza and smallpox584.2. Problem Description (a) (b) Figure 4.1: SIR model with: (a) constant, and (b) time-dependent trans-mission rates[KE05]. For diseases such as influenza, it is important to consider seasonalityin the transmission rate. We consider β as a time-dependent parameter anddenote it by β(t). Moreover, we assume there is one break-out in each year,and at the end of an epidemic year, all the parameters of the SIR model arereset to their initial values. Following Towers et al. in [TVGZF11], we setβ(t):β(t) = β0 (1 +  cos (2piωt)) (4.7)where ω = 1/364 per day. The maximum of β(t) happens at the beginningof an epidemic year. It has been estimated that  = 0.35 [TVGZF11]. Figure4.1 compares SIR model results for constant and time-dependent transmis-sion rates. The initial condition for both models are [S(0), I(0), R(0)] =[0.99, 0.01, 0] and γ = 0.04. β(t) = 0.1 (1 + 0.35 cos(2piωt)), and for the con-stant model β = 0.1 which is the average of β(t). It can be seen that for aslow-changing β(t) as in equation (4.7), S, I and R change smoothly. How-ever, the period of the epidemic is longer for the time-dependent model andhas a smaller maximum infected proportion when compared to the constantmodel. Furthermore, in the time-dependent model, S does not drop downto zero and R does not reach 1 by the end of epidemic, in contrast to theconstant model. This is more realistic for epidemic diseases because in suchdiseases, it is not necessary that the entire population gets infected by theepidemic.Since the SIR model with time-dependent transmission rate is a betterfit for flu epidemic curves, we use this model to approximate our epidemicarrivals.594.3. Periodicity of The Control Policies4.2.2 Combining the RatesTo define a total arrival rate for the ED, we consider arrivals due to aflu epidemic as a constant ratio of the infected people, say αNpI where α isbetween 0 and 1. Assume regular arrivals occur with a Poisson distributionwith rate λ.It is known from experience that if arrivals to a system come from manydifferent sources, the process will be Poisson [Khi60], [Cox62]. In an SIRmodel, assuming each individual as a source of arrival and the infection as asuccess, we can say that epidemic arrivals are Poisson. In addition, Tuckwelland Williams have shown in [TW07] that the number of susceptible, infectedand recovered people in an SIR model are binomial random variables in adiscrete time space and have the Markovian property for constant trans-mission rate. It is evident that with β(t), the Markovian property will bepreserved; however, the process would no longer be homogeneous. The sum-mation of two Markovian processes will be Markovian, so we can concludethe queueing model here has the Markovian property. We putλt = λ+ αNpI(t). (4.8)Now we can obtain the transition matrix and the transient solutionsfor the M(t)/M/c/c queue using the modified fourth-order Runge-Kuttamethod as described in section 2.3. We use the transient solutions to esti-mate integrals in the cost function as given in section 3.2.4.3 Periodicity of The Control PoliciesHaving the model constructed, we can now provide numerical resultsto observe the behavior of the control policies. For obtaining the contourpoints at each decision epoch, we use the outcomes of the MDP model. Tofind Nt, we look at the decisions at each stage n for each state (0, b), whereb ∈ B0 and find the minimum value for b at which one opens the surge. Thisminimum value is equal to Ntn , where tn is the time at stage n. For mt, weinvestigate the decisions at each stage n for each state (1, b), where b ∈ B1.mt will be the maximum value of b for closing the surge section at stage n.We put mtn = −1 if we do not close the surge even if the system is emptyat stage n.The arrival rate is periodic, hence it is expected that the policies will beperiodic as well based on our results in section 3.3. Figure 4.2 shows Nt andmt as an example of an ED with epidemic arrivals. Here, the transmission604.4. Extrapolation-5051015202530354045500 20 40 60 80 100 120 140 160 180NUMBER OF THE WEEKSN(t) m(t) Arrival rateNumber of the weeksFigure 4.2: Nt and mt for three cycles, T = 364, with 52 decision epochs ineach cyclerate is β(t) = 0.1 (1 + 0.35 cos(2piωt)) where ω = 1/364 and the recoveryrate is γ = 0.04. We put µ = 0.2 per day per bed, m1 = 12, m2 = 28,m3 = 20, λ = 2 per day, T = 364 days, Co = 2000, Cs = 500 per dayper stretcher and Cr = 200 per day. Decisions are made weekly, thereforethere are 52 decision points in each cycle. The planning horizon is 3 cycles.The periodic behavior of Nt and mt can be seen with the same period as λt.Both Nt and mt do not follow the pattern close to the terminal stage. This isexpected since terminal rewards can affect policies close to them. It can beobserved that policies match the periodic pattern after a short time movingaway from the terminal stage which shows that with this setting, reachingthe equilibrium happens very quickly. Periodic behavior of the policies willallow us to extend the planning horizon to infinity. As long as we knowwhere we are in the cycle, we can provide the control policy.4.4 ExtrapolationSince execution time is substantial when the number of beds is large,we will try to obtain the policies by extrapolating results for smaller sizeproblems. For this purpose, considering the above example, we multiply614.4. Extrapolation   -202468101214160 20 40 60 80 100 120N(t)/κNumber of the weekκ1 κ2 κ3 κ4-4-20246810120 20 40 60 80 100 120m(t)/κNumber of the weekκ1 κ2 κ3 κ4Figure 4.3: Changes in Nt/κ and mt/κ by increasing the size of the problem,where κ1 = 1, κ2 = 2, κ3 = 3 and κ4 = 4λt, m1, m2 and m3 by a constant number, κ, and solve the problem withMDP for one cycle. In this way, the traffic intensity stays the same in allcases. We put κ = 0.25, 0.5, ..., 3.75 and 4. Figure 4.3 shows the policies ofthe MDP model for only four problem sizes, κ1 = 1, κ2 = 2, κ3 = 3, κ4 = 4and it shows the evolution of Nt/κ and mt/κ by increasing the size of theproblem. We divide Nt and mt by κ to observe the in-scale-evolution ofthe policies by increasing the size of the problem. It can be observed thatthe patterns are quite consistent. Despite mt/κ showing a more tangibleincrement, Nt shows a decrement by the size increment in larger values.A coherent behavior exists for constant-rate models. In time-independentcases with a fixed traffic density, the problem with larger number of theservers has the relatively smaller variance in its process, and therefore theupper/lower control value of it proportionally becomes smaller/larger.We start with a simple linear extrapolation for each contour point, N andm, at specific decision points. For extrapolation, let the dependent variablebe N (or m) at the n-th decision epoch and the independent variable beκ, the constant number multiplied to the size of the problem. Using alldata collected from the MDP model for κ = 0.25, 0.5, ..., 3.75, 4, we finda regression line for N (and m) at the n-th decision point. Then we canpredict the value of N (or m) for a larger κ. We repeat the procedure forall the decision epochs in two cycles, where each cycle has 52 points in ourproblem, and we have the predicted (mt,Nt)-policies. The predicted valuesof N and m are not always integers as desired. We approximate the valuesby rounding them to the closest integer value.Figure 4.4 and 4.5 show the extrapolation for Nt and mt in differentdecision epochs. Linear forecasting works reasonably for mt in all decision624.4. Extrapolation   R² = 0.9951R² = 0.9864R² = 0.995102468101214160 0.5 1 1.5 2 2.5 3 3.5 4 4.5N(t)κWeek 3 Week 16 Week 55R² = 0.9993R² = 0.9985R² = 0.9982R² = 0.9951R² = 0.999201020304050600 0.5 1 1.5 2 2.5 3 3.5 4 4.5N(t)κWeek 25 Week 50 Week 52 Week 55 Week 77Figure 4.4: Linear extrapolation of Nt for different decision epochs  R² = 0.9673R² = 0.9858R² = 0.9858-5051015202530350 0.5 1 1.5 2 2.5 3 3.5 4 4.5m(t)κWeek 22 Week 23 Week 76R² = 0.9946R² = 0.9966R² = 0.9952-50510152025303540450 0.5 1 1.5 2 2.5 3 3.5 4 4.5m(t)κWeek 25 Week 50 Week 77Figure 4.5: Linear extrapolation of mt for different decision epochs634.4. Extrapolation  -100102030405060700 20 40 60 80 100 120N(t)Number of the weekExtrapolation MDP-1001020304050600 20 40 60 80 100 120m(t)Number of the weekExtrapolation MDPFigure 4.6: Comparing the extrapolation and MDP results of mt and Nt forκ = 5points since the coefficient of correlation is large enough. However, for Ntin weeks 13 to 15 of each cycle, this does not yield a good estimation. Thereason can be the quick change in Nt at this time in the cycle as it can beseen in figure 4.3. When we investigated the error of the cost function, itturned out that the largest error occurs between weeks 22 and 30 in eachcycle. This could be the result of the fluctuation in Nt and mt in thisinterval (figure 4.2), and it explains why the linear extrapolation does notwork as well as for the rest of the cycle. However, the large error of thisarea does not affect the total error at the beginning of the cycle. The reasonis that although there are large errors in some states in specific weeks, stillthe majority of the states in those weeks do not produce a large error. Themedians of error for those weeks are still around 10%. Therefore, consideringequation (3.9), having a large error in some states at some decision pointswill not affect the costs or policies in the long run. One might use the secondorder extrapolation for the parts with large error; however, since there is nolarge error in the total cost function we do not follow that path here andaccept linear extrapolation as a reasonable method for obtaining the policiesfor large problems.To obtain a visual understanding of the quality of exploration predic-tion, the comparison between the actual and predicted policies for κ = 5 isprovided in figure 4.6. The pattern of Nt and mt is nearly identical beforeweek 20 in each cycle. For Nt between week 20 and 40 of each cycle, theprediction is below the optimal result from MDP. For mt, this differencestarts from week 19 and exists until the end of each cycle. Overall, the644.5. Chapter Summaryprediction of policies seems very close to the optimal results and the smallerror in the total cost of the system confirms the same. There is one qualitypoint about predictions here; the predicted Nt and mt are below the opti-mal values. This means that it is safe to use the predicted policies since wealways open the surge earlier and close it later than the optimal values. Itmight entail somewhat larger costs, but this assures us that the quality ofservice will not be any worse than the optimal.4.5 Chapter SummaryWe used an SIR model with time-dependent transmission rate to modelthe arrival rate according to a flu epidemic. It turned out that this ar-rival process has the Markovian property. We combined these arrivals withconstant-rate Poisson arrival from regular patients of an ED, and obtainedthe transient solutions of an M(t)/M/c/c queue. After defining a decisionprocess, we came up with periodic control policies since the arrival rate wasperiodic as well. Hence, the problem can easily be extended to an infi-nite horizon. Moreover, we showed that linear extrapolation is a reasonableapproximation for obtaining optimal policies for large-scale problems.65Chapter 5Conclusions and FutureWorkThe primary goal of this thesis was to provide optimal policies for op-erating surge beds in the Emergency Department of a hospital. For thispurpose, we provided various mathematical and practical contributions.Chapter 3, the core of this thesis, presented the MDP model for solvingthe problem. The distinctive characteristic of the MDP model was thearrival rate of the process being time-dependent, and having two levels forthe number of servers. We proved that decisions for an MDP with periodicand time-dependent Poisson arrivals are periodic as well. Consequently, thecontour control policies which are obtained based on the optimal decisionshow periodic behavior as well. Numerical results presented in chapter 3support this claim.Despite the fact that we only proved the periodicity of the policies forperiodic arrival rates, this can be extended to the results for periodic servicerates as well. A periodic service rate is not an applicable or reasonableassumption for our problem, however, it might be useful for other typesof problems. If the service and arrival rates are both periodic, we expectsimilar results for the periodicity of the policies as long as the periods ofservice rate and arrival rate are the same. Otherwise, we believe that theperiod of the policies will be the least common multiple of the period ofarrival rate and period of service rate. Proving this claim can be addressedin future work related to this thesis.It is a fact that the acuteness of patients’ conditions arriving at an EDis not the same, and some patients need to be served with higher priority.Despite the complexity of our model with current assumptions, one contri-bution to this thesis could be considering priority for patients. On the otherhand, one might claim scheduling the surge beds is not that directly relatedto the patients’ condition, and defining a general rate of arrival will suffice.Another contribution to the model in chapter 3 is to assume a largerservice time for stretchers. However, this assumption can increase the timecomplexity of the problem substantially.66Chapter 5. Conclusions and Future WorkIn chapter 2, we adapted the fourth order Runge-Kutta method to obtainmore accurate transient solutions for M(t)/M/c/c queues. We called ourmethod the modified Runge-Kutta. We showed numerically that our methodworks better than RK4 for our specific queue. The method can be used forany other type of queue as well, however, we do not have enough evidenceto claim it will always work better than the RK4. In addition, in chapter 2,we tried to adapt the randomization method for M(t)/M/c/c queues withvarying arrival rates. The original form of randomization method is simpleand useful for homogeneous queues. However, it turned out that it does notwork as well for time-dependent queues. Randomization is a complex andtime-consuming method for time-dependent queues. However, we can stillobtain results comparable to the RK4 method by using an adaptation ofrandomization.To demonstrate an application of the model provided in chapter 3, wemodeled the seasonal flu epidemics in chapter 4 and defined a combinedarrival rate for the ED. The results of this model supported the claim re-garding the periodicity of the policies as discussed in chapter 3. In addition,we showed that a type of linear extrapolation is a reliable way to obtaincontrol polices for large-scale problems.67Bibliography[AM88] Mansoor Alam and V. Mani. Queueing model of a bi-level Markov service-system and its solution using recursion.IEEE Transactions on Reliability, 37(4):427–433, October1988. → pages vii, 34, 36[AM89] Mansoor Alam and V. Mani. Recursive solution techniquein a multi-server bi-level queueing system with server break-down. IEEE Transactions on Reliability, 38(4):416 – 421,October 1989. → pages vii, 34, 35[AS06] Baris Ata and Shiri Shneorson. Dynamic control of anM/M/1 service system with adjustable arrival and servicerates. Management Science, 52(11):1778–1791, 2006. →pages 36[Ata05] Baris Ata. Dynamic power control in a wireless static chan-nel subject to a quality-of-service constraint. Operation Re-search, 35(5):842–851, 2005. → pages 36[BD95] Frank Ball and Peter Donnelly. Strong approximations forepidemic models. Stochastic Processes and their Applica-tions, 55(1):1–21, 1995. → pages 57[Bel71] Colin E. Bell. Characterization and computation of optimalpolicies for operating an M/G/1 queuing system with re-movable server. Operations Research, 19(1):208–218, 1971.→ pages 37[Bla72] Joseph D. Blackburn. Optimal control of a single-serverqueue with balking and reneging. Management Science,19(3):297–313, 1972. → pages 56[Coo91] Robert B. Cooper. Introduction to Queueing Theory. Else-vier North Holland, 3rd edition, 1991. → pages 3768Bibliography[Cox62] David Roxbee Cox. Renewal theory. John Wiley & Sons Inc,1962. → pages 60[Dav92] Paul W. Davis. Differential Equations for Mathematics, Sci-ence, and Engineering. Prentice-Hall, New Jersey, 1992. →pages 22[dBvRVK07] Arnoud M. de Bruin, A. C. van Rossum, M. C. Visser, andG. M. Koole. Modeling the emergency cardiac inpatient flow:An application of queuing theory. Health care ManagementScience, 10:125–137, 2007. → pages vii, 12, 13[Der02] Robert W. Derlet. Overcrowding in emergency departments:Increased demand and decreased capacity. Annals of Emer-gency Medicine, 39(4):430–432, 2002. → pages 56[Dud86] Andrzej Duda. Diffusion approximation for time-dependentqueueing systems. IEEE Journal on Selected Areas in Com-munications, SAC-4(6):905–918, September 1986. → pages18[Fan76] Walter Fan. Simulation of queueing network with timevarying arrival rates. Transaction of the International As-sociation for Mathematics and Computers in Simulation,XVIII(3):165–170, 1976. → pages 19[FH07] Samuel F. Fomundam and Jeffrey W. Herrmann. A surveyof queuing theory applications in healthcare. Technical Re-port 24, The Institute for Systems Research, 2007. → pages14[Geb67] Richard F. Gebhard. A queuing process with bilevel hys-teretic service-rate control. Naval Research Logistics Quar-terly, 14(1):55–67, 1967. → pages vii, 33, 34[GH85] Donald Gross and Carl M. Harris. Fundamentals of Queue-ing Theory (2Nd Ed.). John Wiley & Sons, Inc., New York,NY, USA, 1985. → pages 3[GH01] Jennifer M. George and J. Michael Harrison. Dynamic con-trol of a queue with adjustable service rate. Operations Re-search, 49(5):720–731, 2001. → pages 35, 3669Bibliography[GK91] Linda Green and Peter Kolesar. The pointwise stationaryapproximation for queues with nonstationary arrivals. Man-agement Science, 37(1):84–97, January 1991. → pages 19,38[GKS91] Linda Green, Peter Kolesar, and Anthony Svoronos. Someeffects of nonstationarity on multiserver Markovian queue-ing systems. Operations Research, 39(3):502–511, May-June1991. → pages 19, 20, 38[GMM02] Florin Gorunescu, Sally I. McClean, and Peter H. Millard.Using a queueing model to help plan bed allocation in adepartment of geriatric medicine. Health Care ManagementScience, 5:307–312, 2002. → pages 2, 14[Gra77] Winfried K. Grassmann. Transient solution in Markovianqueueing systems. Computers and Operations Research,4(1):47–53, 1977. → pages 15, 25[Gre03] Linda Green. How many hospital beds? Inquiry, 39:400–412, 2003. → pages 12[Gre06] Linda Green. Patient Flow: Reducing Delay in HealthcareDelivery, chapter Queueing Analysis in Healthcare, pages281–307. Springer, 2006. → pages 12[GSGG06] Linda V. Green, Joao Soares, James F. Giglio, andRobert A. Green. Using queueing theory to increase the ef-fectiveness of emergency department provider staffing. Jour-nal of Academic Emergency Medicine, 13(1):61–68, January2006. → pages 12[GWS92] William J. Gray, Pu Wang, and Meckinley Scott. AnM/G/1-type queuing model with service times depending onqueue length. Applied Mathematical Modelling, 16(12):652– 658, 1992. → pages 34, 35[Het00] Herbert W. Hethcore. The mathematics of infectious dis-eases. 42(4):599–653, 2000. → pages 56[Hey68] Daniel P. Heyman. Optimal operating policies for M/G/1queuing systems. Operations Research, 16(2):362–382, 1968.→ pages 3770Bibliography[HHMG+09] Deborah Hughes-Hallett, William G. McCallum, Andrew M.Gleason, David Mumford, Daniel E. Flath, Brad G. Os-good, Patti Frazer Lock, Douglas Quinney, David O. Lomen,Karen Rhea, David Lovelock, and Jeff Tecosky-Feldman.Calculus, Single and Multivariable. John Wiley and Sons,Inc., 5th edition, 2009. → pages 53[How60] Ronald A. Howard. Dynamic Programming and MarkovProcesses. John Wiley and Sons, Inc., 1960. → pages 44[HSCCLHC10] Carlos M Herna´ndez-Sua´rez, Carlos Castillo-Chavez, Os-val Montesinos Lo´pez, and Karla Herna´ndez-Cuevas. An ap-plication of queuing theory to SIS and SEIS epidemic mod-els. Mathematical Biosciences and Engineering, 7(4):809–823, 2010. → pages 57[HW84] Daniel P. Heyman and Ward Whitt. The asymptotic be-havior of queues with time-varying arrival rates. Journal ofApplied Probability, 21(1):143–156, March 1984. → pages 19[IABL07] Armann Ingolfsson, Elvira Akhmetshina, Susan Budge,and Yongyue Li. A survey and experimental comparisonof service-level-approximation methods for nonstationaryM(t)/M/s(t) queueing systems with exhaustive discipline.INFORMS Journal on Computing, 19(2):201–214, 2007. →pages 20, 25[ICWC10] Armann Ingolfsson, Fernanda Campello, Xudong Wu, andEdgar Cabral. Combining integer programming and the ran-domization method to schedule employees. European Jour-nal of Operational Research, 202:153–163, 2010. → pages20[Ing02] Armann Ingolfsson. Modeling the M(t)/M/s(t) queue withan exhaustive discipline. In Thinking Beyond the Old 80/20Rule. Call Center Magazine 15, pages 54–56, 2002. → pages19[JMMW96] Otis B. Jennings, Avishai Mandelbaum, William A. Massey,and Ward Whitt. Server staffing to meet time-varying de-mand. Management Science, 42(10):1383–1394, October1996. → pages 1971Bibliography[KE05] Matt J. Keeling and Ken T. D. Eames. Networks and epi-demic models. Journal of Royal Society Interface, 5:295–307,2005. → pages 59[Kel82] Joseph B. Keller. Time-dependent queues. SIAM Review,24(4):401–412, October 1982. → pages 18[Khi60] A. I. Khinchine. Mathematical methods in the theory of queu-ing, volume 7. Hafner Pub. Co, New York, 1960. → pages60[KHYB99] SeungChul Kim, Ira Horowitz, Karl K. Young, andThomas A. Buckley. Analysis of capacity management ofthe intensive care unit in a hospital. European Journal ofOperational Research, 115(1):36–46, 1999. → pages 12[Kiv76] Peeter A. Kivestu. Alternative methods of investigatingthe time dependent M/G/k queue. Master’s thesis, Mas-sachusetts Institute of Technology, September 1976. →pages 20[Kne00] Charles Knessl. Exact and asymptotic solutions to a PDEthat arises in time-dependent queues. Advances in AppliedProbability, 32(1):256–283, March 2000. → pages 18[Kol70] Peter Kolesar. A Markovian model for hospital admissionscheduling. Management Science, 16(6):384–396, February1970. → pages 11[KT09] Diwas S. Kc and Christian Terwiesch. Impact of workload onservice time and patient safety: An econometric analysis ofhospital operations. Management Science, 55(9):1486–1498,September 2009. → pages 37[KY02] Charles Knessl and Yongzhi Peter Yang. An exact solutionfor an M(t)/M(t)/1 queue with time-dependent arrivals andservice. Queueing Systems, 40(3):233–245, 2002. → pages18[KY06] Charles Knessl and Yongzhi (Peter) Yang. On the Erlangloss model with time dependent input. Queueing Systems,52(1):49–104, 2006. → pages 19, 2372Bibliography[LPW06] David A. Levin, Yuval Peres, and Elizabeth L. Wilmer.Markov Chains and Mixing Times. American Mathemat-ical Society, 2006. → pages 43[LS84] F. V. Lu and R. F. Serfozo. M/M/1 queueing decision pro-cesses with monotone hysteretic optimal policies. OperationsResearch, 32(5):1116–1132, October 1984. → pages 33[LT99] Fuh-hwa Liu and Jung-wei Tseng. Bilevel hysteretic servicerate control for bulk arrival queue. Computers and IndustrialEngineering, 37(1):269–272, 1999. → pages 35[Mas85] William A. Massey. Asymptotic analysis of the time depen-dent M/M/1 queue. Mathematics of Operations Research,10(2):305–327, May 1985. → pages 19[MG05] Sai Rajesh Mahabhashyam and Natarajan Gautam. Onqueue with Markov modulated service rate. Queueing Sys-tems, 51(1):89–113, 2005. → pages 36[ML67] Anders Martin-Lo¨f. Optimal control of a continuous-timeMarkov chain with periodic transition probabilities. Opera-tions Research, 15(5):872–881, 1967. → pages 43[MM95] Avi Mandelbaum and William A. Massey. Strong approxi-mations for time-dependent queues. Mathematics of Opera-tions Research, 20(1):33–64, February 1995. → pages 19[New68a] G. F. Newell. Queues with time-dependent arrival rates I.the transition through saturation. Journal of Applied Prob-ability, 5(2):436–451, August 1968. → pages 18[New68b] G. F. Newell. Queues with time-dependent arrival rates: II.the maximum queue and the return to equilibrium. Jour-nal of Applied Probability, 5(3):579–590, December 1968. →pages 18[New68c] G. F. Newell. Queues with time-dependent arrival rates: III.a mild rush hour. Journal of Applied Probability, 5(3):591–606, December 1968. → pages 18[New82] G. F. Newell. Application of Queueing Theory. Chapmanand Hall, 2nd edition, 1982. → pages 1773Bibliography[OR83] Amedeo R. Odoni and Emily Roth. An empirical investi-gation of the transient behavior of stationary queueing sys-tems. Operations Research, 31(3):432–455, 1983. → pages20[PT12] R. Kannapiran Palvannan and Kiok Liang Teow. Queueingfor healthcare. Journal of Medical Systems, 36(2):541–547,2012. → pages 2, 13[Put94] Martin L. Puterman. Markov Decision Processes-DiscreteStochastic Dynamic Programming. John Wiley and Sons,Inc., 1994. → pages 38, 42[RO79] Michael H. Rothkopf and Shmuel Oren. A closure approx-imation for the nonstationary M/M/s queue. ManagementScience, 25(6):522–534, June 1979. → pages 20[Ros95] Sheldon M. Ross. Stochastic Processes (Wiley Series inProbability and Statistics). Wiley, 2 edition, February 1995.→ pages 25[Ros10] Sheldon M. Ross. Introduction to Probability Models. Aca-demic Press, 2010. → pages 4[Sob69] Matthew J. Sobel. Optimal average-cost policy for a queuewith start-up and shut-down costs. Operational Research,17(1):145–162, February 1969. → pages 37[TB09] Pieter Trapman and Martinus C. J. Boostsma. A useful re-lationship between epidemiology and queueing theory: Thedistribution of the number of infectives at the moment of thefirst detection. Mathematical Biology, 219(1):15–22, 2009.→pages 57[TC05] Lotfi Tadj and Gautam Choudhury. Optimal design andcontrol of queues. Top, 13(2):359–412, 2005. → pages 33,37[TCT+09] K.L. Teow, Y.P. Chia, H.H. Tan, Z. Zhu, and B.H. Heng.A pure loss queueing model for endoscopy recovering bedplanning. NHG Annual Scientific Congress, October 2009.→ pages 1474Bibliography[TVGZF11] S. Towers, K. Vogt Geisse, Y. Zheng, and Z. Feng. Antivi-ral treatment for pandemic influenza: Assessing potentialrepercussions using a seasonally forced SIR model. Journalof Theoretical Biology, 289:259–268, 2011. → pages 59[TW07] Henry C. Tuckwell and Ruth J. Williams. Some propertiesof a simple stochastic epidemic model of SIR type. Mathe-matical Biosciences, 208(1):76–97, 2007. → pages 60[VN01] Green Linda V. and Vien Nguyen. Strategies for cuttinghospital beds: The impact on patient service. Health ServiceResearch, 36(2):421–442, June 2001. → pages 12[Whi91] Ward Whitt. The pointwise stationary approximation forMt/Mt/s queues is asymptotically correct as the rates in-crease. Management Science, 37(3):307–314, March 1991.→ pages 19[Wor87] D. J. Worthington. Queueing models for hospital waitinglists. Operational Research Society, 38(5):413–422, 1987. →pages 12[WW99] Dave Worthington and A. Wall. Using the discrete timemodelling approach to evaluate the time-dependent be-haviour of queueing systems. The Journal of the OperationalResearch Society, 50(8):777–788, August 1999. → pages 19[YK97] Yongzhi (Peter) Yang and Charles Knessl. Asymptotic anal-ysis of the M/G/1 queue with a time-dependent arrival rate.Queueing Systems, 26(1):23–68, 1997. → pages 18[YN67] M. Yadin and P. Naor. On queueing systems with vari-able service capacities. Naval Research Logistics Quarterly,14(1):43–54, 1967. → pages 33[You62] J. P. Young. A Queueing Theory Approach to the Control ofHospital Inpatient Census. PhD thesis, The John HopkinsUniversity, I.E. Dept, 1962. → pages 11[You65] J. P. Young. Stabilization of inpatient bed occupancythrough control of admissions. Hospitals, Journal of Amer-ican Hospital Association, 39(19), 1965. → pages 1175

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0223115/manifest

Comment

Related Items