DYNAMIC SUBFRAME ALLOCATION IN ARESERVATION MULTIPLE ACCESS CHANNELbyPETER D. ROORDAB.A.Sc. The University of Waterloo, 1991A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER'S OF APPLIED SCIENCEinTHE FACULTY OF GRADUATE STUDIES(Department of Electrical Engineering)We accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAJuly 1993©Peter Roorda, 1993In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature) Department of Electr. 42-0,..\ The University of British ColumbiaVancouver, CanadaDate ^, (113DE-6 (2/88)AbstractMultiaccess protocols allow multiple geographically diverse users to communicate with acentral station or each other over a single broadcast channel. Reservation protocols are a classof multiaccess protocols appropriate for satellite and radio networks. This thesis investigatesdynamic time slot access control schemes for reservation multiaccess protocols similar tothose investigated by Roberts [1] and others. Previous work has considered this model onlyfor the case of a static frame subdivision, the majority of the work relying on approximatemethods for analysis. In this thesis, schemes are considered where the subframe allocation iscontrolled on a frame by frame basis based on the current state of the finite user population.The controlled system is modelled as a Markov decision process with the subframeallocation as the decision variable. Optimality equations are provided and using infinitehorizon dynamic programming techniques, the control policy that maximizes the averagethroughput on the channel based on complete state information is found. In addition tothe optimal policy, we propose two heuristic dynamic allocation schemes that while beingsuboptimal, have implementation advantages over the optimal. Performance results indicatethat the implementation of dynamic subframe allocation into the model significantly improvesthe performance of the network over previously considered fixed allocation schemes, bothin terms of lowering the average message transmission delay and increasing the channelcapacity. Proposed are methods of estimating the unknown state using a variation of Rivest's[2] pseudo-Bayesian scheme to allow approximation of the dynamic control schemes whichdepend on full state information. Simulation results indicate that the implementation of stateestimation does not significantly erode the performance of the dynamic policies. Extendingthe model to incorporate multipacket messages is discussed and performance results indicatethat in that case also, protocol performance may be improved using dynamic allocation.iiContentsAbstract ^ iiList of Tables viList of Figures ^ viiAcknowledgment ixChapter 1 INTRODUCTION ^ 11.1 Motivation and Objectives 11.2 Background ^ 21.2.1 Multiaccess protocols ^ 21.2.2 Reservation multiaccess protocols ^ 31.2.3 Markov decision process theory 51.2.4 State estimation ^ 61.3 Structure of the Thesis 6Chapter 2 NETWORK MODEL AND MARKOV DECISION PROCESSFORMULATION ^ 82.1 The Basic Reservation Model ^ 82.2 Delay Modelling ^ 102.3 Markov Decision Process Formulation ^ 122.3.1 System state ^ 132.3.2 Actions 132.3.3 Rewards ^ 142.3.4 Dynamic equations ^ 142.3.5 State transition probabilities ^ 15iii2.4 Solving for the System Throughput and Delay ^ 162.4.1 Deriving the throughput of the Markov process 162.4.2 Expected packet delay ^ 172.4.3 Summary of steps to find throughput and delay ^ 18Chapter 3 SUBFRAME ALLOCATION SCHEMES ^ 193.1 Fixed Allocation ^ 193.2 Optimal Allocation 203.2.1 Optimality equations ^ 203.2.2 Solving for the optimal policy with dynamic programming ^ 233.2.2.1 The policy iteration algorithm ^ 233.2.2.2 The modified policy iteration algorithm ^ 243.2.3 Optimal performance ^ 263.3 "Data Priority" (DP) Allocation 303.4 "Maximum Flow" (MF) Allocation ^ 343.5 Capacity Analysis^ 38Chapter 4 SYSTEM STATE ESTIMATION ^ 444.1 True Bayesian Estimation ^ 464.2 Pseudo-Bayesian Estimation for the Reservation Subchannel ^ 49Chapter 5 MULTIPACKET MESSAGE MODELS ^ 555.1 The M1 Model ^ 575.1.1 Model and Markov decision process formulation ^ 575.1.2 Dynamic allocation policies ^ 595.1.2.1 Fixed allocation ^ 595.1.2.2 Optimal policy 60iv5.1.2.3 DP policy^ 605.1.2.4 MF policy 605.1.3 Numerical results ^ 615.1.4 Capacity approximation 665.1.5 Problems with the M1 model ^ 685.2 The M2 Model ^ 695.2.1 Model and state structure ^ 695.2.2 Simple allocation in the M2 model^ 71Chapter 6 CONCLUSION ^ 756.1 Summary ^ 756.2 Proposals for Further Research ^ 76Bibliography ^ 79Appendix A Derivation of Conditions for a Policy to Produce an Ergodic MarkovProcess ^ 82Appendix B Joint Probability Distribution for Observations in the ReservationSubframe when n' Requests are Transmitted ^ 84Appendix C Glossary of Symbols, Acronyms and Abbreviations ^ 86List of TablesTable 1 Optimal policies for a range of arrival probabilities for the parameters(M=20, F=5, V=3) ^ 28Table 2 DP policy for the parameters (M=20, F=5, V=3) ^ 30Table 3 MF policy for the parameters (M=20, F=5, V=3) 36Table 4 Optimal policies for a range of arrival probabilities in the M1 model withparameters (M=20, F=5, V=3, R=3) ^ 62viList of FiguresFigure 1 The frame structure of the reservation protocol ^ 9Figure 2 Queuing control representation of the reservation model ^ 15Figure 3 Throughput vs. arrival probability for the parameter set (M=20, F=5, V=3): fixed allocation policies and optimal policy ^ 21Figure 4 Average delay vs. throughput for the parameter set (M=20, F=5, V=3) :fixed allocation policies and optimal policy ^ 21Figure 5 Throughput vs. arrival probability for the parameter set (M=20, F=5, V=3): fixed allocation policy, optimal policy, MF policy and DP policy ^ 32Figure 6 Average delay vs. throughput for the parameter set (M=20, F=5, V=3) :fixed allocation policy, optimal policy, MF policy and DP policy ^ 32Figure 7 Throughput vs. arrival probability for the parameter set (M=150, F=12,V=12) : fixed allocation policy, MF policy and DP policy ^ 33Figure 8 Average delay vs. throughput for the parameter set (M=150, F=12,V=12) : fixed allocation policy, MF policy and DP policy ^ 33Figure 9 Approximate capacity plotted against the number of minislots per slot, V,for the ideal capacity and the best fixed allocation policies' capacity forthe case F=5 ^ 40Figure 10 Approximate capacity plotted against the number of minislots per slot,V, for the ideal capacity and the best fixed allocation policies' capacityfor the case F=5 ^ 40Figure 11 Average delay vs. throughput for the parameter set (M=20, F=5, V=3) :MF policy with and without estimation ^ 53Figure 12 Average delay vs. throughput for the parameter set (M=150, F=12,V=12) : MF policy with and without estimation ^ 53viiFigure 13 Queueing control representation of the M1 multipacket reservationmodel^ 58Figure 14 Throughput vs. arrival probability in the M1 model for the parameter set(M=20, F=5, V=3, R=3) : fixed allocation policy, optimal policy, MFpolicy and DP policy ^ 63Figure 15 Average delay vs. throughput in the M1 model for the parameter set(M=20, F=5, V=3, R=3) : fixed allocation policy, optimal policy, MFpolicy and DP policy ^ 63Figure 16 Throughput vs. arrival probability in the M1 model for the parameter set(M=150, F=15, V=8, R=5) : fixed allocation policy, MF policy and DPpolicy ^ 65Figure 17 Average delay vs. throughput in the M1 model for the parameter set(M=150, F=15, V=8, R=5) : fixed allocation policy, MF policy and DPpolicy 65Figure 18 Approximate capacity plotted against the average message length, R,for the ideal capacity, the best fixed allocation policies' capacity and theMF policy capacity for the case F=15 ^ 67Figure 19 Average delay vs. throughput for a fixed policy and the MF policy withestimation in the M1 and M2 models with parameters (M=15, F=5, V=3,R=3) ^ 73viiiAcknowledgmentI would like to thank my supervisor, Dr. Victor Leung, for his valuable guidance andsuggestions in both my research and in the preparation of this document.I also wish to thank the National Science and Engineering Research Council of Canadafor the funding that made this work possible.ixChapter 1 INTRODUCTION1.1 Motivation and ObjectivesThe research presented in this thesis considers the idea of dynamically allocating channelresources to the reservation and data subchannels based on system state feedback in areservation multiple access protocol. While a great deal of research has been done onreservation protocols, the work has primarily focussed on systems where the subchannelallocation is fixed or where dynamic allocation is based on heuristic algorithms. No onehas investigated the idea of fully using the available system state information to optimallypartition the channel dynamically.We consider a simple reservation model originally proposed by Roberts [1], altering it toallow a frame-by-frame allocation partition control. Without such control, useful data slotspotentially lie idle, causing decreased bandwidth efficiency. By incorporating this type ofcontrol, the system may adaptively allocate channel resources according to the state of thesystem to increase the system throughput. It is obviously desirable to optimize with respectto system throughput the dynamic allocation strategy. This desire motivates us to developa fairly simple reservation model that allows exact mathematical analysis and optimization.While the theoretical appeal of such an optimal result is indeed significant, it is also importantto bear in mind the implementation complexity of any proposed system. For this reason weconsider not only optimal allocation control, but also heuristic suboptimal strategies whichachieve near optimal performance while holding advantages for real system implementation.In this thesis, it is necessary to make simplifications in the reservation model to allowexact numerical analysis of the system performance. One of these assumptions is that thesystem state is known by all users. Since this is not necessarily the case in a real network,it is necessary to develop a channel state estimation technique to allow implementation of1proposed state-dependant control schemes. Finally, it is of interest to extend the simplesingle-packet message model to one where messages are of variable length. This extension ofthe model is appealing in that it incorporates a more general family of multiaccess networks.Specifically our objectives are as follows:1. to determine, for a fairly simple reservation model, the optimal subframe allocation policybased on system state feedback which maximizes the system throughput,2. to compare the performance of the optimal policy performance to previously consideredfixed allocation schemes,3. to investigate and propose heuristic policies that, while suboptimal, show performanceclose to optimal and have implementation advantages over the optimal,4. to develop a simple and effective state estimator that is necessary for state dependantallocation schemes,5. to investigate the extension of these results to multipacket message models.1.2 Background1.2.1 Multiaccess protocolsMultiaccess protocols for packet communication systems enable many users at diverselocations to efficiently share a common broadcast channel for communicating with a centralstation or with each other by packet transmissions. The designer of such protocols strivesto achieve a channel access discipline that achieves communication that is both reliable andbandwidth efficient. While the fixed assignment techniques of time division multiplexingand frequency division multiplexing are well suited protocols for systems where user trafficis continuous, these schemes become extremely inefficient when user traffic is bursty andlow rate as is often the case for data traffic. In this case, we desire more flexible protocolswhere idle users do not use bandwidth needlessly and users with data to send may transmitwithout undue delay. Many such protocols have been implemented and many more proposed2in the communications literature. The wide variety in protocols reflects the differences in thephysical characteristics of the communications channel being shared, the nature of the trafficand the characteristics of the user population.One common characteristic in all multiaccess protocols is that in each there is an amountof channel bandwidth used to coordinate station access in the flexible assignment scheme.Thus there is a basic split in the channel, between control overhead and useful data transfer. Insimple contention based random access systems the control overhead is represented by the idleand collision times between useful data packets. For token rings and busses, the token passingtime is the control overhead. For reservation multiple access protocols, the control overheadis the reservation subchannel used for reservation request transfer. In all of these cases, theefficiency of the protocol is dictated by how small the control overhead can be kept relative tothe amount of useful data exchange. For the most part, the great body of work in the literatureon multiaccess protocols is composed of the invention and analysis of heuristic protocols thatcontrol channel access with a minimum of overhead. Finding the optimal policy has beenelusive and research into optimal protocols has lagged behind the more successful heuristicproposals. In this thesis, we consider the optimization of performance for a specific protocolwithin a class of multiaccess protocols referred to as reservation protocols, where perhaps thechannel split between control overhead and data transmission is made most explicitly.1.2.2 Reservation multiaccess protocolsWithin a reservation multiaccess scheme, a terminal with data to send transmits areservation request on the shared channel to request future channel time to send its data.This request is received (by the scheduling station in a centrally controlled system or byall users in a distributed-control system) and used to allocate channel time to the requestinguser based on the protocol's assignment discipline. Thus, the broadcast channel is essentiallysplit into two subchannels — a reservation subchannel for reservation request traffic and a3data subchannel for transmitting data messages on reserved channel time. The channel splitmay be achieved by either time-division [1], [3], [4], [5], [6] or frequency division [7], [8].Requests for reservations may be sent explicitly as small request packets [1], [6], [9] or beimplied by the successful transmission of an initial data packet [3], [4], [8]. Finally, in thereservation subchannel, protocols may use fixed assignment [9] or random access [1], [4], [8]as a discipline for request packet transmission.In this thesis, a protocol is considered that is very similar to that first proposed byRoberts [1]. The channel is time-divided into the reservation subchannel and data subchannel.Reservation requests are sent explicitly in a S-ALOHA fashion via minislots in the reservationsubchannel. Successful reservation requests join a global queue and those stations in the queuehave channel time allocated to them in a first come first served basis in the data subchannel.It is well known that such a protocol is very efficient for systems where the number of usersis large, the round trip signal propagation time is long compared to the packet length andthe users' message lengths are long in comparison with the reservation request packets. Assuch, reservation protocols of this type are most appropriate for VSAT (very small apertureterminal) satellite networks, mobile satellite networks or packet radio networks.Previous analyses of protocols using minislots for reservation have considered systemswhere the partition between the reservation and data subframes is fixed [1], [5], [10]. Thisbrings about an obvious shortcoming in the protocol. By having a fixed length data subframe,slots may be wasted when there are no stations waiting to send. This inefficient partitioningof the channel prompts us to investigate how better to allocate frame slots, to allow thesubframe resource allocation to be more sensitive to the current state of the system. Thus, inthis thesis, we consider the system where the subframe allocation is dynamically set accordingto feedback information on a frame-by-frame basis. It is shown in this thesis that the useof dynamic allocation in the protocol can produce significantly higher throughputs and lower4average delays than the previously considered fixed schemes for packet transmissions inmedium and high traffic situations.While use of such dynamic allocation in reservation models is not entirely new, Ours isthe first attempt to optimize this allocation for the model under consideration on the basisof full state feedback. Rubin [6] proposed a minislot reservation protocol that automaticallyadjusts allocation based on a portion of the state information and some traffic estimates, but heabandons the fixed frame size and makes no attempt to optimize the allocation. Protocols thatuse first packet transmission for reserving future slots proposed in [3], [4], [8] implicitly usepartial state information to dynamically allocate channel time but their approach is heuristic innature, again with no attempt at optimization. The success of the heuristic dynamic allocationschemes in these similar protocols further motivates us to not only extend the concept ofdynamic allocation to the minislot reservation model, but to optimize that allocation usingMarkov decision process techniques.1.2.3 Markov decision process theorySzpankowski [10] uses a detailed Markovian analysis of Roberts [1] model to determinesystem performance with a fixed subframe allocation. By reformulating the model from asimple Markov process with fixed allocation to a Markov decision process with dynamicsubframe allocation, the Markovian formulation becomes a tool for design rather thanfor simple analysis. Markov decision process theory includes a well developed group oftechniques for deriving optimal decision policies in Markovian systems. Recent advancesin the field address infinite horizon models with undiscounted rewards, a class that includesmost applications in the field of communications.By formulating the reservation model as a Markov decision process, we are able to analyzedifferent dynamic policies explicitly, find the optimal allocation strategy using dynamicprogramming techniques and, by Markovian analysis, determine the optimal performanceunder such a strategy. While novel to this application, this technique has been appliedsuccessfully to other communication problems in areas of random access control [11],[12],queueing control [13], [14], and flow control [15], [16].1.2.4 State estimationWhile we assume that full state information is available when deriving the optimalallocation policy, it must be recognized that the system state is only partially observableby the stations. Specifically there is a need to estimate the number of stations contending onthe reservation subframe based on the history of observations on that frame. This problemis encountered for controlling the S-ALOHA random access broadcast channel. Segall [17]considers the general problem of recursive state estimation for a partially observed Markovprocess. His approach, though optimal, proves to be unwieldy and fairly impractical. Otherresearchers [2], [18], [19] propose simpler techniques to get a good estimate, their performancesummarized and compared in [20]. Rivest's [2] pseudo-Bayesian scheme emerges as anespecially simple and effective scheme for estimation and lends itself quite handily to beadapted to our model. Thus in this thesis, a useful state estimation technique is developedthat bears close resemblance to that proposed by Rivest [2].1.3 Structure of the ThesisThe thesis is organized as follows. Chapter 2 provides a detailed description of thereservation model and formulates this model as a Markov decision process. The model bearsclose resemblance to that proposed by Roberts [1] but with the added capability of allowingdynamic allocation. The assumptions made in the model are discussed in detail. Fromthis model, the states, actions, rewards and transition probabilities that uniquely characterizethe Markov decision process are defined. In the Markov formulation, the specific formulasnecessary for determining the throughput and average delay in the system when a controlpolicy is specified are derived.6Chapter 3 presents the optimality equations and the dynamic programming algorithmto determine the optimal allocation scheme for a given set of system parameters based onthe Markov decision process formulation of chapter 2. Furthermore, two simpler heuristicschemes are proposed which are much more attractive for implementation while givingperformance close to optimal. Exact numerical results for all the policies are calculatedusing the formulas presented in chapter 2 and are presented for comparison among thedynamic schemes and with previously considered fixed schemes. Results indicate that the useof dynamic allocation policies improves performance dramatically over the fixed allocationpolicies that have been previously considered. System capacity under different allocationschemes is examined and numerical results indicate consistently higher capacities for thedynamic policies than fixed allocation. The advantages and disadvantages of the differentpolicies in terms of performance and implementation issues are discussed.Chapter 4 addresses the problem of state estimation, deriving the expressions for theoptimal true Bayesian estimator and an approximate pseudo-Bayesian estimation algorithmappropriate for our model. The latter is much more attractive for real-time implementationand simulation results presented in the chapter indicate that the use of the pseudo-Bayesianestimator does not significantly undermine the performance of the dynamic schemes.In chapter 5, we consider variations of the reservation model that handle multipacketmessages. The extension of the results for the single packet message model to the multipacketmodel are discussed and some simulation results are presented that indicate that dynamicallocation schemes improve performance in this case also.Finally, chapter 6 concludes the thesis, highlighting the major points and proposing furtheravenues of research.7Chapter 2 NETWORK MODEL AND MARKOVDECISION PROCESS FORMULATIONIn this chapter the network model that is used for analysis is described in depth and themodel is formulated as a Markov decision process. The model chosen is fairly archetypical ofreservation models with minislot reservations. The chief difference is that we allow dynamicsubchannel allocation. Since most reservation protocols split the channel into a subchannelfor securing the reservation and a subchannel for servicing reservation, the idea of dynamicallocation of channel resources would appear to be extendable to other reservation modelsand it is likely that improvements similar to those that are revealed in the reservation modelthat we consider would be possible in other reservation protocols when dynamic allocation isimplemented. We restrict our attention in this work to one simple model. For exact analysisand to enable true optimization we are forced to make some simplifying assumptions aboutthe model to allow the Markovian formulation. The assumptions are discussed in detail inthis chapter.2.1 The Basic Reservation ModelWe consider a reservation model with a finite number M users, each of these usershaving a message source and a single message buffer. These stations have no means forintercommunication outside of the single broadcast channel on which all stations transmit andreceive. Channel time is divided into frames of length F slots, each slot being equivalentto the length of one packet. We shall assume that all stations are synchronized such thatpacket transmissions start at the beginning of a slot and occupy exactly one slot. Framesare subdivided into a reservation subframe and a data subframe. The reservation subframeis composed of L slots, each of which is further subdivided into V minislots, each minislot8being the length of a reservation request packet. Thus, there are F — L data slots in the datasubframe. (See Figure 1.)H ^ F slots/frame ^V minislots/slot1111^11 11^1111^1111H-- ^Subframe ^L slotsLV minislots Data subframeF-L slotsFigure 1. The frame structure of the reservation protocolThe LV reservation slots in the reservation subframe are accessed in slotted ALOHA mode.Once a station successfully sends a reservation request packet in a reservation minislot, themessage joins a global queue and gets slots reserved in upcoming data subframes. (Thisglobal queue has been referred to in the literature as the "queue in the sky" [10] since it doesnot exist in a single location, but rather is maintained by distributed control of all stations.All stations monitor the number of packets in the queue based on observations on the channel,and track their own place in the queue.) We initially focus our attention on a situation whereall messages are composed of a single packet. (In chapter 5 we consider the multipacketmessage model.) Thus each station is in one of three modes. When it has no message, thestation is in idle mode. When a new message arrives the station goes into contention mode.When the station has successfully sent a reservation request on the reservation subchannel,it enters reservation mode. Finally, when the station reaches the head of the global queueand the packet is sent in the reserved data slots, the station goes back into idle mode to waituntil a new message arrives.In Roberts' protocol [1], it is assumed that the channel time allocated to the reservationsubframe, L, is fixed for all frames. If we allow L to become a control variable, set according9to the state of the system, it is proposed that gains can be made in the performance of theprotocol.The following are the key assumptions made in the model:1. There are a M users with identical access procedures and traffic characteristics.2. Each slot may be divided into V minislots, each of which can hold 1 reservation requestpacket.3. Each idle user generates a message in a frame with probability p, referred to as the arrivalprobability. Until the message is completely sent, the station is considered to be blocked.(ie. All newly arriving packets to that station are lost.)4. When a new message arrives, the station does not send a reservation request until thefollowing frame. Similarly, when a reservation request is successfully made, the messagemay join the global queue only at the beginning of the next frame. Finally when a stationcompletes a message transmission, it remains blocked until the beginning of the nextframe. (ie. All station states are updated at the start of each frame.)5. When a station is in contention mode, it will send one reservation request in one of theLV minislots (chosen at random) of the current frame with transmission probability ps .The station repeats this until the message is sent successfully at which point the stationenters the global queue.6. The channel is error free except for collisions, and all stations are perfectly synchronized.2.2 Delay ModellingAn important characteristic of any broadcast network model is how the round trip delayis modelled. In local area networks (LAN's) the delay is typically very short with respect tothe message size. For packet radio networks the delay is much greater. For satellite networks,this round trip propagation delay is in the order of a quarter second which typically spansthe length of a large number of data packets. This range in delay characteristics makes it1 0difficult to create a general network model for consideration. Reservation protocols demanda fairly high level of complexity for individual stations to coordinate access and maintainsynchronization. For this reason, the use of such protocols seem to be best suited for packetradio and satellite networks. We discuss in this section the issues involved in delay modellingfor these applications.Archival literature on the subject of reservation protocols [1], [10], [21] have largelyavoided true delay modelling for analysis of the network model. True delay modellinginvolves the addition of "wait" states to the straightforward "contention", "reservation" and"idle" states introduced earlier. This addition of states quickly increases the dimensionalityof the problem and thus computational barriers become insurmountable very quickly. Inlater chapters it becomes apparent that such an increase in dimensionality will render exactnumerical analysis inachievable. Some researchers have attempted the high road, doinganalysis with these added "wait" states, but have been forced to rely on approximate methodsto avoid numerical overload [5]. [5] produces a Markovian model with more than 2G/F-1- 5distinct state variables where G is the round trip delay measured in slots. Most analyseshave, on the other hand, either assumed that the propagation delay is shorter than the datasubframe, thus does not interfere with the state structure [10] or else discounted the presenceof delay completely [1], [21]. This simplification keeps the number of state variables low,allowing exact Markovian analysis.In the model as we have defined it with assumption 4, we assume that the delay issmall enough that all stations have the information from the reservation subframe before thebeginning of the next frame. There is no real need for stations to monitor the data subframesince by assumption 6 we may rely on successful transmissions and correct distributedmaintenance of the global queue. Thus, for a system where control is distributed amongthe stations or orchestrated by satellite on-board processing, assumption 4 demands that the11longest round trip delay in the system G, must be less than the length of the data subframe(ie. G < F — L). (For a network controlled centrally by a hub earth station, the feedbackdelay would be twice the round trip propagation delay.) Considering that the allocation Lmay change dynamically from 0 to F, this effectively means that for our model to be strictlyaccurate, G=O. While this may indicate that our modelling of delay is not completely true,past work indicates that the simplification is not a serious one. Kleinrock and Lam [21] showthat adjusting the retransmission probability of a S-ALOHA system such that the averagebackoff includes the true delay, the results are close to those produced by the computationallyexpensive true modelling approach. Furthermore, Lim and Meerov's [22] investigations showthat in a contention channel, the introduction of delay into the system improves stability andthroughput of the system. Thus, our assumption of small or no delay may act as a worstcase scenario. These results give credence to the notion that valid performance results maybe determined without strictly accurate delay modelling.2.3 Markov Decision Process FormulationIn the literature, analysis of reservation multiaccess protocols has taken a number ofdifferent forms. In many of the cases, the reservation and data subchannels are treatedindependently with the only stipulation being that their average flows are balanced. Whilethis approach may be valid for a fixed allocation, in the more general model where thesubframe allocation may change from frame to frame, it is difficult to avoid handling the twosubsystems together. We thus handle the system as a whole, formulating and analyzing it asa two state Markov decision process. From this formulation, we can determine the stationarystate probabilities and thus the system performance measures, throughput and average messagedelay.A Markov decision process [23] differs from a Markov process in two respects. Firstly,while in a Markov process the probability of being in the current state is a function only12of the previous state, in a Markov decision process the current state of the system is afunction of not only the previous state, but also of the "decision" or "action" at that previousstate. Secondly, in a Markov decision process there is associated with each state a "reward".The Markov decision process is a useful formulation in that using this formulation wemay find the decision policy which maximizes the expected reward of the process. Forour reservation model, this translates to finding the allocation policy which maximizes theexpected throughput for the network. Furthermore, since the decision policy is a function ofthe system state, specifying the policy transforms the Markov decision process to a simpleMarkov process, whose throughput is the expected reward. We now proceed with such aformulation, defining the states, actions, rewards and transition probabilities that uniquelydefine the Markov decision process.2.3.1 System stateWe let the system state be defined by the two dimensional vector„Ck = (Nlc ,Nn E S,where N1 k and N2k define the number of stations in contention mode and in the global queuerespectively at the beginning of frame k. The state space S, is defined byNi E {0^} ,E {0,^—The magnitude of the state space is found to beISI = (M + 2)(M + 1)22.3.2 ActionsAt the start of frame k, the stations decide on the size of the allocation for the reservationsubchannel Lk, based on the current state information. Thus the allocation policy will bedefined byLk = Lk(Sk)^(3)(1)(2)13Since we are interested only in stationary policies, we may drop the subscript to give a timeinvariant policyL(Sk) E A -LE^.^ (4)Thus the policy L, is simply a mapping from the state space onto the set of allowable actionsas defined by A.2.3.3 RewardsWe would like to find the allocation policy to maximize system throughput. Thus wechose the reward r to be the expected number of successful packets sent in the frame giventhe state and action variables. Since this expectation is the lesser of the length of the globalqueue and the data slots in the frame,L(S k) Min (I , F — L(Sk)) •^ (5)Thus -7 .L is a vector indicating the reward for each state for the action policy, L.2.3.4 Dynamic equationsBefore deriving the transition probabilities, let us define the dynamic equations of thesystem that dictate the system state transitions. For ease of notation, letni^Nik ,^7/2^Nj;(6)_4+ 1 , 71/2 = 4+1 , a L(Sk) .Further, we let Xk be the number of new messages generated in frame k, Yk be the number ofsuccessful reservation requests sent in frame k, and Wk be the number of message transmissionsthat are successfully completed in frame k. The role of these variables is seen in Figure 2which illustrates how stations move through the queueing system.We may now write the set of dynamic equations as follows:^n i^—(7)7n2 = 71 2 + Yk W k14Wkcontending^reservation^global queue^datastations subframe subframeFigure 2. Queuing control representation of the reservation model2.3.5 State transition probabilitiesThe derivation of the transition probabilities is somewhat more complex but is simplifiedby assumptions 3-5. In a Markov decision process, the current state is a function of theprevious state and action. Thus a state transition matrix, P L has elements,Pi = Pr{Sk+1 =^= i, L(i)} •^ (8)The probability distribution of Yk given that n requests are sent derived in f101 isPr{Y k = i n, a}= f (i, n, a)^ (9)minfaV,n1(-1) i (aV)in!^(-1) (aV —(aV) n i! (j — i)!(aV — j)!(n — j)!Since the probability that n requests are sent is binomially distributed, we may write theprobability of exactly Yk success given that ni stations are in contention mode to ben j Pr{ Y k =^nini, n2, a} =^p s (1 — p 3 ) 1" -71 f (i , n, a)n=0By assumption 3, Xk is also distributed binomially,(10)pP r^= x^n2, a} =^ni^712) x^1))114 n1 n2^(11)15Finally,Wk = min (712, F — a) .^ (12)Using these expressions, we may now state the state transition probabilities asP(mi , m2 I ni n21 a)Pr { Yk 7n2 — n2 + min (n2, F — a) I n i , n2, a} x^(13)P {X k^— nl 77/2 — n2 + min (n2, F — a) I ni, 712, a}.Since the probability of being in the next state (ml, 7n2) is a function only of the current state(ni, 712) and action, a, we can say immediately that the transition probabilities do in fact satisfythe Markovian property, and thus the process is a Markov decision process. Furthermore,the transition probabilities are independent of the frame index k and thus are stationary. Wenote here again that when a decision policy is specified, the action is a simple function of thecurrent state and these transition probabilities define a unique Markov process.2.4 Solving for the System Throughput and Delay2.4.1 Deriving the throughput of the Markov processAverage channel throughput is commonly used as a measure of the effectiveness of amultiaccess protocol. For our purposes, we will consider the throughput as the long runaverage number of successful packet transmissions per frame, given the initial state S,71 1' (S)^l L(Sk) I So^S (14)T^1 kAll non-degenerate policies result in Markov processes with a single ergodic class (seeAppendix A), in which case the throughput is independent of the initial state,(15)To find this throughput we must first determine the limiting distribution of the stationaryMarkov chain.16The most common method to find the limiting distribution is to find the equivalentstationary distribution of the system. A distribution=^7r2,^} (16)is stationary if^7ri = 1^ (17)and 7r 4 P{Sk+1 =^Sk :i} •-If the Markov process contains a single recurrent class [24], the stationary distribution isunique and is equivalent to the limiting distribution,71-7 = lira P{Sk = i} .^ (18)Determining the stationary distribution is simply a matter of solving the linear system givenby eqns. (17).We may now find the throughput by finding the expected throughput for each state (thestate dependant reward) and summing over the limiting distribution, or more simply thevector product,L^*=^7.L^.2.4.2 Expected packet delayThe expected packet delay for the system is the sum of the reservation request packetdelay D1, and the data packet delay D2, both in units of frame durations. If we determine(19)the stationary state expectations as follows,ME[N1 ] =MM - n1711 > I 71- (i11 ,71 2)n2=0M—n2(20)and E[N2 ]n2=07 (721,112 )17we may use Little's formula to determine the delays to beF E[Ni ]D 1 =^L ^and77,^F E[N2]1-/2 -=^r^•77 1'(21)Thus we may now write the overall expected message delay to beD = F ( E[Nli 4- E{N2] )71(22)According to model assumption 4 there is no need to calculate residual delays. Byloosening the basic simplifying assumptions of our model such residual delays would beintroduced.2.4.3 Summary of steps to find throughput and delayThe following summarizes the steps taken to determine the throughput and delay for agiven stationary control policy L(Ni, N2 ).1. Calculate the transition matrix with eqn. (13).2. Use the transition matrix to determine the limiting distribution by solving the system ofeqns. (17).3. Find the long run average throughput by incorporating the limiting distribution and statedependant throughputs using eqn. (19).4. Solve for the expected packet delay using eqns. (20) and (22).18Chapter 3 SUBFRAME ALLOCATION SCHEMESIn the previous chapter we developed the specific tools to determine the throughput anddelay performance for the network operating under a decision policy. In this chapter weexamine specific policies and compare the performance of the model under these policies.Previous research has focussed chiefly upon fixed allocation schemes. We determine theoptimal dynamic policy and two heuristic dynamic schemes which improve on the performanceof the fixed schemes. Throughout this section, we continue to assume the ideal situationwhere the state information is completely known at each decision point. We will loosen thisassumption in the following chapter and incorporate state estimation into the system.3.1 Fixed AllocationPrevious research into the reservation model ([1], [10], [5]) considers only the casewhere the size of the reservation subframe is fixed with no regard for the system state (ie.L(Ni, N2) = L f). Szpankowski [10] proposes for his very similar model a simple method forchoosing the best fixed allocation. Here we use the same method, applying it to our model.If we assume that throughput on the contention channel is 1/e reservation requests per slot,we may write the flow balance equation to beV LfeF— L f .^ (23)This gives us an optimal fixed allocation ofL f^+F vIe .We use the lower and upper integer values of this result as the two best fixed allocations. Bychoosing the lower of these values, we create a situation where there is a bottleneck in thereservation subchannel. The higher of these values creates a bottleneck in the data subchannel.(24)19For the case where F=5, V=3 and M=20, Lf is 2.377, and the two best fixed allocations schemesare at Lf=2 and Lf=3. For these allocations, we have set the transmission probability ps to1.0 and 0.6 respectively. In Figures 3 and 4, we plot for these two fixed allocation schemesthe throughput against the arrival probability and the average delay against the throughput ascalculated using the procedure summarized in section 2.4.3. In the case of Lf=2 we can seethat the bottleneck in the reservation subframe has caused decreased throughput when trafficis high, a phenomenon that is well explored in [10]. The transmission probability p, for thatcase was reduced from 1.0 to 0.6 to allow good performance on a fairly large range of trafficlevels while keeping the average delay reasonably low. We have chosen a model with smallframe and population size here to ease the numerical computations. In following sections weresort to simulation methods to investigate larger population models and compare dynamicand fixed subframe allocation schemes.3.2 Optimal Allocation3.2.1 Optimality equationsWe wish to optimize the allocation policy with respect to the system throughput. In themodel that we have defined, this is equivalent to minimizing the average packet delay. Inthis section we define the optimality equations used to determine the optimal policy. Findingthe optimal allocation policy corresponds to finding a state-dependant policy L* such that?7 L* (S) = max 77 L ( S) = q* (S) ,LEW(25)where W is the set of all stationary, non-randomized policies that are dependant only onthe current state. For a finite state and action Markov decision process with average rewardcriteria and infinite horizon, it has been show [25] that there exists a policy that is optimalwithin this set. A stationary policy L is said to be unichain if its corresponding Markovchain contains only one recurrent class and possibly a number of transient states. In this case20210.00 0.10 0.40 0.500.20^0.30arrival probability, pFigure 3. Throughput vs. arrival probability for the parameterset (M=20, F=5, V=3) : fixed allocation policies and optimal policy^ fixed allocation: L=2- - - - fixed allocation: L=3^ optimal policy 0.50 0.600.10^0.20^0.30^0.40throughput (packets/slots)Figure 4. Average delay vs. throughput for the parameter set(M=20, F=5, V=3) : fixed allocation policies and optimal policy0.603 0.5a).9-- 0.400.300.20.17.06.00.) 5.0cva)E 4.0coa)co 3.0cvi5 2.01.00.0....................-------^fixed allocation: L=2-- fixed allocation: L=3^ optimal policyV- ................................I^-I77L(S) is independent of the state and is thus a constant [23]. When all stationary policies areunichain, 7/ * satisfies the set of optimality equations,_ max fv L^- 77 * PL f)} . (26)Recall that PL is the policy dependant transition probability matrix whose elements aredefined by eqn. (13). These optimality equations are derived and fully explained in [23] andare the equations that uniquely characterize the optimal policy, L*, and its correspondingoptimal expected average reward, 71 *. The decision rule L which achieves the maximumaverage reward when a f) that solves the equation is inserted is the optimal policy [23]. v isreferred to in the literature as the bias or the relative value function and is defined by eqn.(26) up to an additive constant. It is called the relative value function because its elements,v(S) ; S E S, indicate the relative value in terms of total reward of being in one state versusanother. That is, v(i) — v(j), is the difference in the total reward when a system starts instate i versus a system that starts in state j. Since there are IS I + 1 unknowns in eqn. (26)(y* and v(i), i = 1, ,'SD for ISM equations, f) is solvable only up to an additive constant,thus the values of the relative value function only have relevance with respect to one another.Looking now at the optimality equations we see that the expression on the right is thesum of the immediate reward relative to the average reward for that frame, 7 L — ) *, and theexpected relative value for the next frame,(27)when the policy L is used. This gives us an intuitive sense of what the optimality equationsrepresent. The details of the rigorous derivation of (26) are quite complex and the reader isreferred to such texts as [23], [25] and [26] for complete descriptions.For this optimality equation to hold for our model it must be shown that all policies areunichain. This is demonstrated in Appendix A.223.2.2 Solving for the optimal policy with dynamic programmingUsing these optimality equations we would like to find both the optimal allocation policyL* , and the system throughput, q * . There exists a number of methods to solve these equations.Both the policy iteration algorithm [27] and the modified policy iteration algorithm [23] foraverage reward criterion are presented here. The former is the simpler of the two but the latteris preferable for its reduced computational load and its attractive convergence characteristics.We provide here the specific algorithms and discuss the issues of applying these algorithmsto solving the network control problem.3.2.2.1 The policy iteration algorithmPolicy iteration is an efficient procedure for finding average reward optimal policies inMarkov decision processes (MDP's). It is guaranteed to converge for MDP's with a finitenumber of states and actions when all policies are unichain. These conditions hold for ournetwork control application. The algorithm was first proposed by [27] and consists of fourbasic steps. First an initial policy is chosen arbitrarily as the starting point. Second, thepolicy is evaluated by determining the throughput for that policy. Thirdly an improved policyis found. Finally, a stopping criteria is used to determine whether to repeat the second andthird step to continue to improve the policy. It is guaranteed that at each improvement step,an equal or better policy is found [23] so the last step stops the iterations if the policy hasnot changed. The algorithm is:1. (Initialization) Set n=0 and choose an initial decision rule L„ E 'P.2. (Policy evaluation) Obtain the average reward of the policy, 77„ and relative value vector,V ?, by solving0 = L n — qn + (PLn — ./)^. (28)Note that 7) 7,, is uniquely solvable up to an additive constant.233. (Policy improvement) Choose Ln+1 to satisfyn fL„+1 PL " -" V ii = LETmax^L vit} (29)setting L„+i(S) = L„(S) for each S if possible. Here we have used the relative valuefunction, updated in step 2, to find an improved policy.4. (Loop) If Ln+1^L 71 , stop and set L*^L. Otherwise increment n by 1 and returnto step 2.This algorithm produces a series of decision rules and their respective throughputs andis guaranteed to converge to solution in a finite number of steps [23]. The algorithm hasbeen shown to be equivalent to Newton's method for finding the minimum of a function[24]. An initial policy is arbitrarily chosen and by policy improvement the policy convergestoward the optimal.The relative value function '5„ is determined by solving eqn. (28) and is unique up toan additive constant. For convenience, we have set v 11 (S0) = 0, where So, is equivalent tothe empty system state ((N1, N2) = (0, 0)). This setting is commonly made and does noteffect the result of the algorithm.Note that the solution of eqn. (28) involves solving a system of 1S1 linear equations where1S1 is defined by eqn. (2). Since IS I is of the order M 2 , the complexity of solving the systemof equations grows with M6 . Since in finding the optimal decision rule many iterations arepotentially needed, this evaluation step becomes very costly. The modified policy iterationwas developed to reduce this computational load by avoiding the explicit solution of theIS x IS1 system while still guaranteeing convergence when all policies are unichain.3.2.2.2 The modified policy iteration algorithmThe modified policy iteration algorithm breaks down the policy evaluation step of regularpolicy iteration to reduce the computational expense of solving the IS I x 1S1 system. Similarly24to the policy iteration algorithm, there is a policy improvement step and an evaluation step,but this time, the evaluation is done iteratively over m steps rather than exactly. We avoidsolving explicitly the IS I xIS I system. Note that when m is infinite, the algorithm is equivalentto policy iteration since the exact policy evaluation is achieved. The algorithm is:1. (Initialization) Select f) () , specify (>0 and set n=0.2. (Policy Improvement) Choose L„ +1 to satisfyr Ln+1^=^+ PLii„setting L„+1(S) = L„(S) for each S if possible.3. (Partial Policy Evaluation)a. Set k=0 and0^—^ DLn+1T .Ln+ ,^vnb. If sp(ft,°,— V„) < c go to 4. Otherwise go to c.c. If k=m, go to e. Otherwise compute ii,k;+ 1 = qn+ ,^PLn -f'irt,k;d. Increment k by 1 and return to c.e. Set v„+ 1^11,171„ increment n, and go to step 2.4. (Final Policy Improvement) Choose L, to satisfyLFL, +^= max L, P v„}LETThis algorithm is guaranteed to converge within a finite number of iterations to an (-optimal decision rule (ie. nL, > q* — c), where E may be chosen to be arbitrarily small[23]. In this algorithm, step 3 uses an iterative method to approximate the true relative value(30)(31)(32)25function, i,„ +1 with the iterative result, ft,",' produced by m iterations. The span semi-normused here is defined bysp(:-E) = max (xi) — min (x2) . (33)and is used for the stopping criterion in step 3b. The algorithm stops when the norm ofapproximate relative value for policy, L 11+ 1 differs from that of policy L„ by less than e.In our application, 771 = 6 appeared to work well for all the systems considered. For fulldetails of the modified policy iteration algorithm, and the rationale behind its development,the reader is referred to [23].Applying the modified policy iteration algorithm reduced the computational effort ofsolving for the optimal policy by up to 60-70% compared to policy iteration when all butthe most trivial systems were analyzed. Despite this, the complexity still grows dramaticallywith M since the solution of the system remains to be found, though only partially at eachstep. Since there do not exist methods computationally more efficient than the modified policyiteration algorithm for solving Markov decision processes with large state spaces [23], suchcomplexity must be endured or the search for the optimal abandoned.3.2.3 Optimal performanceHere we present some performance results for the optimal policies in a small networkmodel and compare them with performance results for fixed allocation policies. First, somemention should be made of how the retransmission probability ps is chosen for our models.The literature on slotted ALOHA channels has given close consideration to the choice of thisparameter and it has been shown in that case that when ps is fixed, congestion among collidingpackets renders the system unstable when the user population in infinite and bistable whenthe population is finite [21]. For a finite population model, the bistable character is revealedin the way the delay-throughput performance becomes two-valued at high throughput levelsand throughput degrades as the traffic intensity continues to increase. [10] shows that the26bistable effect is also found in the reservation model when the subframe size is fixed (whichis expected since the reservation subsystem alone is effectively a slotted ALOHA channel).The approximate methods used to expose this bistability, while seemingly effective when thesubframe allocation is static, are not easily extendable to the dynamic allocation model thatwe are considering. Thus we leave detailed stability analysis beyond the scope of our work.We choose network models where the network parameters are such that we may set p s =1.0for the dynamic policies without serious performance degradation due to stability problems.While it may be more appropriate to choose Ps adaptively as a decreasing function of N1as a means to stabilize the reservation subchannel we will maintain a fixed retransmissionprobability for our analysis and leave consideration of an adaptive transmission probabilityto further research. For the fixed allocation schemes, we are at times forced to reduce psto avoid too serious a degradation in throughput due to congestion. The reader is referredto [10] for a full treatment of the bistable character of the reservation model when a fixedallocation policy is used.The modified policy iteration was implemented with E chosen small enough that furtherreduction did not affect the optimal policy. Table 1 shows a sampling of optimal policiesfor the parameter set (M=20, V=3, F=5, ps=1.0). We may first note that the optimal policieschange with the traffic level (as indicated by the arrival probability, p). Furthermore, theoptimal policies are a function of both state variables, thus both state variables containinformation useful for allocation control. For the case when complete state information isavailable, the average throughput and delays were calculated for the optimal policies. Figures3 and 4 presented previously in section 3.1 plot the throughput against the message arrivalprobability and the delay versus throughput respectively. Since these curves are based oncomplete state information, the optimal results represent an upper bound on performance forthe model.27NIO 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20^ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2001234567M a910111213141516171819200123456"2 78910111213141516171819205 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 53 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 5 52 2 2 2 2 2 2 2 2 2 2 3 3 4 4 4 4 41 1 1 1 2 2 2 2 2 2 2 3 3 4 4 4 4O 1 1 1 1 2 2 2 0 0 0 0 0 0 0 0O 1 1 1 1 0 0 0 0 0 0 0 0 0 0O 1 1 1 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0^p=0.04O 0 0 0 0 0 0 0O 0 0 0 0 0 0O 0 0 0 0 0O 0 0 0 0O 0 0 0O 0 0O 005 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 53 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 5 5 52 2 2 2 2 2 2 2 2 3 3 3 4 4 4 4 4 51 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4 4O 1 1 1 1 2 2 2 2 0 0 0 0 0 0 0O 1 1 1 1 1 0 0 0 0 0 0 0 0 0O 1 1 1 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0^p=0.12O 0 0 0 0 0 0 0O 0 0 0 0 0 0O 0 0 0 0 0O 0 0 0 0O 0 0 0O 0 0O 000123456N 7"2 8910111213141516171819200123456N 7"2 891011121314151617181920O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 205 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 53 3 3 3 3 3 3 3 3 3 3 3 4 4 5 5 5 5 52 2 2 2 2 2 2 2 3 3 3 4 4 4 4 5 5 51 1 1 2 2 2 2 2 3 3 3 4 4 4 4 5 5O 1 1 1 2 2 2 2 3 3 3 0 0 0 0 0O 1 1 1 1 2 2 2 0 0 0 0 0 0 0O 1 1 1 1 0 0 0 0 0 0 0 0 0O 0 1 1 0 0 0 0 0 0 0 0 0O 0 1 1 0 0 0 0 0 0 0 0O 0 0 1 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0 0O 0 0 0 0 0 0 0 0^p=0.20O 0 0 0 0 0 0 0O 0 0 0 0 0 0O 0 0 0 0 0O 0 0 0 0O 0 0 0O 0 0O 00O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 205 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 53 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 52 2 2 2 2 2 2 3 3 3 3 4 4 4 5 5 5 51 1 1 2 2 2 2 2 3 3 3 4 4 4 5 5 5O 1 1 1 2 2 2 2 3 3 3 4 4 4 5 5O 1 1 1 2 2 2 2 3 3 3 4 4 4 5O 1 1 1 1 2 2 2 3 3 3 0 4 4O 1 1 1 1 2 2 2 3 3 3 0 4O 0 1 1 1 2 2 2 3 3 3 0O 0 1 1 1 2 2 2 0 3 0O 0 1 1 1 2 2 2 0 0O 0 1 1 1 2 2 2 0^p=0.28O 0 1 1 1 2 2 2O 0 1 1 1 0 2O 0 1 1 1 0O 0 0 1 1O 0 0 1O 0 0O 00ry^ NI0123456NN^7"t 8910111213141516171819200123456N 7"2 891011121314151617181920O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 205 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 53 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 52 2 2 2 2 2 2 3 3 3 3 4 4 4 5 5 5 51 1 1 2 2 2 2 2 3 3 3 4 4 4 5 5 5O 1 1 1 2 2 2 2 3 3 3 4 4 4 5 5O 1 1 1 2 2 2 2 3 3 3 4 4 4 5O 1 1 1 2 2 2 2 3 3 3 4 4 4O 0 1 1 2 2 2 2 3 3 3 4 4O 0 1 1 1 2 2 2 3 3 3 4O 0 1 1 1 2 2 2 3 3 3O 0 1 1 1 2 2 2 3 3O 0 1 1 1 2 2 2 3^p=0.36O 0 1 1 1 2 2 2O 0 1 1 1 2 2O 0 1 1 1 2O 0 1 1 1O 0 1 1O 0 1O 00O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 205 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 53 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 52 2 2 2 2 2 2 3 3 3 3 4 4 4 5 5 5 51 1 1 1 2 2 2 3 3 3 3 4 4 4 5 5 5O 1 1 1 2 2 2 3 3 3 3 4 4 4 5 5O 1 1 1 2 2 2 3 3 3 3 4 4 4 5O 1 1 1 2 2 2 3 3 3 3 4 4 4O 0 1 1 2 2 2 2 3 3 3 4 4O 0 1 1 2 2 2 2 3 3 3 4O 0 1 1 2 2 2 2 3 3 3O 0 1 1 2 2 2 2 3 3O 0 1 1 2 2 2 2 3^p=0.44O 0 1 1 1 2 2 2O 0 1 1 1 2 2O 0 1 1 1 2O 0 1 1 1O 0 1 1O 0 1O 00Table 1. Optimal policies for a range of arrival probabilities for the parameters (M=20, F=5, V=3)28We can see that at medium and high traffic levels, the improvement of the optimal overthe fixed schemes is indeed dramatic, showing higher capacity in general, lower delays for thesame throughput levels and higher throughputs for the same traffic levels. The best fixed policycapacity occurs for the allocation L=2 at capacity of near 0.45 packets per slot. By comparison,optimal allocation gives a capacity of approximately 0.54 packets per slot, an improvementof 20% over the fixed allocation. From Figure 4 we see that while for the fixed policies theaverage delay increases rapidly when throughput approaches capacity of 0.35-0.45, the delayof the optimal allocation remains fairly flat in that region, near the minimum of 2 framesper message, as system throughput extends to levels over 0.5 where the delay then increasesquickly as throughput reaches capacity. By examining the specific policies, we see that theoptimal policy achieves this result by adjusting the reservation subframe allocation such thatthe attempt rate is near the optimal of 1 request per minislot, ensuring sufficient reservationminislots to avoid excessive collisions in the reservation subframe.While these optimal results look promising, there remain some key problems withachieving these results. First, the optimal policy for the network is a function of the trafficlevel as indicated by the arrival probability, p. In all real systems, we can expect the trafficlevels to fluctuate over time. Adapting the optimal policies to this fluctuation would effectivelyinvolve changing policies. It would be preferable for the model to maintain good performancewith a single policy for all traffic levels. Secondly, the optimal allocation at each frame is afunction of both state variables. While the state of the global queue is known with certainty,the number of contending stations must be estimated by implementing an appropriate stateestimator, increasing complexity in the transmitting/receiving stations. Finally, the complexityof the modified policy iteration algorithm prevents us from finding the optimal policies fornetworks with large population models. Many applications of reservation protocols involvesuch large populations and with existing computational means we may not hope to use these29analytical Markov techniques to solve for the optimal policy and performance when thedimension of the state space grows into the thousands and tens of thousands. The followingsections propose two heuristic dynamic schemes that address these problems while maintainingperformance close to the optimal. The state estimation problem is considered in chapter 4.3.3 "Data Priority" (DP) AllocationRather than finding the policy that maximizes the long run average throughput as we havedone in the previous section, let us consider a simple policy that sets the allocation to thevalue that maximizes the successful data packet transmissions in that frame. We shall referto this policy as the "data priority" (DP) policy since as many slots as are needed to servicethe global queue (up to the obvious upper limit of F) are allocated to the data subframe, theremaining slots (if any) allocated to the reservation subframe. Thus,LDp(Ni, N2) = max (0, F — N2) • (34)This policy has an advantage over the optimal in that it is only dependant on N2, and thusno state estimation techniques are required for N1 for implementing this allocation policy.0 1 2 3 4 5 6 7 8 9 10^11^12^13^14^15^16 17 18 19 200 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 51 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 42 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 33 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 24 1 1 1 1 1 1 1 1 1 1 1^1^1^1^1^1^15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 06 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0"2 8 0 0 0 0 0 0 0 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0 0 0 010 0 0 0 0 0 0 0 0 0 0 011120000000000000000000 L^(N 1 , N2)DP13 0 0 0 0 0 0 0 014 0 0 0 0 0 0 015 0 0 0 0 0 016 0 0 0 0 017 0 0 0 018 0 0 019 0 020 0Table 2. DP policy for the parameters (M=20, F=5, V=3)30Table 2 shows the DP policy values for the small model with parameter set (M=20, F=5,V=3). The performance results for the DP policy implemented in the small population modelare plotted in Figures 5 and 6. As expected, we see that the DP policy shows a significantimprovement over the fixed schemes while the throughput levels fall short of the optimalfor medium and high traffic levels. The capacity of the DP policy is approximately 0.515,a 15% improvement over highest capacity for a fixed allocation scheme. This is less thanthe 20% improvement achieved by the optimal policy. However, the DP policy performancedoes show a slight degradation at high traffic levels. The throughput begins to decrease withthe traffic level, a sign indicative of an increasing number of collisions in the reservationsubchannel. Since the DP policy does not explicitly use information regarding the number ofcontending stations, this congestion problem is not unexpected.Since the derivation of the DP policy is simple with respect to the dynamic programmingtechniques involved in finding the optimal policy, we are no longer restricted to small statespace models for finding the policy. For a very large model though, deriving the throughputand average delay are costly if the analytical approach of last chapter is used. Solving theglobal balance equations is an operation of the order, 4181 3 ) = o(M 6 ) and thus fora network of only 100 users, the number of floating point operations already runs into thehundreds of billions. To test the DP policy and compare it to the fixed policies for largemodels, we resort to simulation methods. Calculation of confidence levels for this simulationmodel cannot be done using the usual statistical methods since there is a correlation betweensuccessive delay and throughput values. We simply choose a high simulation duration. Therelative smoothness of our performance curves indicates that the simulation lengths chosenare indeed sufficient for comparison among the different policies. Using the parameter set(M=150, F=12, V=12) we used discrete time simulation over a 20,000 frame duration todetermine the throughput and delay for the model when the DP policy and the two best fixedallocation were used. Data were gathered over a range of traffic levels and the results appear310.6—'5-(7)----,) 0.5a)(..)as-9- 0.40_0)= 0.300.20.17.06.0c 5.0i)pal(,)coE 4.0u)a)EEi 3.0—>,oc15 2.01.00.01^ I^ I.....^ ......^ fixed allocation: L=2- - - - fixed allocation: L=3— - — DP policy• MF policy^ optimal policy0.20^0.30arrival probability, pFigure 5. Throughput vs. arrival probability for the parameter set (M=20,F=5, V=3) : fixed allocation policy, optimal policy, MF policy and DP policy0.00^0.10 0.40^0.50^ fixed allocation: L=2- - - - fixed allocation: L=3--- DP policy• MF policy^ optimal policy/7^......................... - ---- ..... -^.......0.10^0.20^0.30^0.40^0.50^0.60throughput (packets/slots)Figure 6. Average delay vs. throughput for the parameter set (M=20, F=5,V=3) : fixed allocation policy, optimal policy, MF policy and DP policy32^ fixed: L=2, ps=0.6--- - fixed: L=3, ps=1.0— - — DP policy^ MF policy.....................ti 1 110.00.05.0i^ I ....^. IiII^ fixed: L=2, ps=0.6 II- - - - fixed: L=3, ps=1.0 I Ii— - — DP policy II 1IMF policyI I,1I /I ,......................1.00.8.---cp--..coY 0.6__0cva450- 0.4_c0)=0_c0.20.00.00^0.05^0.10arrival probability, pFigure 7. Throughput vs. arrival probability for the parameter set (M=150,F=12, V=12) : fixed allocation policy, MF policy and DP policy0.00^0.20^0.40^0.60^0.80throughput (packets/slot)Figure 8. Average delay vs. throughput for the parameter set (M=150,F=12, V=12) : fixed allocation policy, MF policy and DP policy33in Figures 7 and 8 where throughput is plotted against the arrival probability and the delayis plotted versus the throughput respectively. It is apparent that in this larger model, the DPpolicy, while showing a capacity increase of 6.7% over the fixed policies, actually showshigher delays in a region of medium throughput levels. This indicates that while the DPpolicy avoids wasted data slots, the fact that the fixed policies balance the allocation betweenreservations and data does create better performance at some traffic levels. Note in Figure 8that for the fixed allocation, L=2, there is a serious degradation in throughput as the trafficlevel increases. This is a result of high congestion in the reservation subsystem. This effectmay be reduced by a further reduction of the retransmission probability from p=0.6 at theexpense of higher delays. The dynamic polices do not appear to suffer from the destabilizingeffect of congestion in this model.3.4 "Maximum Flow" (MF) AllocationWe now consider another policy for subframe allocation. We see in the previous sectionthat while the DP policy shows higher capacity than the fixed allocation schemes, it is stilloutdone by the optimal policy and shows congestion effects at high traffic levels. The optimalpolicy achieves this advantage by ensuring a balance between reservations and data packetswhile the DP policy simply maximizes the data packet transmissions, Wk .In light of these observations we now develop a method to improve on the DP policyby better balancing the flow of traffic through the system while drastically simplifying thederivation from the modified policy iteration algorithm used for the optimal policy to a simpleone step maximization. We propose a simple maximization of a new criterion, which we willrefer to as the "flow". We define the "flow" in frame k asF k = Yk + Wk ,^ (35)34where a is a weighting coefficient. Since Y k is a random variable, we must use the expectation,E [Fk] as our criterion. The policy which maximizes this expectation will be referred tohenceforth as the "maximum flow" (MF) policy.From [5] we know that if n' stations each sends a request packet in one of LV minislots(chosen at random), the expected number of minislots containing a single transmission, Y, is1 )" '-1E[Y 71 1 ] = (1 — Lv (36)Thus,E[Y Nd )P i.: 1 ( 1 — ps)N1-71'nt (1 LLi' -1n'=-0 = ps,1\110 —^)N1-1LV )(37)Thus, from eqns. (35) and (37), we may write the expected flow as a function of the state,(N1, N2) and the reservation subframe allocation L,E[Fk],^\N1-1LV )^"" (N2, F — L)It is left now only to determine a good choice for a. By considering Figure 2, we seethat a dictates the value of a successful reservation request transmission with respect to adata packet transmission. If we choose a=0, we give full priority to the data successes andthus the MF policy would become identical to the DP policy. Alternatively, if we choose avery high a, we give high priority to the reservation requests, allocating a high number ofreservation slots for even a small number of contending stations. A desirable choice of awould lie between these extremes in an attempt to balance the flow equitably. It has beenwell established in the literature that the highest throughput on a S-ALOHA channel with aninfinite user population is lie. Thus, in a large population model, we may hope for an averageVie successful reservation requests per slot. This gives us an idea of the relative worth of a(38)350N2 891011121314151617181920234567successful request transmission and we seta — .V(39)This proved to be a good choice for the weighting coefficient. Although a detailed sensitivityanalysis was not done on this choice the close resemblance of the resulting MF policy to theoptimal and their similar performance leads us to believe that for this policy derivation thisweighting would appear to be the best or near the best.We can now find the MF policy, LMF, by maximizing the expected flow,E[F k LmF(n i ,71, 2 )] = max {E [Fk L]} .^(40)LEAThe calculations involved then in finding the MF policy are very manageable compared tomodified policy iteration; there are no expensive iterative steps nor risk of non-convergence.N1O 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 205 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 53 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 52 2 2 2 2 2 2 2 3 3 3 4 4 4 5 5 5 51 1 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5O 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5O 0 1 1 1 2 2 2 3 3 3 4 4 4 5O 0 1 1 1 2 2 2 3 3 3 4 4 4O 0 1 1 1 2 2 2 3 3 3 4 4O 0 1 1 1 2 2 2 3 3 3 4O 0 1 1 1 2 2 2 3 3 3O 0 1 1 1 2 2 2 3 3O 0 1 1 1 2 2 2 3^L MF (N 1 , NilO 0 1 1 1 2 2 2O 0 1 1 1 2 2O 0 1 1 1 2O 0 1 1 1O 0 1 1O 0 1O 00Table 3. MF policy for the parameters (M=20, F=5, V=3)For the small model (parameters : M=20, F=5, V=3) the MF policy was calculated andis shown on table 3. The average throughput and message delay were calculated for this36policy for varying arrival probabilities and the delay-throughput curve and throughput-arrivalprobability curve are plotted in Figures 5 and 6 shown in the previous section with the otherpolicies under consideration. Let us first note that the MF policy is independent of the trafficlevel as represented by the arrival probability, p. The policy appears to follow a structureand bears a very close resemblance to the optimal policy for medium and high traffic levels(see table 1). From Figures 5 and 6 we see that the performance of the MF policy virtuallymatches the optimal. Although it is not apparent from the plots, the MF policy throughput isin the order of 0.01% less than the optimal. Although the specific values of MF policies arequite different than the optimal policy values at low traffic levels, the throughput and delay arevirtually the same since the system is generally in low occupancy states (ie. N1 + N2 < M)where the policies produce equivalent allocations. In these low occupancy states, the optimaland MF policies allocate the subframe very similarly to the DP policy. These numericalresults for the small model indicate that the MF policy is a very good approximation to theoptimal and gives better performance than the DP policy.Again, a larger model (parameters : M=150, F=12, V=12) was analyzed under the MFpolicy and the simulation results are shown in Figures 7 and 8 (see previous section). TheMF policy shows a higher throughput capacity than the DP policy and consistently lowerdelays for the same throughput. This is especially apparent in medium and high throughputlevels. While the performance gain over the fixed allocation schemes is less dramatic forthe large model than the small model, it is still substantial. For medium and low throughputlevels, the MF policy has delays only negligibly different than the DP policy or the bestfixed allocation policy. Again, it must be noted that these results are based on the unrealisticassumption that complete state information is available to all stations. The following chapterinvestigates state estimation implementation and shows that estimation has little effect on theperformance results.373.5 Capacity AnalysisWe have demonstrated with both analytical and simulation results in previous sections thatdynamic allocation policies in the reservation model have advantages over the fixed allocationschemes in terms of delay-throughput performance. In particular, the dynamic policies showdramatically higher capacity, especially in the small model. The performance results shownhave been the result of rather cumbersome analytical or simulation algorithms that are timeconsuming and fairly complex. In this section, we develop a simple approximate method forpredicting the capacity gain of using dynamic allocation versus fixed allocation. Results showthat the capacity of the fixed schemes are highly sensitive to traffic and channel parametersand exhibits a granular effect that prevents them from achieving ideal capacity. Dynamicschemes, we will show, do not suffer from such an effect and their capacity is robust tochanging system parameters.For our approximate analysis, we consider an infinite user population model. Althoughthis is a divergence from the model that we have used to this point, it allows us to get somegeneral results and is a valid approximation when the population size is large with respect tothe number of minislots in the frame. For an infinite population model, it is well known thatunless some sort of input control is used, a S-ALOHA channel is unstable [28]. Thus for thiscapacity analysis, we will assume such control is incorporated. It is well known then that fora stabilized S-ALOHA channel, the throughput approaches lie as the number of contendingstations grows without bound [28]. Using this assumption [28] points out the ideal capacityof a reservation multiaccess channel with S-ALOHA used for transmission of reservationrequests. This ideal approximation assumes that at capacity, both the reservation subchanneland the data subchannel are operating at their individual capacities. These individual capacitiesare 1/e and 1 for reservation minislots and data slots respectively. For the transmission ofa packet in a system operating at capacity then, the expected number of minislots used for38successful transmission of the reservation request is e. Summing the channel time used forrequest transmission with the single data slot for the packet transmission, the average numberof slots used by the station for its packet transmission is1+ V .^ (41)Thus the ideal capacity (in packets/slot) is the reciprocal,C =+ re,- •This ideal capacity approximation assumes that the subframe allocation is such that the flowis balanced between the reservation and data subframes.Indeed, the fixed allocation protocol achieves this capacity when the subframe partition,L f is set to the ideal allocation which balances the flow through the system,— F^^(43)1 +But, we are constrained in the system to set Lf to an integer value. As L f deviates fromthe ideal, the system exhibits a bottleneck in the reservation subchannel if Lf < L', and abottleneck in the data subchannel if L f > L'. The capacity of the channel is then(VLf F — LC (L f) = minFe^F ) (44)The maximum capacity for the fixed allocation schemes and the ideal capacity from eqn. (42)are plotted against the number of minislots per slot, V in Figures 9 and 10 for F=12 and F=5respectively. Notice that for the fixed allocation, the capacity has a granular effect. When Lmay be chosen to satisfy the flow balance of eqn. (43) the fixed allocation capacity matchesthe ideal but between those points, the capacity drops blow the ideal in a step like function,each step representing a different fixed allocation. The steps result from the necessity in themodel of choosing an integer allocation. The deviation of the fixed allocation capacity from1(42)39optimal fixed allocationideal30.010.00-65 0.8c na)0Ci0a)a. 0.6cvU0.40.0optimal fixed allocationideal20.0V1.00 0.8a)0cLai 0.600_0 0.40.20.0^10.0^20.0VFigure 9. Approximate capacity plotted against the number of minislots per slot, V,for the ideal capacity and the best fixed allocation policies' capacity for the case F=5Figure 10. Approximate capacity plotted against the number of minislots per slot, V,for the ideal capacity and the best fixed allocation policies' capacity for the case F=540the ideal is quite dramatic for certain values of V. For example, we see in Figure 9 that forV=8, the best fixed allocation is L1=2, producing a capacity of 0.6. This is 19% lower thanthe ideal capacity of 0.74. Examining the figures, we see that the lower capacities resultingfrom this granular effect are most pronounced in the smaller frame size model and at lowervalues of V where the ideal capacity curve is the steepest.By using a dynamic scheme, we may avoid this granular effect. Let us assume a dynamicallocation scheme, L(ni , n2), where no data slots are left idle (ie. L(ni, n2) > F — n2). Forthe MF, DP and optimal policies this criterion holds. At capacity, the number of messagesin the system, ni + n2, grows without bound. In the case of the DP policy, n2 will decreasefrom one frame to the next if 7/2 > F. It will increase when n2 < F by at most the number ofminislots in the reservation subframe, (F — n2)V. Therefore 7/2 is strictly finite and ni ---* oo.Thus for the DP policy, the expected number of messages moving into the global queue bysuccessful transmission of the reservation request on the reservation subchannel isVL(ni,n2) E[Y n i , 71 2 ] —^ (45)The expected number of complete messages sent over the data subchannel when no data slotsare idle isE[141 I n1,722] = F — L(ni,n2) .The steady state flow balance equation at capacity,E P(ni, n2 )E[Y I n i , n 2 ] =^P(ni, n2 )E[W I ni 7t2]9L 1 ,1L 2^111,1/2reduces toE[1]^_F v .I^cThus the capacity of the DP allocation policy isC 1^ +(46)(47)(48)(49)41C > 1— 1 + -.pi- • (51)the same as the ideal.It is not as easy to find a general expression of the capacity of the optimal and MF policiessince in these cases n 1 does not necessarily grow without bound at capacity. In general thoughwe may say that for the optimal policy, the capacity will at worst match the DP capacity.Since according to eqns. (40) and (35), the MF policy will only favour allocating slots to thereservation subframe when the expected reservation request throughput is greater than Weper slot, we can expect that for the MF policy,E[Y 1 ni,n2] > VL(n i , n2)^(50)eat capacity. Thus for the optimal and MF policies, the capacity matches or exceed the ideal,Examining the numerical results in the previous section, we would expect that these capacitieswould be very close to this bound. Certainly for high traffic levels, we would expect n 1 togrow large, the reservation request throughput to approach 1/e, and thus the capacity of theMF or optimal policies would approach the ideal value. To make a stronger claim wouldinvolve a closer examination of the Markov process underlying the system. This questioninvolves the Markovian analysis of a countable state process, an avenue of investigation thatmay encounter technical barriers.This capacity analysis uses a model different than our throughput and delay analysis in thatwe have assumed an infinite user population and implemented input control. These changesare necessary to provide a simple approximation. In the finite population model with no inputcontrol, the approximations are close provided some care is taken with the choice of ps forall policies to prevent congestion from eroding the throughput on the channel. Generally, theretransmission probability, Ps , should be chosen as high as possible to minimize the requestpacket transmission delay, while being low enough to prevent too many collisions in the42reservation subframe from decreasing the throughput. In the model examined in this work, ithas been found that ps can be safely set to 1.0 except for the fixed allocation policies. Thisissue of stability for this model is explored for the fixed allocation policy in [5] and [10]. Weleave detailed analysis of this issue for the dynamic policies to further research.43Chapter 4 SYSTEM STATE ESTIMATIONIt has been proposed in the opening chapters that by dynamic partitioning of the reservationsubframe based on the system state, we may improve the performance of the reservationmultiaccess protocol. We do not of course have full state information. While the numberof stations in the global queue is readily obtained by monitoring successes in the reservationsubframe and message completions in the data subframe, the number of stations in contentionmode is not so easily known. This state variable increases as idle stations become active whennew messages arrive and decreases with the successful transmissions of reservation requests.Since the stations are geographically diverse and share only the single broadcast channel, thearrival of new messages cannot be monitored with complete accuracy. Rather, if we assumethat stations can decipher collisions, idleness and successes in the minislots of the reservationsubframe, only partial information is available and a state estimate must be derived based onthe history of the reservation subframe observations. We will refer to idle minislots as holes inthis discussion. In this chapter, this estimation problem is considered and a specific estimationtechnique for the system is proposed that is both effective and practical for implementation.The following are some of the assumptions of our model relevant to the estimationproblem.• All stations monitor the previous reservation subframe, k and can reliably decipher thenumber of collisions Ck, the number of successes Tk, and the number of idle minislots orholes, Hk. These observations together form the observation vector, Ok (Hk,Tk,('k).Noting that Ck LkV - Tk - Hk, we may drop the unnecessary entry, Ok (Hk, Tk).This assumption of ternary feedback is made widely in the literature for estimation ofsystem backlog for S-ALOHA stabilization [20].44• All stations monitor the data subframe and can reliably determine the number of messagecompletions, Wk. Thus the number of stations in the global queue is simply determinedbyN2+1 = + Tk W k • (52)• All stations maintain an estimate of the common arrival probability, p, for the idle stationsin the network. This allows us to define the estimated number of newly arriving packetsin frame k to bek (M - 4 - p , (53)where ilk is the estimate in frame k for the number of contending stations, NIc . In thischapter, for ease of notation, we let 71 k =Based on these assumptions we wish to develop a recursive estimator of the form,istk+i f(Ak,ok) • (54)In the literature on S-ALOHA stabilization techniques, there appears a number of papers thataddress this estimation problem for the single channel model with ternary feedback. (Thiswould correspond to the LkV = 1 version of the model under consideration here.) WhileSegall [17] and Rivest [2] both present the formulas for optimal recursive estimation in thiscase, the formulas are impractical and costly to implement. Recognizing this, [2], [19], [20],[29], and [30] all investigate heuristic schemes to produce reliable estimation with muchsimpler implementation. Cunningham summarizes these efforts among others in [20] anddemonstrates with simulations that Rivest's pseudo-Bayesian recursive estimation techniqueshows very good performance in comparison with the other estimators. Furthermore, thepseudo-Bayesian techniques is both a simple and elegant approximation to the true optimal45Plnk I^=^P(Ok)P{Ok I nk} P {nk} (55)Bayesian estimator, and lends itself fairly readily to the extension to the estimation problemat hand.For these reasons we choose to adapt the pseudo-Bayesian estimator to our model forestimation of nk. Initially we derive the true Bayesian estimation formulas and reveal theircrippling complexity. We then follow an approach that parallels Rivest's [2], developinga simple linear pseudo-Bayesian estimator to approximate the true Bayesian estimator thatperforms well in simulation under a variety of system parameters.4.1 True Bayesian EstimationThe true Bayesian estimation procedure is as follows:1. Assume an a priori distribution, P {nk }, of the number of contending stations, nk.2. From our observations Ok in frame k, determine the conditional probability, P{Ok 17/k3. Determine the a posteriori distribution by Bayes rule as follows,4. By considering the generation of new packets and the successful transmission of packets,find the distribution,13 { nk+1 I Ok^ (56)and use its mean as the estimate ii,k+1. We may use eqn. (56) as the apriori distributionfor the next frame.Following this procedure strictly, it is necessary to track and update the distribution, P {nk}and eqn. (54) would becomeP{nk+l} = f(P{nk},Ok) •^ (57)46LV-h, LVn!) Hoh+t (LV)(LV - h)i.t 1 =_Note that in the finite population model the state space is limited to, nk E [0, 1,^, M], thusthe apriori distribution is a vector, fik with elementsPk,i =-- P Ink =^= 0 , 1 ,^M.^ (58)For all other values of nk, the probability is necessarily zero. We may assume the systemis initially empty, thus^po = (1, 0, 0, 0, . . .) .^ (59)The conditional probability of the observation vector in frame k may be written asP(Ok Ink) P(Hk = h, Tk t I nk n)E p(h, t int) p(ni I n) .Vn'Here, n' is the number of requests actually transmitted in the frame. Thus\^P(I / I n)nTipi: (1— ps)" -711 0 < 71 1^nFrom appendix B,(60)(61)H i) i+1 (7. —hh)( LV —h —t, (z — t )^LV — h)n'-(LV-h)(62)(63)If we denote the aposteriori distribution by the vector ij k` with elementsn=-_ rink =^Ok} ,it may be updated according to eqn. (55) withP(Ok nk = n)pk,„ = E P(Ok Pik = i) Pk,iIL = 0,1, ...^(64)47Finally, as in step 4 we must find the distribution of TI, k +1 by considering the generation ofnew messages and successful transmission of reservation requests. This distribution will actas the apriori distribution for the next frame, pk+i . Letting Xk be the number of arrivingmessages in frame k, the new state is updated byn,k+i = nk — Tk X k . (65)Since we know that distribution of X k is binomial conditional on nk and we know thedistribution of the state, we may now calculate the elements of apriori distribution for thenext frame withPk+1,n =^Pk,i P {Xk n — i Tkmin (M,n+Tk)M — —^p _i+i^M —Ar k --n—TkPki^"^(1 — P)^2It -^Tk(66)The entire process is repeated for the next frame.This Bayesian estimation procedure recursively updates a distribution rather than simplya single estimate of the state. This distribution makes optimal use of the available informationand by recursion, incorporates the history of observation on the channel. This is made possibleonly by the Markovian nature of the system. The details of the underlying Markov processare discussed in detail in the next section. To derive a single estimate of the state in the nextframe is a simple matter of finding the expectation of the apriori distribution,k+1 = Pk+1,i (67)While the true Bayesian estimation is the most accurate estimator for our model, it isunlikely that the procedure as presented is of much use for practical implementation in a real-time communication system. The calculations involved in finding the aposteriori distributionaccording to eqns. (60), (62) and (64) are the largest stumbling block, and as M grows large,48it is evident that the complexity of the algorithm is infeasible for real-time implementation.For this reason we investigate an approximation to the true Bayesian estimate that producesa simple linear estimator. Again, its derivation and resulting form is very similar to thatproposed by Rivest [2].4.2 Pseudo-Bayesian Estimation for the Reservation SubchannelTo derive the approximation to the Bayesian estimator we follow essentially the sameprocedure, making a couple of key simplifying assumptions that alter the resulting estimateupdate to be a much simplified expression. Although the simplifying assumptions willnecessarily undermine the accuracy of our result, the need for a practical estimator forcesus to endure these losses. Furthermore, numerical and simulation results in the followingsection reveals that the pseudo-Bayesian estimate appears to perform near to the optimal ata small fraction of the computational expense.Our first simplifying assumption is made of the apriori distribution of n k . We followRivest in assuming that the distribution may be reasonably approximated with a Poissondistribution of mean ilk. Thus, lete —ya k^kk {ri k}k •nk I(68)This approximation is made not only to save the amount of information that must be carriedfrom frame to frame but we will see that with some further simplification, this greatly reducesthe effort in calculating the aposteriori distribution. The Poisson approximation made in thesingle channel, infinite user model considered by Rivest is demonstrated to be a reasonablygood one. The approximation seems less valid as applied to our model where we consideronly finite user populations. Obviously, when Ilk approaches the upper limit of M, the Poissondistribution breaks down, but if we consider that in typical operation M will be large withrespect to the backlog, the Poisson approximation may be a reasonable one. We will proceed49despite these doubts emphasizing that such approximations will be necessary to produce asimpler estimator.The second key assumption is in our handling of the conditional probability distribution.Rather than confront the unwieldy joint distribution defined in eqns. (60) and (62) we willconsider each minislot individually. By doing this we assume that the observations on eachminislot are independent of one another. These observations are obviously not independentand for the sake of simplicity we forfeit the information in the correlation of these observations.Let Hi, Ti and Ci denote the occurrence in minislot i in frame k of a hole, success andcollision respectively. The conditional probabilities for minislot i are thusP {Hi nk } EE H(n k ) = (1 —^nk,P{Ti In k} - T(nk )^Ps nk (1 — Ps )n1,-1LV^LV^ (69)P{Ci I nk} -C(nk) = 1 — H(n k ) — l(nk) •We wish now to find the aposteriori distributions for each of the three observations. UsingBayes rule,P {nk I^= P {Hi 1 7'k} x Pijk (nk )P {Hi}Using eqns. (69) and (68) and with some manipulation on the numerator, we getPok—x)(nk) /Ink I Hi} =P {Hi}wherexLVSimilarly, for the observation of a success or a collision, the aposteriori distributions areP { nk^=P {n k^=x^P(uk„)(nk — 1)P{TZ}k (n k) (1 — H(n k ) — ( 710) P {Ci}(73)(70)(71)(72)50We see then that in the case of the hole, the mean of this new distribution isrtk — xand in the case of a success the mean isx + 1(74)(75)For a collision and for a collision the mean is derived to bex 2' (76)+ e x ____ x ___ 1It is worthwhile to note here that in the case of a hole and a success, the Poissondistribution of nk is maintained while for a collision it is not. For simplicity, we proceeddespite this fact, acknowledging at this point that the assumption of the a priori distributionis not exact. It is for this reason that Rivest suffixes the estimator with "pseudo-".Incorporating the a posteriori distribution means together for observations on all minislots,we now see that our pseudo-Bayesian estimate for the number of contending stations in framek is= istk — Hkx — Tkx Tk Ck x2( ex x 1) (77)At this point, it becomes apparent that the two simplifying approximations (that the aprioridistribution is Poisson and the minislot observations are independent) allows us to break downthe highly complex procedure in section 4.1 to a simple expression. Adjusting this estimateby our prediction of the number of arriving packets, and the successful transmission ofpackets observed in frame k, the estimate for frame k+1 becomes14+1^k (M^iik)P^ (78)51This provides us with a simple formula to estimate the number of contending stations for thenext frame based on the current estimate and observations of the current frame.Let us now consider how our estimate may be incorporated into the subframe allocationprocedure. First it is important to note that the state estimate i k is a real number as definedby eqn. (78), but to choose an allocation in the desired policy, the state must be an integer.We overcome this inconsistency by simply rounding the estimate to the nearest integer. Withthe common integer estimate, all stations determine the subframe allocation using a simplelookup table, matching the state estimate with its respective allocation, Lk = L(A7i,N2).As long as the feedback to all stations is identical and all stations use the same algorithm todetermine the allocation, the distributed control will be consistent among the stations.For testing purposes, the pseudo-Bayesian estimation algorithm was implemented into asimulation of the reservation model. The technique was applied to the MF policy on a 20,000frame simulation and throughput and average delay were calculated. Again, although wecannot determine the confidence intervals of the data points using the usual statistical methods,the comparative smoothness of the resulting curves indicates that this simulation length issufficient for a simple comparison among control schemes. Both the small and large modelsconsidered in the last chapter were examined and Figures 11 and 12 show the delay plottedagainst the throughput for those two models. We can see that with estimation implemented,the performance of the system is virtually unchanged. There is a slight decrease in the capacityfor the small model shown in Figure 11 but the difference is less than 2%, and the MF policywith estimation still shows much better performance than the fixed allocations schemes andthe DP policy, where state estimation is unnecessary. These results are encouraging andindicate that while we have been forced to make some simplifying approximations in thederivation of a simple estimator, the resulting pseudo-Bayesian estimate provides for goodperformance for a dynamic subframe allocation policy.5253-,-6-g 5.0ai°E 4.0lha)EEI 3.0co 2.0T)-01.00.00.10^020^0.30^0.40throughput (packets/slots)Figure 11. Average delay vs. throughput for the parameter set(M=20, F=5, V=3) : MF policy with and without estimation0.00^0.20^0.40^0.60throughput (packets/slot)Figure 12. Average delay vs. throughput for the parameter set(M=150, F=12, V=12) : MF policy with and without estimation0.00.80i5.0^ Complete state informationx Pseudo-Bayesian estimation0.600.507.06.010.0I■I■^ Complete state informationx Pseudo-Bayesian estimationThe issues of sensitivity of this state estimator to system parameters has not beeninvestigated in this work. Analysis of this type is left to further research.54Chapter 5 MULTIPACKET MESSAGE MODELSAnalysis to this point has focussed exclusively on the single packet message model. Inaddition to this model, it is of interest to investigate multipacket message models. Indeed,as the total message length with respect to the reservation packet length increases, we mayexpect the efficiency of the reservation protocol to improve accordingly since proportionallythere is less overhead for reservations per useful data slot. In this chapter we continueto investigate the use of dynamic subframe allocation but now with models that handlemultipacket message lengths.For our multipacket message model, users generate messages of variable length ratherthan the single packet length messages considered earlier. Such a message model is moreappropriate for traffic composed of e-mail messages, and general data transfers while singlepacket messages are more appropriate for such short data transfers as credit card checks andother types of short standardized data signalling. In slotted systems, users would break themessage into slot sized packets. While the length of the messages may have one of manypossible distributions depending on the nature of the traffic, the exponential distribution isgenerally used. When the message is broken into packets of equal length, the exponentiallydistributed message lengths lead to geometrically distributed numbers of packets per message.This distribution covers the full range of possible lengths, from single packet messages to verylong messages and has convenient statistical properties. When a station sends its reservationpacket, it provides reservation for the entire set of data packets in the multipacket message.For the multipacket models that we will consider, the frame structure remains as inFigure 1, with stations transmitting reservation requests in slotted ALOHA fashion over thereservation subframe. Thus, the recursive estimator discussed in chapter 4 remains valid forthe multipacket message models.55Before beginning detailed analysis of multipacket message models, we must first deter-mine the specific service discipline for the model by which the messages in the global queueare allocated slots in the data subframe. In this chapter we consider two distinct service disci-plines. The two multipacket message models corresponding to these disciplines we will referto simply as the Ml model and the M2 model. In both cases, stations send a single reservationrequest packet for the entire multipacket message. The Ml model is the simpler of the two,wherein stations are restricted to sending only one packet per frame and the message's lengthis not indicated in the reservation request packet but rather, the message's end is indicatedby an end-of-message (EOM) flag. This type of service discipline is seen in such reservationprotocols as R-ALOHA [3] and that proposed in [31] where a station is limited to securingonly one slot per frame. In a multichannel scenario where one may consider the allocationof a bank of channels rather than slots in a frame as in [8] and [32], this service disciplineis appropriate since stations can send their packets on only one channel at a time when theyhave a single transmitter. Within the M2 model, message length is indicated in the requestpacket and stations are allowed to transmit multiple packets within a single frame. This is afar more general reservation service discipline for multipacket messages, and is assumed inreservation protocols considered by [4] and [5].We shall see that the simplicity of the Ml model allows us to carry out a Markov decisionprocess formulation and policy optimization very similar to that of the single packet model.The M2 model exhibits a much more complex state structure and we are forced to abandonattempts at finding an allocation policy based on full state information to optimize throughputperformance. Rather we look at what insight we have gained about heuristic policies fromthe single packet message model and the Ml model to propose a version of the MF policythat shows good improvements over fixed allocation schemes in the M2 model. For both theMl and M2 models we provide performance curves which back up our claims that dynamic56subframe allocation provides better system performance than fixed subframe allocation.5.1 The M1 Model5.1.1 Model and Markov decision process formulationStations in the M1 model send a single reservation request packet for their entire message.This reservation request does not indicate the length of the multipacket message and stationsare restricted to sending no more than one packet per frame. Similarly to the single packetmessage model, when the station has successfully sent its reservation request packet, thatstation joins the global queue, sending its leading untransmitted data packet on the datasubframe when it reaches the head of the global queue. When a packet is sent, if it is thelast packet in the message, the station goes into idle mode and waits for a new message toarrive. Alternatively, if it is not the last packet, the station rejoins the global queue at theend of the frame to send its next packet in a subsequent frame. Thus stations are served in around-robin fashion. For all stations to be informed at the beginning of the next frame of thestate of the global queue, an EOM flag should indicate that the last packet is being sent. Ifthe global queue is short so that stations send packets in subsequent frames, this EOM flagshould be indicated in the next to last packet of the message (or in the reservation requestpacket for a single packet message) to avoid the station being allocated a wasted data slotafter its message is complete.Let us assume that for the ith message, the number of packets in the message, r1 isgeometrically distributed with mean message length, R packets. Thus1^1 r-1P(r i = r) = -17 1 — )^,^= 1,2, . . .( Using this very simple multiple message model, the Markov decision process formulationremains very similar to that of the single packet message model. Figure 13 shows how the57(79)contending^reservationstations subframestations move through the queuing system. Note that the state of the system at frame k isstill completely defined by the two state variables, the number of stations in contention mode,Art, and the number of stations currently in the global queue, N. The decision variableremains the size of the reservation subframe, L(Sk). Similarly, since the stations are confinedto sending only one packet per frame, the reward may be written the same as in the singlepacket message model,rL(Sk) = min^, F — L(Sk)) •^ (80)Figure 13. Queueing control representation of the M1 multipacket reservation modelThe only difference from the single packet message model appears in the state transitionprobabilities. Note from Figure 13 that Wk now represents the number of message com-pletions in frame k. The advantage of the geometric distribution becomes evident now aswe take advantage of its memoryless property to update the system state. Using the samesimplified notation as chapter 2,^n a-- Art ,^n2 = N2 ,7,„^4+1 , 77,2^, a L(Sk)the distribution of W k conditional on the state and action isPr{ Wk = w ni, n 2 , a} =min (n2,F—a) (82)(min (71,2,z F — a)) Gil ) 1 (1 71:71 ) min n2 ,F—a)—i(81)58The distributions of Yk and Xk remain the same as in section 2.3.5 eqns. (10) and (11).Combining these distributions we write the transition probabilities of the Markov decisionprocess asp(mi, n1 2 n i , n 2 , a)min (ii,,F—a) [Pr {Wk = w l n1 ,n2,a}xPr { Y k = 7n 2 — n2 w I n i , n 2 , a} x^ (83)Pr {X k = nti — ni 7i2 — n2 + w I nl, n2, a}]•Again, as in the single packet message model, we see that due to the memoryless propertyof the message length distribution the state transition probabilities satisfy the Markovianproperty. Note that the single packet message model is identical to the M1 model with R=1.5.1.2 Dynamic allocation policiesFrom the Markov decision process formulation we see that the M1 model has a verysimilar state structure to that of the single packet model formulated in chapter 2. Thus ourapproach to finding good policies is also similar. As in the single packet message model,we consider fixed allocation schemes, optimal allocation and two heuristic policies. Thesepolicies correspond directly to those examined in chapter 3 for the single packet messagemodel. Some comments are given as to how each of these policies is adapted for the M1model. Figures 14-17 show performance results of the different policies in the Ml model.These results are discussed in section 5.1.3. It should be noted immediately that the reservationsubsystem is unaffected by the extension to the multipacket model and the state estimationtechnique discussed in chapter 4 remains unchanged for this model.5.1.2.1 Fixed allocationAlthough Szpankowski [10] proposed his method for finding the best fixed allocation forspecifically the single packet message model, it is simple to extend it to the M1 model. Ifthe throughput on the reservation subframes assumed to be 1/e requests per minislot, then59the average reservation overhead for the message will be e/V slots per message. Setting theproportion of reservation slots to data slots equal to the proportion of the average reservationoverhead to average message length we getL^e/VF — L Rand solve to getFL 1+ RV/ eWe use the integer values of eqn. (85) for the best fixed allocations.5.1.2.2 Optimal policyAs in section 3.2, we apply the modified policy iteration algorithm with our newtransition probability matrix to find the optimal policy for the Ml model which maximizesthe system throughput. All nondegenerate policies form unichain Markov processes and thusthe optimality equations (eqn. (26)) apply and the algorithm is guaranteed to converge to theoptimal policy, L*(Ni, N2).5.1.2.3 DP policyFor the Ml model, the DP policy again maximizes the one-step throughput for the currentframe. Due to the constraint that each station may send at most one packet per frame, at most/q packets may be sent in the frame and we specify the DP policy to beLDp(Ni , N2 ) = max (0,F —^. (86)5.1.2.4 MF policyIn section 3.4 a new criteria, referred to as the "flow", was proposed to allow a onestep maximization and thereby approximate the optimal policy without the computational60(84)(85)complexity of the dynamic programming techniques. For the M1 policy we again use sucha criteria. Since Wk now refers to the number of message completions in the frame, wenow introduce the variable Zk which indicates the number of packet transmissions in framek. (In the situation where R=1, Zk Wk , and are thus interchangeable in the single packetmessage model.) We define the "flow" in frame k asFk = ayk Zk (87)and set a = e/V. Noting that E [Yk] is defined by eqn. (37) and that Z k =min (M,F — L) , the expectation of Fk may be maximized with respect to L at eachstate to give the MF policy, LMF(N1, N2). It was pointed out in section 3.4 that the MFpolicy was attractive for the single packet message model in that it was independent of thearrival probability, p. This property is maintained for the M1 version of the MF policy.Furthermore, since E[Fil is independent of the average message length, R, so too will bethe MF policy, and the MF policy derived for the single packet message model will applyunchanged to the M1 model with the same F and V.5.1.3 Numerical resultsThese policies were analyzed for a small network model with parameters {M=20, F=5,V=3, R=3) and average delay and throughput were calculated using the methods describedin section 2.4. The retransmission probability, ps is set to unity unless otherwise indicated.The optimal policies for different arrival policies were calculated using the modified policyiteration algorithm and a set of sample policies appear on Table 4. The MF and DP policiesare identical to those appearing on tables 2 and 3. Notice that as in the single packetmessage model, the optimal policies for high traffic levels are very similar to the MF policy,foreshadowing the almost identical performance of the optimal and MF policies.Figures 14 and 15 plot the throughput versus the arrival probability and the average delayagainst the throughput respectively for the fixed and dynamic policies. In these curves, we see610^1 2 3 4 5 6 7 8 9 10^11^12 13^14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10^11^12 13 14 15 16 17 1 8 19 200 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 0 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 51 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 52 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 53 2 2 2 2 2 2 2 2 2 3 3 3 4 4 4 4 5 5 3 2 2 2 2 2 2 2 2 3 3 3 4 4 4 5 5 5 54 1^1 1^1 1^2 2 2 2 3 3 3 3 4 4 4 5 4 1^1 1^1 2 2 2 2 3 3 3 4 4 4 5 5 50 0 1^1 0 0 0 0 0 0 0 0 0 0 0 0 5 0 1 1^1 1^2 2 2 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 1^1 1^2 2 0 0 0 0 0 0 0 07 0 0 0 0 0 0 0 0 0 0 0 0 0 0 NN 7 0 0 1^1 1^2 2 0 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 1^1 1^2 2 0 0 0 0 0 09 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 1^1 1^2 2 0 0 0 0 010 0 0 0 0 0 0 0 0 0 0 0 10 0 0 1^1 1^2 2 0 0 0 011 0 0 0 0 0 0 0 0 0 0 11 0 0 1^1 1^2 2 0 0 012 0 0 0 0 0 0 0 0 0 p=0.04 12 0 0 1^1 1^2 2 0 0 p=0.1013 0 0 0 0 0 0 0 0 13 0 0 1^1 1^0 2 014 0 0 0 0 0 0 0 14 0 0 0^1 1^0 015 0 0 0 0 0 0 15 0 0 0^1 0 016 0 0 0 0 0 16 0 0 0 1 017 0 0 0 0 17 0 0 0 018 0 0 0 18 0 0 019 0 0 19 0 020 0 20 0Nit N10^1 2 3 4 5 6 7 8 9 10^11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10^11 12 13 1 4 15 16 17 18 19 200 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 0 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 51 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 52 3 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 5 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 53 2 2 2 2 2 2 2 2 3 3 3 4 4 4 5 5 5 5 3 2 2 2 2 2 2 2 3 3 3 3 4 4 4 5 5 5 54 1^1 1^1 2 2 2 2 3 3 3 4 4 4 5 5 5 4 1^1 1^1 2 2 2 2 3 3 3 4 4 4 5 5 55 0^1 1^1 2 2 2 2 3 3 3 4 4 4 5 5 5 0 1 1^1 2 2 2 2 3 3 3 4 4 4 5 50 0 1^1 1^2 2 2 3 3 3 4 4 4 5 6 0 0 1^1 2 2 2 2 3 3 3 4 4 4 50 0 1^1 1^2 2 2 3 3 3 4 4 4 N 7 0 0 1^1 2 2 2 2 3 3 3 4 4 4N2 870 0 1^1 1^2 2 2 3 3 3 4 4 '7 8 0 0 1^1 1^2 2 2 3 3 3 4 49 0 0 1^1 1^2 2 2 3 3 3 4 9 0 0 1^1 1^2 2 2 3 3 3 410 0 0 1^1 1^2 2 2 3 3 3 10 0 0 1^1 1^2 2 2 3 3 311 0 0 1^1 1^2 2 2 3 3 11 0 0 1^1 1^2 2 2 3 312 0 0 1^1 1^2 2 2 3 p=0.16 12 0 0 1^1 1^2 2 2 3 p=0.2213 0 0 1^1 1^2 2 2 13 0 0 1^1 1^2 2 214 0 0 1^1 1^2 2 14 0 0 1^1 1^2 215 0 0 1^1 1^2 15 0 0 1^1 1^216 0 0 1^1 1 16 0 0 1^1 117 0 0 1^1 17 0 0 1^118 0 0 1 18 0 0 119 0 0 19 0020 0 20 00^1 2 3 4 5 6 7 8 9 10^1 1^12 13 14^15 16 17 18^19 20 0 1 2 3 4 5 6 7 8 9 10^11 12 13 14 15 16 17 18 19 200 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 0 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 54 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 52 3 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 5 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 5 5 5 5 53 2 2 2 2 2 2 2 3 3 3 3 4 4 4 5 5 5 5 3 2 2 2 2 2 2 2 3 3 3 3 4 4 4 5 5 5 54 1^1 1^1 2 2 2 2 3 3 3 4 4 4 5 5 5 4 1^1 1^1 2 2 2 3 3 3 3 4 4 4 5 5 55 0^1 1^1 2 2 2 2 3 3 3 4 4 4 5 5 5 0 0 1^1 2 2 2 3 3 3 3 4 4 4 5 56 0 0 1^1 2 2 2 2 3 3 3 4 4 4 5 6 0 0 1^1 2 2 2 3 3 3 3 4 4 4 57 0 0 1^1 2 2 2 2 3 3 3 4 4 4 7 0 0 1^1 2 2 2 3 3 3 3 4 4 4N2 8 0 0 1^1 2 2 2 2 3 3 3 4 4 N2 8 0 0 1^1 2 2 2 2 3 3 3 4 49 0 0 1^1 2 2 2 2 3 3 3 4 9 0 0 1^1 2 2 2 2 3 3 3 410 0 0 1^1 1^2 2 2 3 3 3 10 0 0 1^1 2 2 2 2 3 3 311 0 0 1^1 1^2 2 2 3 3 11 0 0 1^1 2 2 2 2 3 312 0 0 1^1 1^2 2 2 3 p=0.28 12 0 0 1^1 1^2 2 2 3 p=0.3413 0 0 1^1 1^2 2 2 13 0 0 1^1 1^2 2 214 0 0 1^1 1^2 2 14 0 0 1^1 1^2 215 0 0 1^1 1^2 15 0 0 1^1 1^216 0 0 1^1 1 16 0 0 1^1 117 0 0 1^1 17 0 0 1^118 0 0 1 18 0 0 119 0 0 19 0 020 0 20 0Table 4. Optimal policies for a range of arrival probabilitiesin the M1 model with parameters (M=20, F=5, V=3, R=3)620.8^ L=1, ps=0.4- - - - L=2, ps=1.0--- DP• MF^ optimalx MF with state est.0.30.20.0^0.1^0.2arrival probability, p0.3^ L-1, ps=0.4-- -- L=2, ps=1.0--- DP• MF^ optimalx MF with state est..................Figure 14. Throughput vs. arrival probability in the M1 model for the parameter set (M=20,F=5, V=3, R=3) : fixed allocation policy, optimal policy, MF policy and DP policy20.0-a; 15.0acaCoCoa)ECoa) 10.0EEl(IITll-0^5.00.00.00^0.20^0.40^0.60^0.80throughput (packets/slot)Figure 15. Average delay vs. throughput in the M1 model for the parameter set (M=20,F=5, V=3, R=3) : fixed allocation policy, optimal policy, MF policy and DP policy63that the dynamic policies all show lower average delays for equivalent throughputs and higherthroughputs for equivalent arrival probabilities than the fixed policies. For example, for athroughput of 0.58 packets per slot, the fixed allocation schemes operate close to their capacitywith a large delay of over 8 frames. The dynamic schemes all show delays of near 4.5 frameswhen operating at that throughput, a reduction of over 40%. Furthermore, the capacities ofthe dynamic policies are considerably higher that those of the fixed policies. For this modelthe highest throughput achieved by a fixed allocation schemes is 0.637 packets per slot whenL=1 and p=0.13. In comparison, the highest throughputs achieved by the dynamic policiesare 0.751 for the DP policy and 0.792 for the optimal and MF policies, an improvement of18% and 24% respectively. As in the single packet message model, the MF and the optimalpolicies under complete state information are indistinguishable in terms of performance andboth show slightly lower delays for equivalent throughputs than the DP policy. Includedin Figures 14 and 15 are the simulation results of the MF policy implemented with stateestimation incorporated. The introduction of state estimation has eroded the capacity of theprotocol from the complete state information case, but achieves a capacity about 3% higherthan the DP policy and 21% higher than the fixed allocation capacity.We also examined a larger scale system model with parameters (M=150, F=15, V=8,R=5). In this case, both the modified policy iteration algorithm and the solving of the globalbalance equations of the Markov processes are extremely complex undertakings and we optfor simulation analysis and abandon the optimal policy. Simulations were conducted forthe policies of interest over a range of traffic levels and delay and throughput statisticswere gathered. The same simulation length of 20,000 frames was used as in the singlepacket message model and provides a sufficient accuracy for simple comparison amongpolicies. Figures 16 and 17 show the simulation results plotting throughput against thearrival probability and the delay versus the throughput respectively. State estimation is64^ L=2-- - DP policy^ MF policy....... .................................................1.00.020^0.030^0.040^0.050arrival probability, pFigure 16. Throughput vs. arrival probability in the M1 model for the parameter set(M=150, F=15, V=8, R=5) : fixed allocation policy, MF policy and DP policy20.0--c-c) 15.0acucncoa)ECl)a) 10.0EEiasa)-0^5.00.90^1.000.2 ^0.0100.00.50^0.60^0.70^0.80throughput (packets/slot)Figure 17. Average delay vs. throughput in the M1 model for the parameter set(M=150, F=15, V=8, R=5) : fixed allocation policy, MF policy and DP policy651implemented for the MF model and the retransmission probabilities are set at unity. Wesee from these results that as in the small model, the dynamic policies show lower delays inthe high throughput regions of the graph and the capacity of the MF policy is the highest.At throughputs up to near 0.8 packets/slot, the delays of all policies are very close. Beyondthis, the fixed allocation policy shows a sudden increase in delay as it reaches its capacityat 0.87 packets/slot. In comparison, the DP and MF policies produce capacities of 0.91 and0.92 respectively, an improvement of approximately 5% over the fixed allocation capacity.This relative improvement achieved by introducing dynamic allocation, while less than thatfor models with smaller frame size, is still encouraging.5.1.4 Capacity approximationThe capacity approximation technique described in section 3.5 for the single packetmessage model may be extended to the M1 model fairly simply. We again look at the casewhere M is very large and the reservation subsystem throughput is stable with throughputapproaching l/e reservation packets per minislot when the number of contending stationsgrows large. In the Ml case, the average overhead for reservation request packet transmissionis e/V and the ideal capacity(88)as discussed in [28]. As in section 3.5, this ideal is a capacity approximation that is basedon the assumption that perfect flow balance exists between the two subchannels. For a fixedallocation scheme of reservation subframe size Lf , this perfect flow balance is not in generalachieved and the capacity is better approximated by(L f)( RV Lf F—LFe^F )(89)due to the bottleneck in the reservation subsystem or the data subsystem caused by choosinga whole number of reservation slots. By the same line of argument as in section 3.5 we can66^ ideal capacity^ best fixed allocation capacity• MF policy capacity 0.601.0^3.0^5.0^7.0^9.0Avg. message length, RFigure 18. Approximate capacity plotted against the average message length, R, for the idealcapacity, the best fixed allocation policies' capacity and the MF policy capacity for the case F=15say that the ideal throughput of eqn. (88) is a good approximation for the capacity of the DPpolicy, MF policy and optimal policies when M is large. We verify this by finding capacitiesby simulation for the MF policy applied to a model with parameters (M=150, F=15, V=8)and state estimation incorporated for different values of R.In Figure 18 we plot system capacity against the average message length for the MFsimulations, the ideal capacity (eqn. (88)) and the best fixed capacities (eqn. (89)). Thegranular effect that we observed when capacity was plotted against V is again evident in thisfigure but now with respect to the average message length, and the difference between the fixedallocation capacity and the ideal capacity is highly sensitive to R. The MF simulations indicatethat the ideal capacity is a good approximation for the MF capacity for large M. Although thedifference is not evident from the figure, the MF simulation results are within 1% of the ideal.It is worthy to repeat here that the MF policy is independent of changing average messagelength. Thus, the MF policy exhibits the quality of adapting not only to the traffic level, as it1.000.90Qcti 0 8000.7067is independent of the arrival probability, p, but its capacity is robust in the face of changingaverage message length. It may be expected that in real systems, the traffic characteristicwould change in this fashion, showing time variance in both intensity and average messagelength. It should be noted though, that while the MF policy is independent of these trafficparameters, the accuracy of state estimation upon which the MF policy depends, almostcertainly depends at least somewhat on these parameters. The extent of this dependance hasnot been investigated in this work.5.1.5 Problems with the M1 modelThe M1 model for multipacket messages is simple, allowing policy optimization and exactMarkovian analysis of the policies that we have considered. This simplicity is only possiblebecause of the following two key constraints put on the operation of the data subsystem:• Reservation request packets do not contain information regarding the length of themessage.• Stations may only send one packet per frame.For a real system, these constraints limit the performance of the protocol substantially. Thefirst constraint limits the possible information that is available to the system for control ofthe allocation. Information on the length of the individual messages is possibly valuable forsubframe allocation control. The second constraint is even more limiting. By allowing thestations to send only one packet per frame, we are introducing a large minimum delay formessage transmission. Consider a situation where a system has a very low traffic level. If astation generates a message of length 4 packets, at best, the total transmission delay will be5 frames, one for the reservation, and 4 for the message. This is indeed an inefficient use ofthe channel resource. When the system is very lightly loaded, the delay could be 2 frames,a 1 frame delay for reservation and another frame for data as all the packets could be sent inthe next frame. At higher traffic levels, forcing the stations to share the data subframe in a68round-robin fashion seems more legitimate. To address these problems, we look at the moregeneral multipacket model that does not impose these constraints on the protocol.5.2 The M2 Model5.2.1 Model and state structureThe M2 model is a more general one than the Ml model for multipacket messagetransmission. The system remains divided into a reservation subsystem and data subsystembut now the two constraints are loosened. Reservation request packets indicate the numberof packets in the current message of the transmitting station. Furthermore, the stations in theglobal queue may send multiple packets in the data subframe. Note that for the Ml model,all stations in the global queue were identical from the perspective of queue service controland it was natural to serve them in a round robin fashion. Now, entries into the global queueare distinguished by their number of remaining packets. As such, to be entirely general,the system controller should choose not only the number of data slots available for packettransmission, but also which stations in the global queue should be served. As such, the stateand decision space becomes very large in the completely general case.To formulate this more general model as a Markov decision process is no longer asimple derivation and while it may be described as Markovian, the dimensionality of the statespace grows large and the number of possible state grows without bound. The reservationsubsystem is described as always by one state variable, the number of contending stations,N1. The variables describing the state of the global queue may no longer be limited to asingle number since not only the number of stations in the queue must be represented, butalso the number of remaining packets for each. We may define a set of new state variables,A, ; i = 1,2,...^, (90)that represent the number of unsent packets at the ith station in the global queue. There areM such variables since there may be up to M stations in the queue. We no longer need to69track the number of stations in the queue since that is simply the number of A, which arenon-zero. Thus the state vector becomesS^Ai, A2, ... Am) .^ (91)Although at normal unsaturated operation the vast majority of the AZ would be zero, all M+1elements must be retained in the state vector for the process to be accurately described asMarkovian. When message lengths are assumed to be geometrically distributed, the lengthsmay take on any positive integer value. Thus we are faced with a Markov process with astate space of dimension M+1 and an infinite number of states. While we could consider atruncated message length distribution to achieve a finite state Markov process, the state spacedimensionality alone is imposing in its size.The range of available actions in the very general M2 model is no less imposing. Sincethe stations in the global queue are distinguished by their number of unsent packets, there maybe advantage for the controller to allocate a specific number of slots in each data subframeto specific stations. For instance, there may be gains to be made by servicing stations withshort messages to free them for new arrivals and wait to serve stations with long messagesuntil the queue is less busy. We may define a set of new action variables,Bi ;^.. F ,^ (92)specifying the stations that are serviced in the current data subframe. These variables form theaction vector B. The length of the reservation subframe would be implicit in this action vector.We provide here a very simple example to indicate how a controlling policy may work.Consider a model where the frame size is F=5, there are currently 4 contending stations andthere are two stations in positions 1 and 2 of the global queue with 5 and 1 unsent packetsrespectively. The state would then beS =^Ai, A2, ... Am)(93):7= (4, 5, 1, 0, 0, ...) .70A sample policy may allocate one slot for the reservation subframe, 1 slot to completeservice of the station in position 2 and the remaining 3 slots to service the station in position1. Thus the action vector would beB (Bi , B2, ... BF)(94)= (0,1,1,1, 2) .A zero indicates that the slot is part of the reservation subframe.To find an optimal state dependant policy for frame allocation it would be necessary toformulate this model as a Markov decision process, deriving the action dependant transitionprobabilities and state rewards. We stop short of attempting this due to the very largedimensionality and size of such a formulation. Indeed, we can expect that such an optimalpolicy for even quite small network model would be unattainable using the cumbersometechniques that we applied to the simple two dimensional state space and single action variablemodels discussed earlier in this thesis. This author has not seen any applications of Markovdecision process theory of state dimension greater than 2 in the literature. Furthermore, whenthe Markov process has a countable number of possible states, there are no general techniquesfor deriving an optimal policy in the Markov decision process literature. In this most generalmodel description, we leave finding the optimal policy as an open research problem andproceed in the next section to use insights gained in the single packet message model and theM1 model to find simple dynamic allocation policies that work quite well for the M2 model.5.2.2 Simple allocation in the M2 modelHaving abandoned the high road of Markov decision process formulation and optimizationfor the M2 model, we now make some simplifications to the problem that allows us to adaptthe heuristic policies of the M1 model to the more general M2 model. First, let us reduce theactions to simply the allocation of the size of the reservation subframe, L, and assume thatstations in the global queue are serviced to completion on a first come first served basis. Thus71we maintain generality in that stations may, and inevitably will transmit multiple packets perframe. (Essentially, this simplified service discipline is identical to that considered by [5] whodoes an approximate analysis of the performance in this case for a fixed allocation policy.)This simpler service discipline uses a simple subset of the state information specifying N 1 tobe the number of contending stations and N2 to be the total number of packets waiting forservice in the global queue. The latter incorporates all global queue information by summingthe true state variablesN2 A 1 + A 2 . . . Am . (95)We will refer to these variables as the state although it must be noted that they containonly a portion (albeit a significant portion) of the state information in the Markov process.With these state variables, our system very closely resembles that of the M1 model. Sinceunsent packets in the global queue are effectively no different than single packet messagesfrom the standpoint of the data subsystem, it seems logical to apply the MF and DP policies(which are independent of the average message lengths) with our two state variables N 1 andthe redefined N2. Since the process is no longer Markovian on these state variables, wecannot apply the Markovian techniques to determine throughput and delay so we incorporatethese policies in simulation for the same model as was examined for the Ml model (F=5,1i=3, M=20, R=3). A simulation over 60,000 frames was conducted, collecting delay andthroughput statistics for different arrival probabilities. The large simulation length is chosenhere simply because in this smaller model, the time taken for a simulation of this manyframes is not as great as in the larger networks considered in earlier simulations. In any case,this simulation length provides us with results that enable comparison between the differentmodels and the allocation policies. Figure 19 plots the average delay against the throughputfor a fixed allocation policy and the MF policy with state estimation for the M2 model andM1 model simulations. We can see that having allowed stations to transmit multiple packets7220.0cf) 15.0a)• 10.0715a)cts^5.0eh->0.0 - - - - M1 service, fixed allocation, L=2— - — M2 service, fixed allocation, L=2^ M1 service, MF policy with estimation— — - M2 service, MF policy with estimation^ ^1^) III,Il I^h ^A_//,.....--^------ -----/.......^.,^, ^.......^,_ ...- - -0.0^0.2^0.4^0.6^0.8^1 .0throughput (packets/slot)Figure 19. Average delay vs. throughput for a fixed policy and the MF policy withestimation in the M1 and M2 models with parameters (M=15, F=5, V=3, R=3)per frame has produced the expected reduction in average message delay at low throughputlevels. In this case the delay is reduced from approximately 4 frames/message for the Mlmodel (1 frame for reservation and 3 for data packets) to near 2 frames for the M2 model(1 frame for reservation and 1 frame for data packets). This reduction is significant and wemay expect the savings to increase with increasing message length. For low traffic levels, wemay expect the Ml model delay to be 1+R frames while the M2 model will produce delaysnear 1+R/F. As throughput reaches capacity, the difference in average delay between identicalpolicies in the Ml and M2 models decreases and seems to disappear at capacity. Capacityfor the same policies for the two models appears the same. This is not unexpected since weexpect the capacity approximation of the MI model to be applicable also to the M2 model.In summary, loosening the constraints on the MI model has rendered optimization andMarkovian formulation of the general model inachievable. We do find though, that wecan still improve results with dynamic subframe allocation using a portion of the state73information and a simple service discipline that does not take advantage of its full rangeof options. Capacity remains the same, and the average delay is significantly reduced forlow and medium throughput levels. Finding optimal or improved policies that incorporatecomplete state information and the full range of possible global queue service disciplinesremains an open research problem.74Chapter 6 CONCLUSION6.1 SummaryIn this thesis we have investigated how the reservation multiple access protocol firstproposed by Roberts[1] may be improved by dynamically allocating the size of the reservationsubframe based on state feedback information. It was shown that an optimal allocationpolicy exists when complete state information is available and using infinite horizon dynamicprogramming techniques, optimal policies were found for specific networks. Numericalanalysis and simulation results demonstrated that the capacity of the broadcast channel isdramatically improved over previously considered fixed allocation schemes when the optimalpolicy is implemented and average message delays are decreased for the same systemthroughput.While the optimal policy is valuable as it produces an upper bound on the systemthroughput, the complexity of its derivation and implementation prompts us to investigatetwo heuristic allocation policies that show performance results close to optimal while holdingsome implementation advantages over the optimal. The MF policy shows the best performanceof the two, almost matching the optimal performance. The DP policy is attractive in that itdepends only on the fully observed state variable for its implementation. Both these policiesare simple to derive, and continue to perform well when the average message length changesor the system traffic intensity changes. Capacity approximations show that the heuristicpolicies can produce capacities close to the ideal capacity for the reservation protocol whilefixed allocation schemes often fall far short of this ideal.Initial analysis was done on the model with the assumption that complete state informationis available for control. This is an unrealistic assumption, and, in loosening the assumptionit is necessary to find for policies dependant on full state information a suitable technique75to estimate the number of contending stations using ternary feedback from the reservationsubframe. We have proposed such an estimator, using a variation on Rivest's [2] pseudo-Bayesian estimator to provide a recursive estimate of the unknown state. Simulation resultshave shown that the implementation of this estimator does not significantly depreciate theperformance of the dynamic policies where estimation is necessary.Finally, we considered the extension of the model to allow multipacket messages. Twoservice disciplines were considered. The first depended on severe constraints to allow us tomaintain our Markovian formulation, derive optimal policies and generate exact performanceresults by analysis of the Markov process. The second multipacket model, while beingmore applicable to real systems, prevents valuable Markovian analysis with its complex statestructure. Thus we were forced to use insights gained from our results for the simpler modelsto propose a heuristic dynamic allocation scheme for this more general model and test itwith simulation. For both the multipacket message models, numerical and simulation resultsshow that the performance of the dynamic allocation policies is improved significantly oversimple fixed allocation.6.2 Proposals for Further ResearchIn this research we have demonstrated that altering Roberts model to allow dynamicsubframe allocation can significantly improve the performance of the broadcast channel. Theresults are indeed encouraging and it is likely that follow-up research could broaden theadvances made in this work. In this section we propose areas of further work that may provebeneficial.In our analysis, the user population in the network had to be finite to allow us to dofinite-state Markovian analysis. Indeed, due to the multiple dimensions of the problem, theuser population had to be quite small to produce analytical results. Since network populationsin a real world application may be quite large and indeed may fluctuate as users go on-line76and off-line, it would be more realistic to use an infinite population model with a true Poissontraffic assumption to achieve a more general result. While derivation of the true optimal policymay be inachievable with this model, use of approximate methods to propose extensions toand analyze the DP and MF policies for the infinite population model would be beneficial.The optimal policies in this thesis were derived based on the full state information. Astate estimator was then proposed independently to provide an estimate of the system state toimplement the optimal policy. While we have treated these as separate problems, derivationof the true optimal policy for the partially observed Markov process would integrate the twoproblems, basing the allocation on the aposteriori distribution of the state rather than the stateestimate (the mean of the aposteriori). While in many systems for controlling a partiallyobserved Markov process these problems may be shown to be separable, it is unknown if thisis the case for the reservation protocol that we have considered. Techniques for analyzingsuch control problems are given in [26]. An investigation on these lines would be complexbut would possibly produce a theoretically appealing result.The improvement in channel performance using dynamic subframe allocation reveals thatstate information is indeed valuable for channel access control. While this fact has beenrecognized by previous researchers for the stabilization of a simple slotted ALOHA network[2], [11], [12], [18], this work is the first to use this information explicitly to optimize subframeallocation. The dynamic programming techniques that we have used rely on the fact that themodel may be reasonable formulated as a Markovian decision process. In the model that wehave considered this has been the case and it is very likely that such Markovian decisionprocess formulation can be performed for other reservation-type protocols to allow subframeallocation optimization. These are some similar protocols that may benefit from such analysis.• R-ALOHA [3] is a reservation protocol that allows users to send multipacket messagesby reserving future slots on the frame by successfully sending their first data packet by77slotted ALOHA on an unreserved slot. The allocation of slots as reserved or unreservedis done by a simple heuristic in the R-ALOHA protocol and the use of state feedback foroptimal allocation has not been considered in the literature.• Leung [8] proposes multichannel reservation protocols for multipacket messages whichuse successful first packet transmission by slotted ALOHA to gain reserved data slotsfor the rest of the packets. Thus the multiple channels are at each slot allocated to bereserved for data packets or free for contention. This partition of a fixed number ofchannels for access by stations in two distinct modes is not very dissimilar to the singlechannel reservation model that we have considered.Wong and Yum [33] propose a combined random access/reservation protocol for singlepacket messages that allows both packet transmission by slotted ALOHA and by minislotreservation request transmission followed by reserved access. Essentially, time on thechannel is divided between minislots, slotted ALOHA slots and reserved data slots.Optimizing this three way allocation based on state feedback may improve the systemperformance.78Bibliography[1] L. Roberts, "Dynamic allocation of satellite capacity through packet reservation," AFIPSConf. Proc., vol. 42, pp. 711 – 716, 1973.[2] R. Rivest, "Network control by Bayesian broadcast," IEEE Trans. Information Theory,vol. IT-33, pp. 323 – 328, May 1987.[3] S. S. Lam, "Packet broadcast networks - a performance analysis of the R-ALOHAprotocol," IEEE Trans. Comput., vol. COM-29, pp. 596 – 603, July 1980.[4] R. Binder, "A dynamic packet-switching system for satellite broadcast channels," in Proc.of the IEEE International Conf. on Communications, (San Francisco, CA), pp. 41-1 to41-5, June 1975.[5] S. Tasaka and Y. Ishibashi, "A reservation protocol for satellite packet communication —a performance analysis and stability considerations," IEEE Trans. Commun., vol. COM-32, pp. 920 – 927, Aug. 1984.[6] I. Rubin, "Access control disciplines for multiaccess communication channels: Reservationand TDMA schemes," IEEE Trans. Inf. Th., vol. IT-25, pp. 516 – 536, Sept. 1979.[7] F. A. Tobagi and L. Kleinrock, "Packet switching in radio channels: Part III - pollingand (dynamic) split channel reservation multiple access," IEEE Trans. Communications,vol. 24, pp. 832 – 845, Aug. 1976.[8] V. C. M. Leung, "A multichannel reservation protocol for satellite multiple accessnetworks," in Proc. of the IEEE Pacific Rim Conf. on Commun., Computers and Sig.Proc., (Victoria, BC), pp. 437 – 440, May 1991.[9] S. F. W. Ng and J. W. Mark, "Multiaccess model for packet switching with a satellitehaving processing capability : Delay analysis," IEEE Trans. Communications, vol. 26,pp. 283 – 290, Feb. 1978.[10]W. Szpankowski, "Analysis and stability considerations in a reservation multiaccesssystem," IEEE Trans. Commun., vol. COM-31, pp. 684 – 692, May 1983.WA Mosely and P. Humblet, "A class of efficient contention resolution algorithms formultiple access channels," IEEE Trans. Commun., vol. COM-33, pp. 145 – 151, Feb.1985.[12]E. Fainberg, Y. Kogan, and A. Smirnov, "Optimal control by the restransmissionprobability in slotted aloha systems," Performance Evaluation, vol. 5, pp. 85 – 96, 1985.79[131J.-B. Suk and C. G. Cassandras, "Optimal scheduling of two competing queues withblocking," IEEE Trans. Automatic Control, vol. 36, pp. 1086 — 1091, Sept. 1991.[141K. R. Krishnan, "Joining the right queue : a state dependant decision rule," IEEE Trans.Automatic Control, vol. 35, pp. 104 — 108, Jan. 1990.[15]Z. Rosberg and I. S. Gopal, "Optimal hop-by-hop flow control in computer networks,"IEEE Trans. Automatic Control, vol. 31, pp. 813 — 822, Sept. 1986.[16]Z. Rosberg and A. M. Makowski, "Optimal routing to parallel heterogeneous servers -small arrival rates," IEEE Trans. Automatic Control, vol. 35, pp. 789 — 796, July 1990.[17]A. Segall, "Recursive estimation from discrete-time point processes," IEEE Trans.Information Theory, vol. 22, pp. 422 — 431, July 1976.[18]L. Kleinrock and S. S. Lam, "Packet-switching in a multi-access broadcast channel:dynamic control procedures," IEEE Trans. Commun., vol. COM-23, pp. 891 — 904, Sept.1975.[19]S. Thomopolous, "A simple and versatile decentralized control for slotted ALOHA,reservation ALOHA, and local area networks," IEEE Trans. Communications, vol. 36,pp. 662 — 674, March 1990.[201G. A. Cunningham, "Delay versus throughput comparisons for stabilized slotted ALOHA,"IEEE Trans. on Comm., vol. COM-38, pp. 1932 — 1934, Nov. 1990.[2 i],. Kleinrock and S. S. Lam, "Packet-switching in a multi-access broadcast channel:performance evaluation," IEEE Trans. Commun., vol. COM-23, pp. 410 — 423, Apr. 1975.[22]J. T. Lim and S. M. Meerov, "Performance of markovian access protocols in satellitechannels," IEEE Trans. Communications, vol. 38, pp. 273 — 276, March 1990.[23]M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming.New York, NY: J. Wiley and Sons, 1993.[24]M. Puterman, "Dynamic programming," Encyclopedia of Physical Science and Technol-ogy, vol. 4, pp. 438 — 463, 1987.[25]C. Derman, Finite State Markovian Decision Processes. New York, NY: Academic Press,1970.[26]D. P. Bertsekas, Dynamic Programming and Stochastic Control. New York, NY: AcademicPress, 1976.[27]R. Howard, Dynammic Programming and Markov Processes. Cambridge, Ma: MIT Press,1960.80[28]R. Gallager and D. Bertzekas, Data Networks. Englewood Cliffs, NJ: Prentice-Hall, Inc.,1987.[29]B. Hajek and T. VanLoon, "Decentralized dynamic control of a multiaccess broadcastchannel," IEEE Trans. Automatic Control, vol. AC-27, pp. 559 — 569, June 1982.[30]L. Merakos and D. Kazakos, "On retransmission control policies in multiple-accesscommunications networks," IEEE Trans. Automatic Control, vol. AC-30, pp. 109 — 117,Feb. 1985.[311T. S. H. Muyahara and T. Hasegawa, "Performance evaluation of an integrated accessschemes in a satellite communications channel," IEEE Journal on Selected Areas inCommunications, vol. SAC-1, pp. 153 — 164, Jan. 1983.[32]S. S. Rappaport, "Demand assigned multiple access schemes using collision type requestchannels - traffic capacity comparisons," IEEE Trans. Commun., vol. COM-27, pp. 1325— 1331, Sept. 1979.[33]E. W. M. Wong and T.-S. Yum, "A controlled multiaccess protocol for packet satellitecommunication," IEEE Trans. Commun., vol. COM-39, pp. 1133 — 1140, July 1991.[34]S. S. Lam, "Satellite packet communication - multiple access protocols and performance,"IEEE Trans. Commun., vol. COM-27, pp. 1456 — 1466, Oct. 1979.[35]F. A. Tobagi, "Multi-access protocols in packet communications systems," IEEE Trans.Commun., vol. COM-28, pp. 468 — 488, April 1980.[36]S. Tasaka, "Stability and performance of the R-ALOHA packet broadcast system," IEEETrans. Comput., vol. C-32, pp. 717 — 726, Aug. 1983.[371S. Tasaka, "Multiple-access protocols for satellite packet communication networks - acomparison," IEEE Proceedings, vol. 72, pp. 1573 — 1582, Nov. 1984.[38]W. Feller, An Introduction to Probability Theory and Its Applications. New York, NY:J. Wiley and Sons, 1970.[39]X. Wang, J. Kaniyil, Y. Onozato, J. Lui, S. Shimamoto, and S. Noguchi, "Performanceanalysis of a combined random-reservation access scheme," IEEE Trans. Commun.,vol. 39, pp. 478 — 481, Apr. 1991.81Appendix A Derivation of Conditions for a Policyto Produce an Ergodic Markov ProcessIn chapter 3 it was demonstrated that the network model under consideration may beformulated as a Markov decision process and thus a Markov process when a specific decisionpolicy is implemented. We wish to show here that for the policies of interest, the Markovprocess is unichain. This condition must be met for the regular optimality equations to holdand for the modified policy iteration algorithm to be guaranteed to converge to an optimalsolution.A unichain Markov process is one that is composed of a single recurrent class and anynumber of transient states. A fairly simple way to show that a process is unichain is to showthat there exists a state that can be reached from all other states within a finite number ofsteps. We show in this appendix that the empty state Sk = (0, 0) is such a state for a set ofpolicies that abide by a couple of conditions.Let us first set the condition that p < 1. (Although this condition is not actually necessaryfor the process to be unichain, the system with p = 1 is really not relevant as obviously areservation protocol would not be appropriate in that circumstance.) Let us say that thereoccurs a finite consecutive series of frames wherein no new packets arrive. Since theprobability of no arrivals in a frame is non-zero, such a series will occur within a finiteamount of time. At the beginning of such a series, let the system state be S = (Ni , N2 ).We need simply now to show that the system will empty out in a finite amount of stepswhen there are no arrivals. Obviously this is the case if at any state,E[Yk] E [wk'] > 0 .^ (96)From eqn. (37) repeated here,E [Yk] = ps Ni (1 — LV ) N1-1^(97)82we can see that E[Yk] > 0 if N1 > 0 and L > 0. Also, sinceE[W k] = min (N2 , F — L) ,^ (98)we may say that E[Wk] > 0 if N2 > 0 and L < F. The sum then is greater than zero forany non-empty state is the following two conditions hold for all non-empty states :1. If N1 =0then L<F.2. If N2 = 0 then L > O.These conditions do not seriously restrict our choice of policies, and indeed, only excludedegenerate policies which would lock the system into undesirable absorbing states. The vastmajority of possible policies and certainly all policies of interest to us fit these criteria andthus produce unichain processes.Although we have looked here at only the single packet message model, it is not difficult toextend the same argument to the multipacket message model for message length distributionsthat have finite mean values.83Appendix B Joint Probability Distribution forObservations in the Reservation Subframewhen n' Requests are TransmittedIn chapter 4, the problem is encountered of determining the probability that h holes andt successes are observed when Ti t request packets are transmitted over LV minislots. Thisproblem bears close resemblance to the classic occupancy problem [38] where Tt i objects arerandomly placed in LV urns. Feller provides the probability of exactly h empty urns in [38] asLV^ )71/LV^( LV — h\ (po Lv,n 1 ) = ( -1 ) h ( h)Y: (-1 q i — h^LV (99)We are left to find joint probability of observing exactly h empty urns and t urns containingexactly 1 object. Applying Bayes rule as follows,P(t,h1LV,71, 1 ) = P(t I h,LV,n') x^,^(100)and using eqn. (99), we are left to find the conditional probability of observing t singlyoccupied urns given that exactly h urns are empty.If we consider that when there are exactly h empty urns, we know with certainty thatthe remaining LV — h urns all have at least one object in them. Discounting the guaranteedsingle object in each remaining urn from the total number of objects, the problem of findingthe conditional probability becomes one of finding the probability of observing t holes when71 1 - (LV — h) objects are distributed over LV — h urns. This probability may be found fromeqn. (99) directly to beP (t I h, LV, nr) =LV-h^ •^\ -(LV-h)ot LV —^/ LV — h — t^ (101)(_t ) )(1 LV i_h)i=t84Now using eqns. (99), (100) and (101), and with some manipulations, we find the jointdistributionHoh+t ( LV) ( LV —P(t,h L17,7-7')h )^tLV-h LV^(_oi+ LV h (LV — h — t)^j n'—(LV -h)(— h )^— t )^LV )^LV —^ h )\ ] (102)85Appendix C Glossary of Symbols,Acronyms and AbbreviationsA : set of all possible reservation allocationsA, : the number of unsent packets at the ith station of the global queue (M2 multipacketmessage model)a : subframe allocation in frame kB : action vector for the M2 multipacket message modelCk : number of minislots containing reservation packet collisions in frame kC : ideal system capacityD : overall expected message delayD1 : reservation request packet delayD2 : data packet delayDP policy : "Data priority" policyEOM flag : "End-of-message" flagF : number of slots per frameFk : "flow" in frame kC : round trip propagation delayHk : number of idle minislots in frame kL : allocation policy mapping the system state, Sk to the allocation LkLDP : DP policyLMF : MF policyLf : fixed reservation subframe allocationLk : reservation subframe allocation in frame k: optimal reservation subframe allocation policy: ideal fixed subframe allocation86LAN : local area networkM : number of user stations in the networkm l : number of stations in reservation mode at the start of frame k+1m2 : number of stations in the global queue at the start of frame k+1MDP : Markov decision processMF policy : "Maximum flow" policy• : number of stations in reservation mode at the start of frame k• : number of stations in the global queue at the start of frame kn i : number of stations in reservation mode at the start of frame kn 2 : number of stations in the global queue at the start of frame krak : estimated number of contending stations in frame kOk : reservation subframe observation vector for frame kPL : state transition probability matrix when allocation policy L is implementedp : probability that a new message arrives at an idle user in a frameps : reservation request packet transmission probability1.5k : a priori distribution of the number of contending stations in frame k: a posteriori distribution of the number of contending stations in frame kR : average message lengthFL : vector indicating the reward for each state when the action policy L is implementedSk : system state at the beginning of frame kS : state spaceTk : number of minislots containing successful reservation packet transmissions in frame kV : number of minislots per slot• : relative value function vectorv(i) : element i of the relative value function vector87VSAT : Very small aperture terminalWk : number of message transmissions that are successfully completed in frame kXk : number of new messages generated in frame kyk number of successful reservation requests sent in frame kZk : number of successful packet transmissions in frame kcx : weighting coefficient for derivation of the MF policyII* : limiting distribution of the Markov process7r5 : probability for state S in the limiting distribution of the Markov process;\k : estimated number of newly arrived packets in frame k: set of all stationary, non-randomized policies: system throughput when allocation policy L is implemented: optimal system throughput88
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Dynamic subframe allocation in a reservation multiple...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Dynamic subframe allocation in a reservation multiple access channel Roorda, Peter D. 1993
pdf
Page Metadata
Item Metadata
Title | Dynamic subframe allocation in a reservation multiple access channel |
Creator |
Roorda, Peter D. |
Date Issued | 1993 |
Description | Multiaccess protocols allow multiple geographically diverse users to communicate with a central station or each other over a single broadcast channel. Reservation protocols are a class of multiaccess protocols appropriate for satellite and radio networks. This thesis investigates dynamic time slot access control schemes for reservation multiaccess protocols similar to those investigated by Roberts [1] and others. Previous work has considered this model only for the case of a static frame subdivision, the majority of the work relying on approximate methods for analysis. In this thesis, schemes are considered where the subframe allocation is controlled on a frame by frame basis based on the current state of the finite user population. The controlled system is modelled as a Markov decision process with the subframe allocation as the decision variable. Optimality equations are provided and using infinite horizon dynamic programming techniques, the control policy that maximizes the average throughput on the channel based on complete state information is found. In addition to the optimal policy, we propose two heuristic dynamic allocation schemes that while being suboptimal, have implementation advantages over the optimal. Performance results indicate that the implementation of dynamic subframe allocation into the model significantly improves the performance of the network over previously considered fixed allocation schemes, both in terms of lowering the average message transmission delay and increasing the channel capacity. Proposed are methods of estimating the unknown state using a variation of Rivest's [2] pseudo-Bayesian scheme to allow approximation of the dynamic control schemes which depend on full state information. Simulation results indicate that the implementation of state estimation does not significantly erode the performance of the dynamic policies. Extending the model to incorporate multipacket messages is discussed and performance results indicate that in that case also, protocol performance may be improved using dynamic allocation. |
Extent | 4665231 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2008-08-20 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0065065 |
URI | http://hdl.handle.net/2429/1435 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1993-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1993_fall_roorda_peter.pdf [ 4.45MB ]
- Metadata
- JSON: 831-1.0065065.json
- JSON-LD: 831-1.0065065-ld.json
- RDF/XML (Pretty): 831-1.0065065-rdf.xml
- RDF/JSON: 831-1.0065065-rdf.json
- Turtle: 831-1.0065065-turtle.txt
- N-Triples: 831-1.0065065-rdf-ntriples.txt
- Original Record: 831-1.0065065-source.json
- Full Text
- 831-1.0065065-fulltext.txt
- Citation
- 831-1.0065065.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065065/manifest