Recursive Bayesian Traffic Prediction for P e r f o r m a n c e I m p r o v e m e n t i n O B S Networks by Rong L i B . E n g . , Nanjing University of Science and Technology, 1999 A THESIS S U B M I T T E D IN PARTIAL F U L F I L L M E N T O F THE REQUIREMENTS FOR T H E DEGREE OF M a s t e r of A p p l i e d Science in T H E F A C U L T Y OF G R A D U A T E STUDIES (Electrical and Computer Engineering) The University of B r i t i s h Columbia A p r i l 2005 © R o n g L i , 2005 11 A b s t r a c t This thesis deals with traffic prediction in Optical Burst Switched (OBS) networks with self-similar traffic, i.e., traffic with long-range dependence (LRD) properties. Aggregated traffic in high speed optical networks exhibits LRD properties. OBS is a recent promising optical network technology to facilitate IP-Over-WDM (Internet Protocol Over Wavelength Division Multiplexing). To improve the quality of service (QoS) in OBS transmission networks, traffic prediction is required for dynamic resource reservation. We present and discuss a model of IP traffic based on MMPP (Markov Modulated Poisson Process), which approximates LRD traffic by mimicking the hierarchical generation of data by Internet users. The MMPP model is capable of effectively capturing the key aspects of the traffic measured on an OBS edge router, hence representing an aggregation of the traffic generated by a number of sources. The main contribution of this thesis is to derive an optimal Bayesian predictor for the burst size at the ingress router of an OBS network for MMPP approximations of LRD traffic. Bayesian prodiction yields the MMSE (Minimum Mean Square Error) estimate of the burst size. As shown in a simulated OBS testbed, such Bayesian predictor can yield substantial improvement in latency reduction and service differentiation of OBS network compared to linear predictors, without substantial increasing in computational complexity. iii C o n t e n t s Abstract ii Contents iii List of Tables vi List of Figures Acknowledgements 1 2 vii ix Introduction 1 1.1 B a c k g r o u n d and M o t i v a t i o n of the Thesis 1 1.2 Contributions 5 1.3 Thesis Overview 8 Literature Review and Related W o r k 2.1 9 O p t i c a l B u r s t Switched Networks 10 2.1.1 Overview 10 2.1.2 O B S traffic shaping 14 2.1.3 O B S burst reservation protocols 16 2.1.4 Contention resolution 18 2.2 Related W o r k on Traffic Models i n O B S Network 20 2.3 Related W o r k on Traffic P r e d i c t i o n 22 Contents 3 Self-Similar Traffic and M M P P Approximation 24 3.1' L R D Properties and Self-Similar Traffic 25 3.2 Influence of self-similar traffic i n O B S 29 3.2.1 Influence on burst size distribution 29 3.2.2 Methodology 29 3.3 M M P P A p p r o x i m a t i o n and Generation 30 3.3.1 Definitions and properties of M M P P 30 3.3.2 M M P P traffic generation model 32 4 Traffic prediction in O B S network 5 iv 36 4.1 Pre-Reservation and Linear Predictor 37 4.2 M M P P Bayesian Predictor 41 Performance Analysis and Simulation Results 48 5.1 M o d e l Setup 48 5.1.1 O p t i c a l network architecture 48 5.1.2 Traffic generator 49 5.1.3 Edge router 50 5.1.4 Core R o u t e r ( O X C ) 54 5.1.5 Buffer at routers 57 5.2 Simulation Experiments and Results 58 5.2.1 Assumptions 58 5.2.2 P r e d i c t i o n performance improvement 59 5.2.3 B H P pre-transmission success probability 64 5.2.4 Latency reduction improvement 65 5.2.5 Effect on B u r s t blocking probability 69 Contents 6 v Conclusion and Future W o r k 71 6.1 Conclusion 71 6.2 Extensions 72 Glossary 74 Bibliography 76 vi L i s t o f T a b l e s 3.1 Parameter setting for superposition of M M P P s 33 4.1 P r o g r a m logic to compute $ and T for simple case, eg. N = 25 46 5.1 I P Header format 50 5.2 B u r s t Header Packet ( B H P ) format 52 5.3 Edge router switching table 53 5.4 Edge router forwarding table 54 5.5 Information Base ( L I B ) at Core Node 57 vii L i s t o f F i g u r e s 1.1 Design issue i n optical networks 2 1.2 Three steps to predict burst size i n an O B S network 7 2.1 O p t i c a l network evolution 10 2.2 Functions at O B S edge node 12 2.3 Functions at the optical cross-connect supporting O B S and M P L S 14 2.4 T h e timer-based burst assembly mechanisms: (a) the periodic mechanism; (b) the non-periodic mechanism 15 3.1 Variance-time plot 28 4.1 Service differentiation at O B S ingress node w i t h traffic prediction 38 4.2 Timer-based burst assembly w i t h prediction 39 5.1 O B S network structure 49 5.2 Traffic generator node 50 5.3 M M P P traffic generator - the a i m of this figure is to illustrate the transactions between this 16-state M M P P 51 5.4 O B S burst structure 52 5.5 Ingress node model (pr_0 and p r _ l are inports; t x O J B H P and t x l _ B H P are outports for B H P ; pt_0 and p t _ l are outports for data burst, each w i t h 4 5.6 wavelengths) 53 O B S core node structure 55 List of Figures viii 5.7 P D F of prediction error (in bits) 59 5.8 C o m p a r i s o n of P under prediction error e = 10%, 20% 61 5.9 SNR 62 £ versus burst assembly time r -1 5.10 S N R a - 1 versus traffic i n core network 63 5.11 C o m p a r i s o n of P versus burst assembly time r 65 5.12 C o m p a r i s o n of O B S network w i t h / w i t h o u t P r e d i c t i o n 66 5.13 C o m p a r i s o n of E T E delay w i t h / w i t h o u t P r e d i c t i o n 68 5.14 Comparison of burst blocking probability w i t h different predictor 69 s a ix A c k n o w l e d g e m e n t s F i r s t and foremost, I would like to dedicate this thesis to my parents, for instilling i n me confidence and a drive for pursuing my master's degree. T h e y have waited so long for this moment to come true; I a m glad that their never-ending support has finally been rewarded. I would like to express my sincerest gratitude to my supervisor, D r . V i k r a m K r i s h namurthy, his encouragement, guidance and support, added considerably to my graduate experience. It is his in-depth theoretical knowledge and constructive suggestion that inspired me and were essential to the completion of this work. I also would like to thank a l l of my friends and fellow students, who have made my past years of study and work at U B C productive, cheerful and enjoyable. T h e i r knowledge, care, encouragement, assistance and humor have meant a lot to me. i C h a p t e r 1 I n t r o d u c t i o n 1.1 B a c k g r o u n d a n d M o t i v a t i o n of the Thesis T h e development of the Internet has been made possible through the huge increase i n b a n d w i d t h of worldwide optical networks. W h i l e capacity of transmission is growing rapidly, service providers now require a new generation of optical networking solutions to remain cost effective. O p t i c a l B u r s t Switching ( O B S ) is a promising h y b r i d approach between coarse grain optical circuit switching ( O C S ) and fine grain optical packet switching ( O P S ) , see Figure 1.1. It allows switching of data channels entirely i n the optical domain by doing resource allocation i n the electronic domain. Therefore, it is considered a viable solution for terabit I P backbone. W i t h the emergence of multitype applications such as data, voice, and videoconferencing, the next-generation network must also be designed to provide a variety of quality-of-service (QoS) functionalities. T h e basic ideas underlying an O B S system are twofold: the burstification of I P packets, and the decoupling of the transmission and switching of a control header and its data payload. In an O B S network, various types of client data are aggregated at the ingress (an edge node) and transmitted as variable-size d a t a bursts that are later disassembled at the egress node. E a c h data burst is preceded by a control header, also called burst header packet(BBP)[l], which is transmitted A time units earlier i n a separate control channel, and undergos optical-electrical-optical ( O E O ) conversions, to set up a switching Chapter 1. Introduction Switching 2 Required *~ granularity Optical packet switching Optical burst switching Optical circuit switching Figure 1.1: Design issue i n optical networks path and reserve b a n d w i d t h at the core network. T h e time interval A that depicts how many time units the B H P preceeds the data burst is called offset time. T h e essence of an O B S system is the decoupling of the B H P and its data payload, which enables d a t a bursts to be transmitted transparently (without O E O ) throughout the core network. A t present, O B S is attracting a lot of attention as a potential method by which future optical networks may use the available optical resources more effectively, i.e., achieving higher utilization. One of the m a i n advantages of an O B S approach lies i n its switching granularity, i.e., a data burst, shown i n Figure 1.1. A n O B S network can switch variablesize d a t a bursts instead of individual I P packets. It is a solution to compensate for the time constraint of directly switching individual I P packets at optical routers due to the mismatch between the transmission capability of W D M fibers and the processing capability of the electronic control plane, thus alleviating the heavy burden of electronic devices for lightpath configuration. T h i s advantage results from the particular procedure of burstification, whereby multiple I P packets are aggregated into a single data burst at the network ingress. A side effect imposed by such a burst-buildup process, however, is an Chapter 1. Introduction 3 artificial delay. T h e t y p i c a l end-to-end ( E T E ) delay of a data burst thus m a i n l y consists of three components: burst assembly delay at edge routers, p a t h setup delay caused by control headers, and the propagation delay i n the core network. In real-time bursts, it is imperative to discuss their E T E delay. T h e time for assembling a burst, which usually consists of hundreds of I P packets and is typically on the order of microsecond to millisecond, is designed to be comparable w i t h the switching path setup time. T h e propagation delay, usually i n the range of millisecond, is significant compared to the burst length. Therefore, one-way signalling protocol is widely investigated i n today's research to avoid the two-way propagation delay. Meanwhile, we notice that the burst assembly delay can be further saved if we could transmit the B H P before the burst is completely assembled. T h e assembly delay at network edges is substantial and has a significant impact on the E T E burst delay. T h i s influence is especially detrimental to the real-time traffic, which has stringent delay constraints. Since the propagation time of a data burst cannot be reduced, reducing burst delay at network ingresses w i l l be greatly beneficial to latency reduction and QoS provisioning. Like an O P S network, an O B S network can dynamically control system resources, assigning wavelengths of optical fiber to individual d a t a bursts only when that user needs to transmit data. A s one-way reservation scheme is adopted to avoid the two-way propagation delay, the d a t a burst is sent out after a certain offset time, and does not wait to receive an acknowledgement from its B H P . Therefore, if the B H P reserved an insufficient b a n d w i d t h , its following data burst w i l l be dropped at the core network, which w i l l result in high burst loss and lead to burst retransmission. Since the B H P is transmitted before the burst assembly finished, the actual burst length is not known ahead of time and the burst length prediction is required. T h e incoming I P traffic characteristics have great influence on the prediction of burst length. Chapter 1. Introduction 4 Internet traffic presents long-range dependence ( L R D ) , non-stationarity and m u l t i fractal scaling i n short timescales. T h e ability to predict traffic patterns w i t h i n an O B S network is one of the fundamental requirements of network design and management facilitate Q o S provisioning. A t r a d i t i o n a l linear predictor works well w i t h to short-range dependent ( S R D ) traffic, such as Poisson traffic, which is approaching uniform distribution after aggregation number over 1000. Therefore, linear predictors lose accuracy when feeding i n L R D traffic, such as Fractional B r o w n i a n M o t i o n (fBm). There exists direct fractional predictor [2] that could achieve a near-optimal prediction for f B m , but it is too complicated to implement. Recent studies have found that M a r k o v i a n models, such as M M P P (Markov M o d u l a t e d Poisson Process) and M A P ( M a r k o v i a n A r r i v a l Process) are solvable mathematical traffic models that can approximate multifractal behaviour in finite time-scales [3]. There is significant motivation to construct an appropriate traffic model that leads to effective prediction algorithms for O B S networks, which yields a more precise prediction of burst length for the dynamic resource reservation. T h e choice of a prediction method is a tradeoff between the prediction interval, prediction error and computational cost. In this thesis, we present a traffic model based Bayesian (optimal) filter/predictor for O B S burst length prediction under M M P P traffic. We also compare the performance of the recursive Bayesian predictor w i t h linear adaptive predictors, such as (least mean square) L M S - b a s e d filter i n the following widely used criteria: (1), how far into the future can be predicted w i t h confidence; (2), how much network resource has to be reserved to absorb the prediction uncertainty. T h e objectives of this thesis can be summarized as follows: Chapter 1. Introduction 5 • Study the Internet traffic characteristics at O B S edge nodes, i.e., the input traffic at O B S ingress routers. Meanwhile, w i t h the influence of long-range dependence of Internet traffic properties, we analyze the assembled traffic statistics as the output traffic at O B S ingress router. • E x a m i n e the use of superimposed M M P P s (Markov M o d u l a t e d Poisson Process) as an approximation to L R D Internet traffic, which mimics the hierarchical generation of data by Internet users. It is shown i n [4] that the superposition of M M P P s has sufficient long-memory characteristics and is therefore more appropriate for modeling the aggregated sources i n optical edge networks. • Summarize and compare least mean square ( L M S ) based linear predictor and m i n i m u m mean square error ( M M S E ) based predictor i n traffic prediction. A c c o r d i n g to our proposed traffic model M M P P , we derive a new prediction algorithm to optimize the tradeoff between efficiency and cost. • W e further need to implement our traffic models and prediction algorithms and investigate the performance by means of a simulation program i n the O P N E T environment. 1.2 Contributions In order to make effective advanced b a n d w i d t h reservations for data bursts, a key requirement is that the B H P needs to have an accurate estimate of the size of the d a t a burst. For simplicity of expression, we use a discrete-time stochastic state space model below to illustrate our methodology. It is important to keep i n m i n d that the O B S burst length prediction problem considered i n this thesis is a continuous-time stochastic state Chapter 1. Introduction 6 space model, however the definitions below apply (with minor modification). G i v e n the partially observed stochastic dynamical system [ x = f (x ) +w k+1 6 k (1.1) k Vk+i = h {x ) + v , e k where k — 0 , 1 , 2 , . . . denotes discrete time, f , (1.2) k h v are inde- pendent white processes w i t h known densities, our aim is to estimate the state x i given k the observation history of Y k = (y , y , x k y )- 2 are known functions, w , k k+ k+ T h e Bayesian state estimation is such a k model-based o p t i m a l filtering, which assume the traffic model fe(.) estimate the state x \ k is known, and then by x i | f c = E{x \Y } f c + k+1 (1.3) k using b o t h Eqs. (1.1) and (1.2). T h e Bayesian estimate x k x = E{x \Y } k k k (1.4) is the minimum-variance error estimate by m i n i m i z i n g E{(x -g(Y )f}, k where g(Y ) k is the estimate x . k k (1.5) O n the other hand, adaptive filtering uses the static observation E q . (1.2), by assuming x k is vary slowly w i t h unknown dynamics 9. Since the computation of the o p t i m a l filter involves m a t r i x manipulation, as shown i n Section 4.2, the computational cost is on the order of N , while the adaptive filter only requires vector 2 multiplication at the cost of N, where ./V is the dimensions of the state space. In this thesis, we consider recursive Bayesian (optimal) filtering/prediction of M M P P s , which are partially observed continuous-time dynamical systems. Similar results to that described Chapter 1. Introduction Traffic Model Approximation M L E Parameter Estimation 7 Bayestian State Estimation Figure 1.2: Three steps to predict burst size i n an O B S network above hold for the o p t i m a l M M P P Bayesian prediction/estimation. A s shown i n Figure 1.2, the m a i n idea of our work contains 3 steps: 1), construct an appropriate traffic model for incoming I P traffic i n the O B S edge networks; 2), estimate the model parameters by using a m a x i m u m likelihood estimate ( M L E ) , for example using the expectation-maximization ( E M ) algorithm; 3), use Bayesian signal processing (predictor) to get the best state estimate. In this thesis, we focus on the first step, i.e., traffic model approximation and the t h i r d step, i.e., state estimation. W e skip the second step by assuming that the model parameter is known by appropriate parameter estimation. T h e m a i n contributions of this thesis are as follows: • It is well known that the aggregated Internet traffic at the O B S ingress is selfsimilar. Based on the investigation of burst assembly mechanisms used at the O B S ingress router, we present to use a superposition of four independent 2-state M M P P s for approximating such self-similar traffic w i t h L R D properties over sufficient time scales. T h e superposition results i n a new 16-state M M P P model. Such M M P P approximations to L R D traffic are widely used i n the literature [4] [5]. • Using the above M M P P approximation, we derive an o p t i m a l recursive Bayesian traffic prediction algorithm, which dynamically predicts the burst duration of the O B S system. In the O B S network w i t h advanced reservation ( A R ) applied, more accurate reservation gives a greater chance of success, and hence reduces burst loss Chapter 1. Introduction probability. 8 There are some studies i n the literature focusing on linear predic- tion ( L P ) methods for traffic prediction, such as [6] [7]. T h e choice of a prediction method is a tradeoff between the prediction interval, prediction error and computational cost. N u m e r i c a l results of our experimental comparison show that our M M P P Bayesian predictor can substantially improve accuracy compared to adaptive L P without increasing computational complexity too much, i.e., from 0 ( 1 6 ) to 0(16 ). 2 • T h e m a i n advantage is that by using recursive Bayesian predictor for the M M P P traffic, accurate estimates of the burst size can be obtained. T h a t is, the performance of O B S network on-line traffic prediction can be optimized i n terms of end-to-end ( E T E ) delay and burst loss probability, which are the two fundamental issues of providing QoS differentiation when applying O B S to the next generation. 1.3 Thesis Overview T h e rest of this thesis is organized as follows: Chapter 2 presents related work, including an overview of new techniques i n today's optical burst switching networks. T h e n we introduce the M M P P traffic modelling for self-similar traffic and generating method i n Chapter 3. In Chapter 4, we give a detailed theoretic analysis of our proposed Bayesian M M P P predictor, w i t h a comparison to L M S based linear predictor. In Chapter 5 we describe the O B S simulation testbed, present the simulation results for the proposed M M P P Bayesian prediction schemes and compare the analytical results. W e further show that the simulation results also validate that our proposed M M P P predictor performs better than linear prediction. Finally, Chapter 6 concludes the thesis w i t h a summary of the presented work and suggests future work. 9 C h a p t e r L i t e r a t u r e 2 R e v i e w a n d R e l a t e d W o r k In this chapter, we first give an introduction to optical burst switched ( O B S ) networks. A comparison between this new switching paradigm w i t h other existing optical switching paradigms has been made, and it has been shown that O B S is not only cost-effective but also a viable solution for the next generation optical Internet. W e provide a brief historical review of the early work on burst switching as well as the state of the art, including the prevailing protocol for O B S networks, called J u s t - E n o u g h - T i m e ( J E T ) , and describe its major features and benefits. O p t i c a l Networks, w i t h their high transmission speed, are the ideal communication infrastructure to meet the ever increasing demand on bandwidth. W i t h recent advances in wavelength division multiplexing ( W D M ) technology, the potential of fiber was fully realized. It is a technology to increase transmission capacity by transmitting d a t a si- multaneously at multiple carrier wavelengths. B e l l Labs scientists have determined that w i t h wavelengths and values typically used i n optical networks today, it is theoretically possible to transmit data at 100 terabits per second without any excessive signal noise or interference [8]. Current commercial optical systems allow for the multiplexing of more than 160 wavelengths at l O G b p s each to get a total throughput of 1.6 terabits per second per fiber. Chapter 2. Literature Review and Related Work (0 ^ c o u c Optical Packet Switched ^ Optical Burst Switched Mesh ^ AONs ^ Optical Burst Switched Ring J <» z Wavelength-Routed Mesh Q. O 10 Q c Wavelength-Routed Ring Point-to-Point W D M Links ^ ^ O E O required time Figure 2.1: O p t i c a l network evolution 2.1 2.1.1 Optical Burst Switched Networks Overview Current W D M networks operate over point-to-point links, where optical-to-electrical-tooptical ( O E O ) conversion is required at each step. However, a l l future research is focused on all-optical networks ( A O N s ) where user data travels entirely i n the optical domain. T h e elimination of O E O conversion i n A O N s allows for unprecedented transmission rates. A O N s can further be categorized as wavelength-routed optical networks ( W R O N s ) , optical burst switched networks ( O B S N s ) , or optical packet switched networks ( O P S N s ) . Switching techniques primarily differ based on whether data w i l l use switch cutthrough or store and forward. T h e A O N evolution (Figure 2.1) begins w i t h the W R O N s , which is also referred to as an optical circuit switched ( O C S ) network. In circuit switching, it is necessary to establish a dedicated p a t h between the two stations. A call is first set up, the d a t a is transferred and the call is disconnected. Resource reservation is done for the duration of the call. In optical circuit switched architecture, an end-to-end all-optical Chapter 2. Literature Review and Related Work 11 circuit, called lightpath, is established before the transmission of data signals, similar to a telephone line i n public switched telephone networks. C i r c u i t switching is advantageous when we have a constant data rate (fixed delays) on the network like voice traffic; however, it is not suitable under bursty traffic conditions, or when circuits are idle. T h e m a i n constraint of W R O N s is the limited number of wavelength per fiber. A l l - o p t i c a l circuits tend to be inefficient for traffic that has not been groomed or statistically multiplexed, and are wasteful when the sustained traffic volume does not consume a full wavelength. In packet switching, the d a t a is broken into small packets and transmitted. resources can be shared by the different sources. The O p t i c a l packet switching ( O P S ) is a technology to transmit user data by means of optical packets, i n which a wavelength is only allocated to a packet when it is transmitted, and it can be reused by others after the transmission. Packet switching works well w i t h variable rate traffic like data traffic, and can achieve higher utilization. Therefore, O P S is more efficient and it can be used to support I P traffic and A T M traffic. However, O P S requires practical, cost-effective, and scalable implementations of optical buffering and optical header processing, which are server years away. X u [9] reviews the O P S technology. C i r c u i t switching uses two-way reservation schemes that have a large round-trip time requirement. Packet switching, on the other hand, has a large buffer requirement and complicated control and strict synchronization issues. O p t i c a l burst switching ( O B S ) is a technology positioned between wavelength routing (i.e., circuit switching) and optical packet switching. O B S has the advantages of b o t h circuit switching and packet switching, and is considered a promising protocol for all-optical networking. T h e benefit of O B S over circuit switching is that there is no need to dedicate a wavelength for each end-toend connection. In addition, optical burst switching is more viable than optical packet switching as the burst data does not need to be buffered or processed at the cross connect. Chapter 2. Literature Review and Related Work 12 !P/ATM/GbE(data) Upper layer —I—t data SONET (voice) Ingress Edge Node (a) burst assembly Control packet From OBS Core IP/ATM/GbE(data) i 1 l~l ._C7L_. To local destination SONET (voice) Burst Egress Edge Node (b) burst disassembly Figure 2.2: Functions at O B S edge node T h i s allows the strengths of optical switching technologies to be leveraged effectively and the problem of buffering i n the optical domain to be circumvented. O p t i c a l burst switching is based on the separation of the control plane and the d a t a plane. Thus, costly O E O conversions are required only on a few control channels instead of a large number of d a t a channels. In O B S networks, d a t a packets are aggregated into much larger bursts before transmission through the network. T h i s allows amortization of the switching overhead across multiple packets. There are two kinds of nodes i n the O B S network: edge node and core node. T h e O B S edge nodes consist of an electronic router and a burst assembler while the O B S core nodes require an optical switching m a t r i x , a switch control unit, and routing and signaling processors. T h e m a i n function of the O B S edge nodes is to collect traffic from various upperlayer users such as A T M switches, I P routers, etc. T h i s collected d a t a is sorted based on a destination address and is assembled into large variable-size units, called bursts, which are typically several hundreds of kilobytes i n size. In addition, the O B S edge nodes create a control packet for each data burst, which contains burst length, offset time, and Chapter 2. Literature Review and Related Work 13 all routing information, and is sent toward the destination at an offset time before the burst itself (Figure 2.2 (a)). These control packets are electronically processed at each intermediate node along the path, from the source to the destination. T h e i r goal is to set up the O B S core switches along the way, so that their corresponding bursts can travel transparently i n the optical domain through an bufferless or buffer limited network. T h e data burst w i l l later be disassembled at the O B S egress edge node (Figure 2.2 (b)). D u r i n g the assembly/disassembly, the client data is buffered at the edge, where electronic R A M is cheap, and abundant. There literature indicates that there are two ways to transmit these signaling packets. One is to set up a completely separate electronic control network, such as I P or A T M network, whereas mainly adopted way is to designate a specific wavelength i n an all-optical network, which can be considered as an out-of-band control channel. In either case, the separation of control and data, both i n time and physical space, is one of the m a i n advantages of O B S . It facilitates efficient electronic control while allowing for great flexibility i n the user data format and rate [10] because the bursts are transmitted entirely over an optical signal and remain transparent throughout the network. In general, a l l O B S designs include an offset time between the transmission of a control packet and its corresponding burst. T h i s offset allows the control packet to reserve the needed resources along the p a t h toward the destination before the burst arrival. Furthermore, the O B S core nodes need this offset time to pre-set their switching fabrics so that the user d a t a can "cut-through" without any buffers (Figure 2.3). There are variations i n the O B S literature, explained i n the following sections, on how exactly to determine the pre-transmission offset time and when to reserve the needed resources at the core nodes. Despite their differences, however, a l l of the proposed O B S networks have a dynamic operation, which has the potential for high resource utilization and adaptability. Reference [11] describes a burst switch architecture design and operation, as well as Chapter 2. Literature Review and Related Work 14 Optical cross-connects packel n burst / x • < v < 'nffcral' Reservation request on selected wavelength Select output interface Figure 2.3: Functions at the optical cross-connect supporting O B S and M P L S the concept of control channel, data burst, and head cell, etc. References [12] [1] propose a network architecture such that the O B S network is the backbone of the Internet (IP network), and describe the architecture of an optical switch (router) w i t h the functions that include burst control packet detecting, processing, a n d rewriting; forwarding table lookup; management of shared buffer (fiber delay line, or F D L ) ; wavelength conversion; and data channel scheduling. Reference [13] studied access schemes i n O B S for a metro ring architecture. A s O B S becomes more mature, reference [14] discussed technical issues 1 and general requirements for a transport layer architecture (i.e., services a n d protocols) for O B S networks. 2.1.2 O B S traffic shaping One of the m a i n functions of an O B S network is to collect upper-layer traffic, classify it a n d aggregate it into variable-size bursts. T h e classification a n d the proper assembly algorithm of small I P packets to larger optical bursts at edge nodes are essential for the performance of burst reservation, transmission a n d electronic control i n core nodes, because it allows the network designers to control the characteristics a n d therefore shape the burst arrival traffic [15]. Chapter 2. Literature Review and Related Work Bl < , 8 2 N« , H« 8 3 , 3 H< 15 4 * (a) Bl B2 B3 W M t (b) Figure 2.4: T h e timer-based burst assembly mechanisms: (a) the periodic mechanism; (b) the non-periodic mechanism. A t the O B S edge node, incoming I P packets are classified based on the egress node and QoS class and stored i n assembly queues accordingly. For burst assembly, two assembly algorithms are used as basic building blocks [16]: threshold-based and time-based schemes. In the first scheme, a burst is sent out when enough I P packets have been collected i n the assembly queue such that the length of the resulting burst exceeds a threshold of L ax m bytes. In the second scheme, a time-out interval T max is set upon the arrival of the first I P packet to an empty queue. A burst consisting of a l l packets i n the assembly queue is sent out as soon as a time-out occurs. T h e timer-based mechanism also includes the periodic and the non-periodic alternatives, whose intrinsic features are illustrated i n Figure 2.4. In the periodic mechanism, the burst is assembled back-to-back. T h a t is, the new burstification process begins as soon as the previous data burst is assembled and lasts u n t i l the pre-determined interval of T max elapses. A new data burst, however, is not actually generated if no packet arrives during the whole burstification interval. In the non-periodic assembly mechanism, the next burstification process starts only when a new I P packet arrives. A n o t h e r variation of the time-based scheme ensures a m i n i m u m burst length L min by introducing padding bytes if necessary. T h e characteristics of the output burst traffic can differ between the above algorithms. In the case of the pure threshold-based algorithm, burst traffic has the same self-similarity Chapter 2. Literature Review and Related Work 16 as the input packet traffic. In the case of the time-based algorithm, burst traffic may have a smaller degree of self-similarity than packet traffic and therefore reduce the contention in the core network and make burst traffic smoother at short time scales. [17] studies the traffic characteristics while combining b o t h criteria together, and concludes that the burst departures are not exponentially distributed but are correlated w i t h the burst size. 2.1.3 O B S burst reservation protocols There are three m a i n components to set up a connection i n O B S framework. T h e first one is the signaling procedure, which consists of creating the O B S control packets ( B H P ) and sending it an offset time ahead of its corresponding data burst. T h e other two are routing and wavelength allocation. T h e various O B S signaling protocols that have appeared i n the literature can be broadly classified by three dimensions: (1) the manner i n which control is exercised i n the network (i.e., distributed or centralized), (2) the reservation scheme used to hold wavelength resources for bursts, and (3) the method i n which the offset value is calculated (and the purpose it serves). B y contrast w i t h centralized signalling w i t h end-to-end reservation i n packet switching network, most of the proposed O B S architectures use a distributed one-way signaling procedure to set up a burst transmission path through the network. T h a t is, the data burst is transmitted after a delay (known as the offset) of its control header ( B H P ) , without waiting for a positive acknowledgment ( A C K ) that the entire p a t h has been successfully established. Since O B S w i l l most likely be implemented i n long-haul networks, the m a i n benefit!of one-way reservation is that it halves the propagation delay, and therefore it w i l l significantly decrease the time needed to establish a connection. Chapter 2. Literature Review and Related Work 17 T h e manner i n which output wavelengths are reserved for bursts is one of the principal differentiating factors among O B S variants. W e distinguish between two types of reservations schemes: immediate and delayed. Immediate reservation, exemplified by the Just-In-Time ( J I T ) family of O B S protocols [18], works as follows: an output wavelength is reserved for a burst immediately after the arrival of the corresponding setup message; if a wavelength cannot be reserved at that time, then the setup message is rejected and the corresponding burst is dropped. T h e H o r i z o n [11] and J u s t - E n o u g h - T i m e ( J E T ) [19] protocols employ a delayed reservation scheme, which operates as follows: an output wavelength is reserved for a burst just before the arrival of the first bit of the burst; if, u p o n the arrival of the setup message, it is determined that no wavelength can be reserved at the appropriate time, then the setup message is rejected and the corresponding burst is dropped. M o s t O B S time offsetting techniques are based on the assumption that the control packet is sent after the entire burst is assembled. A variation on these techniques is to send the control packet before collecting the entire burst from the upper layers, such as F R R (Forward Resource Reservation) [7]. T h e F R R scheme involves 3 steps: predict the d a t a burst length before or during the burst assembly; pre-transmit the B H P for resource reservation before the data burst assembly is completed; when the burst assembly is completed, check whether the pre-reserved resource is sufficient to support the actual burst. T h e m a i n advantage of this variation is the reduction i n the burst pre-transmission delay. However, since the exact length of the burst is not included i n the corresponding control packet, imperfect prediction regarding burst length may lead to overhead due to waisted b a n d w i d t h . In the literature, the immediate reservation is also referred to as "explicit setup" and the delayed reservation is called "estimated setup." Similarly, an "explicit release" is used when the source sends an explicit trailing control packet to signify the end of a Chapter 2. Literature Review and Related Work 18 burst transmission and to release the resources at a l l the intermediate nodes. A n o t h e r possibility for wavelength release, termed as "estimated release," requires that the i n i t i a l control packet contain the oncoming burst length and thus the O B S core node would know exactly the end of the burst transmission and would automatically release the occupied resources. Based on this classification, the four possibilities i n an O B S network are: explicit setup/explicit release, explicit setup/estimated release, estimated setup/explicit release and estimated setup/estimated release. Overall, there no clear winner between these schemes, and each has advantages and disadvantages. For example, the estimated schemes occupy resources for shorter times, and therefore they can achieve a higher network throughput. T h e difficulty w i t h these schemes, however, lies i n the fact that they are inherently quite complicated and their performance greatly depends on whether the estimates are correct. B y contrast, the explicit setup/explicit release scheme is much easier to implement but it occupies resources for longer periods t h a n the actual burst transmission, and therefore it may result i n higher burst loss probability. 2.1.4 Contention resolution In O B S , when one-way reservation schemes (e.g., J I T , J E T , and Horizon) are adopted, it is possible that two bursts compete for the same output port; thus contention may arise. Such a contention situation can be resolved through one or a combination of the following three domains: • Wavelength domain: B y means of a wavelength converter, a burst can be sent on a different wavelength channel to the designated output fiber. Thus, a l l the wavelength channels of an output fiber can be considered a single shared bundle of Chapter 2. Literature Review and Related Work 19 channels. • T i m e domain: In a fiber delay line ( F D L ) buffer, a burst can be delayed until the contention situation is resolved and the wavelength becomes available. In contrast to buffers i n the electronic domain, F D L s provide only a fixed delay, and data bursts leave an F D L i n the same order i n which they entered. T h a t is, they do not have random access functionality. • Space domain: In deflection routing, a burst is sent to a different output fiber of the node and consequently on a different route towards its destination node. Thus, deflection uses the entire network as a shared resource for contention resolution. A l m o s t all work on O B S assumes contention resolution by full wavelength conversion. T h a t is, a dedicated wavelength converter is provided for each input or output wavelength. For a low to medium load, this provides a low burst loss probability because all wavelength channels of an output fiber can be shared among all bursts directed towards this output fiber. For a high load, the number of wavelength channels has to be very large to reach burst loss probabilities of 1 0 - 6 or less, e.g., 350 wavelength channels are needed to carry a load of 0.8 E r l a n g per wavelength channel at this loss rate. Wavelength conversion has also been complemented by providing a number of F D L s in an F D L buffer. It can be seen that the application of an F D L buffer has a significantly reduced burst loss probability compared to the bufferless case. W h e n using F D L buffers i n O B S nodes, the physical length of the F D L has to be considered. E v e n if the F D L s are dispersion compensated if needed, the m a x i m u m length of an F D L is limited by the power budget. Thus, combining performance and technology arguments, mean burst lengths in the order of M b y t e s cannot be realistically stored i n F D L buffers. Chapter 2. Literature Review and Related Work 20 Deflection routing has also been analyzed i n the context of O B S for irregular mesh networks [20] [21]. In general, the p a t h a deflected burst takes through the network should be as short as possible to minimize resource consumption. In O B S schemes which apply offset .times, the problem of insufficient offset times has to be avoided. T h a t is, it has to be ensured that there always be a large enough offset between control packet and d a t a burst even if extra nodes are traversed. Thus, [20] proposes to use F D L buffers to increase offset times i n intermediate nodes prior to deflection. B y an intelligent combination of different contention resolution strategies, the cost and performance of O B S nodes can be optimized. B u r s t segmentation or composite burst switching is an approach for contention resolution that is based solely on burst scheduling: It tries to minimize the data volume discarded by not dropping an entire burst but dropping only that part of a burst that actually conflicts w i t h another burst [22] [23], 2.2 R e l a t e d W o r k on Traffic M o d e l s i n OBS Network Traffic models play very important roles in the planning, design and analysis of communication networks. Different models have different results, and the discrepancy between the results using different models can be huge. A p p l i c a t i o n s of traffic models include resource allocation, call admission control and QoS provisioning. In the design of O B S networks, a major a i m is to solve the problem of the mismatch between extremely high optical transmission rates and the relatively slower electronic processing. B u r s t assembly is required at the O B S edge nodes, where short I P packets Chapter 2. Literature Review and Related Work 21 are demultiplexed according to their destination and gathered into larger size packets. These super packets are also called bursts. T h e outcoming burst traffic from the edge node w i l l depend highly on the incoming I P traffic and the burst assembly algorithm. Internet I P traffic has been demonstrated to affect network performance by two kinds of bursty property: short-range dependent ( S R D ) burstiness and long-range dependent ( L R D ) burstiness. Memoryless Poisson models were assumed i n the O B S network i n the past few years; however, recently, there has been a significant change in the understanding of network traffic. Poisson models for network traffic become essentially uniform when aggregated by a factor of 1,000; while actual network traffic shows no such decrease in variability over the same range of aggregation. Numerous studies have found that data traffic i n high-speed networks exhibits self-similarity that cannot be captured by classical models. T r a d i t i o n a l M a r k o v models and Regression models can only capture short range dependencies i n traffic. In a classical M a r k o v chain, the next state of the system depends only on the current state. Regression models define explicitly the next random variable i n the sequence by previous ones w i t h i n a specified time window and a m o v i n g average of white noise. T h e degree of self-similarity, defined by the Hurst parameter, typically depends on the utilization level of the network and can be used to measure the "burstiness " of the traffic. T h e Hurst parameter is basically a measure of the speed of decay of the t a i l of the autocorrelation function; as H increases the degree of self similarity increases. If 0.5 < H < 1, then the process is long-range dependence ( L R D ) , and if 0 < H < 0.5, then it is short-range dependence ( S R D ) . Hence, H is widely used to capture the intensity of the long-range dependence of a traffic process. T h e closer H is to 1 the more long-range dependent the traffic is, and vice versa. In this section we review some long-memory models, which are widely used i n theory Chapter 2. Literature Review and Related Work 22 and practice to model self-similar traffic. T h e observation of L R D property i n the Internet traffic has initiated studies of new models such as chaotic maps [24], fractional B r o w n i a n motion (fBm) models [25], fractional Gaussian noise (fGn), and fractional autoregressive integrated moving average ( F A R I M A ) [26]. F B m is a non-stationary stochastic process developed as a generalization of the standard B r o w n i a n motion model. F G n is a stationary process. F G n is related to f B m , since f G n is produced by t a k i n g the differences i n f B m realizations. T h e y can describe self-similar behavior i n a relatively simple manner. Wavelet analysis was shown to be one of the most powerful methods for the description of stochastic properties of traffic, which takes explicitly into account a multi-fractal model as i n [27]. T h e m a i n advantage of these models, especially when multifractality is taken into account, is their rich scale-invariance property. These models are computationally efficient, but they suffer for the lack of a simple mapping between the traffic parameters and the model coefficients. T h e queuing theoretical techniques developed i n the past are hardly applicable to these models. O n the other hand, a number of models based on traditional traffic models have been proposed. One approach is to emulate self-similarity over a certain range of time scales w i t h finite state M a r k o v i a n models. [28] [4] have proposed a model based on the M a r k o v M o d u l a t e d Poisson Process ( M M P P ) as a superposition of two-state M o r k o v processes. M M P P is considered as the best M a r k o v process to emulate L R D and scale invariance [3]. In this work we w i l l also apply such an approach. 2.3 R e l a t e d W o r k on Traffic P r e d i c t i o n In an O B S network, the burst reservation message provides a report of the incoming burst length i n a tell-and-go fashion along the path from source (ingress edge node) to Chapter 2. Literature Review and Related Work 23 destination (egress edge node). Note that there is a packetization delay, due to the burst assembly time, which is inherent to burst switching. W h e n to send the reservation message to the core network belongs i n the scope of QoS provisioning. B u r s t length prediction is required if the reservation signaling message is sent before burst assembly finishes. Linear predictor is the simplest one, which does not need prior knowledge of incoming traffic patterns, and suitable for any traffic model. In the literature, some studies propose to use the least mean square ( L M S ) based linear prediction method to predict the burst length in the O B S network [6] [7]. Another more sophisticated non-linear method is M M S E ( m i n i m u m mean square error) predictor, which aims to minimize the expected value of the squared prediction error, and also works for on-line prediction that does not require the underlying structure of the incoming traffic. O n the other hand, the quality of the prediction depends on the amount of uncertainty, which i n t u r n depends on a number of factors, including the amount of traffic history used to make the prediction, the prediction horizon, and the nature of traffic itself. Predictors based on traffic structure have the significant benefit of computing the weight factor on history samples. Considering the abundant evidence that high-speed network traffic is self-similar and long-range dependent ( L R D ) i n nature and the fact that the history of L R D processes has a significant impact on the present value of the process, it is natural to assume that predicting L R D traffic would be rather rewarding. There are particular predictors [2] based on fractional traffic models, such as f B m and f G n processes. F r o m the engineering perspective, these fractional predictors are too complicated to implement and time-consuming when doing dynamic b a n d w i d t h reservations. 24 C h a p t e r 3 S e l f - S i m i l a r T r a f f i c a n d M M P P A p p r o x i m a t i o n It is well known that the characteristics of aggregated Internet traffic at the O B S ingress nodes is fractal and can be described in terms of self-similar stochastic processes. However, i n the last few years, considerable effort has been devoted to studying the O B S burst traffic after assembly, as the burst size distribution determines the optical buffering requirements. Therefore, the self-similarity of O B S burst traffic has a significant impact on the overall network performance, e.g., burst blocking probability. Self-similarity indicates long-range dependence ( L R D ) i n the correlation structure of network traffic. Generally speaking, although the incoming packet traffic is aggregated i n O B S assembly nodes, the number of assembled burst input flows i n the O B S core node is i n fact not large, and not enough to be aggregated to Poisson burst traffic. Furthermore, the claim that burst assembly could reduce traffic self-similarity [29] was disproved by analysis and the simulation of b o t h synthetic traffic and traffic traces [30] [31]. In this chapter, we show first special interest on how the self-similarity features of incoming traffic affect the burst size distribution. T h e n we propose to use M M P P ap- proximation to represent the aggregated Internet traffic at the O B S edge. The traffic generation of this special M M P P model is also provided. Chapter 3. Self-Similar Traffic and MMPP Approximation 3.1 LRD Properties and Self-Similar 25 Traffic A s the t e r m implies, long-range dependence refers to a correlation structure that decays at a rate much slower than the exponential decrease that occurs i n the correlations of a short-range dependent ( S R D ) process. A n interesting characteristic of the correlation structure of a long-range dependent process is that its autocorrelation obeys the wellknown power-law distribution. T h a t is, for a stationary L R D process w i t h mean p, variance a 2 and autocorrelation function rLRD{k), rLRD(k) ~ k 0 {X },t = 0 , 1 , 2 , . . . t we have[32] , as k —> oo, where (0 < (5 < 1) (3-1) T h u s the autocorrelation function of such a process decays hyperbolically. H y p e r b o l i c a l decay is much slower than exponential decay, and since (3 < 1, the sum of the autocorrelation values of such a series approaches infinity. T h i s description is closely associated to self-similarity and a Hurst parameter defined by the equation: H = 1-/3/2 A s we stated i n section 2.2, the (3-2) Hurst parameter H is the index of self-similarity. T h a t is, for general self-similar processes, it measures the degree of "self-similarity." W h e n the parameter lies i n the interval (0.5,1), the resulting self-similar process exhibits long-range dependence as defined i n Eq.(3.2). In contrast, a short-range dependent process requires that its autocorrelation function of {X } decays exponentially to zero. T h a t is, {X } rsRD(k) decreased according to t t r R (k) S D -> would have a correlation function C\ as k —> oo, where — 1 < C < 1. (3-3) Chapter 3. Self-Similar Traffic and MMPP Approximation 26 C o m m o n traffic models w i t h L R D are based on self-similar processes. Intuitively, a process is self-similar i f its statistical behavior is independent of the time-scale. In other words, the characteristics of a self-similar process look the same at any time scale. T h i s feature means that averaging over equal periods of time does not change the statistical characteristics of the process. Consider the above stationary time series {X },t — 0 , 1 , 2 , . . . , w i t h zero mean and t autocorrelation function r(k), representing the number of packets during time intervals observed on a given link, where r(k) is given as We define the m-aggregated series X[ ^ b y summing the original series X over non- m overlapping blocks of size m, replacing each block b y its sample mean. T h a t is, for each m = 1, 2 , 3 , X [ ^ m is given b y •^(m) _ —y^X( _i) t T h e processes {X^} m + j, t = 1,2,... (3.5) are also wide sense stationary w i t h mean /J, a n d autocorrelation r^ '(A;). There are different classes of self-similarity: m • E x a c t Self-Similar: the process {X } t is said to be exactly self-similar if for a l l m = 1 , 2 , i t satisfies ( \k) = r(k). m r (3.6) (3.6) also implies an equivalent definition: Var{X ) {m) = m- Var(X), 0 for all m, and 0 < B < 1. (3.7) Chapter 3. Self-Similar Traffic and MMPP Approximation 27 In other words, the autocorrelation structure is preserved across different time scales. • A s y m p t o t i c Self-Similar: the process {X } t is said to be asymptotically self-similar if for a l l k large enough, r {k) - » r(jfe), when m -> co. {m) (3.8) which could also be represented by: Vcir{X ) {m) ~ m-PVariX), as m - • oo, where 0 < /? < 1. (3.9) • iStochastic Self-Similar: T h i s is a continuous time definition. T h e process {X } t is statistically self similar w i t h the parameter H , if for any positive stretching factor a, the re-scaled process w i t h time scale original process {X }, t at, a~ X H at is equal i n distribution to the and they satisfy the relation X at ~ a-"X , for a l l (a > 0), (3.10) t where ~ denotes equality i n distribution. T h i s is a very strict form of self-similarity called self-similarity w i t h stationary increments. F B m is an example of such a process. T a k i n g the logarithm of b o t h sides of E q . (3.7) gives the following equation: \og (Var(X™)) l0 = \og Var(X) 10 -/?log 1 0 (ro) (3.11) T h e variance-time plot (Fig.3.1), often used to test traffic for self-similar properties, is based on this equation. Self-similar traffic results i n a straight line w i t h slope G [—1,0]. Chapter 3. Self-Similar Traffic and MMPP Approximation 28 Self-similar: slope - P E CO > o \ U) O 0 1 Poisson: slope -1 2 3 4 5 Log m 10 Figure 3.1: Variance-time plot For successive values of m that are equidistant on a log scale, the logarithm of sample variance of is plotted versus m on a log-log plot. Therefore, i n practice, the most common way to validate if a generated process is L R D is by fitting a least-square line to the points of the plot and then calculating its slope. A n estimate of the Hurst parameter is obtained as H = 1 — T h e relation between self-similarity and L R D is that if a process {X } t w i t h Hurst parameter H , then its increment process H. , Y = X —X t t+S s is self-similar is L R D w i t h parameter Chapter 3. Self-Similar Traffic and MMPP Approximation 3.2 3.2.1 Influence of self-similar traffic i n 29 OBS Influence on burst size distribution T h e distribution of burst length can be influenced by offered traffic and burst assembly parameters. In general, according to C e n t r a l L i m i t Theorem ( C L T ) , by aggregating I P packets into bursts, the burst size distribution w i l l approach Gaussian distribution as the number of packet arrivals goes up or the assembled burst size goes up, as verified by our simulation results. O n the other hand, the influence of self-similarity ( H parameter) is significant, especially as the timeout value grows. T h i s is due to the fact that, as the H parameter grows, the variance of the traffic sample mean (aggregated traffic) decays more slowly than i n the independent case (H=0.5). Thus, the larger the H parameter the larger the burst size variance. A s discussed i n Chapter 2.1.2, the traffic shaping (burst assembly) at the O B S edge node can smooth traffic on short time-scales, i.e., reducing its variability. If the correlation extends only over timescales smaller t h a n a burst assembly period, the correlation i n the packet traffic process is shaped away by burstification; i.e., it w i l l not be observed i n the burst traffic process. If, on the other hand, the correlation extends over timescales much larger than a burst assembly period, burstification w i l l not shape it away, so that a similar correlation structure w i l l exist i n both packet and burst traffic streams. Longer assembly times lead to improved smoothing. 3.2.2 Methodology M o t i v a t e d by the above observation, we now introduce a simple M a r k o v i a n approach to approximate the self-similar traffic over sufficient time scales for burstification at O B S edge nodes. In traffic modelling, a superposition of several M M P P s is practically sufficient Chapter 3. Self-Similar Traffic and MMPP Approximation 30 to represent self-similarity over several time-scales to model the real traffic. We construct an M M P P w i t h apparently self-similar behaviour over several time-scales by superposing several M M P P s . F i r s t , consider two-state M M P P s w i t h different time-scales. T h a t is, the mean sojourn time of each process is i n accordance w i t h the different time-scale. L e t us superimpose t h e m to make a new M M P P . W h e n we see this process on a large time-scale, it looks like an ordinary two-state M M P P . If we look on a smaller time-scale, we find that each state is composed of a smaller M M P P . If we look on a still smaller time-scale, we find that each state of a smaller M M P P is again composed of a still smaller M M P P . T h i s can be repeated only a finite number of times. It is impossible to measure given traffic during an infinite amount of time. Thus, it is i n practice sufficient to use the process that has self-similarity over only several time-scales to model the real traffic. Refer to [4] [5] for details. We use a continuous-time discrete-state M M P P for modeling the self-similar traffic. T h a t is, for our O B S system w i t h self-similar traffic input, we assume first that the traffic carried by each forward equivalence class ( F E C ) is a M M P P and that M M P P s are independent of one another. T h e O B S network supports several F E C s from different edge O B S shapers, which are assumed to have the same mean, variance and Hurst parameter, without loss of generality. 3.3 3.3.1 M M P P Approximation and Generation Definitions and properties of M M P P In this section, we summarize some of the m a i n characteristics of M M P P defined on a probability space. M M P P is a doubly stochastic Poisson process where the intensity of Chapter 3. Self-Similar Traffic and MMPP Approximation a Poisson process is defined by the state of a M a r k o v chain. T h e M a r k o v chain can therefore be said to modulate the Poisson process, hence the name. a continuous-time 31 N o w , let {X } t be S-state M a r k o v chain, without loss of generality, defined on the state space e = { d , e , e s } , where e, 6 R is the unit vector w i t h 1 i n the i-th position. T h i s s 2 M a r k o v chain modulates an integrable Poisson traffic rate process; hence i n this S-state M M P P , the packet arrival rate is determined by the state of the continuous-time M a r k o v chain w i t h infinitesimal generator Q and Poisson arrival rate A,, i € [1,5]. T h a t is, the traffic rate process behaves as a Poisson process w i t h arrival rate \ when the M a r k o v chain is i n state i. B o t h the M a r k o v chain and Poisson process can have simultaneous jumps. Transitions between states are dependent and governed by the above continuoustime M a r k o v chain. Q is also known as the transition rate matrix of the m o d u l a t i n g M a r k o v chain. ' Q #11 #12 ••• 0is 021 022 ••• 025 . ' (3.12) j=l 0si 055 0,S2 which equals £ ^ = 0,Vie{l,2,... S}. ) (3.13) j=i Define the arrival rate matrix A = diag[X\, A , A s ] , whose diagonal elements con2 tains the arrival intensities that corresponds to the different states of the M a r k o v chain. A l s o define P{X t = i} — p\, i € { 1 , 2 , . . . , 5 } . T h e n the transition probability distri- bution p = {ptPt-'-PtY should be [e^' ], which satisfies the forward equation ^ 4 t = Q'p , where ' denotes the transpose operation. Let N denote the above M a r k o v chain {X } t t modulated Poisson process, which repre- t Chapter 3. Self-Similar Traffic and MMPP Approximation sents the number of events that occur during the interval 32 [0,t]. For the time-stationary M M P P , the mean of N is given by [33] t E{N } = vrAei, t where 7r = [ni, 7T , 2 ITS] is the steady state vector of the M a r k o v chain such that nQ 3.3.2 (3.14) = 0, 7re = 1 (3.15) M M P P traffic generation model We use a continuous-time M M P P for modeling the self-similar traffic. W e construct an M M P P w i t h apparently self-similar behaviour over several time-scales by superimposing several M M P P s . F i r s t , consider two-state M M P P s w i t h different time-scales. T h a t is, the mean sojourn time of each process is i n accordance w i t h the different time-scales. L e t us superimpose them to make a new M M P P . W h e n we see this process i n a large time-scale, it looks like an ordinary two-state M M P P . If we look at a smaller time-scale, we find that each state is composed of a smaller M M P P . If we look at a still smaller time-scale, we find that each state of a smaller M M P P is again composed of a still smaller M M P P . T h i s can be repeated only a finite number of times. Therefore, the M M P P is not selfsimilar according to the definitions in the previous section because it looks constant when time-scale is larger than the time constant i n itself. However, it can emulate self-similarity over several time-scales. It is impossible to measure given traffic during an infinite amount of time. It has also been observed that high-speed traffic loses self-similarity over days [34]. Thus, it is practically sufficient to use the process which has self-similarity over only several time-scales to model the real traffic. N o w we assume that the number of the states Chapter 3. Self-Similar Traffic and MMPP Approximation source IPP IPP IPP IPP X 2 3 4 A* 6.579 2.562 0.914 0.448 Pois. 33 0~2i 5.715* 1 0 2.285 * 1 0 2 1.231 * 10~' 4.923 * 1 0 - 1 2.653 * 10~ 1.061 * 1 0 ~ 5.715 * 10"^ 2.285 * 1 0 ~ A = 0 4 1 4 3 e P Table 3.1: Parameter setting for superposition of M M P P s of every underlying M M P P is two. W e construct a model that consists of a superposition of four independent two-state M M P P s and one Poisson process. W e consider O N / O F F model as a special case for each two-state M M P P , which is also known as interrupted Poisson processes ( I P P ) . We also assume that the two m o d u l a t i n g parameters of each I P P are equal. T h e superposition of M M P P s is itself an M M P P , and is sufficient to model self-similar behaviour reasonably accurate over four to five time scales [4]. W e can describe the i t h I P P as follows: —o~u Qi 0~2i where A,- = = -On \ 0 0 0 (1 < i < 4) (3.16) Qi and A* are the transition rate matrix and the arrival rate matrix of the i t h I P P , respectively. Aj is the mean arrival rate of each i t h I P P when i n the O N state. W e give one example of the set of source parameters defining the traffic model described above i n Table 3.1, w i t h overall mean rate, described i n E q . (3.19), A = 3 * 10 bps and H = 0.75. T h e 6 interested reader may find the detailed procedure of fitting to an asymptotic second-order self-similarity of the counting process i n [4]. Note that the four 2-state M M P P traffic sources are independent. A t O B S edge node, the incoming I P traffic itself is self-similar aggregated traffic, and is demultiplexed at the Chapter 3. Self-Similar Traffic and MMPP Approximation 34 edge shaper according to each FEC i n separate queues, based on the destination and QoS requirement. W e assume each FEC traffic is also self-similar and can be represented as the superposition of the above four I P P s . T h e superposition can be described as follows: Q = Qi@Q2®Qz@Qi (3.17) A= A ©A eA ©A4©A 1 2 3 (3.18) p where © denotes the Kronecker's sum and A is the arrival rate of the Poisson process to p be superposed. T h e mean arrival rate A of each FEC traffic process can be calculated as: We denote the M M P P obtained by the superposition of I P P s as the stochastic process {N ,t t > 0}. T h i s M M P P N has an underlying 16-state continuous-time M a r k o v chain t which we denoted as {X ,t t > 0} i n the previous section, w i t h transition rate m a t r i x Q (3.17) and arrival rate m a t r i x A (3.18), representing aggregated FEC traffic sources as an input at the O B S edge node, w i t h L R D properties. T h e M a r k o v chain X t MMPP N t generates the as N = f g'X ds + m , t Jo s t (3.20) where the superscript ' denotes vector transpose, N denotes the I P packet arrivals that t occur during the interval process N and m t t [0,t], and g = [A 1 ,A 2 ,Ai 6 ]' is the vector of intensities of the is a Poisson martingale stochastic process, which satisfies E { m | M } = mt, for r > t, T (3.21) where J\f is the observation history of the M M P P N , and E { . } denotes the expectation t operator. t Chapter 3. Self-Similar Traffic and MMPP Approximation 35 A g a i n , since the 16-state worked well for modeling self-similar behaviour over four to five time scales, we choose 16-state for our M a r k o v chain. One example of the transition rate m a t r i x Q generated from Tabel 3.1 i n our simulation is shown below: -0.58 0.00 0.00 0 0.00 0 0 0 0.23 0 0 0 0 0 0 0 0.00 -0.58 0 0.00 0 0.00 0 0 0 0.23 0 0 0 0 0 0 0.00 0 -0.58 0.00 0 0 0.00 0 0 0 0.23 0 0 0 0 0 0 0.00 0.00 -0.58 0 0 0 0.00 0 0 0 0.23 0 0 0 0 0.01 0 0 0 -0.58 0.00 0.00 0 0 0 0 0 0.23 0 0 0 0 0.01 0 0 0.00 -0.58 0 0.00 0 0 0 0 0 0.23 0 0 0 0 0.01 0 0.00 0 -0.58 0.00 0 0 0 0 0 0 0.23 0 0 0 0 0.01 0 0.00 0.00 -0.58 0 0 0 0 0 0 0 0.23 0.57 0 0 0 0 o• 0 0 -0.24 0.00 0.00 0 0.00 0 0 0 0 0.57 0 0 0 0 0 0 0.00 -0.24 0 0.00 0 0.00 0 0 0 0 0.57 0 0 0 0 0 0.00 0 -0.24 0.00 0 0 0.00 0 0 0 0 0.57 0 0 0 0 0 0.00 0.00 -0.24 0 0 0 0.00 0 0 0 0 0.57 0 0 0 0.01 0 0 0 -0.23 0.00 0.00 0 0 0 0 0 0 0.57 0 0 0 0.01 0 0 0.00 -0.23 0 0.00 0 0 0 0 0 0 0 0 0 0 0 0 0.57 0 0 0.57 0 0 0 0 0.01 0 0 0.01 0.00 0 0 0 -0.23 0 0.00 -0.23 36 C h a p t e r T r a f f i c 4 p r e d i c t i o n i n O B S n e t w o r k In this chapter we look at the problem of O B S traffic prediction i n the presence of selfsimilarity. One of the key issues i n QoS provisioning i n the O B S network is to predict the optical burst length i n the next burst assembly time interval based on the online measurements of traffic characteristics. T h e goal is to forecast future traffic variations as precisely as possible, based on the measured traffic history. Traffic prediction requires accurate traffic models that can capture the statistical characteristics of actual traffic. Inaccurate models may overestimate or underestimate network traffic. W h i l e a certain amount of error is allowable (and expected), the overall predictor is required to be accurate enough so as not to reduce utilization i n the worst cases. We start from investigating the B H P pre-transmission mechanisms i n O B S networks for QoS provisioning, where the L M S - b a s e d linear predictor is assumed. W e then derive an o p t i m a l prediction algorithm i n the M M P P predictor sense and discuss further the implementation aspects of our proposed algorithm. Chapter 4. Traffic prediction in OBS network 4.1 37 Pre-Reservation and Linear Predictor A s we have introduced that, different offset-time based approaches have been proposed to provide service differentiation i n O B S networks. T h e common purpose is t r y i n g to manage QoS on a class-by-class basis using different extra offset times for different classes of bursts. T h e basic idea is that by giving a larger extra offset time to a higher priority class, reservation for higher priority bursts can be made far in advance of lower priority bursts and thus have a better chance to succeed. A l t h o u g h offset-time based differentiation is easy to implement and provides efficient isolation between service classes when a sufficiently large extra offset time is assigned to higher priority bursts, the extra offset time introduces an additional delay at the edge. A variation on these approaches is to send the control packet ( B H P ) prior to collecting the entire burst from the upper layers. T h e m a i n advantage of this variation is the reduction i n the burst pre-transmission delay. However, the control packet is sent before the burst is completely assembled; therefore the estimation of burst length is required i n the O B S edge node i n order to utilize the estimated setup/estimated release scheme. For simplicity, we divide the incoming I P traffic into two classes, i.e., low-priority (delay tolerant) and high-priority (delay sensitive), and apply burst-size prediction for highpriority traffic. O n l y basic offset time, T is considered. T h a t is, the B H P is transmitted r 0 Q time units before the data burst so that the B H P can set up a switching path and reserve bandwidth. B o t h classes of traffic use the estimated setup/estimated release scheme. However, the low-priority traffic uses the Just E n o u g h T i m e ( J E T ) signaling protocol [19], where the occupation of the resources is exactly from the burst arrival until the transmission of its last bit. T h e B H P is sent after the entire burst is completely assembled and no estimation is made. T h e bursts still wait i n the queue and w i l l be transmitted to the O B S core after a certain offset time r . We assume that the offset time is short 0 Chapter 4. Traffic prediction in OBS network Predictor BHP BHP Generator BCuV" 38 ...J. 1—1> Control Channel IP Traffic s. Data Channel Classifier assign label Labeled LRD Traffic Burst Assembly Data Burst Ingress Node Figure 4.1: Service differentiation at O B S ingress node w i t h traffic prediction and constant (equals to burst assembly time i n most cases). T h u s the influence of the offset time i n the data burst arrival process is only a time shift. T h i s scheme is termed as delayed reservation. so-called advanced For high-priority traffic, we use F R R signaling protocol [7], a reservation scheme, where B H P is sent before the burst is completely assembled, and the estimation of burst length is made by a predictor between the traffic classifier and the burstification control unit ( B C U ) . Figure 4.1 shows this scenario under analysis. A key requirement for successful implementation of F R R is that the B H P needs to have an accurate estimate of the size the data burst. W e now briefly introduce how traffic prediction works for b a n d w i d t h reservation i n O B S networks. The O B S ingress node performs the grouping of a number of small I P packets i n variable-size bursts before transmitting them to the optical domain. A s M P L S traffic engineering is deployed, the I P packets given same destination address and QoS requirement are classified into one forward equivalent class ( F E C ) and are assigned w i t h the same label. T h e O B S ingress node maintains a separate queue per F E C . T h e classified traffic of per F E C is assumed self-similar w i t h long-range dependence ( L R D ) in property, which is approximated as an M M P P process i n the previous chapter. In our system, we use a timer-based non-periodic burst assembly mechanism, described i n section 2.1.2. Figure 4.2 depicts how the predictor works while burst assembling. Chapter 4. Traffic prediction in OBS network kdi burst assembly time i r <k) ?l 1 st (k) ?2 packet of kth burst . 39 (k+J)di burst assembly time >* r « > 1^ -(*) ,.(*) "'•+1 t ""W 1 ^ pa cket of burst ( A time unit Prediction of the (k+l)th burst send B H P 5 Send burst Figure 4.2: Timer-based burst assembly w i t h prediction Assume the A;th burst has been transmitted down the optical network, and let r f c + 1 1 ^ denotes the arrival time of the first new I P packet, which w i l l be assigned to the (k + l ) t h d a t a burst. U p o n a timeout expiration, the burst is framed and relayed for transmis- sion. A s a result, the burst assembly time r a is kept w i t h i n the timeout value, which is independent of network load. O n the other hand, prior to burst transmission, a control header ( B H P ) , w i t h burst label, predicted burst size, etc., has to be released to reserve the necessary b a n d w i d t h at each intermediate node i n the O B S core network. A s shown in Figure 4.2, the B H P is sent at the time r ( f c + 1 ) x + r a - A , where 0 < A < r . a The number of packets arriving during the time interval A is not known at this time, so the B H P needs to predict the number of packets arriving from r f + 1 ) + r - A to r[ k+l) a + r a i n order to specify the reservation duration. A is termed as prediction interval i n traffic prediction, and i n our case A is physically equal to offset time r . V a r y i n g A is a way of 0 service differentiation for different QoS requirements. Since we only consider two classes of traffic i n our system, we can simply choose A = r . a T h a t is, we predict the (k + l ) t h burst size at the time its first packet arriving. It is usually difficult to make a prediction algorithm practical, while accurate and effective, especially for the self-similar traffic. A m o n g these widely used estimations derived from a mean square error based method, linear predictive filter ( L P F ) adopted in [7] is Chapter 4. Traffic prediction in OBS network 40 the simplest one. Before deriving the Bayesian predictor we first briefly outline the linear prediction method for predicting burst size. W i t h the knowledge of the burst length of the last N bursts, (Lfc_/v, ...,Lfc), the length of the next incoming burst L^+x could be expressed as a linear combination of the lengths of the previous N bursts by: N L +i = ^2w{i) • Lk-i+i k (4.1) i=\ where w(i),i € {1, ...N} are the coefficients of the linear predictor. These are typically updated by an adaptive filtering type algorithm such as the normalized L M S (Least M e a n Square) algorithm [35]. Another benefit of using L M S is that there is no need to know the underlying structure of traffic. A control header makes an advanced resource reservation according to the predicted value. If the prediction is optimal, the predicted value should be equal to the actual incoming burst length. A n a l y z i n g the performance of prediction techniques and the predictability of network is an important study i n the O B S network. T h e quality of a prediction depends on the amount of uncertainty that accompanies the prediction, and mathematically this is measured by the variance of the prediction error. W e modelled the L R D traffic as M a r k o v M o d u l a t e d Poisson Process ( M M P P ) , which matches correlation lags over four to five orders of magnitude. T h e linear predictor compute only a moving average of the N past burst lengths. A major deficiency is that it does not use knowledge about the M M P P structure to predict the burst length. W e now introduce an o p t i m a l (conditional mean) Bayesian prediction algorithm for predicting the O B S burst size when the traffic is for L R D traffic approximated by an M M P P model. Bayesian estimation is o p t i m a l i n the sense of minimum-variance. Chapter 4. Traffic prediction in OBS network 4.2 41 M M P P Bayesian Predictor In the previous chapter, we introduced a 16-state M M P P model as a superposition of four independent 2-state M M P P s for modelling O B S incoming traffic i n finite time scales. W e now derive a recursive Bayesian predictor based on this M M P P model, which calculate a new estimate for each time-step, based on the previous estimate and new measurements. Recursive Bayesian estimation works by simulating the process, while at the same time adjusting it to account for new measurements, taken from the real process. For simplicity, we assume IP packet sizes are fixed i n our system, i.e., equal, and the burst length is measured i n the number of packets i n the burst. A s we introduced i n section 3.3, i n a state space model, for each F E C , let the classified I P traffic be represented as M M P P N t process, i.e., N t denotes the number of I P packet arrivals that occur during the interval [0,t\. A n d let J\ft denote the M M P P observation history, i.e., the times of arrivals of all the packets up to time t. (Note that the observation history can be defined more rigorously i n measure theoretic notation i n terms of sigma algebras.) G i v e n the observation history M our a i m is to predict the (k + l ) t h burst size Lk+i, i.k+i), i.e., the number of I P packet arrivals i n the time interval [r[ \ r[ ^ + r j . T h e best k+1 prediction of L +\ k k+1 can be derived by the Bayesian estimation algorithm, i.e., computing E { L f c i | A / " ( f c + i ) } , which is the m i n i m u m mean square error ( M M S E ) estimate. In order to + r derive the Bayesian predictor for the burst length L +i, it is first necessary to derive the k o p t i m a l state estimate X t of the underlying state of the 16-state continuous time M a r k o v chain {Xt}, w i t h transition rate matrix Q and arrival rate matrix A = diag[\i, A , A 2 1 at the time t given the observation history J\f . Namely, t X = E{X |M}, t t (4-2) 6 ], Chapter 4. Traffic prediction in OBS network where E{.} denotes the expectation operator. 42 T h i s Bayesian estimation is essentially a continuous-time H i d d e n M a r k o v M o d e l ( H M M ) , which can be represented i n Martingale formulation as: X = X + / Q'X dr + M , Jo t 0 T (4.3) t where' M is a martingale w i t h respect to the filtration generated by X . For a brief introt duction of the concept of martingale theory see [36, A p p e n d i x B ] . T h e n dX = Q'X dt + dM . t (4.4) G i v e n the observation history J\f , the estimation X (4.2) is obtained v i a the so called t t t t M M P P filter as follows: = ^ie* (4-5) M where the un-normalized filtered density q (which is a 16 dimensional vector w i t h nont negative elements) is computed by the following Zakai equation [36]: dq = Q'q dt + (A - I)q dn t t t (4.6) t where n = N — t. t t qt = qo+ [ Q'Qrdr + (A - I)q dn Jo T Note that the above integral involving dn T dN t (4.7) T is Stieltjes integral. Recall dn = dN — dt and t t = 1 if an I P packet arrivals at time t, and is zero otherwise. T h e derivation of the above M M P P filter equation (4.7) appears i n [37] [38]. A s shown i n Figure 4.2, let r- \i k = 1,2,n(k) denote the packet arrival times of Chapter 4. Traffic prediction in OBS network 43 M M P P N . T h e n (4.7) can be written as t Qt = 9 o + / (Q' ~ A + I)q dT + (A - / ) V r •/o r ( f c , 4 (4.8) 9 (*,_, T frr <t * where T-^ — denotes the time just before the I P packet arrival at time r- \ k the following exact implementation: For r- ^ <t< T h i s leads to r- \, k k (Q'-A + /)( -r, q = exp r t (4.9) T h a t is, q evolves deterministically between two successive I P packet arrivals. A t the t packet arrival time t = r^ \, q <*) is updated as k i+1 T q ( ) = Aq k i+i T ik) _ , i+i (4.10) T Therefore, q <k) can be updated just at each packet arrival: q r ik) i+l = A exp ( # - A + /)(T#-7i<*>) (4.11) Note ( r ^ { — r ^ ) is the packet interarrival time of the fcth burst. F r o m this equation, it also follows that the o p t i m a l predicted state estimate of the 16-state M a r k o v chain X t given the observation of M M P P N up to time r , r < t can be computed as: t X t = E{X \N } t t ?t|T E"I9*|T(*)' A exp ( Q ' - A + / ) ( i - r ) (4.12) (4.13) G i v e n the above M M P P filter (4.7) a n d the predictor (4.13), we can now derive the Bayesian predictor for the (k + l ) t h burst. Recall that our a i m is to predict the (k + l)th Chapter 4. Traffic prediction in OBS network burst size at the time its first packet arrival r[ \ 44 since we set the prediction interval A k+1 as the burst assembly time r . W e shall use the following representation for E q . (3.20): Q N t = f g ' X d s + m. Jo s (4.14) t T h e n the M M S E (conditional mean) predicted number of packets a r r i v i n g i n the interval / (fc+i) (k+i) _j_ Lfc+i = = gj E ( / ^ + 1 g'X dt + d m ) t E{ / /•^i = observation history J\f (jt+i) is: tfiQ y e n / r, t \ M T ^ } (4.15) 5'X dt|A/" ( i)} + E { m ( t (fc+1> ( f c + 1 ) (fc+i) l + T fc+ r f c + i) + A - m^+ulA/^+u} (4.16) A +A ' E { X | A ^ ( 5 'g'X^ t + 1 ) f c + 1 )}di ( - ) 4 dt. 1 7 (4.18) where the equality (4.17) follows the property of martingales stated i n equation (3.21). F i n a l l y , substitute X^k+i) w i t h E q . (4.12) and (4.13) i n E q . (4.18), we can get the following Bayesian predictor for the (k + l ) t h burst size: In the O B S system, since this prediction interval A is referred to as burst assembly time, which is usually preconfigured as a system parameter at the network design stage, the integral of m a t r i x exponential i n E q . (4.19) can be precomputed, while the update of q involves a m a t r i x vector m u l t i p l i c a t i o n of complexity 0 ( 1 6 ) . T h i s predictor can be 2 T implemented numerically by suitable discretization. Chapter 4. Traffic prediction in OBS network 45 However, i n the case of a threshold-based mechanism is adopted for burst assembly, the burst is framed when the m a x i m u m burst size is reached. Therefore, the burst length is fixed while burst assembly time is varying. W e can use the same prediction algorithm to predict the burst assembly time, and determine when to begin a reservation. In order to facilitate the solution of E q . (4.19), let (fc+i) n = t-Tl T h e n we have (4.20) If we define (4.21) the m a t r i x exponential approximated by <& series expansion $ = e^i = i + (Q'v) , (Q'V? 2 Q> v + 2! 3! + ... (4.22) can also be written as: $ = / + Q'nV, where (4.23) Chapter 4. Traffic prediction in OBS network 1 M a t r i x 7 <— Identity 2 Matrix $ «- 7 3 4 A; <— 25; Comment: we are using AT = 25 i n (4.25). If k = 1, go to step 5 6 7 Matrix * = 7 + fc = /c - 1 G o to step 4 8 9 M a t r i x r *- ^n M a t r i x $ = 7 + Q'r?* 46 Table 4.1: P r o g r a m logic to compute $ and Y for simple case, eg. N = 25 T h e T integral i n (4.21) can be evaluated term by term to give r = E Q' 4k + 1) k (fc + l ) ! fc=0 oo =E = (Q'nf •n ^n (4.24) We evaluate \f by a series i n the form 7 + f 7 ( + ^(- 7V-1 1 + TV J j j ' (4.25) which has better numerical properties than the direct series of powers. W e then find Y from (4.24) and $ from (4.23). B y comparing to the direct computation of the m a t r i x exponential, we choose N = 25 here as an approximation. T h e n the program logic for computation of <J> and Y is given in Table 4.1 Due to the imperfection of a predictor, an estimated length may t u r n out to be smaller or larger than the actual burst duration. A smaller burst length prediction w i l l result i n insufficient reservation on the path holding time for the data burst. T h i s w i l l cause burst drop at the intermediate node on the path i n the core network. So, we compare Chapter 4. Traffic prediction in OBS network 47 the predicted value and the actual value at the edge node when the burst assembly is finished. W h e n the smaller prediction happens, it requires the B H P to be re-transmitted after the burst assembly finishes, thus degrading the latency reduction performance. 48 C h a p t e r 5 P e r f o r m a n c e S i m u l a t i o n A n a l y s i s a n d R e s u l t s In this chapter we present several simulations to illustrate the Bayesian M M P P predictor and its effect on the latency reduction and burst blocking probability of the O B S network. T h e input traffic is assumed to be self-similar w i t h L R D properties. T h e simulation platform is O P N E T . 5.1 Model Setup T h i s section describes how an O B S network model has been implemented i n O P N E T Modeler 9.1. In general, section 5.1.1 briefly describes our network methodology; in sections 5.1.2, we give a detailed introduction on how to develop our M M P P traffic models in O P N E T ; i n section 5.1.3 and 5.1.4 we focus on the functionality and system parameters set up on the O B S edge node and core node, separately. 5.1.1 Optical network architecture We use the optical network architecture i n Figure 5.1, which is similar to [39]. Chapter 5. Performance Analysis and Simulation Results 49 Figure 5.1: O B S network structure 1. T h e network consists of a core part ( O T N ) a n d an external part (optical edge routers), where control channel and data channel are separated. 2. Core network ( O T N ) consists of a fully connected mesh of four optical cross connects ( O X C s ) , w i t h full wavelength conversion. 3. Network edge routers are capable of aggregation and de-aggregation of I P packets. 4. T h e network supports two classes of traffic. 5.1.2 The Traffic generator generator node must generate packets, process them, and send them on to the point- to-point transmitter. T h i s can be modelled using a traffic source processor to generate packets, another processor to perform any necessary operations, and a point-to-point transmitter to transmit the packets on the point-to-point link. T h i s node structure is shown i n F i g u r e 5.2, where the "src" module generates M M P P traffic and the "proc" Chapter 5. Performance Analysis and Simulation Results 50 5^ m xmt proc src Figure 5.2: Traffic generator node field name dest .address QoS type integer integer size 32 bits 8 bits comments egress address Qos class Table 5.1: I P Header format module specifies other information, such as destination address, Q o S requirements etc. "xmt" is simply the point-to-point transmitter. For simplicity, the I P packet header contains only information needed i n our experiment, as shown i n Table 5.1. Recall that i n our simulation, the traffic model for each F E C is a 16-state M M P P process, which is mathematically computed by superposing four 2-state M M P P s i n section 3.3.2. T h e structure of this M M P P traffic generator is shown i n Figure 5.3. S i m u l a t i o n results validate that our 16-state M M P P model is good at modelling self-similar traffic i n four to five time scales. Therefore the traffic generated from the above source processor has the property of long-range dependence ( L R D ) i n four to five time scales. 5.1.3 Edge router The m a i n role of edge routers i n the O B S network is to provide packet transmissions between the optical domain a n d the electrical domain. There are two types of edge routers, i.e., ingress a n d egress routers. T h e one transmitting packets from the electrical domain to the optical domain ( O T N ) is referred to as ingress router, a n d vice versa. Edge routers also implement O B S M A C layer functions between I P and optical layer. Chapter 5. Performance Analysis and Simulation Results 51 Figure 5.3: M M P P traffic generator - the a i m of this figure is to illustrate the transactions between this 16-state M M P P • K e y functions of O B S M A C layer at the ingress router are: 1. Classify the incoming I P packets based on their destinations and Q o S class into different F E C classes. Later, optical burst w i l l be generated from each F E C queue. T h e optical burst structure is shown i n Figure 5.4. 2. For different classes traffic, service differentiation is provided. For high-priority traffic, a predictor is inserted to predict the burst length before it has been fully assembled. In this case, the offset time r between B H P and the data payload is 0 equal to the burst assembly time r . T h e n B H P contains the above information 0 and routing information (label) is launched, and will-be sent out directly after it has been generated. For low-priority traffic, the B H P is generated after the burst assembly is finished, so the B H P contains the exact knowledge of Chapter data 5. Performance HJ Analysis and Simulation 52 Results IP packet ^"X Payload data H Burst size Offset time Cos data H — • Wavelength ID H B u r s t Label Control Header Packet Figure 5.4: O B S burst structure field name label waveJd type integer integer size 32 bits 8 bits comments M P L S routing information for outport wavelength reservation class integer 4 bits offset_time num_pk integer integer integer 32 bits 8 bits setting F E C class for Qos requirement offset time between B H P and burst burst .size number of packet of a burst 32 bits actual burst size Table 5.2: B u r s t Header Packet ( B H P ) format the burst length. However, the assembled data burst has to be queued i n the ingress for a preconfigured offset time r , which is set to the t o t a l p a t h set up 0 time i n the core network. A B H P format is shown i n Table 5.2. 3. Frame the burst after the offset time has elapsed and send the burst into the optical layer. A burst is set to be at least several kilobytes long i n order to be comparable w i t h the real O E O time needed for its header to set up a switching path, which is presumed to be i n the range of microseconds. Usually a burst consists of several tens to hundreds of I P packets and the assembling time is also i n the range of hundreds of microseconds (ps) to several milliseconds ( m s ) . 4. Support two interfaces for input and output on the O T N side as well as the terminal workstation side. 5. T h e ingress router processor has a switching table, for example, Table 5.3, Chapter 5. Performance Analysis and Simulation Results 53 which is used to forward the optical bursts on a particular output stream on an output port. Dest. address 0 1 Dest.egress 1 2 Dest.port 0 1 Dest.wavelength 0 0 Table 5.3: Edge router switching table T h e node model of an ingress router is depicted i n Figure5.5, where the "aggregate" module performs the above functions, and the control channel is separated from those d a t a channels at the electrical to optical interface. It also shows that the ingress node supports four data channels (wavelengths) at each out port. pr_l txl_BHP Figure 5.5: Ingress node model (pr_0 and p r _ l are inports; t x O J 3 H P and t x l J 3 H P are outports for B H P ; pt_0 and p t _ l are outports for d a t a burst, each w i t h 4 wavelengths) K e y functions of O B S M A C layer at the egress router are: Chapter 5. Performance Analysis and Simulation Results 54 1. Deframe the bursts and extract I P packets from them. 2. L o o k up the final destination for a l l the workstations connected to it. 3. T h e egress router processor also has a global forwarding table, which is used to look up the final destinations directly connected to the egress router. Egress address 1 2 Dest. address 0 1 Dest.port 2 2 Table 5.4: Edge router forwarding table 5.1.4 Core R o u t e r ( O X C ) A simplified architecture of a core node w i t h 4 i n p u t / o u t p u t links is represented i n Figure 5.6. There are 8 inputs and 8 outputs, each represented by a point-to-point receiver and transmitter, respectively. Four of the receivers and transmitters have four channels, witch represent 4 d a t a W D M channels. T h e other receivers and transmitters are single channel and represent the B H P channels on each link. T h u s , each combination of B H P transmitter and receiver, and 4-channel data receiver and transmitter represents the connection to a fibre link to an edge node or other intermediate core node. A l l of the receiver/transmitter components are connected to a central process model S C U (switch control unit) v i a packet streams. T h e S C U is responsible for: 1. reading switching information from a precomputed M P L S - t y p e label information table ( L I B ) 2. swapping input and output labels Chapter 5. Performance Analysis and Simulation Results rxO BHP txO BHP r x l BHP t x l BHP BHP tx2 BHP rx3 BHP tx3 BHP ot2 55 Figure 5.6: O B S core node structure 3. calculating new offset times 4. performing wavelength conversions 5. m a k i n g resource reservations for corresponding data bursts 6. reassembling B H P w i t h new necessary information 7. relaying optical bursts to output ports according to the switching table 8. implementing optical output buffers needed for each outgoing burst In the process model, decisions are made depending on whether a B H P or a data burst is received. W h e n a B H P is received, a l l of its information is extracted to the S C U . Here Chapter 5. Performance Analysis and Simulation Results 56 the control packet is transformed to the electrical domain. In the electrical domain, the following actions take place: • T h e label i n the B H P is used to point to the d a t a burst forwarding information i n the L I B , such as the output interface and Qos information. • Information i n the B H P about burst length and the offset time of the data burst is used i n addition to the forwarding information derived from the L I B . In particular, the latter is used to determine the mapping from the incoming fiber and wavelength to the outgoing fiber and wavelength. To be able to forward successive d a t a bursts of the same connection ( L S P ) on different wavelengths i n a given fiber, we propose that the label specify only incoming-fiber-to-outgoing-fiber mapping, while the information about the wavelength be appended to the outgoing label at every hop. • T h e S C U determines burst outgoing fiber by looking up its precomputed L I B , and makes wavelength reservation on designated output port, which supports full wavelength conversion. T h e n the cross-connect is set up to switch the d a t a burst corresponding to that control packet i n the all-optical domain. • B y recalculating a new offset time, the B H P is reassembled w i t h new necessary information and undergoes label swapping (and wavelength information appending) and is forwarded on the dedicated control channel of the outgoing fiber as indicated by the L I B . T h u s the O E O transformation is finished. If, on the other hand, a d a t a burst is received, a check is made w i t h the S C U to extract the next hop of the burst. T h e data burst is transmitted transparently i n the pure optical domain. Table 5.5 is an example of a label information table for label switching. In this simulation, we use one-way reservation scheme. T h a t is, the sender of a burst Chapter 5. Performance Analysis and Simulation Results Port 0 0 0 0 1 Input Wavelength Label Port Wavelength Label 0 1 2 16 3 20 21 22 23 25 0 2 0 0 27 24 0 2 0 0 0 1 30 21 28 57 Output Table 5.5: Information Base ( L I B ) at Core Node does not wait for a positive acknowledgment ( A C K ) of its reservation request. T h e advantage of one-way reservation is higher efficiency, as there is no overhead caused by the propagation delay. T h i s scheme is appropriate because O B S w i l l most likely be i m plemented i n long-haul networks and therefore the one-way reservation w i l l significantly decrease the time needed to establish a connection. A n example may illustrate this. T h e transmission time of a 1 0 0 K B burst on a l O G b p s link is 80/xs while the propagation delay over a distance of 200km (which is not long i n a backbone network) is typically about 1 ms. 5.1.5 Buffer at routers • Electronic buffers at each ingress router for packets aggregation purpose are represented by a single server F I F O queue per F E C . • O p t i c a l buffers ( F D L s ) are fixed for each label and should be allocated among a l l classes. A group of F D L s w i t h variable length can act as the F I F O queue. Chapter 5. Performance Analysis and Simulation Results 5.2 58 Simulation Experiments and Results In this section, we first compare the L M S - b a s e d linear predictor w i t h our proposed M M P P Bayesian predictor based on the prediction quality. W e then show the performance i m provement i n terms of E T E delay and burst blocking probability when applying our M M P P predictor. 5.2.1 Assumptions • A l l sources of d a t a from upper clients are assumed to be self-similar traffic. A superposition of four 2-state M M P P s are used i n this simulation to represent the aggregated self-similar traffic of each F E C at the O B S ingress nodes. • I P Packet size is fixed i n each scenario. • Packets w i t h the same destination address a n d Q o S requirement belong to one Forward Equivalent Class ( F E C ) . • Assume M P L S is used to set up a lightpath a n d to reserve b a n d w i d t h for each source-destination pair ( S D ) . E a c h L a b e l Switching P a t h ( L S P ) is identified w i t h a label and can be transmitted v i a more t h a n one channel. • F u l l wavelength conversion is assumed at each core node. • O p t i c a l buffers are allocated on a per-class a n d per-label basis, according t o the incoming traffic intensity and priority. • Traffic from different sources may share the same label, provided they are destined for the same egress. Chapter 5. Performance Analysis and Simulation Results 5.2.2 59 P r e d i c t i o n performance i m p r o v e m e n t We first study the L R D traffic prediction performance by comparing two prediction methods: simple L M S - b a s e d predictor [7] and our proposed M M P P Bayesian predictor. Figure 5.7 shows the probability density of the prediction error for b o t h our proposed M M P P Bayesian predictor and the traditional L M S - b a s e d linear predictive filter ( L P F ) . It is obvious that the L M S - b a s e d L P F has larger prediction error variance i n comparison w i t h our M M P P Bayesian predictor. Recall that the Bayesian estimation algorithm is optimal i n the sense of minimum variance. Note that the error i n this figure is un-normalized. • • mmpp 4 0 u s m mmpp~200us mftvpp_lrfts *• lms_2ns lms~40us lruapp_400us * • lms_200us lms_400us mnipp_2ms 0.00008 0.0000? 0.00004 0.00002 . 00001 .00000 0.5 prediction Figure 5.7: P D F of prediction error(in bits) error (bits) 1. (xlOOOOO) Chapter 5. Performance Analysis and Simulation Results Let 60 Lk+i be the predicted burst length and L +i be the actual burst length. T h e k performance is further evaluated by the following two criteria. F i r s t is the prediction interval, which refers to how far into the future the traffic can be predicted w i t h confidence. Define the normalized A-step prediction error as: E(A) = l ^ + i - ^ + i I. Lk+i Assume that confident prediction requires that E(A) (e.g. 20%) w i t h a probability (5.1) should not exceed a percentage e P (e.g. 0.01), where {P ,e) is the specified prediction e £ confidence interval. Therefore, a large prediction confidence interval implies good traffic predictability. Now, if we define P = P{E e > e}, we can compare the performance of two predictors on how confidently and how far they can predict the future. Figure 5.8 compares the simulation results of P vs. prediction interval A . £ T h e simulation results show that P of M M P P predictor grows much slower then L P F e when the prediction interval A increases. Furthermore, the M M P P predictor can b o u n d the confidence pair (P ,e) as (0.012,10%) and (0.008,20%) when A approaches £ 3ms, which is usually sufficient for assembling optical bursts. Signal-to-Noise R a t i o ( S N R ) is another criterion for evaluating the prediction performance, which refers to the accuracy of a prediction algorithm. N o w denote S N R as: SNR(A) = E [ L 2 + 1 ]/4, where A is the prediction interval, and a\ ai = E[(L fc+1 -L ) ], 2 fc+1 (5.2) is the A - s t e p prediction variance. T h e larger S N R implies the better prediction accuracy. In our thesis, we use the inverse of S N R , i.e., Chapter 5. Performance Analysis and Simulation Results 61 0.05 0.045 LMS predictor E= 10% 0.04 0.035 >> !5 ns o 0.03 0.025 LMS predictor E = 20% 0.02 0.015 MMPP predictor e = 10% -A 0.01 0.005 . MMPP. predictor, E .=. 20% J i - - 500 1000 1500 2000 2500 3000 Prediction Interval A (us) Figure 5.8: Comparison of P under prediction error e = 10%, 20% e SNR 1 for comparing the above two prediction methods. Namely, SNR SNR - 1 - 1 = E [ ( L fc+i Lk+i) 2 (5.3) is usually influenced by the parameters such as burst assembly duration T , i.e., q prediction interval, and traffic load p, since increasing traffic load implies increasing of hurst parameter H (the traffic bursty degree). W e conduct a set of simulations by tracing the dependence of S N R - 1 on the above two parameters, respectively. Figure 5.9 shows the effect of burst assembly duration. In this scenario, we set the other parameters as p = 25% and H = 0.75, since the b a n d w i d t h utilization of today's optical network is typically under 25%. Figure 5.10 compares SNR 1 w i t h the variation of traffic load i n the core network. Chapter 5. Performance Analysis and Simulation Results SNR 18 1 versus burst assembly time x ••A-- MMPP predictor -©• • LMS predictor load = 25% 16 H = 0.75 ,« / 14 CD a) > - / p O / 12 rr 62 / 10 '/ 8 / / / / 6 / / / 6 1.. o - . " ~" • ~~ £ 0 1.5 ^ J. . ^ . A - - A•— ^ 2.5 ^ 3.5 iog T 10 Figure 5.9: SNR A A (us) versus burst assembly time r In this scenario, we consider small assembly time ( r = a a 200ps) and large assembly time ( r = 2ms) cases, respectively. a A g a i n , smaller S N R - 1 implies better prediction accuracy. For our M M P P predictor, Figure 5.9 shows that ( S N R - 1 < 1%) is achieved when the burst assembly time r between lOO^xs to 1ms. Meanwhile, as r O n the other hand, the S N R r a - 1 a grows to 3ms, the S N R - 1 a is is kept w i t h i n 2%. of the L M S - b a s e d predictor increases dramatically when grows from tens of microsecond to the time scale of millisecond. T h i s is because the traffic behaves long-range dependence over several time scales while the burst is in assembling. It is inherent i n the fact that L M S algorithm converges and adapts to sudden changes is slower. In this scenario, it also yields the average accuracy improvement of M M P P predictor over L M S predictor as 87%. F r o m F i g u r e 5.10 we observe that S N R - 1 Chapter 5. Performance Analysis and Simulation Results SNR -A-o -0_^ 63 versus traffic load in core network — M M P P predictorT =2ms M M P P predictor T =200ns L M S predictor T =2ms L M S predictor-r =200ns a g i5 g _ a •€ (>- — - ~ — f: .... *- - ' ~~ ~ ~ 5 <! <> 0 I 20 - •I 30 <> <> 40 50 I I < > I 60 Core network traffic load% Figure 5.10: S N R 1 _ 1 70 _ J 80 versus traffic i n core network of M M P P predictor almost remains constant while increasing traffic load. T h i s is because the M M P P predictor has the knowledge of incoming traffic structure, which is independent of traffic load. However, the L M S - b a s e d predictor behaves different when feeding traffic becomes heavier. In the case of small assembly time applied, S N R - 1 is decreasing while traffic load increasing. T h i s result indicates that when many source traffic aggregated, the aggregated traffic turns out to be smoother i n very short time scales. N o w we look at larger time scales, i.e., the case of large assembly time applied. S N R - 1 is increasing w i t h the traffic load since the traffic bursty degree also grows, which can not be shaped away at this burstification range. Chapter 5. Performance Analysis and Simulation Results 5.2.3 64 B H P pre-transmission success probability In our system, a successful B H P pre-transmission means sufficient b a n d w i d t h is reserved before burst d a t a arrives at each intermediate core node. T h a t is, a smaller prediction of burst length: e(k + l) = {L -L )<0 k+1 (5.4) k+l w i l l result i n an insufficient reservation of the path-holding time for the data burst. T h i s requires the control header to be retransmitted after the burst assembly finishes, thus degrading the latency reduction performance. A common method for correcting this problem is to add a small correction margin for the predicted value. Instead of making the reservation length the predicted value L , k+X we define the reservation length as: L {k + 1) = L r k+1 where 8 is a small margin of correction. Let P s +S (5.5) denote the probability of success B H P pre-transmission: P = P{e{k + 1) < a}, s a = max{a,S}, . (5.6) where a is the variance of prediction error e(k + 1). Here the constraint a indicates that the prediction error overhead i s ' l i m i t e d i n our system. Figure 5.11 shows the B H P pretransmission success probability versus burst assembly time r . Since P depends largely 0 s on the correction margin 5, we also compare the results of P w i t h 6 variations. s Based on the central limited theorem, the prediction error e(k + 1) is assumed to be normal, w i t h zero mean and variance a . For a l l normal distributions, the density function 2 is symmetric about its mean value. A b o u t 68% of the area under the curve is w i t h i n one Chapter 5. Performance Analysis and Simulation Results 65 standard deviation of the mean, 95.5% w i t h i n two standard deviations, and 99.7% w i t h i n three standard deviations. Therefore, we only need to consider the correction margin: 5 = na, ne[0,3]. (5.7) F r o m Figure 5.11, we notice that the B H P pre-transmission success probability could be highly improved w i t h a small amount of correction value. B H P prelransmission success probability with L R D traffic 100 l 1 M M P P predictor 8 = 2o 90 Q_ >. !n to 80|- M M P P predictor 5 = a -Q o Q. 70 m in CD O O 3 60 h Cfl c o S 'in if) in c (0 Q. \ 50 . L M S predictor 8 = 2rj v.. A * : J 40 ; L M S predictor 8 = a 30 20 ; L M S predictor 8 = 0 500 1000 1500 2000 aggregation time (us) 2500 3000 Figure 5.11: Comparison of P versus burst assembly time r s 5.2.4 a Latency reduction improvement Now we can study how prediction accuracy and efficiency influence the system end-to-end (ETE) delay at edge routers. We consider first low-priority traffic without burst length Chapter 5. Performance Analysis and Simulation Results edge node Bandwidth allocator edge node 66 Bandwidth allocator Packet arrival Packet arrival No prediction With prediction Figure 5.12: Comparison of O B S network w i t h / w i t h o u t P r e d i c t i o n prediction i n the O B S edge routers. In this scenario, we use a one-way reservation protocol such as J E T [19]. A t the edge node, packets w i t h the same F E C class are aggregated into a container. W h e n assembly time is over, i.e., when r expires, the B H P is made and sent a out to the core network immediately, while the container is framed to a burst and stored i n the edge node, and w i l l be sent when the determined offset time expires. T h e E T E delay D = n + +u T T o (5.8) where r is the total transmission time of data burst throughout the designated L S P . T h e t offset time r is set to be the total O E O time + switching setup time at each intermediate 0 node through the designated L S P i n the core network. For high-priority traffic w i t h burst length prediction, we can dynamically reserve sufficient b a n d w i d t h at the O B S core network. In this scenario, we also use a one-way Chapter 5. Performance Analysis and Simulation Results 67 reservation protocol, such as F R R , proposed i n [7], which is a variation of J E T that focuses on delay deduction. T h e edge nodes determine all the routing and reservation information before the entire burst is assembled, then make and send the control header ( B H P ) immediately, while the burst is still being assembled. T h e burst assembly w i l l be finished when the preconfigured assembly timer r expires. However, the knowledge of the a burst length has been predicted at the arrival of the first packet of the burst. Simulation results show that the latency deduction value achieves the m a x i m u m when r is set to the 0 same as r [7]. It is quite straightforward that the assembly time is saved when r = r , a 0 a as shown i n Figure 5.12. Therefore, w i t h prediction, the E T E delay: D = p ~T l a •P + s (^T a + T ) • (1 0 P.) + T . (5.9) t For r = T , E q . 5.9 can be rewritten as: 0 a D = (^-P )-T p a a + T. (5.10) t In the O B S network, we observe that the bandwidth i n the core network (OC192 or more) is much higher'than that in the edge network ( O C 3 to O C 4 8 ) . T h e time for assembling a burst, which is usually i n the hundreds of microsecond, and is comparable w i t h the O E O time. T h e transmission time i n the core network is usually i n the time scale of several microseconds, which could be negligible to the assembling time. W h e n pre-transmission probability P achieves 1, the system end-to-end delay is D s T h e latency improvement (77) could be expressed by p = |r . 0 Chapter 5. Performance Analysis and Simulation Results 68 If the burst length could be predicted precisely such that the pre-transmission of the B H P succeeds w i t h a high probability (P —> 100%), the edge could reduce the latency by 66% s when T = r , and transmission time is negligible. 0 a ETE delay with/without prediction 4500 - 4000 s 3500 s / CO 2500 - / to 03 °U LU H LU s ,<* / 3000 - >, r , / / „ ' S ' 2000 - 1500 - 1000 - „9" < 500 - 0 "0 -+-0200 400 600 800 1000 1200 1400 LMS predictor No predictor 1600 1800 2000 aggregation time (us) Figure 5.13: Comparison of E T E delay w i t h / w i t h o u t P r e d i c t i o n In our simulation, we calculate the actual E T E delay by also considering the transmission time through the light path, since the transmission time should not be ignored i n a complicated network. A n example may illustrate this. T h e transmission time of a 1 0 0 K B burst on a 10 G b p s link is 80yus. A s shown i n our fully meshed network structure, optical burst usually undergos several hops i n the core network. So the accumulated transmission time might be comparable to burst assembly time. F i g u r e 5.13 compared the overall E T E delay w i t h no prediction, linear prediction, and our M M P P prediction. T h e simulation result shows that our proposed M M P P predictor can improve the performance of E T E delay about 30% ~ 35% compared to a traditional linear predictor, w i t h a relatively larger Chapter 5. Performance Analysis and Simulation Results 69 prediction interval, and around 50% compared to the situation without predictor. 5.2.5 Effect on Burst blocking probability Comparison of burst loss probability I 1 I ! : _ -() I ^- " JS O ID in o .a _ . i CD > O •A- • MMPP predictor -o- LMS predictor wavelength = 4 H = 0.75 x = 1ms i 25 30 35 40 45 50 55 60 a 65 70 75 Core network traffic load% Figure 5.14: Comparison of burst blocking probability w i t h different predictor In the O B S system w i t h its one-way reservation scheme, if the requested b a n d w i d t h is not available, the burst is said to be blocked and dropped. A s a burst may contain several user packets, the loss of a burst is critical i n O B S networks, which may incur severe problems at upper layers, e.g., T C P packets out of sequence. T h e performance of O B S networks i n terms of burst blocking probability has been studied extensively [40] [41] using either simulation or simple analytical models. B u r s t blocking probability is influenced mainly by traffic load, traffic chrematistics, and number of wavelengths, as discussed i n [42]. Chapter 5. Performance Analysis and Simulation Results 70 Figure 5.14 compares two predictors for the following criteria: burst loss ratio, also called burst blocking probability. In our simulation, we m a i n l y consider the L R D traffic w i t h H = 0.75 and our system support full wavelength conversion for only 4 wavelengths. So the blocking probability is higher i n comparison to more wavelengths cases. The simulation results validate that, w i t h a more precise prediction of the burst length, the burst loss ratio can be highly decreased. 71 C h a p t e r 6 C o n c l u s i o n 6.1 a n d F u t u r e W o r k Conclusion T h i s thesis studies optical burst switched ( O B S ) networks, Internet I P traffic modeling at the O B S edge networks, and on-line burst size prediction for dynamic b a n d w i d t h reservations. • F i r s t , this thesis presents a detailed analytical model for the burst shaping (assembly), scheduling and wavelength reservation schemes for O B S networks. Though there has already been research on this subject, the models proposed i n this thesis are more viable and valid for general burst length and offset length distributions. QoS differentiation is facilitated at network edges. • In the second part of the thesis, special attention has been p a i d on traffic modeling for self-similar Internet traffic w i t h long-range dependence ( L R D ) properties. T h e incoming I P traffic at O B S ingress nodes is approximated by using the superposition of several M M P P s . W e find that, under reasonable assumptions regarding the system parameters of burst assembly, i.e., burst assembly time and O E O time-scales in optical switch and electronic hardware, the performance of our traffic model using the superposition of four 2-state M M P P s seems to get close to that of more Chapter 6. Conclusion and Future Work 72 complicated F B m and F G n . T h a t is, it is practical sufficient to model L R D traffic properties i n several time scales. • Last b u t not least, a recursive Bayesian (optimal) filtering/prediction method is proposed to predict optical burst length at the O B S ingress node. For I P traffic approximated by a N-state M M P P , the computational complexity is 0(N ). 2 B y this burst length prediction, burst control header ( B H P ) can be transmitted down to the core network before burst assembly is completed, thus greatly reducing burst assembly delay at the edge node and further facilitating the service differentiation based on Q o S requirements among different classes. Simulation studies are presented to i l lustrate the performance of our proposed predictor i n comparison to earlier proposed L M S - b a s e d linear predictive filter. Theoretical analysis and simulations exhibit encouraging results. T h e performance study indicates that, the recursive Bayesian predictor delivers excellent forecasting performance for the self-similar traffic which best models the Internet traffic, and is proved to be practical i n reducing the d a t a burst delay at network ingress routers of an O B S system. 6.2 Extensions Several issues remain open and are worthy of further investigation to generalize our prediction scheme a n d to unleash the potential of QoS provisioning for the O B S system. For example, the proposed Bayesian predictor is an on-line o p t i m a l (minimum-variance) predictive filter, which provides an optimization of the burst-length prediction. However, there are two disadvantages of this estimation scheme, i.e., the "curse of computational cost" a n d the "curse of traffic model". T h e algorithm involves m a t r i x multiplication, which results i n the computational effort being of the order 0(N ), which can be large for 2 Chapter 6. Conclusion and Future Work 73 large N. Recently, some reduced-complexity filtering algorithms have been proposed [43] to solve this problem. In this thesis, the Bayesian prediction algorithm is derived based on a M M P P traffic model, assuming full knowledge of the M M P P traffic structure. A s we stated i n Chapter 1, we skipped the step 2, i.e., parameter estimation. T h e expectationm a x i m i z a t i o n ( E M ) algorithm is a popular off-line locally convergent scheme for obtaining m a x i m u m likelihood estimates of the H M M parameters. Future research may focus on online Bayesian prediction of M M P P models w i t h unknown transition matrix. 74 G l o s s a r y Operators diag[-] diagonal m a t r i x w i t h diagonal entries of vector argument E{-} expectation P{-} probability e Kronecker's sum [ •J transpose (•, •) || • | | scalar product 2 the Frobenius n o r m Other Functions exp(-) exponential function ln(-) logarithm to base e log (-) logarithm to base 10 10 Acronyms AONs A l l - o p t i c a l Networks AR A d v a n c e d Reservation ARIMA A u t o Regressive Integrated M o v i n g Average BCU Burstification C o n t r o l U n i t BHP B u r s t Header Packet CLT Central L i m i t Theorem DR Delayed Reservation ETE End-to-End FARIMA Fractal A u t o Regressive Integrated M o v i n g Averag FEC Forward Equivalence Class FDL F i b e r Delay Line fBm Fractional B r o w n i a n M o t i o n fGn Fractional Gaussian Noise IPP Interrupted Poisson Process JET Just-Enough-Time JIT Just-In-Time HMM Hidden Markov Model MMPP M a r k o v M o d u l a t e d Poisson Process MMSE M i n i m u m M e a n Square E r r o r MPLS M u l t i - P r o t o c o l L a b e l Switching LMS Least M e a n Square LPF Linear Prediction F i l t e r LRD L o n g Range Dependence LSP L a b e l Switched P a t h OBS O p t i c a l Burst Switching OCS O p t i c a l C i r c u i t Switching OEO optical-electrical-optical OPS O p t i c a l Packet Switching OTN O p t i c a l Transport Network OXCs O p t i c a l Cross Connects QoS Q u a l i t y of Service SCU Switch C o n t r o l U n i t SNR Signal-to-Noise R a t i o SRD Short Range Dependence WDM Wavelength D i v i s i o n M u l t i p l e x i n g WRONs Wavelength-Routed O p t i c a l Networks 76 B i b l i o g r a p h y [1] Y . X i o n g , M.Vandenhoute, and H . C . C a n k a y a , " C o n t r o l architecture i n optical burst switched W D M networks," IEEE Journal on Selected Areas in Communications, vol. 18, no. 10, pp. 1838-1851, October 2000. [2] G . Gripenberg and I. Norros, " O n the prediction of fractional B r o w n i a n motion," Journal of Applied Probability, vol. 33, pp. 400-410, 1996. [3] A . H o r v a t h and M . Telek., " A markovian point process exhibiting multifractal behaviour and its application to traffic modeling," i n Proc. 4th International Conference on Matrix-Analytic Methods in Stochastic models, Adelaide, A u s t r a l i a , J u l y 2002, pp. 14-18. [4] A . T . Andersen and B . F . Nielsen, " A markovian approach for modeling packet traffic w i t h long-range dependence," IEEE Journal on Selected Areas in Communications, vol. 16, no. 5, pp. 719-732, June 1998. [5] T . Yoshihara, S. Kasahara, and Y . Takahashi, "Practical time-scale fitting of selfsimilar traffic w i t h Markov-modulated Poisson process," Telecommunication Systems, vol. 17, no. 1-2, pp. 185-211, 2001. [6] D . M o r a t o , J . A r a c i l , L . A . Diez, M . Izal, and E . M a g a n a , " O n linear prediction of internet traffic for packet and burst switching networks," i n October 2001, pp. 138-143. Proceedings of ICCCN, Bibliography 77 [7] J . L i u and N . A n s a r i , "Forward resource reservation for QoS provisioning i n O B S systems," GLOBECOM '02 IEEE, vol. 3, pp. 2777-2781, 2002. [8] M . H I L L and N . J . , "Scientists at lucent technologies' bell labs calculate theoretical limits of fiber optic communications," Lucent Technologies," Press Releases, June 28 2001. [9] L . X u , H . Perros, and G.Rouskas, "Techniques for optical packet switching and opt i c a l burst switching," IEEE Communications Magazine, vol. 39, no. 1, pp. 136-142, January 2001. [10] C . Qiao, "Labeled optical burst switching fore I P - o v e r - W D M integration," i n IEEE Communications Magazine, ser. 9, vol. 38, September 2000, pp. 104-114. [11] J . S. Turner, "Terabit burst switching," Journal of High Speed Networks, vol. 8, no. 1, pp. 3-16, January 1999. [12] F . Callegati, H . C . Cankaya, Y . X i o n g , and M . Vandenhoute, "Design issues of optical ip routers for internet backbone applications," i n IEEE Communications, December 1999, pp. 124-128. [13] L . X u , H . G . Perros, and G . N . Rouskas, " A simulation study of optical burst switching access protocols for W D M ring networks," i n Computer Networks, ser. 2, vol. 41, January 2003, pp. 143-160. [14] A . B r a g g , I. Baldine, and D . Stevenson, " A transport layer architectural framework for optical burst switched ( O B S ) networks," in In Proceedings of the First Interna- tional Workshop on Optical Burst Switching, October 2003. [15] A . Kaheel, T . K h a t t a b , A . M o h a m e d , and H . Alnuweiri, "Quality-of-Service mechanisms i n I P - o v e r - W D M networks," 2002. IEEE Communications Magazine, December Bibliography 78 [16] K . Dolzer, "Assured horizon - a new combined framework for burst assembly and reservation i n optical burst switched networks," i n NOC, Darmstadt, 2002. [17] K . Laevens, "Traffic characteristics inside optical burst switched networks," i n Optical Networking and Commun. Boston: O p t i C o m m , 2002. [18] J . Y . W e i and R . I. M c F a r l a n d , "Just-in-time signaling for W D M optical burst switching networks," Journal of Lightwave Technology, v o l . 18, no. 12, pp. 2019-2037, De- cember 2000. [19] C . Qiao and M . Y o o , " O p t i c a l burst switching ( O B S ) - a new paradigm for an optical internet," Journal of High Speed Networks, v o l . 8, no. 1, pp. 69-84, January 1999. [20] C . H s u , T . L i u , and N . Huang, "Performance analysis of deflection routing i n optical burst-switched networks," in In Proceedings of INFOCOMM, v o l . 1, 2002, pp. 66-73. [21] A . Zalesky, H . L . V u , Z . Rosberg, E . W . M . W o n g , and M . Zukerman, "Reduced load erlang fixed point analysis of optical burst switched networks w i t h deflection routing and wavelength reservation," i n In Proceedings of the First International Workshop on Optical Burst Switching, October 2003. [22] V . Vokkarane, J.P.Jue, and S. Sitaraman, "Burst segmentation: reducing packet loss i n optical burst switched networks," i n A n approach for In Proceedings of IEEE ICC, v o l . 5, 2002, pp. 2673-2677. [23] L . Sofman, K . Laevens, and T . E l - B a w a b , "Segmentation overhead i n optical burst switching," i n In Proceeding of Opticomm, 2002, pp. 101-108. [24] A . E r r a m i l l i , R . Singh, and P. P r u t h i , "Chaotic maps as models of packet traffic," in The Fundamental Role of Teletraffic in the Evolution of Telecommunications Networks, vol. 10, no. 6. P r o c . of the 14th I T C , June 1994, pp. 329-338. Bibliography 79 [25] I. Norros, " O n the use of fractional brownian motion i n the theory of connectionless networks," Selected Areas in Communications, vol. 13, no. 6, pp. 953 - 962, August 1995. [26] R . Narasimha, S. Lee, and R . R a o , "Discrete-time scale invariant systems: relation to long-range dependence and F A R I M A models," i n Processing, ser. 13-17, vol. 4. Acoustics, Speech, and Signal I C A S S P '02, M a y 2002, pp. l V - 4 1 7 0 . [27] R . R i e d i , M . Crouse, V . Ribeiro, and R . Baraniuk, " A multifractal wavelet model w i t h application to network traffic," IEEE Transactions on Information' Theory, vol. 45, no. 3, p p . 992 - 1018, A p r i l 1999. [28] S. Robert a n d J . - Y . L . Boudec, " O n a M a r k o v modulated chain exhibiting selfsimilarities over finite timescale," Perform. Eval, v o l . 27-28, pp. 159-173, 1996. [29] A . G . F . Callegati and L . T a m i l , " O n optical burst switching and selfsimilar traffic," IEEE Communications Letters, vol. 4, pp. 98-100, M a r c h 2000. [30] G . H u , K . Dolzer, a n d C . Gauger, similarity," i n "Does burst assembly really reduce the self- Optical Fiber Communication Conference (OFC 2003), v o l . 86. O S A , 2003, p p . 124-126. [31] M . Izal and J . A r a c i l , " O n the influence of self similarity o n optical burst switching traffic," i n IEEE Global Telecommunications Conference-Globecom'02, New York, 2002, p p . 2320-2324. [32] J . Beran, Statistics for Long-memory Process. N e w Y o r k : C h a p m a n and H a l l , 1994. [33] H . Heffes a n d D . Lucantoni, " A M a r k o v modulated characterization of packetized voice and d a t a traffic and related statistical multiplexer performance," IEEE J. Se- lect. Areas Commun., vol. S A C - 4 , no. 6, pp. 856-868, September 1986. Bibliography [34] A . Errarnilli, 0 . Narayan, and W . W i l l i n g e r , lalow, E d . 80 Fractal Queueing Models, J . H . Dsha- C R C Press, Inc., 1997. [35] J . R . Treichler, C . J . Jr., and M.G.Larimore, Theory and Design of Adaptive Filters. New Y o r k : Wiley, 1987. [36] R . J . E l l i o t t , L . Aggoun, and J . B . M o o r e , Hidden Markov models : estimation and control. New York: Springer-Verlag, 1995. [37] V . K r i s h n a m u r t h y and R . E l l i o t t , "Filters for estimating M a r k o v modulated Poisson processes a n d image-based tracking," Automatica, v o l . 33, no. 5, p p . 821-833, M a y 1997. [38] P . B r e m a u d , Point Process and Queues. New Y o r k : Spring-Verlag, 1981. [39] A . M o h a m e d , A . Kaheel, T . K h a t t a b , and H . Alnuweiri, "Evaluation of optical packet switch as edge device using O P N E T modeler," Opnetwork, 2002. [40] K . Dolzer, C . Gauger, J . Spath, and S. Bodamer, "Evaluation of reservation mechanisms for optical burst switching," AEU International Journal of Electronics and Communications, vol. 55, no. 1, January 2001. [41] M . Y o o , C . Qiao, and S. D i x i t , "QoS performance of optical burst switching i n I P o v e r - W D M networks," Journal on Selected Areas in Communications, v o l . 18, no. 10, pp. 2062-2071, October 2000. [42] H . L . V u and M . Zukerman, " B l o c k i n g probability for priority classes i n optical burst switching networks," [43] V . K r i s h n a m u r t h y IEEE COMMUNICATIONS LETTERS, v o l . 6, no. 5, 2002. and S. Dey, "Reduced spatio-temporal and image-based tracking filters for maneuvering targets," complexity MMPP IEEE Transactions on Aerospace and Electronic Systems, vol. 39, pp. 1277-1291, October 2003.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Recursive Bayesian traffic prediction for performance...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Recursive Bayesian traffic prediction for performance improvement in OBS networks Li, Rong 2005
pdf
Page Metadata
Item Metadata
Title | Recursive Bayesian traffic prediction for performance improvement in OBS networks |
Creator |
Li, Rong |
Date Issued | 2005 |
Description | This thesis deals with traffic prediction in Optical Burst Switched (OBS) networks with self-similar traffic, i.e., traffic with long-range dependence (LRD) properties. Aggregated traffic in high speed optical networks exhibits LRD properties. OBS is a recent promising optical network technology to facilitate IP-Over-WDM (Internet Protocol Over Wavelength Division Multiplexing). To improve the quality of service (QoS) in OBS transmission networks, traffic prediction is required for dynamic resource reservation. We present and discuss a model of IP traffic based on MMPP (Markov Modulated Poisson Process), which approximates LRD traffic by mimicking the hierarchical generation of data by Internet users. The MMPP model is capable of effectively capturing the key aspects of the traffic measured on an OBS edge router, hence representing an aggregation of the traffic generated by a number of sources. The main contribution of this thesis is to derive an optimal Bayesian predictor for the burst size at the ingress router of an OBS network for MMPP approximations of LRD traffic. Bayesian prediction yields the MMSE (Minimum Mean Square Error) estimate of the burst size. As shown in a simulated OBS testbed, such Bayesian predictor can yield substantial improvement in latency reduction and service differentiation of OBS network compared to linear predictors, without substantial increasing in computational complexity. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2009-12-10 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0065868 |
URI | http://hdl.handle.net/2429/16331 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2005-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2005-0250.pdf [ 5.51MB ]
- Metadata
- JSON: 831-1.0065868.json
- JSON-LD: 831-1.0065868-ld.json
- RDF/XML (Pretty): 831-1.0065868-rdf.xml
- RDF/JSON: 831-1.0065868-rdf.json
- Turtle: 831-1.0065868-turtle.txt
- N-Triples: 831-1.0065868-rdf-ntriples.txt
- Original Record: 831-1.0065868-source.json
- Full Text
- 831-1.0065868-fulltext.txt
- Citation
- 831-1.0065868.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065868/manifest