QoS-aware Resource Allocation in Wireless Communication Systems by Chi En Huang B.A.Sc., The University of British Columbia, 1997 M.Eng., Cornell University, 2001 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in THE FACULTY OF GRADUATE STUDIES (Electrical and Computer Engineering) The University of British Columbia (Vancouver) September 2012 c Chi En Huang, 2012 Abstract With the rapid growth in demand for wireless communications, service providers are expected to provide always-on, seamless and ubiquitous wireless data services to a large number of users with different applications and different Quality of Service (QoS) requirements. The multimedia traffic is envisioned to be a concurrent mix of real-time traffic and non-real-time traffic. However, radio spectrum is a scarce resource in wireless communications. In order to adapt to the changing wireless channel conditions and meet the diverse QoS requirements, efficient and flexible packet scheduling algorithms play an increasingly important role in radio resource management (RRM). Much of the published work in RRM has focused on exploiting multi-user and multichannel diversities. In this thesis, we adopt an adaptive cross layer approach to exploit multiapplication diversity in single-carrier communication systems and additionally, multi-bit diversity in multi-carrier communication systems. Efficient and practical resource allocation (RA) algorithms with finer scheduling granularity and increased flexibility are developed to meet QoS requirements. Specifically, for single-carrier communication systems, we develop RA algorithms with flow and user multiplexing while jointly considering physical-layer timevarying channel conditions as well as application-layer QoS requirements. For multi-carrier communication systems, we propose a bitQoS-aware RA framework to adaptively match the QoS requirements of the user application bits to the characteristics of the narrowband channels. The performance gains achievable from the proposed bitQoS-aware RA framework are demonstrated with suboptimal algorithms using water-filling and bit-loading approaches. Efii ficient algorithms to obtain optimal and near-optimal solutions to the joint subcarrier, power and bit allocation problem with continuous and discrete rate adaptation, respectively, are developed. The increased control signaling that may be incurred, as well as the computational complexity as a result of the finer scheduling granularity, are also taken into consideration to establish the viability of the proposed RA framework and algorithms for deployment in practical networks. The results show that the proposed framework and algorithms can achieve a higher system throughput with substantial performance gains in the considered QoS metrics compared to RA algorithms that do not take QoS requirements into account or do not consider multi-application diversity and/or multi-bit diversity. iii Preface Each of Chapters 2 to 8 is based on manuscripts that have been accepted, submitted or to be submitted for publication in international peer-reviewed journals and conferences. The manuscripts are all co-authored by myself as the first author and my supervisor, Dr. Cyril Leung. In all these works, I played the primary role in designing and performing the research, doing data analysis and preparing manuscripts under the supervision of Dr. Cyril Leung. List of publications resulting from this PhD work are: • C. E. Huang and C. Leung, “Multi-flow merging gain in scheduling for flow-based wireless networks,” in Proc. IEEE PACRIM, Aug. 2007, pp. 553–556. • C. E. Huang and C. Leung, “Adaptive cross layer scheduling with flow multiplexing,” in Proc. IEEE WCNC, Mar. 2008, pp. 1871–1876. • C. E. Huang and C. Leung, “Downlink mixed-traffic scheduling with packet division multiplexing,” in Proc. ACM PM2HW2N, Oct. 2008, pp. 165–172. • C. E. Huang and C. Leung, “QoS-aware bit scheduling in multi-user OFDM systems,” in Proc. IEEE WCNC, Mar. 2011, pp. 215–220. • C. E. Huang and C. Leung, “BitQoS-aware resource allocation for multi-user mixedtraffic OFDM systems,” IEEE Trans. Veh. Technol., vol. 61, no. 5, pp. 2067-2082, Jun. 2012. • C. E. Huang and C. Leung, “Scheduling signaling overhead in bitQoS-aware multi-flow OFDM systems,” submitted. iv • C. E. Huang and C. Leung, “On the optimality of bitQoS-aware resource allocation in OFDMA systems,” submitted. • C. E. Huang and C. Leung, “Determination of scheduling block size in bitQoS-aware OFDMA systems,” in preparation. v Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Resource Allocation in Wireless Communication Systems . . . . . . 2 1.2.2 Cross Layer Resource Allocation . . . . . . . . . . . . . . . . . . . . 4 1.2.3 Resource Allocation in OFDM Networks . . . . . . . . . . . . . . . 5 1.3 Objectives and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 vi 2 Flow Multiplexing in Single-carrier CDMA Systems . . . . . . . . . . . . . . . 12 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Background on cdma2000 1xEV-DO . . . . . . . . . . . . . . . . . . . . . . 13 2.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.1 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.2 Traffic Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.3 Data Buffer Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 16 Multi-flow Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.1 17 2.4 2.5 Multi-Flow Merging Scheduling Policy . . . . . . . . . . . . . . . . Adaptive Cross Layer Scheduling with Flow Multiplexing Scheduling Policy 19 2.5.1 Packet Urgency Function . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.2 Packet Priority Function . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5.3 Flow Merging Policy . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5.4 User Selection Policy . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.6.1 Comparative Scheduling Policies . . . . . . . . . . . . . . . . . . . . 23 2.6.2 Performance Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.6.3 MFM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.6.4 ACLS-FM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3 Packet Division Multiplexing in Single-carrier CDMA Systems . . . . . . . . . 34 2.6 2.7 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3 Adaptive Cross Layer Scheduling with Flow and User Multiplexing Scheduling Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.3.1 Transmission Mode Selection Function . . . . . . . . . . . . . . . . 38 3.3.2 SUP Transmission Mode . . . . . . . . . . . . . . . . . . . . . . . . 40 vii 3.3.3 MUP Transmission Mode . . . . . . . . . . . . . . . . . . . . . . . . 40 3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4 BitQoS-aware Resource Allocation Framework for Multi-carrier OFDM Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.1 Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.2 Traffic Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 BitQoS-aware Resource Allocation Framework . . . . . . . . . . . . . . . . 54 4.3.1 BitQoS Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.2 BitQoS-aware Resource Allocation Framework with No Flow Merging 56 4.3.3 BitQoS-aware Resource Allocation Framework with Flow Merging . 57 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.4.1 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.4.2 Analytical System Throughput . . . . . . . . . . . . . . . . . . . . . 61 4.4.3 Comparative Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 62 5 BitQoS-aware Resource Allocation Scheduling Policies . . . . . . . . . . . . . . 64 4.3 4.4 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2 Multi-user Water-filling with Heuristics . . . . . . . . . . . . . . . . . . . . 65 5.2.1 WFH-FM Scheduling Policy . . . . . . . . . . . . . . . . . . . . . . 65 5.2.2 WFH-NFM Scheduling Policy . . . . . . . . . . . . . . . . . . . . . 70 Multi-user BitQoS-aware Bit-loading . . . . . . . . . . . . . . . . . . . . . . 70 5.3.1 BABL-FM Scheduling Policy . . . . . . . . . . . . . . . . . . . . . 71 5.3.2 BABL-NFM Scheduling Policy . . . . . . . . . . . . . . . . . . . . 74 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.4.1 75 5.3 5.4 WFH Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . viii 5.4.2 5.5 BABL Simulation Results . . . . . . . . . . . . . . . . . . . . . . . 83 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6 Scheduling Signaling Overhead in BitQoS-aware Resource Allocation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2 Scheduling Signaling Overhead Model . . . . . . . . . . . . . . . . . . . . . 93 6.3 Scheduling Signaling Information . . . . . . . . . . . . . . . . . . . . . . . 93 6.3.1 Scheduling Policies with No Flow Merging . . . . . . . . . . . . . . 93 6.3.2 Scheduling Policies with Flow Merging . . . . . . . . . . . . . . . . 94 6.3.3 Scheduling Policies with Flow Merging - Grouped Sorted . . . . . . . 95 Scheduling Signaling Information Entropy . . . . . . . . . . . . . . . . . . . 96 6.4.1 Scheduling Policies with No Flow Merging . . . . . . . . . . . . . . 96 6.4.2 Scheduling Policies with Flow Merging . . . . . . . . . . . . . . . . 97 6.4.3 Scheduling Policies with Flow Merging - Grouped Sorted . . . . . . . 97 Compression of Scheduling Signaling Information . . . . . . . . . . . . . . . 98 6.5.1 Run-length Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.5.2 Lempel-Ziv-Welch . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.4 6.5 6.6 6.7 6.6.1 Entropy of Scheduling Signaling Overhead . . . . . . . . . . . . . . 100 6.6.2 Compressed Scheduling Signaling Overhead . . . . . . . . . . . . . 100 6.6.3 Effective Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 7 Continuous and Discrete Rate Adaptation in BitQoS-aware Resource Allocation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 7.2 Reduced-dimensionality BitQoS-aware Resource Allocation Framework . . . 109 7.3 Subcarrier, Power and Bit Allocation with Continuous Rate Adaptation . . . . 110 ix 7.3.1 Optimal Power Allocation . . . . . . . . . . . . . . . . . . . . . . . 113 7.3.2 Optimal Subcarrier Assignment . . . . . . . . . . . . . . . . . . . . 114 7.3.3 Optimal Bit Assignment . . . . . . . . . . . . . . . . . . . . . . . . 116 7.3.4 Optimal Joint Subcarrier, Power and Bit Allocation . . . . . . . . . . 118 7.4 Subcarrier, Power and Bit Allocation with Discrete Rate Adaptation . . . . . 120 7.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.5.1 Optimality and Computation Time . . . . . . . . . . . . . . . . . . . 124 7.5.2 Sensitivity to Iteration Step Size and Termination Tolerance . . . . . 125 7.5.3 Performance Comparison of KKT-CRA and KKT-DRA to the Greedy Multi-user Water-filling Algorithm . . . . . . . . . . . . . . . . . . . 129 7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 8 Computational Complexity and Practicality of BitQoS-aware Resource Allocation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 8.2 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 8.3 Practicality of BitQoS-aware Scheduling Policies . . . . . . . . . . . . . . . 135 8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 9.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 9.2.1 Analysis and Determination of Scheduling Block Size . . . . . . . . 148 9.2.2 Efficient and Optimal Solution to Discrete Rate Adaptation Problem . 149 9.2.3 Alternative Formulations of BitQoS Function . . . . . . . . . . . . . 150 9.2.4 Distributed Resource Allocation Algorithms . . . . . . . . . . . . . . 150 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 x A Inductive Proof of MUP Throughput Gain . . . . . . . . . . . . . . . . . . . . 159 B Proof of Monotonicity of LHS of (4.21) and Existence of Solution of (4.21) . . . 161 C Proof of Concavity of LHS of (7.8) . . . . . . . . . . . . . . . . . . . . . . . . . 163 D Computation Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 165 D.1 WFH-FM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 D.2 BABL-FM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 D.3 KKT-CRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 D.4 KKT-DRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 D.5 WF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 D.6 MDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 xi List of Tables Table 2.1 Simulation Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 23 Table 2.2 Simulation Results of MFM with ρ = 0.90 . . . . . . . . . . . . . . . . . 26 Table 2.3 Results of ACLS-FM for (a) BE only (b) EF only (c) BE + EF . . . . . . . 29 Table 3.1 Simulation Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 43 Table 4.1 Simulation Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 59 Table 4.2 Traffic Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Table 6.1 Effective Throughput Gains of WFH-NFM, WFH-FM and WFH-FMGS for I = {4, 6, 8}, N = 18, 1 BE and 1 EF Flow for each User . . . . . . . 106 Table 8.1 Number of Operations Performed by Each Scheduling Policy - Part I . . . 136 Table 8.2 Number of Operations Performed by Each Scheduling Policy - Part II . . . 137 Table 8.3 Computation Time Calculation Parameter Values for LTE . . . . . . . . . 138 Table 8.4 LTE Transmission Bandwidth Configurations . . . . . . . . . . . . . . . . 139 Table 8.5 Computation Times of the Considered Scheduling Policies . . . . . . . . . 140 xii List of Figures Figure 1.1 Cross Layer Design Model . . . . . . . . . . . . . . . . . . . . . . . . . 4 Figure 1.2 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Figure 2.1 Forward Link Scheduler Model . . . . . . . . . . . . . . . . . . . . . . . 16 Figure 2.2 Illustration of MFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Figure 2.3 BE and EF Packet Urgency Functions . . . . . . . . . . . . . . . . . . . 20 Figure 2.4 CDF of User Throughput, ρ = 0.90, I = 22, 10 EF Flows for each User . 27 Figure 2.5 CDF of User Latency, ρ = 0.90, I = 22, 10 EF Flows for each User . . . 27 Figure 2.6 CDF of User Packet Drop Probability, ρ = 0.90, I = 22, 10 EF Flows for each User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 2.7 28 Performance for a System with ρ = 0.90, I = 14, 16 EF Flows for each User (a) CDF of User Throughput (b) CDF of User Latency (c) CDF of User Packet Drop Probability . . . . . . . . . . . . . . . . . . . . . . . . Figure 2.8 32 Performance for a System with ρ = 0.90, I = 4, 8 BE Flows and 8 EF Flows for each User (a) CDF of User Throughput (b) CDF of User Latency (c) CDF of User Packet Drop Probability . . . . . . . . . . . . . 33 Figure 3.1 Illustration of MUP with MFM . . . . . . . . . . . . . . . . . . . . . . . 36 Figure 3.2 ACLS-FUM Scheduling Policy Flow Chart . . . . . . . . . . . . . . . . 39 Figure 3.3 Performance for a System with ρ = 0.50, I = 110, 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Latency (c) CDF of User Packet Drop Probability (d) CDF of User Jitter . . . . . . . . . . . . xiii 46 Figure 3.4 Performance for a System with ρ = 0.90, I = 30, 1 BE and 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Latency (c) CDF of User Packet Drop Probability (d) CDF of User Jitter . . . . . . . Figure 4.1 48 Mapping of Application Bits to OFDM Subcarriers for the BitQoS-aware Resource Allocation Framework . . . . . . . . . . . . . . . . . . . . . . 52 Figure 4.2 BE and EF BitQoS Functions . . . . . . . . . . . . . . . . . . . . . . . . 56 Figure 5.1 WFH-FM Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Figure 5.2 BABL-FM Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Figure 5.3 WFH: Performance for a System with I = 8, N = 18, 1 BE and 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Bit Latency (c) CDF of User Bit Jitter (d) CDF of Number of User Bits Dropped per 250 OFDM Symbols . . . . . . . . . . . . . . . . . . . . . 76 Figure 5.4 WFH: Average System Throughput under Different Loads . . . . . . . . 79 Figure 5.5 WFH: Performance for Systems under Different Loads (a) Average User Throughput (b) Average User Latency (c) Average User Jitter (d) Average User Packet Drop Probability . . . . . . . . . . . . . . . . . . . . . . . . Figure 5.6 81 BABL: Performance for a System with I = 8, N = 18, 1 BE and 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Bit Latency (c) CDF of User Bit Jitter (d) CDF of Number of User Bits Dropped per 250 OFDM Symbols . . . . . . . . . . . . . . . . . . . . . 84 Figure 5.7 BABL: Average System Throughput under Different Loads . . . . . . . . 87 Figure 5.8 BABL: Performance for Systems under Different Loads (a) Average User Throughput (b) Average User Latency (c) Average User Jitter (d) Average User Packet Drop Probability . . . . . . . . . . . . . . . . . . . . . . . . Figure 6.1 88 Mapping of Application Bits to OFDM Subcarriers with Different BitQoSaware Scheduling Policies . . . . . . . . . . . . . . . . . . . . . . . . . xiv 94 Figure 6.2 Entropy of Scheduling Signaling Overhead . . . . . . . . . . . . . . . . 101 Figure 6.3 Compressed Scheduling Signaling Overhead of the Various Scheduling Policies using RLE and LZW . . . . . . . . . . . . . . . . . . . . . . . . 103 Figure 7.1 Relationship between γi and ψij,z . . . . . . . . . . . . . . . . . . . . . . 119 Figure 7.2 Differences in Power Allocation and Bit Assignment between KKT-CRA and KKT-DRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Figure 7.3 Average Objective Value as a Function of I (N = 6, δ = 0.3, = 10−4 and SN R = 15 dB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Figure 7.4 Average Computation Time as a Function of I (N = 6, δ = 0.3, = 10−4 and SN R = 15 dB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Figure 7.5 Sensitivity of KKT-CRA and KKT-DRA to δ (I = 3, N = 6, SN R = 15 dB and = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Figure 7.6 Sensitivity of KKT-CRA and KKT-DRA to (I = 3, N = 6, SN R = 15 dB and δ = 0.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Figure 7.7 Comparison of Average Objective Value and Average Throughput between KKT-CRA/DRA and WF-CRA/DRA for (a) SMB and (b) VMB . . 131 Figure 7.8 Comparison of Average Computation Time between KKT-CRA/DRA and WF-CRA/DRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 xv List of Symbols I i Ji j number of users user index, i ∈ I = {1, . . . , I} number of flows for user i flow index, j ∈ Ji = {1, . . . , Ji } I Jsys number of flows in the system, defined as Jsys Ji N n Bij (k) z K k αi,n Ptotal Ts ci,n σ02 ζ λj ψij,z θ j,z i number of subcarriers subcarrier index, n ∈ N = {1, . . . , N } data buffer queue length for user i, flow j at time k bit index, z ∈ {1, . . . , Bij (k)} simulation length time index, k ∈ {1, . . . , K} channel gain of subcarrier n for user i total BS transmit power OFDM symbol duration number of bits that can be carried on subcarrier n for user i noise power signal-to-noise ratio gap parameter average traffic arrival rate of flow j bitQoS value of bit z for user i, flow j tuple of QoS parameters associated with bit z for user i, flow j subcarrier assignment optimization variable for user i, subcarrier n transmit power allocation optimization variable for user i, subcarrier n bit assignment optimization variable for bit z of user i, flow j on subcarrier n i=1 ai,n pi,n bj,z i,n xvi πj wij,z (k) Tj ηj ξj cj , d j U N F M (k) U F M (k) V Fn M (k) Mn (k) C R B application flow priority of flow j waiting time of bit z for user i, flow j at time k scheduling delay threshold of flow j comfort latency threshold of flow j delay sensitivity of flow j coefficients for the bitQoS function of flow j subcarrier-to-flow vector at time k subcarrier-to-user vector at time k bit-to-flow vector for subcarrier n at time k number of bits carried by subcarrier n at time k analytical system throughput total number of allocated bits maximum data buffer queue length, defined B = max Bij i∈I L κ EP Sizei (k) µ dj,z i (k) vi (k) CSIi (k) uj,z i (k) ρ GYχ aji,n pji,n cji,n as j∈Ji number of iterations performed by a bisection algorithm number of iterations performed by MDU physical layer encoder packet size for user i at time k effective service rate size of the packet at position z in the data buffer for user i, flow j at time k running average throughput of user i over the last Nwindow time slots at time k channel state information for user i at time k packet urgency value of user i, flow j, packet z at time k system loading factor performance gain of scheduling policy Y with QoS measure χ subcarrier assignment optimization variable for user i, flow j, subcarrier n in the bitQoS-aware resource allocation framework with no flow merging transmit power allocation optimization variable for user i, flow j, subcarrier n in the bitQoS-aware resource allocation framework with no flow merging number of bits that can be carried on subcarrier n for user i, flow j in the bitQoS-aware resource allocation framework with no flow merging xvii W ∆f Jsys Jsys system bandwidth subcarrier spacing set of indices to all f low(i, j), ∀i ∈ I, j ∈ Ji number of flows in the system, defined as Jsys = Ji i∈I U F M GS (k) V Fn M GS (k) H N F M , H F M , H F M GS ΥwRLE , ΥwLZW subcarrier-to-user vector at time k for scheduling policies with FMGS bit-to-flow vector for subcarrier n at time k for scheduling policies with FMGS entropy of scheduling signaling information for NFM, FM and FMGS, respectively number of bits required to represent a data block w using RLE and LZW, respectively Note: In this thesis, in order to distinguish a random variable from a sample value, the former is denoted by an uppercase letter, whereas the latter is denoted by a lowercase letter. xviii List of Acronyms 3G Third Generation 3GPP2 3rd Generation Partnership Project 2 ACLS-FM Adaptive Cross Layer Scheduling with Flow Multiplexing ACLS-FUM Adaptive Cross Layer Scheduling with Flow and User Multiplexing AMC Adaptive Modulation and Coding BE Best Effort BER Bit Error Rate BS Base Station CDF Cumulative Distribution Function CDMA Code Division Multiple Access CSI Channel State Information CSDPS Channel State Dependent Packet Scheduling DPA Default Packet Application DRC Date Rate Control EF Expedited Forwarding FDD Frequency Division Duplex FER Frame Error Rate FIFO First In, First Out HARQ Hybrid Automatic Repeat-reQuest HOL Head-of-Line IP Internet Protocol KKT Karush-Kuhn-Tucker xix LHS Left-Hand-Side LTE Long Term Evolution LZW Lempel-Ziv-Welch M-LWDF Modified Largest Weighted Delay First MAC Medium Access Control MDU Max-Delay-Utility MFM Multi-Flow Merging MFPA Multi-Flow Packet Application MILP Mixed-Integer Linear Programming MINLP Mixed-Integer Non-Linear Programming MS Mobile Station MSO Markov Service Option MUP Multi-User Packet OFDM Orthogonal Frequency Division Multiplexing OFDMA Orthogonal Frequency Division Multiple Access OSI Open Systems Interconnection PDM Packet Division Multiplexing PDU Protocol Data Unit PF Proportional Fair QoS Quality of Service RA Resource Allocation RLE Run-Length Encoding RRM Radio Resource Management SNR Signal-to-Noise Ratio TCP Transmission Control Protocol TDD Time Division Duplex UMTS Universal Mobile Telecommunications System VoIP Voice over Internet Protocol xx Acknowledgments I would like to take this opportunity to express my utmost gratitude and sincere appreciation to my supervisor, Dr. Cyril Leung, whose continued guidance, persistent encouragement and deep insight in the research area have helped me immeasurably throughout the course of my thesis research. This thesis would never have been written without his assistance. I would also like to thank my supervisory committee members for their time and effort. I am deeply indebted to my family for their constant support and immense encouragement over the years. This work was supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada under Grant OGP0001731 and by the UBC PMC-Sierra Professorship in Networking and Communications. xxi To my parents and JHYC xxii Chapter 1 Introduction 1.1 Motivation Radio spectrum is a scarce and expensive resource in wireless communications. This has led to extensive research in Radio Resource Management (RRM) with the objective of improving the achievable system capacity. While a large number of mature Resource Allocation (RA) algorithms for wireline networks have been studied [1, 2], they are not directly applicable to wireless networks due to distinct characteristics of the wireless channel such as user mobility, time-varying link capacity, high error rates, scarce bandwidth and power constraint of the Mobile Station (MS). With the tremendous growth in the wireless communications industry, wireless networks are expected to provide always-on, seamless and ubiquitous wireless data services to a large number of users with different applications and different Quality of Service (QoS) requirements. The multimedia traffic is envisioned to be mostly Internet Protocol (IP) based and to be a mix of real-time traffic such as voice, videoconferencing and gaming, and non-real-time traffic such as web browsing, file transfers and messaging [3]. The expected increase in peak rate and throughput requirements will be achieved using a combination of wider channel bandwidths and increased spectral efficiency. QoS requirements will include minimum acceptable throughput, maximum latency and maximum delay jitter, maximum packet loss and packet error rates and a priori determined priority classes of 1 users and applications. In order to adapt to the time-varying wireless channel conditions and meet the diverse QoS requirements for a large number of users, wireless networks will need efficient and flexible packet scheduling algorithms. 1.2 Related Work In this section, we review techniques for wireless resource allocation including scheduling algorithms that exploit multi-user and multi-channel diversities, cross layer resource allocation and resource allocation in Orthogonal Frequency Division Multiplexing (OFDM) networks. 1.2.1 Resource Allocation in Wireless Communication Systems A common objective in RRM is to improve system capacity while meeting the diverse QoS requirements. While it is desirable that an optimal scheduling algorithm shall attempt to achieve key objectives that include efficient link utilization, fairness, throughput guarantees, low algorithm complexity, scalability and system stability [4, 5], some of these objectives are conflicting in nature. Hence, appropriate trade-offs need to be made to satisfy specific system service requirements. In [6], a comprehensive survey of wireless scheduling algorithms to support the provision of QoS requirements for various types of broadband multimedia wireless networks are classified and examined. A Channel State Dependent Packet Scheduling (CSDPS) algorithm is proposed in [7] where the authors show that by deferring transmission of packets on a wireless link that is experiencing bursty errors to reduce retransmissions and exploit channel diversity gains, significant improvement in channel utilization can be achieved. However, the proposed CSDPS algorithm does not guarantee fairness to users and does not provide any bounds on packet delay. In [8], a Proportional Fair (PF) algorithm is proposed which exploits multi-user diversity to maximize system throughput on the forward link of a Code Division Multiple Access (CDMA) network by scheduling data transmission based on the relative channel quality of the competing users, while at the same time maintaining fairness across the entire competing user population. A user i which has not transmitted for a long 2 time due to a relatively low carrier-to-interference ratio gets its priority Qi (k) raised where Qi (k) = CSIi (k) . v¯i (k) (1.1) In (1.1), CSIi (k) is the channel state information of user i at time k, and v¯i (k) is the average throughput of user i over a time window up to time k. The forward link throughput performance of a cdma2000 1xEV-DO system employing the PF algorithm is presented in [9]. While the specifications for Third Generation (3G) networks do not specify the details of the scheduler, some form of PF scheduler is typically used. Recent work in RRM has focused on supporting QoS of multimedia traffic. In [10], the delays are explicitly controlled by inclusion of the queue lengths in the scheduling algorithm. Token based rate control mechanisms are studied in [11] to provide minimum throughput guarantees. In [12], the PF algorithm is modified to take into account delay requirements of real-time data and it is shown that with the simple modifications, the scheduler can provide effective and fair service to both real-time and non-real-time data. Formulation of QoS requirements as stochastic constraints are expressed in [13] where a general structure for opportunistic scheduling policies that exploit channel and buffer content variations is presented. Since issues of efficient and fair resource allocation have been well studied in economics, utility-based resource allocation and scheduling are studied in [5, 14] by quantifying resource use (bandwidth, power, etc.) or performance criteria (data rate, delay, etc.) into corresponding price values and optimizing the established utility pricing system. In [15], an access scheme for multiplexing (from one session at each transmission) multimedia traffic over the air that can achieve absolute QoS guarantees in terms of Average Packet queuing Delay (APD), Packet Loss Rate (PLR), Packet Delay Variation (i.e. jitter) (PDV) and Packet Transfer Delay (PTD) for different service classes is proposed. Per-session guaranteed QoS for multimedia traffic is introduced in [10, 16] for scheduling of uplink and downlink flows. 3 1.2.2 Cross Layer Resource Allocation Cross layer design is an interdisciplinary research area which involves signal processing, adaptive coding and modulation, channel modeling, traffic modeling, queuing theory, and network protocol design and optimization techniques [17]. As a wide variety of cross layer related designs have been studied in literature, we focus our literature survey on the general application of cross layer optimization in RRM. Fig. 1.1 shows a typical cross layer design model which attempts to optimize functionality across blurred delineation of layers. Application Presentation Quality of Service Requirements Throughput Latency Jitter Packet Error Rate Packet Loss Signal-to-Noise Information Channel State Session Transport Network Radio Resource Management Power Control MAC Physical Flow M Flow 2 Flow 1 MAC Figure 1.1: Cross Layer Design Model .. j An important Baspect of wireless communications is its dynamic behavior. While the coni (k ) userlayered i, flow j, Open Systems Interconnection (OSI) model [18] has served communication ventional packet z Merge system designers well in the past by exploiting the advantage of modularity in system design, PHY EPSizei(k) the structure is inflexible, requiring the various layers to communicate in a strictly defined manner. In most cases, layers are designed to operate in worst-case scenarios rather than adapting to conditions as they change, leading to inefficient use of both spectrum and energy. Evolving wireless networks are seriously challenging this design architecture, mandating the need for the various OSI layers to adapt to the channel variations and QoS requirements [19] 4 and to be considered together [20–22] in order to provide more efficient methods of allocating network resources over the wireless network. In [3], an overview of the cross layer design paradigm shift is provided as wireless communication networks evolve from a circuitswitched to a packet-switched infrastructure. In [22], a general survey of the recent myriad of cross layer design proposals is presented along with a suggested definition and taxonomy for classifying cross layer designs. Open challenges to cross layer optimization are listed in [20, 22] to establish a platform upon which new research can be built. An overview of cross layer design approaches for resource allocation is provided in [23] which proposes a cross layer design approach that exploits physical and application layer information to transmit real-time video over time-varying CDMA channels. Simulation results are presented to show the effectiveness of the proposed approach. In [24], information obtained from the fast power control algorithm is used to define a low complexity prioritization function to exploit short-term channel variations and to schedule transmissions for a Universal Mobile Telecommunications System (UMTS) downlink channel. Simulations results show improved system performance in terms of capacity and delay. [25] introduces an adaptive cross layer packet scheduler which minimizes a prescribed cost function given the current channel qualities and delay states of the packets in the queue. It is shown that the cross layer scheduling algorithm outperforms both the weighted fair queuing (WFQ) and earliest deadline first (EDF) schedulers with respect to both packet delay and user throughput. While most recent papers tout the advantages of a cross layer approach to resource allocation for next generation wireless networks, a cautionary perspective is raised in [26], which points out a trade-off between performance gains and upkeep difficulty of system architecture violations introduced by cross layer designs. 1.2.3 Resource Allocation in OFDM Networks OFDM is a promising technique for communication systems due to its high spectral efficiency and flexibility in dynamically allocating resources to multiple users. While spectral efficiency has improved significantly with the deployment of beyond 3G OFDM-based cellu5 lar air interfaces [27, 28], unallocated radio spectrum is scarce in most populated regions. The problem of dynamic bit-loading, transmission power allocation and subcarrier assignment for multi-user OFDM systems has attracted a great deal of interest. It is shown in [29] that the system efficiency can be significantly improved by allocating the power and subcarriers based on knowledge of the users’ channel qualities. In [30], it is shown that the downlink system throughput is maximized when each subcarrier is assigned to the user with the best channel gain on that subcarrier and power is then allocated to the subcarriers using the water-filling algorithm. However, fairness among users is not considered in [29, 30] and it is possible that when the path loss differences are large among users, the users experiencing poor channel gains for an extended period of time may be starved. In [31], the optimal subcarrier assignment is formulated as a max-min convex optimization problem to maximize the worst user’s capacity. However, since the max-min approach deals with the worst-case scenario in which the smallest user capacity is maximized, thereby ensuring that all users achieve similar data rates, it penalizes users with better channels and reduces system efficiency. In [32], a set of proportional rate constraints is introduced into the throughput maximization problem to allow each user to achieve a required data rate. The above-mentioned works exploit multi-user and multi-channel diversities to maximize system throughput and/or minimize total transmit power. However, they do not consider application QoS requirements which allow users to subscribe to the different levels of service available in contemporary wireless networks [27, 28, 33]. Radio RA algorithms that take QoS information of different traffic classes from the application layer and channel information from the physical layer into consideration to exploit multi-flow (concurrent applications with different QoS requirements) diversity have been studied for mixed-traffic networks [34–41]. The Modified Largest Weighted Delay First (M-LWDF) [35, 42] is a throughput-optimal algorithm that exploits multi-user diversity across time by buffering bursty traffic and improves throughput performance by trading delay for throughput. It provides QoS for data users by ensuring a minimum throughput guarantee and maintaining delays smaller than a predetermined threshold with a given probability. 6 RA algorithms based on M-LWDF with buffer and channel information have been studied in [36, 37, 39, 40]. In [36, 37], the authors consider a mixed-traffic environment and propose a utility-based cross layer RA framework in which utility functions are used to represent application QoS requirements. Based on this framework, a Max-Delay-Utility scheduling policy, hereafter referred to as MDU, is proposed in [38]. MDU aims to maximize the aggregate utility with respect to the user average waiting time while taking into account channel conditions and data queue information. An urgency and efficiency based packet scheduling algorithm is proposed in [39] to support both real-time and non-real-time traffic. The aim is to maximize the throughput of non-real-time traffic while satisfying the QoS requirements of real-time traffic by serving non-real-time traffic until the real-time packets approach their deadlines. In [40], the different traffic classes are handled separately by considering Head-of-Line (HOL) packet waiting time for real-time traffic and the queue length for non-real-time traffic. In [41], the authors present a joint bit rate, subcarrier and power allocation problem which take into consideration limits on the subcarrier transmit power in addition to an overall system power constraint. 1.3 Objectives and Contributions Wireless communications, in particular CDMA and Orthogonal Frequency Division Multiple Access (OFDMA) cellular networks, has emerged as one of the largest sectors of the telecommunications industry and one of the most promising growth areas into the next decade. To meet the challenges of deploying an efficient wireless multimedia network, it is useful to consider network functions (i.e., the various OSI layers) together when designing the network to take into account QoS requirements at the Medium Access Control (MAC) layer where the scheduling and RA algorithms reside. As the scarce radio spectrum is shared by a large number of users, in this thesis, the research objective is to design and analyze efficient and practical adaptive cross layer (physical, MAC and application layers) RA algorithms for single-carrier CDMA communication systems and multi-carrier OFDMA communication systems that jointly consider the physical layer time-varying channel conditions as well as 7 application layer QoS requirements so as to more efficiently utilize the radio spectrum. In addition to exploiting multi-user and multi-channel diversities as in existing studies, we increase the flexibility and granularity of the RA algorithms by exploiting multi-application and multi-bit diversities to take advantage of the mechanisms and optimization features introduced in the air interfaces [27, 43]. In particular, for CDMA communication systems, we develop RA algorithms with flow and user multiplexing to take advantage of the flow-oriented QoS approach and Packet Division Multiplexing (PDM) to provide a unified approach to intra-user (between flows of a user) and inter-user (between users) QoS and to permit the Base Station (BS) to serve multiple users in the same physical layer encoder packet, respectively. As cellular networks adopt OFDM as a modulation scheme due to its high spectral efficiency and flexibility in dynamically allocating resources to multiple users, for OFDMA communication systems, since data is loaded onto subcarriers in units of bits, we consider QoS at the bit-level rather than at the flow-level as in existing studies and define a bitQoS function which maps the QoS parameters of an application bit into a numerical value. We establish a bitQoS-aware RA framework which adaptively matches the QoS requirements of the user application bits to the characteristics of the OFDM subcarriers in a mixed-traffic environment. The proposed bitQoS-aware RA framework is formulated as an optimization problem with the objective of finding the joint subcarrier, power and bit assignment to maximize the total bitQoS-weighted throughput, subject to the total power constraint. However, as the formulated optimization problem is a Mixed-Integer Non-Linear Programming (MINLP) problem whose solution is computationally complex given the large number of subcarriers and users in a practical system, we demonstrate the performance gains achievable from the proposed framework with suboptimal algorithms using water-filling and bit-loading approaches. We then formulate the bitQoS RA framework as a convex optimization problem and use the Karush-Kuhn-Tucker (KKT) conditions to develop efficient algorithms to obtain optimal and near-optimal solutions to the joint subcarrier, power and bit allocation problem with continuous and discrete rate adaptation, respectively. To assess the viability of the bitQoS-aware RA 8 framework, we formulate a model to determine and analyze the scheduling signaling overhead, including the scheduling signaling information entropy, and consider different schemes to compress the associated control signaling. The computational complexities of the proposed RA algorithms are also assessed for deployment consideration in practical networks. 1.4 Thesis Overview The thesis is organized as follows: RA algorithms for single-carrier CDMA communication systems are studied in Chapters 2 and 3 and RA algorithms for multi-carrier OFDMA communication systems are studied in Chapters 4-8. The structure of the thesis is illustrated in Fig. 1.2. In Chapter 2, we exploit multi-application diversity in flow-based single-carrier CDMA communication systems and quantify the performance gains obtainable with Multi-Flow Merging (MFM) in terms of user throughput, user latency and user packet drop probability. In addition, we incorporate the concept of MFM and propose an adaptive cross layer (physical, MAC and application layers) scheduling policy which further takes into account the timevarying channel conditions from the physical layer and includes QoS requirements from the application layer. In Chapter 3, we extend the scheduling policy proposed in Chapter 2 to take into account PDM introduced in cdma2000 1xEV-DO Revision A. PDM permits the BS to service multiple users in the same physical layer encoder packet in a single time slot with the use of MultiUser Packet (MUP) transmission. We consider a mix of real-time voice services and nonreal-time data applications and study the improvements in packing efficiency and latency performances. The QoS performance gains with flow and user multiplexing are quantified in terms of user throughput, user latency, user packet drop probability and user jitter in a mixed-traffic environment. In Chapter 4, we propose a bitQoS-aware RA framework which exploits multi-bit diversity in addition to multi-application diversity to increase the flexibility and granularity of the RA algorithms in multi-carrier OFDMA communication systems. The proposed bitQoS9 Single-carrier CDMA Systems Multi-carrier OFDMA Systems Exploit multi-user and multi-flow diversities Exploit multi-user, multi-channel, multi-flow and multi-bit diversities Chapter 4 Chapter 2 Framework BitQoS-aware RA Framework Multi-flow Merging Chapter 5 Chapter 3 Packet Division Multiplexing Performance Water-filling- and Bit-loadingbased Scheduling Policies Chapter 7 Efficiency and Optimality Efficient Optimal and Nearoptimal Algorithms Chapter 6 Viability Scheduling Signaling Overhead Chapter 8 Practicality Computational Complexity Figure 1.2: Structure of the Thesis aware RA framework is formulated as two optimization problems, with no flow merging and with flow merging, with the objective of finding the joint subcarrier, power and bit assignment to maximize the total bitQoS-weighted throughput subject to the total power constraint. The system model which includes the network model and traffic classes are described and the performance evaluation methodology along with the comparative schemes used to assess the performance of the proposed bitQoS-aware RA framework are presented. In Chapter 5, we evaluate the performance of the bitQoS-aware RA framework and propose two iterative subcarrier-power-bit allocation algorithms, one based on the water-filling approach and the other on the bit-loading approach, to quantify the achievable performance gains. In addition, the potential performance gains by allowing bits from different application flows of a user to be merged into a single OFDM subcarrier is examined. The performance gains obtainable are quantified in terms of system throughput, user throughput, user latency, user jitter and user packet drop probability for systems under different loads. In Chapter 6, we establish the viability of the bitQoS-aware RA framework by taking into 10 account the scheduling signaling overhead associated with the increased scheduling granularity of the proposed bitQoS-aware RA framework. This is critical since valuable resources that could otherwise be used to transmit application bits need to be reserved for control signaling. We formulate a scheduling signaling overhead model to analyze the scheduling signaling information required and consider different schemes to compress the scheduling signaling information. To assess the tradeoff between the scheduling gain and the increased scheduling signaling overhead of the proposed bitQoS-aware RA framework, the effective throughput gains (with the scheduling signaling overhead taken into account) are quantified. In Chapter 7, with the performance gains and viability of the proposed bitQoS-aware RA framework established in Chapters 5 and 6, we use the KKT conditions to establish necessary and sufficient optimality conditions and develop efficient algorithms to obtain optimal and near-optimal solutions to the joint subcarrier, power and bit allocation problem with continuous and discrete rate adaptation, respectively. The performance of the proposed KKT-based algorithms is evaluated in terms of their closeness to optimality and computation time. In addition, the sensitivities of the objective value and computation time to tuning parameters in the KKT-based algorithms are also discussed. In Chapter 8, we assess the computational complexity of the scheduling policies proposed for the bitQoS-aware RA framework and evaluate their practicality for real-time resource allocation in Long Term Evolution (LTE), an OFDM-based air interface. In Chapter 9, the main contributions of the thesis and suggestions for future research are presented. 11 Chapter 2 Flow Multiplexing in Single-carrier CDMA Systems 1 2.1 Introduction While much of the existing work in RRM has focused on exploiting multi-user (channel) diversity and more recently exploiting multi-application (flow) diversity, we present in this chapter the performance gains of MFM in scheduling and propose an adaptive cross layer scheduling policy that is realizable in a framework such as that provided in cdma2000 1xEVDO Revision A [43], which takes into account the time-varying Channel State Information (CSI) from the physical layer and includes QoS requirements from the application layer. We refer to this as the Adaptive Cross Layer Scheduling with Flow Multiplexing (ACLS-FM) scheduling policy. This chapter is organized as follows: in Section 2.2, we briefly describe the enhancements to the cdma2000 1xEV-DO Revision A air interface and the included Multi-Flow Packet 1 The material in this chapter is based on the following: C. E. Huang and C. Leung, “Multi-flow merging gain in scheduling for flow-based wireless networks,” in Proc. IEEE PACRIM, Aug. 2007, pp. 553–556. c 2007 IEEE. http://dx.doi.org/10.1109/PACRIM.2007.4313296 C. E. Huang and C. Leung, “Adaptive cross layer scheduling with flow multiplexing,” in Proc. IEEE WCNC, Mar. 2008, pp. 1871–1876. c 2008 IEEE. http://dx.doi.org/10.1109/WCNC.2008.333 12 Application (MFPA). Section 2.3 presents the system model that includes the network model, traffic classes and data buffer parameters. The concept of MFM is illustrated in Section 2.4 along with an example scheduling policy. The ACLS-FM scheduling policy is described in Section 2.5. Simulation results are presented in Section 2.6 and the main findings are summarized in Section 2.7. 2.2 Background on cdma2000 1xEV-DO The cdma2000 1xEV-DO (1x Evolution-Data Optimized) air interface is an evolution of the cdma2000 family of 3G mobile telecommunications air interface, standardized by the 3rd Generation Partnership Project 2 (3GPP2), that utilizes CDMA to provide high-speed packet data services to wireless users. However, unlike the other variants of CDMA based systems such as IS-95 [44], where the forward link transmit power is shared among all active mobiles within a sector to maintain simultaneous, continuous voice channels, cdma2000 1xEV-DO systems time-division multiplex (TDM) the forward link data transmission and transmit at full power to produce the highest possible energy per bit to noise ratio (Eb /N0 ) to each active mobile. This allows the base station to transmit user data at the highest data rate supported by the time-varying wireless channel that the MS determines from the pilot channel carrier-tointerference ratio [45]. The reverse link remains similar to the IS-95 and utilizes code division multiplexing. 1xEV-DO Release 0 provides a peak physical layer data rate of 2.4 Mbps in the forward link and 153.6 kbps in the reverse link [46]. Forward link data is transmitted 2 2 in successive 26 ms frames, which are divided into sixteen 1 ms slots in which packets 3 3 of data are transmitted. The transmission duration of a single packet may vary from 1 to 16 slots. The successor to 1xEV-DO Release 0 is 1xEV-DO Revision A [43] which includes enhancements that provide significant gains in spectral efficiency and substantial QoS support for inter-user (between users) and intra-user (between flows of a user) QoS in both the forward and reverse links. In addition to a rich variety of link adaptation techniques, such as power control, data rate control and Adaptive Modulation and Coding (AMC), 1xEV-DO Revision 13 A makes use of higher order modulation and Hybrid Automatic Repeat-reQuest (HARQ) to achieve higher peak data rates of 3 Mbps in the forward link and 1.8 Mbps in the reverse link [43, 47]. HARQ reduces the effects of power control imperfections due to variations in channel state and multiple-access interference to achieve higher reverse link spectral efficiency via early termination of physical packet transmissions, leading to improved throughput and reduced packet delay. Shorter packets, along with finer rate quantization, multi-user packet (MUP) transmission, and uninterrupted data transfer during forward link cell switching contribute to lower latency. In the reverse traffic channel MAC, key QoS-sensitive support includes efficient support for latency-sensitive and delay-tolerant applications, resource allocation among flows associated within a MS and MAC layer ARQ [43, 48]. A flow is an octet stream that can be used to carry packets between the MS and the BS. While some of these features increase system throughput and spectral efficiency, others improve the operator’s ability to guarantee acceptable latency performance for delay sensitive applications such as interactive voice and video, and still others provide a mechanism for application coexistence. In particular, we highlight the following features of 1xEV-DO Revision A that are considered in our research. 1xEV-DO Release 0 systems support per flow QoS on the forward link and per MS QoS on the reverse link through the Default Packet Application (DPA). The DPA consists of a link layer protocol that provides octet retransmissions and duplicate detection, a location update protocol that provides mobility between data service networks and a flow control protocol that provides flow control of data traffic [46]. There is no differentiation of packets from different applications with different QoS requirements. In 1xEV-DO Revision A, a flow-oriented QoS approach [43, 49] is adopted and provides a unified approach to inter-user and intra-user QoS. MFPA is included and provides multiple octet streams that can be used to carry octets between the mobile station and base station. MFPA, along with the reverse link multi-flow MAC with per-flow QoS support, provides the framework for the exploitation of MFM gain in both the forward and reverse links. Packets from latency-sensitive flows that arrive later at the base station following a large packet from a 14 delay-tolerant flow can be transmitted first instead of being transmitted in the order of arrival, hence reducing latency and jitter for multimedia traffic. 2.3 System Model The network model, traffic classes and data buffer parameters used are described in this section. 2.3.1 Network Model We consider a 1xEV-DO Revision A-like packet cellular network random discrete-event model, as shown in Fig. 2.1, consisting of one BS servicing I MSs. Let I = {1, ..., I} be the set of all users (MSs). Each MS i can have up to Ji data queues (flows) and let Ji = {1, ..., Ji } be the set of all flows. Forward link scheduling is centralized at the BS which communicates with all MSs. At each time slot k, where k ∈ Z+ = {1, 2, ..., K}, we assume that only one user is scheduled and that the scheduling decision time is negligible. Power control is not enabled in the forward link for 1xEV-DO systems and the BS transmits at full power to the MSs in all time slots. We assume that packets are received without errors, i.e. 0% Frame Error Rate (FER), between the BS and MS. This simplifying assumption is made to illustrate the potential gains that ACLS-FM can provide. The BS is assumed to have knowledge of the channel state information, CSIi (k), for each MS i at time k, queue status and QoS requirements for all the data queues. The service rate for a user during time slot k is a function of the channel quality which is characterized by its received Signal-to-Noise Ratio (SNR) in time slot k. For simplicity, we assume that the physical layer encoder packet size, EP Sizei (k), is chosen according to a uniform distribution from the set of eight discrete physical layer encoder packet sizes, where EP Size ∈ E = {128, 256, 512, 1024, 2048, 3072, 4096, 5120} bits. Let µ EP Size/S ∈ {4.8, ..., 3072} kbps be the effective service rate, where S ∈ S = {1, 2, 4, 8, 16} time slot(s) of 1.667 ms duration. 15 User 1 Flow 1 MS 1 Flow 2 . . Scheduler Trans ceiver Flow 1 MS 2 User I . . Flow 2 . . Channel State Monitor / Predictor Flow JI MS I Figure 2.1: Forward Link Scheduler Model 2.3.2 Traffic Classes Two traffic classes are considered: Best Effort (BE) traffic class representing Internet browsinglike applications and Expedited Forwarding (EF) traffic class representing Voice over Internet Protocol (VoIP)-like applications. We use the web browsing traffic arrival model in [50] to represent incoming BE traffic. The application layer Protocol Data Unit (PDU) is based on a truncated Pareto distribution with a mean of 25 kBytes and minimum and maximum sizes of 4.5 kBytes and 2 MBytes respectively. The application layer PDU interarrival time is geometrically distributed with a mean of 5 sec. For the EF traffic class, we use the VoIP traffic arrival model in [50]. In contrast to the web browsing model, source configuration and source files are used to generate VoIP traffic. The source file is generated based on the Markov Service Option (MSO) model IS-871 with alterations as detailed in [50]. The application layer PDU size and interarrival time have a mean of 152.4 bits and 0.04 sec, respectively. The average traffic arrival rates λBE and λEF are 40.0 kbps and 3.7 kbps per application respectively. 2.3.3 Data Buffer Parameters The key data buffer parameters are as follows: Queue length: Bij (k) ∈ Z = {0, 1, 2, ...} denotes the queue length, in packets, of the data 16 buffer for user i, flow j at time k. We assume that the size of data buffer itself is infinite (i.e. no packet blocking). Packets arriving in the data queues are chronologically ordered and serviced in a First In, First Out (FIFO) fashion. Packets in the data buffer are indexed by z, z ∈ {1, ..., Bij (k)}. Packet size: dj,z i (k) denotes the size, in bits, of the packet at position z in the data buffer for user i, flow j at time k. Waiting time: wij,z (k) ∈ (0, ∞) denotes the amount of time, in seconds, that the packet at position z in user i, flow j buffer has waited. Each packet is time stamped upon arrival in the data buffer, and the waiting time is found by simply subtracting the arrival time from the current time k. Packets are dropped if wij,z (k) exceeds the flow scheduling delay thresholds Tj ∈ R+ . Flow Priority: πj (k) ∈ R+ denotes the intra-user QoS requirement of flow j at time k. πj (k) is a function of k to allow for time-varying intra-user priority changes. However, πj (k) is not a function of user i as it is assumed that flows of the same applications have the same QoS requirement. 2.4 Multi-flow Merging An illustration of MFM for a user i having up to Ji data buffers (flows) is shown in Fig. 2.2. Each application layer PDU is segmented into a number of packets. Packets from different flows can be multiplexed into the same physical layer encoder packet of size EP Sizei (k) for transmission at time k. 2.4.1 Multi-Flow Merging Scheduling Policy To explore the benefits of MFM, we extend the existing PF scheduling policy to allow transmission of packets from multiple data queues using a single physical layer encoder packet and refer to this as the MFM scheduling policy. The MFM scheduling policy consists of two 17 Flow Ji Flow 2 Flow 1 MAC .. j Bi (k ) user i, flow j, packet z Merge EPSizei(k) PHY Figure 2.2: Illustration of MFM steps: in Step 1, similar to PF, a user is selected based on the ratio of its CSIi (k) and corresponding running average throughput vi (k) over the last Nwindow time slots at time k; in Step 2, packets from the multiple data queues of the selected user are merged into the physical layer encoder packet. More specifically, Step 1: Let Qi (k) denote the priority of user i at scheduling period k: CSIi (k) vi (k) Qi (k) = 0 Ji Bij (k) > 0 if j=1 . (2.1) otherwise A user with no data to send is assigned a priority of 0 and is ignored in the selection process. The user to be scheduled at time k is determined as: i∗ (k) = arg max Qi (k). i∈I (2.2) Step 2: Packets are selected, one at a time in an iterative fashion, from the data queues of user i∗ and added to the physical layer encoder packet until either the physical layer encoder packet of size EP Sizei∗ (k) is filled or there are no more packets in the data queues. The probability pj (τ ) that a packet is selected from flow j at iteration τ is 18 set to: pj (τ ) = Bij∗ (k) Ji , ∀j ∈ Ji . (2.3) Bij∗ (k) j=1 Thus the probability of merging a packet from flow j is given by the ratio of its queue length to the sum of all data queue lengths for user i∗ . While other simple schemes such as a deterministic longest-queue-first (LQF) scheme to more complex merging schemes are possible, this simple probabilistic scheme was chosen as an example to highlight the realizable gains from multi-flow merging whilst taking into account possible data queue starvation due to the different average data arrival rates in a mixed traffic (BE + EF) environment. 2.5 Adaptive Cross Layer Scheduling with Flow Multiplexing Scheduling Policy The proposed Adaptive Cross Layer Scheduling with Flow Multiplexing (ACLS-FM) scheduling policy consists of a packet urgency function (to meet latency requirements), a packet priority function (for intra-user QoS adjustments), a flowing merging policy (to determine which flows and how many bits from each flow to service) and a user selection policy (to fairly schedule users). 2.5.1 Packet Urgency Function The packet urgency function allows a packet from a latency-sensitive application flow to have its service priority raised when its waiting time exceeds a predetermined threshold. Let uj,z i (k) ∈ R+ denote the packet urgency (PU) value of user i, flow j, packet z at time k. The PU value is given by the following packet urgency function (wij,z (k)−ηj ) uj,z i (k) = cj ξj 19 , (2.4) where ξj ∈ R+ is the urgency base and cj ∈ R+ is the scaling factor for flow j. The parameter ηj ∈ R+ is the comfort latency threshold and is generally set to a value that is less than the flow scheduling delay threshold Tj . An illustration of the PU functions for BE and EF traffic are shown in Fig. 2.3. In the region where wij,z (k) ≤ ηEF , the BE traffic has a higher PU value than the EF traffic; this is meant to reduce BE traffic backlog, if necessary. EF u BE priority > EF priority BE ȘEF TEF ȘBE TBE Figure 2.3: BE and EF Packet Urgency Functions 2.5.2 w Packet Priority Function Let ψij,z (k) ∈ [0, 1] denote the packet priority (PP) value of user i, flow j, packet z at time k. The PP value is calculated using the following packet priority function od j,z ou ψij,z (k) = πj (k)oπ dj,z i (k) ui (k) , (2.5) where oπ , od , and ou ∈ R+ are non-negative weighting constants. Each component of ψij,z (k) is normalized to its maximum value: πj (k) is normalized to max(πj (k)) ∀ j, dj,z i (k) is nor(Tj −ηj ) j,z malized to max(dj,z i (k)) ∀ i, j, z and ui (k) is normalized to max(cj ξj 2.5.3 ) ∀ j. Flow Merging Policy The objective of the flow merging policy is to merge packets from the different flow data buffers of a given user i into a physical layer encoder single user packet at time k such that the sum PP value of the selected packets is maximized subject to the EP Sizei (k) and FIFO 20 packet service constraints. The flow merging policy is formulated as follows: j Ji Bi (k) ψij,z (k)aj,z i (k) OP2.1: max j,z ai (k) j=1 z=1 j Ji Bi (k) j,z dj,z i (k)ai (k) ≤ EP Sizei (k), s. t. (2.6) j=1 z=1 aj,z i (k) ∈ {0, 1}, ∀ j ∈ Ji , ∀ z ∈ {1, . . . , Bij (k)}, j,z aj,z i (k) ≤ ai (k), ∀ z ≤ z, j,z where the binary variable aj,z i (k) = 1 if user i, flow j, packet z is selected, and ai (k) = 0 j,z otherwise. The constraint aj,z i (k) ≤ ai (k), ∀ z ≤ z ensures that the packets in any data buffer are serviced in a FIFO fashion. The optimal solution is denoted by A ∗ , where A ∗ is a binary matrix consisting of elements aj,z i (k) that maximizes the PP sum of the objective function in OP2.1. To obtain the optimal solution A ∗ , we first determine the set of unique feasible solutions, denoted by Y, where each element y is a vector consisting of Ji elements. The j th element in y represents the number of data packets selected from flow j that satisfies the constraints of the optimization problem formulated in OP2.1. The optimal solution A ∗ is then mapped by y ∗ ∈ Y which maximizes the objective function. In the event of a tie, y ∗ is then selected randomly with equal probabilities. The set Y can be iteratively determined using Ji -nested loops. The loop counter for each nested loop j is [0, . . . , Bij (k)] and represents the number of data packets selected from flow j. A loop terminates when the total size of the selected data packets exceeds EP Sizei (k). Let Ui (k) denote the maximal sum packet priority (MSPP) value for user i at time k attained by A ∗ , i.e. j Ji Bi (k) ψij,z (k)aj,z i (k), Ui (k) = j=1 z=1 21 ∗ aj,z i (k) ∈ A . (2.7) 2.5.4 User Selection Policy Let Qi (k) denote the priority of user i at scheduling period k: Qi (k) = κi CSIi (k)α Ui (k)β , v¯i (k)ε (2.8) where CSIi (k) is the channel state information, Ui (k) is the MSPP value and v¯i (k) denotes the running average throughput over the last Nwindow time slots for user i at time k. The parameter κi ∈ R+ can be used to establish relative user priorities and α, β, ε ∈ R+ are non-negative weighting constants. The user i∗ to be scheduled at time k is determined as: i∗ (k) = arg max Qi (k). (2.9) i∈I 2.6 Simulation Results The MFM and ACLS-FM scheduling policies described in Sections 2.4.1 and 2.5 were simulated in Matlab using the system model described in Section 2.3. Simulation results were obtained for BE only traffic, EF only traffic and mixed traffic (BE + EF) scenarios with sysI tem loading factor, ρ Ji λji and λ/µ, values of 0.10, 0.50 and 0.90, where λ = i=1 j=1 1 EP Size . · is the cardinality of a set. For the mixed traffic sceµ= S E S∈S EP Size∈E S nario, an equal number of BE and EF traffic flows were simulated. To achieve the desired system loading factor, I and Ji were varied. The simulation parameter values are listed in Table 2.1. The scheduling delay thresholds were set at 3.0 sec for BE traffic class (to avoid Transmission Control Protocol (TCP) retransmissions) [51] and 0.070 sec for EF traffic class (to achieve a “Users Satisfied” mouth-to-ear delay rating) [52]. 22 Table 2.1: Simulation Parameter Values Parameter Value Parameter Value Nwindow 100 slots S 1 slot dj,z i (k) 128 bits ∀ i ∈ I ξBE 1.0 ∀ j ∈ Ji , z = 1, ..., Bij (k) 2.6.1 πj (k) 1 ∀ j ∈ Ji , k = 1, ..., K ξEF 1.5 λj 1 ∀ j ∈ Ji TBE 3.000 sec oπ , od , ou 1 TEF 0.070 sec κi 1∀i∈I ηBE 1.500 sec α, β, ε 1 ηEF 0.035 sec Comparative Scheduling Policies The performance of the MFM and ACLS-FM scheduling policies are compared with those of four other scheduling policies: Modified Greedy (MG), Modified Round Robin (MRR), MFM and Modified Proportional Fair (MPF). The MG, MRR and MPF scheduling policies are described below. The term, EP Sizei (k), denotes the physical layer encoder packet size for user i at time k. MG Scheduling Policy The Classical Greedy (CG) scheduling policy [53] i∗ (k) = arg max CSIi (k) is strictly opi∈I ∗ portunistic and simply selects the user i with the best channel condition. While CG provides a throughput upper-bound, it does not specify how the flows of the selected user are to be scheduled. In MG, each traffic flow is regarded as a separate user. At each scheduling period k, MG services the flow with the best channel condition and longest data queue. Specifically, Step 1: Let QM G ji (k) denote the priority of user i, flow j at scheduling period k: Bij (k) QM G ji (k) = min{EP Sizei (k), dj,z i (k)}, z=1 23 ∀ i ∈ I, j ∈ Ji . (2.10) The user and flow to be scheduled at time k is determined as: (i∗ (k), j ∗ (k)) = arg max QM G ji (k). i∈I j∈Ji (2.11) Step 2: Packets are selected, one at a time in an iterative fashion, from the data queue of user i∗ , flow j ∗ and added to the physical layer encoder packet until either the physical layer encoder packet EP Sizei∗ (k) is filled or that there are no more packets in the data queue j ∗ . MRR Scheduling Policy The MRR scheduling policy assigns equal service to each traffic flow and in order regardless of queue length and channel condition. The MRR scheduling policy (where each of the I users has Ji flows) is specified as follows: Step 1: The user and flow to be scheduled at time k is determined as: i∗ (k) = (k − 1) mod IJi + 1 , Ji (2.12) ∗ j (k) = (k − 1) mod Ji + 1. Step 2: Same as Step 2 of the MG scheduling policy. MPF Scheduling Policy The PF scheduling policy [9] exploits multi-user diversity to maximize system throughput by scheduling data transmission based on the relative channel quality of the competing users, while at the same time maintaining fairness across users. Note that in the classic PF scheduling policy, there is no provision for choosing which flow to schedule from among the flows of a given user. Thus for purposes of comparison, each traffic flow is regarded as a separate user and we refer to this as the MPF scheduling policy. The MPF scheduling policy is defined as: 24 Step 1: Let QM P F ji (k) denote the priority of user i, flow j at scheduling period k: CSIi (k) j v i (k) QM P F ji (k) = 0 if Bij (k) > 0 ∀i ∈ I, j ∈ Ji , (2.13) otherwise where v ji (k) denotes the running average throughput over the last Nwindow time slots for user i, flow j at time k. A flow with no data to send is assigned a priority of 0 and is ignored in the selection process. The user and flow to be scheduled at time k is determined as: (i∗ (k), j ∗ (k)) = arg max QM P F ji (k). i∈I j∈Ji (2.14) Step 2: Same as Step 2 of the MG scheduling policy. 2.6.2 Performance Measure To evaluate the system performance, we define the scheduling policy performance gain, GYχ , as GYχ = C χY − χ × 100%, χ (2.15) where Y is either the MFM or ACLS-FM scheduling policy and χ is the QoS measure of interest: throughput (TP), latency (LT) and packet drop probability (PDP). The term C in (2.15) takes value +1 for TP and −1 for LT and PDP. The terms χY and χ are the average QoS values for scheduling policy Y and MPF respectively. The MPF scheduling policy is used for evaluating the system performance as it (or its variants) is the most commonly used scheme in wireless networks. 2.6.3 MFM Results FM FM FM Some simulation results are shown in Table 2.2 for ρ = 0.90. The GM , GM and GM TP LT P DP columns show the throughput, latency and packet drop probability gains of MFM compared to MPF. Cumulative distribution function (CDF) plots for user throughput, user latency and 25 Table 2.2: Simulation Results of MFM with ρ = 0.90 I Ji FM GM TP FM GM LT FM GM P DP 20 2 BE 0.19 % 0.00 % 0.00 % 110 2 EF 52.14 % 22 10 EF 30 1 BE, 1 EF 22.87 % 17.60 % 13.64 % 6 5 BE, 5 EF 36.83 % 67.97 % 20.80 % 0.00 % 13.15 % 246.83 % 23.53 % 60.98 % user packet drop probability obtained from a simulation of 22 users, each with 10 EF traffic FM flows are shown in Fig. 2.4, 2.5 and 2.6 respectively. The improvements in GM and TP FM GM relative to MPF come from the reduction in wastage in the physical layer encoder LT packet when the flow queue sizes (in bits) are typically quite small compared to the physical FM layer encoder packet size. The results show that GM increases as the system loading factor TP increases for both the EF only traffic and mixed traffic scenarios but is negligible for the BE only traffic scenario due to minimal unfilled space left in the physical layer encoder packet for FM merging. For the same system loading factor, GM increases as the number of traffic flows TP is increased with a corresponding decrease in the number of users due to multi-application diversity. Further decrease in latency is realized due to the possible multiplexing of packets from different data queues in a scheduling period. Application layer PDU from EF flows that arrive later at the access network following a large application layer PDU from a BE flow can be transmitted first instead of being transmitted in the order of arrival, hence reducing latency. FM GM exhibits the highest gain in a mixed traffic scenario. As expected, the results show LT FM FM that an increase in GM results in a corresponding increase in GM LT P DP . 26 CDF of User Throughput 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 0.2 MPF MFM 0.1 0 0 1 2 3 4 5 User Throughput (bps) 6 4 x 10 Figure 2.4: CDF of User Throughput, ρ = 0.90, I = 22, 10 EF Flows for each User CDF of User Latency 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 0.2 MPF MFM 0.1 0 0 0.01 0.02 0.03 0.04 0.05 User Latency (s) Figure 2.5: CDF of User Latency, ρ = 0.90, I = 22, 10 EF Flows for each User 27 CDF of User Packet Drop Probability 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 0.2 MPF MFM 0.1 0 0 20 40 60 80 100 User Packet Drop Probability (%) Figure 2.6: CDF of User Packet Drop Probability, ρ = 0.90, I = 22, 10 EF Flows for each User 2.6.4 ACLS-FM Results M Some simulation results are shown in Table 2.3 for ρ = 0.10, 0.50 and 0.90. The GACLS−F , TP M M GACLS−F and GACLS−F columns show the throughput, latency and packet drop probaLT P DP bility gains of ACLS-FM compared to MPF. CDF plots for user throughput, user latency and user packet drop probability for a system with ρ = 0.90 for EF only and mixed traffic (BE + EF) are shown in Fig. 2.7 and Fig. 2.8 respectively. The simulation results confirm that ACLS-FM generally performs better than the other four scheduling policies defined in Section 2.6.1 in terms of user throughput, user latency and user packet drop probability. We note in Fig. 2.8a that while MG has a higher system aggregate throughput, ACLS-FM can have a higher user throughput compared to MG. In addition to exploiting the benefits of MFM, ACLS-FM achieves additional performance gains from the PU function defined in Section 2.5.1, which allows a packet from a EF (latency-sensitive) flow to have its urgency increased as its waiting time, wiEF, z , exceeds a predetermined threshold, ηEF , to meet its latency requirements. For the period wiBE, z < ηEF , packets from the BE (delay-tolerant) flows are given a higher urgency to reduce the buffer backlog as a mechanism 28 to achieve a higher system throughput. The PP function defined in Section 2.5.2 allows for further intra-user adjustments through the coupling of flow priority, πj , packet size, dj,z i , and packet urgency, uj,z i . Fairness among users is taken into account in the user selection policy defined in Section 2.5.4. For BE only traffic, as shown in Table 2.3a, ACLS-FM provides little performance gains over MPF regardless of the system loading factor. This is due to the fact that BE only traffic is high data rate and bursty in nature, leaving minimal unfilled space in the physical layer encoder packet for exploiting MFM. However, for EF only traffic, as shown in Fig. 2.7, ACLS-FM provides significant perTable 2.3: Results of ACLS-FM for (a) BE only (b) EF only (c) BE + EF (a) BE only M GACLS−F TP M GACLS−F LT M GACLS−F P DP 2 BE 0.00 % 0.52 % 0.00 % 0.50 10 2 BE 0.79 % 0.00 % 0.44 % 0.90 20 2 BE 0.43 % 2.59 % 0.00 % ρ I Ji 0.10 3 (b) EF only M GACLS−F TP M GACLS−F LT M GACLS−F P DP 2 EF 7.53 % 52.63 % 6.93 % 60 2 EF 91.37 % 3.13 % 36.48 % 0.90 110 2 EF 87.06 % 2.94 % 21.88 % 0.90 22 10 EF 304.02 % 50.00 % 74.85 % 0.90 14 16 EF 300.78 % 67.65 % 74.69 % M GACLS−F TP M GACLS−F LT M GACLS−F P DP ρ I Ji 0.10 20 0.50 (c) BE + EF ρ I Ji 0.10 5 1 BE, 1 EF 0.00 % 10.92 % 0.00 % 0.50 20 1 BE, 1 EF 31.54 % 28.79 % 16.18 % 0.90 4 114.15 % 45.80 % 35.95 % 8 BE, 8 EF 29 formance gains, especially as ρ increases. As with the MFM scheduling policy, the gain of ACLS-FM is achieved through a reduction of wastage in the physical layer encoder packet. In addition, with the inclusion of the packet urgency function uj,z i (k) in the ACLS-FM schedulM ing policy, GPACLS−F is achieved as the number of packets dropped due to the violation DP of the scheduling delay threshold is substantially reduced, which in turn leads to additional M ACLS−F M GTACLS−F . Further GLT is realized due to the consideration of the MSPP Ui (k) of a P user in the user selection policy in (2.9) which selects a user with more urgent packets. Comparing the cases of I = 110, Ji = 2 and I = 22, Ji = 10 for ρ = 0.90 shown in Table 2.3b, we see that as the number of flows per user increases, a corresponding increase in performance gains is obtained due to the exploitation of multi-application diversity. For ρ = 0.90 and 16 EF flows per user shown in Fig. 2.7c, ACLS-FM achieves a near-0% PDP in comparison to an average of 45% PDP for the other 4 scheduling policies at the 95th percentile. For the mixed traffic (BE+EF) scenario shown in Fig. 2.8, ACLS-FM has the second highest throughput performance. MG provides the best throughput performance at the expense of starving EF traffic as it also has the highest EF PDP as shown in Fig. 2.8c. On the other hand, ACLS-FM achieves a near-0% PDP for EF traffic and the second lowest PDP for BE traffic. As shown in Fig. 2.8b, ACLS-FM has the lowest latency for BE traffic, and while MFM has a lower latency for EF traffic than ACLS-FM, that is achieved at the expense of a 50% EF PDP at the 95th percentile (shown in Fig. 2.8c). In a mixed traffic scenario, MFM has a higher EF than BE PDP shown in Fig. 2.8c as the flow merging policy for MFM determines the probability of merging a packet from a flow by the ratio of its queue length to the sum of all queue lengths. It is worth noting that while MPF has the second lowest EF PDP, it also has the second highest BE PDP as it trades-off BE packets to achieve its intended throughput fairness objective. On the other hand, MG trades-off EF packets (highest EF PDP) for BE packets to achieve a high throughput. 30 2.7 Conclusion The performance gains of a scheduling policy which exploits MFM in terms of user throughput, user latency and user packet drop probability were quantified. The substantial gains of MFM results from wastage reduction in the physical layer encoder packet and multiplexing of ackets with different latency tolerances in a scheduling period. Only queue length information is needed to implement the MFM scheduling policy. With the promising gains and simplicity in implementation of MFM, we propose an ACLS-FM scheduling policy that integrates MFM and jointly considers physical-layer time-varying channel conditions as well as applicationlayer QoS requirements. In addition to exploiting the benefits of MFM, ACLS-FM realizes additional performance gains through the use of a cross layer design, utilizing a packet urgency function, packet priority function, flow merging policy and user selection policy. The simulation results confirm that ACLS-FM achieves substantial performance gains in the considered QoS performance measures (user throughput, user latency and user packet drop probability) when compared to other commonly used scheduling policies. 31 CDF of User Throughput 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF MFM ACLS−FM 0.2 0.1 0 0 2 4 6 8 10 User Throughput (bps) 4 x 10 (a) CDF of User Latency 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF MFM ACLS−FM 0.2 0.1 0 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 User Latency (s) (b) CDF of User Packet Drop Probability 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF MFM ACLS−FM 0.2 0.1 0 0 20 40 60 80 100 User Packet Drop Probability (%) (c) Figure 2.7: Performance for a System with ρ = 0.90, I = 14, 16 EF Flows for each User (a) CDF of User Throughput (b) CDF of User Latency (c) CDF of User Packet Drop Probability 32 CDF of User Throughput 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF MFM ACLS−FM 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 User Throughput (bps) 4.5 5 x 10 (a) CDF of User Latency EF Flows 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF CDF of User Latency BE Flows 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.5 1 1.5 2 0 2.5 MG MRR MPF MFM ACLS−FM 0 0.02 User Latency (s) 0.04 0.06 User Latency (s) (b) CDF of User Packet Drop Probability EF Flows 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF CDF of User Packet Drop Probability BE Flows 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 20 40 60 80 100 0 MG MRR MPF MFM ACLS−FM 0 20 40 60 80 100 User Packet Drop Probability (%) User Packet Drop Probability (%) (c) Figure 2.8: Performance for a System with ρ = 0.90, I = 4, 8 BE Flows and 8 EF Flows for each User (a) CDF of User Throughput (b) CDF of User Latency (c) CDF of User Packet Drop Probability 33 Chapter 3 Packet Division Multiplexing in Single-carrier CDMA Systems 2 3.1 Introduction With the rapid introduction of multimedia services, wireless networks are expected to integrate a mix of real-time traffic and non-real-time traffic with different QoS requirements. This has driven the continued extensive research in RRM with the objective of improving achievable system capacity while at the same time meeting the diverse QoS requirements and adapting to the dynamically changing wireless conditions. As part of the evolution of the cdma2000 family of 3G mobile telecommunications air interface, cdma2000 1xEV-DO Revision A [43] provides significant improvements at various protocol layers over cdma2000 1xEV-DO Release 0 [46]. These include higher peak data rates, HARQ transmission and enhancements that provide considerable gains in spectral efficiency and substantial QoS support to efficiently support both latency-sensitive and delay-tolerant applications. In addition, cdma2000 1xEV-DO Revision A also introduced PDM [43, 47, 48] in the forward link that permits the BS to service multiple users in the 2 The material in this chapter is based on: C. E. Huang and C. Leung, “Downlink mixed-traffic scheduling with packet division multiplexing,” in Proc. ACM PM2HW2N, Oct. 2008, pp. 165–172. c 2008 ACM. http://dx.doi.org/10.1145/1454630.1454655 34 same physical layer encoder packet in a single time slot with the use of multi-user packet (MUP) transmission. PDM not only improves the resource utilization (packing efficiency) by allowing delay-tolerant applications to fill up the physical layer encoder packet unused with higher priority, low rate latency-sensitive applications but also improves the transmission latency performance by overcoming the shortage of time slots and enables cdma2000 1xEV-DO Revision A to support a large number of low-rate latency-sensitive applications, leading to increased system throughput and spectral efficiency. The feasibility of supporting a single traffic type with PDM in cdma2000 1xEV-DO Revision A is explored in [54]. Analytical models and simulation are developed to evaluate the expected capacity and delay performance of implementing VoIP traffic using cdma2000 1xEV-DO Revision A. The authors demonstrate in the study that MUP transmission plays a critical role in achieving the expected Erlang capacity for VoIP which is comparable to that of a circuit switched cdma2000 [55] system. MUP efficiency, in terms of the average number of VoIP packets contained in one physical layer encoder packet is also presented. In [56], the performance and capacity of VoIP traffic by itself and VoIP together with other traffic types are analyzed. It is shown that cdma2000 1xEV-DO Revision A can not only provide VoIP capacity that is comparable to IS-2000, but the simulation results also show that a significant amount of delay-tolerant traffic can be simultaneously supported along with VoIP. In this chapter, we leverage upon the PDM and mixed traffic findings in [54] and [56] respectively and adopt the ACLS-FM scheduling policy approach introduced in Chapter 2. ACLS-FM integrates MFM (Section 2.4) and takes into account the time-varying channel conditions from the physical layer and QoS requirements from the application layer. Considerable performance gains achievable with ACLS-FM in terms of user throughput, user latency and user packet drop probability were quantified in Section 2.6. We extend ACLS-FM and propose an adaptive cross layer scheduling policy that incorporates PDM of the shared physical layer encoder packet. We refer to this scheme as the Adaptive Cross Layer Scheduling with Flow and User Multiplexing (ACLS-FUM) scheduling policy. We consider a mix of real-time voice services and non-real-time data applications and study the improvements in 35 User 1 i Flow JI B1j (k ) Flow 2 ...... .. user i, flow j, packet z (d j,z bits) Flow 1 Flow J1 Flow 2 Flow 1 MAC User I .. BIj (k ) Flow Multiplexing (MFM) Flow Multiplexing (MFM) User Multiplexing (MUP) PHY EPSizeMUP(k) Figure 3.1: Illustration of MUP with MFM packing efficiency and latency performance. The resulting performance gains that are realizable in a framework such as that provided in cdma2000 1xEV-DO Revision A are quantified. An illustration of MUP with MFM is shown in Fig. 3.1. This chapter is organized as follows: in Section 3.2, we present the system model that includes the network model, traffic classes and data buffer parameters. The ACLS-FUM scheduling policy is described in Section 3.3. Simulation results are presented in Section 3.4 and the main findings are summarized in Section 3.5. 3.2 System Model The network model used in this chapter is described in this section. The traffic classes and data buffer parameters used are described in Sections 2.3.2 and 2.3.3, respectively. We consider a 1xEV-DO Revision A-like packet cellular network random discrete-event model, as shown in Fig. 2.1, consisting of one BS servicing I mobile stations (MSs). Let I = {1, ..., I} be the set of all users (MSs). Each MS i can have up to Ji data queues (flows) and let Ji = {1, ..., Ji } be the set of all flows for MS i. Forward link scheduling is centralized at the BS which communicates with all MSs. At each time slot k, where k ∈ Z+ = {1, 2, ..., K}, only one user is scheduled for single-user packet (SUP) transmission or up to eight users are scheduled for MUP transmission. We assume that the scheduling decision time is negligible. 36 Power control is not enabled in the forward link for 1xEV-DO systems and the BS transmits at full power to the MSs in all time slots. We assume that packets are received without errors, i.e. 0% FER, between the BS and MS. This simplifying assumption is made to illustrate the potential gains that ACLS-FUM can provide. It is expected that gains will also be achievable in a more realistic setting with non-zero FER values. The BS is assumed to have knowledge of the channel state information, CSIi (k), for each MS i at time k, queue status and QoS requirements for all the data queues. The maximum service rate for a user during time slot k is a function of its channel quality which is characterized by its received SNR in time slot k. Multiple application layer protocol data units (PDUs) from the same user can be transmitted in the same physical layer encoder packet in the same time slot using SUP transmission. Furthermore, application layer PDUs destined for different users are either scheduled and transmitted in different time slots using SUP transmission or multiplexed into the same physical layer encoder packet and transmitted in the same time slot using MUP transmission. For simplicity, we assume that the physical layer encoder packet size, EP Sizei (k), is chosen according to a uniform distribution from the set of eight physical layer encoder packet sizes, where EP Size ∈ ESU P = {128, 256, 512, 1024, 2048, 3072, 4096, 5120} bits for SUP transmission. For MUP transmission, EP Size ∈ EMU P = {1024, 2048, 3072, 4096, 5120} bits and maps to the set of Date Rate Control (DRC) indices compatible with MUP transmission for data rates greater than 153.6 kbps [43]. Let µ EP Size/S ∈ {4.8, ..., 3072} kbps be the effective service rate, where S ∈ S = {1, 2, 4, 8, 16} time slot(s), each of 1.667 ms duration. Each application layer PDU is segmented into a number of packets. Packets from up to Ji = 16 different flows and I = 8 different users can be multiplexed into the same physical layer encoder packet of size EP SizeM U P (k) for transmission at time k. 37 3.3 Adaptive Cross Layer Scheduling with Flow and User Multiplexing Scheduling Policy The proposed ACLS-FUM scheduling policy consists of a packet urgency function (to meet latency requirements), a packet priority function (for intra-user QoS adjustments), a transmission mode selection function (to determine SUP/ MUP transmission mode), SUP transmission mode, MUP transmission mode and a flow merging policy (to determine which flows and how many bits from each flow to service). The packet urgency function, packet priority function and flow merging policy are described in Chapter 2. A flow chart illustrating the ACLS-FUM scheduling policy is shown in Fig. 3.2. 3.3.1 Transmission Mode Selection Function In order to give users that do not qualify for MUP transmission (cdma2000 1xEV-DO Revision A [43] precludes DRC index < 3 from MUP transmission) an opportunity to clear their backlog, the scheduling policy may transmit packets in SUP transmission mode if the average waiting time of the head-of-line (HOL) packets exceeds predefined thresholds. Specifically, we define the averaging waiting time of BE and EF HOL packets as follows wBE (k) = 1 IBE (j) IBE (j)wij,HOL (k) (3.1) IEF (j)wij,HOL (k) (3.2) i∈I j∈Ji i∈I j∈Ji wEF (k) = 1 IEF (j) i∈I j∈Ji i∈I j∈Ji where wij,HOL (k) denotes the waiting time of the HOL packet for user i, flow j at time k. The indicator function IBE (j) is defined as IBE (j) = 1 if j is an BE flow 0 otherwise. 38 (3.3) Start (ACLS-FUM) Transmission Mode Selection Function: Compute average waiting time of BE HOL packets w and EF HOL packets w BE EF (k ) (k ) BE BE β BE U ( w (k ) − Τ SUP )+ EF EF β EF U ( w (k ) − Τ SUP )>0 Yes No SUP Transmission Mode: Schedule user using ACLS-FM MUP Transmission Mode: Compute flow and user priority End (ACLS-FUM) MUP Transmission Mode: Determine MUP candidates and select MUP users MUP Transmission Mode: Determine EPSizeMUP(k) and number of bits allocated to each selected MUP user. Schedule each selected MUP user using ACLS-FM End (ACLS-FUM) Figure 3.2: ACLS-FUM Scheduling Policy Flow Chart 39 IEF (j) is similarly defined. The terms IBE (j) and j∈Ji IEF (j) represent the number of j∈Ji BE and EF flows for user i respectively. The ACLS-FUM scheduling policy will use SUP transmission mode at time k if the following condition is true BE EF EF β BE U (wBE (k) − TSU U (wEF (k) − TSU P) + β P ) > 0, (3.4) where U (x) denotes the unit step function, i.e. U (x) = 1 for x > 0 and U (x) = 0 otherwise. The parameters β BE and β EF take on values in {0, 1} depending on whether BE and/or EF EF BE flows are included in the transmission mode selection function. TSU P and TSU P are the SUP thresholds for BE and EF flows respectively. They are generally set to values that are less than the scheduling delay thresholds TBE and TEF . MUP transmission mode is used if (3.4) is false. 3.3.2 SUP Transmission Mode In SUP transmission mode, only one user is serviced. Multiple packets from the same user are selected using ACLS-FM (see Section 2.5) and packed into the same physical layer encoder packet in the same time slot for SUP transmission. 3.3.3 MUP Transmission Mode In MUP transmission mode, up to eight users are serviced. Multiple packets from different users are multiplexed into the same physical layer encoder packet in the same time slot for MUP transmission. The MUP transmission mode is performed as follows: 1) Flow and User Priority: Let Qji (k) ∈ [0, 1] denote the flow priority of user i, flow j at time k, and be defined as Qji (k) αuj,HOL (k) + (1 − α)CSIi (k) i = , v ji (k) (3.5) where CSIi (k) is the channel state information of user i at time k, uj,HOL (k) is the PU i value of the HOL packet and v ji (k) (for fairness consideration) denotes the running aver40 age throughput over the last Nwindow time slots for user i, flow j at time k. Each of these (Tj −ηj ) terms is normalized to its maximum value: uj,HOL (k) is normalized to max(cj ξj i ∀j, CSIi (k) which is mapped to EP Sizei (k) is normalized to v ji (k) is normalized to max EP Size∈EMU P max EP Size∈EMU P ) EP Size and EP Size/S. The parameter α ∈ [0, 1] is a weighting constant that is used to adjust the relative weighting of HOL packet urgency and channel condition. The user priority of user i at time k is Ji Qji (k), Qi (k) = ∀ i ∈ I. (3.6) j=1 2) MUP User Selection: We define the set, M U Pcands (k), of candidate users that qualify for MUP transmission at time k as M U Pcands (k) = { i ∈ I | EP Sizei (k) ≥ 1024} (3.7) and let imax (k) denote the MUP candidate which has the largest user priority: imax (k) = arg max i∈M U Pcands (k) Qi (k). (3.8) The set of MUP users, M U Pusers (k), to be scheduled at time k is determined as: M U Pusers (k) = { i ∈ M U Pcands (k) | CSIi (k) ≥ CSIimax (k) (k)}. (3.9) In the case where M U Pusers (k) > 8 ( · denotes the cardinality of a set), the 8 users with the largest Qi (k) are selected for MUP transmission. In the event of any ties, the tied users are selected randomly with equal probabilities. 3) MUP User Bit Allocation: Let EP SizeM U P (k) denote the physical layer encoder packet size used for MUP transmission and it is defined as EP SizeM U P (k) = EP Sizeimax (k) (k). 41 (3.10) The number of bits allocated to user i for MUP transmission at time k is denoted by M U P Sizei (k). It is proportional to its user priority, Qi (k), which takes into account flow throughput fairness, HOL packet urgencies and user channel condition. M U P Sizei (k) is determined by M U P Sizei (k) = Qi (k) Qi (k) EP SizeM U P (k), ∀ i ∈ M U Pusers (k). i∈M U Pusers (k) (3.11) Based on the number, M U P Sizei (k), of bits allocated, packets for user i ∈ M U Pusers (k) are selected using ACLS-FM (see Section 2.5) and multiplexed into the same physical layer encoder packet in the same time slot for MUP transmission. 3.4 Simulation Results The ACLS-FUM scheduling policy described in Section 3.3 was simulated in Matlab using the system model described in Section 3.2. Simulation results were obtained for BE only traffic, EF only traffic and mixed traffic (BE + EF) scenarios with system loading factor, I ρ Ji λji and λ/µ, values of 0.10, 0.50 and 0.90, where λ = i=1 j=1 1 EP Size j µ = . λi is the average traffic arrival rate for user i, S ESU P S∈S EP Size∈E S SU P flowj. For the mixed traffic scenario, an equal number of BE and EF traffic flows were simulated. To achieve the desired system loading factor, I and Ji were varied. The simulation parameter values are listed in Table 3.1. To evaluate the system performance, we define the ACLS-FUM scheduling policy perforUM UM UM , as in (2.15). The performance metrics GACLS−F , GACLS−F , mance gain, GACLS−F χ TP LT ACLS−F U M UM UM , GJT,BE and GACLS−F quantifies the throughput, latency, packet drop GACLS−F JT,EF P DP probability and jitter (BE and EF) gains of ACLS-FUM compared to MPF. CDF plots for user throughput, user latency, user packet drop probability and user jitter for a system with ρ = 0.50 for EF only and ρ = 0.90 for mixed traffic (BE + EF) are shown in Figs. 3.3 and 3.4 respectively. The simulation results confirm that ACLS-FUM performs better than the other 42 Table 3.1: Simulation Parameter Values Param. dj,z i (k) Nwindow S cj BE TSU P EF TSU P β BE β EF Value 128 bits ∀ i ∈ I ∀ j ∈ Ji , z = 1, ..., Bij (k) 100 slots 1 slot 1 1.500 sec 0.035 sec 0 1 Param. Value ξBE 1.0 ξEF TBE TEF ηBE ηEF α 1.5 3.000 sec 0.070 sec 1.500 sec 0.035 sec 0.5 four scheduling policies defined in Section 2.6.1 in terms of user throughput, user latency, user packet drop probability and user jitter through the exploitation of both MFM and MUP. The simulation scenario for ρ = 0.50 (110 users, each with 1 EF flow) is created to demonstrate the achievable performance gains solely from MUP transmission. From both UM UM Fig. 3.3 and the performance metrics where GACLS−F = 132.83%, GACLS−F = TP LT UM UM 77.42%, GPACLS−F = 99.99% and GACLS−F = 61.15%, it is clear that ACLS-FUM DP JT,EF performs much better than any of the other four scheduling policies. ACLS-FUM performs UM better than MG in terms of user throughput as shown in Fig. 3.3a. The high GACLS−F TP is achieved due to the ability to multiplex packets for different users into the same physical layer encoder packet. ACLS-FUM also provides the best performance in terms of user jitter compared to the other four scheduling policies which have almost identical performance as shown in Fig. 3.3d. This improvement is achieved due to the ability to PDM the physical layer encoder packet using MUP transmission which provides an increase number of UM and available time slots to support low-rate latency-sensitive EF traffic. High GACLS−F LT UM GACLS−F (near-0% PDP) are achieved from the reduction in wastage in the physical layer P DP encoder packet and a corresponding queue length reduction. The simulation scenario for ρ = 0.90 (30 users, each with 1 BE and 1 EF flow) is created to demonstrate the achievable performance gains from MUP with MFM transmission in a UM mixed traffic (BE + EF) scenario. The results are presented in Fig. 3.4 with GACLS−F = TP 43 ACLS−F U M UM UM 66.58%, GLT = 42.84%, GACLS−F = 82.57%, GACLS−F = 16.81% and P DP JT,BE UM GACLS−F = 39.06%. In this scenario, based on the transmission mode selection function JT,EF defined in Section 3.3.1, 99.56% were MUP transmissions and the remaining 0.44% were SUP transmissions. As shown in Fig. 3.4, simulation results confirm that ACLS-FUM performs better than the other four scheduling polices defined in Section 2.6.1 in terms of user throughput, user latency, user packet drop probability and user jitter with the exception that MG has a slightly better BE throughput and BE packet drop probability. Fig. 3.4a shows that ACLS-FUM has the second highest throughput performance for BE (behind MG) and the highest throughput performance for EF. However, MG’s BE throughput performance comes at a great sacrifice of EF traffic, which not only has the lowest EF throughput but also the highest EF PDP as shown in Fig. 3.4a and 3.4c respectively. In contrast, ACLS-FUM achieves a near-0% PDP for EF traffic and the second lowest PDP for BE traffic. ACLS-FUM also has the lowest latency for BE (up to the 80th percentile) and the lowest latency for EF as shown in Fig. 3.4b. While MG has a lower latency for the upper 20th percentile for BE, that is achieved at the expense of a 50% EF PDP at the 80th percentile as shown in Fig. 3.4c. ACLS-FUM achieves the lowest user jitter for both BE and EF traffic. It is worth highlighting that ACLS-FUM (MUP) outperforms ACLS-FM (SUP) in all four QoS metrics, primarily due to the increased packing efficiency of MUP transmission. A solution possible under SUP is also feasible under MUP. Therefore, optimizing over the set of possible MUP solutions will generally yield an improved optimal solution in any one of the four QoS metrics. An inductive proof of MUP throughput gain is presented in Appendix A. From the simulation results, we note that the HOL average waiting time increases as the system loading ρ increases. ACLS-FUM will more likely select the SUP transmission mode in an attempt to clear the users’ backlog. However, this could degrade the system performance to that of ACLS-FM (see Chapter 2). As such, further considerations should be taken in account when defining the transmission mode selection function so as to 1) achieve a balanced tradeoff between backlog reduction and MUP benefit maximization and 2) attempt to determine (other than from the DRC index) whether a user’s low DRC index request is 44 due to bad channel conditions or due to a lack of transmit data in the queue. Scheduling priority should be given to the users that are in better channel conditions (provided that the user fairness constraint is met) so as to not compromise system capacity. 3.5 Conclusion An ACLS-FUM scheduling policy that integrates both MFM and PDM while jointly considering physical-layer time-varying channel conditions as well as application-layer QoS requirements in a mixed traffic environment has been proposed and evaluated. In addition to exploiting the benefits of MFM and cross layer information, ACLS-FUM realizes additional performance gains by taking PDM of the shared physical layer encoder packet into account, further reducing wastage in the physical layer encoder packet. Simulation results show that ACLS-FUM can achieve substantial performance gains in user throughput, user latency, user packet drop probability and user jitter when compared to four other well-known scheduling policies. 45 CDF of User Throughput 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0 2000 4000 6000 8000 10000 User Throughput (bps) (a) CDF of User Latency 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 User Latency (s) (b) Figure 3.3: Performance for a System with ρ = 0.50, I = 110, 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Latency 46 CDF of User Packet Drop Probability 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0 20 40 60 80 100 User Packet Drop Probability (%) (c) CDF of User Jitter 1 0.9 0.8 0.7 CDF 0.6 0.5 0.4 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0 0.01 0.02 0.03 0.04 0.05 User Packet Jitter (s) (d) Figure 3.3: Performance for a System with ρ = 0.50, I = 110, 1 EF Flow for each User (Continued) (c) CDF of User Packet Drop Probability (d) CDF of User Jitter 47 CDF of User Throughput EF Flows 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF CDF of User Throughput BE Flows 0.5 0.4 0.4 0.3 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0.5 0 1 2 0.2 0.1 0 3 0 User Throughput (bps)x 105 2000 4000 6000 8000 10000 User Throughput (bps) (a) CDF of User Latency EF Flows 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF CDF of User Latency BE Flows 0.5 0.4 0.4 0.3 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0.5 0 0.5 1 1.5 2 0.2 0.1 0 2.5 User Latency (s) 0 0.02 0.04 0.06 User Latency (s) (b) Figure 3.4: Performance for a System with ρ = 0.90, I = 30, 1 BE and 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Latency 48 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF CDF of User Packet Drop Probability CDF of User Packet Drop Probability EF Flows BE Flows 0.5 0.4 0.4 0.3 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0.5 0 20 40 60 80 0.2 0.1 0 100 User Packet Drop Probability (%) 0 20 40 60 80 100 User Packet Drop Probability (%) (c) CDF of User Jitter EF Flows 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF CDF of User Jitter BE Flows 0.5 0.4 0.4 0.3 0.3 MG MRR MPF ACLS−FM ACLS−FUM 0.2 0.1 0 0.5 0 0.5 1 1.5 0.2 0.1 0 2 User Packet Jitter (s) 0 0.01 0.02 0.03 0.04 0.05 User Packet Jitter (s) (d) Figure 3.4: Performance for a System with ρ = 0.90, I = 30, 1 BE and 1 EF Flow for each User (Continued) (c) CDF of User Packet Drop Probability (d) CDF of User Jitter 49 Chapter 4 BitQoS-aware Resource Allocation Framework for Multi-carrier OFDM Systems 3 4.1 Introduction Orthogonal Frequency Division Multiplexing (OFDM) [57–59] is a promising technique for communication systems due to its high spectral efficiency and is currently employed in many communication systems, e.g., LTE [27], Worldwide Interoperability for Microwave Access (WiMAX IEEE 802.16) [28] and Very High bit rate Digital Subscriber Line (VHDSL) [60]. In OFDM, the available transmission bandwidth is divided into mutually orthogonal narrowband subcarriers and data is transmitted over these subcarriers. A higher spectral efficiency is possible as the orthogonality is achieved through proper selection of waveforms instead of reliance on guard bands as in conventional frequency division multiplexing (FDM). The system performance can be enhanced by adapting the modulation, coding and power to the channel quality of each subcarrier. In a multi-user system, as the channel quality on each 3 The material in this chapter is based on: C. E. Huang and C. Leung, “BitQoS-aware resource allocation for multi-user mixed-traffic OFDM systems,” IEEE Trans. Veh. Technol., vol. 61, no. 5, pp. 2067-2082, Jun. 2012. c 2012 IEEE. http://dx.doi.org/10.1109/TVT.2012.2189030 50 subcarrier is likely to be independent among different users, OFDMA allows users to access subcarriers selectively, in time and frequency, to exploit multi-user and multi-channel diversities, providing increased scheduler flexibility and scalability to further improve system performance. Dynamic bit-loading, transmission power allocation and subcarrier assignment schemes for multi-user OFDM systems have been devised to take advantage of the mechanisms and optimization features introduced in the air interfaces [27, 28, 33]. Much of the published work in OFDM RRM has focused on exploiting multi-user and multi-channel diversities [29–32] to maximize the system throughput subject to a total system transmit power constraint [30, 61, 62] or to minimize the total transmit power while satisfying a transmission rate for each user [63]. In addition, many of the RRM algorithms have previously focused on homogeneous traffic where the traffic type consists of only either real-time or non-realtime traffic traffic. More recently, multi-application (flow) diversity [34–37, 39–41] has been exploited to address concurrent heterogeneous application QoS requirements in mixed-traffic networks. In this chapter, we propose to increase the flexibility and granularity of the resource allocation algorithms by considering QoS at the bit-level rather than at the flow-level as in previous works [36–41]. This is achieved by adaptively matching the QoS requirements of the user application bits to the characteristics of the OFDM subcarriers. As shown in Fig. 4.1, the bits from each application flow of a given user are mapped into OFDM subcarriers based on a bitQoS-aware scheduling policy to exploit both multi-application and multi-bit diversities. BitQoS represents a QoS prioritization mechanism which can take into consideration inter-user priorities, intra-user application QoS requirements and fairness in a multi-user mixed-traffic system. The selected application bits are then transmitted simultaneously on a set of OFDM subcarriers allocated to that user. The mapping between application bits and the OFDM subcarriers is signaled using the control channel accompanying the data channel. The receiver is then able to extract the application bits from the assigned OFDM subcarriers. While the proposed scheme requires additional scheduling signaling overhead and increased 51 OFDM Symbols User 1 flow 2 .. flow J1 User 2 flow 1 …… flow 1 User I flow JI k-1 Application QoS Parameters Transmitted Bits Subcarriers flow 1 bitQoS-aware Scheduling Policy Application Bits k k+1 ... Subcarrier Channel State Information Figure 4.1: Mapping of application bits to OFDM subcarriers for the bitQoS-aware resource allocation framework. There are no restrictions as to whether each subcarrier can carry bits from more than one application flow of a user. computational complexity, it provides the advantage of matching the QoS requirements of the application bits to the channel qualities of the OFDM subcarriers, and the critical ability to more closely meet the QoS requirements of multiple user application flows. This is not possible in flow-level scheduling since only flow-level QoS parameter values are considered. We formulate the proposed bitQoS-aware RA framework as two optimization problems: one with no flow merging and one with flow merging. This chapter is organized as follows: in Section 4.2, we present the system model that includes the network model and traffic classes. The bitQoS-aware RA framework with no flow merging and with flow merging are described in Section 4.3. The performance measures, analytical system throughput and comparative schemes are presented in Section 4.4. 4.2 System Model The network model of a multi-user OFDM system and the BE and EF traffic classes are described in this section. 52 4.2.1 Network Model We consider forward link transmissions in a multi-user OFDM system consisting of one BS servicing I users with N subcarriers in a single cell. Let I = {1, 2, . . . , I} denote the set of all users and N = {1, 2, . . . , N } denote the set of all subcarriers. User i has Ji application flows and let Ji = {1, 2, . . . , Ji } denote the set of all application flows of user i. All application data packets to be transmitted to users are queued at the BS. We assume that the data buffer size at the BS is infinite (i.e., no packet blocking) and that the BS has knowledge of the data buffer parameters and QoS requirements for all the application flows. Bits in the data buffer are indexed by z, z ∈ {1, 2, . . . , Bij (k)}, where Bij (k) denotes the queue length, in bits, of the data buffer for user i, flow j at time k, k ∈ {1, 2, . . . , K}. Packets in the data queues are serviced in a FIFO fashion. We assume that the BS has perfect knowledge of the channel gain, αi,n , of subcarrier n for user i, i ∈ I, n ∈ N , from the feedback channel. In practice, for Time Division Duplex (TDD) systems, the BS is able to estimate the channel state information based on the received uplink transmission given the symmetry of the channel characteristics for the downlink and uplink, and for Frequency Division Duplex (FDD) systems, pilot symbols are inserted in the downlink transmission [64] for the MS to estimate the channel state information. For simplicity, we do not consider the path loss or the effects of shadowing from the BS to MSs and we assume that the subcarriers undergo independent and identically distributed (i.i.d.) Rayleigh fading to account for multipath fading. The fading rate is slow enough that αi,n remains constant over an OFDM symbol duration, Ts , and the mean, E{|αi,n |2 }, of the channel power gain is assumed to be unity. Let pi,n denote the transmit power allocated to user i on subcarrier n. The corresponding number of bits that can be carried per OFDM symbol [65] is ci,n = log2 1 + pi,n |αi,n |2 ζσ02 , (4.1) where σ02 denotes the noise power and ζ is a SNR gap parameter. For practical signal constellations, ζ reflects the Bit Error Rate (BER) requirement [65]. The scheduling decision 53 is performed on an OFDM symbol basis and the total BS transmit power is Ptotal . It is also assumed that the scheduling decision time is negligibly small compared to Ts and that transmitted bits are received without errors. 4.2.2 Traffic Classes Two traffic classes are considered: BE traffic representing Internet browsing-like applications and EF traffic representing VoIP-like applications. We use the web browsing traffic arrival model in [50] for incoming BE traffic. The application layer PDU size is based on a truncated Pareto distribution with a mean of 25 kBytes and minimum and maximum sizes of 4.5 kBytes and 2 MBytes respectively. The application layer PDU interarrival time is geometrically distributed with a mean of 5 sec and takes on values which are multiples of 1 sec. For the EF traffic class, the VoIP traffic arrival model in [50] is assumed. In contrast to the web browsing model, source configuration and source files are used to generate VoIP traffic. The source file is generated based on the MSO model IS-871 with alterations as detailed in [50]. The application layer PDU size and interarrival time have a mean of 152.4 bits and 0.04 sec, respectively. The average traffic arrival rates λBE and λEF are 40.0 kbps and 3.7 kbps per application respectively. 4.3 BitQoS-aware Resource Allocation Framework In this section, we describe the bitQoS function and the bitQoS-aware resource allocation problem formulation with no flow merging and with flow merging. 4.3.1 BitQoS Function Based on the application QoS requirements, data buffer parameters, inter- and intra-user priorities and fairness, the bitQoS function maps these QoS parameters of an application bit into a numerical value. The bitQoS function allows the scheduling priority of a bit to be raised when the QoS satisfaction level is low and vice versa. For example, for delay-sensitive traffic such as VoIP applications, the bitQoS function may be expressed as an exponentially increas- 54 ing function of the bit waiting time, whereas for the BE traffic, the bitQoS function may be a constant. We define the bitQoS value of user i, flow j, bit z as ψij,z = f (θθ j,z i ), (4.2) where f (·) denotes the bitQoS function and θ j,z denotes the tuple of QoS parameters of i interest associated with user i, flow j, bit z. For this work, we consider a bitQoS function which includes the following QoS parameters: application flow priority and bit waiting time, j,z θ j,z i = {πj , wi (k)}, (4.3) where πj ∈ R+ is the application flow priority for flow j and wij,z (k) ∈ [0, ∞) denotes the amount of time, in seconds, that the bit at position z in user i, flow j buffer has waited. The term, πj , is included in (4.3) to account for the different traffic classes that may be present in a mixed-traffic system, and wij,z is included to account for the bit waiting time since latency is a key QoS requirement for delay-sensitive traffic. Each bit is time stamped upon arrival in the data buffer, and the waiting time is found by simply subtracting the arrival time from the current time k. Bits are dropped if wij,z (k) exceeds the application flow scheduling delay threshold Tj ∈ R+ as specified in Table 4.2. If any bit within an application data packet is dropped, then all the bits in that application data packet are dropped. We define the bitQoS function as d (wij,z (k)−ηj ) j f (θθ j,z i ) = cj πj ξj . (4.4) The bitQoS function is expressed as an exponential of the bit waiting time wij,z (k), which allows bits from delay-sensitive application flows to have their service priority rapidly raised as the waiting time exceeds the comfort latency threshold ηj ∈ R+ , where ηj is set to a value smaller than Tj . In the region where wij,z (k) ≤ ηEF , the BE bits have a higher bitQoS value than the EF bits; this allows for a reduction in BE traffic backlog, if necessary. The base of the exponential function ξj ∈ R+ is set according to the delay sensitivity of the respective 55 bitQoS Value BE EF BE bitQoS > EF bitQoS 0 ηEF ηBE TEF TBE Waiting Time Figure 4.2: BE and EF BitQoS Functions application flow. The coefficients, cj ∈ R+ and dj ∈ R+ , are used to scale the exponential function if needed. Examples of the bitQoS functions for BE and EF are illustrated in Fig. 4.2. 4.3.2 BitQoS-aware Resource Allocation Framework with No Flow Merging We formulate the proposed bitQoS-aware RA framework with no flow merging as an optimization problem with the objective of finding the joint subcarrier, power and bit assignment to maximize the total bitQoS-weighted throughput, subject to the total transmit power constraint. Let the optimization variable aji,n denote the subcarrier assignment variable which takes on the value 1 if subcarrier n is allocated to user i, flow j and 0 otherwise. Furthermore, let the optimization variable bj,z i,n denote the bit assignment variable which takes on the value 1 if user i, flow j, bit z is transmitted on subcarrier n and 0 otherwise. Finally, the optimization variable pji,n ∈ [0, Ptotal ] denotes the transmit power for user i, flow j on subcarrier n. The 56 optimization problem, OP4.1, is formulated as I OP4.1: Ji Bij N j,z f (θθ j,z i )bi,n max aji,n ∈{0,1} j pi,n ∈[0,Ptotal ] bj,z i,n ∈{0,1} (4.5) i=1 j=1 z=1 n=1 pji,n aji,n ≤ Ptotal subject to i j bj,z i,n (4.6) n ≤ cji,n aji,n ∀i, j, n (4.7) z i aji,n ≤ 1 ∀n (4.8) bj,z i,n ≤ 1 ∀i, j, z. (4.9) j n Constraint (4.6) ensures that the sum of the transmit powers on all subcarriers does not exceed Ptotal . Constraint (4.7) ensures that the total number of bits that user i, flow j can transmit on subcarrier n does not exceed the throughput limit cji,n given in (4.1). Constraint (4.8) ensures that each subcarrier can only be assigned to at most a single application flow of a user so as to reduce the signaling overhead required for the application bits to OFDM subcarriers mapping. Constraint (4.9) ensures that each bit is only transmitted on one subcarrier. 4.3.3 BitQoS-aware Resource Allocation Framework with Flow Merging In the previous section, each subcarrier assigned to a user is restricted to only carry bits from a single application flow of that user so as not to incur additional signaling overhead that may be required to indicate which application flow of the user each bit is from in the proposed bitQoS-aware RA framework. However, the bitQoS-aware RA framework with no flow merging increases the computational burden of the BS [34], as at each scheduling decision time k, the BS has to schedule Ji users instead of I users. It may also result in some i wastage in the event that there are not enough bits from an application flow to fill up subcarriers that have been assigned to it. Hence, we relax this constraint and allow application bits from different application flows of a user to be merged onto a single OFDM subcarrier 57 to study the potential performance gains that can be further achieved with the bitQoS-aware RA framework with flow merging. The proposed bitQoS-aware RA framework with flow merging is formulated as an optimization problem with the objective of finding the joint subcarrier, power and bit assignment to maximize the total bitQoS-weighted throughput, subject to the total transmit power constraint. Let the optimization variable ai,n denote the subcarrier assignment variable which takes on the value 1 if subcarrier n is allocated to user i and 0 otherwise. Furthermore, let the optimization variable bj,z i,n denote the bit assignment variable, which takes on the value 1 if user i, flow j, bit z is transmitted on subcarrier n and 0 otherwise. Finally, the optimization variable pi,n ∈ [0, Ptotal ] denotes the transmit power for user i on subcarrier n. The optimization problem, OP4.2, is formulated as I OP4.2: Ji Bij N j,z f (θθ j,z i )bi,n max ai,n ∈{0,1} pi,n ∈[0,Ptotal ] bj,z i,n ∈{0,1} (4.10) i=1 j=1 z=1 n=1 pi,n ai,n ≤ Ptotal subject to i n bj,z i,n ≤ ci,n ai,n j (4.11) ∀i, n (4.12) z ai,n ≤ 1 ∀n (4.13) bj,z i,n ≤ 1 ∀i, j, z. (4.14) i n Constraint (4.11) ensures that the sum of the transmit powers on all subcarriers does not exceed Ptotal . Constraint (4.12) ensures that the total number of bits that user i can transmit on subcarrier n does not exceed the throughput limit ci,n given in (4.1). Constraint (4.13) ensures that each subcarrier can only be assigned to at most one user. Note that this constraint has been relaxed from the problem formulation in OP4.1 to allow the subcarrier to be assigned to more than one flow of that user. Constraint (4.14) ensures that each bit is only transmitted on one subcarrier. 58 Table 4.1: Simulation Parameter Values Parameter Value System bandwidth (kHz) W = 4.5 Number of subcarriers N = 18 OFDM symbol duration (sec) Ts = 0.004 Subcarrier spacing (Hz) ∆f = 250 Application data packet size (bits) 128 Channel model independent Rayleigh fading Total transmit (Watt) 4.4 Ptotal = 1 power SNR gap ζ=1 Noise power (Watt) σ02 = 10−13 MDU window length (OFDM symbols) WM DU = 200 Performance Evaluation The bitQoS scheduling policies are simulated in Matlab using the system model described in Section 4.2. Each user is assumed to have 1 BE flow and 1 EF flow. The parameter values used in our simulation are listed in Tables 4.1 and 4.2. 4.4.1 Performance Measures To evaluate the performance of the bitQoS-aware RA framework, we quantify the performance gains in terms of average system throughput, average user throughput, average user latency, average user jitter and average user packet drop probability. We define the user 59 Table 4.2: Traffic Parameter Values Parameter BE Traffic EF Traffic Packet size Truncated Pareto (α = 1.2, xmin = 4.5 kBytes and xmax = 2 MBytes) IS-871 with alterations as detailed in [50] Packet interarrival time Geometric distribution (mean = 5 sec) IS-871 with alterations as detailed in [50] Average traffic arrival rate (kbps) λBE = 40.0 λEF = 3.70 Scheduling delay threshold (sec) TBE = 3.000 TEF = 0.100 Comfort latency threshold (sec) ηBE = 0.100 ηEF = 0.025 Flow priority πBE = 1.00 πEF = 1.00 Flow scaling coefficients cBE = 1.00 dBE = 1000 cEF = 1.00 dEF = 1000 Urgency base ξBE = 1.00 ξEF = 1.05 throughput, user latency, user jitter and user packet drop probability respectively as follows: T Puser (i) = LTuser (i) = JTuser (i) = P DPuser (i) = T Pij (k) , K j,z bit(i,j,z)∈Φj LTi j k i K × T Puser (i) (4.15) , (4.16) j,z bit(i,j,z)∈Φji (LTi − LTuser (i))2 K × T Puser (i) j k BDij (k) K × T Puser (i) + j k BDij (k) , , (4.17) (4.18) where T Pij (k) denotes the number of bits that is contained in packets of user i, flow j which are successfully received at time k, LTij,z denotes the amount of time bit(i, j, z) has waited in the data buffer before being scheduled and BDij (k) denotes the number of bits that is contained in packets of user i, flow j which are dropped at time k. The term, Φji , denotes the set of all the scheduled bits of user i, flow j. Note that if any bit within an application data packet is dropped, then that application data packet is dropped. Furthermore, in the calculation of user latency and user jitter, only bits in the scheduled packets (i.e., packets 60 that are not dropped due to exceeding Tj ) are included. The average system throughput, T Psystem , is defined as T Psystem = T Puser (i), and the average user throughput, average i user latency, average user jitter and average user packet drop probability are obtained by averaging T Puser (i), LTuser (i), JTuser (i) and P DPuser (i) respectively, over the I users. 4.4.2 Analytical System Throughput For comparison, the analytical system throughput curve of a multi-user, multi-channel OFDM system with full buffer using water-filling is derived. The analytical system throughput is obtained by summing the capacities of the i.i.d. Rayleigh fading subcarriers subject to the total power constraint Ptotal over the set N and distribution pΓn (γn ) [59], where Γn |αi∗ (n),n |2 Ptotal /ζσ02 denotes the instantaneous SNR of subcarrier n assuming the total BS transmit power, Ptotal , is allocated to that subcarrier and i∗ (n) = arg max{|αi,n |2 }: i∈I C= max ∞ pi∗ (n),n (γn ): n 0 pi∗ (n),n (γn )·pΓn (γn )dγn ≤Ptotal (4.19) ∞ log2 1 + n p i∗ (n),n (γn )γn Ptotal pΓn (γn )dγn . 0 We use the results in [66] which derives the capacity of a single Rayleigh fading channel with multi-receiver antennas using selection combining under optimal simultaneous power I−1 −γn γn I 1 − e E{Γn } e− E{Γn } , where the term and rate adaptation to obtain pΓn (γn ) = E{Γn } E{Γn } denotes the expected value of Γn . The analytical system throughput can be expressed as N I (−1)z+1 C = log2 (e) n=1 z=1 ∞ I zγ0 E1 ( ), z E{Γn } (4.20) e−u du denotes the exponential integral of order 1. The optimal power u x allocation is a two-dimensional water-filling (over N and pΓn (γn )) with a common cutoff where E1 (x) 61 SNR value, γ0 , which is obtained by solving N I (−1)z n=1 z=1 I z zγ0 z zγ0 1 E1 ( ) − e− E{Γn } = 1. E{Γn } E{Γn } γ0 (4.21) Since the Left-Hand-Side (LHS) of (4.21) is a monotonically decreasing function of γ0 , ∀I ≥ 1, N ≥ 1, γ0 > 0 and E{Γn } > 0, the solution can be found numerically using a bisection algorithm (see proof in Appendix B). 4.4.3 Comparative Schemes To provide a comparative performance assessment, we consider the following comparative scheduling policies: 1) Multi-user Water-filling (WF) scheduling policy, 2) Multi-user Waterfilling with Full Buffer (WF-FB) scheduling policy, and 3) MDU scheduling policy [38]. Since there are no provisions in WF and WF-FB for choosing which application flow to schedule from among the flows of a given user, each application flow j of user i is regarded as a separate user with the same channel gain αi,n in the simulation. When multiple users experiencing the same channel gain are considered for assignment to a subcarrier, one user is chosen completely at random. All scheduling policies, with the exception of WF-FB, adopt the traffic model described in Section 4.2.2. Multi-user Water-filling WF assigns each subcarrier to the user that has the best channel gain for that subcarrier, and the transmit power is distributed over the subcarriers using the water-filling algorithm [30]. The purpose of including this scheduling policy is to illustrate the performance of an algorithm that does not take QoS requirements into account but attempts to maximize the overall throughput of the system. Multi-user Water-filling with Full Buffer WF-FB is similar to WF described above, except that in WF-FB, a full buffer model is assumed for the incoming traffic, i.e., all data buffers are always full. While this model may 62 not be realistic, it establishes an upper bound on the throughput achievable for a multi-user, multi-channel OFDM system with full buffer using water-filling. Max-Delay-Utility MDU is a channel- and queue-aware, dynamic power-subcarrier assignment scheme which aims to maximize the aggregate utility with respect to the average waiting times [38]. The objective function of the optimization problem is max i∈I |Ui (wi (k))| ri (k), ri (k) (4.22) where ri (k) is the long-term average throughput for user i up to time k which is obtained by averaging the instantaneous actual throughput of user i over the last WM DU OFDM symbols, and ri (k) is the instantaneous achievable throughput for user i at time k. The term wi (k) Qi (k) denotes the average waiting time of user i at time k which is approximated by wi (k) = , ri (k) where Qi (k) is the queue length, in bits, of user i at time k. The marginal utility function Ui (·) is a non-decreasing function which is chosen based on the QoS requirements of the traffic classes. For our simulation, we adopt the marginal utility functions specified in [38]. The solution to the optimization problem in (4.22) is found by a combination of iterative subcarrier assignment, power allocation and the update of the marginal utility [38]. The purpose of including this scheduling policy is to illustrate the performance gains and tradeoffs of WFH-FM with respect to MDU, which only considers the flow-level QoS requirements. 63 Chapter 5 BitQoS-aware Resource Allocation Scheduling Policies 4 5.1 Introduction In Chapter 4, a novel bitQoS-aware RA framework is proposed to increase the flexibility and granularity of the resource allocation algorithms by considering QoS at the bit-level rather than only at the flow-level as in previous works [34–37, 39–41]. The proposed RA framework is formulated as MINLP optimization problems (OP4.1 and OP4.2), whose solutions are computationally complex given the large number of subcarriers and users in a practical system. To evaluate the performance of the bitQoS-aware RA framework, we propose lower complexity, iterative subcarrier-power-bit allocation algorithms, hereafter referred to as Multi-user Water-filling with Heuristics (WFH) and Multi-user BitQoS-aware Bit-loading (BABL) to quantify the achievable performance gains. This chapter is organized as follows: in Section 5.2, the water-filling-based WFH schedul4 The material in this chapter is based on the following: C. E. Huang and C. Leung, “QoS-aware bit scheduling in multi-user OFDM systems,” in Proc. IEEE WCNC, Mar. 2011, pp. 215–220. c 2011 IEEE. http://dx.doi.org/10.1109/WCNC.2011.5779163 C. E. Huang and C. Leung, “BitQoS-aware resource allocation for multi-user mixed-traffic OFDM systems,” IEEE Trans. Veh. Technol., vol. 61, no. 5, pp. 2067-2082, Jun. 2012. c 2012 IEEE. http://dx.doi.org/10.1109/TVT.2012.2189030 64 ing policy is described and the bit-loading-based BABL is described in Section 5.3. Simulation results are presented in Section 5.4 including the performance of the bitQoS-aware RA framework with flow merging and with no flow merging. The main findings are summarized in Section 5.5. 5.2 Multi-user Water-filling with Heuristics To evaluate the performance of the bitQoS-aware RA framework with flow merging and with no flow merging, we propose water-filling-based iterative subcarrier-power-bit allocation algorithms, hereafter referred to as Multi-user Water-filling with Heuristics with Flow Merging (WFH-FM) to solve the optimization problem, OP4.2, and Multi-user Water-filling with Heuristics with No Flow Merging (WFH-NFM) to solve the optimization problem, OP4.1. The goal of the WFH scheduling policies is to maximize the total bitQoS-weighted throughput. It uses the following two main steps: 1) multi-user water-filling for throughput maximization and 2) iterative subcarrier reassignment for bitQoS maximization. 5.2.1 WFH-FM Scheduling Policy At each scheduling decision time k, we run the following resource allocation algorithm. To ease the notational burden, we omit the time index k from the equations in this section. The bits from all flows of a user are combined into one queue, i.e., Ji = 1, and sorted in decreasing order based on their bitQoS values. A flow chart for the WFH-FM scheduling policy is shown in Fig. 5.1. Step 1: Multi-user water-filling: To simplify the maximization of the total bitQoS-weighted throughput, we assume that the bitQoS values, ψij,z , of all the bits are equal, so that the objective function in OP4.2 can be rewritten as I Ji Bij N bj,z i,n . max ai,n ∈{0,1} pi,n ∈[0,Ptotal ] bj,z i,n ∈{0,1} i=1 j=1 z=1 n=1 65 (5.1) This new optimization problem can be solved as a throughput maximization problem subject to a total power constraint [30, 67–69]. In [30], it is shown that when a full buffer model is assumed, the maximum throughput of a multi-user OFDM system can be achieved by assigning each subcarrier to the user with the best channel gain for that subcarrier and distributing the power over subcarriers using the water-filling algorithm. Let i∗ (n) denote the selected user that has the highest channel gain on subcarrier n, i.e., i∗ (n) = arg max |αi,n |. In the event where multiple users expei∈I rience the same channel gain max |αi,n |, then i∗ (n) is chosen randomly from these i∈I users with equal probabilities. Thus, the current subcarrier assignment can be written as a ˆi,n = 1, if i = i∗ (n) ∀n. (5.2) 0, if i = i∗ (n) The term a ˆi,n is used to denote the current intermediate subcarrier assignment variable which may be different from the optimal subcarrier assignment variable, ai,n , for OP4.2. Once the subcarrier assignment is determined, we can determine the amount of transmit power to be allocated to the subcarriers in order to maximize overall system throughput. This is achieved using the water-filling algorithm. The transmit power for user i on subcarrier n [30] is pˆi,n where [x]+ 1 1 + ζσ02 [ − ] , if i = i∗ (n) 2 λ |α | 0 i,n = 0, if i = i∗ (n) (5.3) max{x, 0} and λ0 is a threshold determined using the total power constraint (4.11). The bit assignment variable ˆbj,z i,n is then obtained by assigning bits of user i in a FIFO manner to the subcarriers in Vi , one subcarrier at a time, where Vi = {n ∈ N |ˆ ai,n = 1}. This bit assignment is performed until either all the bits of user i have been assigned or the throughput limits ci,n , ∀n ∈ Vi have been reached. 66 We note that performing subcarrier assignments based only on channel gains may lead to situations where users in good channel conditions are assigned more subBij . To address this issue, we perform an ci,n > carriers than needed, i.e., n j additional subcarrier reassignment step, called greedy water-filling, which aims to reduce the wastage of resources through reassignments of the users’ excess subcarriers. We define U = {i ∈ I| Bij } and U c = I − U to denote the ci,n > n j set of users that have excess subcarriers and the complement set of U, respectively. Furthermore, we define ΩU = {n ∈ N |ˆ ai,n = 1, i ∈ U} to denote the set of subcarriers that are assigned to users in U. The goal of the greedy water-filling is to iteratively reassign one subcarrier in ΩU at a time to a user in U c such that the overall system throughput after reassignment is maximized. This can be done by computing the attainable throughput gain for every possible reassignment pair in the Cartesian product of U c and ΩU and performing the subcarrier reassignment based on the pair yielding the highest throughput gain. Power allocation is updated after each reassignment using the water-filling algorithm. This procedure is repeated until the overall system throughput cannot be increased any further through the reassignments of the subcarriers in ΩU . Based on the current assignment values a ˆi,n , pˆi,n and ˆbj,z i,n , the current intermediate objective value δˆobj is given by I Ji Bij N ˆj,z f (θθ j,z i )bi,n . δˆobj = (5.4) i=1 j=1 z=1 n=1 Step 2: Iterative subcarrier reassignment: While the intermediate solution from Step 1 maximizes the overall system throughput, it may not be an optimal solution to OP4.2. In particular, if there exists any unassigned bit in the data buffer with a bitQoS value that is greater than those of any already assigned bits, then the intermediate solution may be improved upon by reassigning subcarriers to the users who have unassigned bits with larger bitQoS values. Let ψun (i) denote the bitQoS value of the first unas- 67 Start ^ by assigning Determine subcarrier assignment, a, each subcarrier to the user with the highest channel gain according to (5.2) ^ using the water· Compute transmit power, p, filling algorithm (5.3) ^ · Determine bit assignment, b, subject to constraint (4.12) · Reduce wastage of resources using the greedy water-filling algorithm ^ · Compute objective value, δobj, according to (5.4) for i =1:I ψun(i) = bitQoS value of the first unassigned bit of user i ψas(i) = bitQoS value of the last assigned bit of user i end max ψun(i) ≤ min ψas(i) i i Y End N l * arg max un ( i ) i n* arg max l *, n n D l * aˆ l *, n * 1 ^ using the water-filling algorithm (5.3) · Recalculate p’ ^ ^ · Update b’ and δ’obj ^ N ^ δ’obj > δobj Y ^ a^ = a’ ψun(l*) = 0 ^ p^ = p’ ^ b^ = b’ δ^obj = δ^’obj Figure 5.1: WFH-FM Flow Chart 68 signed bit in the data buffer of user i and let ψas (i) denote the bitQoS value of the last assigned bit in the data buffer of user i, that is ψun (i) = ψas (i) = min bit(i,j,z)∈Sas (i) f (θθ j,z i ), max bit(i,j,z)∈Sun (i) f (θθ j,z i ) and where ˆbj,z = 0, j ∈ Ji , z ∈ {1, . . . , B j }} i,n i Sun (i) = {bit(i, j, z)| (5.5) n denotes the set of bits of user i that have not yet been assigned based on the current intermediate assignments, and ˆbj,z = 1, j ∈ Ji , z ∈ {1, . . . , B j }} i,n i Sas (i) = {bit(i, j, z)| (5.6) n denotes the set of bits of user i that have been assigned based on the current intermediate assignments. The term bit(i, j, z) refers to the bit z of user i, flow j. At each iteration, the user with the largest unassigned bitQoS-valued bit, l∗ = arg max ψun (i), i will be assigned a subcarrier in an attempt to increase the current intermediate objective value δˆobj even though power may be less efficiently used as this user may be experiencing a lower channel quality on this subcarrier. The subcarrier is chosen by n∗ = arg max αl∗ ,n , where Dl∗ = {n ∈ N |ˆ al∗ ,n = 0} denotes the set of subcarriers n∈Dl∗ that have not yet been assigned to user l∗ . If the subcarrier n∗ was previously assigned to another user, then subcarrier n∗ is unassigned from that user and the corresponding assigned bits are put back to the data buffers. Based on this new subcarrier assignment variable, a ˆi,n , the transmit power, pˆi,n , is recalculated using the water-filling alj,z gorithm. The bit assignment variable, ˆbi,n , and current intermediate objective value, δˆobj , are also updated accordingly. As this subcarrier reassignment may cause a decrease in δˆobj , this subcarrier reassignment is only performed if δˆobj > δˆobj . Otherwise, ψun (l∗ ) is temporarily set to 0, and a new user with the next largest unassigned 69 QoS-valued bit is selected. This step repeats until max ψun (i) ≤ min ψas (i). i∈I i∈I (5.7) It can be shown that the number of iterations required for Step 2 in the worst case is IN iterations. 5.2.2 WFH-NFM Scheduling Policy WFH-NFM is identical to WFH-FM with the exception that application bits from the different flows of a user cannot be assigned to the same subcarrier, i.e., each subcarrier assigned to the user can only carry bits from a single application flow of that user. As such, the bits from all flows of a user are not combined into one queue as in WFH-FM, but rather each application flow j of user i in WFH-NFM is regarded as a separate user with the same channel gain αi,n . Specifically, OP4.2 is modified as follows: we replace constraint (4.13) with I Ji aji,n ≤ 1 ∀n. (5.8) i=1 j=1 In addition, the variables ai,n , pi,n and ci,n take dependence on j and are replaced with aji,n , pji,n and cji,n in WFH-NFM. 5.3 Multi-user BitQoS-aware Bit-loading To evaluate the performance of the bitQoS-aware RA framework with flow merging and with no flow merging, we propose the following bit-loading-based adaptive, joint subcarrier, power and bit allocation algorithms, hereafter referred to as Multi-user BitQoS-aware Bit-loading with Flow Merging (BABL-FM) to solve the optimization problem, OP4.2, and Multi-user BitQoS-aware Bit-loading with No Flow Merging (BABL-NFM) to solve the optimization problem, OP4.1. The goal of the BABL scheduling policies is to jointly determine the subcarrier, power and bit assignments using bit-loading in an effort to maximize the total 70 bitQoS-weighted throughput subject to the total transmit power constraint. This is accomplished by iteratively assigning the largest unassigned bitQoS-valued bits, one bit at a time to the subcarrier requiring the least amount of power, until the total BS transmit power, Ptotal , is depleted or all bits in the user data buffers have been assigned. 5.3.1 BABL-FM Scheduling Policy At each scheduling decision time k, we run the following resource allocation algorithm. To ease the notational burden, we omit the time index k from the equations in this section. The bits from all application flows of each user i are merged into one queue, i.e., Ji = 1, and sorted in decreasing order based on their bitQoS values. A flow chart for the BABL-FM scheduling policy is shown in Fig. 5.2. For each bit assignment iteration, we determine the largest unassigned bitQoS-valued bit in the data buffer as bit(i∗ , j ∗ , z ∗ ) = arg max bit(i,j,z): ˆj,z n bi,n =0 ψij,z , (5.9) where bit(i, j, z) refers to bit z of user i, flow j. The term ˆbj,z i,n is used to denote the current intermediate bit assignment variable which may be different from the optimal bit assignment variable, bj,z i,n , for OP4.2. The power required to transmit this bit is computed for each sub∗ ∗ carrier n ∈ N and is denoted by the temporary variable, pij∗ ,n,z . Depending on the current ∗ ∗ intermediate subcarrier assignment variable, a ˆi,n , the power, pij∗ ,n,z , is determined by one of the following three cases: Case 1: Subcarrier n was previously not assigned to any user i (ˆ ai,n = 0 ∀i ∈ I): The power, ∗ ∗ pij∗ ,n,z , required to transmit bit(i∗ , j ∗ , z ∗ ) on subcarrier n is ∗ ∗ pij∗ ,n,z = ζσ02 . |αi∗ ,n |2 (5.10) ∗ ∗ Case 2: Subcarrier n was previously assigned to user i∗ (ˆ ai∗ ,n = 1): The power, pij∗ ,n,z , 71 required to transmit the additional bit(i∗ , j ∗ , z ∗ ) on subcarrier n is ∗ pij∗ ,n,z ∗ (2cˆi∗ ,n +1 − 1)ζσ02 (2cˆi∗ ,n − 1)ζσ02 − |αi∗ ,n |2 |αi∗ ,n |2 ζσ02 = (2cˆi∗ ,n +1 − 2cˆi∗ ,n ), |αi∗ ,n |2 = (5.11) where cˆi∗ ,n denotes the number of bits of user i∗ that have already been assigned to subcarrier n. Case 3: Subcarrier n was previously assigned to another user l (ˆ al,n = 1): As each subcarrier can only be assigned to at most one user based on constraint (4.13), allocating bit(i∗ , j ∗ , z ∗ ) to subcarrier n will first require reallocating the bits of user l that were previously assigned to subcarrier n to other subcarriers. We define j Sl,n = {bit(l, j, z)|ˆbj,z l,n = 1, j ∈ Jl , z ∈ {1, . . . , Bl }} (5.12) to denote the set of bits of user l currently assigned to subcarrier n. To prevent nested bit reallocations, we restrict the reallocation of bits in Sl,n only to subcarriers that are either unassigned or previously assigned to user l. We define Ωl = {m ∈ N |ˆ ai,m = 0 ∀i ∈ I or a ˆl,m = 1} (5.13) to denote the set of subcarriers that bits in Sl,n can be reallocated to. The bit reallocations are done iteratively in a FIFO manner by assigning bits in Sl,n , one bit at a time, j,z , required to realto the subcarriers in Ωl . For each bit(l, j, z) ∈ Sl,n , the power, pl,m locate the bit is computed using either (5.10) or (5.11) for all m ∈ Ωl . The subcarrier j,z that requires the least power, m∗ = arg min pl,m , is selected. This procedure repeats m∈Ωl until all the bits in Sl,n have been reallocated. The power, pˆl,n , previously assigned to user l, subcarrier n is reclaimed and bit(i∗ , j ∗ , z ∗ ) is assigned to subcarrier n. The 72 Start Determine the largest unassigned bitQoS-valued bit, bit(i*, j*, z*), according to (5.9) n=1 Case 1: Compute aˆ i,n 0 i 0 2 p'ij*,*,nz* Y i *,n 2 N Case 2: Compute aˆ i*, n 1 Y p'ij*,*,nz* 0 2 2 i*,n 2 cˆi*,n 1 2 cˆi*,n Case 3: N · For each bit ( l , j , z ) S l , n , j,z compute p ' l , m for all m l. N · Compute 0 2 p ' ij*,*,nz * i *, n bit ( l , j , z ) S l , n n = n+1 n>N Y Select the subcarrier that requires the least power to transmit bit(i*, j*, z*) n* arg min p' ij*,*,nz* n N pˆ i ,n p'i*,n* Ptotal j*,z* i n Y N · Update the optimization variables according to (5.15), (5.16) and (5.17) · Reallocate bits in Sl,n* for Case 3 End Figure 5.2: BABL-FM Flow Chart 73 2 pˆ l , n min p ' lj,,mz m l ∗ ∗ power, pij∗ ,n,z , required to transmit bit(i∗ , j ∗ , z ∗ ) on subcarrier n is ∗ ∗ pij∗ ,n,z = ζσ02 − pˆl,n + |αi∗,n |2 j,z . min pl,m bit(l,j,z)∈Sl,n m∈Ωl (5.14) Based on the above three possible cases, the subcarrier that requires the least power to ∗ ∗ transmit bit(i∗ , j ∗ , z ∗ ) is selected as n∗ = arg min pij∗ ,n,z and this bit assignment is performed n∈N pˆi,n + if i ∗ ∗ pij∗ ,n,z∗ ≤ Ptotal . The current intermediate optimization variables are then n updated as follows: a ˆi∗ ,n∗ = 1 (5.15) ˆbj∗∗ ,z∗∗ = 1 i ,n ζσ02 , for Case 1; |αi∗ ,n∗ |2 cˆi∗ ,n∗ +1 (2 − 1)ζσ02 pˆi∗ ,n∗ = , for Case 2; |αi∗ ,n∗ |2 ζσ02 , for Case 3, |αi∗ ,n∗ |2 (5.16) (5.17) along with the reallocation of the bits in Sl,n∗ for Case 3 as necessary. The next largest bitQoSvalued bit is then selected for the next bit assignment iteration. This iterative algorithm repeats until ∗ ∗ pˆi,n + pij∗ ,n,z∗ > Ptotal i (5.18) n or that all the bits in the data buffers of all users have been assigned. 5.3.2 BABL-NFM Scheduling Policy BABL-NFM is identical to BABL-FM with the exception that application bits from the different flows of a user cannot be assigned to the same subcarrier, i.e., each subcarrier assigned to the user can only carry bits from a single application flow of that user. As such, the bits from all flows of a user are not combined into one queue as in BABL-FM, but rather each application flow j of user i in BABL-NFM is regarded as a separate user with the same channel 74 gain αi,n . Specifically, OP4.2 is modified as follows: we replace constraint (4.13) with I Ji aji,n ≤ 1 ∀n. (5.19) i=1 j=1 In addition, the variables ai,n , pi,n and ci,n take dependence on j and are replaced with aji,n , pji,n and cji,n in BABL-NFM. 5.4 Simulation Results The WFH-FM, WFH-NFM, BABL-FM and BABL-NFM scheduling policies described in Sections 5.2.1, 5.2.2, 5.3.1 and 5.3.2, respectively, were simulated in Matlab using the system model described in Section 4.2. In the simulation, it is assumed that each user has 1 BE flow and 1 EF flow. The parameter values used in our simulation are listed in Tables 4.1 and 4.2. 5.4.1 WFH Simulation Results We next discuss the performance of WFH-FM and WFH-NFM and the comparative schemes described in Section 4.4.3 under two system loading scenarios: A) heavy load and B) different loads. Simulation results were obtained for mixed-traffic scenarios for I = {4, 6, 8} with a simulation length of K = {18000, 10000, 12000} OFDM symbols, respectively. WFH-FM performance under heavy load The CDF plots for user throughout, user bit latency, user bit jitter and number of user bits dropped for a system with I = 8, N = 18, 1 BE and 1 EF flow for each user are shown in Fig. 5.3. The CDF plots are averaged over I users and obtained from T Pij (k), LTij,z , JTij,z and BDij (k), respectively, where the term, JTij,z = |LTij,z − LTuser (i)|. Figs. 5.3a and 5.3d show that WFH-FM not only has the highest user throughput for BE but also the lowest number of user bits dropped for both BE and EF traffic. From Fig. 5.3a, it can be seen that the MDU user EF throughput is slightly higher than that of WFH-FM; however, the MDU user BE throughput and MDU number of user BE bits dropped are significantly 75 EF Flows 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF BE Flows 1 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 5 10 0 15 User Throughput (bps) WF MDU WFH−FM WFH−NFM 0 5 10 15 User Throughput (bps) 4 x 10 4 x 10 (a) EF Flows 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF BE Flows 1 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.5 1 1.5 2 2.5 0 3 User Bit Latency (sec) WF MDU WFH−FM WFH−NFM 0 0.02 0.04 0.06 0.08 0.1 User Bit Latency (sec) (b) Figure 5.3: WFH: Performance for a System with I = 8, N = 18, 1 BE and 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Bit Latency 76 EF Flows 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 CDF CDF BE Flows 1 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.5 1 1.5 0 2 WF MDU WFH−FM WFH−NFM 0 User Bit Jitter (sec) 0.02 0.04 0.06 User Bit Jitter (sec) (c) EF Flows 1 0.95 0.95 0.9 0.9 0.85 0.85 CDF CDF BE Flows 1 0.8 0.8 0.75 0.75 0.7 0.65 WF MDU WFH−FM WFH−NFM 0.7 0 0.5 1 1.5 2 Number of User Bits Dropped per 250 OFDM Symbols (bits) 0.65 2.5 0 200 400 600 Number of User Bits Dropped per 250 OFDM Symbols (bits) 5 x 10 (d) Figure 5.3: WFH: Performance for a System with I = 8, N = 18, 1 BE and 1 EF Flow for each User (Continued) (c) CDF of User Bit Jitter (d) CDF of Number of User Bits Dropped per 250 OFDM Symbols 77 worse compared to WFH-FM as the MDU scheduling policy strictly favors EF traffic in a mixed-traffic environment. WF, on the other hand, achieves the lowest user EF throughput and also the highest number of user EF bits dropped as it does not have QoS provisioning to schedule EF traffic in an attempt to meet the scheduling delay threshold. In terms of user bit latency, Fig. 5.3b shows that WFH-FM has the lowest user BE bit latency and the second lowest user EF bit latency. Only MDU has a better user EF bit latency than WFH-FM due to its strict bias towards EF traffic in a mixed-traffic environment. WFH-FM does not achieve the lowest user bit latency for EF traffic because the finesse control of the bitQoS-aware RA framework trades off a longer user EF scheduling delay (albeit within the packet drop threshold) for gains in both the user BE throughput (highest) and number of user BE and EF bits dropped (lowest). Similar to user bit latency, we see from Fig. 5.3c that WFH-FM has the lowest user BE bit jitter and the second lowest user EF bit jitter. The simulation results confirm the performance gains of the proposed WFH-FM scheduling policy which adopted the bitQoS-aware RA framework against the other comparative scheduling policies. WFH-FM performance under different loads The average system throughput of the scheduling policies with no flow merging (WF, WFFB, MDU and WFH-NFM) and the scheduling policy with flow merging (WFH-FM) as a function of I are shown in Fig. 5.4. We see from Fig. 5.4 that the analytical throughput agrees very closely with the simulation results of WF-FB and that WFH-FM achieves the highest overall system throughput when compared to WF and MDU. While the objective function of WFH-FM is to maximize total bitQoS-weighted throughput, WFH-FM provides a good overall system throughput in part due to the adoption of the greedy multi-user waterfilling in the first step of WFH-FM. The average system throughput of WFH-FM increases monotonically with I, for I = {4, 6, 8}. The plots in Fig. 5.5 show the average user throughput, average user latency, average user jitter and average user packet drop probability for I = {4, 6, 8} users. It can be observed that the performance of WFH-FM relative to the comparative scheduling policies is, in general, 78 4 x 10 Average System Throughput (bps) 18 Analytical Throughput WF−FB WF MDU WFH−NFM WFH−FM 16 14 12 10 8 6 4 2 0 4 5 6 7 8 Number of Users Figure 5.4: WFH: Average System Throughput under Different Loads insensitive to the different loads across all the QoS metrics considered. In Fig. 5.5a, we see that the user BE throughput decreases monotonically and the user EF throughput is relatively constant for all scheduling polices as I (system loading) increases. Figs. 5.5a and 5.5d show that WFH-NFM/FM not only have the highest user throughput but also the lowest user packet drop probability for both BE and EF traffic. While the MDU user EF throughput is close to that of WFH-NFM/FM, the MDU user BE throughput and MDU user BE packet drop probability are significantly worse than WFH-NFM/FM as the MDU scheduling policy strictly favors EF traffic in a mixed-traffic environment. On the other hand, WF has the lowest user BE throughput and also the highest user EF packet drop probability as it does not have QoS provisioning to meet the scheduling delay thresholds. In terms of user latency, Fig. 5.5b shows that WFH-NFM/FM have the lowest user BE latency. MDU has a lower user EF latency than WFH-NFM/FM due to its strict bias towards EF traffic in a mixed-traffic environment. However, the MDU user EF latency gain comes at the expense of the user BE throughput and user BE packet drop probability. WF, with no QoS provisions, suffers the highest user BE latency. WFH-NFM/FM do not achieve the lowest 79 user latency for EF traffic as the bitQoS-aware scheduling trades off a longer user scheduling delay (albeit within the scheduling delay threshold) for gains in both the user throughput (highest) and user packet drop probability (lowest) for both BE and EF traffic. This trade-off is possible since in OFDM, data is loaded onto subcarriers in units of bits and the latency QoS is satisfied as long as the bit waiting time does not exceed the scheduling delay threshold. By applying the bitQoS function at the bit-level as proposed, system providers can trade off the bit waiting time for a reduction in the user packet drop probability by prioritizing which bit to transmit based on its closeness to the scheduling delay threshold. This finer resolution of control provides an additional flexibility to push back the scheduling of bits that are not as close to the scheduling delay threshold (i.e., by increasing the bit waiting time) so as to allow the servicing of more “urgent” bits when necessary. As long as this push-back does not cause the bit waiting time to exceed the scheduling delay threshold, bits will be serviced within their scheduling delay thresholds, resulting in a simultaneous increase in user throughput and a reduction in user packet drop probability. The WFH-NFM/FM user EF latency are also influenced by the bitQoS function (4.4) which explicitly gives bits from the BE (delaytolerant) flows a higher urgency when wiBE,z (k) ≤ ηEF so as to reduce the BE buffer backlog, if necessary. Similar to user latency, we see from Fig. 5.5c that WFH-NFM/FM have the lowest user BE jitter. For delay and jitter sensitive applications (EF flows), we note from Figs. 5.5b and 5.5c, that the plots for WFH-NFM/FM are flatter than WF as I varies, i.e., user EF latency and user EF jitter are less sensitive to different system loads. This is particularly beneficial for the sizing of input buffers in mobile devices for delay and jitter sensitive applications. As shown in Fig. 5.5d, WFH-NFM/FM are able to maintain the lowest user packet drop probability for both BE and EF traffic across the different system loads and thus provides the highest user throughput for both BE and EF traffic among the comparative scheduling policies. 80 BE Flows 4 2 x 10 EF Flows 8000 WF MDU WFH−NFM WFH−FM Average User Throughput (bps) Average User Throughput (bps) 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 6000 4000 2000 0.2 0 4 5 6 7 0 8 4 5 Number of Users 6 7 8 Number of Users (a) BE Flows EF Flows 3 0.1 WF MDU WFH−NFM WFH−FM Average User Latency (sec) Average User Latency (sec) 2.5 2 1.5 1 0.5 0 4 5 6 7 0.08 0.06 0.04 0.02 0 8 Number of Users 4 5 6 7 8 Number of Users (b) Figure 5.5: WFH: Performance for Systems under Different Loads (a) Average User Throughput (b) Average User Latency 81 BE Flows EF Flows 1.5 0.05 0.04 Average User Jitter (sec) Average User Jitter (sec) 1.25 1 0.75 0.5 0.03 0.02 0.01 0.25 0 WF MDU WFH−NFM WFH−FM 4 5 6 7 0 8 4 5 Number of Users 6 7 8 Number of Users (c) BE Flows BE Flows 1 Average User Packet Drop Probability (%) Average User Packet Drop Probability (%) 100 80 60 40 20 0 4 5 6 7 0.8 0.6 0.4 0.2 0 8 Number of Users WF MDU WFH−NFM WFH−FM 4 5 6 7 8 Number of Users (d) Figure 5.5: WFH: Performance for Systems under Different Loads (Continued) (c) Average User Jitter (d) Average User Packet Drop Probability 82 5.4.2 BABL Simulation Results We next discuss the performance of BABL-FM and BABL-NFM and the comparative schemes described in Section 4.4.3 under two system loading scenarios: A) heavy load and B) different loads. Simulation results were obtained for mixed-traffic scenarios for I = {4, 6, 8} with a simulation length of K = {15000, 11000, 4500} OFDM symbols, respectively. BABL-FM performance under heavy load The CDF plots of user throughout, user bit latency, user bit jitter and number of user bits dropped for a system with I = 8, N = 18, 1 BE and 1 EF flow for each user are shown in Fig. 5.6. The CDF plots are averaged over I users and obtained from T Pij (k), LTij,z , JTij,z and BDij (k), respectively, where the term, JTij,z = |LTij,z − LTuser (i)|. Figs. 5.6a and 5.6d show that BABL-FM not only has the highest user throughput but also the lowest number of user bits dropped for both BE and EF traffic. From Fig. 5.6a, it can be seen that the MDU user EF throughput is close to that of BABL-FM; however, the MDU user BE throughput and MDU number of user BE bits dropped are significantly worse compared to BABL-FM as the MDU scheduling policy strictly favors EF traffic in a mixed-traffic environment. WF, on the other hand, achieves the lowest user EF throughput and also the highest number of user EF bits dropped as it does not have QoS provisioning to schedule EF traffic in an attempt to meet the scheduling delay threshold. In terms of user bit latency, Fig. 5.6b shows that BABL-FM has the lowest user BE bit latency. However, MDU has a lower user EF bit latency than BABL-FM due to its strict bias towards EF traffic in a mixed-traffic environment but the MDU user EF bit latency gain comes at the expense of the user BE throughput and number of user BE bits dropped. BABL-FM does not achieve the lowest user bit latency for EF traffic as the bitQoS-aware scheduling trades off a longer user EF scheduling delay (albeit within the scheduling delay threshold) for gains in both the user throughput (highest) and number of user bits dropped (lowest) for both BE and EF traffic. The BABL-FM user EF bit latency is also influenced by the bitQoS function (4.4) which explicitly gives bits from the BE (delaytolerant) flows a higher urgency when wiBE,z (k) ≤ ηEF so as to reduce the buffer backlog, 83 4 4 (a) (b) Figure 5.6: BABL: Performance for a System with I = 8, N = 18, 1 BE and 1 EF Flow for each User (a) CDF of User Throughput (b) CDF of User Bit Latency 84 (c) 5 (d) Figure 5.6: BABL: Performance for a System with I = 8, N = 18, 1 BE and 1 EF Flow for each User (Continued) (c) CDF of User Bit Jitter (d) CDF of Number of User Bits Dropped per 250 OFDM Symbols 85 resulting in the EF traffic being allocated with LTij,z around ηEF as shown in Fig. 5.6b. In addition, we note that since BABL-FM has a negligible number of user EF bits dropped, we can further improve the BABL-FM user BE throughput by either increasing ηEF to a value closer to TEF or increasing the priority of BE traffic by adjusting the parameter values of the BE bitQoS function. In contrast, WF, with no QoS provisions, has the highest user BE bit latency and the second highest user EF bit latency. We see from Figs. 5.6c and 5.6d that BABL-FM has the lowest user bit jitter and lowest number of user bits dropped for both BE and EF traffic. BABL-FM performance under different loads The average system throughput of the various scheduling policies (WF, WF-FB, MDU, BABLFM and BABL-NFM) shown in Fig. 5.7 were obtained by simulation as a function of I based on the system model described in Section 4.2 with the simulation parameter values listed in Tables 4.1 and 4.2. We see from Fig. 5.7 that the analytical throughput agrees very closely with the simulation results of WF-FB and that BABL-FM achieves the highest average system throughput when compared to WF and MDU. While the objective function of BABL-FM is to maximize the total bitQoS-weighted throughput, it provides a good average system throughput in part due to BABL-FM iteratively assigning the largest unassigned bitQoS-valued bit to the subcarrier requiring the least amount of power. The average system throughput of BABL-FM increases monotonically with I, for I = {4, 6, 8}. The plots in Fig. 5.8 show the average user throughput, average user latency, average user jitter and average user packet drop probability, obtained by averaging T Puser (i), LTuser (i), JTuser (i) and P DPuser (i) over I users, respectively, for I = {4, 6, 8}. We see from Fig. 5.8 that the performance of BABL-FM relative to the comparative scheduling policies is, in general, insensitive to the different loads across all the QoS metrics considered. In Fig. 5.8a, we see that the user BE throughput increases monotonically only for BABL-FM as I (system loading) increases and the user EF throughput is constant for all scheduling polices. In terms of user latency, we see from Fig. 5.8b that BABL-FM achieves the lowest user BE latency 86 5 Figure 5.7: BABL: Average System Throughput under Different Loads and the highest user EF latency across all I in part due to the mixed-traffic bitQoS function (4.4) allowing BE traffic to reduce backlog when wiBE,z (k) ≤ ηEF . In terms of user jitter, we see from Fig. 5.8c that BABL-FM achieves the lowest user BE jitter whereas the user EF jitter decreases as LTiEF,z clusters around ηEF when I increases. In addition, BABL-FM is also able to maintain the lowest BE and EF user packet drop probabilities across the different system loads as shown in Fig. 5.8d and thus provides the highest user throughput for both BE and EF traffic among the comparative scheduling policies. Note that Figs. 5.7, 5.6 and 5.8 do not show the optimal solution to OP4.2 using the commercial MINLP optimization solver package due to the prohibitive computation time required. 5.5 Conclusion A bitQoS-aware RA framework that adaptively matches the QoS requirements of the user application bits to the characteristics of the OFDM subcarriers was proposed for multi-user OFDM systems. The performance gains achievable from the proposed framework are demonstrated using suboptimal water-filling-based WFH and bit-loading-based BABL scheduling policies. The results show that with the finesse bit-level control provided by the proposed 87 4 (a) (b) Figure 5.8: BABL: Performance for Systems under Different Loads (a) Average User Throughput (b) Average User Latency 88 (c) (d) Figure 5.8: BABL: Performance for Systems under Different Loads (Continued) (c) Average User Jitter (d) Average User Packet Drop Probability 89 framework, it is possible to simultaneously achieve both an increase in throughput and a reduction in packet drop probability in a mixed-traffic environment at the cost of a longer (albeit within the packet drop threshold) scheduling delay. This flexibility comes from the realization that in OFDM, data is loaded onto subcarriers in units of bits and the latency QoS is satisfied as long as the bit waiting time does not exceed the scheduling delay threshold. By applying the bitQoS function at the bit-level as proposed, system providers can trade off the bit waiting time for a reduction in the number of dropped packets by prioritizing which bit to transmit based on its closeness to the scheduling delay threshold. This finer resolution of control provides an additional flexibility to push back the scheduling of bits that are not as close to the scheduling delay threshold (i.e., by increasing the bit waiting time) so as to allow the servicing of more “urgent” bits when necessary. As long as this push-back does not cause the bit waiting time to exceed the scheduling delay threshold, bits will be serviced within their scheduling delay thresholds, resulting in a simultaneous increase in user throughput and a reduction in the number of user bits dropped. Simulation results, obtained using the proposed WFH and BABL scheduling policies, show that the proposed bitQoS-aware RA framework is able to provide a substantial improvement in user throughput and user packet drop probability compared to scheduling policies that do not take QoS provisions into account such as WF and policies that consider only application flow QoS requirements such as MDU. In particular, WFH and BABL are also able to achieve the highest average system throughput across all considered system loads. In addition, it was found that in a multi-application system, the performance gains by allowing bits from different application flows of a user to be merged into a single subcarrier for transmission are small and should only be used if such gains, at the expense of the increased scheduling signaling overhead, are warranted. However, it provides the service providers the option to choose, based on computational resource availability, whether to let the BS fully take on the scheduling task with less scheduling signaling overhead as in WFH/BABL-NFM or let the MS share the computational burden with the BS at the expense of increased scheduling signaling overhead as in WFH/BABL-FM. 90 Chapter 6 Scheduling Signaling Overhead in BitQoS-aware Resource Allocation Framework 5 6.1 Introduction A novel bitQoS-aware RA framework is proposed in Chapter 4 which allows the exploitation of both multi-application and multi-bit diversities (in addition to multi-user and multichannel diversities) in mixed-traffic OFDMA systems. It is shown in Chapter 5 that with the finesse control of bitQoS-aware scheduling, it is possible to simultaneously achieve both an increase in user throughput and a reduction in user packet drop probability by accepting a within packet drop threshold increase in user latency. However, as the granularity of RRM scheduling algorithms increases to more closely meet the different QoS requirements of multiple concurrent user application flows, the potential scheduling gain comes at the cost of an increased scheduling signaling overhead. This is due to the fact that the mapping between the application bits and the OFDM subcarriers need to be signaled using the control channel accompanying the data channel so that the receiver is able to extract the application bits from 5 The material in this chapter is based on: C. E. Huang and C. Leung, “Scheduling signaling overhead in bitQoS-aware multi-flow OFDM systems,” submitted. 91 the assigned OFDM subcarriers. Only a few of the numerous published papers on RRM scheduling algorithms consider the scheduling signaling overhead. In [70], the compression of signaling information for adaptive multi-carrier systems is studied. It is shown that efficient compression schemes can reduce the amount of signaling information and increase system transmission efficiency. In [71], the authors attempt to reduce the scheduling signaling overhead by exploiting the correlation of the scheduling information in time. This is achieved by changing the subcarrier assignments in successive scheduling intervals only if the gain in system throughput is larger than the signaling overhead incurred with the reassignment. In [72], an algorithm for OFDMA downlink scheduling under a control signaling cost constraint is proposed. The authors formulate the subcarrier assignment as a combinatorial optimization problem with the objective of finding the subcarrier assignment that maximizes the system throughput while penalizing the cost for transmitting the scheduling information. In [73], a new scheme for encoding the scheduling information which exploits the correlation among different users’ scheduling assignments is proposed to reduce the amount of scheduling information that needs to be transmitted. The scheme assumes that users with a high SNR can decode the scheduling information intended for all other users with a lower SNR and thus the scheduling information can be encoded differentially. In this chapter, we formulate a scheduling signaling overhead model to analyze the scheduling signaling overhead associated with the proposed bitQoS-aware RA framework and consider different schemes to compress the scheduling signaling information. The effective system throughput gains of the bitQoS-aware RA framework are determined so as to assess the tradeoff between the scheduling gain and signaling overhead. This chapter is organized as follows: in Section 6.2, the scheduling signaling overhead model is presented and the required scheduling signaling information is described in Section 6.3. The entropy of the scheduling signaling information is evaluated in Section 6.4 and different schemes to compress the scheduling signaling information bits are described in Section 6.5. The simulation results are presented in Section 6.6 and the main findings are summarized in Section 6.7. 92 6.2 Scheduling Signaling Overhead Model We examine the scheduling signaling overhead incurred by the proposed bitQoS-aware RA framework based on the control signaling evaluation model proposed in [74], where the authors compared the effects of different scheduling granularity, scheduling policies and control signaling transmission strategies in OFDMA systems. Specifically, we look at the case where the assignment of subcarriers is at a per-resource-element basis (i.e., each of the N subcarriers can be assigned to different flows/users and the subcarrier assignment is updated at every OFDM symbol). It is assumed that the scheduling signaling information is compressed (when necessary) and broadcast to all I users and that the scheduling signaling information bits for each OFDM symbol are transmitted with the application bits at each scheduling decision time k. In practice, the scheduling signaling information broadcast message needs to be encoded such that the user with the weakest channel condition is able to decode it. For simplicity, we also assume that subcarrier resources are pre-reserved for the transmission of the scheduling signaling information, i.e., there is no need to reallocate the application bits to take into account the transmission of the scheduling signaling information bits. 6.3 Scheduling Signaling Information We consider the required scheduling signaling information bits at each scheduling decision time k for 1) scheduling policies with no flow merging (NFM), 2) scheduling policies with flow merging (FM) and 3) scheduling policies with flow merging - grouped sorted (FMGS). The three types of scheduling policies are illustrated in Fig. 6.1. 6.3.1 Scheduling Policies with No Flow Merging The scheduling decision at each scheduling decision time k is represented by a 1 × N subcarrier-to-flow vector, U N F M (k) FM FM {uN (k), ∀n ∈ N }. The n-th element, uN (k), n n of the subcarrier-to-flow vector, U N F M (k), is an integer from the set Jsys = {1, . . . , Jsys } that indicates the flow j of user i to which subcarrier n is assigned at time k. The set, Jsys , 93 OFDM Symbols flow 2 .. flow J1 User 2 flow 1 …… flow 1 User I flow JI FM FMGS k-1 Application QoS Parameters NFM Subcarriers User 1 bitQoS-aware Scheduling Policy Application Bits flow 1 Transmitted Bits k k+1 ... Subcarrier Channel State Information Figure 6.1: Mapping of application bits to OFDM subcarriers with different bitQoSaware scheduling policies. NFM: Each subcarrier can only carry bits from a single application flow of a user. FM: Each subcarrier can carry bits from more than one application flow of a user. FMGS: Each subcarrier can carry bits from more than one application flow of a user. In addition, the bits on each subcarrier are grouped in a FIFO fashion by application flows and sorted in an ascending order by the flow index, j, prior to transmission. I contains the indices to all f low(i, j), ∀i ∈ I, j ∈ Ji in the system and the term, Jsys Ji , i=1 denotes the total number of flows in the system. 6.3.2 Scheduling Policies with Flow Merging The scheduling decision at each scheduling decision time k is represented by a 1 × N subcarrier-to-user vector, U F M (k) tors, V Fn M (k) {uFn M (k), ∀n ∈ N } and N 1 × Mn (k) bit-to-flow vec- FM {vn,z (k), ∀z = 1, . . . , Mn (k)}, ∀n ∈ N , where Mn (k) is the total number of bits carried by subcarrier n at time k. The n-th element, uFn M (k), of the subcarrier-to-user vector, U F M (k), is an integer from the set I that indicates the user i to which subcarrier n is FM assigned at time k. The z-th element, vn,z (k), of the bit-to-flow vector for each subcarrier n, 94 V Fn M (k), is an integer from the set JuFn M that indicates the flow of user uFn M (k) to which the z-th bit on subcarrier n is assigned at time k. 6.3.3 Scheduling Policies with Flow Merging - Grouped Sorted However, scheduling policies with flow merging, as described above, makes no assumption about the ordering of bits on each subcarrier n. While FM allows the BS to merge bits from different application flows without restriction, a significant amount of scheduling signaling overhead is incurred to communicate the mapping between the application bits on each OFDM subcarrier and application flows. To reduce the scheduling signaling overhead without decreasing the performance gains provided by the bitQoS-aware RA framework with flow merging, we can require the bits scheduled on each subcarrier to be grouped in a FIFO fashion by application flows and sorted in an ascending order by the flow index, j, prior to transmission. Hence, instead of having the BS signal the mapping between the application bits on each OFDM subcarrier and application flows for every single scheduled bit to be transmitted to the MS, the BS only needs to signal the number of consecutive bits belonging to each application flow j on each subcarrier n. The scheduling decision at each scheduling decision time k is represented by a 1 × N subcarrier-to-user vector, U F M GS (k) N 1 × JuFn M GS (k) bit-to-flow vectors, V Fn M GS (k) {uFn M GS (k), ∀n ∈ N } and {τjF M GS (k), ∀j ∈ JuFn M GS (k) }, ∀n ∈ N . The n-th element, uFn M GS (k), of the subcarrier-to-user vector, U F M GS (k), is an integer from the set I that indicates the user i to which subcarrier n is assigned at time k. The j-th element, τjF M GS (k), of the bit-to-flow vector for each subcarrier n, V Fn M GS (k), is an integer from the set {0, . . . , Mn (k)} that indicates the number of bits belonging to flow j at time k, where JuF M GS (k) n τjF M GS (k) = Mn (k). j=1 It is assumed that each user is assigned its user index i ∈ I and its range of application flow indices {(i − 1)Jmax + 1, . . . , iJmax } ∈ Jsys in the system during call admission. The term, Jmax , denotes the maximum number of application flows a user can have as defined in [27, 28, 33] and the range of application flow indices for user i is {1, . . . , Ji }. 95 6.4 Scheduling Signaling Information Entropy To gain some insight into the amount of scheduling signaling information bits incurred by scheduling policies with no flow merging, scheduling policies with flow merging and scheduling policies with flow merging - grouped sorted, we evaluate the entropies assuming a simplified model for the signaling information bits. Since the statistics of the scheduling decisions are not readily available for the considered scheduling policies, we determine an entropy upperbound (regardless of scheduling policy) by assuming that each subcarrier n ∈ N is independently and equally likely to be assigned to any flow j ∈ Jsys for scheduling policies with no flow merging or any user i ∈ I for scheduling policies with flow merging. Furthermore, for scheduling policies with flow merging and for scheduling policies with flow merging - grouped sorted, each bit z, ∀z = 1, . . . , Mn (k), on subcarrier n is independently and equally likely to be mapped to any flow j ∈ JuFn M (k) and j ∈ JuFn M GS (k) , respectively. In Section 6.6.2, we show that the entropy results obtained from this simplified model are useful in explaining the compressed scheduling signaling overhead results obtained by simulation. Depending on whether flow merging is allowed, the entropy of the scheduling signaling information is determined by enumerating all the possible values that the pertinent vectors U N F M (k), U F M (k) and V Fn M (k), and U F M GS (k) and V Fn M GS (k) can take on. All the possible assignment combinations are represented in a table which is assumed to be known at both the BS and MSs. At each scheduling decision time k, the index of the assignment combination corresponding to the scheduling decision is transmitted with the application bits. The number of bits required to represent the assignment combination index is determined by assuming that every assignment combination is equally likely. We determine the entropies for 1) scheduling policies with no flow merging, 2) scheduling policies with flow merging and 3) scheduling policies with flow merging - grouped sorted. 6.4.1 Scheduling Policies with No Flow Merging For scheduling policies with no flow merging, each subcarrier n assigned to user i can only carry bits from a single application flow j of that user, i.e., each subcarrier n is assigned to one 96 of Jsys application flows. Hence, there are (Jsys )N possible ways to assign all N subcarriers to Jsys flows. Assuming all (Jsys )N possible assignments of subcarriers to application flows are equally likely, the entropy of the scheduling signaling information for NFM is given by H N F M (k) = log2 (Jsys )N = N log2 Jsys . 6.4.2 (6.1) Scheduling Policies with Flow Merging For scheduling policies with flow merging, each subcarrier n assigned to user i can carry up to Mn (k) bits from any of Ji flows of user i. There are I N possible ways to assign all N subcarriers to I users. Assuming that all I N possible assignments of subcarriers to users are equally likely, the corresponding entropy of U F M (k) is given by log2 I N . In addition, we assume that each bit, z, ∀z = 1, . . . , Mn (k), on subcarrier n is equally likely to be mapped to any flow j ∈ JuFn M (k) and the mapping of the bits are independent from one bit to another. Hence, there are (JuFn M (k) )Mn (k) possible ways to map all Mn (k) bits to JuFn M (k) flows. Assuming all (JuFn M (k) )Mn (k) possible mappings of bits to application flows are equally likely, N the entropy of V FM n (k), ∀n ∈ N is given by log2 (JuFn M (k) )Mn (k) . Hence, the entropy of n=1 the scheduling signaling information for FM is given by N N H FM N (k) = log2 I + log2 Mn (k) (JuFn M (k) ) 6.4.3 Mn (k) log2 JuFn M (k) . (6.2) = N log2 I + n=1 n=1 Scheduling Policies with Flow Merging - Grouped Sorted The subcarrier-to-user vector for FMGS, U F M GS (k), is determined identically as U F M (k). Hence, the corresponding entropy of U F M GS (k) is also given by log2 I N . Determining the possible values of V Fn M GS (k) is equivalent to finding the possible ways of distributing Mn (k) indistinguishable balls into JuFn M GS (k) distinguishable urns [75]. This gives a total of Mn (k) + JuFn M GS (k) − 1 Mn (k) + JuFn M GS (k) − 1 possible values. Assuming that all JuFn M GS (k) − 1 JuFn M GS (k) − 1 F M GS possible values are equally likely, the corresponding entropy of V n (k), ∀n ∈ N is given 97 N Mn (k) + JuFn M GS (k) − 1 . Hence, the entropy of the scheduling signaling inF M GS (k) − 1 J u n n=1 formation for FMGS is given by by log2 N H F M GS N (k) = log2 I + log2 n=1 N = N log2 I + log2 n=1 6.5 Mn (k) + JuFn M GS (k) − 1 JuFn M GS (k) − 1 Mn (k) + JuFn M GS (k) − 1 . JuFn M GS (k) − 1 (6.3) Compression of Scheduling Signaling Information We consider two different schemes to compress the scheduling signaling information bits: 1) Run-Length Encoding (RLE) [76] and 2) Lempel-Ziv-Welch (LZW) [77]. Note that RLE/LZW compression of the scheduling signaling information bits is performed only if the compression reduces the number of scheduling signaling information bits; otherwise, the scheduling signaling information bits are transmitted uncompressed. An additional bit is added and transmitted along with the scheduling signaling information bits to indicate whether or not compression is performed. 6.5.1 Run-length Encoding RLE is particularly efficient for short data blocks with long consecutively repeating data values and has a low implementation complexity. RLE compresses a data block by representing each run of data (i.e., a data sequence in which the same data value occurs in consecutive elements) by a single data value, called the run value, and the number of consecutively repeating data values, called the run length. The number of bits, ΥwRLE , required to represent a data block w of length L with elements from an alphabet of cardinality R using RLE is given by [76] Q ΥwRLE q−1 log2 (L − = Q log2 R + q=1 98 lx ) , x=0 (6.4) where Q denotes the total number of runs in w , lx denotes the run length of the x-th run and l0 = 0. The first term in the right-hand side of (6.4) corresponds to the number of bits required to represent the run values and the second term corresponds to the number of bits required to represent the run lengths. 6.5.2 Lempel-Ziv-Welch LZW is useful for data blocks with repeated patterns and is more efficient for long data blocks as the initial part of the compression algorithm builds a dictionary and has low compression efficiency. The dictionary is initialized to contain all the possible single-character strings of the input data block. LZW then scans through the input data block for successively longer sub-string that are not yet defined in the dictionary. When such a sub-string is found, the index for the sub-string less the last character (i.e., the longest sub-string that is in the dictionary) is sent to the output and the sub-string including the last character is added to the dictionary with the next available code. The last input data character is then used as the new starting point for the next scan. The dictionary building process repeats and successively longer data strings are added to the dictionary and made available for subsequent encoding as single output values. The number of bits, ΥwLZW , required to represent a data block w using LZW is obtained using simulation. 6.6 Simulation Results To evaluate the scheduling signaling overhead of the bitQoS-aware resource allocation framework, we adopt the water-filling-based iterative subcarrier-power-bit allocation algorithm proposed in Section 5.2 for the following scheduling policies: 1) Multi-user Water-filling with Heuristics with No Flow Merging (WFH-NFM) where each subcarrier can only carry bits from a single application flow of a user, 2) Multi-user Water-filling with Heuristics with Flow Merging (WFH-FM) where each subcarrier can carry bits from more than one application flow of a user and 3) Multi-user Water-filling with Heuristics with Flow Merging Grouped Sorted (WFH-FMGS) where each subcarrier can carry bits from more than one ap- 99 plication flow of a user and in addition, the bits on each subcarrier are grouped in a FIFO fashion by application flows and sorted in an ascending order by the flow index, j, prior to transmission. To provide a comparative performance assessment of the WFH-NFM/FM/FMGS scheduling policies, we consider the WF and MDU scheduling policies described in Section 4.4.3. The scheduling policies were simulated in Matlab using the system model described in Section 4.2. In the simulation, it is assumed that each user has 1 BE flow and 1 EF flow. The parameter values used in our simulation are listed in Tables 4.1 and 4.2. Simulation results were obtained for mixed-traffic scenarios with I = {4, 6, 8}. 6.6.1 Entropy of Scheduling Signaling Overhead The entropy of the scheduling signaling overhead based on the entropy model described in Section 6.4 for scheduling policies with NFM, FM and FMGS are shown in Fig. 6.2 as a function of the number, I, of users in the system. It can be seen that the entropy increases with I. As expected, the entropy for NFM is the lowest since only the subcarrierto-flow vector, U N F M (k), is transmitted; in FM/FMGS, both the subcarrier-to-user vector, U F M GS (k), and bit-to-flow vectors, V Fn M (k)/V V Fn M GS (k), ∀n ∈ N , are transmitted. U F M (k)/U The entropy for FM is much higher than that for FMGS; this is due to the fact that the bits on FM a subcarrier n for FM are not grouped by application flows (i.e., every element, vn,z (k), of V Fn M (k) can take on values in the set Ji with equal probability), representing the maximum entropy for the bit-to-flow vectors. By grouping and sorting the bits carried on a subcarrier by their application flows and flow index respectively as in FMGS, the entropy can be greatly reduced. 6.6.2 Compressed Scheduling Signaling Overhead Fig. 6.3 shows the compressed scheduling signaling overhead for the various scheduling policies (WF, MDU, WFH-NFM, WFH-FM and WFH-FMGS) as a function of I. These results were obtained from simulation, based on the scheduling signaling overhead compression 100 Entropy of Scheduling Signaling Information (bits) 750 NFM FM FMGS 600 450 300 150 0 4 5 6 7 8 Number of Users Figure 6.2: Entropy of Scheduling Signaling Overhead schemes (RLE and LZW) described in Section 6.5. It can be observed from Fig. 6.3 that the compressed scheduling signaling overhead increases with the number, I, of users in the system for all the scheduling policies and compression schemes. These results are consistent with the entropy analysis results shown in Fig. 6.2. Regardless of the compression scheme used, WFH-FM/FMGS, which allows flow merging, incurs the highest scheduling signaling overhead when compared to WF, MDU and WFH-NFM, which do not allow flow merging. This is due to the fact that in general, as more constraints are imposed upon the scheduling problem, the amount of scheduling signaling overhead required decreases. In this case, for WF, MDU and WFH-NFM, with the no flow merging constraint, it eliminates the need to transmit the bit-to-flow mapping information and thus results in a lower scheduling signaling overhead. Among the scheduling policies that do not allow flow merging, WF has the highest scheduling signaling overhead regardless of the compression scheme used. This is due to the fact that WF has no QoS provisioning and assigns each subcarrier to the user that has the 101 highest channel gain for that subcarrier. Given that the channel gains are i.i.d., each subcarrier is equally likely to be assigned to any of the users i ∈ I; this results in a higher entropy for U N F M (k). On the other hand, a lower entropy for U N F M (k) is expected for MDU and WFH-NFM since they consider QoS at the flow-level and bit-level, respectively; as such, each subcarrier is no longer equally likely to be assigned to any of the users i ∈ I. We also see that MDU has a lower scheduling signaling overhead than WFH-NFM regardless of the compression scheme used. This is because MDU strictly favors EF traffic in a mixedtraffic environment, which effectively reduces the total number of flows scheduled by MDU to Jsys /2 (assuming an equal number of BE and EF flows). For both WF and WFH-NFM, LZW provides a lower compressed scheduling signaling overhead than RLE. Since the chance of getting a long run length in U N F M (k) decreases with increasing I, the gap between RLE and LZW widens when I increases. For MDU, RLE gives a slightly lower compressed scheduling signaling overhead than LZW. This is due to the fact that MDU mostly schedules only Jsys /2 flows, which increases the chance of getting a consecutively repeating sequence in U N F M (k), allowing RLE to achieve efficient compression. With either RLE or LZW, there is little difference in the compressed scheduling signaling overhead for WFH-FM and WFH-FMGS. This is because the number of bits in one application data packet (128 bits) is greater than the number of bits that can be carried by a subcarrier and that all bits in one application data packet have identical bitQoS values. Hence, the bits carried by a subcarrier typically come from the same application data packet of an application flow. As such, the grouping and sorting of bits as described in Section 6.3.3 has little effect on the scheduling signaling overhead incurred by bit-to-flow vectors for WFH-FM. This observation also implies that the bit-to-flow vectors, V Fn M (k), ∀n ∈ N , which constitutes most of the scheduling signaling overhead of WFH-FM, correspond mostly to short and consecutively repeating data sequences (i.e., consecutive bits assigned to the same flow). This explains why in Fig. 6.3 RLE gives a much lower scheduling signaling overhead than LZW as RLE is more efficient for such data blocks. 102 600 WF RLE WF LZW MDU RLE MDU LZW WFH−NFM RLE WFH−NFM LZW WFH−FM RLE WFH−FM LZW WFH−FMGS RLE WFH−FMGS LZW Compressed Scheduling Signaling Overhead (bits) 500 400 300 200 100 0 4 4.5 5 5.5 6 6.5 7 7.5 8 Number of Users Figure 6.3: Compressed Scheduling Signaling Overhead of the Various Scheduling Policies using RLE and LZW 6.6.3 Effective Throughput We note that although Fig. 6.3 shows that WFH-FM has the largest compressed scheduling signaling overhead, it also has the highest average system throughput, compared to WF and MDU, as shown in Fig. 5.4. Hence, to determine the viability of the bitQoS-aware RA framework, we define the effective throughput to account for the scheduling signaling overhead for each scheduling policy as T Pef f = i i i j k T Pij (k) − k ΥUΞ N F M (k) j K j T P (k) − i k k ΥUΞ F M (k) − j k T Pij (k) − K Ξ k ΥU F M GS (k) − K for WF, MDU and WFH-NFM, n k n ΥVΞ Fn M (k) k ΥVΞ Fn M GS (k) for WFH-FM, for WFH-FMGS, (6.5) where Ξ is either RLE or LZW depending on the compression scheme used as defined in Section 6.5. The effective throughput gain of scheduling policy X over scheduling policy Y is 103 Y X T Pef f − T Pef f = defined as . The effective throughput gains of WFH-NFM, WFHY T Pef f FM and WFH-FMGS over the comparative scheduling policies for I = {4, 6, 8} are listed GX,Y T Pef f in Table 6.1. We see that WFH-NFM, WFH-FM and WFH-FMGS have higher effective throughputs than WF and MDU with RLE compression, i.e., the bitQoS-aware RA framework provides an increased average system throughput even when the scheduling signaling overhead is taken into account. However, when LZW is used, scheduling policies with flow merging (WFH-FM and WFH-FMGS) do not yield a higher effective throughput over WF and MDU due to the inefficiency of LZW to compress the short and consecutively repeating data sequences of the bit-to-flow vectors. It can also be seen that WFH-FM/FMGS has a lower effective throughput compared to WFH-NFM regardless of the compression scheme used, i.e., allowing flow merging in the bitQoS-aware RA framework yields no system throughput improvement in this case. This result is due to the fact that, with the per-resource-element scheduling granularity of 1 OFDM symbol × 1 subcarrier considered in this chapter, the number of bits in one application layer PDU is typically much greater than the number of bits that can be carried by a subcarrier. As a result, very little flow merging actually takes place and the performance gain from flow merging is minimal. 6.7 Conclusion The viability of the proposed bitQoS-aware RA framework which adaptively matches the QoS requirements of the user application bits to the characteristics of the OFDM subcarriers, with and with no flow merging, was analyzed by taking the associated scheduling signaling overhead into account. A model is formulated to analyze the associated scheduling signaling overhead and the performance gains achievable with the bitQoS-aware RA framework are quantified. The entropy analysis shows that scheduling policies with flow merging incur a significantly higher scheduling signaling overhead compared to scheduling policies that do not allow flow merging. However, the scheduling signaling overhead for scheduling policies 104 with flow merging can be greatly reduced by grouping and sorting the bits carried on the subcarrier by their application flows and flow indices respectively. Simulation results further show that despite the increase in the scheduling signaling overhead for scheduling policies with flow merging, the proposed bitQoS-aware RA framework is still able to provide a higher effective throughput gain compared to scheduling policies that do not take QoS provisions into account such as WF and policies that consider only flow-level QoS requirements such as MDU, when RLE compression of the scheduling signaling information is performed. 105 Table 6.1: Effective Throughput Gains of WFH-NFM, WFH-FM and WFH-FMGS for I = {4, 6, 8}, N = 18, 1 BE and 1 EF Flow for each User I=4 GX,Y T Pef f × 100% RLE LZW Scheduling Policy Y Scheduling Policy X WFH-NFM WFH-FM WFH-FMGS WFH-NFM WFH-FM WFH-FMGS WF 75.23 21.91 21.94 75.03 -185.55 -185.56 MDU 1.73 -29.23 -29.21 3.77 -150.72 -150.73 WFH-NFM 0.00 -30.43 -30.41 0.00 -148.87 -148.88 WFH-FM 43.74 0.00 0.02 304.61 0.00 0.02 WFH-FMGS 43.70 -0.02 0.00 304.56 -0.02 0.00 I=6 GX,Y T Pef f × 100% RLE LZW Scheduling Policy Y Scheduling Policy X WFH-NFM WFH-FM WFH-FMGS WFH-NFM WFH-FM WFH-FMGS WF 148.33 96.06 96.09 150.29 -114.70 -114.67 MDU 35.63 7.08 7.10 38.82 -108.15 -108.13 WFH-NFM 0.00 -21.05 -21.03 0.00 -105.87 -105.86 WFH-FM 26.66 0.00 0.02 1803.14 0.00 -0.21 WFH-FMGS 26.64 -0.02 0.00 1806.66 0.21 0.00 I=8 GX,Y T Pef f × 100% RLE LZW Scheduling Policy Y Scheduling Policy X WFH-NFM WFH-FM WFH-FMGS WFH-NFM WFH-FM WFH-FMGS WF 130.69 92.07 92.09 131.39 -54.98 -54.99 MDU 86.07 54.92 54.94 96.02 -61.86 -61.87 WFH-NFM 0.00 -16.74 -16.73 0.00 -80.54 -80.55 WFH-FM 20.11 0.00 0.01 413.94 0.00 -0.03 WFH-FMGS 20.09 -0.01 0.00 414.10 0.03 0.00 106 Chapter 7 Continuous and Discrete Rate Adaptation in BitQoS-aware Resource Allocation Framework 6 7.1 Introduction Given the promising performance gains and viability of the bitQoS-aware RA framework, even when scheduling signaling overhead is taken into account, in this chapter, we focus on developing more efficient algorithms and use the technique of Lagrange multipliers [78] to find the optimal solution to the bitQoS-aware resource allocation problem which, in addition to considering subcarrier assignments and power allocations, further involves discrete bit assignments for control of bit-level QoS requirements. This differs from the works presented in [63, 79] which only consider subcarrier assignments and power allocations to meet given rate requirements. The Lagrange multiplier technique provides an approach for finding the maxima/minima of a function subject to constraints and yields necessary conditions for optimality in equality constrained problems. To take inequality constraints into account, the technique of Lagrange multipliers is generalized by the KKT conditions [80, 81], which are necessary 6 The material in this chapter is based on: C. E. Huang and C. Leung, “On the optimality of bitQoS-aware resource allocation in OFDMA systems,” submitted. 107 conditions for a solution in non-linear programming to be optimal. The technique of Lagrange multipliers has been applied to radio resource allocation problems in [36, 37, 41, 63, 79]. In [63], the authors investigate resource allocation in a multiuser OFDM system with homogeneous traffic and formulate the problem with the objective of minimizing the overall transmit power while satisfying a minimum discrete rate requirement for each user. By relaxing the subcarrier assignments to allow time-sharing, an iterative algorithm based on properties of the Lagrangian formulation is proposed to obtain a suboptimal solution to the original combinatorial resource allocation problem. In [79], the authors extended the time-sharing technique for subcarrier assignment used in [63] to the scheduling of heterogeneous traffic in a multi-user OFDM system. The authors converted the problem into a convex programming problem and proposed an iterative algorithm with polynomial complexity to obtain the optimal subcarrier and power allocation. Since the bitQoS-aware resource allocation optimization problem is non-deterministic polynomial-time hard (NP-hard), we first look at a reduced-complexity form of the problem, obtained by transforming the joint subcarrier, power and bit allocation problem into a convex optimization problem through a variable transformation and the relaxation of the integer constraints for both the subcarrier and bit assignment variables. Using the KKT conditions, we establish necessary and sufficient optimality conditions and develop an iterative algorithm to obtain the optimal solution. We show that the solution to this relaxed problem follows a bitQoS-based multi-level water-filling principle whereby the water levels of the subcarriers assigned to a user are determined by the bitQoS values of the bits in the user data buffer. These water levels may be different from one user to another, in contrast to a constant water level for all users in the classical throughput maximization water-filling solution. Since the solution to this relaxed problem contains non-discrete bit assignments, it can be interpreted as assignments for systems with continuous rate adaptation and provides an upper bound on the objective value of the original unrelaxed problem. For systems with discrete rate adaptation, we leverage the results of the continuous rate solution and propose an efficient iterative algorithm to compute the solution to the original resource allocation problem with discrete 108 bit assignments. This chapter is organized as follows: in Section 7.2, the reduced-dimensionality bitQoSaware RA framework is described. In Section 7.3, we formulate the resource allocation problem as a convex optimization problem and present necessary and sufficient conditions for the optimal solution to the continuous rate adaptation problem. An iterative algorithm to obtain the optimal solution is also presented. In Section 7.4, an algorithm which leverages on the iterative algorithm for the continuous rate adaptation problem is presented for the discrete rate adaptation problem. The simulation framework and results are discussed in Section 7.5 and the main findings are summarized in Section 7.6. 7.2 Reduced-dimensionality BitQoS-aware Resource Allocation Framework In Chapter 4, we formulate the proposed bitQoS-aware RA framework as an optimization problem, OP4.2, with the objective of finding the joint subcarrier, power and bit assignment to maximize the total bitQoS-weighted throughput, subject to the total transmit power constraint, Ptotal . However, we note that since the mapping between the application bits and OFDM subcarriers does not affect the objective value of the optimal solution, we can reduce N j,z bj,z i,n with bi . The the dimensionality of the optimization problem OP4.2 by substituting n=1 new optimization variable, bj,z i , takes on the value 1 if bit z of user i, flow j is transmitted on 109 any subcarrier(s) assigned to user i and 0 otherwise. We can thus rewrite OP4.2 as follows I OP7.1: Ji Bij ψij,z bj,z i (7.1) pi,n ai,n = Ptotal (7.2) max ai,n ∈{0,1} pi,n ∈[0,Ptotal ] bj,z i ∈{0,1} i=1 j=1 z=1 subject to i n bj,z i j ≤ z ≤1 ai,n ∀i (7.3) ∀n (7.4) ∀i, j, z, (7.5) ai,n = 1 i j,z bi log2 n pi,n |αi,n |2 1+ ζσ02 where the variables are as defined in Chapter 4. 7.3 Subcarrier, Power and Bit Allocation with Continuous Rate Adaptation The reduced-dimensionality optimization problem OP7.1 is a MINLP problem whose solution is still computationally complex given the large number of subcarriers and users in a practical system. To make the problem computationally tractable, we adopt the integer constraint relaxation technique used in [63, 79], where it is assumed that the discrete subcarrier assignment variable can take on real values in [0, 1]. For our problem formulation with two discrete optimization variables, we attempt to convexify OP7.1 by relaxing the integer constraints for both ai,n and bj,z i , allowing them to take on real values in [0, 1] and defining a new 110 optimization variable πi,n = pi,n ai,n . The optimization problem OP7.1 becomes OP7.2: Bij Ji I ψij,z bj,z i max ai,n ∈[0,1] πi,n ∈[0,Ptotal ai,n ] bj,z i ∈[0,1] (7.6) i=1 j=1 z=1 Ptotal − subject to πi,n = 0 log2 1 + n 1− 1− |αi,n |2 πi,n ζσ02 ai,n ai,n = 0 i j,z bi (7.7) n i ≥0 bj,z i ≥ 0 ai,n − ∀i (7.8) z j ∀n (7.9) ∀i, j, z. (7.10) By evaluating the Hessian matrix of the functions on the LHS of (7.8), it can be shown that the Hessian matrix is negative semi-definite at any point in the convex constraint set X = {ai,n ∈ [0, 1], πi,n ∈ [0, Ptotal ai,n ], bj,z i ∈ [0, 1]}, i.e., the functions in the LHS of (7.8) are concave (see proof in Appendix C). In addition, since the functions in (7.6) and the LHS of (7.10) are also concave and the functions in the LHS of (7.7) and (7.9) are affine, OP7.2 is a concave optimization problem [82, 83] in X . Hence, the KKT conditions, which are necessary conditions for a solution to be optimal, are also sufficient for optimality in this case. Using the technique of Lagrange multipliers [78], the Lagrangian for OP7.2 is I Ji Bij ψij,z bj,z i L = (7.11) i=1 j=1 z=1 + β(Ptotal − πi,n ) n i + γi log2 1 + n i µn (1 − + n |αi,n |2 πi,n ζσ02 ai,n ai,n ) i j,z λj,z i (1 − bi ), + i j z 111 bj,z i ai,n − j z where β, γi , µn and λj,z i are the Lagrange multipliers for the constraints (7.7), (7.8), (7.9) and (7.10), respectively. The necessary and sufficient conditions for the optimal solution to ∗ ∗ ∗ OP7.2, {a∗i,n , πi,n , bj,z , β ∗ , γi∗ , µ∗n , λj,z }, if it exists, are i i Primal feasibility: ∗ πi,n =0 Ptotal − i (7.12) n log2 1 + n ∗ |αi,n |2 πi,n ζσ02 a∗i,n a∗i,n ∗ bj,z ≥ 0 ∀i i − j a∗i,n = 0 1− 1− Dual feasibility: Stationarity: Complementary slackness: i j,z ∗ bi ≥0 (7.13) z ∀n (7.14) ∀i, j, z (7.15) ∗ ≥0 β ∗ ≥ 0, γi∗ ≥ 0, µ∗n ≥ 0, λj,z i < 0 if a∗i,n = 0 ∂L = 0 if 0 < a∗i,n < 1 ∀i, n ∂a∗i,n > 0 if a∗i,n = 1 ∗ < 0 if πi,n =0 ∂L ∗ ∗ ∀i, n ∗ = 0 if 0 < πi,n < Ptotal ai,n ∂πi,n ∗ > 0 if πi,n = Ptotal a∗i,n ∗ < 0 if bj,z =0 i ∂L j,z ∗ <1 ∀i, j, z j,z ∗ = 0 if 0 < bi ∂bi ∗ > 0 if bj,z =1 i γi∗ log2 1 + n ∗ |αi,n |2 πi,n ζσ02 a∗i,n bj,z i − j ∗ ∗ )=0 λj,z (1 − bj,z i i 112 a∗i,n ∗ (7.16) (7.17) (7.18) (7.19) (7.20) =0 z (7.21) Note that the inequalities in (7.17), (7.18) and (7.19) are obtained by considering the case where the optimal solution occurs at a boundary point of the constraint set X . Since we are attempting to maximize (7.11), the partial derivatives of L in (7.17), (7.18) and (7.19) will be negative if the optimal solution occurs at the lower limit of ai,n , πi,n and bj,z i , respectively, and positive otherwise. We next derive the analytical expressions for the optimal power allocation, subcarrier and bit assignments. 7.3.1 Optimal Power Allocation In this section, we determine the optimal power allocation for a given subcarrier and bit assignment. The optimal power allocation, p∗i,n , is obtained by differentiating the Lagrangian, L, in (7.11) with respect to πi,n and substituting the result into the KKT condition (7.18). Specifically, for a given subcarrier assignment, ai,n , where ai,n = 0, and bit assignment, bj,z i , we have ∂L ∂πi,n = −β + ∗ πi,n =πi,n ∗ < 0 if πi,n =0 2 αi,n γi ai,n ∗ 2 ∗ = 0 if 0 < πi,n < Ptotal ai,n ln 2 ζσ02 ai,n + αi,n πi,n ∗ > 0 if πi,n = Ptotal ai,n ∀i, n . (7.22) ∗ Since p∗i,n = πi,n /ai,n , the three cases in (7.22) can be rewritten as p∗i,n = 0 ∗ πi,n γi ζσ 2 = − 20 ai,n β ln 2 αi,n Ptotal γi ζσ 2 < 20 β ln 2 αi,n 2 ζσ γi ζσ 2 if 2 0 ≤ ≤ 2 0 + Ptotal , αi,n β ln 2 αi,n 2 γi ζσ if > 2 0 + Ptotal β ln 2 αi,n if ∀i, n, (7.23) ∗ where the term, β, is chosen such that {πi,n } will satisfy the total power constraint (7.7). Equation (7.23) shows that the optimal power allocation is similar to the classical waterγi ζσ 2 filling algorithm [84] where is the equivalence of the water level and 2 0 the noise β ln 2 αi,n floor of user i, subcarrier n. The main difference is that in (7.23), the water level on each 113 subcarrier n is determined by γi of the user to which subcarrier n is assigned, whereas in the classical water-filling algorithm, the water level is constant for all subcarriers. In addition, we will show in Section 7.3.3 that γi is in fact related to the bitQoS values of user i in the proposed bitQoS-aware RA framework. Hence, the optimal power allocation in (7.23) can be interpreted as a generalized bitQoS-based multi-level water-filling solution. Specifically, for the case where the bitQoS values are identical ∀i, j, z, the optimization problem OP7.2 reduces to a throughput maximization problem, and (7.23) becomes the classical water-filling solution where γi is identical for every user i. 7.3.2 Optimal Subcarrier Assignment In this section, we determine the optimal subcarrier assignment for a given power allocation and bit assignment. The optimal subcarrier assignment, a∗i,n , is obtained by differentiating the Lagrangian, L, in (7.11) with respect to ai,n and substituting the result into the KKT condition ∗ (7.17). Specifically, for a given power allocation, pi,n , and bit assignment, bj,z i , for ai,n = 0, we have ∂L ∂ai,n ai,n =a∗i,n = γi log2 1 + 2 αi,n ζσ02 πi,n a∗i,n = 0 if 0 < a∗i,n < 1 > 0 if a∗i,n − α2i,n πi,n ζσ02 a∗i,n 1 ln 2 1 + α2i,n πi,n ζσ02 a∗i,n − µn ∀i, n. (7.24) =1 By substituting the power allocation, pi,n , from (7.23) into (7.24), we obtain Hi,n (γi ) = µn > µn if 0 < a∗i,n < 1 if a∗i,n 114 =1 ∀i, n, (7.25) where the function Hi,n (γi ) is defined as Hi,n (γi ) = 0 γi log2 γi log2 γi ζσ 2 < 20 β ln 2 αi,n 2 2 αi,n γi 1 ζσ β ln 2 − 1 − 02 2 ζσ0 β ln 2 ln 2 αi,n γi ζσ 2 ζσ 2 γi if 2 0 ≤ ≤ 2 0 + Ptotal . αi,n β ln 2 α i,n α2i,n 2 P αi,n 1 ζσ02 total 1 + 2 Ptotal − 2 ζσ0 ln 2 1 + αi,n P ζσ 2 total if (7.26) 0 γi ζσ 2 if > 2 0 + Ptotal β ln 2 αi,n In the case where a∗i,n ∈ (0, 1) for an arbitrary user i on subcarrier n, Constraint (7.9) mandates that there will be time-sharing on subcarrier n (i.e., a∗i,n ∈ (0, 1) for more than a∗i,n = 1. From (7.25), we thus have Hi,n (γi ) = µn for every user one user) since i i ∈ {i ∈ I|a∗i,n ∈ (0, 1)}, which implies that all users sharing subcarrier n must have the same Hi,n (γi ). However, since Hi,n (γi ) is a function of αi,n and {αi,n } are outcomes of independent and real-valued random variables modeling Rayleigh fading, it is highly unlikely that the value of Hi,n (γi ) for two or more users will be identical. Hence, the case of a∗i,n ∈ (0, 1) in (7.25) is unlikely. Using (7.9) and (7.25), we see that there will only be one user i∗ for which a∗i∗ ,n = 1 and that subcarrier n will be assigned to the user i∗ with the largest Hi,n (γi ). For all other users i = i∗ , a∗i,n = 0. In other words, a∗i,n = 1 if i = i∗ ∀n, (7.27) 0 if i = i∗ where i∗ = arg max Hi,n (γi ). i (7.28) In the unlikely event that multiple users have identical Hi,n (γi ) values for subcarrier n, then i∗ is chosen from among these users with equal probabilities. Since Hi,n (γi ) in (7.26) plays 115 an integral role in determining the optimal subcarrier assignment, we examine its properties 2 with respect to (w.r.t.) αi,n and γi : 2 1) By differentiating Hi,n (γi ) w.r.t. αi,n for all three cases in (7.26), it can be shown that ∂Hi,n (γi ) 2 ≥ 0. As each Hi,n (γi ) is a monotonically increasing function of αi,n since 2 ∂αi,n subcarrier n is assigned to the user i∗ with the largest Hi,n (γi ) according to (7.28), the subcarrier will be assigned to the user with the highest channel gain on that subcarrier when the effect of γi is not considered. 2) Similarly, by differentiating Hi,n (γi ) w.r.t. γi for all three cases in (7.26), it can be shown that Hi,n (γi ) is also a monotonically increasing function of γi . As will be shown in Section 7.3.3, γi is related to the bitQoS values of user i. Hence, users with higher bitQoSvalued bits are also more likely to have higher Hi,n (γi ) values, resulting in a higher chance of being assigned the subcarrier n. From property 1), we note that the optimal subcarrier assignment in (7.27) agrees with the water-filling solution for throughput maximization in a multi-user OFDM system [30]. Property 2) shows that the optimal subcarrier assignment in (7.27) takes into account the bitQoS values from our proposed bitQoS-aware RA framework. Given that Hi,n (γi ) is a 2 monotonically increasing function of both αi,n and γi , the optimal subcarrier assignment (7.27) will assign a subcarrier to the user with good channel condition and high bitQoSvalued bits. 7.3.3 Optimal Bit Assignment In this section, we determine the optimal bit assignment for a given power allocation and ∗ subcarrier assignment. The optimal bit assignment, bj,z , is obtained by differentiating the i Lagrangian, L, in (7.11) with respect to bj,z i and substituting the result into the KKT condition (7.19). Specifically, for a given power allocation, pi,n , and subcarrier assignment, ai,n , we 116 have ∂L ∂bj,z i = ψij,z − γi − λj,z i j,z ∗ bj,z i =bi <0 ∗ =0 if bj,z i ∗ if bj,z ∈ (0, 1) i =0 > 0 ∀i, j, z. (7.29) ∗ if bj,z =1 i ∗ From (7.21), we see that if bj,z = 0, λj,z i i = 0 and by solving (7.29) for γi , we can obtain γi > ψij,z j,z ∗ =0 if bj,z i j,z = ψj − λi < ψjj,z − λj,z i ∗ if bj,z ∈ (0, 1) i ∀i, j, z. (7.30) ∗ if bj,z =1 i j,z The term, λj,z i , is a Lagrange multiplier which takes on a value in [0, ψi − γi ]. We see from γi (7.30) that γi (and the associated water-level, ) for each user i is related to the bitQoS β ln 2 values, ψij,z , of the assigned and unassigned bits of that user. Specifically, in the case of ∗ unassigned bits (bj,z = 0), γi > ψij,z , i.e., γi should take on a value that is greater than the i bitQoS values of all the unassigned bits in the data buffer of user i. In the case of assigned bits ∗ ≤ 1), γi ≤ ψij,z − λj,z (0 < bj,z i , i.e., γi should take on a value that is less than or equal to the i bitQoS values of all the assigned bits in the data buffer of user i. This relationship in (7.30) appears to be counter-intuitive, as increasing γi (and the associated allocated power, pi,n (7.23)) does not increase the number of assigned bits, since bits can only be assigned when ψij,z ≥ γi . However, if we take into account the KKT conditions collectively, in particular the primal condition (7.14) and the stationarity condition (7.19) (from which (7.30) is derived), we see that these two KKT conditions drive the assignment of bits in opposing directions such that the optimal solution (satisfying all the KKT conditions), if it exists, strives to assign the highest bitQoS-valued bits requiring the least amount of power (i.e., bits with large ψij,z , subcarriers with large αi,n and subcarriers requiring small pi,n ). 117 7.3.4 Optimal Joint Subcarrier, Power and Bit Allocation In this section, we determine the optimal solution to OP7.2 which must simultaneously satisfy all of the KKT conditions (7.12)-(7.21) for continuous rate adaptation. Using the optimal power, subcarrier and bit allocations in (7.23), (7.27) and (7.30), we propose an iterative KKT-driven algorithm, hereafter referred to as KKT for Continuous Rate Adaptation (KKTCRA) to numerically obtain the optimal solution. The algorithm is outlined in Algorithm 1. Algorithm 1 KKT-CRA 1: Initialize and γi to some small number for all i 2: Sort bits in each user data buffer by their bitQoS values in a descending order 3: repeat 4: 1) Perform subcarrier assignment 5: 1.1) Compute Hi,n (γi ) for all i and n according to (7.26) 6: 1.2) Update subcarrier assignment ai,n according to (7.27) with 7: i∗ (n) = arg max Hi,n (γi ) for all n i 8: 2) Perform power allocation pi,n for all i and n according to (7.23) 9: where β is determined using bisection subject to constraint (7.7) 10: 3) Perform bit assignment 11: 3.1) Compute the throughput limits ci,n according to |αi,n |2 pi,n for all i and n 12: ci,n = log2 1 + ζσ02 13: 3.2) Assign bits of each user i in a FIFO manner to the subcarriers in Vi = {n ∈ N |ai,n = 1} , one subcarrier at a time, until all the bits of user i are assigned or the throughput limits ci,n , ∀n ∈ Vi are reached. 14: 4) Update γi with iteration step size δ 15: 4.1) Determine the bitQoS value, ψ HOL (i), of the first unassigned bit in the data buffer of each user i 16: 4.2) Update γi according to 17: γi = (1 − δ)γi + δψ HOL (i) for all i 18: until γi + > ψ HOL (i) for all i The algorithm begins with initializing γi , ∀i ∈ I to a value less than ψ min , where ψ min = min ψij,z denotes the smallest bitQoS value of all bits in the data buffers of all users. The bits from all application flows of each user i are merged into one queue, i.e., Ji = 1, and sorted in decreasing order based on their bitQoS values. At each iteration, using the current values of γi , ∀i, power and subcarriers are allocated according to (7.23) and (7.27) respectively and the corresponding number of bits, ci,n , ∀i, n, that user i can transmit on subcarrier n is 118 data buffer of user i data buffer of user i bit(i, j, 7) bit(i, j, 7) bit(i, j, 6) 119 unassigned bits bij , z 0 unassigned bits bit(i, j, 5) bij , z bit(i, j, 4) 0 bit(i, j, 3) Initialization bit(i, j, 5) bit(i, j, 4) bit(i, j, 3) assigned bits bit(i, j, 2) bit(i, j, 1) bit(i, j, 6) HOL (i ) bij , z 1 bit(i, j, 2) bit(i, j, 1) i unassigned bits j,z bi 0 ij , z i assigned bits HOL (i ) b j , z 1 i ij , z i Iteration Figure 7.1: Relationship between γi and ψij,z increasing values …. …. …. i data buffer of user i bit(i, j, 7) bit(i, j, 6) bit(i, j, 5) bit(i, j, 4) bit(i, j, 3) bit(i, j, 2) bit(i, j, 1) Termination i HOL (i ) i determined according to (4.1). The bits of each user i are assigned in a FIFO manner to the subcarriers in Vi , one subcarrier at a time, where Vi = {n ∈ N |ai,n = 1}, until either all the bits of user i have been assigned or the throughput limits ci,n , ∀n ∈ Vi have been reached. At the end of each iteration, the value of γi is updated according to γi = (1 − δ)γi + δψ HOL (i) with an iteration step size, δ ∈ (0, 1), where ψ HOL (i) denotes the bitQoS value of the first unassigned bit in the data buffer of user i defined as ψ HOL (i) = Sun (i) = {bit(i, j, z)|bj,z i = 0, j ∈ Ji , z ∈ {1, . . . , Bij }} max bit(i,j,z)∈Sun (i) ψij,z , and denotes the set of bits of user i that have not yet been assigned based on the current bit assignment. The term bit(i, j, z)) refers to bit z of user i, flow j. The iteration repeats until γi + > ψ HOL (i), ∀i, where ∈ R+ is the termination tolerance. This relationship between γi and ψij,z is illustrated in Fig. 7.1. Note that in the process of updating γi and the associated subcarrier power allocation pi,n , ci,n may take on a non-integer value, hence yielding a continuous rate adaptation solution. Since the KKT-CRA algorithm works by iteratively increasing the values of γi , ∀i monotonically towards ψ HOL (i), it will always converge and the solution will satisfy the KKT conditions with appropriate values of δ and . In addition, the choice of values for δ and will also determine the number of iterations for KKT-CRA to converge and the closeness of the obtained solution to the optimal solution. They can be varied to achieve a tradeoff between the closeness to optimality and computation time. 7.4 Subcarrier, Power and Bit Allocation with Discrete Rate Adaptation In the previous section, we adopted the integer constraint relaxation technique and proposed an optimal joint subcarrier, power and bit allocation algorithm for the bitQoS-aware RA framework with continuous rate adaptation. This relaxation on ai,n and bj,z allows timei sharing of a subcarrier as well as permits subcarriers to transmit a non-integer number of bits. Simply quantizing the solution from the continuous rate case does not necessarily yield the optimal solution to the discrete rate case. As a result, the optimal solution obtained by KKTCRA may not provide a feasible solution for OP7.1, but gives an upperbound to the maximum 120 achievable bitQoS-weighted throughput for OP7.1. In this section, we leverage on the KKT conditions and KKT-CRA presented in Section 7.3 and propose a bit-loading-based, iterative joint subcarrier, power and bit allocation algorithm, hereafter referred to as KKT with Discrete Rate Adaptation (KKT-DRA) to numerically obtain a solution to OP7.1, with discrete ai,n ∈ {0, 1} and bj,z i ∈ {0, 1} solutions. This algorithm is outlined in Algorithm 2. As we have shown in Section 7.3.2, time-sharing of a subcarrier, i.e., ai,n ∈ (0, 1), is unlikely; hence by applying the same argument in KKT-DRA, we obtain a∗i,n ∈ {0, 1}. The key difference to obtain discrete bj,z i solutions in KKT-DRA is that the power allocation and bit assignment are done iteratively, one bit at a time, by bit-loading a bit to the subcarrier that maximizes the gain in bitQoS value while requiring the least amount of power instead of using the water-filling algorithm as in KKT-CRA. Specifically, the algorithm begins by initializing the power allocation to each subcarrier to zero, i.e., pi,n = 0, ∀i, n. At each bit-loading iteration, the incremental power, ∆p(n), required to transmit an additional bit from user i (for which ai,n = 1) on subcarrier n and the increase in bitQoS value, ∆ψ(n), for transmitting that bit are calculated for all subcarriers n. The subcarrier that achieves the ∆ψ(n) , and the highest bitQoS value increase per unit power is selected, i.e., n∗ = arg max n∈N ∆p(n) corresponding bit assignment is performed. This iterative bit-loading algorithm repeats until the total transmit power, Ptotal , is reached or all the bits in the data buffers of all users have been assigned. The differences in power allocation and bit assignment between KKT-DRA and KKT-CRA are illustrated in Fig. 7.2. However, since the KKT conditions are sufficient for optimality only for convex optimization problems, the discrete rate adaptation solutions obtained using KKT-DRA may not be optimal for OP7.1. Nonetheless, we will later show in Section 7.5.1 that the reduced-complexity KKT-DRA algorithm provides a solution that closely approximates the optimal solution obtained using a commercial MINLP optimization solver package. 121 Algorithm 2 KKT-DRA 1: Initialize δ and γi to some small number for all i 2: Sort bits in each user data buffer by their bitQoS values in a descending order 3: repeat 4: 1) Perform subcarrier assignment 5: 1.1) Compute Hi,n (γi ) for all i and n according to (7.26) 6: 1.2) Update subcarrier assignment ai,n according to (7.27) with 7: i∗ (n) = arg max Hi,n (γi ) for all n i 8: 2) Perform discrete power and bit allocation using bit-loading 9: pˆi,n = 0 for all i and n ˆ 10: bj,z i,n = 0 for all i, j, z and n 11: r(n) = 0 for all n; C(i) = 0 for all i 12: pused = 0; pinc = 0 13: while pused + pinc ≤ Ptotal do 14: pused = pused + pinc 15: for n = 1 : N do 16: i∗ (n) = arg max a ˆi,n i∈I 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 2r(n)+1 − 2r(n) ∆p(n) = SN R(i∗ (n), n) 1,C(i∗ (n))+1 ∆ψ(n) = ψi end for ∆ψ(n) n∗ = arg max n∈N ∆p(n) pinc = ∆p(n∗ ) if pused + pinc ≤ Ptotal then r(n∗ ) = r(n∗ ) + 1 C(i∗ (n∗ )) = C(i∗ (n∗ )) + 1 pˆi∗ (n∗ ),n∗ = pˆi∗ (n∗ ),n∗ + pinc ∗ (n∗ )) ˆbj,C(i i∗ (n∗ ),n∗ = 1 end if end while 3) Determine β based on ai,n i 30: 31: 32: 33: 34: n ζσ 2 γi − 20 β ln 2 αi,n + ≤ Ptotal (7.31) using bisection 4) Update γi with iteration step size δ 4.1) Determine the bitQoS value, ψ HOL (i), of the first unassigned bit in the data buffer of each user i 4.2) Update γi according to γi = (1 − δ)γi + δψ HOL (i) for all i until γi + > ψ HOL (i) for all i 122 KKT-CRA KKT-DRA Power Power 1* 2* KKT-CRA water level * ln 2 3* * ln 2 * ln 2 2*,1 * ln 2 KKT-DRA subcarrier water levels for i = 2 (as opposed * to * 2 for KKT-CRA) ln 2 * ln 2 p(n) incremental power to transmit one bit pi*,n power allocated o2 i2,n normalized noise power KKT-DRA water level 2*, 2 * ln 2 2*, 3 a *2 ,n 1 a1*,n 1 a *3 ,n 1 o2 i2,n normalized noise power Subcarriers a *2 ,n 1 a1*,n 1 a *3 ,n 1 Subcarriers Figure 7.2: Differences in Power Allocation and Bit Assignment between KKT-CRA and KKT-DRA 7.5 Simulation Results In this section, we present simulation results to illustrate the performance of the proposed algorithms for the bitQoS-aware RA framework: KKT-CRA, for the continuous rate adaptation problem OP7.2 and KKT-DRA, for the discrete rate adaptation problem OP7.1. The simulation was performed using Matlab and the system model described in Section 4.2. In the simulation, it is assumed that each user has one flow with full buffer. As the focus of this chapter is to develop efficient and practical algorithms to obtain near-optimal solutions, the generation of realistic bitQoS values as in Chapter 5 is deemed to be unnecessary. Instead, the bitQoS values of the bits in each data buffer are randomly generated from a continuous uniform distribution where the range is varied to represent bitQoS values of different traffic class types [85]. To study the effect of the variability of bitQoS values on the algorithms, we consider two different bitQoS generation schemes: Same Maximum BitQoS (SMB), in which the bitQoS values of all application flows, ψij,z ∼ U(0, 50), ∀i, j, z, are generated in the range between 0 and 50 and Varying Maximum BitQoS (VMB), in which the bitQoS values of user i, flow j, ψij,z ∼ U(0, Ψji ), ∀i, j, z, are generated in the range between 0 and Ψji , where Ψji ∼ U(0, 50) is the maximum bitQoS value for user i, flow j. The SMB scheme represents a traffic mix with mild variation between applications and the VMB scheme represents 123 a traffic mix with diverse servicing priorities between applications. To provide a comparative performance assessment of the proposed algorithms, we compare the solutions from KKT-CRA and KKT-DRA with the optimal results obtained using a commercial MINLP optimization solver package. We present simulation results to illustrate various aspects of the KKT-CRA and KKT-DRA algorithms in terms of A) optimality and computation time, B) sensitivity of KKT-CRA and KKT-DRA to the iteration step size, δ, and termination tolerance, , and C) performance comparison of KKT-CRA and KKT-DRA to the classical greedy multi-user water-filling algorithm. The simulation results were obtained by averaging over 10,000 independent trials, each representing a scheduling decision with a different realization of ψij,z , ∀i, j, z and αi,n , ∀i, n. Due to the long computation times required, the optimal results in Sections 7.5.1 and 7.5.2 were obtained over 20 independent trials using the commercial MINLP optimization solver package. 7.5.1 Optimality and Computation Time We demonstrate the optimality of the KKT-CRA and KKT-DRA by comparing them with the optimal continuous rate adaptation (OPT-CRA) and optimal discrete rate adaptation (OPTDRA) solutions obtained by a commercial MINLP optimization solver package, which uses the branch-and-bound approach [86]. Due to the NP-hard nature of OP7.1, where branchand-bound has a worst case complexity of O(2RIN ), we simulate the system with I = pi,n |αi,n |2 = 15 dB for the re{2, 3, 4, 5, 6}, N = 6, δ = 0.3, = 10−4 and SN R ζσ02 sults in this subsection so that the OPT-CRA and OPT-DRA solutions can be obtained within a reasonable amount of time. We show in Fig. 7.3 the average objective value obtained by OPT-CRA, KKT-CRA, OPTDRA and KKT-DRA over an identical set of 20 independent trials for I = {2, 3, 4, 5, 6} using the SMB bitQoS generation scheme. As stated in Section 7.3, since OP7.2 is a convex optimization problem, satisfaction of the KKT conditions (7.12)-(7.21) is sufficient for the continuous rate adaptation solution to be optimal. This is shown in Fig. 7.3 where the average objective value obtained by KKT-CRA is identical to the optimal solution obtained by OPT124 Average Objective Value 1500 OPT−CRA KKT−CRA OPT−DRA KKT−DRA 1000 500 0 2 3 4 5 6 Number of Users Figure 7.3: Average Objective Value as a Function of I (N = 6, δ = 0.3, = 10−4 and SN R = 15 dB) CRA and provides an upperbound to the maximum achievable bitQoS-weighted throughput for OP7.1. In the case of discrete rate adaptation, the average objective value obtained by KKT-DRA is slightly lower than OPT-CRA/KKT-CRA (objective value upperbound). While the optimality of KKT-DRA cannot be proved, Fig. 7.3 shows that the KKT-DRA solution is almost identical to the optimal solution obtained by OPT-DRA. It is important to note that both the KKT-CRA and KKT-DRA solutions are obtained with reduced complexity (shown in Section 7.3.4 and Section 7.4, respectively) as compared to the optimal solutions, OPT-CRA and OPT-DRA. This is reflected in Fig. 7.4, where the computation times of both KKT-CRA and KKT-DRA are orders of magnitude lower than those of OPT-CRA and OPT-DRA. 7.5.2 Sensitivity to Iteration Step Size and Termination Tolerance We next study the sensitivities of the objective values and computation times of KKT-CRA/DRA to the iteration step size, δ, and the termination tolerance, . We consider a system with I = 3, N = 6 and SN R = 15 dB using both the SMB and VMB bitQoS generation 125 Average Computation Time (sec) 4 10 2 10 OPT−CRA KKT−CRA OPT−DRA KKT−DRA 0 10 −2 10 2 3 4 5 6 Number of Users Figure 7.4: Average Computation Time as a Function of I (N = 6, δ = 0.3, and SN R = 15 dB) = 10−4 schemes. The average objective value deviations of the KKT-CRA/DRA solutions from the OPT-CRA/DRA solutions and the average computation times of KKT-CRA/DRA as functions of δ ∈ [0, 1] and of ∈ [0, 50] are shown in Figs. 7.5 and 7.6, respectively. The objective OP T −CRA/DRA value deviation is defined as (δobj OP T −CRA/DRA where δobj KKT −CRA/DRA and δobj KKT −CRA/DRA − δobj OP T −CRA/DRA )/δobj × 100%, denote the objective values obtained by OPT- CRA/DRA and KKT-CRA/DRA, respectively. We see from Fig. 7.5 that the objective value deviations of KKT-CRA/DRA from OPTCRA/DRA respectively are minimal for δ ∈ [0, 0.95] for both the SMB and VMB bitQoS generation schemes. KKT-CRA/DRA have average deviations of 1.3 × 10−3 % and 3.9 × 10−2 % from OPT-CRA/DRA respectively using the SMB bitQoS generation scheme, and average deviations of 0.48% and 1.42% respectively using the VMB bitQoS generation scheme. In general, using a high value of δ results in a faster termination (fewer iterations) of the KKTCRA/DRA algorithms. Since the power allocation (7.23), subcarrier assignment (7.26) and bit assignment (7.30) are all dependent on γi , and the bit assignment is also dependent on ψij,z of the assigned and unassigned bits in (7.30), increasing γi too rapidly with a high value of δ may cause the algorithm to approach the thresholds in (7.30) too rapidly and lead to subcar126 rier/bit assignments and power allocations which are quite far from optimal. As the bitQoS values of the bits that are selected for transmission between application flows are more diverse for the VMB bitQoS generation scheme, terminating KKT-CRA/DRA prematurely (possibly in the first iteration) will result in a solution with a high objective value deviation. This is especially noticeable for δ values close to 1 as shown in Fig. 7.5 since γi is updated to a value that is close to ψ HOL (i) very quickly. On the other hand, we can see that the effect of a high δ value is less pronounced with the SMB bitQoS generation scheme since the bitQoS values of the bits selected for transmission among application flows are less diverse. Hence, even if γi is set to ψ HOL (i) in the first iteration of the algorithms, the associated water-level, γi /(β ln 2), will be similar among users, resulting in a classical water-filling solution. In terms of computation time, Fig. 7.5 shows that, regardless of the bitQoS generation scheme, the computation time decreases as δ increases for both KKT-CRA/DRA. This is due to the fact that the number of iterations, D, performed in the main loop of KKT-CRA/DRA is inversely related to δ. It can be seen that KKT-DRA has a higher computation time than KKTCRA for both SMB and VMB, as within each main loop, the power allocation in KKT-DRA is performed iteratively on a bit-by-bit basis whereas in KKT-CRA, the power allocation is obtained by (7.18) using the bisection algorithm. From Fig. 7.5, it is recommended that δ be set to a value of around 0.3 in this simulation setup, which is a compromise between a desired small δ value to achieve a low objective value deviation and the computation time required for both the SMB and VMB bitQoS generation schemes. It can be seen from Fig. 7.6 that the objective value deviations of KKT-CRA/DRA from OPT-CRA/DRA respectively are larger as the termination tolerance varies between 0 and 50 compared to as the iteration step size δ varies between 0 and 1 for both the SMB and VMB bitQoS generation schemes. KKT-CRA/DRA have average deviations of 0.18% and 0.23% from OPT-CRA/DRA respectively using the SMB bitQoS generation scheme and average deviations of 7.68% and 6.23% respectively using the VMB bitQoS generation scheme. KKTCRA/DRA is more sensitive to since determines how close γi approaches ψ HOL (i) when the algorithm terminates. A high value of can cause the algorithm to terminate prematurely 127 2 SMB: KKT−CRA Objective Value Deviation SMB: KKT−DRA Objective Value Deviation VMB: KKT−CRA Objective Value Deviation VMB: KKT−DRA Objective Value Deviation SMB: KKT−CRA Computation Time SMB: KKT−DRA Computation Time VMB: KKT−CRA Computation Time VMB: KKT−DRA Computation Time 4 Average Computation Time (sec) Average Objective Value Deviation (%) 6 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Iteration Step Size, δ 0.8 0 0.9 Figure 7.5: Sensitivity of KKT-CRA and KKT-DRA to δ (I = 3, N = 6, SN R = 15 dB and = 1) 15 2 SMB: KKT−CRA Objective Value Deviation SMB: KKT−DRA Objective Value Deviation VMB: KKT−CRA Objective Value Deviation VMB: KKT−DRA Objective Value Deviation SMB: KKT−CRA Computation Time SMB: KKT−DRA Computation Time VMB: KKT−CRA Computation Time VMB: KKT−DRA Computation Time 1.5 10 1 5 0 0 0.5 10 20 30 Termination Tolerance, ε 40 Average Computation Time (sec) Average Objective Value Deviation (%) 20 0 50 Figure 7.6: Sensitivity of KKT-CRA and KKT-DRA to (I = 3, N = 6, SN R = 15 dB and δ = 0.3) 128 with a γi value that is less than the intended value of ψ HOL (i) in the terminating conditions of KKT-CRA and KKT-DRA (γi + > ψ HOL (i)), and lead to allocations and assignments that are quite far from optimal. The objective value deviation becomes pronounced when the value of is ≥ min ψ HOL (i) as shown in Fig. 7.6 ( > 35 for SMB and > 5 for VMB). In i particular, since the bitQoS values of the bits selected for transmission are more diverse when the VMB bitQoS generation scheme is used, the objective value deviation spans a larger range of ∈ [5, 50] as compared to ∈ [35, 50] when using the SMB bitQoS generation scheme. In terms of computation time, we see from Fig. 7.6 that, regardless of bitQoS generation scheme, the computation time of both KKT-CRA and KKT-DRA decreases as increases since a large value of will terminate the algorithm more quickly. Hence, as with δ, a small value of is preferred while at the same time maintaining computational efficiency. It can also be seen that KKT-DRA has a higher computation time than KKT-CRA for both SMB and VMB since the power allocation in KKT-DRA is performed iteratively on a bit-by-bit basis whereas in KKT-CRA, the power allocation is obtained by (7.18) using the bisection algorithm. It is recommended that be set to a value that is smaller than the difference between the bitQoS values of any two consecutive different bitQoS-valued bits. By selecting the values of δ and appropriately, the KKT-CRA/DRA algorithms can be tuned to meet the computation time requirement with a pre-determined objective value deviation bound. 7.5.3 Performance Comparison of KKT-CRA and KKT-DRA to the Greedy Multi-user Water-filling Algorithm In this section, we compare the performance of the bitQoS-aware resource allocation using the proposed KKT-CRA and KKT-DRA algorithms to the classical greedy multi-user waterfilling algorithm [30], hereafter referred to as WF-CRA for continuous rate adaptation and WF-DRA for discrete rate adaptation. WF-CRA/DRA assign each subcarrier to the user that has the best channel gain for that subcarrier, and the transmit power is distributed over the subcarriers using the water-filling algorithm [84]. The purpose of comparing KKT-CRA/DRA 129 to WF-CRA/DRA is to study the effect of KKT-CRA/DRA where the subcarrier assignments and power allocations are dependent on both the channel gains and the bitQoS values of the bits in the user data buffers as opposed to just the channel gains in the well-studied WFCRA/DRA which do not take QoS requirements into account but attempt to maximize the overall throughput of the system. We simulate the system with a number of subcarriers, N = {6, 12, 25, 50, 75, 100}, to represent the number of resource blocks (RBs) of the different practical LTE transmission bandwidth configurations [87, 88]. The corresponding number of users in the system are I = {2, 4, 9, 17, 25, 34} respectively using both the SMB and VMB bitQoS generation schemes. We see from Fig. 7.7a that the average throughput and average objective values for WFCRA/DRA and KKT-CRA/DRA are essentially identical for the SMB bitQoS generation scheme. This is due to the fact that the bitQoS values of the bits selected for transmission do not vary widely. As such, the bitQoS values can essentially be neglected and OP4.2 becomes a throughput maximization problem subject to a total power constraint where the optimal channel assignment (7.26) is solely determined by αi,n . On the other hand, for the VMB bitQoS generation scheme, we see from Fig. 7.7b, that while WF-CRA/DRA have higher average throughputs than KKT-CRA/DRA, KKT-CRA/DRA have higher average objective values as they attempt to maximize the bitQoS-weighted throughput. This is due to the fact that the bitQoS values of the bits selected for transmission are more diverse and hence, the optimal channel assignment (7.26) for KKT-CRA/DRA depends on both αi,n and ψij,z instead of just αi,n for WF-CRA/DRA. In terms of computation time, we see from Fig. 7.8 that while KKT-CRA/DRA incurs a computation time increase over WF-CRA/DRA for both the SMB and VMB bitQoS generation schemes, the increase is small for KKT-DRA. 130 120 5000 100 Average Throughput (bits) Average Objective Value 6000 4000 3000 2000 1000 0 80 60 40 KKT−CRA KKT−DRA WF−CRA WF−DRA 20 20 40 60 80 0 100 20 Number of Subcarriers 40 60 80 100 Number of Subcarriers (a) 4500 120 4000 Average Throughput (bits) Average Objective Value 100 3500 3000 2500 2000 1500 1000 80 60 40 KKT−CRA KKT−DRA WF−CRA WF−DRA 20 500 0 20 40 60 80 100 Number of Subcarriers 0 20 40 60 80 100 Number of Subcarriers (b) Figure 7.7: Comparison of Average Objective Value and Average Throughput between KKT-CRA/DRA and WF-CRA/DRA for (a) SMB and (b) VMB 131 SMB VMB 3 Average Computation Time (sec) Average Computation Time (sec) 3 2.5 2 1.5 1 0.5 0 20 40 60 80 2.5 2 1.5 1 0.5 0 100 Number of Subcarriers KKT−CRA KKT−DRA WF−CRA WF−DRA 20 40 60 80 100 Number of Subcarriers Figure 7.8: Comparison of Average Computation Time between KKT-CRA/DRA and WF-CRA/DRA 7.6 Conclusion Optimality conditions and efficient algorithms for the proposed bitQoS-aware RA framework were presented in this chapter for deployment consideration in practical OFDMA systems. The MINLP bitQoS-aware RA problem (NP-hard) was transformed into a convex optimization problem for continuous rate adaptation through a variable transformation and the relaxation of integer constraints for both the subcarrier and bit assignment variables. Using the KKT conditions, we established necessary and sufficient optimality conditions for the continuous rate adaptation problem and showed that the optimal subcarrier assignments and power allocations are dependent on both the channel gains and the bitQoS values of the bits in the user data buffers. In addition, the optimal power allocation can be interpreted as a bitQoS-based multi-level water-filling solution. Efficient KKT-based algorithms, KKT-CRA and KKT-DRA, were developed to obtain the optimal and near-optimal solutions to the joint subcarrier, power and bit allocation problem with continuous and discrete rate adaptation, respectively. The solutions obtained using the lower complexity KKT-CRA and KKT-DRA 132 algorithms were compared with the optimal solutions, OPT-CRA and OPT-DRA, obtained using a commercial MINLP optimization solver package. The simulation results show that KKT-CRA yield identical solutions to OPT-CRA. While the optimality of the KKT-DRA solutions cannot be proved, the KKT-DRA solutions are shown to be almost identical to those of OPT-DRA. By appropriately selecting the parameters, δ and , the KKT-CRA and KKTDRA algorithms can be tuned to tradeoff the computation time against the closeness of the solution to the optimal value. 133 Chapter 8 Computational Complexity and Practicality of BitQoS-aware Resource Allocation Framework 7 8.1 Introduction In this chapter, we assess the computational complexity of the scheduling policies proposed for the bitQoS-aware RA framework and evaluate their practicality for real-time resource allocation in LTE [27], an OFDM-based air interface. 8.2 Computational Complexity To assess the computational complexity of the considered scheduling policies, we determine the number of operations performed at each scheduling decision time. The number of operations is determined by listing the pseudocode for each scheduling policy and counting the number of addition, assignment, comparison and multiplication operations associated with each line of code. To simplify the analysis, the exponential, ceiling and absolute value func7 The material in this chapter is based on: C. E. Huang and C. Leung, “BitQoS-aware resource allocation for multi-user mixed-traffic OFDM systems,” IEEE Trans. Veh. Technol., vol. 61, no. 5, pp. 2067-2082, Jun. 2012. c 2012 IEEE. http://dx.doi.org/10.1109/TVT.2012.2189030 134 tions are treated as a multiplication operation. For each of the considered scheduling policies, the pseudocode and the associated number of operations performed per line of pseudocode are given in Appendix D and the number of operations performed by each scheduling policy is summarized in Table 8.1. Bij . The term, L, The term, R, is the total number of allocated bits and B = max i∈I j∈Ji is associated with the number of iterations performed by the bisection algorithm, which can Ptotal ) [89], where ϕ is the tolerance value on how close the be approximated by L ≈ log2 ( ϕ bisection algorithm comes to the solution. The term, κ, is associated with the number of iterative subcarrier assignment, power allocation and update of the marginal utility performed by the MDU scheduling policy. The term, D, denotes the number of iterative updates of the Lagrange multiplier, γi , performed in the main loop of the KKT-CRA and KKT-DRA algoψmax rithms and it can be approximated by D ≈ log 1 , where ψ max = max ψij,z denotes 1−δ i,j,z the largest bitQoS value of all bits in the data buffers of all users. The term, Q, denotes the number of iterations required by the bisection algorithm in KKT-CRA and KKT-DRA to dey−x termine the Lagrange multiplier, β, and it can be approximated by Q ≈ log2 , where ϕ ψ min ψ max maxi,n |αi,n |2 are the minimum and maxix = and y = 2 ζσ02 ln 2ζσ 0 ) ln 2(Ptotal + 2 mini,n |αi,n | mum values of β respectively, which are obtained by solving for β within the total transmit γi ζσ02 power constraint (0 ≤ − ≤ Ptotal ). Due to the increased scheduling granβ ln 2 |αi,n |2 ularity, we note that the proposed bitQoS-aware scheduling policies, with the exception of KKT-CRA and KKT-DRA, generally have a higher computational complexity compared to both WF and MDU. Nonetheless, all of the proposed bitQoS-aware scheduling policies have polynomial-time complexities as compared to a worst case complexity of O(2RIN ) for the optimal solution to the discrete rate adaptation problem. 8.3 Practicality of BitQoS-aware Scheduling Policies To evaluate the practicality of the proposed bitQoS-aware scheduling policies for real-time resource allocation, we consider the transmission bandwidth configurations [87, 88] of the 135 Table 8.1: Number of Operations Performed by Each Scheduling Policy - Part I Scheduling Addition Assignment Comparison Multiplication Big O Policy Operations Operations Operations Operations Notation WF 2N I + 4N 4N I + 4N 3N I + 3N 2N I + 8N O(N I + R + LN ) +L(2N + 2) +I + R + 4 +L(N + 2) +L(N + 3) +3 +L(N + 4) 2N I + 2I + κ(N I 4N I + 2N + 5I + R 2N I + 3I + κ(N I 7N I + N + 5I O(κN I + R +6N + 4I + 2) +κ(N I + 4N + 3I +4N + 2I + 1) +κ(4N I + 7N + 5I) +κLN ) MDU +κL(2N + 2) WFH-FM 3 2 +4) + κL(N + 4) 2 2 2 3 +κL(N + 2) 2 3 3 2 +κL(2N + 3) 3 2 2 136 I N B + 2I N B+ I N B + 5I N + 2I N I N + 3I N + 5I N IN B + 2I 3 N 2 + 4I 2 N 2 +3I 2 N 2 + I 2 N R + I 2 N B +4I 2 N + 6I 2 + 4IN B 2 2 +8I N + 2I + 7IN +2IB + Jsys + 5I + 4N 2 WFH-NFM 3 2 2 2 2 +15I N + 6I + 11IN + IB log B + 2IB + 16I + IR 2 +3N + L(I N 2 +IB log B + IR 2 +2LI 2 N 2 ) +7I + 6IB + 5IN 2 +5I + 7N + L(I N 2 +4I 2 N + IN + 4I + N + 4) +N + 2) +N + 3) 3 3 Jsys N 2 B + 5Jsys N2 3 2 +2Jsys N + 3Jsys N2 2 2 +Jsys N R + Jsys NB 2 2 +15Jsys N + 6Jsys + 3 3 Jsys N 2 + 3Jsys N 2 2 2 +5Jsys N + 4Jsys N 2 +6Jsys + 4Jsys N B 3 3 3Jsys N 2 + Jsys NB 3 O(Jsys N 2B 2 2 +2Jsys N 2 + 8Jsys N +Jsys R + IB 2 +7Jsys 2 2LJsys N 2) +7Jsys N + 4Jsys + 3N +5Jsys N + 5Jsys + 7N 2 2 +L(Jsys N 2 + 3Jsys N +2IN + 2N + 2I + 2) 3 Jsys N 2B 12Jsys N 2 +2I 2 N 2 + 8I 2 N +3I 2 N + IN + 3I +4N + R + 9 + L(I N 2 + 2Jsys N 2B 3 +Jsys N B + 2Jsys N2 2 2 +4Jsys N 2 + 8Jsys N 2 +2Jsys + 7Jsys N + 2IB 2 O(I 3 N 2 B + Jsys +2I N + IN + 2I +3 + L(2I N + 2I N 2 +IB log B + 7IN + 4I 3I 3 N 2 + I 3 N B +5Jsys + 4N + 3 +2IB + 15Jsys + Jsys R + 4N 2 +L(Jsys N2 2 2 +L(2Jsys N 2 + 2Jsys N 2 +R + 9 + L(Jsys N2 +Jsys N + 2Jsys + N +Jsys N + 3Jsys + N +2Jsys N + 2N + 2Jsys 2 +4Jsys N +2) +3) +2) +N + 4) + Jsys N + 4Jsys + 2 2Jsys N + 5IB + Jsys B Table 8.2: Number of Operations Performed by Each Scheduling Policy - Part II Scheduling Addition Policy Operations BABL-FM BABL-NFM 137 KKT-CRA KKT-DRA 2IB + 2RBN Assignment Operations 2 2IB + 4N I + RBN +RBN + 2RN +2RBN + RN +4R +9R + N + I + 1 2IB + 2RBN Comparison 2 Multiplication Operations 2 2IB + 4N Jsys + RBN IB log B + 3RBN 2 Big O Operations 2 4IB + 3RBN Notation 2 O(IB log B −3RBN + 3RN −3RBN + 5N I +RBN 2 + RI +RI + R +RN +N I) 3RBN 2 4IB + 3RBN 2 O(IB +RBN + 2RN +2RBN + RN −3RBN + 3RN −3RBN + 5N Jsys +RBN 2 + RJsys +4R +9R + N + I + 1 +RJsys + R +RN +N Jsys ) 2IB + Jsys 2IB + N I + 3I + 1 IB log B + D[5N I 5IB + 5N I+ O(IB log B +D[5IN + 6I +D[IB + 5N I + 4I +5I + 2N + Q(N + 2)] D[18N I + 5I + 8N +Jsys + D(N I +4N + Q(N + 2)] +3N + 5 + Q(N + 4)] +Q(4N + 3)] +IB + QN )) 2IB + Jsys 2IB + N I + 3I + 1 IB log B + D[4N I 5IB + 5N I+ O(IB log B +D[2N I + 6I + 4N +D[IBN + 3N I + 4I +5I + 2N + R(2N + 1) D[11N I + 2I + 8N +Jsys + D(IBN +2 + R(3N + 5) +4N + 7 + R(2N + 7) +Q(N + 2)] +R(4N ) + Q(4N + 3)] +RN + QN )) +Q(2N + 2)] +Q(N + 4)] Table 8.3: Computation Time Calculation Parameter Values for LTE Parameter Value Parameter Value Subcarrier bandwidth 15 kHz B 44 bits Number of subcarriers in a SB 12 ϕ 0.001 SB bandwidth 180 kHz L 9.97 SB time duration 1 ms κ R OFDM symbols per SB with extended cyclic prefix 12 δ 0.3 Symbol duration 1/12 ms Modulation 16 QAM Jsys 2I 3.0 ψ max 50.0 LTE air interface. In the forward link of LTE systems, subcarriers are grouped into resource blocks (RBs) of 12 adjacent subcarriers, each with a subcarrier bandwidth of 15 kHz. Each RB has a time slot duration of 0.5 ms, which corresponds to 6 or 7 OFDM symbols depending on whether an extended or normal cyclic prefix is used. The smallest resource unit which a scheduler can assign to a user in LTE is a Scheduling Block (SB) [27, 90], which consists of two consecutive RBs, resulting in a subframe time duration of 1 ms with a frequency block of 180 kHz. The LTE specifications define transmission bandwidth configurations ranging from 1.4 MHz to 20 MHz and the number of SBs and subcarriers depends on the overall transmission bandwidth of the system. In our calculation of the computation times of the considered scheduling policies for LTE, a system with a loading of ρ = 0.95 is used, where ρ = I(λBE + λEF )/µ. Each user is assumed to have 1 BE and 1 EF flow and the term, µ, denotes the service rate of the system. The number of instructions performed by each of the considered scheduling policies at each scheduling decision time is determined by summing the number of addition, assignment, comparison and multiplication operations listed in Table 8.1. We assume that the basic operations used (addition, assignment, comparison and multiplication) are effectively executed as a single instruction in a modern pipelined microprocessor architecture. The times 138 Table 8.4: LTE Transmission Bandwidth Configurations LTE Transmission Bandwidth Configuration Parameter Values LTE Transmission Bandwidth A B C D E F Bandwidth (MHz) 1.4 3.0 5.0 10.0 15.0 20.0 Number of subcarriers 72 144 300 600 900 1200 N (Number of SBs) 6 12 25 50 75 100 3456 6912 14400 28800 43200 57600 3.456 6.912 14.4 28.8 43.2 57.6 75 150 313 626 939 1252 Configuration 139 R (Number of bits per scheduling decision time) µ (System service rate) (Mbps) I (Number of users for ρ = 0.95 ) Table 8.5: Computation Times of the Considered Scheduling Policies Number of Instructions Per Scheduling Decision Time LTE Transmission Bandwidth A B C D E F WF 9.02 × 103 2.78 × 104 1.03 × 105 3.77 × 105 8.24 × 105 1.44 × 106 MDU 1.66 × 107 1.09 × 108 8.83 × 108 6.68 × 109 2.21 × 1010 5.20 × 1010 WFH-FM 1.76 × 109 5.21 × 1010 1.96 × 1012 6.16 × 1013 4.65 × 1014 1.95 × 1015 WFH-NFM 1.36 × 1010 4.10 × 1011 1.56 × 1013 4.91 × 1014 3.71 × 1015 1.56 × 1016 BABL-FM 4.67 × 107 3.82 × 108 3.50 × 109 2.82 × 1010 9.52 × 1010 2.26 × 1011 BABL-NFM 4.70 × 107 3.83 × 108 3.50 × 109 2.82 × 1010 9.53 × 1010 2.26 × 1011 KKT-CRA 2.31 × 105 6.97 × 105 2.53 × 106 9.23 × 106 2.01 × 107 3.51 × 107 KKT-DRA 2.47 × 106 8.99 × 106 3.70 × 107 1.44 × 108 3.22 × 108 5.71 × 108 Configuration 140 Computation Times (ms) (Intel 990x) WF 5.67 × 10−5 1.75 × 10−4 6.46 × 10−4 2.37 × 10−3 5.18 × 10−3 9.08 × 10−3 MDU 1.05 × 10−1 6.88 × 10−1 5.55 × 100 4.20 × 101 1.39 × 102 3.27 × 102 WFH-FM 1.11 × 101 3.28 × 102 1.23 × 104 3.87 × 105 2.92 × 106 1.23 × 107 WFH-NFM 8.57 × 101 2.58 × 103 9.79 × 104 3.09 × 106 2.33 × 107 9.80 × 107 BABL-FM 2.94 × 10−1 2.41 × 100 2.20 × 101 1.77 × 102 5.99 × 102 1.42 × 103 BABL-NFM 2.95 × 10−1 2.41 × 100 2.20 × 101 1.77 × 102 5.99 × 102 1.42 × 103 KKT-CRA 1.45 × 10−3 4.39 × 10−3 1.59 × 10−2 5.80 × 10−2 1.26 × 10−1 2.21 × 10−1 KKT-DRA 1.55 × 10−2 5.56 × 10−2 2.33 × 10−1 9.90 × 10−1 2.03 × 100 3.59 × 100 that the considered scheduling policies require to make a scheduling decision is determined assuming the use of an Intel Core i7 Extreme Edition 990x microprocessor [91], which is rated to perform 159,000 Million Instructions Per Second (MIPS) at 3.46 GHz. The parameter values used for the calculation of the computation times are presented in Table 8.3, where the term, B, is approximated by TSB (λBE + λEF ), and TSB is the time duration of a SB. The LTE transmission bandwidth configuration parameter values, the number of instructions executed by each scheduling policy and the computation times required for making each scheduling decision are presented in Tables 8.4 and 8.5. Note that for LTE, the term, N , can be viewed as the number of SBs rather than the number of subcarriers in the system. As can be seen from the computation time results in Table 8.5, the Intel 990x is currently already capable of executing: KKT-CRA within a 1 ms SB for all LTE transmission bandwidth configurations; KKT-DRA up to LTE transmission bandwidth configuration D; BABL-FM/NFM for LTE transmission bandwidth configuration A; and WFH-FM/NFM for LTE transmission bandwidth configuration A when the system loading is reduced to ρ = 0.45. KKT-DRA incurs a higher computational complexity compared to KKT-CRA as at each iteration of the main loop where γi is updated, KKT-DRA has to perform RN bit-loading assignments instead of just N water-filling operations as in the case of KKT-CRA. While the proposed bitQoS-aware scheduling policies are, in general, more computationally complex than the other considered scheduling policies (especially WF), the performance gains of the bitQoS-aware scheduling policies in user throughput and user packet drop probability (shown in Chapter 5) as well as in effective throughput gains (shown in Table 6.1) over scheduling policies, such as WF, that do not take QoS provisions into account and scheduling policies, such as MDU, that only consider flow-level QoS requirements demonstrate that the increased scheduling granularity and flexibility of the proposed bitQoS RA framework may be attractive in many situations. Given the computation times of KKT-CRA and KKT-DRA shown in Table 8.5, which can at least support up to LTE transmission bandwidth configuration D, we expect that with the additional technological advancements outlined below, the bitQoS-aware RA framework is practical and can be adopted in even higher LTE transmission bandwidth 141 configurations. 1) Faster/dedicated processors: Forward-looking statements indicate that the upcoming Intel Core i7 Extreme Edition 3960x microprocessor utilizing the Sandy Bridge architecture will yield a 47% performance increase [92] over the Intel 990x and microprocessors utilizing the Ivy Bridge architecture (to be released in 2012) [93] will yield another 20% performance increase over the Intel 3960x. In addition, for timing critical components such as resource allocation at the BS, we would expect commercial grade microprocessors/dedicated DSPs to be used in commercial deployments. 2) Algorithm development: We expect more efficient algorithms to be developed to take advantage of the proposed bitQoS-aware RA framework for deployment. The efficiencies can come from multiple areas such as mathematical techniques to reduce algorithm complexity and/or tradeoffs made between performance and complexity. 3) Multiple parallel baseband processing modules: We note that it is not uncommon for BSs to use multiple parallel baseband processing modules for system scalability/flexibility as well as to handle large bandwidth systems, e.g., 4 × 5 MHz for a 20 MHz system. 8.4 Conclusion In this chapter, we assessed the computational complexity of the proposed bitQoS-aware scheduling policies (WFH-FM/NFM, BABL-FM/NFM and KKT-CRA/DRA) by determining the number of operations performed at each scheduling decision time and evaluated the practicality of the proposed scheduling policies for real-time resource allocation by determining the computation time required to make a scheduling decision for the LTE air interface. We showed that the Intel 990x microprocessor is currently already capable of executing KKT-CRA within a 1 ms SB for all LTE transmission bandwidth configurations, KKT-DRA up to LTE transmission bandwidth configuration D, BABL-FM/NFM for LTE transmission bandwidth configuration A, and WFH-FM/NFM for LTE transmission bandwidth configuration A when the system loading is reduced to ρ = 0.45. In addition, we believe that with 142 the rapid improvement in microprocessor performances, algorithm development and parallel processing modules, among technological advancements, the bitQoS-aware RA framework is practical and can be adopted in even higher LTE transmission bandwidth configurations. 143 Chapter 9 Conclusion This chapter summarizes the main contributions of this thesis and provides suggestions for future research work. 9.1 Contributions In this thesis, we have investigated RA and proposed scheduling policies for single-carrier and multi-carrier communication systems that service multiple users with different applications and different QoS requirements. The main contributions of this thesis are summarized as follows: In Chapter 2, the performance gains of scheduling policies that exploit MFM in multiapplication single-carrier CDMA communication systems were quantified in terms of user throughput, user latency and user packet drop probability. The gains of MFM results from wastage reduction in the physical layer encoder packet and multiplexing of packets with different latency tolerances in a scheduling period. Additional performance gains were achieved by the ACLS-FM scheduling policy through the integration of MFM with a cross layer design (physical, MAC and application layers) and the utilization of a packet urgency function to allow a packet from a delay-sensitive application flow to have its service priority raised when its waiting time exceeds a predetermined threshold. In Chapter 3, an ACLS-FUM scheduling policy that integrates both MFM and PDM while 144 jointly considering physical-layer time-varying channel conditions as well as applicationlayer QoS requirements in a mixed traffic environment was proposed and evaluated. Simulation results showed that ACLS-FUM is able to achieve substantial performance gains in user throughput, user latency, user jitter and user packet drop probability when compared to other well known scheduling policies. This improvement is achieved due to the ability to PDM the physical layer encoder packet using MUP transmission which not only improves the resource utilization (packing efficiency) by allowing delay-tolerant applications to fill up the unused physical layer encoder packet with higher priority, low-rate, latency-sensitive applications, but also provides an increase in the number of available time slots to support low-rate latency-sensitive applications, leading to increased system throughput and spectral efficiency. In Chapters 4 and 5, a bitQoS-aware RA framework that exploits multi-application and multi-bit diversities by adaptively matching the QoS requirements of user application bits to the characteristics of the OFDM subcarriers was proposed for a multi-user OFDM system in a mixed-traffic environment. The simulation results, obtained using the proposed water-fillingbased WFH scheduling policy and bit-loading-based BABL scheduling policy, showed that with the finesse bit-level control provided by the proposed bitQoS-aware RA framework, it is possible to achieve both an increase in throughput and a reduction in packet drop probability at the cost of a longer (albeit within the scheduling delay threshold) scheduling delay. This flexibility comes from the realization that in OFDM, data is loaded onto subcarriers in units of bits and the latency QoS is satisfied as long as the bit waiting time does not exceed the scheduling delay threshold. By applying the bitQoS function at the bit-level as proposed, system providers can trade off the bit waiting time for a reduction in the number of dropped packets by prioritizing which bit to transmit based on its closeness to the scheduling delay threshold. This finer resolution of control provides an additional flexibility to push back the scheduling of bits that are not as close to the scheduling delay threshold (i.e., by increasing the bit waiting time) so as to allow the servicing of more “urgent” bits when necessary. As long as this push-back does not cause the bit waiting time to exceed the scheduling delay threshold, bits will be serviced within their scheduling delay thresholds, resulting in a simul145 taneous increase in user throughput and a reduction in the number of user bits dropped. Both WFH and BABL were also able to achieve the highest average system throughput across all considered system loads when compared to scheduling policies that do not take QoS provisions into account such as WF and policies that consider only flow-level QoS such as MDU. In addition, it was found that in a multi-application system, the performance gains by allowing bits from different application flows of a user to be merged into a single subcarrier for transmission are small. In Chapter 6, the viability of the proposed bitQoS-aware RA framework, with and with no flow merging, was analyzed by taking the associated scheduling signaling overhead into account. A model is formulated to analyze the associated scheduling signaling overhead and the performance gains achievable with the bitQoS-aware RA framework are quantified. The entropy analysis shows that scheduling policies with flow merging incur a significantly higher scheduling signaling overhead compared to scheduling policies that do not allow flow merging. However, the scheduling signaling overhead for scheduling policies with flow merging can be greatly reduced by grouping and sorting the bits carried on the subcarrier by their application flows and flow indices, respectively. Simulation results further show that despite the increase in the scheduling signaling overhead for scheduling policies with flow merging, the proposed bitQoS-aware RA framework is able to provide a higher effective throughput gain compared to scheduling policies that do not take QoS provisions into account such as WF and policies that consider only flow-level QoS requirements such as MDU, when RLE compression of the scheduling signaling information is performed. In Chapter 7, optimality conditions and efficient algorithms for the proposed bitQoSaware RA framework were presented for deployment consideration in practical OFDMA systems. The MINLP bitQoS-aware RA problem (NP-hard) was transformed into a convex optimization problem for continuous rate adaptation through a variable transformation and the relaxation of integer constraints for both the subcarrier and bit assignment variables. Using the KKT conditions, we established necessary and sufficient optimality conditions for the continuous rate adaptation problem and showed that the optimal subcarrier assignments 146 and power allocations are dependent on both the channel gains and the bitQoS values of the bits in the user data buffers. In addition, the optimal power allocation can be interpreted as a bitQoS-based multi-level water-filling solution. Efficient KKT-based algorithms, KKT-CRA and KKT-DRA, were developed to obtain the optimal and near-optimal solutions to the joint subcarrier, power and bit allocation problem with continuous and discrete rate adaptation, respectively. The solutions obtained using the lower complexity KKT-CRA and KKT-DRA algorithms were compared with the optimal solutions, OPT-CRA and OPT-DRA, obtained using a commercial MINLP optimization solver package. The simulation results show that KKT-CRA yield identical solutions to OPT-CRA. While the optimality of the KKT-DRA solutions cannot be proved, the KKT-DRA solutions are shown to be almost identical to those of OPT-DRA. By appropriately selecting the parameters, δ and , the KKT-CRA and KKTDRA algorithms can be tuned to tradeoff the computation time against the closeness of the solution to the optimal value. In Chapter 8, we assessed the computational complexity of the proposed bitQoS-aware scheduling policies (WFH-FM/NFM, BABL-FM/NFM and KKT-CRA/DRA) by determining the number of operations performed at each scheduling decision time and evaluated the practicality of the proposed scheduling policies for real-time resource allocation by determining the computation time required to make a scheduling decision for the LTE air interface. We showed that the Intel 990x microprocessor is currently already capable of executing KKT-CRA within a 1 ms SB for all LTE transmission bandwidth configurations, KKT-DRA up to LTE transmission bandwidth configuration D, BABL-FM/NFM for LTE transmission bandwidth configuration A, and WFH-FM/NFM for LTE transmission bandwidth configuration A when the system loading is reduced to ρ = 0.45. In addition, we believe that with the rapid improvement in microprocessor performances, algorithm development and parallel processing modules, among technological advancements, the bitQoS-aware RA framework is practical and can be adopted in even higher LTE transmission bandwidth configurations. 147 9.2 Future Work In this thesis, we increase the flexibility and granularity of the RA algorithms by adopting an adaptive cross layer approach to exploit multi-application diversity in single-carrier communication systems and additionally, multi-bit diversity in multi-carrier communication systems. While the results show that the proposed algorithms can achieve a higher system throughput with substantial performance gains in the considered QoS metrics, the following summarizes some possible topics for future study. 9.2.1 Analysis and Determination of Scheduling Block Size The bitQoS-aware RA framework in Chapter 4 is formulated as optimization problems with no flow merging and with flow merging. However, it is shown in the results in Chapters 5 and 6 that, with or without consideration of the scheduling signaling overhead, in a multiapplication system, the performance gains achievable by allowing different application flows of a user to be merged into a single subcarrier for transmission are quite small. This is due to the fact that the scheduling block size considered in the simulations is on a per-resourceelement basis (1 OFDM symbol × 1 subcarrier) and the number of bits in one application PDU is typically much greater than the number of bits that can be carried by a subcarrier. As a result, very little flow merging actually takes place and the performance gain from flow merging is minimal. It is expected that if we increase the scheduling block size to a perresource-block basis (6/7 OFDM symbols × 12 subcarriers) as in LTE [27, 94], a higher throughput [74] and better QoS performance may be possible due to the further exploitation of the flow merging gain and bit-level scheduling. The higher throughput achieved may thus offset the additional scheduling signaling overhead that is incurred, especially for WFHFMGS, and result in a higher effective throughput gain. As the potential flow merging gain is dependent on the scheduling block size, determining the appropriate scheduling block size is critical. Detailed analysis needs to be performed when determining the scheduling block size as factors such as dependencies among subcarrier channel gains (over time and across frequency) and the increased bit waiting times need to be taken into consideration and traded 148 off with the potential flow merging gain. 9.2.2 Efficient and Optimal Solution to Discrete Rate Adaptation Problem It is shown in Chapter 7 that KKT-DRA attains a near-optimal solution. However, optimality of the solution to the discrete rate adaptation problem cannot be claimed. Further studies should be undertaken to develop efficient algorithms (for practical importance) to obtain the optimal solution (for theoretical importance) to the discrete rate adaptation problem. Given that the number of bits to be transmitted on a subcarrier is discrete, the MINLP problem can be transformed into a Mixed-Integer Linear Programming (MILP) problem by replacing the non-linear log function for ci,n in OP4.2 with piece-wise linear representations [95] and replacing constraints (4.11) and (4.12) of OP4.2 accordingly. The problem formulation can thus be represented as follows: I Ji Bij N j,z f (θθ j,z i )bi,n max ai,n ∈{0,1} d ∈[0,1] wi,n (9.1) i=1 j=1 z=1 n=1 bj,z i,n ∈{0,1} d wi,n pdi,n ≤ Ptotal subject to i n d bj,z i,n j (9.2) d dwi,n ≤ z ∀i, n (9.3) d ai,n ≤ 1 ∀n (9.4) bj,z i,n ≤ 1 ∀i, j, z (9.5) ∀i, n, (9.6) i n d wi,n ≤ ai,n d where the index d, d ∈ {0, 1, 2, ..., D} denotes the number of discrete bits a user can transmit on a subcarrier and D is the maximum number of bits that can be transmitted by a subcarrier. The term, pdi,n , denotes the transmit power required to transmit d bits of user i on subcarrier 2 n and can be calculated a priori for every value of d using pdi,n = (2d − 1)ζσ02 /αi,n . The d term, wi,n , is a optimization variable which takes on a value between 0 and 1. Techniques for 149 solving MILP problems should be explored and the optimality of the MILP solution to the MINLP problem needs to be established [96]. 9.2.3 Alternative Formulations of BitQoS Function In Chapter 7, it is shown that the solution to the bitQoS-aware RA framework is a multilevel water-filling solution where the optimal subcarrier assignment is dependent on both the channel gain and the bitQoS value of the user as opposed to just the channel gain in the classical water-filling solution. Since the proposed KKT-CRA and KKT-DRA scheduling policies are able to obtain the optimal and near-optimal solutions to the continuous and discrete rate adaptation problems, respectively, alternative formulations of the bitQoS function should be studied to take advantage of the bitQoS-aware RA framework to potentially address other critical issues in OFDM networks. As an example, given that the incremental power required to transmit additional bits on an OFDM subcarrier increases as bits are loaded onto a subcarrier, the bitQoS function can be formulated such that it takes into account both the bit latency and transmit power required, where the latency experienced by a bit can be traded-off for energy savings considerations in green communication systems. Trade-offs in terms of the system throughput and pertinent QoS metrics should be quantified along with the savings in energy. 9.2.4 Distributed Resource Allocation Algorithms The RA algorithms proposed in this thesis are centralized scheduling policies. However, as the scheduling granularity increases, so does the computational complexity of the algorithms for systems with a large number of users and subcarriers. In addition to developing efficient and optimal algorithms as outlined in Section 9.2.2, distributed RA algorithms should also be studied to broaden the scope of the centralized scheduling policies considered in this thesis. In particular, computationally complex functions within the centralized RA algorithms need to be identified and segmented for distributed computing in an effort to reduce the computation burden on the computing server. In addition, since in a cellular system, the MS entity is most 150 aware of its channel conditions and application QoS requirements, distributed RA may be performed using a game theoretic approach [97, 98] where multiple players (MSs) seek to maximize a utility function (e.g., bitQoS-weighted throughput) using one of several available strategic RA actions as opposed to a centralized RA being performed solely by the BS. The performance and trade-offs of such a distributed game theoretic RA approach can be evaluated against the centralized scheduling approach presented in this thesis, taking into account that the information received (e.g., CSI) may be imperfect. 151 Bibliography [1] A. K. Parekh and R. G. Gallager, “A generalized processor sharing approach to flow control in integrated services networks: The single-node case,” IEEE/ACM Trans. Netw., vol. 1, no. 3, pp. 344–357, Jun. 1993. [2] H. Zhang, “Service disciplines for guaranteed performance service in packet-switching networks,” Proc. IEEE, vol. 83, no. 10, pp. 1374–1396, Oct. 1995. [3] S. Shakkottai, T. S. Rappaport, and P. C. Karlsson, “Cross-layer design for wireless networks,” IEEE Commun. Mag., vol. 41, no. 10, pp. 74–80, Oct. 2003. [4] H. Fattah and C. Leung, “An overview of scheduling algorithms in wireless multimedia networks,” IEEE Wireless Commun., vol. 9, no. 5, pp. 76–83, Oct. 2002. [5] G. Song and Y. Li, “Utility-based resource allocation and scheduling in OFDM-based wireless broadband networks,” IEEE Commun. Mag., vol. 43, no. 12, pp. 127–134, Dec. 2005. [6] Y. Cao and V. O. K. Li, “Scheduling algorithms in broad-band wireless networks,” Proc. IEEE, vol. 89, no. 1, pp. 76–87, Jan. 2001. [7] P. Bhagwat, P. Bhattacharya, A. Krishna, and S. K. Tripathi, “Enhancing throughput over wireless LANs using channel state dependent packet scheduling,” in Proc. INFOCOM, Mar. 1996, pp. 1133–1140. [8] J. M. Holtzman, “CDMA forward link waterfilling power control,” in Proc. VTC, May 2000, pp. 1663–1667. [9] A. Jalali, R. Padovani, and R. Pankaj, “Data throughput of CDMA-HDR a high efficiency-high data rate personal communication wireless system,” in Proc. VTC, May 2000, pp. 1854–1858. [10] S. Shakkottai and A. L. Stolyar, “Scheduling for multiple flows sharing a time-varying channel: The exponential rule,” Amer. Mathematical Soc. Translations, vol. 207, no. 1, pp. 185–202, Dec. 2002. [11] ——, “Scheduling algorithms for a mixture of real-time and non-real-time data in HDR,” in Proc. ITC, Sep. 2001, pp. 793–804. [12] G. Barriac and J. M. Holtzman, “Introducing delay sensitivity into the proportional fair algorithm for CDMA downlink scheduling,” in Proc. IEEE Int. Symp. Spread-Spectrum Tech. & Appl., Sep. 2002, pp. 652–656. 152 [13] A. Farrokh, F. Blomer, and V. Krishnamurthy, “A comparison of opportunistic scheduling algorithms for streaming media in high-speed downlink packet access (HSDPA),” Lecture Notes in Computer Science, vol. 3311, no. 1, pp. 130–142, Oct. 2004. [14] C. Zhou, M. L. Honig, S. Jordan, and R. Berry, “Utility-based resource allocation for wireless networks with mixed voice and data services,” in Proc. Int. Conf. Computer Comm. Networks, Oct. 2002, pp. 485–488. [15] H. Fattah and C. Leung, “A guaranteed quality of service wireless access scheme for CDMA networks,” in Proc. PACRIM, Aug. 2003, pp. 533–536. [16] Y. P. Fallah and H. Alnuweiri, “Hybrid polling and contention access scheduling in IEEE 802.11e WLANs,” J. Parallel and Distributed Computing, vol. 67, no. 2, pp. 242–256, Feb. 2007. [17] S. L. Kota, E. Hossain, R. Fantacci, and A. Karmouch, “Cross-layer protocol engineering for wireless mobile networks: Part 1,” IEEE Commun. Mag., vol. 43, no. 12, pp. 110–111, Dec. 2005. [18] D. Bertsekas and R. Gallager, Data Networks, 2nd ed. Prentice Hall, 1992. Upper Saddle River, NJ, USA: [19] Z. J. Haas, “Design methodologies for adaptive and multimedia networks,” IEEE Commun. Mag., vol. 39, no. 11, pp. 106–107, Nov. 2001. [20] T. S. Rappaport, A. Annamalai, R. M. Buehrer, and W. H. Tranter, “Wireless communications: Past events and a future perspective,” IEEE Commun. Mag., vol. 40, no. 5, pp. 148–161, May 2002. [21] C. Verikoukis, L. Alonso, and T. Giamalis, “Cross-layer optimization for wireless systems: A european research key challenge,” IEEE Commun. Mag., vol. 43, no. 7, pp. 1–3, Jul. 2005. [22] V. Srivastava and M. Motani, “Cross-layer design: A survey and the road ahead,” IEEE Commun. Mag., vol. 43, no. 12, pp. 112–119, Dec. 2005. [23] H. Jiang, W. Zhuang, and X. Shen, “Cross-layer design for resource allocation in 3G wireless networks and beyond,” IEEE Commun. Mag., vol. 43, no. 12, pp. 120–126, Dec. 2005. [24] R. Ferrus, L. Alonso, A. Umbert, X. Reves, and J. Perez, “Cross-layer scheduling strategy for UMTS downlink enhancement,” IEEE Commun. Mag., vol. 53, no. 6, pp. 24–28, Jun. 2005. [25] K. B. Johnsson and D. C. Cox, “An adaptive cross-layer scheduler for improved QoS support of multiclass data services on wireless systems,” IEEE J. Sel. Areas Commun., vol. 23, no. 2, pp. 334–343, Feb. 2005. 153 [26] V. Kawadia and P. R. Kumar, “A cautionary perspective on cross-layer design,” IEEE Wireless Comm., vol. 12, no. 1, pp. 3–11, Feb. 2005. [27] 3GPP TS 36.211 v9.1.0, “Physical Channels and Modulation (Release 9),” Mar. 2010. [28] IEEE 802.16-2009, “Part 16: Air Interface for Broadband Wireless Access Systems,” May 2009. [29] T. Keller and L. Hanzo, “Adaptive multicarrier modulation: A convenient framework for time-frequency processing in wireless communications,” Proc. IEEE, vol. 88, no. 5, pp. 611–640, May 2000. [30] J. Jang and K. Lee, “Transmit power adaptation for multiuser OFDM systems,” IEEE J. Sel. Areas Commun., vol. 21, no. 2, pp. 171–178, Feb. 2003. [31] W. Rhee and J. Cioffi, “Increase in capacity of multiuser OFDM system using dynamic subchannel allocation,” in Proc. IEEE VTC, May 2000, pp. 1085–1089. [32] Z. Shen, J. Andrews, and B. Evans, “Adaptive resource allocation in multiuser OFDM systems with proportional rate constraints,” IEEE Trans. Wireless Commun., vol. 4, no. 6, pp. 2726–2737, Nov. 2005. [33] 3GPP2 C.S0024-200-C v1.0, “Physical Layer for cdma2000 High Rate Packet Data Air Interface Specification,” Apr. 2010. [34] X. Wang, G. Giannakis, and A. Marques, “A unified approach to QoS-guaranteed scheduling for channel-adaptive wireless networks,” Proc. IEEE, vol. 95, no. 12, pp. 2410–2431, Dec. 2007. [35] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting, and R. Vijayakumar, “Providing quality of service over a shared wireless link,” IEEE Commun. Mag., vol. 39, no. 2, pp. 150–154, Feb. 2001. [36] G. Song and Y. Li, “Cross-layer optimization for OFDM wireless networks-part I: theoretical framework,” IEEE Trans. Wireless Commun., vol. 4, no. 2, pp. 614–624, Mar. 2005. [37] ——, “Cross-layer optimization for OFDM wireless networks-part II: algorithm development,” IEEE Trans. Wireless Commun., vol. 4, no. 2, pp. 625–634, Mar. 2005. [38] G. Song, “Cross-layer resource allocation and scheduling in wireless multicarrier networks,” Ph.D. dissertation, Georgia Inst. Technol., 2005. [39] S. Ryu, B. Ryu, H. Seo, and M. Shin, “Urgency and efficiency based packet scheduling algorithm for OFDMA wireless system,” in Proc. IEEE ICC, May 2005, pp. 2779–2785. [40] W. Park, S. Cho, and S. Bahk, “Scheduler design for multiple traffic classes in OFDMA networks,” in Proc. IEEE ICC, Jun. 2006, pp. 790–795. 154 [41] M. Katoozian, K. Navaie, and H. Yanikomeroglu, “Utility-based adaptive radio resource allocation in OFDM wireless networks with traffic prioritization,” IEEE Trans. Wireless Commun., vol. 8, no. 1, pp. 66–71, Jan. 2009. [42] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, R. Vijayakumar, and P. Whiting, “CDMA data QoS scheduling on the forward link with variable channel conditions,” Bell Labs Tech. Memo., Apr. 2000. [43] 3GPP2 C.S0024-A v2.0 (TIA-856-A-1), “cdma2000 High Rate Packet Data Air Interface Specification,” Aug. 2005. [44] T. IS-95B, “Mobile station-base station compatibility standard for dual-mode wideband spread spectrum cellular systems,” Dec. 1998. [45] A. Bedekar, S. Borst, K. Ramanan, P. Whiting, and E. Yeh, “Downlink scheduling in CDMA data networks,” in Proc. GLOBECOM, Dec. 1999, pp. 2653–2657. [46] 3GPP2 C.S0024-0 v4.0 (TIA-IS-856-2), “cdma2000 High Rate Packet Data Air Interface Specification,” Oct. 2002. [47] R. Yallapragada and M. Naidu, “New enhancements in 3G technologies,” in Proc. ICPWC, Jan. 2005, pp. 182–187. [48] N. Bhushan, C. Lott, P. Black, R. Attar, Y.-C. Jou, M. Fan, D. Ghosh, and J. Au, “CDMA2000 1xEV-DO Revision A: A physical layer and MAC layer overview,” IEEE Commun. Mag., vol. 44, no. 2, pp. 37–49, Feb. 2006. [49] C. Lott, N. Bhushan, D. Ghosh, R. Attar, J. Au, and M. Fan, “Reverse traffic channel MAC design of CDMA2000 1xEV-DO Revision A system,” in Proc. VTC, May 2005, pp. 1416–1421. [50] 3GPP2 TSG-C WG 3, “cdma2000 Evaluation Methodology V6,” Dec. 2006. [51] IETF RFC 2988, “Computing TCP’s Retransmission Timer,” Nov. 2000. [52] ITU-T G.114, “One-way Transmission Time,” May 2003. [53] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms. Cambridg, MA, USA: MIT Press, 1990. [54] Q. Bi, P.-C. Chen, Y. Yang, and Q. Zhang, “An analysis of VoIP service using 1xEV-DO Revision A system,” IEEE J. Sel. Areas Commun., vol. 24, no. 1, pp. 36–45, Jan. 2006. [55] T. IS-2000.2-A, “Physical Layer Standard for cdma2000 Spread Spectrum Systems,” Mar. 2001. [56] M. Yavuz, S. Diaz, R. Kapoor, M. Grob, P. Black, Y. Tokgoz, and C. Lott, “VoIP over cdma2000 1xEV-DO Revision A,” IEEE Commun. Mag., vol. 44, no. 2, pp. 88–95, Feb. 2006. 155 [57] R. Chang, “Synthesis of band-limited orthogonal signals for multichannel data transmission,” Bell Sys. Tech. J., vol. 45, no. 10, pp. 1775–1796, Dec. 1966. [58] A. Bahai, B. Saltzberg, and M. Ergen, Multi-Carrier Digital Communications: Theory and Applications of OFDM, 2nd ed. New York, NY, USA: Springer Verlag, 2004. [59] A. Goldsmith, Wireless Communications. University Press, 2005. New York, NY, USA: Cambridge [60] ITU-T G.993.1, “Very High Speed Digital Subscriber Line Transceivers,” Jun. 2004. [61] P. W. C. Chan and R. S. K. Cheng, “Optimal power allocation in zero-forcing MIMO-OFDM downlink with multiuser diversity,” in Proc. IST Mobile & Wireless Communications Summit, Jun. 2005, pp. 1–5. [62] Y. M. Tsang and R. Cheng, “Optimal resource allocation in SDMA/multi-input-single-output/OFDM systems under QoS and power constraints,” in Proc. IEEE WCNC, Mar. 2004, pp. 1595–1600. [63] C. Wong, R. Cheng, K. Lataief, and R. Murch, “Multiuser OFDM with adaptive subcarrier, bit, and power allocation,” IEEE J. Sel. Areas Commun., vol. 17, no. 10, pp. 1747–1758, Oct. 1999. [64] Y. Li, “Pilot-symbol-aided channel estimation for OFDM in wireless systems,” IEEE Trans. Veh. Technol., vol. 49, no. 4, pp. 1207–1215, Jul. 2000. [65] X. Qiu and K. Chawla, “On the performance of adaptive modulation in cellular systems,” IEEE Trans. Commun., vol. 47, no. 6, pp. 884–895, Jun. 1999. [66] M. Alouini and A. Goldsmith, “Capacity of Rayleigh fading channels under different adaptive transmission and diversity-combining techniques,” IEEE Trans. Veh. Technol., vol. 48, no. 4, pp. 1165–1181, Aug. 2002. [67] I. Wong and B. Evans, “Optimal downlink OFDMA resource allocation with linear complexity to maximize ergodic rates,” IEEE Trans. Wireless Commun., vol. 7, no. 3, pp. 962–971, Mar. 2008. [68] J. Huang, V. Subramanian, R. Agrawal, and R. Berry, “Downlink scheduling and resource allocation for OFDM systems,” in Proc. Conf. on Info. Science and Systems, Mar. 2006, pp. 1272–1279. [69] X. Wang and G. Giannakis, “Resource allocation for wireless multiuser OFDM networks,” IEEE Trans. Inf. Theory, vol. 57, no. 7, pp. 4359–4372, Jul. 2011. [70] H. Nguyen, J. Brouet, V. Kumar, and T. Lestable, “Compression of associated signaling for adaptive multi-carrier systems,” in Proc. IEEE VTC, May 2004, pp. 1916–1919. [71] J. Gross, H. Geerdes, H. Karl, and A. Wolisz, “Performance analysis of dynamic OFDMA systems with inband signaling,” IEEE J. Sel. Areas Commun., vol. 24, no. 3, pp. 427–436, Mar. 2006. 156 [72] E. Larsson, “Optimal OFDMA downlink scheduling under a control signaling cost constraint,” IEEE Trans. Commun., vol. 58, no. 10, pp. 2776–2781, Oct. 2010. [73] R. Moosavi, J. Eriksson, and E. Larsson, “Differential signaling of scheduling information in wireless multiple access systems,” in Proc. IEEE GLOBECOM, Dec. 2010, pp. 1–6. [74] R. Moosavi, J. Eriksson, E. Larsson, N. Wiberg, P. Frenger, and F. Gunnarsson, “Comparison of strategies for signaling of scheduling assignments in wireless OFDMA,” IEEE Trans. Veh. Technol., vol. 59, no. 9, pp. 4527–4542, Nov. 2010. [75] S. S. Epp, Discrete Mathematics with Applications, 3rd ed. Brooks Cole, 2003. Boston, MA, USA: [76] S. Golomb, “Run-length encodings,” IEEE Trans. Inf. Theory, vol. 12, no. 3, pp. 399–401, Jul. 1966. [77] T. Welch, “A technique for high-performance data compression,” Computer, vol. 17, no. 6, pp. 8–19, Jun. 1984. [78] D. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods. MA, USA: Academic Press, 1982. Boston, [79] M. Tao, Y. Liang, and F. Zhang, “Resource allocation for delay differentiated traffic in multiuser OFDM systems,” IEEE Trans. Wireless Commun., vol. 7, no. 6, pp. 2190–2201, Jun. 2008. [80] W. Karush, “Minima of functions of several variables with inequalities as side constraints,” Master’s thesis, Univ. of Chicago, 1939. [81] H. Kuhn and A. Tucker, “Nonlinear programming,” in Proc. Berkeley Symposium on Mathematical Statistics and Probability, Jul. 1951, pp. 481–492. [82] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004. New York, NY, USA: [83] Z. Luo and W. Yu, “An introduction to convex optimization for communications and signal processing,” IEEE J. Sel. Areas Commun., vol. 24, no. 8, pp. 1426–1438, Aug. 2006. [84] T. Cover and J. Thomas, Elements of Information Theory. Wiley & Sons, 1991. New York, NY, USA: John [85] ITU-T Rec. Y.1541, “Network performance objectives for IP-based services,” Feb. 2006. [86] F. S. Hillier and G. J. Lieberman, Introduction to Operations Research, 2nd ed. York, NY, USA: McGraw-Hill, 1995. New [87] 3GPP TS 36.104 v9.9.0, “Base Station (BS) radio transmission and reception (Release 9),” Sep. 2011. 157 [88] J. Zyren and W. McCoy, “Overview of the 3GPP long term evolution physical layer,” White Paper, Freescale Semiconductor, Inc., Jul. 2007. [89] W. Press, Numerical Recipes in FORTRAN: The Art of Scientific Computing. York, NY, USA: Cambridge University Press, 1992. New [90] R. Kwan, C. Leung, and J. Zhang, “Multiuser scheduling on the downlink of an LTE cellular system,” Research Letters in Communications, vol. 2008, no. 1, pp. 1–4, Jan. 2008. [91] Intel Corporation. (2011) Intel Core i7-990X Processor Extreme Edition. [Online]. Available: http://ark.intel.com/products/52585 [92] ——. (2011) Intel Core i7-3960X Processor Extreme Edition. [Online]. Available: http://ark.intel.com/products/63696 [93] ——. (2012) Ivy Bridge Products. [Online]. Available: http://ark.intel.com/products/codename/29902/Ivy-Bridge [94] A. Ghosh, J. Zhang, R. Muhamed, and J. Andrews, Fundamentals of LTE. Saddle River, NJ, USA: Prentice Hall, 2010. Upper [95] D. Babayev, “Piece-wise linear approximation of functions of two variables,” Journal of Heuristics, vol. 2, no. 4, pp. 313–320, Apr. 1997. [96] G. Nemhauser and L. Wolsey, Integer and Combinatorial Optimization. NY, USA: John Wiley & Sons, 1988. [97] R. Myerson, Game Theory: Analysis of Conflict. University Press, 1997. New York, Cambridge, MA, USA: Harvard [98] D. Gesbert, S. Kiani, A. Gjendemsj, and G. Øien, “Adaptation, coordination, and distributed resource allocation in interference-limited wireless networks,” Proc. IEEE, vol. 95, no. 12, pp. 2393–2409, Dec. 2007. [99] I. Gradshteyn, I. Ryzhik, and A. Jeffrey, Table of integrals, series, and products, 6th ed. San Diego, CA, USA: Academic Press, 2000. 158 Appendix A Inductive Proof of MUP Throughput Gain Consider two scheduling policies: single-user packet (SUP) and multi-user packet (MUP). We assume all packets are of the same size and each user has one flow (i.e. Ji = 1). The SUP scheduling policy is formulated as follows: Step 1: Let QSU Pi (k) denote the priority of user i at time k. It is determined by QSU Pi (k) = EP Sizei (k), ∀i ∈ I. (A.1) The user to be scheduled at time k is determined as i∗ (k) = arg max QSU Pi (k). i∈I (A.2) Step 2: Packets are selected, one at a time in an iterative fashion, from the data queue of user i∗ and added to the physical layer encoder packet until either the physical layer encoder packet EP Sizei∗ (k) is filled or that there are no more packets in the data queue. The MUP scheduling policy is formulated similarly as the SUP scheduling policy except that if there is any unfilled space in the physical layer encoder packet, the MUP scheduling policy may use MUP transmission mode and piggyback packets from other users until 159 the physical layer encoder packet is full. Thus, by construction, the MUP scheduling policy always transmits at least as many bits as the SUP scheduling policy in each time slot. Let SU P (k) and M U P (k) denote the number of bits scheduled by the SUP and MUP schedul- ing policies respectively at time k. Hence, M U P (k) ≥ SU P (k) for k ≥ 1. (A.3) Let CSU P (K) and CM U P (K) denote the total number of bits sent by the SUP and MUP scheduling policies respectively from k = 1 to k = K. They can be written as CSU P (K) = CM U P (K) = SU P (1) + SU P (2) + ... + SU P (K) (A.4) M U P (1) + M U P (2) + ... + M U P (K). (A.5) We claim that CM U P (k) ≥ CSU P (k) for k ≥ 1. This can be proven by induction as follows: 1. Base case: When K = 1, CM U P (1) = M U P (1) ≥ SU P (1) = CSU P (1). 2. Induction hypothesis: Assume that CM U P (k) ≥ CSU P (k). 3. Inductive step: CM U P (k + 1) = CM U P (k) + ≥ CSU P (k) + M U P (k M U P (k + 1) (A.6) + 1) (A.7) (by induction hypothesis) ≥ CSU P (k) + SU P (k + 1) (A.8) (by (A.3)) = CSU P (k + 1). Hence, CM U P (k) ≥ CSU P (k) for k ≥ 1. Q.E.D. 160 (A.9) Appendix B Proof of Monotonicity of LHS of (4.21) and Existence of Solution of (4.21) From (4.21), we define N I I z (−1)z g(γ0 ) n=1 z=1 z E1 E{Γn } zγ0 E{Γn } − zγ0 1 − E{Γ n} − 1. e γ0 − zγ0 1 − E{Γ d n} e dγ0 γ0 (B.1) The first order derivative of g(γ0 ) with respect to γ0 is dg(γ0 ) = dγ0 N I (−1)z n=1 z=1 Using the formula [99] I z d z E1 dγ0 E{Γn } zγ0 E{Γn } . (B.2) d e−x E1 (x) = −E0 (x), where E0 (x) = , we can rewrite (B.2) as dx x dg(γ0 ) = dγ0 N I (−1) n=1 z=1 z zγ0 I 1 − E{Γ 1 n} = e 2 z γ0 γ02 161 N hn (γ0 ), n=1 (B.3) I where hn (γ0 ) z=1 γ0 I (−1)z (e− E{Γn } )z . We also have z hn (γ0 ) = I γ0 I (−1)I−z (e− E{Γn } )z − 1, z z=0 I − z=0 if I is even γ0 I (−1)I−z (e− E{Γn } )z − 1, if I is odd. z n n Recall the binomial formula where (x + y) = z=0 (B.4) n n−z z x y , then (B.4) becomes z γ0 hn (γ0 ) = (−1)I (−1 + e− E{Γn } )I − 1. (B.5) dg(γ0 ) < 0. In addition, dγ0 given that lim+ g(γ0 ) = +∞ > 0 and limγ0 →+ ∞ g(γ0 ) = −1 < 0, there exists a unique Thus, for all γ0 > 0 and E{Γn } > 0, we have hn (γ0 ) ∈ [−1, 0) and γ0 →0 γ0 for which g(γ0 ) = 0. Since g(γ0 ) is a monotonically decreasing function of γ0 , ∀I ≥ 1, N ≥ 1, γ0 > 0 and E{Γn } > 0, the value of γ0 for which g(γ0 ) = 0 can be found numerically using a bisection algorithm. 162 Appendix C Proof of Concavity of LHS of (7.8) From (7.8), we define g(ai,n , πi,n , bj,z i ) = log2 1 + n 2 |πi,n |αi,n 2 ζσ0 ai,n bj,z i . ai,n − j (C.1) z The Hessian of the function g at the point x = (ai,n , πi,n , bj,z i ) is given by 2 2 2 ∂ g ∂ g ∂ g j,z 2 ∂a ∂a ∂π ∂ai,n ∂bi i,n i,n i,n 2 2 2 ∂ g ∂ g ∂ g x) = H(g)(x j,z 2 ∂πi,n ∂πi,n ∂ai,n ∂πi,n ∂bi ∂ 2g ∂ 2g ∂ 2g j,z j,z j,z2 ∂bi ∂ai,n ∂bi ∂πi,n ∂bi 2 h2i,n πi,n h2i,n πi,n − 2 h πi,n ln 2a3 1 + hi,n πi,n ln 2a2i,n 1 + i,n i,n ai,n ai,n 2 2 hi,n πi,n hi,n = − 2 h πi,n h πi,n ln 2a2i,n 1 + i,n ln 2ai,n 1 + i,n ai,n ai,n 0 0 163 (C.2) 2 2 0 0 0 (C.3) 2 |αi,n | x), λ1 = 0, λ2 = 0 and λ3 = −h2i,n (a2i,n + where hi,n denotes . The eigenvalues of H(g)(x 2 ζσ0 2 x) − λI) = 0. Given that πi,n )/(ln 2ai,n (ai,n + hi,n πi,n )2 ), are obtained by solving det(H(g)(x x) is a negative semi-definite hi,n ≥ 0, ai,n ≥ 0 and πi,n ≥ 0, it can be shown that H(g)(x x) is a concave function. matrix and hence, g(x 164 Appendix D Computation Complexity Analysis Legend: 1. add denotes the addition operation 2. assgn denotes the assignment operation 3. comp denotes the comparison operation 4. mult denotes the multiplication operation. D.1 WFH-FM Algorithm 3 WFH-FM (Part I) 1: 2: 3: 4: 5: \\ 1. Compute bitQoS values for i = 1 : I do for j = 1 : Ji do for z = 1 : Bij (k) do wij,z (k) = (k − bj,z i .arrivalT ime) ∗ Ts 6: 7: 8: 9: ψij,z = cj ∗ πj ∗ γj j end for end for end for d ∗(wij,z (k)−ηj ) 165 for I times for Ji times for Bij times 1 add, 1 assgn, 1 mult 1 add, 1 assgn, 4 mult Algorithm 3 WFH-FM (Part II) 10: 11: 12: 13: 14: 15: \\ 2. Merge and sort bits by bitQoS values for i = 1 : I do for I times ψi = merge the bits from all application flows of user i and sort by ψij,z in a descending order B log B assign, B log B comp Bi = sumj∈Ji Bij (k) Ji add, 1 assgn end for 16: 17: 18: 19: \\ 3. Throughput maximization \\ 3.1. Assign each subcarrier to the user with the highest channel gain for n = 1 : N do for N times ∗ i (n) = arg max SN Ri,n 1 assgn, I comp i∈I 21: 22: 23: 24: 25: 26: 27: 28: |αi∗ (n),n |2 SN R(n) = ζ ∗ σ02 for i = 1 : I do if i = i∗ (n) then a ˆi,n = 1 else a ˆi,n = 0 end if end for end for 29: 30: \\ 3.2. Determine transmit power and bit assignment using the water-filling algorithm ˆi,n ) (ˆ pi,n , cˆi,n , ˆb1,z i,n ) = W aterf illing(Ptotal , SN R(n), a 31: 32: 33: 34: \\ 3.3. Perform greedy water-filling subcarrier reassignment U = ∅ and C = 0 for i = 1 : I do cˆi,n R(i) = 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: C = C + min(R(i), Bi ) if R(i) > Bi then U = {U, i} end if end for U c = I − U and ΩU = ∅ for i ∈ U do for n = 1 : N do if ai,n = 1 then ΩU = {ΩU , n} end if end for end for 20: 5 mult, 1 assgn for I times 1 comp 1 assgn 1 assgn 2 assgn for I times N add, 1 assgn n 1 add, 1 assgn, 1 comp 1 comp 1 assgn 2 assgn, I comp for up to I times for N times 1 comp 1 assgn 166 Algorithm 3 WFH-FM (Part III) 48: loop 49: for n ∈ ΩU do 50: i∗ (n) = arg max ai,n i 51: for i ∈ U c do ∆Ci,n = min R(i) + log2 (1 + pˆi∗ (n),n ∗ 52: 53: 54: 55: 56: 57: 58: 59: 60: 61: 62: 63: 64: for up to I times for up to N times 1 assgn, I comp for up to I times |αi,n |2 ) , Bi − min (R(i), Bi ) ζσ02 2 add, 1 assgn, 2 comp, 7 mult end for end for (i , n ) = arg max ∆Ci,n 2 assgn, IN comp i,n ΩU = ΩU − n a ˆi,n = 0, ∀i = i a ˆi ,n = 1 |αi ,n |2 SN R(n ) = ζ ∗ σ02 1,z (ˆ pi,n , cˆi,n , ˆb i,n ) = W aterf illing(Ptotal , SN R(n), a ˆi,n ) C =0 for i = 1 : I do cˆi,n R(i) = 1 add, 1 assgn I assgn, I comp 1 assgn, 5 mult 1 assgn for I times N add, 1 assgn n 65: 66: 67: 68: 69: 70: 71: 72: 73: C = C + min(R(i), Bi ) if R(i) ≥ Bi then Uc = Uc − i end if end for if C > C then a ˆi,n = a ˆi,n , ∀i, n pˆi,n = pˆi,n , ∀i, n ˆbj,z = ˆb j,z , ∀i, j, z, n i,n 1 add, 1 assgn, 1 comp 1 comp 1 add, 1 assgn 1 comp IN assgn IN assgn N i,n Bi assgn i 74: 75: 76: 77: 78: C=C else break end if end loop 1 assgn \\ 3.4. Compute current intermediate objective value 80: δˆobj = ψij,z ˆbj,z N i,n 79: i i z n Bi add, 1 assgn, i 167 Bi mult i Algorithm 3 WFH-FM (Part IV) 81: 82: 83: 84: \\ 4. Iterative subcarrier reassignment loop for up to IN times for i = 1 : I do for I times j,z j ˆ Sun (i) = {bit(i, j, z)| bi,n = 0, j ∈ Ji , z ∈ {1, . . . , Bi }} n 85: ˆbj,z = 1, j ∈ Ji , z ∈ {1, . . . , B j }} i,n i Sas (i) = {bit(i, j, z)| n 86: 87: 88: 89: 90: 91: 92: 93: 94: 95: 96: 97: 98: 99: 100: 101: 102: 103: 104: 105: 106: 107: ψun (i) = ψas (i) = 109: 110: 111: 2N Bi add, 2Bi comp Bi comp ψij,z Bi comp max bit(i,j,z)∈Sun (i) min bit(i,j,z)∈San (i) end for loop if max ψun (i) ≤ min ψas (i) then i i break end if l∗ = arg max ψun (i) i Dl∗ = ∅ for n = 1 : N do if ai,n = 0 then Dl∗ = {Dl∗ ,n } end if end for n∗ = arg max αl∗ ,n for up to I − 1 times 2I + 1 comp 1 assgn, I comp 1 assgn for N times 1 comp 1 assgn 1 assgn, up to N comp n∈Dl∗ for i = 1 : I do ai,n∗ = 0 end for al∗ ,n∗ = 1 |αl∗ ,n∗ |2 SN R(n∗ ) = ζ ∗ σ02 (ˆ pi,n , ˆb1,z i,n ) = W aterf illing(Ptotal , SN R) δˆ = ψ j,z ˆbj,z obj i i 108: ψij,z i z i,n n 1 assgn 1 assgn, 5 mult N Bi add, 1 assgn, Bi mult i if δˆobj > δˆobj then a ˆi,n = a ˆi,n , ∀i, n pˆi,n = pˆi,n , ∀i, n ˆbj,z = ˆb j,z , ∀i, j, z, n i,n for I times 1 assgn i 1 comp IN assign IN assign N i,n Bi assign i 112: 113: 114: 115: 116: 117: δˆobj > δˆobj else ψun (l∗ ) = 0 end if end loop end loop 1 assign 1 assign 168 Algorithm 3 WFH-FM (Part V) 118: 119: 120: 121: 122: 123: 124: 125: function WATERFILLING(Ptotal , SN R(n), ai,n ) x=0 1 y = Ptotal + max n∈N SN R(n) for n = 1 : N do 1 ) px(n) = max(0, x − SN R(n) 1 py(n) = max(0, y − ) SN R(n) end for fx = px(n) − Ptotal 1 assgn 1 add, 1 assgn, N comp for N times 1 add, 1 assgn, 1 comp, 1 mult 1 add, 1 assgn, 1 comp, 1 mult N + 1 add, 1 assgn n∈N 126: fy = py(n) − Ptotal N + 1 add, 1 assgn 127: while |x − y| > do x+y λ= 2 for n = 1 : N do pn = max(0, λ − for L times n∈N 128: 129: 130: 131: 132: 1 ) SN R(n) end for f= pn − Ptotal 1 add, 1 assgn, 1 mult for N times 1 add, 1 assgn, 1 comp, 1 mult N + 1 add, 1 assgn n∈N 133: 134: 135: 136: 137: 138: 139: 140: 141: 142: 143: 144: 145: 146: 147: 148: 149: 150: 151: 152: 153: 154: 155: if f x ∗ f > 0 then x=λ fx = f else if f y ∗ f > 0 then y=λ fy = f else break end if end while 1 comp, 1 mult 1 assgn 1 assgn 1 comp, 1 mult 1 assgn 1 assgn for i = 1 : I do HOL = 0 for n = 1 : N do if ai,n = 1 then ci,n = log2 (1 + SN R(n) ∗ pn ) pi,n = pn b1,z i,n = 1 for z = HOL + 1 : HOL + ci,n HOL = HOL + ci,n end if end for end for return pi,n , ci,n , b1,z i,n end function 169 for I times 1 assgn for N times 1 comp 1 add, 1 assgn, 3 mult 1 assgn ci,n assgn 1 add, 1 assgn D.2 BABL-FM Algorithm 4 BABL-FM (Part I) 1: for i = 1 : I do 2: for j = 1 : Ji do 3: for z = 1 : Bij (k) do 4: wij,z (k) = (k − bj,z i .arrivalT ime) ∗ Ts 1 add, 1 mult, 1 assgn 5: 6: 7: 8: 9: 10: 11: 12: for I times for Ji times for Bij times ψij,z = cj ∗ π j ∗ d ∗(wj,z (k)−ηj ) ξj j i 27: 1 add, 3 mult, 1 assgn end for end for end for for i = 1 : I do for I times j,z ψi = merge the bits from all application flows of user i and sort by ψi in a descending order B log B comp end for for i = 1 : I do for I times for n = 1 : N do for N times |αi,n |2 5 mult, 1 assgn SN Ri,n = ζ ∗ σ02 cˆi,n = 0 1 assgn a ˆi,n = 0 1 assgn pˆi,n = 0 1 assgn end for end for psum = 0 1 assgn chused (n) = 0 for all n ∈ N N assgn HOL(i) = 1 for all i ∈ I I assgn loop for R times HOL(i) ∗ ∗ ∗ bit(i , j , z ) = arg max ψi 28: 29: 30: I comp, 1 assgn for N times 1 comp 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 31: 32: 33: i∈I for n = 1 : N do if chused (n) = 0 then ∗ ∗ 1 pij∗ ,n,z = SN Ri∗ ,n else if a ˆi∗ ,n = 1 then ∗ ∗ 2cˆi∗ ,n +1 − 2cˆi∗ ,n pij∗ ,n,z = SN Ri∗ ,n 1 mult, 1 assgn 1 comp 2 add, 3 mult, 1 assgn 34: 170 Algorithm 4 BABL-FM (Part II) 35: else 36: for every bit bit(l, j, z) ∈ Sl,n do 37: 38: 39: for m ∈ Ωl do if chused (m) = 0 then 1 j,z pl,m = SN Rl,m else if a ˆl,m = 1 then 2cˆl,m +1 − 2cˆl,m j,z = pl,m SN Rl,m 40: 41: 42: m∈Ωl cˆl,m∗ = cˆl,m∗ + 1 cˆl,n = cˆl,n − 1 end for ∗ ∗ 1 − pˆl,n + pij∗ ,n,z = SN Ri∗ ,n 51: 60: 61: 62: 63: 64: 65: 66: 67: 1 comp end if end for j,z m∗ = arg max pl,m 47: 48: 49: 50: 56: 57: 58: 59: 1 mult, 1 assgn 2 add, 3 mult, 1 assgn 43: 44: 45: 46: 52: 53: 54: 55: for up to B times for up to N − 1 times 1 comp N − 1 comp, 1 assgn 1 add, 1 assgn 1 add, 1 assgn j,z pl,m ∗ bit(l,j,z)∈Sl,n B + 2 add, 1 mult, 1 assgn end if end for ∗ ∗ n∗ = arg min pij∗ ,n,z if n∈N ∗ ∗ psum + pij∗ ,n,z∗ N comp, 1 assgn > Ptotal then 1 add, 1 comp break else a ˆi∗ ,n∗ = 1 ∗ ∗ pˆi∗ ,n∗ = pij∗ ,n,z∗ ˆbj∗∗ ,z∗∗ = 1 i ,n cˆi∗ ,n∗ = cˆi∗ ,n∗ + 1 ∗ ∗ psum = psum + pij∗ ,n,z∗ chused (n∗ ) = 1 HOL(i) = HOL(i) + 1 end if end loop 1 assgn 1 assgn 1 assgn 1 add, 1 assgn 1 add, 1 assgn 1 assgn 1 add, 1 assgn 171 D.3 KKT-CRA Algorithm 5 KKT-CRA (Part I) 1: 2: 3: 4: 5: \\ Compute bitQoS values for i = 1 : I do for j = 1 : Ji do for z = 1 : Bij (k) do wij,z (k) = (k − bj,z i .arrivalT ime) ∗ Ts 6: 7: 8: 9: ψij,z = cj ∗ πj ∗ γj j end for end for end for for I times for Ji times for Bij times 1 add, 1 mult, 1 assgn d ∗(wij,z (k)−ηj ) 1 add, 4 mult, 1 assgn for i = 1 : I do ψi = merge the bits from all application flows of user i and sort by ψij,z in a descending order Bi = sumj∈Ji Bij (k) for n = 1 : N do |αi,n |2 15: SN R(i, n) = ζ ∗ σ02 16: end for 17: end for 10: 11: 12: 13: 14: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: β= γi = for all i ∈ I Si = 0 for all i ∈ I loop if Si = 1 for all i ∈ I then break end if \\ Perform subcarrier assignment for i = 1 : I do for n = 1 : N do B log B comp Ji add, 1 assgn for N times 5 mult, 1 assgn 1 assgn I assgn I assgn for D times I comp SN R(i, n)γi β ln 2 1 β ln 2 − max 0, 1 − ln 2 SN R(i, n)γi for I times for N times H(i, n) = γi max 0, log2 29: 30: 31: 32: for I times 2 add, 2 comp, 11 mult, 1 assgn end for end for 172 Algorithm 5 KKT-CRA (Part II) 33: for n = 1 : N do 34: i∗ (n) = arg max H(i, n) for N times I comp, 1 assgn i∈I 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: for i = 1 : I do if i = i∗ (n) then a ˆi,n = 1 else a ˆi,n = 0 end if end for end for \\ Use bisection to find β and pi,n x= y = 108 for n = 1 : N do γi∗ (n) 1 − ) px(n) = max(0, x ln 2 SN R(i∗ (n), n) γi∗ (n) 1 py(n) = max(0, − ) y ln 2 SN R(i∗ (n), n) end for fx = px(n) − Ptotal for I times 1 comp 1 assgn 1 assgn 1 assgn 1 assgn for N times 1 comp, 1 add, 4 mult, 1 assgn 1 comp, 1 add, 4 mult, 1 assgn N+1 add, 1 assgn n∈N 51: py(n) − Ptotal fy = N+1 add, 1 assgn n∈N 56: 57: while |x − y| > do x+y λ= 2 for n = 1 : N do γi∗ (n) 1 pλ(n) = max(0, − ) λ ln 2 SN R(i∗ (n), n) end for fλ = pλ(n) − Ptotal 58: 59: 60: 61: 62: 63: 64: 65: 66: 67: if f x ∗ f λ > 0 then x=λ fx = fλ else if f y ∗ f λ > 0 then y=λ fy = fλ else break end if end while 52: 53: 54: 55: for Q times 1 add, 1 mult, 1 assgn for N times 1 comp, 1 add, 4 mult, 1 assgn N + 1 add, 1 assgn n∈N 1 comp, 1 mult 1 assgn 1 assgn 1 comp, 1 mult 1 assgn 1 assgn 173 Algorithm 5 KKT-CRA (Part III) 68: β=λ 69: for i = 1 : I do 70: C(i) = 0 71: for n = 1 : N do 1 γi − ) 72: pˆi,n = a ˆi,n max(0, β ln 2 SN R(i, n) 73: cˆi,n = log2 (1 + SN R(i, n)ˆ pi,n ) 74: C(i) = C(i) + cˆi,n 75: end for 76: b1,z i = 1 for z = 1 : C(i) 1, C(i) +1 77: bi = C(i) − C(i) 1,z 78: bi = 0 for z = C(i) + 2 : Bi 79: end for 80: for i = 1 : I do 81: if S(i) = 0 then 1, C(i) +1 82: S(i) = (γi + ) > ψi 83: else 1, C(i) +1 then 84: if (γi + ) < ψi 85: S(i) = 0 86: end if 87: end if 88: end for 89: \\ Update γi 90: for i = 1 : I do 91: if C(i) + 1 > Bi then 92: ψ HOL (i) = 0 93: else 1, C(i) +1 94: ψ HOL (i) = ψi 95: end if 96: if S(i) = 0 then 97: γi = (1 − δ)γi + δψ HOL (i) 98: end if 99: end for 100: end loop 174 1 assgn for I times 1 assgn for N times 1 comp, 1 add, 5 mult, 1 assgn 1 add, 2 mult, 1 assgn 1 add, 1 assgn B assgn for I times 1 comp 1 comp, 2 add, 1 mult, 1 assgn 1 comp, 2 add, 1 mult 1 assgn for I times 1 comp, 1 add, 1 mult 1 assgn 1 add, 1 mult, 1 assgn 1 comp 2 add, 2 mult, 1 assgn D.4 KKT-DRA Algorithm 6 KKT-DRA (Part I) 1: 2: 3: 4: 5: \\ Compute bitQoS values for i = 1 : I do for j = 1 : Ji do for z = 1 : Bij (k) do wij,z (k) = (k − bj,z i .arrivalT ime) ∗ Ts 6: 7: 8: 9: ψij,z = cj ∗ πj ∗ γj j end for end for end for for I times for Ji times for Bij times 1 add, 1 mult, 1 assgn d ∗(wij,z (k)−ηj ) 1 add, 4 mult, 1 assgn for i = 1 : I do ψi = merge the bits from all application flows of user i and sort by ψij,z in a descending order Bi = sumj∈Ji Bij (k) for n = 1 : N do |αi,n |2 15: SN R(i, n) = ζ ∗ σ02 16: end for 17: end for 10: 11: 12: 13: 14: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: β= γi = for all i ∈ I Si = 0 for all i ∈ I loop if Si = 1 for all i ∈ I then break end if \\ Perform subcarrier assignment for i = 1 : I do for n = 1 : N do H(i, n) = γi max 0, log2 for I times B log B comp Ji add, 1 assgn for N times 5 mult, 1 assgn 1 assgn I assgn I assgn for D times I comp SN R(i, n)γi β ln 2 end for end for 175 for I times for N times 1 β ln 2 − max 0, 1 − ln 2 SN R(i, n)γi 2 add, 2 comp, 11 mult, 1 assgn Algorithm 6 KKT-DRA (Part II) 32: for n = 1 : N do 33: i∗ (n) = arg max H(i, n) for N times I comp, 1 assgn i∈I 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: 56: 57: 58: 59: 60: 61: 62: 63: 64: 65: 66: 67: 68: for i = 1 : I do if i = i∗ (n) then a ˆi,n = 1 else a ˆi,n = 0 end if end for end for \\ Use bit-loading to find pˆi,n and ˆbj,z i,n pused = 0 pinc = 0 pˆi,n = 0 for all i ∈ I, n ∈ N r(n) = 0 for all n ∈ N C(i) = 0 for all i ∈ I ˆb1,z = 0 for all i ∈ I, n ∈ N , z ∈ {1, . . . , Bi } i,n while pused + pinc ≤ Ptotal do pused = pused + pinc for n = 1 : N do if C(i∗ (n)) ≥ Bi∗ (n) then pchange (n) = ∞ ψchange (n) = 0 else 2r(n)+1 − 2r(n) pchange (n) = SN R(i∗ (n), n) 1,C(i∗ (n))+1 ψchange (n) = ψi end if end for n∗ = arg max ψchange (n)/pchange (n) n∈N pinc = pchange (n∗ ) if pused + pinc ≤ Ptotal then r(n∗ ) = r(n∗ ) + 1 C(i∗ (n∗ )) = C(i∗ (n∗ )) + 1 pˆi∗ (n∗ ),n∗ = pˆi∗ (n∗ ),n∗ + pinc ∗ (n∗ )) ˆb1,C(i =1 i∗ (n∗ ),n∗ end if end while for I times 1 comp 1 assgn 1 assgn 1 assgn 1 assgn N I assgn N assgn I assgn IBN assgn for R times 1 add, 1 assgn for N times 1 comp 1 assgn 1 assgn 2 add, 3 mult, 1 assgn 1 add, 1 assgn N mult, N comp, 1 assgn 1 assgn 1 add, 1 comp 1 add, 1 assgn 1 add, 1 assgn 1 add, 1 assgn 1 assgn 176 Algorithm 6 KKT-DRA (Part III) 69: 70: 71: 72: 73: 74: 75: 76: \\ Use bisection to find β x= y = 108 for n = 1 : N do γi∗ (n) 1 − ) px(n) = max(0, x ln 2 SN R(i∗ (n), n) γi∗ (n) 1 py(n) = max(0, − ) y ln 2 SN R(i∗ (n), n) end for fx = px(n) − Ptotal 1 assgn 1 assgn for N times 1 comp, 1 add, 4 mult, 1 assgn 1 comp, 1 add, 4 mult, 1 assgn N+1 add, 1 assgn n∈N 77: py(n) − Ptotal fy = N+1 add, 1 assgn n∈N 78: 79: 80: 81: 82: 83: while |x − y| > do x+y λ= 2 for n = 1 : N do γi∗ (n) 1 − ) pλ(n) = max(0, λ ln 2 SN R(i∗ (n), n) end for fλ = pλ(n) − Ptotal for Q times 1 add, 1 mult, 1 assgn for N times 1 comp, 1 add, 4 mult, 1 assgn N + 1 add, 1 assgn n∈N 84: 85: 86: 87: 88: 89: 90: 91: 92: 93: 94: 95: 96: 97: 98: 99: 100: 101: 102: 103: if f x ∗ f λ > 0 then x=λ fx = fλ else if f y ∗ f λ > 0 then y=λ fy = fλ else break end if end while β=λ for i = 1 : I do if S(i) = 0 then 1,C(i)+1 S(i) = (γi + ) > ψi else 1,C(i)+1 if (γi + ) < ψi then S(i) = 0 end if end if end for 1 comp, 1 mult 1 assgn 1 assgn 1 comp, 1 mult 1 assgn 1 assgn 1 assgn for I times 1 comp 1 comp, 2 add, 1 assgn 1 comp, 2 add 1 assgn 177 Algorithm 6 KKT-DRA (Part IV) 104: 105: 106: 107: 108: 109: 110: 111: 112: 113: 114: 115: \\ Update γi for i = 1 : I do if C(i) + 1 > Bi then ψ HOL (i) = 0 else 1,C(i)+1 ψ HOL (i) = ψi end if if S(i) = 0 then γi = (1 − δ)γi + δψ HOL (i) end if end for end loop for I times 1 comp, 1 add 1 assgn 1 add, 1 assgn 1 comp 2 add, 2 mult, 1 assgn 178 D.5 WF Algorithm 7 WF (Part I) 1: for n = 1 : N do 2: i∗ (n) = arg max αi,n for N times I comp, 1 assgn i∈I 4: 5: 6: 7: 8: 9: 10: 11: 12: |αi∗ (n),n |2 SN R(n) = ζ ∗ σ02 for i = 1 : I do if i = i∗ (n) then a ˆi,n = 1 else a ˆi,n = 0 end if pˆi,n = 0 end for end for 13: x=0 3: 5 mult, 1 assgn for I times 1 comp 1 assgn 1 assgn 1 assgn 1 assgn 18: 19: 1 y = Ptotal + max n∈N SN R(n) for n = 1 : N do 1 px(n) = max(0, x − ) SN R(n) 1 py(n) = max(0, y − ) SN R(n) end for px(n) − Ptotal fx = 20: fy = 14: 15: 16: 17: N comp, 1 add, 1 assgn for N times 1 comp, 1 add, 1 mult, 1 assgn 1 comp, 1 add, 1 mult, 1 assgn N+1 add, 1 assgn n∈N py(n) − Ptotal N+1 add, 1 assgn n∈N while |x − y| > do x+y 22: λ= 2 23: for n = 1 : N do 24: pˆi∗ (n),n = max(0, λ − for L times 21: 25: 26: 1 ) SN R(n) end for f= pˆi∗ (n),n − Ptotal 1 add, 1 mult, 1 assgn for N times 1 comp, 1 add, 1 mult, 1 assgn N + 1 add, 1 assgn n∈N 179 Algorithm 7 WF (Part II) 27: if f x ∗ f > 0 then 28: x=λ 29: fx = f 30: else if f y ∗ f > 0 then 31: y=λ 32: fy = f 33: else 34: break 35: end if 36: end while 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 1 comp, 1 mult 1 assgn 1 assgn 1 comp, 1 mult 1 assgn 1 assgn for i = 1 : I do HOL = 0 for n = 1 : N do if a ˆi,n = 1 then cˆi,n = log2 (1 + SN R(n) ∗ pˆi,n ) b1,z ˆi,n i,n = 1 for z = HOL + 1 : HOL + c HOL = HOL + cˆi,n end if end for end for 180 for I times 1 assgn for N times 1 comp 1 add, 2 mult, 1 assgn ci,n assgn 1 add, 1 assgn D.6 MDU Algorithm 8 MDU (Part I) 1: for n = 1 : N do for N times Ptotal 1 mult, 1 assgn 2: p (n) = N ∗ 3: i (n) = a random number from [1, I] 1 assgn 4: for i = 1 : I do for I times |αi,n |2 5: SN R(i, n) = 5 mult, 1 assgn ζ ∗ σ02 6: if i = i∗ (n) then 1 comp 7: a ˆi,n = 1 1 assgn 8: else 9: a ˆi,n = 0 1 assgn 10: end if 11: end for 12: end for 13: for i = 1 : I do for I times 14: ri = 0 1 assgn Qi 2 comp, 2 add, 4 mult, 2 assgn 15: w(i) = UtilityFunc( , flow type of i)/ri ri 16: γi = wi ∗ (ri < Qi ) 1 comp, 1 mult 17: end for 18: loop for κ times 19: for n = 1 : N do for N times 20: for i = 1 : I do for I times 21: c(i, n) = log2 (1 + SN R(i, n) ∗ p (n)) ∗ a ˆi,n 1 add, 3 mult, 1 assgn 22: end for 23: i∗ (n) = arg max γi ∗ c(i, n) I comp, I mult, 1 assgn i∈I 24: end for 25: 26: x= y = max γi∗ (n) ∗ SN R(i∗ (n), n) 27: for n = 1 : N do γi∗ (n) 1 − ) px(n) = max(0, x SN R(i∗ (n), n) γi∗ (n) 1 py(n) = max(0, − ) y SN R(i∗ (n), n) end for 28: 29: 30: 1 assgn N mult, N comp n∈N 181 for N times 1 comp, 1 add, 2 mult, 1 assgn 1 comp, 1 add, 2 mult, 1 assgn Algorithm 8 MDU (Part II) 31: fx = px(n) − Ptotal N + 1 add, 1 assgn py(n) − Ptotal N + 1 add, 1 assgn n∈N 32: fy = 33: while |x − y| > do x+y λ= 2 for n = 1 : N do γi∗ (n) 1 p (n) = max(0, − ) λ SN R(i∗ (n), n) end for f= pˆi∗ (n),n − Ptotal n∈N 34: 35: 36: 37: 38: for L times 1 add, 1 mult, 1 assgn for N times 1 comp, 1 add, 2 mult, 1 assgn N + 1 add, 1 assgn n∈N 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: 56: 57: 58: 59: 60: if f x ∗ f > 0 then 1 comp x=λ 1 assgn fx = f 1 assgn else if f y ∗ f > 0 then 1 comp y=λ 1 assgn fy = f 1 assgn else break end if end while for i = 1 : I do for I times ri = 0 1 assgn for n = 1 : N do for N times if a ˆi,n = 1 then 1 comp ri = ri + log2 (1 + SN R(i, n) ∗ p (n)) 2 add, 2 mult, 1 assgn end if end for end for for i = 1 : I do for I times γi = (1 − µ) ∗ γi + µ ∗ wi ∗ (ri < Qi ) 1 comp, 2 add, 3 mult, 1 assgn end for if wi ∗ (ri < Qi ) ∗ (riold − ri ) ≤ then I(1 comp, 3 mult, 2 add), 1 comp i∈I 61: 62: 63: break end if end loop 182 Algorithm 8 MDU (Part III) 64: for i = 1 : I do 65: HOL = 0 66: for n = 1 : N do 67: if a ˆi,n = 1 then 68: cˆi,n = log2 (1 + SN R(i, n) ∗ pˆi,n ) 69: b1,z ˆi,n i,n = 1 for z = HOL + 1 : HOL + c 70: HOL = HOL + cˆi,n 71: end if 72: end for 73: end for 74: 75: 76: 77: 78: 79: 80: 81: 82: 83: 84: 85: 86: 87: 88: 89: function U TILITY F UNC(x, flow type) if flow type is BE then if x < ηBE then f = x0.5 else 0.5 f = ηBE end if else if flow type is EF then if x < ηEF then f =x else 1 x1.5 − ηEF .5 + ηEF end if end if return f end function for I times 1 assgn for N times 1 comp 1 add, 2 mult, 1 assgn ci,n assgn 1 add, 1 assgn 1 comp 1 comp 1 mult, 1 assgn 1 mult, 1 assgn 1 comp 1comp 1 assgn 2 add, 2 mult, 1 assgn 183
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- QoS-aware resource allocation in wireless communication...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
QoS-aware resource allocation in wireless communication systems Chi En, Huang 2012
pdf
Page Metadata
Item Metadata
Title | QoS-aware resource allocation in wireless communication systems |
Creator |
Chi En, Huang |
Publisher | University of British Columbia |
Date Issued | 2012 |
Description | With the rapid growth in demand for wireless communications, service providers are expected to provide always-on, seamless and ubiquitous wireless data services to a large number of users with different applications and different Quality of Service (QoS) requirements. The multimedia traffic is envisioned to be a concurrent mix of real-time traffic and non-real-time traffic. However, radio spectrum is a scarce resource in wireless communications. In order to adapt to the changing wireless channel conditions and meet the diverse QoS requirements, efficient and flexible packet scheduling algorithms play an increasingly important role in radio resource management (RRM). Much of the published work in RRM has focused on exploiting multi-user and multi-channel diversities. In this thesis, we adopt an adaptive cross layer approach to exploit multi-application diversity in single-carrier communication systems and additionally, multi-bit diversity in multi-carrier communication systems. Efficient and practical resource allocation (RA) algorithms with finer scheduling granularity and increased flexibility are developed to meet QoS requirements. Specifically, for single-carrier communication systems, we develop RA algorithms with flow and user multiplexing while jointly considering physical-layer time-varying channel conditions as well as application-layer QoS requirements. For multi-carrier communication systems, we propose a bitQoS-aware RA framework to adaptively match the QoS requirements of the user application bits to the characteristics of the narrowband channels. The performance gains achievable from the proposed bitQoS-aware RA framework are demonstrated with suboptimal algorithms using water-filling and bit-loading approaches. Efficient algorithms to obtain optimal and near-optimal solutions to the joint subcarrier, power and bit allocation problem with continuous and discrete rate adaptation, respectively, are developed. The increased control signaling that may be incurred, as well as the computational complexity as a result of the finer scheduling granularity, are also taken into consideration to establish the viability of the proposed RA framework and algorithms for deployment in practical networks. The results show that the proposed framework and algorithms can achieve a higher system throughput with substantial performance gains in the considered QoS metrics compared to RA algorithms that do not take QoS requirements into account or do not consider multi-application diversity and/or multi-bit diversity. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2012-09-28 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0073213 |
URI | http://hdl.handle.net/2429/43303 |
Degree |
Doctor of Philosophy - PhD |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2012-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2012_fall_huang_chi_en.pdf [ 2.68MB ]
- Metadata
- JSON: 24-1.0073213.json
- JSON-LD: 24-1.0073213-ld.json
- RDF/XML (Pretty): 24-1.0073213-rdf.xml
- RDF/JSON: 24-1.0073213-rdf.json
- Turtle: 24-1.0073213-turtle.txt
- N-Triples: 24-1.0073213-rdf-ntriples.txt
- Original Record: 24-1.0073213-source.json
- Full Text
- 24-1.0073213-fulltext.txt
- Citation
- 24-1.0073213.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0073213/manifest