P A R Reduction in O F D M Systems by Mult iple Signal Representation by Trung Thanh Nguyen B.E. , The University of Sydney, Australia, 2004 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L M E N T OF T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF M A S T E R OF A P P L I E D SCIENCE in The Faculty of Graduate Studies (Electrical and Computer Engineering) T H E U N I V E R S I T Y OF BRITISH C O L U M B I A April 2006 © Trung Thanh Nguyen 2006 11 Abstract As we are embarking on the new information age, demand for high speed wireless com-munication has increased substantially. One of the promising technologies that could fulfil such demands is orthogonal frequency division multiplexing (OFDM). Due to its bandwidth efficiency and robustness in wireless environments, O F D M has recently gained an increasing popularity. One major drawback of O F D M is its high peak-to-average power ratio (PAR). If not processed, the high peaks in the O F D M signal cause saturation in the power amplifier, which in turn distorts and decreases the power of the transmit signal. In order to avoid the resulting bit-error-rate performance degradation and out-of-band radiation, either expensive linear power amplifiers need to be employed, or nonlinear amplifiers must be operated with high power backoffs in power-inefficient amplification. Power inefficiency is however unacceptable to battery-powered devices. Among the numerous P A R reduction approaches available in the literature, multi-ple signal representation (MSR) techniques have achieved considerable attention thanks to their great P A R reduction capability and favorable performance-expense tradeoffs. Within the M S R family, partial transmit sequences (PTS) is the most popular while trellis shaping is the newest technique. This thesis makes contributions to both of them. For PTS, we restate its low-PAR signal search as a combinatorial optimization (CO) problem and propose a fair benchmark to evaluate the search algorithms. These allow us to compare the existing PTS algorithms and apply efficient heuristics from the CO liter-ature to PTS. Surprisingly, our results show that random search, the simplest algorithm, actually outperforms all available PTS algorithms. Only our tailored versions of the two Abstract i i i CO metaheuristics, namely simulated annealing and tabu search, attain better tradeoffs in the low-PAR region. Trellis shaping is an interesting MSR technique that does not require explicit side information. Our contributions to trellis shaping include a new metric and the use of the stack sequential decoding algorithm in shaping-sequence search. These proposals improve P A R reduction and/or reduce the complexity of the technique. Beneficial to all algorithms that involve a large number of peak-power searches, we propose using a new optimization objective to reduce their overall complexity. This is applicable to both PTS and trellis shaping. Finally, based on the trellis shaping structure, we propose a side information em-bedding/detecting scheme for PTS. Our low-complexity scheme does not affect the P A R reduction capability of PTS and practically does not degrade the B E R performance of the O F D M systems. iv Contents Abst rac t ii Contents iv Lis t of Tables vii L is t of Figures viii L i s t of Abbrevia t ions xi Preface xiii Acknowledgements xiv 1 In t roduct ion 1 1.1 Contributions 2 1.2 Organization 3 1.3 Notations 4 2 Background 6 2.1 Orthogonal Frequency Division Multiplexing 6 2.2 The P A R Problem in O F D M 8 2.2.1 P A R of O F D M Signal 9 2.2.2 Effect of Distortion 13 2.3 Review of P A R Reduction Techniques 16 Contents v 2.3.1 Peak Cancellation 18 2.3.2 Multiple Signal Representation 21 3 P A R Reduc t ion by Pa r t i a l Transmit Sequences 25 3.1 Partial Transmit Sequences 25 3.2 Review of Descent Heuristics 28 3.2.1 Random Search 28 3.2.2 Bit Flip 28 3.2.3 Local Search 29 3.3 Metaheuristics for PTS 30 3.3.1 Simulated Annealing 32 3.3.2 Tabu Search 33 3.4 Simulation Results 34 3.5 Conclusion . 36 4 P A R Reduc t ion by Trell is Shaping 38 4.1 Trellis Shaping Scheme : 38 4.2 Selection of Code Sequences 42 4.2.1 Metrics 43 4.2.2 Algorithms 45 4.2.3 Variations Based on the Appended Partial P A R Criterion 47 4.3 Implementation and Complexity 48 4.3.1 Partial and Appended Partial P A R Criteria 48 4.3.2 Partial Autocorrelation and Partial Autocorrelation Squares Criteria 49 4.4 Simulation Results 50 4.4.1 Comparison of Metrics 50 4.4.2 V A vs. ST-SDA 54 4.4.3 Variations Based on the Appended Partial P A R Criterion 56 4.5 Conclusion 57 Contents vi 5 Fur ther Improvements for P T S and Trell is Shaping 59 5.1 Application of Combinatorial Optimization Heuristics in Trellis Shaping . 59 5.2 Complexity Reduction by Peak Power Approximation 64 5.3 SI Embedding for PTS 67 5.3.1 Review of Existing SI Embedding and Detecting Schemes 67 5.3.2 Novel SI Embedding Scheme 68 5.3.3 Performance Analysis 72 5.4 Conclusion 74 6 Conclus ion 75 Bib l iography 77 vii List of Tables 4.1 Shaping codes 51 4.2 Performance of trellis shaping with VA 53 4.3 Performance and Complexity of ST-SDA 54 4.4 Adaptive shaping with VA, Metric 2 57 Vll l List of Figures 2.1 Power spectral density of an O F D M signal with 256 subcarriers 7 2.2 Cyclic prefix in O F D M 8 2.3 Simplified block diagram of the complex equivalent baseband O F D M trans-mission scheme 9 2.4 C C D F plots of continuous-time and L-time oversampled discrete-time O F D M signals. N = 256, 16-QAM 11 2.5 C C D F plots of O F D M signals with different constellations 14 2.6 C C D F plots of O F D M signals with different numbers of subcarriers. . . . 14 2.7 Amplitude input-output relationship of Rapp distortion model 15 2.8 B E R increase due to nonlinear amplification, A W G N channel 17 2.9 B E R increase due to nonlinear amplification, fading channel 17 2.10 Out-of-band radiation due to nonlinear distortion. Rapp distortion model with p = 10. N = 256, 16-QAM 18 2.11 Bound on the achievable maximum P A R using multiple signal represen-tation. Simulative bound indicates the maximum P A R of O F D M signals after applying an M S R technique with R redundancy bits. The approxi-mation is from Eqn. (2.23). N = 256 22 3.1 Block diagram of the PTS technique 26 3.2 Illustration of search moves 31 3.3 Performance of optimal PTS and random search. N = 256, V = 16, pseudo-random partition, binary rotation 35 List of Figures ix 3.4 Performance of SA vs. RS with different number of searches 36 3.5 Average number of searches required to achieve Q7(7o) = 10~3 37 4.1 Constellation shaping with a 16-QAM constellation 39 4.2 Block diagram of trellis shaping system: (a) transmitter, (b) receiver. Note: p(D): LSB sequence; s(D): MSB data sequence; z(D): initial MSB sequence; y(D): shaping sequence; and z'(D): shaped MSB sequence. 40 4.3 Special constellation labelling for 16-QAM 41 4.4 Simplified trellis shaping transmitter 42 4.5 Two codes with the same rate and number of states have very similar P A R reduction performance. The code generators in octal are shown. Viterbi algorithm , Metric 1 52 4.6 Performance of trellis shaping with VA. Metric 1-4, Case 1 from Table 4.1. 52 4.7 Performance of trellis shaping with VA. Metric 1-4, Case 3 from Table 4.1. 53 4.8 Performance of trellis shaping with ST-SDA. Metrics 1 and 2, shaping codes Case 1 and 3 from Table 4.1 55 4.9 Performance of truncated shaping with different number N' of shaped carriers (c = N'/N). V A with Metric 2. Case 1 from Table 4.1 56 4.10 Performance of adaptive shaping. V A with Metric 2. Case 1 from Table 4.1. 58 5.1 Trellis shaping with random search. N = 128, 16-QAM, Case 1 from Table 4.1 61 5.2 Trellis shaping V A and Metric 1 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1. 61 5.3 Trellis shaping ST-SDA and Metric 1 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1 62 List of Figures x 5.4 Trellis shaping ST-SDA and Metric 2 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1 62 5.5 Average number of searches required to achieve Q7(7o) = 10 - 3 . Trellis shaping with N — 128, 16-QAM, Metric 1, shaping code Case 1 from • Table 4.1 63 5.6 Signal in complex plane 64 5.7 Approximation of a circle by a square and an octagon (P = 8) 64 5.8 SA with 256 searches, using approximation of the peak power with P = 4, 8,16. For comparison: SA with 64 searches, and optimal PTS 66 5.9 Block diagram of SI embedding scheme for one subblock: (a) transmitter, (b) receiver 69 5.10 A special case of the SI embedding scheme, (a) transmitter, (b) receiver. 71 5.11 B E R vs. 101og10(/5s/A/"0) for O F D M with and without SI embedding. A W G N and Rayleigh fading with D-fold diversity reception 74 List of Abbreviations A W G N Additive White Gaussian Noise B E R Bit-Error-Rate B P S K Binary Phase Shift Keying C C D F Complementary Cumulative Distribution Function CO Combinatorial Optimization D F T Discrete Fourier Transform ICI InterCarrier Interference IDFT Inverse Discrete Fourier Transform ISI InterSymbol Interference LSB Less Significant Bit M S B Most Significant Bit M S R Multiple Signal Representation O F D M Orthogonal Frequency Division Multiplexing P A R Peak-to-Average power Ratio PSD Power Spectral Density PSK Phase Shift Keying PTS Partial Transmit Sequences Q A M Quadrature Amplitude Modulation RS Random Search SA Simulated Annealing SI Side Information List of Abbreviations xii SNR Signal-to-Noise Ratio ST-SDA Stack-Sequential Decoding Algorithm TS Tabu Search V A Viterbi Algorithm i.i.d. independently and identically distributed pdf probability density function Xll l Preface Parts of the work in this thesis have been published in two papers • T. T. Nguyen and L. Lampe, "Trellis shaping for P A R reduction in O F D M sys-tems," Accepted for presentation at IEEE International Conference on Communi-cations (ICC '06), Istanbul, June 2006. • T. T. Nguyen and L. Lampe, "On partial transmit sequences to reduce P A R in O F D M systems," Submitted to IEEE Globecom 2006, San Francisco, 2006. xiv Acknowledgements I would like to express my deep gratitude to Dr. L. Lampe for his supervison and guid-ance over the course of my degree. He is an excellent supervisor and his enthusiam has been inspirational. I would also like to thank all the professors of E C E U B C for their teaching, advice, and help. I thank all my colleagues in the Communication Theory Group for a friendly and warm atmosphere in our labs. My sincere thanks to the people of Green College for an excellent place to live, en-joy life, and excel. My special thanks to all friends who have shared good times and helped me out in all difficult times. My deepest thank to my family for their constant and unconditional love and support. Vancouver, Canada April , 2006 Chapter 1 i Introduction Advancements in wireless communications have brought about a tremendous growth in the telecommunication industry over the last decade. Wireless technologies such as 3G, W i F i , and W i M A X have opened new ways of communication and helped lay the foundations of an information age. At the start of this new era, demands for cheaper, higher data rate, and more mobile wireless technologies are greater than ever before. The characteristics of the wireless channels impose immense challenges to achieve high-rate transmission. Beside attenuation and additive noise, signals in wireless envi-ronments suffer from multipath fading and time variation. Orthogonal frequency division multiplexing (OFDM) is one of the promissing technologies that would offer high data rates over such environments. Its robustness to multipath propagation and bandwidth-efficiency render O F D M a very attractive option for future systems. In fact, O F D M has been applied to many existing wireline and wireless standards. As a multicarrier technology, O F D M however suffers from one major drawback com-mon to this family, which is the large dynamic range of the transmit signal. High peaks in multicarrier systems occur when signals in subcarriers are constructively added. A large peak-to-average power ratio (PAR) poses special challenges for power amplification in the transmitter. Excessively high input causes saturation in amplifiers, which distorts and reduces the power of the output. As a result, either expensive linear power am-plifiers need to be employed, or nonlinear amplifiers much be operated with very high power backoffs to avoid significant bit-error-rate (BER) performance degradation and out-of-band radiation. Operating with high power backoffs, in turn, leads to inefficient power amplification, which may be critical for devices with battery power supply. Chapter 1. Introduction 2 In general, P A R reduction in O F D M systems can be achieved at the expenses of increasing complexity, transmit power,. BER, out-of-band radiation, or decreasing data transmission rate. Numerous approaches have been proposed in the literature to alleviate the P A R problem. Each of the approaches attains the P A R reduction at, to various degrees, some or all of the mentioned expenses. Unfortunately, no specific approaches have been considered the best solution for all O F D M systems. Therefore, research on this issue is currently still very active. Given OFDM's increasing popularity, the aim of this thesis is to investigate and propose new contributions to the research on P A R reduction in O F D M systems. 1.1 Contributions From our literature review, multiple signal representation (MSR) appears to have the most favorable performance-expense tradeoff among the existing P A R reduction ap-proaches. Among MSR techniques, partial transmit sequences (PTS) is the most popular while trellis shaping is the newest. In this thesis, we propose improvements to both of them. In particular, the main contributions of this thesis are: • We restate the PTS search as a combinatorial optimization (CO) problem. Firstly, this helps us to gain further insights into PTS and to propose a fair criterion to eval-uate PTS search algorithms. Secondly, the restatement enables us to apply a rich set of efficient metaheuristics from the CO literature. We decide to tailor simulated annealing (SA) and tabu search (TS), the two most popular CO metaheuristics, for PTS search. Under our evaluation criterion, simulation results show that, (a) the simplest algorithm random search indeed attains the best tradeoff amongst currently available algorithms, and (b), our versions of SA and TS outperform all other algorithms in the low-PAR region. • We introduce appended partial P A R as a new metric and the use of stack sequential decoding algorithm (ST-SDA) in trellis shaping. These modifications improve P A R Chapter 1. Introduction 3 reduction performance and/or reduce complexity of the technique. We also propose two variations which are referred to as truncated and adaptive shaping. These variations enable flexible performance-expense tradeoffs in trellis shaping. • We also restate the code sequence search in trellis shaping as a CO problem and apply the same algorithms as in the PTS case. This brings about a completely new degree of flexibility for trellis shaping. • We propose the use of an approximation to the peak power of complex O F D M signal as the new optimization objective in M S R techniques. Using this new objective, it is possible to dramatically reduce the required number of multiplications in the low-PAR signal search while attaining the same P A R reduction performance of the techniques. • Finally, based on trellis shaping, we propose a side information embedding/detecting scheme for PTS. Our low-complexity scheme does not affect the P A R reduction ca-pacity of PTS and virtually does not degrade the B E R performance of the O F D M system. 1.2 Organization The remainder of this thesis is organized as follows: In Chapter 2 we provide background for the work of this thesis. A brief introduction to the principles, characteristics, and applications of O F D M is given. The effect of high peak power on the performance of O F D M systems is described. Following the formal definition of the PAR, we review the P A R reduction literature and explain the connection between ours and the existing works. Chapter 3 restates the PTS search as a CO problem. The available search algorithms and our tailored SA and TS metaheuristics are then described and compared. Chapter 1. Introduction 4 Trellis shaping is studied and improved in Chapter 4. In particular, we describe a new time-domain metric and the use of the stack sequential decoding algorithm in trellis shaping. We then provide both analytical and numerical comparisons between our proposals and existing trellis shaping variations. In Chapter 5 we consider three improvements to both PTS and trellis shaping. We choose to present these important improvements after Chapters 3 and 4 because an ac-quaintance of both techniques is helpful to understand the materials. The improvements are (a) the application CO heuristics in trellis shaping, (b) a new optimization objective to reduce the overall complexity of search algorithms, and (c) a novel side information embedding/detecting scheme for PTS. Conclusion and suggestions for future work are given in Chapter 6. 1.3 Notations The following mathematical notations are used in this thesis. x, X, ... Row vectors H,... Matrices 0N All-zero vector of length N IN All-one vector of length N In Identity matrix of size n x n ( . )T Transpose E{-} Statistical average o Component-wise multiplication * Complex conjugate II ' 1 l°o Max norm (or norm) Chapter 1. Introduction 5 {•} A set {-,i = 0 . . . N-l} A set of N elements with index i runs from 0 to N — 1 Real part of scalars/vectors Imaginary part of scalars/vectors 6 Chapter 2 Background This chapter provides relevant background on P A R reduction in O F D M systems. Sec-tion 2.1 gives a brief overview of O F D M and its features and applications. In Section 2.2, we first define the P A R of O F D M signals and then illustrate the effects of high-PAR signals on O F D M systems. Finally, Section 2.3 reviews the P A R reduction techniques available in the literature and shows the relation of our contributions to the existing works.' 2.1 Orthogonal Frequency Division Mult iplexing Orthogonal frequency division multiplexing (OFDM) [1-3] is a multicarrier technique suited for high rate transmission over wireless fading channels. Two periodic signals are orthogonal when their inner product equals zero. Orthogonality allows multiple signals to be transmitted over a common physical link and be detected at the receiver without intercarrier interference (ICI). In O F D M , orthogonality is achieved by efficient discrete Fourier transforms which require relatively low hardware and computational complexi-ties [4]. Because all the subcarriers1 in O F D M are orthogonal, they can be spaced closely in the frequency domain. When the number of subcarriers is large, OFDM's power spec-tral density (PSD) appears nearly rectangular, e.g. Fig 2.1. The rectangular spectrum maximally utilizes the available bandwidth [5] and renders O F D M a very bandwidth-efficient technique. Multipath propagation in fading channels [6, 7] causes overlaps between delayed ver-1 Subcarriers are also called carriers or tones. In this thesis, these terms are used exchangeably. Chapter 2. Background 7 10 r 0 " -10 -X •2°" m Frequency, normalized to bandwidth Figure 2.1: Power spectral density of an OFDM signal with 256 subcarriers. sions of adjacent symbols. In single-carrier systems, a higher symbol rate means a shorter symbol duration and larger relative overlaps, which, in turn, lead to greater intersymbol interference (ISI). Usually, complex equalizers are needed to alleviate ISI [8]. In O F D M , as the high-rate channel is divided into many parallel lower-rate subchannels, symbol durations in subchannels increase and the relative overlaps become small. With the in-troduction of the cyclic prefix in every O F D M symbol, ISI can be greatly reduced or even eliminated without the need of equalization. The cyclic prefix is a copy of the last portion appended to the front of the symbol during its guard period, as illustrated in Fig. 2.2. Generally, if every symbol is extended by a guard period that is larger than the multipath channel's delay spread, delayed versions of a symbol cannot interfere with the adjacent ones and ISI is eliminated. If the guard period also consists of signals that preserve the orthogonality of subcarriers, ICI does not occur. The cyclic prefix in O F D M is designed to satisfy both of the conditions. Therefore, it can minimize both ISI and ICI. Chapter 2. Background 8 Copy Cyclic prefix Data symbol Figure 2.2: Cyclic prefix in O F D M . Due to its high data rate, relatively low complexity, robustness to fading, and uti-lization of channel bandwidth, O F D M has been applied in various modern wireless and wireline communication systems. Some of the most popular applications of O F D M are W i - F i and W i M A X for wireless local and metropolitan area networks, D A B and D V B - T for digital terrestrial radio and television broadcasting in Europe, A D S L for communi-cation over telephone lines, and the HomePlug standard for powerline communication. O F D M is also being considered for many upcoming standards, for example, U W B for wireless personal area networks and Mobile W i M A X . 2.2 The P A R Problem in O F D M Common to all multicarrier systems like O F D M is the high peak-to-average power ratio (PAR) problem. High peaks occurs when signals in subcarriers are constructively added. The problem arises when the high-power amplifier (HPA) at the transmitter saturates as the input signal is exceedingly large. Saturation, in turn, distorts and reduces the power of the transmit signal. To avoid distortion, either expensive linear HPAs need to be employed, or the nonlinear HPAs have to operate at the linear region, which may be well below the saturation point. Operating with very high power back-offs, however, leads to power-inefficient amplification and reduction of transmit power. In this section, we first give a formal definition of the P A R and then illustrate the effects of distortion on the O F D M system. Chapter 2. Background 9 (a) Transmitter X IDFT X Cyclic Prefix xe Insert (b) Receiver X D F T X Cyclic Prefix XQ ^ Remove Figure 2.3: Simplified block diagram of the complex equivalent baseband OFDM transmission scheme. 2.2.1 P A R of O F D M Signal A simplified block diagram of the complex equivalent baseband O F D M transmission schemes is presented in Fig. 2.3. The input signals {Xk} in the form of complex PSK or Q A M symbols are assigned to O F D M subcarriers with frequencies fk = kAf, 0 < k < N — 1. TV is the total number of subcarriers and Af is the spacing between adjacent subcarriers in the frequency domain. The duration of the O F D M symbol, excluding the cyclic extension, is NT. The relationship between N, T, and Af is 1 NT (2.1) The subcarriers are mutually orthogonal over NT because their inner products equal zeros, i.e., / ej27Tfmt(ej27rfnt)*dt = 0, (2.2) Jo for any m, n 6 {0 ...' N — 1} and TO ^ n. The complex envelope of the O F D M signal (excluding the cyclic extension) is given by i - v 1 • X(T) = - 7 T ? J l X ^ f k t - 0 < £ < i V T . (2.3) In O F D M , inverse discrete Fourier transform (IDFT) is used to map signals from the Chapter 2. Background 10 frequency domain into the time domain. Let X = [X0... XJV-I] (2.4) be the vector of N input symbols corresponding to an O F D M block and x = [x0 ... xNL-i] (2.5) be the vector of NL discrete-time samples of the O F D M signal, where L is the oversam-pling factor. The IDFT generates the samples as N-l xn = x(nT/L) = ^Y] X k ^ k n / N L , 0 < n < NL . (2.6) v ^ fc=0 We define the notation x = I D F T ( X ) (2.7) which returns the vector x as defined in (2.5) and (2.6) from the vector X. The P A R of the continuous-time signal x[t) is defined as max {|x(i)|2} 7 { x { t ) ) = • ( 2 - 8 ) The cyclic prefix is not considered because it does not affect the mathematical P A R definition. In this thesis, we consider the approximation P A R based on the sampled, discrete-time signal x whose P A R is _ 0 <*S L{M 2} (2.9) MIL E { > n | 2 } • Due to possible missing of the true peak of the continuous-time signal through sam-pling, it is clear that ^(x) < j°(x(t))- The approximation approaches the true value as the oversampling factor L increases. Chapter 2. Background 11 10° i Continuous-1 ime IO"1 L = 1 L — 2'^ L = 4 10 - 2 io- 3 9 10 101og1 0(7 o) [dB]—> 11 Figure 2.4: CCDF plots of continuous-time and L-time oversampled discrete-time OFDM sig-nals. N = 256, 16-QAM. The complementary cumulative distribution function (CCDF) is often used to show the P A R statistics and to evaluate the performance of P A R reduction techniques. It is the probability that the P A R exceeds some threshold 70 <27(7o) = P r{ 7 (x ) > 70} . (2.10) Fig. 2.4 illustrates the P A R C C D F plots of the continuous-time and discrete-time signals with different oversampling factors. The O F D M system with N = 256 and 16-Q A M was chosen for simulation. The plots show that 7(01) well approximates Y(x(t)) with L > 4. We therefore adopt L = 4 for all subsequent simulation results (cf. [9]). When N is sufficiently large, by the central limit theorem [10], x(t) can be modelled as a zero-mean complex Gaussian random variable with variance al = E{|a;(f)|2}. The Chapter 2. Background 12 probability density function (pdf) of x(t) is / i ( 1 ) W = _ L e x p ( _ t T ) . ( 2 . u ) Hence, its amplitude is Rayleigh distributed with the pdf 2u ( u2 \ f\x(t)\(u) = — exp ( - j , u > 0 . (2.12) ® X V ®x J Any sample xn also has the same distribution as x(t). The probability that the magnitude of a certain sample remains below UQ is Pv{\xn\ < U0} = / f\x(t)\(u)du Jo = l - e x P ( - ^ ) . (2.13) For Nyquist-rate sampling (i.e. L = 1), the samples {xn} are statistically independent (except for B P S K constellation, because discrete Fourier transform of x has to return a real vector X in the frequency domain), therefore the probability that the maximum magnitude of the samples exceeds UQ is Pr{ max \xn\ > u0} = 1 - Pr{ max \xn\ < UQ) 0<n<N l0<n<N = 1 - Pr{|:ro| < u0} P r{ |x i | <u0} ... Pr{\xN-i\ < u0} _ ( - § ) " • . (2.14) The C C D F of the discrete-time signal is approximated as Q-ho) = Pr{7(a;)>7o} - ' = Pr{ max | x n | 2 > crl^o] = Pr{ max \xn\ > A / ^ I T O } 0<n<N = . ^ ( l - e - 7 0 ) ^ . (2.15) We note that the statistical independence assumption is not valid when oversampling is applied (L > 1). The approximation (2.15) is interesting in that it.shows the C C D F Chapter 2. Background 13 function is independent of the P S K / Q A M constellation (again, except BPSK) . This is confirmed by our simulation results in Fig. 2.5. The plot of the B P S K case is also shown for reference. In subsequent simulations we will assume the 16-QAM constellation, unless specified otherwise. In Fig. 2.6 we shows the C C D F plots of O F D M systems with different number of subcarriers 7Y, from both the approximation (2.15) and simulation. It can be seen that the approximations, derived under the unrealistic assumption that the signal samples are i.i.d. Gaussian distributed, are smaller than the simulation results. However, they agree on the general trend of dependence of the C C D F plots on N. 2.2.2 Effect of Distortion We assume memoryless amplification, i.e. the current output depends only on the current input. In order to study the effects of distortion on the O F D M system, we assume the Rapp distortion model [11] for solid-state power amplifiers. Let x = pe7* be the input and x' be the output signal (continuous or discrete-time) of the distortion model, their relationship is x' = Tpr , (2-16) (1 + (p/A)*P) 1/2p where A is the saturation amplitude and p is a parameter which determines the linearity of the amplifier. When p —> oo, the distortion approaches that of the ideal soft limiter whose input-output relationship is if p < A (2.17) Ae 7 * otherwise The input-output relationship of Rapp model is presented in Fig 2.7. In subsequent simulations in this section we use p = 10, which is appropriate to represent systems that employ linearization mechanisms, e.g. predistortion [12]. The input power backoff B is defined as the ratio between the saturation power and Chapter 2. Background 14 IO :* I i 1 i i i i '. * u \ \ \ \ \ i 4 5 6 7 8 9 10 11 12 101og 1 0( 7o) [dB] —> Figure 2.6: CCDF plots of OFDM signals with different numbers of subcarriers. 16-QAM, L = 4. Approximation from Eqn. (2.15). Chapter 2. Background 15 1 a (X o 0.8 0.6 0.4 0.2 soft limiter P = 10 \ \ P = 3 \v. '" \ \ p \ P = = 2 1 p = 0.5 0 0.5 1 1.5 Input amplitude \x\/A —• Figure 2.7: Amplitude input-output relationship of Rapp distortion model, the average input power, i.e., 5 = 101og10(4) t d B ]> (2.18) where o2x = E{ |x | 2 } . A higher power backoff means that the amplifier operates further away from the saturation point. To avoid nonlinear distortion, the HPAs have to operate with high power backoffs in the linear region. However the average output signal level has to decrease accordingly. As a compromise, some nonlinearity has to be introduced into the output O F D M signal. We now examine the performance degradation and out-of-band radiation due to distortion. B E R Increase We use simulations to illustrate the effect of distortion on B E R . We present the B E R as a function of Es/J\fQ where Es denotes the average received energy per P S K / Q A M symbol Chapter 2. Background 16 as if no distortion occurs and A/o denotes baseband noise power spectral density. We note that, with distortion, the actual average received energy is less than EA due to output power cut-off (hence the actual signal-to-noise ratio is less than ES/NQ). However, in this case we still present the B E R as a function of ES/NQ so that the effect of transmit power reduction is taken into account. The B E R increases in A W G N and fading channels are presented in Fig 2.8 and 2.9, respectively. As can be seen in both cases, for a given ES/NQ, distortion can seriously degrade the B E R performance. We recall that distortion can be avoided by operating with high power backoffs, but this leads to power inefficiency and transmit power reduction2 which eventually increases the BER. Out-of-Band Radiation The effect of nonlinear amplification on the PSD of the O F D M signal is illustrated in Fig 2.10. As shown, the out-of-band radiation power increases with the distortion. This PSD spread is an important design consideration due to possible governmental restrictions on out-of-band radiation. It can be limited by reducing the transmit power, however, again, at the cost of system performance degradation. 2.3 Review of P A R Reduction Techniques As presented in the previous section, high P A R imposes significant challenges on trans-mitter implementation. Due to OFDM's potential, the P A R problem in O F D M systems has drawn great enthusiasms from the research community over the last decade. As a result, a rich set of P A R reduction techniques has been" proposed. The two most practical and popular families of techniques are peak cancellation and multiple signal representa-2Note that when nonlinear HPAs are employed, total transmit power reduction is the combined results of two causes. The first is the down-scaling of input or amplification gain to reduce distortion, and the second is the distortion itself. Chapter 2. Background 17 i r\—4 I I L. . L_ I : I i — : i 1 0 • 5 • 10 15 . 20 25' 30 ' 35 40 mogw(E,/Af0) [dB] —» Figure 2.9: B E R increase due to nonlinear amplification. Fading channel, Rapp distortion model with p = 10, N = 256, 16-QAM. Chapter 2. Background 18 10 -10 -20 r CQ -40 -50 -60 B = 5, 10, 15, 20 [dB] 0.5 1 1.5 2 Frequency, normalized to bandwidth —> Figure 2.10: Out-of-band radiation due to nonlinear distortion. Rapp distortion model with p = 10. N = 256, 16-QAM. tion. In this section we give a review of the techniques belong to these families. 2.3.1 Peak Cancellation A common strategy of many techniques is to add a peak-cancellation signal c(t) into the data-carrying signal x(t) in order to eliminate the high peaks before amplification. This strategy is followed in clipping, tone reservation, tone injection, and active constellation extension. C l i p p i n g Hard clipping [13] is the most simple P A R reduction technique where the signal's ampli-tude is clipped to A whenever it exceeds that threshold (similar to the ideal soft limiter, Chapter 2. Background 19 Eqn. (2.17)). In effect, hard clipping adds a signal c(i) into x(t) which can be specified as In the frequency domain, the spectral content of c(t) is both in-band and out-of-band (overlap or outside of the O F D M spectrum). The in-band content distorts the signal in subchannels and increases the overall data bit-error-rate (BER) of the O F D M transmission. The out-of-band content causes out-of-band radiation, which may have to be filtered out to comply with regulations. However, filtering in the frequency domain may lead to peak regrowth in the time domain. Repeating the clipping and filtering process many times can reduce both peak regrowth and out-of-band radiation, but also increase the complexity [14, 15]. Tone Reservation To avoid out-of-band radiation, c(t) can be designed such that its spectral content is only in-band. In tone reservation, a number of tones are reserved for peak-cancellation purpose (hence the term tone reservation). c(t) is wholly carried by the reserved tones. Therefore, it is orthogonal to and does not increase the B E R in all other data-carrying tones [16]. Performance of the tone reservation technique depends on the number of reserved tones as well as the algorithms to design the peak-cancellation signal. Let X and C be the iV-point data and peak-cancellation signals in the frequency domain. X is non-zero only at the data-carrying tones, whereas C is non-zero only at the reserved tones. In the discrete-time domain, the corresponding signals are (2.19) x = I D F T ( X ) (2.20) and c = IDFT(C) (2.21) Chapter 2. Background 20 The design of c aims at reducing the peak power of (c + x). It can be done in the frequency domain as an optimization problem with the number of complex variables equals to the number of the reserved tones. In [17], this problem has been cast as a linear programming problem to reduce the computational complexity. The advantage of tone reservation is that it does not require any additional compu-tation at the receiving side. The receiver can simply ignore the reserved tones. The disadvantage is that the data rate is reduced and the overall transmit power is increased. Furthermore, the number of reserved tone and the complexity to solve the optimization problem at the transmitter can be relatively high. Tone Injection In tone injection technique [16], no tones are reserved solely for peak-cancellation purpose. Instead, the constellations in a number of subcarriers are extended such that more than one points represent the same data. By selecting suitable constellation points, a low-P A R signal can be produced. Let X be the original data vector and (X + C) is the modified data vector after constellation extension. The peak-cancellation signal is c = I D F T ( C ) . Therefore, as in tone reservation, c is also in-band. However, because the peak-cancellation and the data-carrying signals share the same subcarriers, they interfere and this increases the BER. In order to minimize the B E R increase, the constellation needs to be extended such that points which represent different data are as far away from each other as possible. In tone injection, no bandwidth is lost for P A R reduction. However, the constellation extension increases the transmit power while potentially decreasing the performance in terms of B E R . Furthermore, application of tone injection increases the complexity in both the transmitter and the receiver. Chapter 2. Background 21 A c t i v e Constel la t ion Extens ion Active constellation extension (ACE) [18, 19] is very similar to tone injection except in the way that the constellation is extended. In tone injection, one point may be extended to a number of fixed points. In A C E , only some outer points of the constellation may be dynamically and continuously extended outwards. A C E has the same general advantages and disadvantages as tone injection. In com-parison with tone injection, performance degradation in A C E may be lower, however, the P A R reduction capacity also decreases. 2.3.2 Multiple Signal Representation Another common strategy to reduce P A R is multiple signal representation (MSR) [20]. In MSR, redundancy is introduced into the data so that the same data can be represented by multiple O F D M signals and one low-PAR signal is chosen for transmission. Members of this family include selected mapping, partial transmit sequence, and trellis shaping. If R redundancy bits are inserted into the data stream, 2R different O F D M signals can be generated. A n ideal MSR technique can generate 2R statistically independent O F D M signals and selects the one with lowest P A R for transmission. In other words, it can reject the (1 — 2~R) portion of all O F D M that has the highest PAR. As a result, the M S R technique limits the P A R of the PAR-reduced signals to 70 specified by P r { 7 ( x ) > 7 0 } = 1 - 2~R . (2.22) Eqn. (2.22) gives the lower bound for the achievable maximum P A R 70 when MSR techniques are used. This bound can be approximated by combining (2.22) and (2.15), which yields 70 = -\n(l-2-R/N) . (2.23) Fig. 2.11 shows the simulative lower bound and its approximation based on (2.23) for the case of N = 256. The simulative bound is obtained from (2.22) and P A R values of Chapter 2. Background 22 3 4 5 6 Redundancy R [bit] —> Figure 2.11: Bound on the achievable maximum PAR using multiple signal representation. Sim-ulative bound indicates the maximum PAR of OFDM signals after applying an MSR technique with R redundancy bits. The approximation is from Eqn. (2.23). N = 256. O F D M signals from a simulation with the oversampling factor L = 4. It can be observed that the approximation from (2.23) is lower (hence, less tight) than the simulative bound. This deviation is caused by the unrealistic assumptions about Nyquist-rate sampling and Gaussian distribution when we derived Eqn. (2.15). The simulative bound suggests that M S R can greatly reduce the P A R using only a very small amount of redundancy. For example, applying M S R with only R = 6 redundancy bit in an O F D M system with N = 256 subcarriers, we can limit the P A R of all O F D M symbols to under 7 dB. Furthermore, as all alternative signals can be made as valid O F D M signal, M S R does not increase the overall data BER. These two characteristics (small amount of redundancy and no B E R increase) make M S R a very promising strategy for P A R reduction in O F D M systems. Chapter 2. Background 23 A number of M S R have been proposed in the literature such that selected mapping (SLM) and partial transmit sequences (PTS) [20, 21], and recently, trellis shaping [22, 23]. It is important to realize that the bound (2.22) is the P A R limits that some ideal M S R technique would achieve. Practical MSR techniques may not be able to achieve the ideal performance. However very close performances have been observed e.g. cf. [20, Fig. 10]. Selected Mapping In the S L M technique [20, 21], each data symbol Xk can be rotated by a phase before transformation. Let us define U distinct phase vectors 0<"> = [¥i u ) • • • ¥ # - ! ] . l<u<U (2.24) and the corresponding rotation vectors p(u) = v^<u) _ _ _ ejviyij] ? 1 < u < u . (2.25) The distinct U alternative O F D M signals are generated as x(u) = i D F T (x o p M ) . (2.26) Total enumeration of U search steps can be carried out to find the phase vector that results in the lowest PAR. One disadvantage of S L M is its high complexity. Due to the arbitrariness of the phase vectors, each P A R search step involves the rotation and IDFT operations to produce the discrete-time signal x^ (see (2.26)), and then searching for the sample with the largest power. The computational complexity per search step is therefore rather expensive. Another disadvantage of S L M is, in order to detect the data, the receiver needs to know which phase vector have been used for PAR-reduction. This information has to be transmitted as side information (SI), which contains at least R — \og2(U) bits. In [24], an S L M technique without explicit side information was proposed. However, in this S L M variation, the complexity per search step remains high. Chapter 2. Background > 24 Partial Transmit Sequences Partial transmit sequence (PTS) [20, 21] is a variation of S L M which avoids IDFT oper-ation in every P A R search. Detailed description of PTS is provided in Chapter 3. The overall complexity of PTS, especially when R is large, is significantly less than that of S L M . Nevertheless, for the same number of search steps, PTS attains an almost equally good performance as S L M cf. [20]. PTS still requires side information to be transmitted to the receiver. PTS has gained much attention as one of the most promissing P A R reduction tech-niques. In this thesis we propose some important improvements for PTS. We address both the complexity and SI transmission issues of PTS. The complexity reduction is pro-posed in Chapters 3 and 5. As our SI transmission and detection scheme is based on the trellis shaping structure presented in Chapter 4, we defer its description until Chapter 5. Trellis Shaping Trellis shaping is an interesting recent variation of M S R where no explicit SI is required. The concept of trellis shaping was proposed in [25] to reduced the average transmit power in single-carrier systems. The application of trellis shaping to reduce P A R in O F D M systems was first introduced in [22] and developed in [23]. It has been shown in [23] that PTS offers a good performance with moderate complexity. In Chapter 4 we propose improvements in trellis shaping to reduce the computational complexity and/or increase performance. Further possible complexity reduction proposal and a comparison with PTS are provided in Chapter 5. 25 Chapter 3 PAR Reduction by Partial Transmit Sequences In this chapter we propose improvements to reduce the computational complexity of the partial transmit sequences (PTS) technique. In particular, we first reformulate the tech-nique as a combinatorial optimization (CO) problem. This allows us to apply efficient search algorithms from the CO literature to PTS. Simulation results show that they in-deed offer better computation-complexity tradeoffs than PTS search algorithms currently available in the literature. 3.1 Partial Transmit Sequences In PTS [20, 21], the data symbols in X are partitioned into V disjoint subblocks X^v\ 0 < v < V such that ^ = [ ^ . . . ^ 1 , (3.1) and ,,A Xk, if k-th. subcarrier belongs to v-th. subblock Xiv)={ (3.2) I 0, otherwise for 0 < k < N. Therefore, v - i X = ^ X M . (3.3) These subblocks are independently transformed to produce partial transmit sequences >) = [x0v) ... *a_i] . '0 < v < V . (3.4) Chapter 3. PAR Reduction by Partial Transmit Sequences 26 X Parition X (0 ) 1 I D F T I D F T IDFT + x Optimize for phases Figure 3.1: Block diagram of the PTS technique. These sequences are rotated by some phase <pv and then combined to produce the transmit sequence v - i x = xiv)e'Vv . (3.5) By changing the phases <pv, we can generate different O F D M signals from which one low-PAR signal is chosen. The block diagram of PTS is shown in Fig. 3.1. If ipv can be chosen among W values {2ixw/W, w = 0,... ,W — 1}, and the first block needs not rotate (without any performance loss), the number of alternative representa-tions for an O F D M symbol is Wv~l. Therefore, the side information (SI) that carries the information about the phases contains R = (V — 1) log 2(W) bits. With R fixed, PTS attains the best performance when W = 2 (see. [21, p. 78 and Fig. 5.18]). Hence, in this thesis we choose to consider the case W = 2, i.e. the number of redundancy bits will be R = V — 1, the phases will be 0 or n. We can state the PTS problem as |2 minimize {<Pv} R i;=l subject to: ipv G {0, ir} , 1 < v < R (3.6) Alternatively, let bv = 0 i i > = 0 1 if (/? = 7T (3.7) Chapter 3. PAR Reduction by Partial Transmit Sequences 27 and let us define the phase vector 6 =[&!... by-i] • (3-8) Then Eqn. (3.6) can be reformulated as a zero-one combinatorial optimization (CO) [26] problem minimize f(b) = \\XWIQ b subject to: b G {0,1} R . (3.9) where the LiV-point complex transmit sequence is R R x = ^2x{v) - 2 j 2 x ( v ) b v (3-10) v=Q v—1 and the objective function f(b) is the peak power of x. • In optimal PTS, we choose the optimal solution of (3.9) to produce the transmit sequence. Finding the optimal solution of the CO problem (3.9) requires total enumer-ation over all 2R possible phase vectors, which is impractical for large R. It is therefore desirable to apply heuristics (suboptimal algorithms) to reduce the complexity of PTS. In order to solve (3.9), we would have to evaluate f(b) very often. The evaluation can be divided into two steps: calculating the transmit sequence and finding its peak power. By storing complex numbers as pairs of real and imaginary parts and taking advantage of binary coefficients bv, calculating the transmit sequence requires no multiplications. On the other hand, finding the peak power of an LiV-point complex sequence then requires 2LN real multiplications. Assuming that computational complexity is largely credited to multiplications, the evaluation of f(b) is relatively computationally expensive. Hence, we will compare heuristics based on the average number of peak-power evaluations (also referred to as the number of peak-power searches or simply the number of searches) that they require to achieve the same P A R reduction. Chapter 3. PAR Reduction by Partial Transmit Sequences 28 A l g o r i t h m 3.1 Random Search Select initial solution b for k = 1 . . . K - 1 do Randomly generate a new solution b i f /(&) < f(b) then b = b end if end for Return b 3.2 Review of Descent Heuristics A number of PTS heuristics have been proposed in literature: random search and bit flip [27], and local search [28]. A l l of these heuristics fall into the category of descent heuristics, in which the solution is always updated in the direction of decreasing the objective function /(b) (downhill moves). In this section we give a brief description of these heuristics. Their pseudo-codes are also provided. 3.2.1 Random Search The most simple PTS heuristic is random search. Its pseudo-code is given in Algo-rithm 3.1. In each iteration we randomly generate a new solution and update the current solution to the new solution if improvement is observed. The application of this subop-timal algorithm to PTS has been mentioned in [27]. 3.2.2 Bit Flip The bit flip algorithm was proposed in [27]. The algorithm is presented in Algorithm 3.2 with some extension. In each iteration, a new trial solution is derived from the current solution by flipping one of its bit (from 0 to 1 and vice versa). If the trial solution is Chapter 3. PAR Reduction by Partial Transmit Sequences 29 A l g o r i t h m 3.2 Bit Flip Select initial solution b, flip position 0 < p < R — 1, time to last update r — 0 for k = 1 . . . K - 1 do i f r == R then break end i f Obtain b by flipping the pth bit in b i f f(b) < f(b) then b = 6, r = 0 end i f V = (P + (remainder), r = r + 1 end for Return b better, i.e. results in lower value when applied to the objective function, it replaces the current solution. In the proposal in [27], the flipping process starts at the first bit and sequentially continues to the last bit of the phase vector with a total of R +1 peak-power searches. We extend the algorithm in that the flipping can continue in a cyclic fashion, which means that the first bit is flipped again after the last bit was considered and so on, allowing for an arbitrary number of searches. We note that, if no improvements are observed after R consecutive flips, the current solution is a local optimum and continuing the process does not help. Hence, in order to save complexity, we incorporate a detection scheme into the algorithm so that it can be terminated early if this is the case. 3.2.3 Local Search A local search heuristic find the best solution in a neighborhood Af(6) of the current solution b. From the objective function /(&), it is not clear what a mathematically proper definition of the neighborhood would be. A pragmatic choice is to select the Chapter 3. PAR Reduction by Partial Transmit Sequences 30 A l g o r i t h m 3 . 3 Local Search Select initial solution 6, searched list C = 0 for i = 1 ... I do Neighborhood J\f(b), search set A = X{b) \ C i f 4^ == 0 then break end i f b = argminb€^{/(6)} Searched list £ = C U A end for Return 6 neighborhood as the set of vectors with Hamming distances no larger than r from the current solution, as in [28], or the set of /^-interchange vectors [29]. If the neighborhood is the set of vectors with Hamming distances of r or less, the number of vectors in the neighborhood is can be observed, however, that the neighborhoods in consecutive iterations may overlap. Therefore, in order to reduce the number of searches, we maintain a list of all vectors that have been searched in previous iterations and exclude them from M{b). The pseudo-code of local search is presented in Algorithm 3.3. (3.11) In each iteration, we have to perform as many as 1^ (6)1 peak-power searches. It 3.3 Metaheuristics for P T S Descent heuristics usually do not perform well in dealing with hard CO problems. The reason is that the solution is always updated in the direction of improvements (downhill moves) hence it soon converges and is trapped to a local optimum [30]. Chapter 3. PAR Reduction by Partial Transmit Sequences 31 / Q . . . . s L (a local optimum) 0 (global optimum) b —• Figure 3.2: Illustration of search moves. Having stated PTS as a CO problem, many efficient metaheuristics [31] are avail-able from the field of CO research such as simulated annealing (SA) [30, 32, 33], tabu search (TS) [30, 34, 35], genetic algorithms, artificial neural networks, particle swarms, ant colony optimization, and greedy randomized adaptive search procedure (GRASP). Unlike descent heuristics, these metaheuristics have mechanisms to accept uphill moves. Such moves are advantageous to avoid being trapped to local optima. This can be illus-trated by a simple example in Fig. 3.2. We consider neighborhoods that contain only two adjacent points to the current solution. If only downhill moves are allowed, the quality of the final solution depends entirely on the initial solution. For example, if the initial solution is P or Q, the final solution is L. Likewise, initial solution S results in final solution 0. When uphill moves are allowed, it is possible to reach the better solution 0 even if the algorithm starts with the solution Q. In this section, we consider the application of SA and TS, which are amongst the most popular metaheuristics, to PTS. These general-purpose metaheuristics have been reported to efficiently solve numerous difficult CO problems. Depending on the particular Chapter 3. PAR Reduction by Partial Transmit Sequences 32 Algorithm 3.4 Simulated Annealing Select initial solution b, initial temperature T, cooling coefficient a Best solution 6 = 6 for k = 1 . . . K - 1 do Generate neighborhood solution b Calculate 5 = /(b) - /(b) Generate a uniform random number r in (0,1) if 5 < 0 or r < e _ < 5 / T then 6 = b end if if /(b) < /(b) then b = b end if T = aT end for Return b problem, the metaheuristics have to be modified to a suitable form. The pseudo-codes presented here are our tailored versions for PTS. 3.3.1 Simulated Annealing SA [30, 32, 33] is one of the first algorithms that have explicit strategies to escape from local minima. It is deduced from the physical annealing process of solid materials. There has been significant amount of research on the convergence and statistical behavior of the SA algorithm, and we refer interested readers to [33] and the works cited therein. In SA, downhill moves are always accepted, and uphill moves are accepted with some probability p that depends on the height of the hill 5 and a control parameter T (the temperature, as from its original physical model). We select the probability following the Chapter 3. PAR Reduction by Partial Transmit Sequences 33 Boltzmann distribution, i.e. p = e~6'T. (3.12) For the solution to converge, the acceptance probability of uphill moves has to reduce during the search. This is realized by reducing the temperature in the cooling process. We choose the simple geometric cooling schedule Tk = a T n , (3.13) where k indicates the search step and a < 1 is the cooling rate. The pseudo-code of the SA metaheuristic that we use is described in Algorithm 3.4. In this algorithm, we choose to derive the neighborhood solution b from the current solution b by cyclicly flipping one bit (as in bit flip, Algorithm 3.2). . The cooling process is controlled by the combination of the initial temperature and the cooling rate. It is an important consideration in SA. Firstly, if the final solution is to be independent of the initial solution, the initial temperature must be hot enough such that almost any exchange of solutions are allowed. Secondly, the temperature must gradually decrease to zero, but not so fast that uphill moves are freezed early in the search. It normally requires experiments to fine-tune the control parameters. 3.3.2 Tabu Search TS [30, 34, 35] is possibly the most used metaheuristics for CO problems. The algorithm uses information from the search history to escape from local optima and drive the search to regions that might have better solutions. Algorithm 3.5 shows the particular form of the TS metaheuristic that we apply to PTS. It uses a short term memory to avoid both trapping to local optima and moving back and forth between searched solutions. This is implemented in the form of a tabu list C(B) that keeps track of B most recently visited solutions and forbids moving back to them. In each iteration we only search within an acceptance set A of neighborhood solutions that are not on the tabu list. We define the neighborhood M(b) as the set of vectors that have Hamming distance 1 from b, i.e. each Chapter 3. PAR Reduction by Partial Transmit Sequences 34 A l g o r i t h m 3.5 Tabu Search Select initial solution 6, initial tabu list C(B) = 0 Best solution 6 = 6 for i — 1 ... I do Acceptance set A = J\f(b) \ C(B) New solution 6 = argmin b 6^{/(6)} i f f(b) < f(b) then 6 = 6 end i f Update tabu list C(B) end for Return 6 vector b of the neighborhood is obtained by flipping one bit of 6. After each iteration, the new solution differs from the last solution at exactly one bit position. That position is recorded by the tabu list C and it will not be flipped in the next B iterations. The size of the neighborhood is 1^ (6)1 = R. The size of the acceptance test varies from R (at the beginning when the tabu list is empty) to R — B. 3.4 Simulation Results We present the simulation results for an O F D M system with 16-QAM constellation, N = 256 subcarriers which are divided into V — 16 (hence R = V — 1 = 15) subblocks by pseudo-random partition [36]. For metaheuristics, different algorithmic parameters often lead to different performance and require experiments to fine-tune the parameters to achieve good results. We use the following parameters for the SA metaheuristic: initial temperature T = E{|a; n | 2 }/2, cooling rate a = l—A/K (K is the total number of searches, see Algorithm 3.4). For the TS metaheuristic, we choose the maximum tabu duration to be B = 9 (see Algorithm 3.5). Chapter 3. PAR Reduction by Partial Transmit Sequences 35 Figure 3.3: Performance of optimal PTS and random search. N = 256 , V = 16, pseudo-random partition, binary rotation. Fig. 3.3 shows the C C D F of P A R applying random search (RS) with different K (2, 4, 16, 256, 1024, and 4096 searches). Also included are the C C D F for O F D M without P A R reduction and for optimal PTS with K = 2R = 32768 searches. It can be seen that considerable improvements in P A R are achieved with relatively few searches K < 64. Closing the remaining gap of about 1 dB (at Q 7 ( 7 o ) = 10~3) compared to optimum PTS, however,s appears quite difficult to achieve. A fair comparison of different algorithms has to take this fact into account. In Fig. 3.4, it can be observed that the application of the simulated annealing meta-heuristic provides some good improvements in the low-PAR range. In particular, SA needs to perform only about one-fourth of number of searches for RS to achieve the same P A R reduction close to that of optimum PTS. In Fig. 3.5, we present the average number of searches vs. 70 that different algorithms Chapter 3. PAR Reduction by Partial Transmit Sequences 36 10Tog10(7o) [dB] — Figure 3.4: Performance of SA vs. RS with different number of searches. require to achieve Q7(7o) = 10 - 3 . The number of searches is averaged because for some algorithms (e.g. bit flip and iterative local search) the number of searches varies with different instance of the CO problem (different O F D M data). Interestingly, it can be seen that descent heuristics proposed in [27, 28] perform worse than random search in all cases. Only the application of SA and TS metaheuristics to solve the CO problem of PTS yields an enhanced tradeoff in the low-PAR range (e.g. at 7 dB), where about four times less searches are required for the same P A R reduction. 3.5 Conclusion In this chapter we reformulate PTS as a CO problem. This helps us to propose a fair benchmark to evaluate PTS search algorithms. The formulation also bring a rich set of efficient metaheuristics from the CO literature to apply in PTS. A n important con-Chapter 3. PAR Reduction by Partial Transmit Sequences 37 random search 10° 1 1 ' 1 1 1 6.5 7 7.5 8 8.5 9 101og10(7o)[dB] — Figure 3.5: Average number of searches required to achieve <37(7o) = 10 - 3 . tribution of this chapter is the finding that the most simple algorithm RS outperforms all descent heuristics that have been proposed for PTS [27, 28]. Only our version of the SA and TS metaheuristics can attain a better tradeoff in the low-PAR region. The numerical results presented in this chapter can also serve as motivation to apply other metaheuristics to the PTS CO problem. 38 Chapter 4 PAR Reduction by Trellis Shaping The concept of trellis shaping was proposed in [25] to reduce the average transmit power in single-carrier systems. The application of trellis shaping to reduce P A R in O F D M systems is developed in [22, 23]. In this chapter we propose further improvements of the techniques. Our proposals include a new metric and the use of sequential decoding in trellis shaping. We also provide extensive performance comparison between the proposed modifications and previous works. In Section 4.1 we show and explain the trellis shaping structure. Metrics and decoding algorithms are described in Section 4.2 where we present the existing and our proposals together to facilitate our comprehensive comparison. The discussion of some efficient implementations and their complexities follows in Section 4.3. Finally, Section 4.4 shows simulation results and Section 4.5 concludes the chapter. 4.1 Trellis Shaping Scheme In this section we introduce the trellis shaping scheme [25] and its application in P A R reduction in O F D M systems [22, 23]. Trellis shaping is a type of constellation shaping, which is best acquainted via an example. Consider a 16-QAM constellation with Gray labelling as in Fig. 4.1. Each symbol in the constellation is labelled by four bits abed. If only two of them carry the data, say cd = 11, while the other two bits are redundancy, we can choose any of the filled symbols in Fig. 4.1 for modulation. The particular choice depends on our objective and possibly other constraints. For example, to reduce the transmit power, we would choose the point that is closest to the origin. We denote ab the most significant bits (MSBs) and cd the less significant bits (LSBs). Starting with some Chapter 4. PAR Reduction by Trellis Shaping 39 0 0 1 0 O 0 0 1 1 ' • ' 0 0 0 1 O 0 0 0 0 o 0 1 1 0 O 0 1 1 1 • 0 1 0 1 O 0 1 0 0 O O 1 1 1 0 • mi O 1 1 0 1 O 1 1 0 0 O 1 0 1 0 • 1 0 1 1 O 1 0 0 1 ol Figure 4.1: Constellation shaping with a 16-QAM constellation initial MSBs, we can drive our selection by raodulo-2 adding binary bits (the shaping bits) to the MSBs. In this example, there are two redundancy bits per Q A M symbol, hence any initial MSBs and shaping bits are valid. Trellis shaping is a type of constellation shaping where there is only a fraction of a redundancy bit per Q A M symbol. Furthermore, only one bit of the label is the MSB while all other bits are LSBs. The shaping bits come from a code sequence of a rate- l /n s convolutional code called the shaping code. The block diagram of the trellis shaping scheme for P A R reduction in O F D M systems is shown in Fig. 4.2. We consider the shaping code characterized by an (ns — 1) x ns parity check matrix H(D), whose elements are polynomials in D [37]. As illustrated in Fig 4.2, the binary data is divided into two sequences s(D) and p{D), which are respectively the data MSB and the LSB sequence. Given s(D), the initial (or unshaped) M S B sequence z(D) is obtained as z(D) = s{D)H- T(D) (4.1) where H~ T(D) is the left inverse of H T(D), i.e, H- T(D)H T(D) = In3(D). (4.2) The shaping sequence y(D) is a valid code sequence of the shaping code. The shaped Chapter 4. PAR Reduction by Trellis Shaping 40 (a) A S(D) Select Code Sequence H~T(D) \z{D) V(D) data Mapping x (b) X De-mapping \z'(D\ * PiD) Figure 4.2: Block diagram of trellis shaping system: (a) transmitter, (b) receiver. Note: p(D): L S B sequence; s(D): M S B data sequence; z(D): initial M S B sequence; y(D): shaping sequence; and z'(D): shaped M S B sequence. M S B sequence is z'(D) = z(D)®y(D). (4.3) The LSB sequence p(D) together with the shaped sequence z'(D) are mapped onto N M-ary Q A M symbol {Xi, i — 0 ... N-l}. Since any valid code sequence y(D) of the shaping code satisfies y(D)HT(D) = On,-1(D), (4.4) the data sequence s(D) is recovered at the receiver from z'(D) by z'{D)HT(D) = 8(D) . (4.5) In the following, we define the binary vectors s and p of length N(n3 — l)/na and N(\og2(M) - 1), respectively, as the data in an O F D M symbol. Correspondingly, we denote by y a code sequence of N bits used to shape the O F D M symbol and z' the re-sulting shaped MSBs. z' is the MSBs in the labels of N Q A M symbols X = [X0 ... XN-i] Chapter 4. PAR Reduction by Trellis Shaping 41 010 on 001 000 O O O O 110 in 101 100 O O O O O o O O 100 101 in 110 O O o O 000 001 011 010 (a) (b) Figure 4.3: Special constellation labelling for 16-QAM, Type-I from [23], (a) M S B , (b) L S B . (multidimensional shaping [22, 23]), whereas p is the LSBs. P A R reduction with trellis shaping is achieved by selecting an appropriate shaping sequence y from the set of code sequences C. The optimum solution is expressed as y o pt = argmin{7(x)} . (4.6) It is clear that with the same binary data, multiple O F D M signals can be generated depending on the shaping sequence and one low-PAR signals is chosen for transmission. Furthermore, as the receiver does not need to know which shaping sequence has been used, trellis shaping is indeed an M S R technique with implicit side information. The total redundancy inserted into one O F D M symbol by trellis shaping is N/ns bits. Since the total number of transmitted bits is N\og2(M), the relative redundancy is N/na _ 1 r = (4.7) iVlog 2 (M) log 2[M)n s ' The P A R reduction performance of trellis shaping also depends on the constellation labelling. Simulation results in [23] suggest that a mapping such that two signal points with the same LSBs are symmetric about the origin of the complex plane is advantageous. This mapping is referred to as constellation mapping Type-I in [23] (see Fig. 4.3) and we Chapter 4. PAR Reduction by Trellis Shaping 42 A H'T(D) ] Mapping Select Code b(D) Sequence and Shape U 1 data Figure 4.4: Simplified trellis shaping transmitter limit the discussion from now on to this type of mapping. With this mapping, a shaping bit 0 does not affect whereas a shaping bit 1 simply inverts the corresponding Q A M symbol. Hence, the transmitter in Fig. 4.2 (a) can be simplified as in Fig. 4.4. In this simplified scheme, the transmitter needs to perform the mapping only once. Afterward, the code sequence selection algorithm can work directly with the data symbols. It should be noted that with this mapping the average power of the O F D M signal is independent of the shaping sequence y. Since the code book C is of size 2N/n% a brute force search for yopt is infeasible for typical values of N and ns. Instead, computationally efficient search algorithms are needed. 4.2 Selection of Code Sequences In this section, we discuss methods to select the code sequence y for shaping. First, in Section 4.2.1, we present four different metrics numbered 1-4, which can be applied in search (or decoding) algorithms. Metrics 1, 3, and 4 have been used in the literature and Metric 2 is our proposal. We then describe the application of the Viterbi algorithm (VA) and stack (or ZJ) sequential decoding algorithm (ST-SDA) for efficient searches in Section 4.2.2. While the V A is the only decoding algorithm in previous works, the application of the ST-SDA is one of our contributions. Finally, in Section 4.2.3 we propose two variations of trellis shaping that are only possible when our Metric 2 is used. Chapter 4. PAR Reduction by Trellis Shaping 43 As the decoding algorithms in Section 4.2.2 select the code sequences y in a sequential manner, it is convenient to define a partial code sequence of length kns as y [k) = [y0... ykna-i] (4.8) for 1 < k < N/nsl. The partial shaped data sequence corresponding to y^ is denoted by X<*> = [X0 ... X^-i) (4.9) 4 .2 .1 Metrics M e t r i c 1: Pa r t i a l P A R The first metric was suggested in [22] for trellis shaping using the V A (see Section 4.2.2). It applies the P A R criterion (4.6) directly to partial code sequences y^ k\ More specifically, the metric for t / f c ; is given by A 1 ( y W ) = ||aJ?)||20> (4.10) where2 x^ = IDFT(X^ fe ;) is the partial transmit sequence corresponding to the zero-padded shaped data sequence X?> = 0N_kn3] . (4.11) M e t r i c 2: Appended par t ia l P A R (Newly Proposed) We propose a modification of Metric 1 where we always consider all subcarriers at each decoding step. This metric facilitates truncated and adaptive shaping discussed in Sec-tion 4.2.3. Let y{d] = [V (k) Vkns • • • Vkns+t-\ OjV-fcns-i] , (4-12) 1In the following, we always assume that 1 < k < N/ns and we omit repeated statement of this for the sake of brevity. 2 I D F T ( X ) gives the LTV-dimensional inverse discrete Fourier transform (IDFT) of the appropriately zero-padded iV-dimensional vector X. Chapter 4. PAR Reduction by Trellis Shaping 44 i.e. a full-length code sequence obtained from by pushing O's into the convolutional encoder, t is the number of possible non-zero output bits, i.e., { \og2{Ns)ns kns< N - \og2(Ns)ns t = N — kns otherwise and Ns is the number of encoder states. Let be the resulting data sequence and a j j 0 = IDFT(X^ f c ) ) . The metric for y{k) is obtained as A 2 ( y ( f c ) ) = |l4fe)ll~- (4-14) Metric 3: Partial Autocorrelation Another criterion related to the P A R of the time domain signal x(t) considers the aperi-odic autocorrelation function N-m-l pm= Y, Xi+rnX*, 0<m<N-l, (4.15) i=0 of the frequency domain data sequence. In particular, it was shown in [38] that the peak power of x(i) is bounded by max \x 0<t<NT (t)\2<^Po + 2^\pm^j . (4.16) It was therefore suggested to minimize the term 3 Ylm=\ \Pm\ to reduce the P A R of the O F D M signal x(t) [39]. In order to adopt this criterion for trellis shaping, we define the aperiodic autocorre-lation function kns— m—1 p<£> = £ X i + m X * , 0<m<kn8-l, (4.17) »=o of the partial shaped data sequence X^h\ and we choose as metric for the corresponding code sequence y^ A 3 ( l / ( f c ) ) = £ Ipffl- (4.18) m=l 3The value of po does not change if inversions are applied to Xi, which is the case with symmetrical labelling. Chapter 4. PAR Reduction by Trellis Shaping 45 Metric 4: Partial Autocorrelation Squares Ochiai used a similar reasoning as provided for Metric 3 above and considered minimizing the sum of the squares of the autocorrelation function ]Cm=i \Pm\2 for P A R reduction [23, Eqn. (20)]. The metric for code sequence is given by [23, Eq. (22)]. Wfc))=£Vs?|2. (4.19) This Metric 4 has the advantage over Metric 3 that it is additive (see Section 4.2.2). 4.2.2 Algorithms The optimum decoding algorithm selects the code sequence y^71^ of length N from the code book C that minimizes the adopted metric: y o p t J - argmin { W / n s ) ) } , I = {1,2,3,4} . (4.20) In the following, we consider the use of the V A and the stack algorithm to find a good code sequence with feasible computational complexity. We also discuss whether this sequence coincides with J/opt,;-Viterbi Algorithm (VA) Henkel and Wagner [22] considered using the V A with Metric 1 for suboptimal code sequence search. If the two partial code sequences y^ and y^ enter the same trellis state, sequence y[h^ is selected if Ai(y^) < A 1(t/ 2 / c^) and vice versa. Since, as noted in [22], Metric 1 is not additive, the VA does not necessarily find the optimum solution 2 / o p t i i from (4.20). Similarly, the code sequence selected by the V A with Metric 2 is not necessarily yopt )2. Still, a considerable P A R reduction was observed in [22] and we expect similar results if Metric 2 is applied (cf. Section 4.4). Ochiai [23] suggested the use of the V A and Metric 4. Metric 4 is additive, as the metric update equation A4(y(fc)) = W*" 1 5) + A?V f c ) ) (4-21) Chapter 4. PAR Reduction by Trellis Shaping 46 A l g o r i t h m 4.1 ST-SDA Algorithm Note: Sorted stack, top node has least metric Load stack with origin node. while Top of stack is not at end of tree do Pop the top node Expand top node (two branches 0 and 1) Compute metrics of these two new nodes Push the two new nodes into the stack end while Output top path with \[k)(yw) > 0 can be formulated [23, Eqn. (28)]. The difficulty inherent to (4.21) is the unlimited memory of the metric increment A 4 ^(? / f c ; ) . In particular, \[h\y^) does not only depend on the code symbols [y(k-i)ns • • • Dkns-i} along the transition path between the two states, but it also depends on the entire previous code sequence (see also [23, Eqn. (28)]). Hence, a tree search algorithm rather than a trellis search algorithm must be used to find the optimum solution i / o p t ) 4 for Metric 4. It is worth noting that the suboptimality of the V A was not mentioned in [23], Metric 3 directly follows from the bound (4.16) of the peak power. But different from Metric 4, it is not additive and was therefore not considered in [23]. As discussed above for Metrics 1 and 2, the V A could be used to find a suboptimum solution. Stack Sequential Decoding A l g o r i t h m ( S T - S D A ) The application of sequential decoding algorithms (SDAs) to trellis shaping for P A R reduction was not considered in [22, 23]. Here, we consider the ST-SDA as one of the most commonly used SDAs. The pseudo-code for the ST-SDA is given in Algorithm 4.1. The ST-SDA searches through the code tree rather than the code trellis with the advantage that the complexity is almost independent of the code memory [37]. Since Metrics 1-3 are not additive, the ST-SDA does not find the code sequence yopt); from (4.20) for Chapter 4. PAR Reduction by Trellis Shaping 47 I G {1,2,3}. However, different from the VA, the ST-SDA finds the optimum solution 2/ 0 Pt,4 for Metric 4. We especially note that decoding using the ST-SDA with Metric 2 can be imple-mented more efficiently given the following observation. The ST-SDA maintains a stack, where the top entry corresponds to the path with the smallest metric. At each sequen-tial decoding step, this path branches into two paths corresponding to inputs 0 and 1, respectively. The two expanded paths are then pushed back to the stack (see Algorithm 4.1). It is immediately clear from the definition of Metric 2 that the original path and the path expanded with bit 0 have the same accumulated metric because they share the same appended code sequence. Hence, no additional metric update is needed for this expanded path. It also follows immediately that one of the expanded paths will be at the top of the stack. As a result, the stack algorithm always expands the longest path. This fact leads to two consequences: (i) the algorithm stops after N/ns steps and in total only N/ns + 1 metric calculations are needed; and (ii) a stack of size two is sufficient. 4.2.3 Variations Based on the Appended Partial P A R Criterion The "appended partial PAR" Metric 2 in (4.14) is different from the "partial PAR" Metric 1 in (4.10) in that the unshaped portion of the data sequence is included when forming the transmit signal. Using this property we propose two variations of trellis shaping based on Metric 2. These modifications are applicable to code sequence selection with the V A and with the ST-SDA. Truncated Shaping Conventional shaping requires N/ns redundancy bits and the shaping code sequence is A/'-bit long. For truncated shaping, only a subset of N' < N Q A M symbols is shaped in order to reduce both the computational complexity and the number of redundancy bits Chapter 4. PAR Reduction by Trellis Shaping 48 (N'/ns). Generally, this truncated shaping can operate on any subset of N' subcarriers and we found that the achieved P A R reduction is insensitive to which subset is chosen. Hence, we assume that the N' symbols {Xi, 0 < i < N' — 1} are shaped. Unshaped data symbols do not carry redundant bits. To correctly recover the data at the receiver, only the MSBs of the N' shaped symbols are applied to the syndrome former. Adap t ive Shaping With Metric 2, we always consider a valid code sequence at every step. Therefore, we can stop as soon as the achieved P A R is below the desired value. This is the idea of adaptive P A R reduction [40] which can considerably reduce the complexity of shaping. 4.3 Implementation and Complexity In this section, we discuss the implementation and complexity of trellis shaping with the different algorithms and metrics introduced in the previous section. 4.3.1 Partial and Appended Partial P A R Criteria (fc) (k) The computation of Metrics 1 and 2 requires the partial time-domain vectors x\ and xKd , respectively, and a subsequent peak-amplitude search (see Eqns. (4.10) and (4.14)). In a computationally efficient implementation, each state at stage (k — 1) stores the partial time-domain sequence corresponding to the partial code sequence ending in this state. The initial vectors at k = 0 are x i 0 ) = 0LN and ccj 0 = I D F T ( A ) , respectively. The vector update can be formulated as (yi G {0,1} is the zth bit in the shaping sequence) x + v (fc) l e {z,d} (4.22) with kns — 1 kn3+t— 1 (4.23) i = ( f c - l ) n s i= ( fc—l)n s Chapter 4. PAR Reduction by Trellis Shaping 49 and u ^ I D F T a O i X i f V - i - ! ] ) (4.24) respectively. It should be noted that this update relies on the symmetric Q A M labelling. Assuming multiplication dominates the complexity, at the beginning of each decoding step, needed vectors Vj are precalculated and temporarily saved. For each of the transition paths, and are updated by adding or subtracting these vectors according to (4.23) . Without taking into account possible complexity savings due to e.g. multiplication with ± 1 and ±j or repeated multiplication of Xi with the same complex exponential in (4.24) , nsLN complex multiplications are required per transition step. Hence, for trellis shaping with the V A and Metrics 1 and 2, 2LN2 complex multiplications are required to generate time-domain vectors x\k\ I € {z,d}, (the computations for an initial IDFT for Metric 2 are neglected). Calculating the peak value accounts for 2LN real multiplications per considered time-domain vector. Since in total 2Ns(N/ns — \og2(Ns)) states are visited, the complexity (in terms of multiplications) for the V A is therefore proportional to LN2 and, depending on the ratio Ns/ns, it is dominated by the effort for finding the peak value. In case of the ST-SDA and Metric 1, the complexity is a random variable and depends on the number of states visited. For Metric 2, as discussed in Section 4.2.2, only N/ns +1 time domain vectors are considered, which leads to a reduced complexity compared to the VA. 4.3.2 Partial Autocorrelation and Partial Autocorrelation Squares Criteria A computationally efficient implementation of trellis-shaping with Metric 4 is described for the V A in [23]. It requires about NSN2 / (2ns) complex multiplications. It should be noted, however, that each state at time k has to store all partial autocorrelations p$ for 1 < m < kns — 1 associated with the chosen path. It is further assumed that all possible Chapter 4. PAR Reduction by Trellis Shaping 50 entries for Xi+mX* are tabulated to save computations. Hence, we conclude that the V A with Metric 4 requires by about a factor 2L less multiplications than with Metrics 1 and 2, while memory requirements are considerably higher. In turn, if we consider similar memory use, then the computation of partial autocorrelations pm\ 1 < m < kns — 1, for Metric 4 leads to a considerably increased number of operations compared to Metrics 1 and 2. Similar conclusions apply if the ST-SDA is used and a comparable number of states is visited for the different metrics. The complexity for Metric 3 is somewhat difficult to specify quantitatively since the absolute value operation has to be frequently carried out (see Eq. (4.18)). We expect that due to this operation Metric 3 compares unfavorably with the other metrics in terms of complexity for most processors. 4.4 Simulation Results This section presents the simulation results for trellis shaping. We assume O F D M trans-mission with N — 128 subcarriers, 16-QAM constellation, and the symmetrical labelling which was described in. Section 4.1. We select five maximum free distance codes as the shaping codes. The codes are presented in Table 4.1 (\/ns is the code rate, Ns is the number of encoder state, and r is the relative redundancy, see Eqn. (4.7)). We found that in each case, ns and .Ns are important parameters, however the actual numeric values of the code generator has only a small impact on P A R reduction. A n example is given in Fig 4.5 where the two shaping codes have the same rate and number of states but different code generators. Their performances are very similar. 4.4.1 Comparison of Metrics Figs. 4.6 and 4.7 depict the C C D F Q7(7o) as a function of 70 for trellis shaping with the V A and Metrics 1-4. Two different convolutional codes are chosen for simulation (Cases 1 and 3 from Table 4.1). As a reference, the C C D F for O F D M without shaping Chapter 4. PAR Reduction by Trellis Shaping 51 Table 4.1: Shaping codes Case ns Ns r Code generator (octal) 1 2 4 1/8 [5 7] 2 4 4 1/16 [3 7 7 7] 3 4 16 1/6 [25 27 33 37] 4 8 8 1/64 [17 17 13 13 13 15 15 17] 5 8 64 1/64 [153 111 165 173 135 135 147 137] is also plotted. We observe that Metrics 1 and 2 show a very similar P A R reduction capability. For both cases, the 0.1% P A R (101og10(70) at C}7(7o) = 10~3) reduced by more than 4.5 dB compared to O F D M without shaping. This was expected as both metrics represent the same optimization criterion. Furthermore, Metrics 3 and 4 compare unfavorably with Metrics 1 and 2 in terms of P A R reduction. For example, a gap of about 1 dB is observed for the 0.1% P A R is observed in Fig 4.6. Interestingly, Metric 4 is slightly superior to Metric 3, although Metric 3 is more closely related to the theoretical P A R bound in (4.16). Table 4.2 shows the results in terms of 101og10(70) at which Q7(7o) = 1 0 - 2 (1% PAR) for different convolutional codes. We note that the 1% P A R is 10.1 dB for O F D M without shaping (see Fig. 4.6). It can be seen that Metrics 1 and 2 consistently offer an improved P A R reduction over Metrics 3 and 4. We also observe that, with the same code rate l/nS) increasing the number of states Ns is beneficial for P A R reduction (with the cost of increased complexity). In summary, we conclude that trellis shaping accomplishes significant P A R reduc-tion and that Metrics 1 and 2 are clearly superior over Metric 3 and 4. We therefore concentrate on Metrics 1 and 2 in what follows. Chapter 4. PAR Reduction by Trellis Shaping 52 4 5 6 7 8 101og10(7o) [dB] —-Figure 4.5: Two codes with the same rate and number of states have very similar P A R reduction performance. The code generators in octal are shown. Viterbi algorithm , Metric 1. 10° 10" Of 10 -2 10" :• • • :• :• ^ \ :• :• • • • OFDM \ \ { M e t r i c . 2 V - - ^ \ Y i t h o u t .shaoiuE ... Y v / Metric ; 5 \ • • -: i • AV/ • • Metric 4 •:•:•:•:::: : \ : ::::::: Metric 1 \ . \\ •: 1 : \\ t • \ \\ i i 1 i IA ,—i i 1 4 5 6 7 8 9 101og10(7o) [dB] —* 10 11 Figure 4.6: Performance of trellis shaping with V A . Metric 1-4, Case 1 from Table 4.1. Chapter 4. PAR Reduction by Trellis Shaping 53 101og 1 0( 7o) [dB] —> Figure 4.7: Performance of trellis shaping with VA. Metric 1-4, Case 3 from Table 4.1. Table 4.2: Performance of trellis shaping with VA. 1% P A R [dB] Case (101og 1 0 ( 7 o )atQ 7 (7o) = 10- 2) Metric 1 Metric 2 Metric 3 Metric 4 1 6.3 6.3 7.0 6.9 2 6.7 6.6 7.5 7.4 3 6.3 6.4 7.2 7.2 4 6.9 6.8 7.8 7.8 5 6.3 6.4 7.5 7.4 Chapter 4. PAR Reduction by Trellis Shaping 54 Table 4.3: Performance and Complexity of ST-SDA, Case 1% P A R [dB] avg. # of metric calculations V A Metric 1 Metric 2 Metric 1 Metric 2 1 6.6 6.8 174 65 496 2 7.0 7.1 83 33 240 3 7.0 7.2 83 33 896 4 7.5 7.5 40 17 208 5 7.5 7.4 40 17 1280 4.4.2 V A vs. ST-SDA For the ST-SDA with Metric 1, due to zero padding, longer paths accumulate larger metric values. Therefore, we adjust the metric when comparing paths of different lengths by subtracting the expected metric value for the respective lengths. In other words, the actual decision metric is A i ( y ( f c ) ) - S { A ! ( y W ) } (4.25) where the expectation E{Ai(y^)} can be determined offline. For example, for each stage k, we can randomly generate 105 partial shaped data sequences X^k\ and average their peak powers to obtain E{Ai{y^)} (the larger the number of samples the better, subject to simulation run time constraint). This needs to be done only once and can be considered as part of the algorithm design. The application of such a bias term is known from sequential decoding of convolutional codes. Also, in order to avoid possible long decoding delay, we limit the maximum stack size to 100. For the ST-SDA with Metric 2, we apply the observation in Section 4.2.2 to the implementation. Table 4.3 shows the 1% P A R for trellis shaping with the ST-SDA and Metrics 1 and 2. A graphical illustration for Cases 1 and 3 is given in Fig. 4.8. In the table, we present the average number of metric calculations, which is the appropriate complexity measure when comparing the ST-SDA with the VA. The respective complexity figures Chapter 4. PAR Reduction by Trellis Shaping 5 5 Figure 4.8: Performance of trellis shaping with ST-SDA. Metrics 1 and 2, shaping codes Case 1 and 3 from Table 4.1. for the V A are also included (which is 2Ns(N/ns — log2(Na)), as we need to calculate metrics only for parts of the trellis from stage \og2(Ns) to stage N/ns). A comparison of the figures in Tables 4.2 and 4.3 show that (a) the ST-SDA performs considerably fewer searches than the VA, which results in a corresponding gap in P A R reduction capability, (b) Metric 2 achieves a slightly better performance-complexity tradeoff than Metric 1, and (c) the ST-SDA with Metrics 1 and 2 attains a better or comparable 1% P A R as the V A with Metric 4 (proposed in [23]), while computational and/or memory complexity savings are achieved. Hence, the ST-SDA is an interesting alternative for low-complexity trellis shaping. Chapter 4. PAR Reduction by Trellis Shaping 56 Figure 4.9: Performance of truncated shaping with different number N' of shaped carriers (c = N'/N). V A with Metric 2. Case 1 from Table 4.1. 4.4.3 Variations Based on the Appended Partial P A R Criterion We now consider trellis shaping using the VA (similar results are obtained with the ST-SDA) with Metric 2 and the modifications proposed in Section 4.2.3. Fig. 4.9 shows the C C D F of P A R for truncated shaping with redundancies r' = N'/(N log2(M)ns) and different values c = N'/N of shaped subcarriers (see Section 4.2.3). The code parameters are according to Case 1 in Table 4.1. The complexity compared to conventional shaping is reduced by the factor c. From Fig. 4.9 we see that the per-formance degradation due to truncation is very moderate for c > 3/4. In particular, for truncated shaping with 25% less multiplications and additions than for full shaping the loss in P A R reduction is about 0.3 dB at 0.1% PAR. It is also interesting to note that Chapter 4. PAR Reduction by Trellis Shaping 57 Table 4.4: Adaptive shaping with VA, Metric 2. Case Threshold [dB] Avg. # of metric calculations 1 6.3 139 2 6.6 65 3 6.4 192 4 6.8 52 5 6.4 274 similar relative performance-complexity tradeoffs were observed in [23] when applying "trellis window truncation" to the V A with Metric 4. Table 4.4 shows the average number of metric calculations needed with adaptive shaping. The threshold for each case is chosen as the 1% P A R for the conventional V A (cf. Table 4.2). The actual performance of adaptive shaping for Case 1 is shown in Fig. 4.10. From Table 4.4 a consistent reduction of 75% in terms of number of metric calculations can be observed. The overall performance of adaptive shaping, as illustrated via the C C D F plots in Fig. 4.10, however is not the same as conventional shaping. 4.5 Conclusion In this chapter, we have studied the application of trellis shaping for P A R reduction in O F D M systems. To this end, we have considered different decoding algorithms and metrics. While trellis shaping using the V A with Metrics 1 (partial PAR) and 4 (partial autocorrelation squares) was suggested in the literature, we propose the application of Metric 2 (appended partial PAR) and the ST-SDA. These modifications help increase the P A R performance and/or reduce the computational complexity of the trellis shaping. Our proposal of truncated and adaptive shaping based on Metric 2 also offers more flexible performance-complexity tradeoffs. The presented comparison of the respective performances and complexities provides valuable guidelines for practical use. Chapter 4. PAR Reduction by Trellis Shaping 58 . . . . 59 Chapter 5 Further Improvements for PTS and Trellis Shaping In this chapter we propose three improvements for PTS and trellis shaping. These propos-als are presented in the same chapter because they are either the results of our comparison between the two techniques or applicable to both of them. In Section 5.1 we restate the code sequence search in trellis shaping as a CO problem and apply the heuristics1 in Chapter 3 to solve this problem. In Section 5.2, we use an approximation of the peak power to reduce the overall complexity of search algorithms. Finally, we propose a novel side information embedding scheme for PTS in Section 5.3. This low-complexity scheme is based on trellis shaping and has minimal effect on the overall performance of the O F D M system. Conclusion of this chapter is presented in Section 5.4. 5.1 Application of Combinatorial Optimization Heuristics in Trellis Shaping We first restate the code sequence search in trellis shaping as a CO problem. In trellis shaping, let b be the respective code message of the shaping sequence y, i.e. encoding 6 by the shaping code gives y. The code message b contains N/ns bits. We note that, while the bits in y are not independent, the bits in b can be arbitrary. The relationship between b and y is one-to-one. Therefore, finding a shaping sequence is equivalent to finding its respective code message. As a result, the code sequence search (4.6) can be 1In this chapter, heuristic may indicate any CO [meta]heuristic and decoding algorithm. Chapter 5. Further Improvements for PTS. and Trellis Shaping 60 restated as a CO problem minimize /(b) = H^ H2*, b subject to: b e {0,1} R . (5.1) where x is the transmit sequence. From the CO literature's viewpoint, the two decoding algorithms V A and ST-SDA in Chapter 4 can be regarded as greedy heuristics (see e.g. [41, 42]) where the solution b is constructed in a sequential manner. We now consider the application of the heuristics in Chapter 3 to the trellis shaping CO problem (5.1). Fig. 5.1 shows the performance of random search (RS) applied to this problem. RS turns out to perform very well and its performance-complexity tradeoff remains very similar to the PTS case. That is, large P A R reduction is obtained after the first few searches. Then, it becomes increasingly difficult to further reduce the peak power. From this result, we expect that all other CO heuristics would also show similar tradeoffs as in the PTS case (cf. Section 3.4, especially Fig. 3.5). It is worth noting again that the parameters of the SA and TS metaheuristics need to be fine-tuned per different trellis shaping configuration. The good performance of RS, the simplest heuristic, in trellis shaping prompts us to compare it with the two decoding algorithms (VA and ST-SDA) in Chapter 4. We want to see if these more sophisticated algorithms are really worthy (recall Section 3.4 where simulation results show that RS outperforms local search and bit flip). We note that the average computational complexity per search step in trellis shaping is approximately the same for either the decoding algorithms with time-domain metrics or other heuristics (see Sections 3.1, 4.3 and also 5.2 below). Therefore, we will compare algorithms based on their performance vs. number of peak-power searches. Fig. 5.2 shows the performance of trellis shaping with the V A and Metric 1 vs. RS using the same number of searches (which is 496, see Table 4.3). It can be seen that the V A actually performs better than RS. Similar results can be seen in Figs. 5.3 and 5.4 for comparison of the ST-SDA with Metrics 1 and 2 vs. RS. Chapter 5. Further Improvements for PTS and Trellis Shaping 61 10° 10- 1 i o - 2 10- 3 i.. 1 1 1 \ . \ . A . . V . . . V . V . . . . : u r J J M . w i t n o u t . •2v4y 16;:64,- 256^ i024l• \ • • •I • sea rches ' " :1" r vv ' \ ' i i 1 I i I 1 \ i \ i 101og 1 0( 7o) [dB] —> Figure 5.1: Trellis shaping with random search. N = 128, 16-QAM, Case 1 from Table 4.1. 101og10(7o) [dB] —> Figure 5.2: Trellis shaping V A and Metric 1 vs. random search (RS) with the same number of peak-power searches. N — 128, 16-QAM. Case 1 from Table 4.1. Chapter 5. Further Improvements for PTS and Trellis Shaping 62 101og10(7o) [dB] — Figure 5.3: Trellis shaping ST-SDA and Metric 1 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1. 101og1 0(7 o) [dB] — Figure 5.4: Trellis shaping ST-SDA and Metric 2 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1. Chapter 5. Further Improvements for PTS and Trellis Shaping 63 IO4 <-Or CO B o c3 CO CP -6 CO CO O bb > < 6.5 7 7.5 101og10(7o)[dB] 8.5 Figure 5.5: Average number of searches required to achieve Q7(7o) = 10 3 . Trellis shaping with N = 128, 16-QAM, Metric 1, shaping code Case 1 from Table 4.1. Fig. 5.5 shows the average number of searches vs. 70 that the VA, ST-SDA, and RS require to achieve Q7(7o) = 10~3 in trellis shaping (similar to Fig 3.5 for PTS). The trellis shaping configuration is N = 128, 16-QAM, Metric 1, and shaping code Case 1 from Table 4.1. In this particular case, it can be observed that the V A and ST-SDA indeed require less than two times as much the number of searches to attain the same performance as RS (cf. Fig. 3.5 illustrates a corresponding figure of up to four when metaheuristics are applied to PTS). From this result, we conclude that trellis shaping with V A and ST-SDA has comparable performance and tradeoff as PTS with metaheuristics. Chapter 5. Further Improvements for PTS and Trellis Shaping 64 Figure 5.6: Signal in complex plane. Figure 5.7: Approximation of a circle by a square and an octagon (F = 8). 5.2 Complexity Reduction by Peak Power Approximation Repeated evaluation of the objective function /(b) in both trellis shaping (see Eqn. (5.1)) and PTS is computationally expensive. It is therefore desirable to provide an approxi-mation of f(b) that is cheaper to compute while retaining the P A R reduction capability of the techniques. The peak power is the square of the radius of the circle that encloses the transmit se-quence x in the complex plane, as illustrated in Fig 5.6. Following an approach from [43], the circle can be approximated by a symmetrical polygon of P sides, see e.g. Fig. 5.7. Chapter 5. Further Improvements for PTS and Trellis Shaping 65 The circle's radius is equal to the polygon's side radius which can be estimated by m a x{5R ( x „ e J ' 2 p 7 r / p ) , n = 0 . . . LN - l , p = 0 . . . P- 1} . (5.2) Let us define the function h = G(x, P) where h = [m{x} Q{x} ... R j x ^ ^ - ^ O S{xJ2*(p'4-»'p}] (5.3) i.e., the function returns an iVXP/2-pomt real-valued sequence h from an iVL-point complex sequences x such that the peak power of the two sequence are approximately equal [Ml * \\x\\l . (5.4) While finding the peak power of x requires 2NL multiplications, finding the peak power of h requires none. Assuming that multiplications are the main source of the computational complexity, it is advantageous to use h instead of x in our CO problems. In PTS, if partial transmit sequences x^ are replaced by = G(x^v\ P) in relevant equations, we end up with a sequence h = G(x, P). Similarly, we obtain the same result in trellis shaping with time-domain metrics if we substitute NL-point complex sequences Vi from (4.24) by hi = G(vi, P) in relevant equations. Finally, the CO problem for both trellis shaping (Eqn. (5.1)) and PTS (Eqn. (3.9)) can be approximated as minimize g(b) = WhW2^ (5.5) subject to: b G {0,1} R . where R = N/ns for trellis shaping and R — V — 1 for PTS. With this formulation all algorithms introduced in the previous chapters still apply. The solution of (5.5) approaches that of the original problems as P increases. On the other hand, complexity also increases with P . For example, in PTS, additional 4NLV(P/4: — 1) multiplications are required to generate vectors but subsequence evaluations of g(b) require no multiplications. In comparison, each evaluation of f(b) requires 2A^L multiplications. Therefore, the new objective function g(b) is preferred when more than 2V(P/4 — 1) searches are performed. Chapter 5. Further Improvements for PTS and Trellis Shaping 66 Figure 5.8: SA with 256 searches, using approximation of the peak power with P = 4, 8,16. For comparison: SA with 64 searches, and optimal P T S . In Fig. 5.8 we present P A R reduction results when applying the objective function g(b). The PTS system with parameters as is Section 3.4 is chosen. The SA metaheuristic with K = 256 searches and g(b) with P = 4,8,16 are considered. Solving the original CO problem requires 2NLK multiplications. With P = 4, no multiplications are required to setup the approximate problem and only 0.6 dB in P A R reduction at ( ? 7 ( 7 o ) = 1 0 - 3 are lost. With P = 8, 64iVX multiplications are required but P A R reduction is still improved compared to optimizing f(b) in 64 searches, which involves 128NL, i.e. dou-ble the number of, multiplications. With P = 16, the approximation yields essentially the same performance as the original objective function, while complexity in terms of multiplications is greatly reduced. Chapter 5. Further Improvements for PTS and Trellis Shaping 67 5.3 SI Embedding for PTS One drawback of PTS is its need to transfer side information (SI) to the receiving end. Inspired by the trellis shaping structure, in this section we propose a new SI embed-ding/detecting scheme for PTS. The scheme requires only a minimal additional band-width and complexity and practically does not degrade the overall performance of the O F D M system. 5.3.1 Review of Existing SI Embedding and Detecting Schemes Various solutions to the transmission and detection of the SI b in PTS have been pro-posed in the literature. They solve the problem at some or all of the expenses of peak regrowth, extra bandwidth (or equivalently, redundancy), complexity, and O F D M sys-tem's performance. In explicit SI schemes [44, 45], a number of subcarriers are reserved for SI transmission. The receiver will use the SI to invert the rotations before detecting the data. Since erroneous SI may cause a complete loss of a whole subblock, the SI usually need to be protected by error control codes, which increase the required redundancy/bandwidth. If the peak reduction optimization is done only on the data-carrying signal, peak regrowth will occur after the Si-carrying signal is added. If the optimization is done on the sum of both the signals, the computational complexity will increase, e.g. [44]. In [45], SI is transmitted the next O F D M symbol to avoid peak regrowth and additional complexity for the optimization. This solution however introduces an extra detection delay of one O F D M symbol. Furthermore, a dummy O F D M symbol is needed to carry the last SI. In implicit SI schemes [21, 46-48], data detection does not require explicit SI. Since all carriers in a subblock is rotated by the same phase, [21] suggests employing differentially encoded modulation to eliminate the need of SI at the receiver. This scheme requires one reference subcarrier per subblock and inherits all possible disadvantages of differential Chapter 5. Further Improvements for PTS and Trellis Shaping 68 modulation. In [46], a marking algorithm is used to indicate different rotations. This scheme applies to PSK only and suffers from peak regrowth as well as increased complex-ity in the receiver. Some modifications have been proposed in [47] to make the scheme applicable to Q A M constellations, but the peak regrowth and high complexity issues remain. In a special form of PTS [48], subblocks are rotated by phase vectors from a special code book. A maximum likelihood detector at the receiving side will estimate the rotation factors, hence no SI transmissions are needed. This scheme however inherently introduces enormous additional complexity to the receiver. 5.3.2 Novel SI Embedding Scheme A n ideal SI embedding scheme should have the following characteristics: • Does not complicate the P A R reduction algorithms; • Does not cause peak regrowth; • Practically does not degrade system performance; • Requires minimal additional redundancy/bandwidth and complexity; and • Is flexible enough to apply to any configurations, e.g. any constellation sizes and subblock partitions. We propose a scheme based on trellis shaping that has all of the above characteristics. It is the structure of trellis shaping applied to individual PTS subblocks and repetition block codes are used instead of convolutional codes. The diagram of the scheme for one block is presented in Fig. 5.9 (cf. Fig 4.2). Similarly to trellis shaping, the SI scheme differentiates one MSB and (m — 1) LSB for every m bits corresponding to a constellation symbol. The symmetric constellation labelling described in Section 4.1 is required. A 180° rotation of a subcarrier is equivalent to inverting its corresponding MSB (from 1 to 0 or vice versa). This is also equivalent to Chapter 5. Further Improvements for PTS and Trellis Shaping 69 (a) Transmitter f r o m P T S a l g o r i t h m H ,(v) M S B data (b) Receiver L S B De-mapping 1 . / ( „ • ) ^ Hr j M S B L S B Figure 5.9: Block diagram of SI embedding scheme for one subblock: (a) transmitter, (b) receiver. (modulo-2) adding bit 1 to the MSB. A 180° rotation of the whole PTS subblock of nv subcarriers is therefore equivalent to adding = l n „ to the MSBs of the subblock. Side information is implicitly embedded into the data-carrying signal means that the receiver is able to detect data without having to first determine the rotations performed at the receiver. In Fig. 5.9, let and z'^ be the vectors of, respectively, the nv unshaped and shaped MSBs of the non-zero components in 2 X^v\ The PTS P A R reduction can be formulated as (v) (5.6) where determines the subblock rotations, i.e. (see PTS description, Section 3.1) 0Uv if 6V = 0 c(v) = if bv = 1 (5.7) 2Recall that is an iV-point vector which is non-zero (and equal to respective components of X) at subcarriers that belong to the v-th subblock. Chapter 5. Further Improvements for PTS and Trellis Shaping 70 We note that l n „ and 0 n „ are the two codewords of an (nv, l)-repetition block code. Hence, any (nv — 1) x nv parity-check matrix H of the repetition code satisfies (5.8) In Fig. 5.9, H~T is the left-inverse of HT. The P A R reduction operation (5.6) can be made transparent for data transmission (i.e. the SI is embedded into the O F D M signal) by obtaining the nv unshaped MSBs from z(v) = s(v)H-T at the transmitter and applying the "inverse" operation at the receiver. We note the following: (5.9) (5.10) • The SI embedding is done per subblock, independently of all other subblocks; • The minimum of one redundancy bit for a subblock is used, introduced by (5.9); • The computational complexity for SI embedding and detection is extremely low as all the multiplications in (5.9) and (5.10) are in Galois field GF(2). There are usually many parity check matrices for a given repetition code. In fact, any (nv — 1) x nv matrix H with nv — 1 independent rows that satisfies InM1 = 0 nv — 1 (5.11) is a valid parity check matrix of the code. For any such matrix, there exists at least one left inverse H~r. For a special selection of the matrices and H Inv-1 ^•nv — 1 0 n t ) _ i (5.12) (5.13) Chapter 5. Further Improvements for PTS and Trellis Shaping 71 (a) t data Insert marking bit' [a™ 0] X(v) Mapping LSB (b) De-mapping If marking bit 1 invert all bits LSB Figure 5.10: A special case of the SI embedding scheme, (a) transmitter, (b) receiver. the scheme can be simplified as in Fig. 5.10. In this case, = [s^ 0] where the last bit can be considered as the explicit marking bit. At the receiving end, if the marking bit is 1, the receiver assumes that the PTS algorithm has inverted the subblock and flips the MSBs accordingly. This special case appears to be exactly the same as other explicit SI schemes mentioned in Section 5.3.1. However, there are two fundamental differences: firstly, our scheme does not cause peak regrowth, and secondly, due to the special labelling and differentiation of M S B and LSB, the SI scheme only affect MSBs, which accounts for just an 1/m portion of the data. The relative ratio of affected data becomes smaller for larger constellations. In the next section, we will analyze the effect of the scheme on the performance of the O F D M system. Chapter 5. Further Improvements for PTS and Trellis Shaping 72 5.3.3 Performance Analysis We note that the distribution of errors in in case of erroneous reception depends on H. For example, consider the two parity-check matrices H\ and H2 such that Inv-l HTX = l n „ — i and Inv-1 e On„-l 0 n „ - i Inv — 1 (5.14) (5.15) Whereas burst errors occur if Hi is applied, double-adjacent errors result for H2. In case of coded transmission, we may therefore consider separate error control coding for LSBs and MSBs, and the code protecting should take the controlled error patterns into account. Let BER f e be the bit-error rate (BER) after demapping for sufficiently large signal-to-noise ratio (SNR) 7^ in subcarrier k. In the case of uncoded transmission and 2m-ary Q A M transmission and, for the (m- 1) LSBs per Q A M symbol, we find [6, Section 5.2.9] 4 ^ - 4 B E R f c ) L s B M(m- 1) Q M-l Ik where Q(x) = 1/27T '•^dt (5.16) (5.17) is the complementary error function of Gaussian statistics. The error rate for the data bits representing MSBs is well approximated by 2 2nv - 2 BER f c ] M SB Q -ik (5.18) 'M nv ' \ V M - 1 for both parity-check matrices Hx and H2- Hence, assuming nv » 1, an approximation for the overall error rate is given by BER f c = - ((m - l ) B E R f e , L S B + ^ B E R f c , M S B j — Q m (5.19) M-l Ik Chapter 5. Further Improvements for PTS and Trellis Shaping 73 We note that this approximation is independent of nv and identical to that for uncoded Gray labelled M Q A M transmission without shaping for large 7fc, i.e., asymptotically error rate is not affected by the proposed SI embedding scheme. Finally, considering Rayleigh fading with D-fold diversity either through multiple receive branches or spreading across subcarriers for time-dispersive channels, from (5.19) and [6, Section 14.4.1] the average B E R is approximated by BER~^°f(D-1+i)(l-pY, (5.20) i=0 where P = U l - ^ . ) C (5.21) (5.22) 2 V V a+7 2 ( M - 1) a = 3 and „ = E ^ } = ( 5 2 3 ) with Es and J\fo denoting the average received energy per symbol and baseband noise power spectral density, respectively. Fig. 5.11 shows the B E R performances for O F D M with (a) conventional Gray labelled 16-QAM (i.e., either perfect explicit SI is available or P A R reduction is not applied) and (b) 16-QAM with sign-bit mapping and the novel SI embedding scheme. Unfaded additive white Gaussian noise (AWGN) and fading A W G N channels with D-fold diversity are considered, and simulation and numerical results using Eqns. (5.19) and (5.20) are plotted. As can be seen, (a) numerical and simulation results match very well and (b) the loss in B E R performance compared to O F D M without SI embedding is very small (about a factor 1.2 in BER) . This, together with the other features outlined in Section 5.3.2, renders the new SI embedding scheme an interesting solution for O F D M systems employing PTS-based P A R reduction. Chapter 5. Further Improvements for PTS and Trellis Shaping 74 Figure 5.11: B E R vs. 101og 1 0 (£ s /A/b) for O F D M with and without SI embedding. A W G N and Rayleigh fading with D-fold diversity reception. 5.4 Conclusion In this chapter, we propose three improvements for PTS and/or trellis shaping. The simulation results in Section 5.1 show that (a) CO heuristics in trellis shaping attain similar tradeoffs as in PTS, and (b) VA and ST-SDA in trellis shaping perform almost as good as CO metaheuristics. The comparable performances imply that trellis shaping and PTS have their own merits and both of them are good P A R reduction techniques. Our proposal of the new optimization objective is beneficial to all M S R search algo-rithms what involve repeated peak-power searches. While some overhead is required, the total number of multiplications, hence the overall complexity, can be reduced at no cost in P A R reduction performance. ' . The trellis shaping structure has inspired us to propose a new SI embedding scheme for PTS. Ours has all desirable characteristics of an ideal SI scheme. Chapter 6 75 Conclusion The high peak power problem is one of the major drawbacks of O F D M , which sometimes, in the extreme, outweighs all of its advantages. The problem has attracted a great deal of research, however, so far no all-satisfactory solutions exist. From our literature review, M S R appears to be the most promising among the existing P A R reduction approaches. In particular, PTS is the most popular while trellis shaping is the newest MSR technique. In this thesis, we have improved both of them. Many suboptimal search algorithms for PTS are available in the literature with fa-vorable claims about their performance-complexity tradeoffs. It is important to evaluate their usefulness and propose better algorithms. In Chapter 3, we reformulated PTS as a CO problem and showed that the complexity of any PTS search algorithm is strongly proportional to the number of peak-power searches. Hence, a fair benchmark for tradeoff evaluation would be performance-vs.-number-of-searches. The CO reformulation brings about a rich set of CO heuristics to apply to PTS. We considered the application of SA and TS, which are the two most popular CO metaheuristics, to PTS. Under our benchmark, surprisingly, the simplest algorithm RS actually outperforms all available PTS search algorithms. Only with our versions of the SA and TS metaheuristics, better tradeoffs are attained in the low-PAR region. Trellis shaping is the most recent M S R technique and is interesting because no explicit SI is required. In Chapter 4, we proposed the application of a new metric (Metric 2) and the use of sequential decoding algorithm (ST-SDA) in trellis shaping. We also provided an extensive comparison between our proposals and existing works. We showed that our proposals increase P A R reduction performance and/or reduce the complexity of trellis Chapter 6. Conclusion 76 shaping. In Chapter 5 we presented three further improvements for PTS and trellis shaping. Firstly, code sequence search in trellis shaping was reformulated as a CO problem. This allowed us to apply CO heuristics in trellis shaping and provided further insightful eval-uation of the decoding algorithms in Chapter 4. Simulation results showed that they indeed have comparable tradeoff as the SA and TS metaheuristics. This comparison confirms the merit of trellis shaping as a competitive M S R technique. Secondly, we pro-posed the application of a new objective function for CO problems. When the number of searches is large, this modification dramatically reduces the number of multiplications required while keeping the P A R reduction performance unchanged. Finally, based on the structure of trellis shaping, we proposed an SI embedding scheme for PTS, which requires only minimal additional bandwidth and complexity. Our scheme is better than existing solutions because it does not affect the performance or complexity of PTS as well as practically does not increase the BER. To this end, we conclude that both PTS and trellis shaping are competitive P A R reduction techniques. We suggest the following for future work: • Trying other algorithms from the CO literature to both PTS and trellis shaping CO problems. Especially, dynamic metaheuristics (e.g. [49, 50]) where the algorithm's parameters are adaptively changed as per instance of the CO problem (per O F D M data) may yield better tradeoffs. • Examining the errors patterns in MSBs when using PTS with our SI scheme. In coded O F D M , structured error patterns may be exploited to improve the B E R performance of the system. 77 Bibliography [1] R. W. Chang, "Orthogonal Frequency Division Multiplexing," U.S. Patent 3 488 445, Jan. 6, 1970. [2] J. A . C. Bingham, "Multicarrier modulation for data transmission: an idea whose time has come," IEEE Commun. Mag., vol. 28, no. 5, pp. 5-14, May 1990. [3] S. van Nee and R. Prasad, OFDM for Wireless Multimedia Communications. Boston: Artech House, 2000. [4] S. Weinstein and P. Ebert, "Data transmission by frequency-division multiplexing using the discrete fourier transform," IEEE Trans. Commun., vol. 19, no. 5, pp. 628-634, Oct. 1971. [5] V . Aue and G. P. Fettweis, "High-level multi-carrier modulation and its implemen-tation," in Proc. of Veh. Technol. Conf. (VTC), Apr. 1996, pp. 914-917. [6] J. G. Proakis, Digital Communications, 4th ed. New York: McGraw-Hill , 2001. [7] E. Biglieri, J. Proakis, and S. Shamai, "Fading channels: information-theoretic and communications aspects," IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2619-2692, Oct. 1998. [8] T. S. Rappaport, Wireless Communications. New Jersey: Prentice Hall, 1996. [9] C. Tellambura, "Computation of the continuous-time P A R of an O F D M signal with B P S K subcarriers," IEEE Commun. Lett, vol. 5, no. 5, pp. 185-187, May 2001. Bibliography 78 [10] A . Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. New York: McGraw-Hil l , 1991. [11] C. Rapp, "Effects of HPA-nonlinearity on a 4 - D P S K / O F D M signal for a digital sound broadcasting system," in Proc. 2nd European Conf. on Satellite Comm., Oct. 1991, pp. 179-184. [12] A . N . D'Andrea, V . Lottici, and R. Reggiannini, "RF power amplifier linearization through amplitude and phase predistortion," IEEE Trans. Commun., vol. 44, no. 11, pp. 1477-1484, Nov. 1996. [13] R. O'Neil and L. B. Lopes, "Envelope variations and spectral splatter in clipped multicarrier signals," in IEEE PIMRC'95, Sept. 1995, pp. 71-75. [14] X . L i and L. J. Cimini, Jr, "Effect of clipping and filtering on the performance of ofdm," IEEE Commun. Lett, vol. 2, no. 5, pp. 131-133, May 1998. [15] J. Armstrong, "Peak-to-average power reduction for O F D M by repeated clipping and frequency domain filtering," Electron. Lett, vol. 38, no. 8, pp. 246-247, Feb. 2002. [16] J . Tellado, Multicarrier Modulation with Low PAR: Applications to xDSL and Broad-band Wireless. Springer, 2000. [17] B. S. Krongold and D. L. Jones, "A new tone reservation method for complex-baseband P A R reduction in O F D M systems," in IEEE ICASSP'02, vol. 3, May 2002, pp. 2321-2324. [18] , "PAR reduction in O F D M via active constellation extension," IEEE Trans. Broadcast, vol. 49, no. 3, pp. 258-268, Sept. 2003. [19] , "An active-set approach for O F D M P A R reduction via tone reservation," IEEE Trans. Signal Processing, vol. 52, no. 2, pp. 495-509, Feb. 2004. Bibliography 79 [20] S. H. Miiller, R. W. Bauml, R. F. H. Fischer, and J. B. Ruber, " O F D M with reduced peak-to-average power ratio by multiple signal representation," Annals of Telecommunications, vol. 52, no. 1-2, pp. 58-67, Feb. 1997. [21] S. H. Miiller-Weinfurtner, " O F D M for Wireless Communications: Nyquist Win-dowing, Peak-Power Reduction, and Synchronization," Ph.D. dissertation, Univ. Erlangen-Niirnberg, Germany,. 2000, ISBN 3-8265-7658-6. [22] W. Henkel and B. Wagner, "Another application for trellis shaping: P A R reduction for D M T (OFDM)," IEEE Trans. Commun., vol. 48, no. 9, pp. 1471-1476, Sept. 2000. [23] H. Ochiai, "A novel trellis-shaping design with both peak and average power reduc-tion for O F D M systems," IEEE Trans. Commun., vol. 52, no. 11, pp. 1916-1926, Nov. 2004. [24] H. Breiling, S. H. Miiller-Weinfurtner, and J. B. Huber, " S L M peak-power reduction without explicit side information," IEEE Commun. Lett., vol. 5, no. 6, pp. 239-241, June 2001. [25] J. G. D. Forney, "Trellis shaping," IEEE Trans. Inform. Theory, vol. 38, no. 2, pp. 281-300, Mar. 1992. [26] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. New York: Dover Publications, 1998. [27] L. J. Cimini, Jr and N . R. Sollenberger, "Peak-to-average power ratio reduction of an O F D M signal using partial transmit sequences," IEEE Commun. Lett., vol. 4, no. 3, pp. 86-88, Mar. 2000. [28] S. .H. Han and J. H. Lee, "PAPR reduction of O F D M signals using a reduced complexity PTS technique," IEEE Sig. Processing Lett, vol. 11, no. 11, pp. 887-890, Nov. 2004. Bibliography 80 [29] R. G. Parker and R. L. Rardin, Discrete Optimization. Boston: Academic Press, 1998. [30] C. R. Reeves, Ed., Modern Heuristic Techniques for Combinatorial Problems. Lon-don: Blackwell Scientific Publications, 1993. [31] C. Blum and A. Roli, "Metaheuristics in combinatorial optimization: Overview and conceptual comparison," ACM Comput. Surv., vol. 35, no. 3, pp. 268-308, 2003. [32] S. Kirkpatrick, C. D. Gelatt, and M . P. Vecchi, "Optimization by simulated anneal-ing," Science, vol. 220 (4598), pp. 671-680, 1983. [33] P. J. M . Laarhoven and E. H. L. Aarts, Eds., Simulated annealing: theory and applications. Norwell, M A , USA: Kluwer Academic Publishers, 1987. [34] F. Glover and F. Laguna, Tabu Search: Norwell, M A : Kluwer Academic Publishers, 1997. [35] S. Hanafi, "On the convergence of tabu search," Journal of Heuristics, vol. 7, no. 1, pp. 47-58, 2001. [36] S.H. Muller and J.B. Huber, "A novel peak power reduction scheme for O F D M , " in Proc. of Intern. Symp. on Personal, Indoor and Mobile Radio Communications (PIMRC), 1997, pp. 1090-1094. [37] S. Lin and J. D.J . Costello, Error Control Coding, 2nd ed. Upper Saddle River, N.J : Pearson-Prentice Hall, 2004. [38] C. Tellambura, "Upper bound on peak factor of N-multiple carriers," Electron. Lett, vol. 33, pp. 1608-1609, Sept. 1997. [39] , " A coding technique for reducing peak-to-average power ratio in O F D M , " IEEE GLOBECOM 98, vol. 5, pp. 2783-2787, Nov. 1998. Bibliography 81 [40] A . Jayalath and C. Tellambura, "Adaptive PTS approach for reduction of peak-to-average power ratio of O F D M signal," Electron. Lett, vol. 36, no. 14, pp. 1226-1228, July 2000. [41] S. A . Curtis, "The classification of greedy algorithms," Sci. Comput. Program., vol. 49, no. 1-3, pp. 125-157, 2003. [42] A . Vince, "A framework for the greedy algorithm," Discrete Appl. Math., vol. 121, no. 1-3, pp. 247-260, 2002. ' [43] X . Chen and T. Parks, "Design of FIR filters in the complex domain," IEEE Trans. Acoust, Speech, Signal Processing, vol. 35, no. 2, pp. 144-153, Feb. 1987. [44] C.-C. Feng, C. -Y. Wang, C. -Y. Lin, and Y . - H . Hung, "Protection and transmission of side information for peak-to-average power ratio reduction of an O F D M signal using partial transmit sequences," in 58th IEEE VTC 2003-Fall, vol. 4, 2003, pp. 2461-2465. [45] A . D. S. Jayalath and C. Tellambura, "Side information in P A R reduced P T S - O F D M signals," in 14th IEEE PIMRC 2003, vol. 1, 2003, pp. 226-230. [46] L. J. Cimini, Jr and N . R. Sollenberger, "Peak-to-average power ratio reduction of an O F D M signal using partial transmit sequences with embedded side information," in IEEE GLOBECOM'00, vol. 2, 2000, pp. 746-750. [47] C.-C. Feng, Y . - T . Wu, and C.-Y. Chi, "Embedding and detection of side information for peak-to-average power ratio reduction of an O F D M signal using partial transmit sequences," in 58th IEEE VTC 2003-Fall, vol. 2, 2003, pp. 1354-1358. [48] A . D. S. Jayalath and C. Tellambura, " S L M and PTS peak-power reduction of O F D M signals without side information," IEEE Trans. Wireless Commun., vol. 4, pp. 2006-2013, Sept. 2005. Bibliography 82 [49] L. Ingber, "Adaptive simulated annealing (ASA): Lessons learned," Control and Cybernetics, vol. 25, no. 1, pp. 33-54, 1996. [50] K . Varrentrapp, "A practical framework for adaptive metaheuristics," Ph.D. disser-tation, Technische Universitat Darmstadt, Darmstadt, Germany, 2005.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- PAR reduction in OFDM systems by multiple signal representation
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
PAR reduction in OFDM systems by multiple signal representation Nguyễn, Trung Thành 2006
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | PAR reduction in OFDM systems by multiple signal representation |
Creator |
Nguyễn, Trung Thành |
Publisher | University of British Columbia |
Date Issued | 2006 |
Description | As we are embarking on the new information age, demand for high speed wireless communication has increased substantially. One of the promising technologies that could fulfil such demands is orthogonal frequency division multiplexing (OFDM). Due to its bandwidth efficiency and robustness in wireless environments, OFDM has recently gained an increasing popularity. One major drawback of OFDM is its high peak-to-average power ratio (PAR). If not processed, the high peaks in the OFDM signal cause saturation in the power amplifier, which in turn distorts and decreases the power of the transmit signal. In order to avoid the resulting bit-error-rate performance degradation and out-of-band radiation, either expensive linear power amplifiers need to be employed, or nonlinear amplifiers must be operated with high power backoffs in power-inefficient amplification. Power inefficiency is however unacceptable to battery-powered devices. Among the numerous PAR reduction approaches available in the literature, multiple signal representation (MSR) techniques have achieved considerable attention thanks to their great PAR reduction capability and favorable performance-expense tradeoffs. Within the MSR family, partial transmit sequences (PTS) is the most popular while trellis shaping is the newest technique. This thesis makes contributions to both of them. For PTS, we restate its low-PAR signal search as a combinatorial optimization (CO) problem and propose a fair benchmark to evaluate the search algorithms. These allow us to compare the existing PTS algorithms and apply efficient heuristics from the CO literature to PTS. Surprisingly, our results show that random search, the simplest algorithm, actually outperforms all available PTS algorithms. Only our tailored versions of the two CO metaheuristics, namely simulated annealing and tabu search, attain better tradeoffs in the low-PAR region. Trellis shaping is an interesting MSR technique that does not require explicit side information. Our contributions to trellis shaping include a new metric and the use of the stack sequential decoding algorithm in shaping-sequence search. These proposals improve PAR reduction and/or reduce the complexity of the technique. Beneficial to all algorithms that involve a large number of peak-power searches, we propose using a new optimization objective to reduce their overall complexity. This is applicable to both PTS and trellis shaping. Finally, based on the trellis shaping structure, we propose a side information embedding/ detecting scheme for PTS. Our low-complexity scheme does not affect the PAR reduction capability of PTS and practically does not degrade the BER performance of the OFDM systems. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-01-08 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0065563 |
URI | http://hdl.handle.net/2429/17797 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2006-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2006-0278.pdf [ 3.76MB ]
- Metadata
- JSON: 831-1.0065563.json
- JSON-LD: 831-1.0065563-ld.json
- RDF/XML (Pretty): 831-1.0065563-rdf.xml
- RDF/JSON: 831-1.0065563-rdf.json
- Turtle: 831-1.0065563-turtle.txt
- N-Triples: 831-1.0065563-rdf-ntriples.txt
- Original Record: 831-1.0065563-source.json
- Full Text
- 831-1.0065563-fulltext.txt
- Citation
- 831-1.0065563.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0065563/manifest