PAR Reduction in OFDM Systems by Multiple Signal Representation by Trung Thanh Nguyen B.E., The University of Sydney, Australia, 2004 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in The Faculty of Graduate Studies (Electrical and Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA April 2006 © Trung Thanh Nguyen 2006 11 Abstract As we are embarking on the new information age, demand for high speed wireless com munication has increased substantially. One of the promising technologies that could fulfil such demands is orthogonal frequency division multiplexing (OFDM). Due to its bandwidth efficiency and robustness in wireless environments, OFDM has recently gained an increasing popularity. One major drawback of OFDM is its high peak-to-average power ratio (PAR). If not processed, the high peaks in the OFDM signal cause saturation in the power amplifier, which in turn distorts and decreases the power of the transmit signal. In order to avoid the resulting bit-error-rate performance degradation and out-of-band radiation, either expensive linear power amplifiers need to be employed, or nonlinear amplifiers must be operated with high power backoffs in power-inefficient amplification. Power inefficiency is however unacceptable to battery-powered devices. Among the numerous PAR reduction approaches available in the literature, multi ple signal representation (MSR) techniques have achieved considerable attention thanks to their great PAR reduction capability and favorable performance-expense tradeoffs. Within the MSR family, partial transmit sequences (PTS) is the most popular while trellis shaping is the newest technique. This thesis makes contributions to both of them. For PTS, we restate its low-PAR signal search as a combinatorial optimization (CO) problem and propose a fair benchmark to evaluate the search algorithms. These allow us to compare the existing PTS algorithms and apply efficient heuristics from the CO liter ature to PTS. Surprisingly, our results show that random search, the simplest algorithm, actually outperforms all available PTS algorithms. Only our tailored versions of the two Abstract iii CO metaheuristics, namely simulated annealing and tabu search, attain better tradeoffs in the low-PAR region. Trellis shaping is an interesting MSR technique that does not require explicit side information. Our contributions to trellis shaping include a new metric and the use of the stack sequential decoding algorithm in shaping-sequence search. These proposals improve PAR reduction and/or reduce the complexity of the technique. Beneficial to all algorithms that involve a large number of peak-power searches, we propose using a new optimization objective to reduce their overall complexity. This is applicable to both PTS and trellis shaping. Finally, based on the trellis shaping structure, we propose a side information em bedding/detecting scheme for PTS. Our low-complexity scheme does not affect the PAR reduction capability of PTS and practically does not degrade the BER performance of the OFDM systems. iv Contents Abstract ii Contents v List of Tables vii List of Figures viiList of Abbreviations xi Preface xiiAcknowledgements xiv 1 Introduction 1 1.1 Contributions 2 1.2 Organization 3 1.3 Notations 4 2 Background 6 2.1 Orthogonal Frequency Division Multiplexing 6 2.2 The PAR Problem in OFDM 8 2.2.1 PAR of OFDM Signal 9 2.2.2 Effect of Distortion 13 2.3 Review of PAR Reduction Techniques 16 Contents v 2.3.1 Peak Cancellation 18 2.3.2 Multiple Signal Representation 21 3 PAR Reduction by Partial Transmit Sequences 25 3.1 Partial Transmit Sequences 23.2 Review of Descent Heuristics 8 3.2.1 Random Search3.2.2 Bit Flip 23.2.3 Local Search 9 3.3 Metaheuristics for PTS 30 3.3.1 Simulated Annealing 2 3.3.2 Tabu Search 3 3.4 Simulation Results 34 3.5 Conclusion . 6 4 PAR Reduction by Trellis Shaping 38 4.1 Trellis Shaping Scheme :4.2 Selection of Code Sequences 42 4.2.1 Metrics 43 4.2.2 Algorithms 5 4.2.3 Variations Based on the Appended Partial PAR Criterion 47 4.3 Implementation and Complexity 48 4.3.1 Partial and Appended Partial PAR Criteria 44.3.2 Partial Autocorrelation and Partial Autocorrelation Squares Criteria 49 4.4 Simulation Results 50 4.4.1 Comparison of Metrics 54.4.2 VA vs. ST-SDA 4 4.4.3 Variations Based on the Appended Partial PAR Criterion 56 4.5 Conclusion 57 Contents vi 5 Further Improvements for PTS and Trellis Shaping 59 5.1 Application of Combinatorial Optimization Heuristics in Trellis Shaping . 59 5.2 Complexity Reduction by Peak Power Approximation 64 5.3 SI Embedding for PTS 67 5.3.1 Review of Existing SI Embedding and Detecting Schemes 67 5.3.2 Novel SI Embedding Scheme 68 5.3.3 Performance Analysis 72 5.4 Conclusion 74 6 Conclusion 5 Bibliography 77 vii List of Tables 4.1 Shaping codes 51 4.2 Performance of trellis shaping with VA 53 4.3 Performance and Complexity of ST-SDA 4 4.4 Adaptive shaping with VA, Metric 2 57 Vlll List of Figures 2.1 Power spectral density of an OFDM signal with 256 subcarriers 7 2.2 Cyclic prefix in OFDM 8 2.3 Simplified block diagram of the complex equivalent baseband OFDM trans mission scheme 9 2.4 CCDF plots of continuous-time and L-time oversampled discrete-time OFDM signals. N = 256, 16-QAM 11 2.5 CCDF plots of OFDM signals with different constellations 14 2.6 CCDF plots of OFDM signals with different numbers of subcarriers. ... 14 2.7 Amplitude input-output relationship of Rapp distortion model 15 2.8 BER increase due to nonlinear amplification, AWGN channel 17 2.9 BER increase due to nonlinear amplification, fading channel 17 2.10 Out-of-band radiation due to nonlinear distortion. Rapp distortion model with p = 10. N = 256, 16-QAM 18 2.11 Bound on the achievable maximum PAR using multiple signal represen tation. Simulative bound indicates the maximum PAR of OFDM signals after applying an MSR technique with R redundancy bits. The approxi mation is from Eqn. (2.23). N = 256 22 3.1 Block diagram of the PTS technique 6 3.2 Illustration of search moves 31 3.3 Performance of optimal PTS and random search. N = 256, V = 16, pseudo-random partition, binary rotation 35 List of Figures ix 3.4 Performance of SA vs. RS with different number of searches 36 3.5 Average number of searches required to achieve Q7(7o) = 10~3 37 4.1 Constellation shaping with a 16-QAM constellation 39 4.2 Block diagram of trellis shaping system: (a) transmitter, (b) receiver. Note: p(D): LSB sequence; s(D): MSB data sequence; z(D): initial MSB sequence; y(D): shaping sequence; and z'(D): shaped MSB sequence. 40 4.3 Special constellation labelling for 16-QAM 41 4.4 Simplified trellis shaping transmitter 2 4.5 Two codes with the same rate and number of states have very similar PAR reduction performance. The code generators in octal are shown. Viterbi algorithm , Metric 1 52 4.6 Performance of trellis shaping with VA. Metric 1-4, Case 1 from Table 4.1. 52 4.7 Performance of trellis shaping with VA. Metric 1-4, Case 3 from Table 4.1. 53 4.8 Performance of trellis shaping with ST-SDA. Metrics 1 and 2, shaping codes Case 1 and 3 from Table 4.1 55 4.9 Performance of truncated shaping with different number N' of shaped carriers (c = N'/N). VA with Metric 2. Case 1 from Table 4.1 56 4.10 Performance of adaptive shaping. VA with Metric 2. Case 1 from Table 4.1. 58 5.1 Trellis shaping with random search. N = 128, 16-QAM, Case 1 from Table 4.1 61 5.2 Trellis shaping VA and Metric 1 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1. 61 5.3 Trellis shaping ST-SDA and Metric 1 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1 62 List of Figures x 5.4 Trellis shaping ST-SDA and Metric 2 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1 62 5.5 Average number of searches required to achieve Q7(7o) = 10-3. Trellis shaping with N — 128, 16-QAM, Metric 1, shaping code Case 1 from • Table 4.1 63 5.6 Signal in complex plane 64 5.7 Approximation of a circle by a square and an octagon (P = 8) 64 5.8 SA with 256 searches, using approximation of the peak power with P = 4, 8,16. For comparison: SA with 64 searches, and optimal PTS 66 5.9 Block diagram of SI embedding scheme for one subblock: (a) transmitter, (b) receiver 69 5.10 A special case of the SI embedding scheme, (a) transmitter, (b) receiver. 71 5.11 BER vs. 101og10(/5s/A/"0) for OFDM with and without SI embedding. AWGN and Rayleigh fading with D-fold diversity reception 74 List of Abbreviations AWGN Additive White Gaussian Noise BER Bit-Error-Rate BPSK Binary Phase Shift Keying CCDF Complementary Cumulative Distribution Function CO Combinatorial Optimization DFT Discrete Fourier Transform ICI InterCarrier Interference IDFT Inverse Discrete Fourier Transform ISI InterSymbol Interference LSB Less Significant Bit MSB Most Significant Bit MSR Multiple Signal Representation OFDM Orthogonal Frequency Division Multiplexing PAR Peak-to-Average power Ratio PSD Power Spectral Density PSK Phase Shift Keying PTS Partial Transmit Sequences QAM Quadrature Amplitude Modulation RS Random Search SA Simulated Annealing SI Side Information List of Abbreviations xii SNR Signal-to-Noise Ratio ST-SDA Stack-Sequential Decoding Algorithm TS Tabu Search VA Viterbi Algorithm i.i.d. independently and identically distributed pdf probability density function Xlll Preface Parts of the work in this thesis have been published in two papers • T. T. Nguyen and L. Lampe, "Trellis shaping for PAR reduction in OFDM sys tems," Accepted for presentation at IEEE International Conference on Communi cations (ICC '06), Istanbul, June 2006. • T. T. Nguyen and L. Lampe, "On partial transmit sequences to reduce PAR in OFDM systems," Submitted to IEEE Globecom 2006, San Francisco, 2006. xiv Acknowledgements I would like to express my deep gratitude to Dr. L. Lampe for his supervison and guid ance over the course of my degree. He is an excellent supervisor and his enthusiam has been inspirational. I would also like to thank all the professors of ECE UBC for their teaching, advice, and help. I thank all my colleagues in the Communication Theory Group for a friendly and warm atmosphere in our labs. My sincere thanks to the people of Green College for an excellent place to live, en joy life, and excel. My special thanks to all friends who have shared good times and helped me out in all difficult times. My deepest thank to my family for their constant and unconditional love and support. Vancouver, Canada April, 2006 Chapter 1 i Introduction Advancements in wireless communications have brought about a tremendous growth in the telecommunication industry over the last decade. Wireless technologies such as 3G, WiFi, and WiMAX have opened new ways of communication and helped lay the foundations of an information age. At the start of this new era, demands for cheaper, higher data rate, and more mobile wireless technologies are greater than ever before. The characteristics of the wireless channels impose immense challenges to achieve high-rate transmission. Beside attenuation and additive noise, signals in wireless envi ronments suffer from multipath fading and time variation. Orthogonal frequency division multiplexing (OFDM) is one of the promissing technologies that would offer high data rates over such environments. Its robustness to multipath propagation and bandwidth-efficiency render OFDM a very attractive option for future systems. In fact, OFDM has been applied to many existing wireline and wireless standards. As a multicarrier technology, OFDM however suffers from one major drawback com mon to this family, which is the large dynamic range of the transmit signal. High peaks in multicarrier systems occur when signals in subcarriers are constructively added. A large peak-to-average power ratio (PAR) poses special challenges for power amplification in the transmitter. Excessively high input causes saturation in amplifiers, which distorts and reduces the power of the output. As a result, either expensive linear power am plifiers need to be employed, or nonlinear amplifiers much be operated with very high power backoffs to avoid significant bit-error-rate (BER) performance degradation and out-of-band radiation. Operating with high power backoffs, in turn, leads to inefficient power amplification, which may be critical for devices with battery power supply. Chapter 1. Introduction 2 In general, PAR reduction in OFDM systems can be achieved at the expenses of increasing complexity, transmit power,. BER, out-of-band radiation, or decreasing data transmission rate. Numerous approaches have been proposed in the literature to alleviate the PAR problem. Each of the approaches attains the PAR reduction at, to various degrees, some or all of the mentioned expenses. Unfortunately, no specific approaches have been considered the best solution for all OFDM systems. Therefore, research on this issue is currently still very active. Given OFDM's increasing popularity, the aim of this thesis is to investigate and propose new contributions to the research on PAR reduction in OFDM systems. 1.1 Contributions From our literature review, multiple signal representation (MSR) appears to have the most favorable performance-expense tradeoff among the existing PAR reduction ap proaches. Among MSR techniques, partial transmit sequences (PTS) is the most popular while trellis shaping is the newest. In this thesis, we propose improvements to both of them. In particular, the main contributions of this thesis are: • We restate the PTS search as a combinatorial optimization (CO) problem. Firstly, this helps us to gain further insights into PTS and to propose a fair criterion to eval uate PTS search algorithms. Secondly, the restatement enables us to apply a rich set of efficient metaheuristics from the CO literature. We decide to tailor simulated annealing (SA) and tabu search (TS), the two most popular CO metaheuristics, for PTS search. Under our evaluation criterion, simulation results show that, (a) the simplest algorithm random search indeed attains the best tradeoff amongst currently available algorithms, and (b), our versions of SA and TS outperform all other algorithms in the low-PAR region. • We introduce appended partial PAR as a new metric and the use of stack sequential decoding algorithm (ST-SDA) in trellis shaping. These modifications improve PAR Chapter 1. Introduction 3 reduction performance and/or reduce complexity of the technique. We also propose two variations which are referred to as truncated and adaptive shaping. These variations enable flexible performance-expense tradeoffs in trellis shaping. • We also restate the code sequence search in trellis shaping as a CO problem and apply the same algorithms as in the PTS case. This brings about a completely new degree of flexibility for trellis shaping. • We propose the use of an approximation to the peak power of complex OFDM signal as the new optimization objective in MSR techniques. Using this new objective, it is possible to dramatically reduce the required number of multiplications in the low-PAR signal search while attaining the same PAR reduction performance of the techniques. • Finally, based on trellis shaping, we propose a side information embedding/detecting scheme for PTS. Our low-complexity scheme does not affect the PAR reduction ca pacity of PTS and virtually does not degrade the BER performance of the OFDM system. 1.2 Organization The remainder of this thesis is organized as follows: In Chapter 2 we provide background for the work of this thesis. A brief introduction to the principles, characteristics, and applications of OFDM is given. The effect of high peak power on the performance of OFDM systems is described. Following the formal definition of the PAR, we review the PAR reduction literature and explain the connection between ours and the existing works. Chapter 3 restates the PTS search as a CO problem. The available search algorithms and our tailored SA and TS metaheuristics are then described and compared. Chapter 1. Introduction 4 Trellis shaping is studied and improved in Chapter 4. In particular, we describe a new time-domain metric and the use of the stack sequential decoding algorithm in trellis shaping. We then provide both analytical and numerical comparisons between our proposals and existing trellis shaping variations. In Chapter 5 we consider three improvements to both PTS and trellis shaping. We choose to present these important improvements after Chapters 3 and 4 because an ac quaintance of both techniques is helpful to understand the materials. The improvements are (a) the application CO heuristics in trellis shaping, (b) a new optimization objective to reduce the overall complexity of search algorithms, and (c) a novel side information embedding/detecting scheme for PTS. Conclusion and suggestions for future work are given in Chapter 6. 1.3 Notations The following mathematical notations are used in this thesis. x, X, ... Row vectors H,... Matrices 0N All-zero vector of length N IN All-one vector of length N In Identity matrix of size n x n (.)T Transpose E{-} Statistical average o Component-wise multiplication * Complex conjugate II ' 1 l°o Max norm (or norm) Chapter 1. Introduction 5 {•} A set {-,i = 0 ... N-l} A set of N elements with index i runs from 0 to N — 1 Real part of scalars/vectors Imaginary part of scalars/vectors 6 Chapter 2 Background This chapter provides relevant background on PAR reduction in OFDM systems. Sec tion 2.1 gives a brief overview of OFDM and its features and applications. In Section 2.2, we first define the PAR of OFDM signals and then illustrate the effects of high-PAR signals on OFDM systems. Finally, Section 2.3 reviews the PAR reduction techniques available in the literature and shows the relation of our contributions to the existing works.' 2.1 Orthogonal Frequency Division Multiplexing Orthogonal frequency division multiplexing (OFDM) [1-3] is a multicarrier technique suited for high rate transmission over wireless fading channels. Two periodic signals are orthogonal when their inner product equals zero. Orthogonality allows multiple signals to be transmitted over a common physical link and be detected at the receiver without intercarrier interference (ICI). In OFDM, orthogonality is achieved by efficient discrete Fourier transforms which require relatively low hardware and computational complexi ties [4]. Because all the subcarriers1 in OFDM are orthogonal, they can be spaced closely in the frequency domain. When the number of subcarriers is large, OFDM's power spec tral density (PSD) appears nearly rectangular, e.g. Fig 2.1. The rectangular spectrum maximally utilizes the available bandwidth [5] and renders OFDM a very bandwidth-efficient technique. Multipath propagation in fading channels [6, 7] causes overlaps between delayed ver-1 Subcarriers are also called carriers or tones. In this thesis, these terms are used exchangeably. Chapter 2. Background 7 10 r 0 " -10 -X •2°" m Frequency, normalized to bandwidth Figure 2.1: Power spectral density of an OFDM signal with 256 subcarriers. sions of adjacent symbols. In single-carrier systems, a higher symbol rate means a shorter symbol duration and larger relative overlaps, which, in turn, lead to greater intersymbol interference (ISI). Usually, complex equalizers are needed to alleviate ISI [8]. In OFDM, as the high-rate channel is divided into many parallel lower-rate subchannels, symbol durations in subchannels increase and the relative overlaps become small. With the in troduction of the cyclic prefix in every OFDM symbol, ISI can be greatly reduced or even eliminated without the need of equalization. The cyclic prefix is a copy of the last portion appended to the front of the symbol during its guard period, as illustrated in Fig. 2.2. Generally, if every symbol is extended by a guard period that is larger than the multipath channel's delay spread, delayed versions of a symbol cannot interfere with the adjacent ones and ISI is eliminated. If the guard period also consists of signals that preserve the orthogonality of subcarriers, ICI does not occur. The cyclic prefix in OFDM is designed to satisfy both of the conditions. Therefore, it can minimize both ISI and ICI. Chapter 2. Background 8 Copy Cyclic prefix Data symbol Figure 2.2: Cyclic prefix in OFDM. Due to its high data rate, relatively low complexity, robustness to fading, and uti lization of channel bandwidth, OFDM has been applied in various modern wireless and wireline communication systems. Some of the most popular applications of OFDM are Wi-Fi and WiMAX for wireless local and metropolitan area networks, DAB and DVB-T for digital terrestrial radio and television broadcasting in Europe, ADSL for communi cation over telephone lines, and the HomePlug standard for powerline communication. OFDM is also being considered for many upcoming standards, for example, UWB for wireless personal area networks and Mobile WiMAX. 2.2 The PAR Problem in OFDM Common to all multicarrier systems like OFDM is the high peak-to-average power ratio (PAR) problem. High peaks occurs when signals in subcarriers are constructively added. The problem arises when the high-power amplifier (HPA) at the transmitter saturates as the input signal is exceedingly large. Saturation, in turn, distorts and reduces the power of the transmit signal. To avoid distortion, either expensive linear HPAs need to be employed, or the nonlinear HPAs have to operate at the linear region, which may be well below the saturation point. Operating with very high power back-offs, however, leads to power-inefficient amplification and reduction of transmit power. In this section, we first give a formal definition of the PAR and then illustrate the effects of distortion on the OFDM system. Chapter 2. Background 9 (a) Transmitter X IDFT X Cyclic Prefix xe Insert (b) Receiver X DFT X Cyclic Prefix XQ ^ Remove Figure 2.3: Simplified block diagram of the complex equivalent baseband OFDM transmission scheme. 2.2.1 PAR of OFDM Signal A simplified block diagram of the complex equivalent baseband OFDM transmission schemes is presented in Fig. 2.3. The input signals {Xk} in the form of complex PSK or QAM symbols are assigned to OFDM subcarriers with frequencies fk = kAf, 0 < k < N — 1. TV is the total number of subcarriers and Af is the spacing between adjacent subcarriers in the frequency domain. The duration of the OFDM symbol, excluding the cyclic extension, is NT. The relationship between N, T, and Af is 1 NT (2.1) The subcarriers are mutually orthogonal over NT because their inner products equal zeros, i.e., / ej27Tfmt(ej27rfnt)*dt = 0, (2.2) Jo for any m, n 6 {0 ...' N — 1} and TO ^ n. The complex envelope of the OFDM signal (excluding the cyclic extension) is given by i -v 1 • X(T) = -7T?JlX^fkt - 0<£<iVT. (2.3) In OFDM, inverse discrete Fourier transform (IDFT) is used to map signals from the Chapter 2. Background 10 frequency domain into the time domain. Let X = [X0... XJV-I] (2.4) be the vector of N input symbols corresponding to an OFDM block and x = [x0 ... xNL-i] (2.5) be the vector of NL discrete-time samples of the OFDM signal, where L is the oversam-pling factor. The IDFT generates the samples as N-l xn = x(nT/L) = ^Y] Xk^kn/NL , 0 < n < NL . (2.6) v ^ fc=0 We define the notation x = IDFT(X) (2.7) which returns the vector x as defined in (2.5) and (2.6) from the vector X. The PAR of the continuous-time signal x[t) is defined as max {|x(i)|2} 7 {x{t)) = • (2-8) The cyclic prefix is not considered because it does not affect the mathematical PAR definition. In this thesis, we consider the approximation PAR based on the sampled, discrete-time signal x whose PAR is _ 0<*SL{M2} (2.9) MIL E{>n|2} • Due to possible missing of the true peak of the continuous-time signal through sam pling, it is clear that ^(x) < j°(x(t))- The approximation approaches the true value as the oversampling factor L increases. Chapter 2. Background 11 10° i Continuous-1 ime IO"1 L = 1 L — 2'^ L = 4 10 -2 io- 3 9 10 101og10(7o) [dB]—> 11 Figure 2.4: CCDF plots of continuous-time and L-time oversampled discrete-time OFDM sig nals. N = 256, 16-QAM. The complementary cumulative distribution function (CCDF) is often used to show the PAR statistics and to evaluate the performance of PAR reduction techniques. It is the probability that the PAR exceeds some threshold 70 <27(7o) = Pr{7(x) > 70} . (2.10) Fig. 2.4 illustrates the PAR CCDF plots of the continuous-time and discrete-time signals with different oversampling factors. The OFDM system with N = 256 and 16-QAM was chosen for simulation. The plots show that 7(01) well approximates Y(x(t)) with L > 4. We therefore adopt L = 4 for all subsequent simulation results (cf. [9]). When N is sufficiently large, by the central limit theorem [10], x(t) can be modelled as a zero-mean complex Gaussian random variable with variance al = E{|a;(f)|2}. The Chapter 2. Background 12 probability density function (pdf) of x(t) is /i(1)W = _Lexp(_tT) . (2.u) Hence, its amplitude is Rayleigh distributed with the pdf 2u ( u2 \ f\x(t)\(u) = — exp ( -j , u>0. (2.12) ® X V ®x J Any sample xn also has the same distribution as x(t). The probability that the magnitude of a certain sample remains below UQ is Pv{\xn\ < U0} = / f\x(t)\(u)du Jo = l-exP(-^) . (2.13) For Nyquist-rate sampling (i.e. L = 1), the samples {xn} are statistically independent (except for BPSK constellation, because discrete Fourier transform of x has to return a real vector X in the frequency domain), therefore the probability that the maximum magnitude of the samples exceeds UQ is Pr{ max \xn\ > u0} = 1 - Pr{ max \xn\ < UQ) 0<n<N l0<n<N = 1 - Pr{|:ro| < u0} Pr{|xi| <u0} ... Pr{\xN-i\ < u0} _ (-§)"• . (2.14) The CCDF of the discrete-time signal is approximated as Q-ho) = Pr{7(a;)>7o} - ' = Pr{ max |xn|2 > crl^o] = Pr{ max \xn\ > A/^ITO} 0<n<N = . ^(l-e-70)^ . (2.15) We note that the statistical independence assumption is not valid when oversampling is applied (L > 1). The approximation (2.15) is interesting in that it.shows the CCDF Chapter 2. Background 13 function is independent of the PSK/QAM constellation (again, except BPSK). This is confirmed by our simulation results in Fig. 2.5. The plot of the BPSK case is also shown for reference. In subsequent simulations we will assume the 16-QAM constellation, unless specified otherwise. In Fig. 2.6 we shows the CCDF plots of OFDM systems with different number of subcarriers 7Y, from both the approximation (2.15) and simulation. It can be seen that the approximations, derived under the unrealistic assumption that the signal samples are i.i.d. Gaussian distributed, are smaller than the simulation results. However, they agree on the general trend of dependence of the CCDF plots on N. 2.2.2 Effect of Distortion We assume memoryless amplification, i.e. the current output depends only on the current input. In order to study the effects of distortion on the OFDM system, we assume the Rapp distortion model [11] for solid-state power amplifiers. Let x = pe7* be the input and x' be the output signal (continuous or discrete-time) of the distortion model, their relationship is x' = Tpr , (2-16) (1 + (p/A)*P)1/2p where A is the saturation amplitude and p is a parameter which determines the linearity of the amplifier. When p —> oo, the distortion approaches that of the ideal soft limiter whose input-output relationship is if p < A (2.17) Ae7* otherwise The input-output relationship of Rapp model is presented in Fig 2.7. In subsequent simulations in this section we use p = 10, which is appropriate to represent systems that employ linearization mechanisms, e.g. predistortion [12]. The input power backoff B is defined as the ratio between the saturation power and Chapter 2. Background 14 IO :* I i 1 i i i i '. * u \ \\\ \ i 4 5 6 7 8 9 10 11 12 101og10(7o) [dB] —> Figure 2.6: CCDF plots of OFDM signals with different numbers of subcarriers. 16-QAM, L = 4. Approximation from Eqn. (2.15). Chapter 2. Background 15 1 a (X o 0.8 0.6 0.4 0.2 soft limiter P = 10 \ \ P = 3 \v. '" \ \ p \ P = = 2 1 p = 0.5 0 0.5 1 1.5 Input amplitude \x\/A —• Figure 2.7: Amplitude input-output relationship of Rapp distortion model, the average input power, i.e., 5 = 101og10(4) tdB]> (2.18) where o2x = E{|x|2}. A higher power backoff means that the amplifier operates further away from the saturation point. To avoid nonlinear distortion, the HPAs have to operate with high power backoffs in the linear region. However the average output signal level has to decrease accordingly. As a compromise, some nonlinearity has to be introduced into the output OFDM signal. We now examine the performance degradation and out-of-band radiation due to distortion. BER Increase We use simulations to illustrate the effect of distortion on BER. We present the BER as a function of Es/J\fQ where Es denotes the average received energy per PSK/QAM symbol Chapter 2. Background 16 as if no distortion occurs and A/o denotes baseband noise power spectral density. We note that, with distortion, the actual average received energy is less than EA due to output power cut-off (hence the actual signal-to-noise ratio is less than ES/NQ). However, in this case we still present the BER as a function of ES/NQ so that the effect of transmit power reduction is taken into account. The BER increases in AWGN and fading channels are presented in Fig 2.8 and 2.9, respectively. As can be seen in both cases, for a given ES/NQ, distortion can seriously degrade the BER performance. We recall that distortion can be avoided by operating with high power backoffs, but this leads to power inefficiency and transmit power reduction2 which eventually increases the BER. Out-of-Band Radiation The effect of nonlinear amplification on the PSD of the OFDM signal is illustrated in Fig 2.10. As shown, the out-of-band radiation power increases with the distortion. This PSD spread is an important design consideration due to possible governmental restrictions on out-of-band radiation. It can be limited by reducing the transmit power, however, again, at the cost of system performance degradation. 2.3 Review of PAR Reduction Techniques As presented in the previous section, high PAR imposes significant challenges on trans mitter implementation. Due to OFDM's potential, the PAR problem in OFDM systems has drawn great enthusiasms from the research community over the last decade. As a result, a rich set of PAR reduction techniques has been" proposed. The two most practical and popular families of techniques are peak cancellation and multiple signal representa-2Note that when nonlinear HPAs are employed, total transmit power reduction is the combined results of two causes. The first is the down-scaling of input or amplification gain to reduce distortion, and the second is the distortion itself. Chapter 2. Background 17 i r\—4 I I L. . L_ I : I i—: i 1 0 • 5 • 10 15 . 20 25' 30 ' 35 40 mogw(E,/Af0) [dB] —» Figure 2.9: BER increase due to nonlinear amplification. Fading channel, Rapp distortion model with p = 10, N = 256, 16-QAM. Chapter 2. Background 18 10 -10 -20 r CQ -40 -50 -60 B = 5, 10, 15, 20 [dB] 0.5 1 1.5 2 Frequency, normalized to bandwidth —> Figure 2.10: Out-of-band radiation due to nonlinear distortion. Rapp distortion model with p = 10. N = 256, 16-QAM. tion. In this section we give a review of the techniques belong to these families. 2.3.1 Peak Cancellation A common strategy of many techniques is to add a peak-cancellation signal c(t) into the data-carrying signal x(t) in order to eliminate the high peaks before amplification. This strategy is followed in clipping, tone reservation, tone injection, and active constellation extension. Clipping Hard clipping [13] is the most simple PAR reduction technique where the signal's ampli tude is clipped to A whenever it exceeds that threshold (similar to the ideal soft limiter, Chapter 2. Background 19 Eqn. (2.17)). In effect, hard clipping adds a signal c(i) into x(t) which can be specified as In the frequency domain, the spectral content of c(t) is both in-band and out-of-band (overlap or outside of the OFDM spectrum). The in-band content distorts the signal in subchannels and increases the overall data bit-error-rate (BER) of the OFDM transmission. The out-of-band content causes out-of-band radiation, which may have to be filtered out to comply with regulations. However, filtering in the frequency domain may lead to peak regrowth in the time domain. Repeating the clipping and filtering process many times can reduce both peak regrowth and out-of-band radiation, but also increase the complexity [14, 15]. Tone Reservation To avoid out-of-band radiation, c(t) can be designed such that its spectral content is only in-band. In tone reservation, a number of tones are reserved for peak-cancellation purpose (hence the term tone reservation). c(t) is wholly carried by the reserved tones. Therefore, it is orthogonal to and does not increase the BER in all other data-carrying tones [16]. Performance of the tone reservation technique depends on the number of reserved tones as well as the algorithms to design the peak-cancellation signal. Let X and C be the iV-point data and peak-cancellation signals in the frequency domain. X is non-zero only at the data-carrying tones, whereas C is non-zero only at the reserved tones. In the discrete-time domain, the corresponding signals are (2.19) x = IDFT(X) (2.20) and c = IDFT(C) (2.21) Chapter 2. Background 20 The design of c aims at reducing the peak power of (c + x). It can be done in the frequency domain as an optimization problem with the number of complex variables equals to the number of the reserved tones. In [17], this problem has been cast as a linear programming problem to reduce the computational complexity. The advantage of tone reservation is that it does not require any additional compu tation at the receiving side. The receiver can simply ignore the reserved tones. The disadvantage is that the data rate is reduced and the overall transmit power is increased. Furthermore, the number of reserved tone and the complexity to solve the optimization problem at the transmitter can be relatively high. Tone Injection In tone injection technique [16], no tones are reserved solely for peak-cancellation purpose. Instead, the constellations in a number of subcarriers are extended such that more than one points represent the same data. By selecting suitable constellation points, a low-PAR signal can be produced. Let X be the original data vector and (X + C) is the modified data vector after constellation extension. The peak-cancellation signal is c = IDFT(C). Therefore, as in tone reservation, c is also in-band. However, because the peak-cancellation and the data-carrying signals share the same subcarriers, they interfere and this increases the BER. In order to minimize the BER increase, the constellation needs to be extended such that points which represent different data are as far away from each other as possible. In tone injection, no bandwidth is lost for PAR reduction. However, the constellation extension increases the transmit power while potentially decreasing the performance in terms of BER. Furthermore, application of tone injection increases the complexity in both the transmitter and the receiver. Chapter 2. Background 21 Active Constellation Extension Active constellation extension (ACE) [18, 19] is very similar to tone injection except in the way that the constellation is extended. In tone injection, one point may be extended to a number of fixed points. In ACE, only some outer points of the constellation may be dynamically and continuously extended outwards. ACE has the same general advantages and disadvantages as tone injection. In com parison with tone injection, performance degradation in ACE may be lower, however, the PAR reduction capacity also decreases. 2.3.2 Multiple Signal Representation Another common strategy to reduce PAR is multiple signal representation (MSR) [20]. In MSR, redundancy is introduced into the data so that the same data can be represented by multiple OFDM signals and one low-PAR signal is chosen for transmission. Members of this family include selected mapping, partial transmit sequence, and trellis shaping. If R redundancy bits are inserted into the data stream, 2R different OFDM signals can be generated. An ideal MSR technique can generate 2R statistically independent OFDM signals and selects the one with lowest PAR for transmission. In other words, it can reject the (1 — 2~R) portion of all OFDM that has the highest PAR. As a result, the MSR technique limits the PAR of the PAR-reduced signals to 70 specified by Pr{7(x) > 70} = 1 - 2~R . (2.22) Eqn. (2.22) gives the lower bound for the achievable maximum PAR 70 when MSR techniques are used. This bound can be approximated by combining (2.22) and (2.15), which yields 70 = -\n(l-2-R/N) . (2.23) Fig. 2.11 shows the simulative lower bound and its approximation based on (2.23) for the case of N = 256. The simulative bound is obtained from (2.22) and PAR values of Chapter 2. Background 22 3 4 5 6 Redundancy R [bit] —> Figure 2.11: Bound on the achievable maximum PAR using multiple signal representation. Sim ulative bound indicates the maximum PAR of OFDM signals after applying an MSR technique with R redundancy bits. The approximation is from Eqn. (2.23). N = 256. OFDM signals from a simulation with the oversampling factor L = 4. It can be observed that the approximation from (2.23) is lower (hence, less tight) than the simulative bound. This deviation is caused by the unrealistic assumptions about Nyquist-rate sampling and Gaussian distribution when we derived Eqn. (2.15). The simulative bound suggests that MSR can greatly reduce the PAR using only a very small amount of redundancy. For example, applying MSR with only R = 6 redundancy bit in an OFDM system with N = 256 subcarriers, we can limit the PAR of all OFDM symbols to under 7 dB. Furthermore, as all alternative signals can be made as valid OFDM signal, MSR does not increase the overall data BER. These two characteristics (small amount of redundancy and no BER increase) make MSR a very promising strategy for PAR reduction in OFDM systems. Chapter 2. Background 23 A number of MSR have been proposed in the literature such that selected mapping (SLM) and partial transmit sequences (PTS) [20, 21], and recently, trellis shaping [22, 23]. It is important to realize that the bound (2.22) is the PAR limits that some ideal MSR technique would achieve. Practical MSR techniques may not be able to achieve the ideal performance. However very close performances have been observed e.g. cf. [20, Fig. 10]. Selected Mapping In the SLM technique [20, 21], each data symbol Xk can be rotated by a phase before transformation. Let us define U distinct phase vectors 0<"> = [¥iu) •••¥#-!]. l<u<U (2.24) and the corresponding rotation vectors p(u) = ^v<u) _ _ _ ejviyij] ? 1 < u < u . (2.25) The distinct U alternative OFDM signals are generated as x(u) = iDFT(x o pM) . (2.26) Total enumeration of U search steps can be carried out to find the phase vector that results in the lowest PAR. One disadvantage of SLM is its high complexity. Due to the arbitrariness of the phase vectors, each PAR search step involves the rotation and IDFT operations to produce the discrete-time signal x^ (see (2.26)), and then searching for the sample with the largest power. The computational complexity per search step is therefore rather expensive. Another disadvantage of SLM is, in order to detect the data, the receiver needs to know which phase vector have been used for PAR-reduction. This information has to be transmitted as side information (SI), which contains at least R — \og2(U) bits. In [24], an SLM technique without explicit side information was proposed. However, in this SLM variation, the complexity per search step remains high. Chapter 2. Background > 24 Partial Transmit Sequences Partial transmit sequence (PTS) [20, 21] is a variation of SLM which avoids IDFT oper ation in every PAR search. Detailed description of PTS is provided in Chapter 3. The overall complexity of PTS, especially when R is large, is significantly less than that of SLM. Nevertheless, for the same number of search steps, PTS attains an almost equally good performance as SLM cf. [20]. PTS still requires side information to be transmitted to the receiver. PTS has gained much attention as one of the most promissing PAR reduction tech niques. In this thesis we propose some important improvements for PTS. We address both the complexity and SI transmission issues of PTS. The complexity reduction is pro posed in Chapters 3 and 5. As our SI transmission and detection scheme is based on the trellis shaping structure presented in Chapter 4, we defer its description until Chapter 5. Trellis Shaping Trellis shaping is an interesting recent variation of MSR where no explicit SI is required. The concept of trellis shaping was proposed in [25] to reduced the average transmit power in single-carrier systems. The application of trellis shaping to reduce PAR in OFDM systems was first introduced in [22] and developed in [23]. It has been shown in [23] that PTS offers a good performance with moderate complexity. In Chapter 4 we propose improvements in trellis shaping to reduce the computational complexity and/or increase performance. Further possible complexity reduction proposal and a comparison with PTS are provided in Chapter 5. 25 Chapter 3 PAR Reduction by Partial Transmit Sequences In this chapter we propose improvements to reduce the computational complexity of the partial transmit sequences (PTS) technique. In particular, we first reformulate the tech nique as a combinatorial optimization (CO) problem. This allows us to apply efficient search algorithms from the CO literature to PTS. Simulation results show that they in deed offer better computation-complexity tradeoffs than PTS search algorithms currently available in the literature. 3.1 Partial Transmit Sequences In PTS [20, 21], the data symbols in X are partitioned into V disjoint subblocks X^v\ 0 < v < V such that ^ = [^...^1, (3.1) and ,,A Xk, if k-th. subcarrier belongs to v-th. subblock Xiv)={ (3.2I 0, otherwise for 0 < k < N. Therefore, v-i X = ^XM. (3.3) These subblocks are independently transformed to produce partial transmit sequences >) = [x0v) ... *a_i] . '0 < v < V . (3.4) Chapter 3. PAR Reduction by Partial Transmit Sequences 26 X Parition X(0) 1 IDFT IDFT IDFT + x Optimize for phases Figure 3.1: Block diagram of the PTS technique. These sequences are rotated by some phase <pv and then combined to produce the transmit sequence v-i x = xiv)e'Vv . (3.5) By changing the phases <pv, we can generate different OFDM signals from which one low-PAR signal is chosen. The block diagram of PTS is shown in Fig. 3.1. If ipv can be chosen among W values {2ixw/W, w = 0,... ,W — 1}, and the first block needs not rotate (without any performance loss), the number of alternative representa tions for an OFDM symbol is Wv~l. Therefore, the side information (SI) that carries the information about the phases contains R = (V — 1) log2(W) bits. With R fixed, PTS attains the best performance when W = 2 (see. [21, p. 78 and Fig. 5.18]). Hence, in this thesis we choose to consider the case W = 2, i.e. the number of redundancy bits will be R = V — 1, the phases will be 0 or n. We can state the PTS problem as |2 minimize {<Pv} R i;=l subject to: ipv G {0, ir} , 1 < v < R (3.6) Alternatively, let bv = 0 ii> = 0 1 if (/? = 7T (3.7) Chapter 3. PAR Reduction by Partial Transmit Sequences 27 and let us define the phase vector 6 =[&!... by-i] • (3-8) Then Eqn. (3.6) can be reformulated as a zero-one combinatorial optimization (CO) [26] problem minimize f(b) = \\XWIQ b subject to: b G {0,1}R . (3.9) where the LiV-point complex transmit sequence is R R x = ^2x{v) -2j2x(v)bv (3-10) v=Q v—1 and the objective function f(b) is the peak power of x. • In optimal PTS, we choose the optimal solution of (3.9) to produce the transmit sequence. Finding the optimal solution of the CO problem (3.9) requires total enumer ation over all 2R possible phase vectors, which is impractical for large R. It is therefore desirable to apply heuristics (suboptimal algorithms) to reduce the complexity of PTS. In order to solve (3.9), we would have to evaluate f(b) very often. The evaluation can be divided into two steps: calculating the transmit sequence and finding its peak power. By storing complex numbers as pairs of real and imaginary parts and taking advantage of binary coefficients bv, calculating the transmit sequence requires no multiplications. On the other hand, finding the peak power of an LiV-point complex sequence then requires 2LN real multiplications. Assuming that computational complexity is largely credited to multiplications, the evaluation of f(b) is relatively computationally expensive. Hence, we will compare heuristics based on the average number of peak-power evaluations (also referred to as the number of peak-power searches or simply the number of searches) that they require to achieve the same PAR reduction. Chapter 3. PAR Reduction by Partial Transmit Sequences 28 Algorithm 3.1 Random Search Select initial solution b for k = 1 ... K - 1 do Randomly generate a new solution b if /(&) < f(b) then b = b end if end for Return b 3.2 Review of Descent Heuristics A number of PTS heuristics have been proposed in literature: random search and bit flip [27], and local search [28]. All of these heuristics fall into the category of descent heuristics, in which the solution is always updated in the direction of decreasing the objective function /(b) (downhill moves). In this section we give a brief description of these heuristics. Their pseudo-codes are also provided. 3.2.1 Random Search The most simple PTS heuristic is random search. Its pseudo-code is given in Algo rithm 3.1. In each iteration we randomly generate a new solution and update the current solution to the new solution if improvement is observed. The application of this subop-timal algorithm to PTS has been mentioned in [27]. 3.2.2 Bit Flip The bit flip algorithm was proposed in [27]. The algorithm is presented in Algorithm 3.2 with some extension. In each iteration, a new trial solution is derived from the current solution by flipping one of its bit (from 0 to 1 and vice versa). If the trial solution is Chapter 3. PAR Reduction by Partial Transmit Sequences 29 Algorithm 3.2 Bit Flip Select initial solution b, flip position 0 < p < R — 1, time to last update r — 0 for k = 1 ... K - 1 do if r == R then break end if Obtain b by flipping the pth bit in b if f(b) < f(b) then b = 6, r = 0 end if V = (P + (remainder), r = r + 1 end for Return b better, i.e. results in lower value when applied to the objective function, it replaces the current solution. In the proposal in [27], the flipping process starts at the first bit and sequentially continues to the last bit of the phase vector with a total of R +1 peak-power searches. We extend the algorithm in that the flipping can continue in a cyclic fashion, which means that the first bit is flipped again after the last bit was considered and so on, allowing for an arbitrary number of searches. We note that, if no improvements are observed after R consecutive flips, the current solution is a local optimum and continuing the process does not help. Hence, in order to save complexity, we incorporate a detection scheme into the algorithm so that it can be terminated early if this is the case. 3.2.3 Local Search A local search heuristic find the best solution in a neighborhood Af(6) of the current solution b. From the objective function /(&), it is not clear what a mathematically proper definition of the neighborhood would be. A pragmatic choice is to select the Chapter 3. PAR Reduction by Partial Transmit Sequences 30 Algorithm 3.3 Local Search Select initial solution 6, searched list C = 0 for i = 1 ... I do Neighborhood J\f(b), search set A = X{b) \ C if ^4 == 0 then break end if b = argminb€^{/(6)} Searched list £ = C U A end for Return 6 neighborhood as the set of vectors with Hamming distances no larger than r from the current solution, as in [28], or the set of /^-interchange vectors [29]. If the neighborhood is the set of vectors with Hamming distances of r or less, the number of vectors in the neighborhood is can be observed, however, that the neighborhoods in consecutive iterations may overlap. Therefore, in order to reduce the number of searches, we maintain a list of all vectors that have been searched in previous iterations and exclude them from M{b). The pseudo-code of local search is presented in Algorithm 3.3. (3.11) In each iteration, we have to perform as many as 1^(6)1 peak-power searches. It 3.3 Metaheuristics for PTS Descent heuristics usually do not perform well in dealing with hard CO problems. The reason is that the solution is always updated in the direction of improvements (downhill moves) hence it soon converges and is trapped to a local optimum [30]. Chapter 3. PAR Reduction by Partial Transmit Sequences 31 /Q. . . . s L (a local optimum) 0 (global optimum) b —• Figure 3.2: Illustration of search moves. Having stated PTS as a CO problem, many efficient metaheuristics [31] are avail able from the field of CO research such as simulated annealing (SA) [30, 32, 33], tabu search (TS) [30, 34, 35], genetic algorithms, artificial neural networks, particle swarms, ant colony optimization, and greedy randomized adaptive search procedure (GRASP). Unlike descent heuristics, these metaheuristics have mechanisms to accept uphill moves. Such moves are advantageous to avoid being trapped to local optima. This can be illus trated by a simple example in Fig. 3.2. We consider neighborhoods that contain only two adjacent points to the current solution. If only downhill moves are allowed, the quality of the final solution depends entirely on the initial solution. For example, if the initial solution is P or Q, the final solution is L. Likewise, initial solution S results in final solution 0. When uphill moves are allowed, it is possible to reach the better solution 0 even if the algorithm starts with the solution Q. In this section, we consider the application of SA and TS, which are amongst the most popular metaheuristics, to PTS. These general-purpose metaheuristics have been reported to efficiently solve numerous difficult CO problems. Depending on the particular Chapter 3. PAR Reduction by Partial Transmit Sequences 32 Algorithm 3.4 Simulated Annealing Select initial solution b, initial temperature T, cooling coefficient a Best solution 6 = 6 for k = 1 ... K - 1 do Generate neighborhood solution b Calculate 5 = /(b) - /(b) Generate a uniform random number r in (0,1) if 5 < 0 or r < e_<5/T then 6 = b end if if /(b) < /(b) then b = b end if T = aT end for Return b problem, the metaheuristics have to be modified to a suitable form. The pseudo-codes presented here are our tailored versions for PTS. 3.3.1 Simulated Annealing SA [30, 32, 33] is one of the first algorithms that have explicit strategies to escape from local minima. It is deduced from the physical annealing process of solid materials. There has been significant amount of research on the convergence and statistical behavior of the SA algorithm, and we refer interested readers to [33] and the works cited therein. In SA, downhill moves are always accepted, and uphill moves are accepted with some probability p that depends on the height of the hill 5 and a control parameter T (the temperature, as from its original physical model). We select the probability following the Chapter 3. PAR Reduction by Partial Transmit Sequences 33 Boltzmann distribution, i.e. p = e~6'T. (3.12) For the solution to converge, the acceptance probability of uphill moves has to reduce during the search. This is realized by reducing the temperature in the cooling process. We choose the simple geometric cooling schedule Tk = aTn , (3.13) where k indicates the search step and a < 1 is the cooling rate. The pseudo-code of the SA metaheuristic that we use is described in Algorithm 3.4. In this algorithm, we choose to derive the neighborhood solution b from the current solution b by cyclicly flipping one bit (as in bit flip, Algorithm 3.2). . The cooling process is controlled by the combination of the initial temperature and the cooling rate. It is an important consideration in SA. Firstly, if the final solution is to be independent of the initial solution, the initial temperature must be hot enough such that almost any exchange of solutions are allowed. Secondly, the temperature must gradually decrease to zero, but not so fast that uphill moves are freezed early in the search. It normally requires experiments to fine-tune the control parameters. 3.3.2 Tabu Search TS [30, 34, 35] is possibly the most used metaheuristics for CO problems. The algorithm uses information from the search history to escape from local optima and drive the search to regions that might have better solutions. Algorithm 3.5 shows the particular form of the TS metaheuristic that we apply to PTS. It uses a short term memory to avoid both trapping to local optima and moving back and forth between searched solutions. This is implemented in the form of a tabu list C(B) that keeps track of B most recently visited solutions and forbids moving back to them. In each iteration we only search within an acceptance set A of neighborhood solutions that are not on the tabu list. We define the neighborhood M(b) as the set of vectors that have Hamming distance 1 from b, i.e. each Chapter 3. PAR Reduction by Partial Transmit Sequences 34 Algorithm 3.5 Tabu Search Select initial solution 6, initial tabu list C(B) = 0 Best solution 6 = 6 for i — 1 ... I do Acceptance set A = J\f(b) \ C(B) New solution 6 = argminb6^{/(6)} if f(b) < f(b) then 6 = 6 end if Update tabu list C(B) end for Return 6 vector b of the neighborhood is obtained by flipping one bit of 6. After each iteration, the new solution differs from the last solution at exactly one bit position. That position is recorded by the tabu list C and it will not be flipped in the next B iterations. The size of the neighborhood is 1^(6)1 = R. The size of the acceptance test varies from R (at the beginning when the tabu list is empty) to R — B. 3.4 Simulation Results We present the simulation results for an OFDM system with 16-QAM constellation, N = 256 subcarriers which are divided into V — 16 (hence R = V — 1 = 15) subblocks by pseudo-random partition [36]. For metaheuristics, different algorithmic parameters often lead to different performance and require experiments to fine-tune the parameters to achieve good results. We use the following parameters for the SA metaheuristic: initial temperature T = E{|a;n|2}/2, cooling rate a = l—A/K (K is the total number of searches, see Algorithm 3.4). For the TS metaheuristic, we choose the maximum tabu duration to be B = 9 (see Algorithm 3.5). Chapter 3. PAR Reduction by Partial Transmit Sequences 35 Figure 3.3: Performance of optimal PTS and random search. N = 256, V = 16, pseudo-random partition, binary rotation. Fig. 3.3 shows the CCDF of PAR applying random search (RS) with different K (2, 4, 16, 256, 1024, and 4096 searches). Also included are the CCDF for OFDM without PAR reduction and for optimal PTS with K = 2R = 32768 searches. It can be seen that considerable improvements in PAR are achieved with relatively few searches K < 64. Closing the remaining gap of about 1 dB (at Q7(7o) = 10~3) compared to optimum PTS, however,s appears quite difficult to achieve. A fair comparison of different algorithms has to take this fact into account. In Fig. 3.4, it can be observed that the application of the simulated annealing meta heuristic provides some good improvements in the low-PAR range. In particular, SA needs to perform only about one-fourth of number of searches for RS to achieve the same PAR reduction close to that of optimum PTS. In Fig. 3.5, we present the average number of searches vs. 70 that different algorithms Chapter 3. PAR Reduction by Partial Transmit Sequences 36 10Tog10(7o) [dB] — Figure 3.4: Performance of SA vs. RS with different number of searches. require to achieve Q7(7o) = 10-3. The number of searches is averaged because for some algorithms (e.g. bit flip and iterative local search) the number of searches varies with different instance of the CO problem (different OFDM data). Interestingly, it can be seen that descent heuristics proposed in [27, 28] perform worse than random search in all cases. Only the application of SA and TS metaheuristics to solve the CO problem of PTS yields an enhanced tradeoff in the low-PAR range (e.g. at 7 dB), where about four times less searches are required for the same PAR reduction. 3.5 Conclusion In this chapter we reformulate PTS as a CO problem. This helps us to propose a fair benchmark to evaluate PTS search algorithms. The formulation also bring a rich set of efficient metaheuristics from the CO literature to apply in PTS. An important con-Chapter 3. PAR Reduction by Partial Transmit Sequences 37 random search 10° 1 ' 1 1 1 6.5 7 7.5 8 8.5 9 101og10(7o)[dB] — Figure 3.5: Average number of searches required to achieve <37(7o) = 10-3. tribution of this chapter is the finding that the most simple algorithm RS outperforms all descent heuristics that have been proposed for PTS [27, 28]. Only our version of the SA and TS metaheuristics can attain a better tradeoff in the low-PAR region. The numerical results presented in this chapter can also serve as motivation to apply other metaheuristics to the PTS CO problem. 38 Chapter 4 PAR Reduction by Trellis Shaping The concept of trellis shaping was proposed in [25] to reduce the average transmit power in single-carrier systems. The application of trellis shaping to reduce PAR in OFDM systems is developed in [22, 23]. In this chapter we propose further improvements of the techniques. Our proposals include a new metric and the use of sequential decoding in trellis shaping. We also provide extensive performance comparison between the proposed modifications and previous works. In Section 4.1 we show and explain the trellis shaping structure. Metrics and decoding algorithms are described in Section 4.2 where we present the existing and our proposals together to facilitate our comprehensive comparison. The discussion of some efficient implementations and their complexities follows in Section 4.3. Finally, Section 4.4 shows simulation results and Section 4.5 concludes the chapter. 4.1 Trellis Shaping Scheme In this section we introduce the trellis shaping scheme [25] and its application in PAR reduction in OFDM systems [22, 23]. Trellis shaping is a type of constellation shaping, which is best acquainted via an example. Consider a 16-QAM constellation with Gray labelling as in Fig. 4.1. Each symbol in the constellation is labelled by four bits abed. If only two of them carry the data, say cd = 11, while the other two bits are redundancy, we can choose any of the filled symbols in Fig. 4.1 for modulation. The particular choice depends on our objective and possibly other constraints. For example, to reduce the transmit power, we would choose the point that is closest to the origin. We denote ab the most significant bits (MSBs) and cd the less significant bits (LSBs). Starting with some Chapter 4. PAR Reduction by Trellis Shaping 39 0010 O 0011' • ' 0001 O 0000 o 0110 O 0111 • 0101 O 0100 O O 1110 • mi O 1101 O 1100 O 1010 • 1011 O 1001 ol Figure 4.1: Constellation shaping with a 16-QAM constellation initial MSBs, we can drive our selection by raodulo-2 adding binary bits (the shaping bits) to the MSBs. In this example, there are two redundancy bits per QAM symbol, hence any initial MSBs and shaping bits are valid. Trellis shaping is a type of constellation shaping where there is only a fraction of a redundancy bit per QAM symbol. Furthermore, only one bit of the label is the MSB while all other bits are LSBs. The shaping bits come from a code sequence of a rate-l/ns convolutional code called the shaping code. The block diagram of the trellis shaping scheme for PAR reduction in OFDM systems is shown in Fig. 4.2. We consider the shaping code characterized by an (ns — 1) x ns parity check matrix H(D), whose elements are polynomials in D [37]. As illustrated in Fig 4.2, the binary data is divided into two sequences s(D) and p{D), which are respectively the data MSB and the LSB sequence. Given s(D), the initial (or unshaped) MSB sequence z(D) is obtained as z(D) = s{D)H-T(D) (4.1) where H~T(D) is the left inverse of H T(D), i.e, H-T(D)H T(D) = In3(D). (4.2) The shaping sequence y(D) is a valid code sequence of the shaping code. The shaped Chapter 4. PAR Reduction by Trellis Shaping 40 (a) A S(D) Select Code Sequence H~T(D) \z{D) V(D) data Mapping x (b) X De-mapping \z'(D\ * PiD) Figure 4.2: Block diagram of trellis shaping system: (a) transmitter, (b) receiver. Note: p(D): LSB sequence; s(D): MSB data sequence; z(D): initial MSB sequence; y(D): shaping sequence; and z'(D): shaped MSB sequence. MSB sequence is z'(D) = z(D)®y(D). (4.3) The LSB sequence p(D) together with the shaped sequence z'(D) are mapped onto N M-ary QAM symbol {Xi, i — 0 ... N-l}. Since any valid code sequence y(D) of the shaping code satisfies y(D)HT(D) = On,-1(D), (4.4) the data sequence s(D) is recovered at the receiver from z'(D) by z'{D)HT(D) = 8(D) . (4.5) In the following, we define the binary vectors s and p of length N(n3 — l)/na and N(\og2(M) - 1), respectively, as the data in an OFDM symbol. Correspondingly, we denote by y a code sequence of N bits used to shape the OFDM symbol and z' the re sulting shaped MSBs. z' is the MSBs in the labels of N QAM symbols X = [X0 ... XN-i] Chapter 4. PAR Reduction by Trellis Shaping 41 010 on 001 000 O O O O 110 in 101 100 O O O O O o O O 100 101 in 110 O O o O 000 001 011 010 (a) (b) Figure 4.3: Special constellation labelling for 16-QAM, Type-I from [23], (a) MSB, (b) LSB. (multidimensional shaping [22, 23]), whereas p is the LSBs. PAR reduction with trellis shaping is achieved by selecting an appropriate shaping sequence y from the set of code sequences C. The optimum solution is expressed as yopt = argmin{7(x)} . (4.6) It is clear that with the same binary data, multiple OFDM signals can be generated depending on the shaping sequence and one low-PAR signals is chosen for transmission. Furthermore, as the receiver does not need to know which shaping sequence has been used, trellis shaping is indeed an MSR technique with implicit side information. The total redundancy inserted into one OFDM symbol by trellis shaping is N/ns bits. Since the total number of transmitted bits is N\og2(M), the relative redundancy is N/na _ 1 r = (4.7) iVlog2(M) log2[M)ns ' The PAR reduction performance of trellis shaping also depends on the constellation labelling. Simulation results in [23] suggest that a mapping such that two signal points with the same LSBs are symmetric about the origin of the complex plane is advantageous. This mapping is referred to as constellation mapping Type-I in [23] (see Fig. 4.3) and we Chapter 4. PAR Reduction by Trellis Shaping 42 A H'T(D) ] Mapping Select Code b(D) Sequence and Shape U 1 data Figure 4.4: Simplified trellis shaping transmitter limit the discussion from now on to this type of mapping. With this mapping, a shaping bit 0 does not affect whereas a shaping bit 1 simply inverts the corresponding QAM symbol. Hence, the transmitter in Fig. 4.2 (a) can be simplified as in Fig. 4.4. In this simplified scheme, the transmitter needs to perform the mapping only once. Afterward, the code sequence selection algorithm can work directly with the data symbols. It should be noted that with this mapping the average power of the OFDM signal is independent of the shaping sequence y. Since the code book C is of size 2N/n% a brute force search for yopt is infeasible for typical values of N and ns. Instead, computationally efficient search algorithms are needed. 4.2 Selection of Code Sequences In this section, we discuss methods to select the code sequence y for shaping. First, in Section 4.2.1, we present four different metrics numbered 1-4, which can be applied in search (or decoding) algorithms. Metrics 1, 3, and 4 have been used in the literature and Metric 2 is our proposal. We then describe the application of the Viterbi algorithm (VA) and stack (or ZJ) sequential decoding algorithm (ST-SDA) for efficient searches in Section 4.2.2. While the VA is the only decoding algorithm in previous works, the application of the ST-SDA is one of our contributions. Finally, in Section 4.2.3 we propose two variations of trellis shaping that are only possible when our Metric 2 is used. Chapter 4. PAR Reduction by Trellis Shaping 43 As the decoding algorithms in Section 4.2.2 select the code sequences y in a sequential manner, it is convenient to define a partial code sequence of length kns as y[k) = [y0... ykna-i] (4.8) for 1 < k < N/nsl. The partial shaped data sequence corresponding to y^ is denoted by X<*> = [X0 ... X^-i) (4.9) 4.2.1 Metrics Metric 1: Partial PAR The first metric was suggested in [22] for trellis shaping using the VA (see Section 4.2.2). It applies the PAR criterion (4.6) directly to partial code sequences y^k\ More specifically, the metric for t/fc; is given by A1(yW) = ||aJ?)||20> (4.10) where2 x^ = IDFT(X^fe;) is the partial transmit sequence corresponding to the zero-padded shaped data sequence X?> = 0N_kn3] . (4.11) Metric 2: Appended partial PAR (Newly Proposed) We propose a modification of Metric 1 where we always consider all subcarriers at each decoding step. This metric facilitates truncated and adaptive shaping discussed in Sec tion 4.2.3. Let y{d] = [V(k) Vkns • • • Vkns+t-\ OjV-fcns-i] , (4-12) 1In the following, we always assume that 1 < k < N/ns and we omit repeated statement of this for the sake of brevity. 2IDFT(X) gives the LTV-dimensional inverse discrete Fourier transform (IDFT) of the appropriately zero-padded iV-dimensional vector X. Chapter 4. PAR Reduction by Trellis Shaping 44 i.e. a full-length code sequence obtained from by pushing O's into the convolutional encoder, t is the number of possible non-zero output bits, i.e., {\og2{Ns)ns kns< N - \og2(Ns)ns t = N — kns otherwise and Ns is the number of encoder states. Let be the resulting data sequence and ajj0 = IDFT(X^fc)). The metric for y{k) is obtained as A2(y(fc)) = |l4fe)ll~- (4-14) Metric 3: Partial Autocorrelation Another criterion related to the PAR of the time domain signal x(t) considers the aperi odic autocorrelation function N-m-l pm= Y, Xi+rnX*, 0<m<N-l, (4.15) i=0 of the frequency domain data sequence. In particular, it was shown in [38] that the peak power of x(i) is bounded by max \x 0<t<NT (t)\2<^Po + 2^\pm^j . (4.16) It was therefore suggested to minimize the term3 Ylm=\ \Pm\ to reduce the PAR of the OFDM signal x(t) [39]. In order to adopt this criterion for trellis shaping, we define the aperiodic autocorre lation function kns— m—1 p<£> = £ Xi+mX* , 0<m<kn8-l, (4.17) »=o of the partial shaped data sequence X^h\ and we choose as metric for the corresponding code sequence y^ A3(l/(fc))= £ Ipffl- (4.18) m=l 3The value of po does not change if inversions are applied to Xi, which is the case with symmetrical labelling. Chapter 4. PAR Reduction by Trellis Shaping 45 Metric 4: Partial Autocorrelation Squares Ochiai used a similar reasoning as provided for Metric 3 above and considered minimizing the sum of the squares of the autocorrelation function ]Cm=i \Pm\2 for PAR reduction [23, Eqn. (20)]. The metric for code sequence is given by [23, Eq. (22)]. Wfc))=£Vs?|2. (4.19) This Metric 4 has the advantage over Metric 3 that it is additive (see Section 4.2.2). 4.2.2 Algorithms The optimum decoding algorithm selects the code sequence y^71^ of length N from the code book C that minimizes the adopted metric: yoptJ - argmin {W/ns))} , I = {1,2,3,4} . (4.20) In the following, we consider the use of the VA and the stack algorithm to find a good code sequence with feasible computational complexity. We also discuss whether this sequence coincides with J/opt,;-Viterbi Algorithm (VA) Henkel and Wagner [22] considered using the VA with Metric 1 for suboptimal code sequence search. If the two partial code sequences y^ and y^ enter the same trellis state, sequence y[h^ is selected if Ai(y^) < A1(t/2/c^) and vice versa. Since, as noted in [22], Metric 1 is not additive, the VA does not necessarily find the optimum solution 2/optii from (4.20). Similarly, the code sequence selected by the VA with Metric 2 is not necessarily yopt)2. Still, a considerable PAR reduction was observed in [22] and we expect similar results if Metric 2 is applied (cf. Section 4.4). Ochiai [23] suggested the use of the VA and Metric 4. Metric 4 is additive, as the metric update equation A4(y(fc)) = W*"15) + A?Vfc)) (4-21) Chapter 4. PAR Reduction by Trellis Shaping 46 Algorithm 4.1 ST-SDA Algorithm Note: Sorted stack, top node has least metric Load stack with origin node. while Top of stack is not at end of tree do Pop the top node Expand top node (two branches 0 and 1) Compute metrics of these two new nodes Push the two new nodes into the stack end while Output top path with \[k)(yw) > 0 can be formulated [23, Eqn. (28)]. The difficulty inherent to (4.21) is the unlimited memory of the metric increment A4^(?/fc;). In particular, \[h\y^) does not only depend on the code symbols [y(k-i)ns • • • Dkns-i} along the transition path between the two states, but it also depends on the entire previous code sequence (see also [23, Eqn. (28)]). Hence, a tree search algorithm rather than a trellis search algorithm must be used to find the optimum solution i/opt)4 for Metric 4. It is worth noting that the suboptimality of the VA was not mentioned in [23], Metric 3 directly follows from the bound (4.16) of the peak power. But different from Metric 4, it is not additive and was therefore not considered in [23]. As discussed above for Metrics 1 and 2, the VA could be used to find a suboptimum solution. Stack Sequential Decoding Algorithm (ST-SDA) The application of sequential decoding algorithms (SDAs) to trellis shaping for PAR reduction was not considered in [22, 23]. Here, we consider the ST-SDA as one of the most commonly used SDAs. The pseudo-code for the ST-SDA is given in Algorithm 4.1. The ST-SDA searches through the code tree rather than the code trellis with the advantage that the complexity is almost independent of the code memory [37]. Since Metrics 1-3 are not additive, the ST-SDA does not find the code sequence yopt); from (4.20) for Chapter 4. PAR Reduction by Trellis Shaping 47 I G {1,2,3}. However, different from the VA, the ST-SDA finds the optimum solution 2/0Pt,4 for Metric 4. We especially note that decoding using the ST-SDA with Metric 2 can be imple mented more efficiently given the following observation. The ST-SDA maintains a stack, where the top entry corresponds to the path with the smallest metric. At each sequen tial decoding step, this path branches into two paths corresponding to inputs 0 and 1, respectively. The two expanded paths are then pushed back to the stack (see Algorithm 4.1). It is immediately clear from the definition of Metric 2 that the original path and the path expanded with bit 0 have the same accumulated metric because they share the same appended code sequence. Hence, no additional metric update is needed for this expanded path. It also follows immediately that one of the expanded paths will be at the top of the stack. As a result, the stack algorithm always expands the longest path. This fact leads to two consequences: (i) the algorithm stops after N/ns steps and in total only N/ns + 1 metric calculations are needed; and (ii) a stack of size two is sufficient. 4.2.3 Variations Based on the Appended Partial PAR Criterion The "appended partial PAR" Metric 2 in (4.14) is different from the "partial PAR" Metric 1 in (4.10) in that the unshaped portion of the data sequence is included when forming the transmit signal. Using this property we propose two variations of trellis shaping based on Metric 2. These modifications are applicable to code sequence selection with the VA and with the ST-SDA. Truncated Shaping Conventional shaping requires N/ns redundancy bits and the shaping code sequence is A/'-bit long. For truncated shaping, only a subset of N' < N QAM symbols is shaped in order to reduce both the computational complexity and the number of redundancy bits Chapter 4. PAR Reduction by Trellis Shaping 48 (N'/ns). Generally, this truncated shaping can operate on any subset of N' subcarriers and we found that the achieved PAR reduction is insensitive to which subset is chosen. Hence, we assume that the N' symbols {Xi, 0 < i < N' — 1} are shaped. Unshaped data symbols do not carry redundant bits. To correctly recover the data at the receiver, only the MSBs of the N' shaped symbols are applied to the syndrome former. Adaptive Shaping With Metric 2, we always consider a valid code sequence at every step. Therefore, we can stop as soon as the achieved PAR is below the desired value. This is the idea of adaptive PAR reduction [40] which can considerably reduce the complexity of shaping. 4.3 Implementation and Complexity In this section, we discuss the implementation and complexity of trellis shaping with the different algorithms and metrics introduced in the previous section. 4.3.1 Partial and Appended Partial PAR Criteria (fc) (k) The computation of Metrics 1 and 2 requires the partial time-domain vectors x\ and xKd , respectively, and a subsequent peak-amplitude search (see Eqns. (4.10) and (4.14)). In a computationally efficient implementation, each state at stage (k — 1) stores the partial time-domain sequence corresponding to the partial code sequence ending in this state. The initial vectors at k = 0 are x i0) = 0LN and ccj0 = IDFT(A), respectively. The vector update can be formulated as (yi G {0,1} is the zth bit in the shaping sequence) x + v (fc) l e {z,d} (4.22) with kns — 1 kn3+t— 1 (4.23) i=(fc-l)ns i=(fc—l)ns Chapter 4. PAR Reduction by Trellis Shaping 49 and u^IDFTaOiXifV-i-!]) (4.24) respectively. It should be noted that this update relies on the symmetric QAM labelling. Assuming multiplication dominates the complexity, at the beginning of each decoding step, needed vectors Vj are precalculated and temporarily saved. For each of the transition paths, and are updated by adding or subtracting these vectors according to (4.23) . Without taking into account possible complexity savings due to e.g. multiplication with ±1 and ±j or repeated multiplication of Xi with the same complex exponential in (4.24) , nsLN complex multiplications are required per transition step. Hence, for trellis shaping with the VA and Metrics 1 and 2, 2LN2 complex multiplications are required to generate time-domain vectors x\k\ I € {z,d}, (the computations for an initial IDFT for Metric 2 are neglected). Calculating the peak value accounts for 2LN real multiplications per considered time-domain vector. Since in total 2Ns(N/ns — \og2(Ns)) states are visited, the complexity (in terms of multiplications) for the VA is therefore proportional to LN2 and, depending on the ratio Ns/ns, it is dominated by the effort for finding the peak value. In case of the ST-SDA and Metric 1, the complexity is a random variable and depends on the number of states visited. For Metric 2, as discussed in Section 4.2.2, only N/ns +1 time domain vectors are considered, which leads to a reduced complexity compared to the VA. 4.3.2 Partial Autocorrelation and Partial Autocorrelation Squares Criteria A computationally efficient implementation of trellis-shaping with Metric 4 is described for the VA in [23]. It requires about NSN2 / (2ns) complex multiplications. It should be noted, however, that each state at time k has to store all partial autocorrelations p$ for 1 < m < kns — 1 associated with the chosen path. It is further assumed that all possible Chapter 4. PAR Reduction by Trellis Shaping 50 entries for Xi+mX* are tabulated to save computations. Hence, we conclude that the VA with Metric 4 requires by about a factor 2L less multiplications than with Metrics 1 and 2, while memory requirements are considerably higher. In turn, if we consider similar memory use, then the computation of partial autocorrelations pm\ 1 < m < kns — 1, for Metric 4 leads to a considerably increased number of operations compared to Metrics 1 and 2. Similar conclusions apply if the ST-SDA is used and a comparable number of states is visited for the different metrics. The complexity for Metric 3 is somewhat difficult to specify quantitatively since the absolute value operation has to be frequently carried out (see Eq. (4.18)). We expect that due to this operation Metric 3 compares unfavorably with the other metrics in terms of complexity for most processors. 4.4 Simulation Results This section presents the simulation results for trellis shaping. We assume OFDM trans mission with N — 128 subcarriers, 16-QAM constellation, and the symmetrical labelling which was described in. Section 4.1. We select five maximum free distance codes as the shaping codes. The codes are presented in Table 4.1 (\/ns is the code rate, Ns is the number of encoder state, and r is the relative redundancy, see Eqn. (4.7)). We found that in each case, ns and .Ns are important parameters, however the actual numeric values of the code generator has only a small impact on PAR reduction. An example is given in Fig 4.5 where the two shaping codes have the same rate and number of states but different code generators. Their performances are very similar. 4.4.1 Comparison of Metrics Figs. 4.6 and 4.7 depict the CCDF Q7(7o) as a function of 70 for trellis shaping with the VA and Metrics 1-4. Two different convolutional codes are chosen for simulation (Cases 1 and 3 from Table 4.1). As a reference, the CCDF for OFDM without shaping Chapter 4. PAR Reduction by Trellis Shaping 51 Table 4.1: Shaping codes Case ns Ns r Code generator (octal) 1 2 4 1/8 [5 7] 2 4 4 1/16 [3 7 7 7] 3 4 16 1/6 [25 27 33 37] 4 8 8 1/64 [17 17 13 13 13 15 15 17] 5 8 64 1/64 [153 111 165 173 135 135 147 137] is also plotted. We observe that Metrics 1 and 2 show a very similar PAR reduction capability. For both cases, the 0.1% PAR (101og10(70) at C}7(7o) = 10~3) reduced by more than 4.5 dB compared to OFDM without shaping. This was expected as both metrics represent the same optimization criterion. Furthermore, Metrics 3 and 4 compare unfavorably with Metrics 1 and 2 in terms of PAR reduction. For example, a gap of about 1 dB is observed for the 0.1% PAR is observed in Fig 4.6. Interestingly, Metric 4 is slightly superior to Metric 3, although Metric 3 is more closely related to the theoretical PAR bound in (4.16). Table 4.2 shows the results in terms of 101og10(70) at which Q7(7o) = 10-2 (1% PAR) for different convolutional codes. We note that the 1% PAR is 10.1 dB for OFDM without shaping (see Fig. 4.6). It can be seen that Metrics 1 and 2 consistently offer an improved PAR reduction over Metrics 3 and 4. We also observe that, with the same code rate l/nS) increasing the number of states Ns is beneficial for PAR reduction (with the cost of increased complexity). In summary, we conclude that trellis shaping accomplishes significant PAR reduc tion and that Metrics 1 and 2 are clearly superior over Metric 3 and 4. We therefore concentrate on Metrics 1 and 2 in what follows. Chapter 4. PAR Reduction by Trellis Shaping 52 4 5 6 7 8 101og10(7o) [dB] —-Figure 4.5: Two codes with the same rate and number of states have very similar PAR reduction performance. The code generators in octal are shown. Viterbi algorithm , Metric 1. 10° 10" Of 10 -2 10" :• • • :• :• ^\ :• :• • • • OFDM \\{ Metric.2 V--^\Yithout .shaoiuE ... Yv/ Metric ;5 \ • • -: i • AV/• • Metric 4 •:•:•:•:::: :\: ::::::: Metric 1 \ . \\ •: 1 : \\ t • \ \\ i i 1 i IA ,—i i 1 4 5 6 7 8 9 101og10(7o) [dB] —* 10 11 Figure 4.6: Performance of trellis shaping with VA. Metric 1-4, Case 1 from Table 4.1. Chapter 4. PAR Reduction by Trellis Shaping 53 101og10(7o) [dB] —> Figure 4.7: Performance of trellis shaping with VA. Metric 1-4, Case 3 from Table 4.1. Table 4.2: Performance of trellis shaping with VA. 1% PAR [dB] Case (101og10(7o)atQ7(7o) = 10-2) Metric 1 Metric 2 Metric 3 Metric 4 1 6.3 6.3 7.0 6.9 2 6.7 6.6 7.5 7.4 3 6.3 6.4 7.2 7.2 4 6.9 6.8 7.8 7.8 5 6.3 6.4 7.5 7.4 Chapter 4. PAR Reduction by Trellis Shaping 54 Table 4.3: Performance and Complexity of ST-SDA, Case 1% PAR [dB] avg. # of metric calculations VA Metric 1 Metric 2 Metric 1 Metric 2 1 6.6 6.8 174 65 496 2 7.0 7.1 83 33 240 3 7.0 7.2 83 33 896 4 7.5 7.5 40 17 208 5 7.5 7.4 40 17 1280 4.4.2 VA vs. ST-SDA For the ST-SDA with Metric 1, due to zero padding, longer paths accumulate larger metric values. Therefore, we adjust the metric when comparing paths of different lengths by subtracting the expected metric value for the respective lengths. In other words, the actual decision metric is Ai(y(fc))-S{A!(yW)} (4.25) where the expectation E{Ai(y^)} can be determined offline. For example, for each stage k, we can randomly generate 105 partial shaped data sequences X^k\ and average their peak powers to obtain E{Ai{y^)} (the larger the number of samples the better, subject to simulation run time constraint). This needs to be done only once and can be considered as part of the algorithm design. The application of such a bias term is known from sequential decoding of convolutional codes. Also, in order to avoid possible long decoding delay, we limit the maximum stack size to 100. For the ST-SDA with Metric 2, we apply the observation in Section 4.2.2 to the implementation. Table 4.3 shows the 1% PAR for trellis shaping with the ST-SDA and Metrics 1 and 2. A graphical illustration for Cases 1 and 3 is given in Fig. 4.8. In the table, we present the average number of metric calculations, which is the appropriate complexity measure when comparing the ST-SDA with the VA. The respective complexity figures Chapter 4. PAR Reduction by Trellis Shaping 55 Figure 4.8: Performance of trellis shaping with ST-SDA. Metrics 1 and 2, shaping codes Case 1 and 3 from Table 4.1. for the VA are also included (which is 2Ns(N/ns — log2(Na)), as we need to calculate metrics only for parts of the trellis from stage \og2(Ns) to stage N/ns). A comparison of the figures in Tables 4.2 and 4.3 show that (a) the ST-SDA performs considerably fewer searches than the VA, which results in a corresponding gap in PAR reduction capability, (b) Metric 2 achieves a slightly better performance-complexity tradeoff than Metric 1, and (c) the ST-SDA with Metrics 1 and 2 attains a better or comparable 1% PAR as the VA with Metric 4 (proposed in [23]), while computational and/or memory complexity savings are achieved. Hence, the ST-SDA is an interesting alternative for low-complexity trellis shaping. Chapter 4. PAR Reduction by Trellis Shaping 56 Figure 4.9: Performance of truncated shaping with different number N' of shaped carriers (c = N'/N). VA with Metric 2. Case 1 from Table 4.1. 4.4.3 Variations Based on the Appended Partial PAR Criterion We now consider trellis shaping using the VA (similar results are obtained with the ST-SDA) with Metric 2 and the modifications proposed in Section 4.2.3. Fig. 4.9 shows the CCDF of PAR for truncated shaping with redundancies r' = N'/(N log2(M)ns) and different values c = N'/N of shaped subcarriers (see Section 4.2.3). The code parameters are according to Case 1 in Table 4.1. The complexity compared to conventional shaping is reduced by the factor c. From Fig. 4.9 we see that the per formance degradation due to truncation is very moderate for c > 3/4. In particular, for truncated shaping with 25% less multiplications and additions than for full shaping the loss in PAR reduction is about 0.3 dB at 0.1% PAR. It is also interesting to note that Chapter 4. PAR Reduction by Trellis Shaping 57 Table 4.4: Adaptive shaping with VA, Metric 2. Case Threshold [dB] Avg. # of metric calculations 1 6.3 139 2 6.6 65 3 6.4 192 4 6.8 52 5 6.4 274 similar relative performance-complexity tradeoffs were observed in [23] when applying "trellis window truncation" to the VA with Metric 4. Table 4.4 shows the average number of metric calculations needed with adaptive shaping. The threshold for each case is chosen as the 1% PAR for the conventional VA (cf. Table 4.2). The actual performance of adaptive shaping for Case 1 is shown in Fig. 4.10. From Table 4.4 a consistent reduction of 75% in terms of number of metric calculations can be observed. The overall performance of adaptive shaping, as illustrated via the CCDF plots in Fig. 4.10, however is not the same as conventional shaping. 4.5 Conclusion In this chapter, we have studied the application of trellis shaping for PAR reduction in OFDM systems. To this end, we have considered different decoding algorithms and metrics. While trellis shaping using the VA with Metrics 1 (partial PAR) and 4 (partial autocorrelation squares) was suggested in the literature, we propose the application of Metric 2 (appended partial PAR) and the ST-SDA. These modifications help increase the PAR performance and/or reduce the computational complexity of the trellis shaping. Our proposal of truncated and adaptive shaping based on Metric 2 also offers more flexible performance-complexity tradeoffs. The presented comparison of the respective performances and complexities provides valuable guidelines for practical use. Chapter 4. PAR Reduction by Trellis Shaping 58 . . . . 59 Chapter 5 Further Improvements for PTS and Trellis Shaping In this chapter we propose three improvements for PTS and trellis shaping. These propos als are presented in the same chapter because they are either the results of our comparison between the two techniques or applicable to both of them. In Section 5.1 we restate the code sequence search in trellis shaping as a CO problem and apply the heuristics1 in Chapter 3 to solve this problem. In Section 5.2, we use an approximation of the peak power to reduce the overall complexity of search algorithms. Finally, we propose a novel side information embedding scheme for PTS in Section 5.3. This low-complexity scheme is based on trellis shaping and has minimal effect on the overall performance of the OFDM system. Conclusion of this chapter is presented in Section 5.4. 5.1 Application of Combinatorial Optimization Heuristics in Trellis Shaping We first restate the code sequence search in trellis shaping as a CO problem. In trellis shaping, let b be the respective code message of the shaping sequence y, i.e. encoding 6 by the shaping code gives y. The code message b contains N/ns bits. We note that, while the bits in y are not independent, the bits in b can be arbitrary. The relationship between b and y is one-to-one. Therefore, finding a shaping sequence is equivalent to finding its respective code message. As a result, the code sequence search (4.6) can be 1In this chapter, heuristic may indicate any CO [meta]heuristic and decoding algorithm. Chapter 5. Further Improvements for PTS. and Trellis Shaping 60 restated as a CO problem minimize /(b) = H^H2*, b subject to: b e {0,1}R . (5.1) where x is the transmit sequence. From the CO literature's viewpoint, the two decoding algorithms VA and ST-SDA in Chapter 4 can be regarded as greedy heuristics (see e.g. [41, 42]) where the solution b is constructed in a sequential manner. We now consider the application of the heuristics in Chapter 3 to the trellis shaping CO problem (5.1). Fig. 5.1 shows the performance of random search (RS) applied to this problem. RS turns out to perform very well and its performance-complexity tradeoff remains very similar to the PTS case. That is, large PAR reduction is obtained after the first few searches. Then, it becomes increasingly difficult to further reduce the peak power. From this result, we expect that all other CO heuristics would also show similar tradeoffs as in the PTS case (cf. Section 3.4, especially Fig. 3.5). It is worth noting again that the parameters of the SA and TS metaheuristics need to be fine-tuned per different trellis shaping configuration. The good performance of RS, the simplest heuristic, in trellis shaping prompts us to compare it with the two decoding algorithms (VA and ST-SDA) in Chapter 4. We want to see if these more sophisticated algorithms are really worthy (recall Section 3.4 where simulation results show that RS outperforms local search and bit flip). We note that the average computational complexity per search step in trellis shaping is approximately the same for either the decoding algorithms with time-domain metrics or other heuristics (see Sections 3.1, 4.3 and also 5.2 below). Therefore, we will compare algorithms based on their performance vs. number of peak-power searches. Fig. 5.2 shows the performance of trellis shaping with the VA and Metric 1 vs. RS using the same number of searches (which is 496, see Table 4.3). It can be seen that the VA actually performs better than RS. Similar results can be seen in Figs. 5.3 and 5.4 for comparison of the ST-SDA with Metrics 1 and 2 vs. RS. Chapter 5. Further Improvements for PTS and Trellis Shaping 61 10° 10-1 io-2 10-3 i.. 1 1 1 \ . \. A . . V . . .V .V . . . . : urJJM.witnout. •2v4y 16;:64,- 256^ i024l• \ • • •I • searches'":1" r vv' \ ' i i 1 I i I 1 \ i \ i 101og10(7o) [dB] —> Figure 5.1: Trellis shaping with random search. N = 128, 16-QAM, Case 1 from Table 4.1. 101og10(7o) [dB] —> Figure 5.2: Trellis shaping VA and Metric 1 vs. random search (RS) with the same number of peak-power searches. N — 128, 16-QAM. Case 1 from Table 4.1. Chapter 5. Further Improvements for PTS and Trellis Shaping 62 101og10(7o) [dB] — Figure 5.3: Trellis shaping ST-SDA and Metric 1 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1. 101og10(7o) [dB] — Figure 5.4: Trellis shaping ST-SDA and Metric 2 vs. random search (RS) with the same number of peak-power searches. N = 128, 16-QAM. Case 1 from Table 4.1. Chapter 5. Further Improvements for PTS and Trellis Shaping 63 IO4 <-Or CO B o c3 CO CP -6 CO CO O bb > < 6.5 7 7.5 101og10(7o)[dB] 8.5 Figure 5.5: Average number of searches required to achieve Q7(7o) = 10 3. Trellis shaping with N = 128, 16-QAM, Metric 1, shaping code Case 1 from Table 4.1. Fig. 5.5 shows the average number of searches vs. 70 that the VA, ST-SDA, and RS require to achieve Q7(7o) = 10~3 in trellis shaping (similar to Fig 3.5 for PTS). The trellis shaping configuration is N = 128, 16-QAM, Metric 1, and shaping code Case 1 from Table 4.1. In this particular case, it can be observed that the VA and ST-SDA indeed require less than two times as much the number of searches to attain the same performance as RS (cf. Fig. 3.5 illustrates a corresponding figure of up to four when metaheuristics are applied to PTS). From this result, we conclude that trellis shaping with VA and ST-SDA has comparable performance and tradeoff as PTS with metaheuristics. Chapter 5. Further Improvements for PTS and Trellis Shaping 64 Figure 5.6: Signal in complex plane. Figure 5.7: Approximation of a circle by a square and an octagon (F = 8). 5.2 Complexity Reduction by Peak Power Approximation Repeated evaluation of the objective function /(b) in both trellis shaping (see Eqn. (5.1)) and PTS is computationally expensive. It is therefore desirable to provide an approxi mation of f(b) that is cheaper to compute while retaining the PAR reduction capability of the techniques. The peak power is the square of the radius of the circle that encloses the transmit se quence x in the complex plane, as illustrated in Fig 5.6. Following an approach from [43], the circle can be approximated by a symmetrical polygon of P sides, see e.g. Fig. 5.7. Chapter 5. Further Improvements for PTS and Trellis Shaping 65 The circle's radius is equal to the polygon's side radius which can be estimated by max{5R(x„eJ'2p7r/p),n = 0 ... LN - l,p = 0 ... P- 1} . (5.2) Let us define the function h = G(x, P) where h = [m{x} Q{x} ... Rjx^^-^O S{xJ2*(p'4-»'p}] (5.3) i.e., the function returns an iVXP/2-pomt real-valued sequence h from an iVL-point complex sequences x such that the peak power of the two sequence are approximately equal [Ml * \\x\\l . (5.4) While finding the peak power of x requires 2NL multiplications, finding the peak power of h requires none. Assuming that multiplications are the main source of the computational complexity, it is advantageous to use h instead of x in our CO problems. In PTS, if partial transmit sequences x^ are replaced by = G(x^v\ P) in relevant equations, we end up with a sequence h = G(x, P). Similarly, we obtain the same result in trellis shaping with time-domain metrics if we substitute NL-point complex sequences Vi from (4.24) by hi = G(vi, P) in relevant equations. Finally, the CO problem for both trellis shaping (Eqn. (5.1)) and PTS (Eqn. (3.9)) can be approximated as minimize g(b) = WhW2^ (5.5) subject to: b G {0,1}R . where R = N/ns for trellis shaping and R — V — 1 for PTS. With this formulation all algorithms introduced in the previous chapters still apply. The solution of (5.5) approaches that of the original problems as P increases. On the other hand, complexity also increases with P. For example, in PTS, additional 4NLV(P/4: — 1) multiplications are required to generate vectors but subsequence evaluations of g(b) require no multiplications. In comparison, each evaluation of f(b) requires 2A^L multiplications. Therefore, the new objective function g(b) is preferred when more than 2V(P/4 — 1) searches are performed. Chapter 5. Further Improvements for PTS and Trellis Shaping 66 Figure 5.8: SA with 256 searches, using approximation of the peak power with P = 4, 8,16. For comparison: SA with 64 searches, and optimal PTS. In Fig. 5.8 we present PAR reduction results when applying the objective function g(b). The PTS system with parameters as is Section 3.4 is chosen. The SA metaheuristic with K = 256 searches and g(b) with P = 4,8,16 are considered. Solving the original CO problem requires 2NLK multiplications. With P = 4, no multiplications are required to setup the approximate problem and only 0.6 dB in PAR reduction at (?7(7o) = 10-3 are lost. With P = 8, 64iVX multiplications are required but PAR reduction is still improved compared to optimizing f(b) in 64 searches, which involves 128NL, i.e. dou ble the number of, multiplications. With P = 16, the approximation yields essentially the same performance as the original objective function, while complexity in terms of multiplications is greatly reduced. Chapter 5. Further Improvements for PTS and Trellis Shaping 67 5.3 SI Embedding for PTS One drawback of PTS is its need to transfer side information (SI) to the receiving end. Inspired by the trellis shaping structure, in this section we propose a new SI embed ding/detecting scheme for PTS. The scheme requires only a minimal additional band width and complexity and practically does not degrade the overall performance of the OFDM system. 5.3.1 Review of Existing SI Embedding and Detecting Schemes Various solutions to the transmission and detection of the SI b in PTS have been pro posed in the literature. They solve the problem at some or all of the expenses of peak regrowth, extra bandwidth (or equivalently, redundancy), complexity, and OFDM sys tem's performance. In explicit SI schemes [44, 45], a number of subcarriers are reserved for SI transmission. The receiver will use the SI to invert the rotations before detecting the data. Since erroneous SI may cause a complete loss of a whole subblock, the SI usually need to be protected by error control codes, which increase the required redundancy/bandwidth. If the peak reduction optimization is done only on the data-carrying signal, peak regrowth will occur after the Si-carrying signal is added. If the optimization is done on the sum of both the signals, the computational complexity will increase, e.g. [44]. In [45], SI is transmitted the next OFDM symbol to avoid peak regrowth and additional complexity for the optimization. This solution however introduces an extra detection delay of one OFDM symbol. Furthermore, a dummy OFDM symbol is needed to carry the last SI. In implicit SI schemes [21, 46-48], data detection does not require explicit SI. Since all carriers in a subblock is rotated by the same phase, [21] suggests employing differentially encoded modulation to eliminate the need of SI at the receiver. This scheme requires one reference subcarrier per subblock and inherits all possible disadvantages of differential Chapter 5. Further Improvements for PTS and Trellis Shaping 68 modulation. In [46], a marking algorithm is used to indicate different rotations. This scheme applies to PSK only and suffers from peak regrowth as well as increased complex ity in the receiver. Some modifications have been proposed in [47] to make the scheme applicable to QAM constellations, but the peak regrowth and high complexity issues remain. In a special form of PTS [48], subblocks are rotated by phase vectors from a special code book. A maximum likelihood detector at the receiving side will estimate the rotation factors, hence no SI transmissions are needed. This scheme however inherently introduces enormous additional complexity to the receiver. 5.3.2 Novel SI Embedding Scheme An ideal SI embedding scheme should have the following characteristics: • Does not complicate the PAR reduction algorithms; • Does not cause peak regrowth; • Practically does not degrade system performance; • Requires minimal additional redundancy/bandwidth and complexity; and • Is flexible enough to apply to any configurations, e.g. any constellation sizes and subblock partitions. We propose a scheme based on trellis shaping that has all of the above characteristics. It is the structure of trellis shaping applied to individual PTS subblocks and repetition block codes are used instead of convolutional codes. The diagram of the scheme for one block is presented in Fig. 5.9 (cf. Fig 4.2). Similarly to trellis shaping, the SI scheme differentiates one MSB and (m — 1) LSB for every m bits corresponding to a constellation symbol. The symmetric constellation labelling described in Section 4.1 is required. A 180° rotation of a subcarrier is equivalent to inverting its corresponding MSB (from 1 to 0 or vice versa). This is also equivalent to Chapter 5. Further Improvements for PTS and Trellis Shaping 69 (a) Transmitter from PTS algorithm H ,(v) MSB data (b) Receiver LSB De-mapping 1 ./(„•) ^ Hr j MSB LSB Figure 5.9: Block diagram of SI embedding scheme for one subblock: (a) transmitter, (b) receiver. (modulo-2) adding bit 1 to the MSB. A 180° rotation of the whole PTS subblock of nv subcarriers is therefore equivalent to adding = ln„ to the MSBs of the subblock. Side information is implicitly embedded into the data-carrying signal means that the receiver is able to detect data without having to first determine the rotations performed at the receiver. In Fig. 5.9, let and z'^ be the vectors of, respectively, the nv unshaped and shaped MSBs of the non-zero components in2 X^v\ The PTS PAR reduction can be formulated as (v) (5.6) where determines the subblock rotations, i.e. (see PTS description, Section 3.1) 0Uv if 6V = 0 c(v) = if bv = 1 (5.7) 2Recall that is an iV-point vector which is non-zero (and equal to respective components of X) at subcarriers that belong to the v-th subblock. Chapter 5. Further Improvements for PTS and Trellis Shaping 70 We note that ln„ and 0n„ are the two codewords of an (nv, l)-repetition block code. Hence, any (nv — 1) x nv parity-check matrix H of the repetition code satisfies (5.8) In Fig. 5.9, H~T is the left-inverse of HT. The PAR reduction operation (5.6) can be made transparent for data transmission (i.e. the SI is embedded into the OFDM signal) by obtaining the nv unshaped MSBs from z(v) = s(v)H-T at the transmitter and applying the "inverse" operation at the receiver. We note the following: (5.9) (5.10) • The SI embedding is done per subblock, independently of all other subblocks; • The minimum of one redundancy bit for a subblock is used, introduced by (5.9); • The computational complexity for SI embedding and detection is extremely low as all the multiplications in (5.9) and (5.10) are in Galois field GF(2). There are usually many parity check matrices for a given repetition code. In fact, any (nv — 1) x nv matrix H with nv — 1 independent rows that satisfies InM1 = 0 nv — 1 (5.11) is a valid parity check matrix of the code. For any such matrix, there exists at least one left inverse H~r. For a special selection of the matrices and H Inv-1 ^•nv — 1 0nt) _ i (5.12) (5.13) Chapter 5. Further Improvements for PTS and Trellis Shaping 71 (a) t data Insert marking bit' [a™ 0] X(v) Mapping LSB (b) De-mapping If marking bit 1 invert all bits LSB Figure 5.10: A special case of the SI embedding scheme, (a) transmitter, (b) receiver. the scheme can be simplified as in Fig. 5.10. In this case, = [s^ 0] where the last bit can be considered as the explicit marking bit. At the receiving end, if the marking bit is 1, the receiver assumes that the PTS algorithm has inverted the subblock and flips the MSBs accordingly. This special case appears to be exactly the same as other explicit SI schemes mentioned in Section 5.3.1. However, there are two fundamental differences: firstly, our scheme does not cause peak regrowth, and secondly, due to the special labelling and differentiation of MSB and LSB, the SI scheme only affect MSBs, which accounts for just an 1/m portion of the data. The relative ratio of affected data becomes smaller for larger constellations. In the next section, we will analyze the effect of the scheme on the performance of the OFDM system. Chapter 5. Further Improvements for PTS and Trellis Shaping 72 5.3.3 Performance Analysis We note that the distribution of errors in in case of erroneous reception depends on H. For example, consider the two parity-check matrices H\ and H2 such that Inv-l HTX = ln„ — i and Inv-1 e On„-l 0n„-i Inv — 1 (5.14) (5.15) Whereas burst errors occur if Hi is applied, double-adjacent errors result for H2. In case of coded transmission, we may therefore consider separate error control coding for LSBs and MSBs, and the code protecting should take the controlled error patterns into account. Let BERfe be the bit-error rate (BER) after demapping for sufficiently large signal-to-noise ratio (SNR) 7^ in subcarrier k. In the case of uncoded transmission and 2m-ary QAM transmission and, for the (m- 1) LSBs per QAM symbol, we find [6, Section 5.2.9] 4^-4 BERfc)LsB M(m- 1) Q M-l Ik where Q(x) = 1/27T '•^dt (5.16) (5.17) is the complementary error function of Gaussian statistics. The error rate for the data bits representing MSBs is well approximated by 2 2nv - 2 BERfc]MSB Q -ik (5.18) 'M nv ' \ V M - 1 for both parity-check matrices Hx and H2- Hence, assuming nv » 1, an approximation for the overall error rate is given by BERfc = - ((m - l)BERfe,LSB + ^BERfc,MSB j — Q m (5.19) M-l Ik Chapter 5. Further Improvements for PTS and Trellis Shaping 73 We note that this approximation is independent of nv and identical to that for uncoded Gray labelled MQAM transmission without shaping for large 7fc, i.e., asymptotically error rate is not affected by the proposed SI embedding scheme. Finally, considering Rayleigh fading with D-fold diversity either through multiple receive branches or spreading across subcarriers for time-dispersive channels, from (5.19) and [6, Section 14.4.1] the average BER is approximated by BER~^°f(D-1+i)(l-pY, (5.20) i=0 where P=Ul-^.) C (5.21) (5.22) 2 V V a+7 2(M - 1) a = 3 and „ = E^} = (5 23) with Es and J\fo denoting the average received energy per symbol and baseband noise power spectral density, respectively. Fig. 5.11 shows the BER performances for OFDM with (a) conventional Gray labelled 16-QAM (i.e., either perfect explicit SI is available or PAR reduction is not applied) and (b) 16-QAM with sign-bit mapping and the novel SI embedding scheme. Unfaded additive white Gaussian noise (AWGN) and fading AWGN channels with D-fold diversity are considered, and simulation and numerical results using Eqns. (5.19) and (5.20) are plotted. As can be seen, (a) numerical and simulation results match very well and (b) the loss in BER performance compared to OFDM without SI embedding is very small (about a factor 1.2 in BER). This, together with the other features outlined in Section 5.3.2, renders the new SI embedding scheme an interesting solution for OFDM systems employing PTS-based PAR reduction. Chapter 5. Further Improvements for PTS and Trellis Shaping 74 Figure 5.11: BER vs. 101og10(£s/A/b) for OFDM with and without SI embedding. AWGN and Rayleigh fading with D-fold diversity reception. 5.4 Conclusion In this chapter, we propose three improvements for PTS and/or trellis shaping. The simulation results in Section 5.1 show that (a) CO heuristics in trellis shaping attain similar tradeoffs as in PTS, and (b) VA and ST-SDA in trellis shaping perform almost as good as CO metaheuristics. The comparable performances imply that trellis shaping and PTS have their own merits and both of them are good PAR reduction techniques. Our proposal of the new optimization objective is beneficial to all MSR search algo rithms what involve repeated peak-power searches. While some overhead is required, the total number of multiplications, hence the overall complexity, can be reduced at no cost in PAR reduction performance. ' . The trellis shaping structure has inspired us to propose a new SI embedding scheme for PTS. Ours has all desirable characteristics of an ideal SI scheme. Chapter 6 75 Conclusion The high peak power problem is one of the major drawbacks of OFDM, which sometimes, in the extreme, outweighs all of its advantages. The problem has attracted a great deal of research, however, so far no all-satisfactory solutions exist. From our literature review, MSR appears to be the most promising among the existing PAR reduction approaches. In particular, PTS is the most popular while trellis shaping is the newest MSR technique. In this thesis, we have improved both of them. Many suboptimal search algorithms for PTS are available in the literature with fa vorable claims about their performance-complexity tradeoffs. It is important to evaluate their usefulness and propose better algorithms. In Chapter 3, we reformulated PTS as a CO problem and showed that the complexity of any PTS search algorithm is strongly proportional to the number of peak-power searches. Hence, a fair benchmark for tradeoff evaluation would be performance-vs.-number-of-searches. The CO reformulation brings about a rich set of CO heuristics to apply to PTS. We considered the application of SA and TS, which are the two most popular CO metaheuristics, to PTS. Under our benchmark, surprisingly, the simplest algorithm RS actually outperforms all available PTS search algorithms. Only with our versions of the SA and TS metaheuristics, better tradeoffs are attained in the low-PAR region. Trellis shaping is the most recent MSR technique and is interesting because no explicit SI is required. In Chapter 4, we proposed the application of a new metric (Metric 2) and the use of sequential decoding algorithm (ST-SDA) in trellis shaping. We also provided an extensive comparison between our proposals and existing works. We showed that our proposals increase PAR reduction performance and/or reduce the complexity of trellis Chapter 6. Conclusion 76 shaping. In Chapter 5 we presented three further improvements for PTS and trellis shaping. Firstly, code sequence search in trellis shaping was reformulated as a CO problem. This allowed us to apply CO heuristics in trellis shaping and provided further insightful eval uation of the decoding algorithms in Chapter 4. Simulation results showed that they indeed have comparable tradeoff as the SA and TS metaheuristics. This comparison confirms the merit of trellis shaping as a competitive MSR technique. Secondly, we pro posed the application of a new objective function for CO problems. When the number of searches is large, this modification dramatically reduces the number of multiplications required while keeping the PAR reduction performance unchanged. Finally, based on the structure of trellis shaping, we proposed an SI embedding scheme for PTS, which requires only minimal additional bandwidth and complexity. Our scheme is better than existing solutions because it does not affect the performance or complexity of PTS as well as practically does not increase the BER. To this end, we conclude that both PTS and trellis shaping are competitive PAR reduction techniques. We suggest the following for future work: • Trying other algorithms from the CO literature to both PTS and trellis shaping CO problems. Especially, dynamic metaheuristics (e.g. [49, 50]) where the algorithm's parameters are adaptively changed as per instance of the CO problem (per OFDM data) may yield better tradeoffs. • Examining the errors patterns in MSBs when using PTS with our SI scheme. In coded OFDM, structured error patterns may be exploited to improve the BER performance of the system. 77 Bibliography [1] R. W. Chang, "Orthogonal Frequency Division Multiplexing," U.S. Patent 3 488 445, Jan. 6, 1970. [2] J. A. C. Bingham, "Multicarrier modulation for data transmission: an idea whose time has come," IEEE Commun. Mag., vol. 28, no. 5, pp. 5-14, May 1990. [3] S. van Nee and R. Prasad, OFDM for Wireless Multimedia Communications. Boston: Artech House, 2000. [4] S. Weinstein and P. Ebert, "Data transmission by frequency-division multiplexing using the discrete fourier transform," IEEE Trans. Commun., vol. 19, no. 5, pp. 628-634, Oct. 1971. [5] V. Aue and G. P. Fettweis, "High-level multi-carrier modulation and its implemen tation," in Proc. of Veh. Technol. Conf. (VTC), Apr. 1996, pp. 914-917. [6] J. G. Proakis, Digital Communications, 4th ed. New York: McGraw-Hill, 2001. [7] E. Biglieri, J. Proakis, and S. Shamai, "Fading channels: information-theoretic and communications aspects," IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2619-2692, Oct. 1998. [8] T. S. Rappaport, Wireless Communications. New Jersey: Prentice Hall, 1996. [9] C. Tellambura, "Computation of the continuous-time PAR of an OFDM signal with BPSK subcarriers," IEEE Commun. Lett, vol. 5, no. 5, pp. 185-187, May 2001. Bibliography 78 [10] A. Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. New York: McGraw-Hill, 1991. [11] C. Rapp, "Effects of HPA-nonlinearity on a 4-DPSK/OFDM signal for a digital sound broadcasting system," in Proc. 2nd European Conf. on Satellite Comm., Oct. 1991, pp. 179-184. [12] A. N. D'Andrea, V. Lottici, and R. Reggiannini, "RF power amplifier linearization through amplitude and phase predistortion," IEEE Trans. Commun., vol. 44, no. 11, pp. 1477-1484, Nov. 1996. [13] R. O'Neil and L. B. Lopes, "Envelope variations and spectral splatter in clipped multicarrier signals," in IEEE PIMRC'95, Sept. 1995, pp. 71-75. [14] X. Li and L. J. Cimini, Jr, "Effect of clipping and filtering on the performance of ofdm," IEEE Commun. Lett, vol. 2, no. 5, pp. 131-133, May 1998. [15] J. Armstrong, "Peak-to-average power reduction for OFDM by repeated clipping and frequency domain filtering," Electron. Lett, vol. 38, no. 8, pp. 246-247, Feb. 2002. [16] J. Tellado, Multicarrier Modulation with Low PAR: Applications to xDSL and Broad band Wireless. Springer, 2000. [17] B. S. Krongold and D. L. Jones, "A new tone reservation method for complex-baseband PAR reduction in OFDM systems," in IEEE ICASSP'02, vol. 3, May 2002, pp. 2321-2324. [18] , "PAR reduction in OFDM via active constellation extension," IEEE Trans. Broadcast, vol. 49, no. 3, pp. 258-268, Sept. 2003. [19] , "An active-set approach for OFDM PAR reduction via tone reservation," IEEE Trans. Signal Processing, vol. 52, no. 2, pp. 495-509, Feb. 2004. Bibliography 79 [20] S. H. Miiller, R. W. Bauml, R. F. H. Fischer, and J. B. Ruber, "OFDM with reduced peak-to-average power ratio by multiple signal representation," Annals of Telecommunications, vol. 52, no. 1-2, pp. 58-67, Feb. 1997. [21] S. H. Miiller-Weinfurtner, "OFDM for Wireless Communications: Nyquist Win dowing, Peak-Power Reduction, and Synchronization," Ph.D. dissertation, Univ. Erlangen-Niirnberg, Germany,. 2000, ISBN 3-8265-7658-6. [22] W. Henkel and B. Wagner, "Another application for trellis shaping: PAR reduction for DMT (OFDM)," IEEE Trans. Commun., vol. 48, no. 9, pp. 1471-1476, Sept. 2000. [23] H. Ochiai, "A novel trellis-shaping design with both peak and average power reduc tion for OFDM systems," IEEE Trans. Commun., vol. 52, no. 11, pp. 1916-1926, Nov. 2004. [24] H. Breiling, S. H. Miiller-Weinfurtner, and J. B. Huber, "SLM peak-power reduction without explicit side information," IEEE Commun. Lett., vol. 5, no. 6, pp. 239-241, June 2001. [25] J. G. D. Forney, "Trellis shaping," IEEE Trans. Inform. Theory, vol. 38, no. 2, pp. 281-300, Mar. 1992. [26] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. New York: Dover Publications, 1998. [27] L. J. Cimini, Jr and N. R. Sollenberger, "Peak-to-average power ratio reduction of an OFDM signal using partial transmit sequences," IEEE Commun. Lett., vol. 4, no. 3, pp. 86-88, Mar. 2000. [28] S. .H. Han and J. H. Lee, "PAPR reduction of OFDM signals using a reduced complexity PTS technique," IEEE Sig. Processing Lett, vol. 11, no. 11, pp. 887-890, Nov. 2004. Bibliography 80 [29] R. G. Parker and R. L. Rardin, Discrete Optimization. Boston: Academic Press, 1998. [30] C. R. Reeves, Ed., Modern Heuristic Techniques for Combinatorial Problems. Lon don: Blackwell Scientific Publications, 1993. [31] C. Blum and A. Roli, "Metaheuristics in combinatorial optimization: Overview and conceptual comparison," ACM Comput. Surv., vol. 35, no. 3, pp. 268-308, 2003. [32] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, "Optimization by simulated anneal ing," Science, vol. 220 (4598), pp. 671-680, 1983. [33] P. J. M. Laarhoven and E. H. L. Aarts, Eds., Simulated annealing: theory and applications. Norwell, MA, USA: Kluwer Academic Publishers, 1987. [34] F. Glover and F. Laguna, Tabu Search: Norwell, MA: Kluwer Academic Publishers, 1997. [35] S. Hanafi, "On the convergence of tabu search," Journal of Heuristics, vol. 7, no. 1, pp. 47-58, 2001. [36] S.H. Muller and J.B. Huber, "A novel peak power reduction scheme for OFDM," in Proc. of Intern. Symp. on Personal, Indoor and Mobile Radio Communications (PIMRC), 1997, pp. 1090-1094. [37] S. Lin and J. D.J. Costello, Error Control Coding, 2nd ed. Upper Saddle River, N.J: Pearson-Prentice Hall, 2004. [38] C. Tellambura, "Upper bound on peak factor of N-multiple carriers," Electron. Lett, vol. 33, pp. 1608-1609, Sept. 1997. [39] , "A coding technique for reducing peak-to-average power ratio in OFDM," IEEE GLOBECOM 98, vol. 5, pp. 2783-2787, Nov. 1998. Bibliography 81 [40] A. Jayalath and C. Tellambura, "Adaptive PTS approach for reduction of peak-to-average power ratio of OFDM signal," Electron. Lett, vol. 36, no. 14, pp. 1226-1228, July 2000. [41] S. A. Curtis, "The classification of greedy algorithms," Sci. Comput. Program., vol. 49, no. 1-3, pp. 125-157, 2003. [42] A. Vince, "A framework for the greedy algorithm," Discrete Appl. Math., vol. 121, no. 1-3, pp. 247-260, 2002. ' [43] X. Chen and T. Parks, "Design of FIR filters in the complex domain," IEEE Trans. Acoust, Speech, Signal Processing, vol. 35, no. 2, pp. 144-153, Feb. 1987. [44] C.-C. Feng, C.-Y. Wang, C.-Y. Lin, and Y.-H. Hung, "Protection and transmission of side information for peak-to-average power ratio reduction of an OFDM signal using partial transmit sequences," in 58th IEEE VTC 2003-Fall, vol. 4, 2003, pp. 2461-2465. [45] A. D. S. Jayalath and C. Tellambura, "Side information in PAR reduced PTS-OFDM signals," in 14th IEEE PIMRC 2003, vol. 1, 2003, pp. 226-230. [46] L. J. Cimini, Jr and N. R. Sollenberger, "Peak-to-average power ratio reduction of an OFDM signal using partial transmit sequences with embedded side information," in IEEE GLOBECOM'00, vol. 2, 2000, pp. 746-750. [47] C.-C. Feng, Y.-T. Wu, and C.-Y. Chi, "Embedding and detection of side information for peak-to-average power ratio reduction of an OFDM signal using partial transmit sequences," in 58th IEEE VTC 2003-Fall, vol. 2, 2003, pp. 1354-1358. [48] A. D. S. Jayalath and C. Tellambura, "SLM and PTS peak-power reduction of OFDM signals without side information," IEEE Trans. Wireless Commun., vol. 4, pp. 2006-2013, Sept. 2005. Bibliography 82 [49] L. Ingber, "Adaptive simulated annealing (ASA): Lessons learned," Control and Cybernetics, vol. 25, no. 1, pp. 33-54, 1996. [50] K. Varrentrapp, "A practical framework for adaptive metaheuristics," Ph.D. disser tation, Technische Universitat Darmstadt, Darmstadt, Germany, 2005.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- PAR reduction in OFDM systems by multiple signal representation
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
PAR reduction in OFDM systems by multiple signal representation Nguyễn, Trung Thành 2006-01-08
pdf
Page Metadata
Item Metadata
Title | PAR reduction in OFDM systems by multiple signal representation |
Creator |
Nguyễn, Trung Thành |
Date Issued | 2006 |
Description | As we are embarking on the new information age, demand for high speed wireless communication has increased substantially. One of the promising technologies that could fulfil such demands is orthogonal frequency division multiplexing (OFDM). Due to its bandwidth efficiency and robustness in wireless environments, OFDM has recently gained an increasing popularity. One major drawback of OFDM is its high peak-to-average power ratio (PAR). If not processed, the high peaks in the OFDM signal cause saturation in the power amplifier, which in turn distorts and decreases the power of the transmit signal. In order to avoid the resulting bit-error-rate performance degradation and out-of-band radiation, either expensive linear power amplifiers need to be employed, or nonlinear amplifiers must be operated with high power backoffs in power-inefficient amplification. Power inefficiency is however unacceptable to battery-powered devices. Among the numerous PAR reduction approaches available in the literature, multiple signal representation (MSR) techniques have achieved considerable attention thanks to their great PAR reduction capability and favorable performance-expense tradeoffs. Within the MSR family, partial transmit sequences (PTS) is the most popular while trellis shaping is the newest technique. This thesis makes contributions to both of them. For PTS, we restate its low-PAR signal search as a combinatorial optimization (CO) problem and propose a fair benchmark to evaluate the search algorithms. These allow us to compare the existing PTS algorithms and apply efficient heuristics from the CO literature to PTS. Surprisingly, our results show that random search, the simplest algorithm, actually outperforms all available PTS algorithms. Only our tailored versions of the two CO metaheuristics, namely simulated annealing and tabu search, attain better tradeoffs in the low-PAR region. Trellis shaping is an interesting MSR technique that does not require explicit side information. Our contributions to trellis shaping include a new metric and the use of the stack sequential decoding algorithm in shaping-sequence search. These proposals improve PAR reduction and/or reduce the complexity of the technique. Beneficial to all algorithms that involve a large number of peak-power searches, we propose using a new optimization objective to reduce their overall complexity. This is applicable to both PTS and trellis shaping. Finally, based on the trellis shaping structure, we propose a side information embedding/ detecting scheme for PTS. Our low-complexity scheme does not affect the PAR reduction capability of PTS and practically does not degrade the BER performance of the OFDM systems. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-01-08 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0065563 |
URI | http://hdl.handle.net/2429/17797 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2006-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_2006-0278.pdf [ 3.76MB ]
- Metadata
- JSON: 831-1.0065563.json
- JSON-LD: 831-1.0065563-ld.json
- RDF/XML (Pretty): 831-1.0065563-rdf.xml
- RDF/JSON: 831-1.0065563-rdf.json
- Turtle: 831-1.0065563-turtle.txt
- N-Triples: 831-1.0065563-rdf-ntriples.txt
- Original Record: 831-1.0065563-source.json
- Full Text
- 831-1.0065563-fulltext.txt
- Citation
- 831-1.0065563.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065563/manifest