Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Synchronization, phase detection, lock detection, and SNR estimation in coherent M-PSK receivers Linn, Yair 2007

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2007-318164.pdf [ 20.53MB ]
Metadata
JSON: 831-1.0100657.json
JSON-LD: 831-1.0100657-ld.json
RDF/XML (Pretty): 831-1.0100657-rdf.xml
RDF/JSON: 831-1.0100657-rdf.json
Turtle: 831-1.0100657-turtle.txt
N-Triples: 831-1.0100657-rdf-ntriples.txt
Original Record: 831-1.0100657-source.json
Full Text
831-1.0100657-fulltext.txt
Citation
831-1.0100657.ris

Full Text

SYNCHRONIZATION, PHASE DETECTION, LOCK DETECTION, AND SNR ESTIMATION IN COHERENT M-PSK RECEIVERS by YAIR LINN B.Sc, Technion Israel Institute of Technology, Haifa, Israel, 1996 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Electrical and Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA July 2007 © Yair Linn 2007 ABSTRACT This thesis presents and investigates new structures for use within coherent M-PSK (M-ary Phase Shift Keying) receivers. The thesis is divided into three main parts. The first part of the thesis presents and investigates a new family of carrier lock detectors. These detectors are self-normalizing, i.e., they are independent of the AGC (Automatic Gain Control) circuit parameters. In the second part of the thesis, two new families of carrier phase detectors are presented and analyzed. The first family of phase detectors is self-normalizing (i.e., the phase detectors are independent from the AGC). The second family of phase detectors is based on an adaptive structure that achieves not only independence vis-a-vis the AGC, but rather also independence from the SNR (Signal-to-Noise Ratio), hence potentially allowing the carrier synchronization circuit to operate optimally at all SNRs. In the third part of the thesis, two new families of SNR estimators are presented. The first family of SNR estimators requires that the carrier synchronization PLL (Phase Lock Loop) be locked in order to function. The second family of SNR estimators is more complicated but has the advantage of dispensing with the carrier synchronization requirement; thus, this second family of SNR estimators is also suitable for SNR estimation in D-MPSK (Differential M-ary Phase Shift Keying) receivers. The aforementioned lock detection, phase detection, and SNR estimation structures are compared to previously available structures and it is shown that they have significant implementational and performance advantages. Three important unifying aspects of the proposed structures are: ( 1 ) They are Non Data Aided (NDA); (2) they all have compact fixed-point implementations suitable for use within an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array); and (3) they are all to a great extent independent from the AGC circuit. These key advantages make the proposed structures particularly attractive for use in FPGA-based or ASIC-based receivers. TABLE OF CONTENTS Abstract ii Table of Contents iii List of Tables ix List of Figures x List of Abbreviations xix Acknowledgements xxi Dedication xxii Chapter 1 Introduction and System Model Definitions 1 1.1 Overview of This Thesis 3 1.2 Intuitive Understanding of M-PSK Synchronization 3 1.3 Coherent M-PSK Receiver Topologies 11 1.3.1 Analog architectures 12 1.3.2 Digital architectures 12 1.3.3 Hybrid architectures 13 1.4 Signal and Receiver Models that are Used in this Thesis 16 1.5 The AGC's Operation 21 1.5.1 The parameter K and its relationship with the AGC 22 1.5.2 Demonstration of the AGC's operation 24 -iii-1.5.3 Discussion - ideal AGC vs. atrophied AGC 31 Chapter 2 A Family of Self-Normalizing Carrier Lock Detectors and E s / N 0 Estimators for M - P S K 33 2.1 Introduction 33 2.2 Signal and Receiver Models 36 2.3 Detector Characteristics 36 2.3.1 Basic Definitions and Equations 36 2.3.2 Lock Detector Expectation 38 2.3.3 Hardware Realization 43 2.3.4 Implementational Advantages vs. Other Lock Detectors 47 2.3.5 Principle of SNR Estimation from the Lock Detector Value 48 2.3.6 Intuitive Understanding of the Detectors' Behaviour 49 2.4 Performance Analysis for Imperfect Locking 53 2.5 Lock Detector's Variance and Distribution 60 2.6 Lock Probabilities and Circuit Parameter Determination 61 2.7 Operation in Fading Conditions 66 2.7.1 Lock detector distribution for Case (a): 2N-T«TCOH 69 2.7.2 Lock detector distribution for Case (b): 2N-T»TCOH 69 2.7.3 Example: Operation in Nakagami-m fading for 2N-T»TCOH 72 2.8 Conclusions 74 Chapter 3 Robust M - P S K Phase Detector Structures: Theory, Simulations, and System Identification Analysis 75 3.1 Introduction 75 3.2 Signal and Receiver Models 80 3.3 Performance Analysis of PLLs - A Brief Overview 80 3.3.1 Performance metrics, nonlinear model, and linearized model 80 -iv-3.3.2 The linearized PLL model 81 3.4 A Family of Self-Normalizing Phase Detectors 84 3.4.1 Definition 84 3.4.2 S-Curve of dM(n) 84 3.4.3 Amplitude suppression factors ofdM(n) 87 3.4.4 Loop noise, self noise, and squaring loss of dM(n) 89 3.4.5 Examples of S-Curves of dM(n) 90 3.4.6 Squaring loss comparison vs. other phase detectors 95 3.4.7 Nonlinear model simulation investigation oid^n) 97 3.4.8 Hardware realization 98 3.4.9 Lookup Table Implementation Issues 100 3.4.10 Intuitive Understanding of d^n) 103 3.5 The Case for a Constant-Gain Phase Detector 109 3.6 A Family of Robust Adaptive Phase Detectors 112 3.6.1 Review of lock detector 112 3.6.2 Definition and structure 113 3.6.3 Comments regarding the hardware implementation of VM,N 115 3.6.4 Linear modeling of VM,N 116 3.6.5 Nonlinear model simulations for VM.N 118 3.6.6 Unlocked-state operation of VM.N 119 3.6.7 System identification analysis 126 3.6.8 Bounds on TV to ensure satisfactory tolerances in PLL parameters in PLLs using VM.N(n) 129 3.7 Conclusions 131 Chapter 4 New Methods for Real-Time Generation of SNR Estimates for Digital Phase Modulation Signals 133 Part A A New SNR Estimation Structure for M-PSK 133 4.1 Introduction 133 4.1.1 The general principle behind SNR estimation 134 -v-4.2 Review of Receiver structure and Lock Detector 136 4.2.1 Signal and receiver models 136 4.2.2 Lock detector definition, expectation and distribution 136 4.2.3 Low jitter approximations 137 4.3 Principle of SNR Estimation from lM N 137 4.3.1 Theoretical basis 137 4.3.2 Hardware implementation 140 4.4 Discussion: Quantitative Measurement of Estimator Performance 141 4.5 Comparison of Estimation via lM N to Estimation via the SER 142 4.5.1 Number of symbols needed for estimation via lM N 143 4.5.2 Number of symbols needed for SNR estimation via the SER 145 4.5.3 Display and Analysis of Results 148 4.5.4 Asymptotic Formulas for Number of Necessary Symbol Intervals 152 4.5.5 The Causes of the Proposed Method's Advantages 160 4.6 Comparison of Estimation via lM N to additional SNR estimation methods.. 163 4.6.1 Qualitative comparison 163 4.6.2 Quantitative comparison of N M S E vs. other blind 1-sample/symbol SNR estimators 164 4.7 SNR Estimation via lM N for M-PSK in the Presence of Fading: Comparison with Estimation via the SER 174 4.7.1 Case (a): 2NT « TC0H 175 4.7.2 Case (b): 2NT » TC0H 175 4.7.3 Discussion 185 4.8 SNR Estimation via lM N for M-PSK in the Presence of Fading: N M S E evaluation 187 Part B A New SNR Estimation Structure for D-MPSK and for M-PSK in the Absence of Carrier Synchronization 192 -vi-4.9 Introduction to Part B of This Chapter 192 4.10 D-MPSK System Model 193 4.11 Motivation and Estimator Structure 195 4.11.1 Motivation 195 4.11.2 Estimator structure and operation principle 195 4.11.3 Hardware implementation 197 4.12 Conditional Distribution of l°N 198 4.12.1 Conditional expectation of given x f ° r 2 N T < S : T C O H 200 4.12.2 Conditional expectation of given % for INT » TC0H 203 4.12.3 Variance of 206 4.12.4 Summary: Conditional distribution of 207 4.13 Principle of SNR Estimation from 207 4.14 Comparison of Estimation via to Estimation via the SER (with and without fading) 209 4.14.1 Number of symbols needed for estimation via l^N 209 4.14.2 Number of symbols needed for SNR estimation via the SER 209 4.14.3 Graphical Exhibition of Results 214 4.14.4 Discussion of Results 218 4.15 Comparison to other estimators using the N M S E metric 221 4.15.1 Comparison for 2NT « TC0H 221 4.15.2 Comparison for 2NT » TC0H 225 4.16 Application to SNR Estimation for M-PSK in the Absence of Carrier Synchronization 229 4.17 Conclusions 230 Chapter 5 Conclusions and Future Work 232 5.1 Summary of Contributions 232 -vii-5.2 Future Work 234 5.2.1 Future Research: Analysis during Lack of Symbol Synchronization 234 5.2.2 Future Research and Continuing Research: A Constant-Gain Detector during Both Tracking and Acquisition 234 5.2.3 Future and Continuing Research: Symbol Synchronization PLL Structures 235 5.3 Final Remarks 240 References 241 Appendices 250 Appendix A Closed-Form Expressions for fM(%) 250 Appendix B Closed-Form Expressions for fM (%) in the Presence of Nakagami-//! Fading 255 Appendix C Closed-Form Expressions for f^ix) 258 Appendix D Closed-Form Expressions for f^iz) in the Presence of Nakagami-/// Fading 261 Appendix E Asymptotic Limits and Simulation Computations of the Cross-Correlation Coefficients Pn,k 263 - V l l l -LIST OF TABLES Table 1. Closed-form expressions for lock detector expected value 41 Table 2. Fading distributions for common channel types 68 Table 3. Comparison of important linearized-model phase detector characteristics 123 -ix-LIST OF FIGURES Fig . 1. Post-matched-filter signals. The received signal is noiseless B P S K . N o carrier error, no timing error 4 Fig . 2. Post-matched-filter signals. The received signal is noiseless B P S K . N o Carrier error. Timing error of T14 4 Fig . 3. Post-matched-filter signals. The received signal is B P S K with Es IN0 = 7 d B . N o Carrier error. Timing Error of T14 5 Fig . 4. Post-matched-filter signals. The received signal is noiseless B P S K . Carrier frequency error of 1/(50- T). N o timing error 5 F ig . 5. Post-matched-filter signals. The received signal is noiseless B P S K . Carrier frequency error of i/(50-r) . Timing error of T/A 6 Fig . 6.1-Q Graphs for Es lN0 = i dB • Left: no timing or carrier error. Right: no carrier error, timing error of TIA 7 Fig . 7.1-Q Graphs. Left: Es / N0 = 15 dB with carrier frequency error, no timing error. Right: Es IN0 = 7 dB with constant carrier phase error of 0.4;r, no timing error 8 Fig . 8.1-Q graphs for B P S K with Esl NQ- 30 d B . Left: no carrier or timing error. Right: Gaussian carrier phase jitter with ^var(6>e) = 20°, no timing error 9 Fig . 9. Post-matched-filter signals. The received signal is B P S K with Es IN0 = 30 d B . N o Carrier error, no timing error 9 F ig . 10. Post-matched-filter signals. The received signal is B P S K with Es IN0 = 30 d B . Gaussian carrier phase jitter with Jvai(&e) = 20° , no timing error 10 Fig . 11.1-Q graphs for Q P S K with Es / N0= 30 dB. Left: no carrier or timing error. Right: Gaussian carrier phase jitter with yjvar(0e) = 20°, no timing error 11 Fig . 12. General structure of a hybrid receiver for digital communications 14 Fig . 13. M - P S K receiver model that is used in this thesis 17 Fig . 14. Square-Root Raise Cosine pulse shapes 19 -x-Fig. 15. Pre-AGC and post-AGC sampled waveforms for Es IN0 =20 dB 25 Fig. 16. Pre-AGC and post-AGC sampled waveforms for Es IN0 =8 dB 26 Fig. 17. Pre-AGC and post-AGC sampled waveforms for Es IN0-2 dB 27 Fig. 18. Pre-AGC and post-AGC sampled waveforms for ES/NQ =-3 dB 28 Fig. 19. A G C signal suppression factor for the example A G C 31 Fig. 20. Lock detection principle 34 Fig. 21. Lock metric when locked vs. Es / N0 42 Fig. 22. Efficient hardware generation of lM N 44 Fig. 23. Moving average 45 Fig. 24. Integrate and Dump module 46 Fig. 25. 2nd-order nonlinearity lock metric generation for BPSK 48 Fig. 26. Received QPSK signal, with Es IN0 = 20 dB and with receiver in lock, K = 0 .5 , superimposed on contour map of x 4 n 51 Fig. 27. Received QPSK signal, with Es IN0 = 20dB , unlocked receiver, K — 0.5 , superimposed on contour map of X4 51 Fig. 28. Received QPSK signal, with Es IN0 = 2QdB , K = 0.S, receiver in lock, superimposed on contour map of Re[(I(n) + j • £>(«))4 ] 52 Fig. 29. Received QPSK signal, with Es/N0= 2MB , K = 0.4, receiver in lock, superimposed on contour map of Re[(I(n) + j • Q(n))4] 52 Fig. 30. Equivalent baseband model of the synchronization loop that was used for closed loop simulations 56 Fig. 31. Theoretical and simulated lock detector values, using equivalent baseband model of Fig. 30, for 2BL -T = 0.01 57 Fig. 32. Theoretical and simulated lock detector values, using equivalent baseband model of Fig. 30, for 2BL - T = 0.1 58 Fig. 33. Theoretical and simulated lock detector values, using equivalent baseband model of Fig. 30, for 2BL-T = 0.25 59 -xi-Fig. 34. Required threshold T, needed to achieve a necessary PD and PFA, for various values of M 64 Fig. 35. Required No. of symbols, 2-N, if using /4 N , needed to achieve a necessary PD and PFA, for j=ldB, for M=4 (QPSK) 65 Fig. 36. Required No. of symbols, 2-N, if using LNDAA N , needed to achieve a necessary PD and PFA, for j=ldB, for M=4 (QPSK) 66 Fig. 37. Theoretical expectation of lock detector vs. simulations 73 Fig. 38. Simplified flow diagram of feedforward demodulation of an M-PSK burst 76 Fig. 39. Comparison of exact expression for fM(z) vs. the approximate expression 89 Fig. 40. S-Curve of d2(n) for various SNR ratios 91 Fig. 41. S-Curve of d2(n) (for BPSK) for various SNRs in the interval ^ | 92 Fig. 42. S-Curve of d4(n) (for QPSK) for various SNRs in the interval ~^0t <^ 93 Fig. 43. S-Curve of d%(n) (for 8-PSK) for various SNRs in the interval --<6e <- 94 8 8 Fig. 44. S-Curve of dl6(n) (for 16-PSK) for various SNRs in the interval -—<6e <— . 95 16 16 Fig. 45. Squaring loss as a function of Es IN0 96 Fig. 46. Calculated phase-error variance var(<9J , using linearized baseband model 97 Fig. 47. Simulated phase-error variance var(#J, using equivalent baseband nonlinear model 98 Fig. 48. Efficient hardware generation of dM(ri) 99 Fig. 49. Intensity graph for the lookup table computing d2(n) = 2I{n)Q(n)l{I2(n) + Q2{n)), no quantization 102 Fig. 50. Intensity graph for the lookup table computing d2(n) = 2I(n)Q(n)/(I2(n) + Q2(n)), with quantization of inputs to 4 bits (each) and the output to 8 bits 103 Fig. 51. Received QPSK constellation, Es/N0= 20dB, K = 0.3, superimposed on a , 4l\n)Q(n)-4I(n)Q\n) contour map of dA(n) = v ' y 2 K ' 105 (l\n) + Q2(n))--xii-Fig. 52. Received QPSK constellation, Es/N0= 20dB, K = 0.S, superimposed on a contour map of dA(ri) = v ' K ^ 1 0 6 (l2(n) + Q2(n)) Fig. 53. Received QPSK constellation, Es/N0= 20dB, K = 03, superimposed on a contour map of c4 (n) = Im[(/(«) + j • Q{n)f ] = 4/ 3 (n)Q(n) - 4I(n)Q3 (n) 107 Fig. 54. Received QPSK constellation, Es/N0= 20dB, AT = 0.8, superimposed on a contour map of c4(n) = Im[(/(«) + j-Q(n))4] = 4l\n)Q(n)-4I(n)Q\n) 108 Fig. 55. Calculated var(0e) using linearized model when A G C effects are ignored and when A G C effects are modeled I l l Fig. 56. Hardware implementation of VM N(n) 115 Fig. 57. a s m for the various phase detectors discussed in this thesis. 120 Fig. 58. psm for the various detectors discussed in this thesis 121 Fig. 59. IC and IC ^  as a function of the SNR for the A G C of Section 1.5 122 Fig. 60. var(<9J, obtained via nonlinear-model simulations including A G C effects 124 Fig. 61. var(6>J for QPSK and 16-PSK obtained via nonlinear-model simulations including A G C effects 125 Fig. 62. Simulated responses of carrier PLLs to a phase step 127 Fig. 63. Predicted and measured performance of DDM, cM, and VMN 129 Fig. 64. Lower bound on N needed to achieve a>opl and £ o p l to a desired tolerance, at a given confidence 131 Fig. 65. SNR estimation principle 135 Fig. 66. Simulated and predicted values for eq. (113) and (114) 139 Fig. 67. Efficient hardware generation of / d B 140 Fig. 68. Probability of symbol error Pe = gM (Es I A^0) as a function of the Es IN0 ratio. , 147 Fig. 69. Number of symbols needed to estimate the Es IN0 to within ± 0 . 5 d B , for M = 2 (BPSK) 149 -xiii-Fig. 70. Number of symbols needed to estimate the Es IN0 to within +0.5 dB, for M = 4 (QPSK) 150 Fig. 71. Number of symbols needed to estimate the Es IN0 to within +0.5 dB, for M = 8 (8-PSK) 151 Fig. 72. Number of symbols needed to estimate the ES IN0 to within ±0.5 dB, with a confidence of C=99%, for M = 2 (BPSK) 158 Fig. 73. Number of symbols needed to estimate the ES/N0 to within ±0.5 dB, with a confidence of C=99%, for M = 4 (QPSK) 159 Fig. 74. Number of symbols needed to estimate the Es INB to within ±o.5 dB, with a confidence of C=99%, for M = 8 (8-PSK) 160 Fig. 75. NMSE comparison of estimation via lu N , the M2M4 estimator, and the SVR estimator, with 2N = 1024 symbols used to compute each estimator. Modulation is BPSK(M = 2) 168 Fig. 76. NMSE comparison of estimation via lM N , the M2M4 estimator, and the SVR estimator, with 2N = 1024 symbols used to compute each estimator. Modulation is QPSK (M = 4) 169 Fig. 77. NMSE comparison of estimation via lM N , the M 2 M 4 estimator, and the SVR estimator, with 2N = 1024 symbols used to compute each estimator. Modulation is 8-PSK (M = 8) 170 Fig. 78. Normalized bias comparison of estimation via lMN, the M 2 M 4 estimator, and the SVR estimator, with 2N = 1024 symbols used to compute each estimator. Modulation is BPSK ( M - 2 ) 171 Fig. 79. Normalized bias comparison of estimation via lM N , the M2M4 estimator, and the SVR estimator, with 2N = 1024 symbols used to compute each estimator. Modulation is QPSK (M-4) 172 Fig. 80. Normalized bias comparison of estimation via lM N , the M2M4 estimator, and the SVR estimator, with 2N = 1024 symbols used to compute each estimator. Modulation is 8-PSK (M = 8) 173 -xiv-Fig. 81. Probability of symbol error gM (x) as a function of % f ° r Nakagami-m fading with m = l 177 Fig. 82. Probability of symbol error gM (x) a s a function of x f ° r Nakagami-m fading with m = 2 178 Fig. 83. Probability of symbol error gM (x) a s a function of x for Nakagami-m fading with m = 5 179 Fig. 84. Probability of symbol error gM (x) a s a function of x for Nakagami-m fading with m = 10 180 Fig. 85. Estimation via lM N vs. estimation via the SER, for case (b) (2NT » TC0H) and case (ii) ( L » T C 0 H ) for Nakagami-m fading with m -1 182 Fig. 86. Estimation via 1M N vs. estimation via the SER, for case (b) ( 2NT » TC0H ) and case (ii) (L » TCOH ) for Nakagami-m fading with m = 2 183 Fig. 87. Estimation via lMN vs. estimation via the SER, for case (b) (2NT » TC0H ) and case (ii) (L » TCOH ) f ° r Nakagami-m fading with m = 5 184 Fig. 88. Estimation via lM N vs. estimation via the SER, for case (b) (2NT » TCOH) and case (ii) (L » T C 0 H ) for Nakagami-m fading with m = 10 185 Fig. 89. NMSE comparison of SNR estimation via estimation via lM N for BPSK and Nakagami-m fading 187 Fig. 90. NMSE comparison of SNR estimation via estimation via lM N for QPSK and Nakagami-m fading 188 Fig. 91. NMSE comparison of SNR estimation via lM N for 8-PSK and Nakagami-m fading 188 Fig. 92. Normalized bias comparisons for SNR estimation via lM N for BPSK and Nakagami-m fading 190 Fig. 93. Normalized bias comparisons for SNR estimation via lu N for QPSK and Nakagami-m fading 190 -xv-Fig . 94. Normalized bias comparisons for S N R estimation via lM N for 8-PSK and Nakagami-m fading 191 Fig. 95. Front end of D - M P S K receiver (simplified diagram) 194 Fig . 96. Fixed-point hardware generation of ydB 198 Fig . 97. Expected and simulated values of l°iN for case (a) 203 Fig . 98. 7M(Z) = E M,N E[ES/N0] = % as a function of X for Nakagami-m .205 fading F ig . 99. Demonstration of the effects of quantization on the measured value of JM CF) = E l^N E[ES /N0] = % for Nakagami-m fading with m = 2 206 Fig . 100. yDdB (for case (a)) or ydDB (for case (b)) vs. 208 Fig . 101. Probability of symbol error Pe - gDM (Es IN0) as a function of the Es/N0 ratio. 211 Fig . 102. Probability of symbol error Pe = gDM {x) as a function of J = E[ES I JV 0 ] for Nakagami-m fading with m -1 212 Fig . 103. Probability of symbol error Pe = gDM (%) as a function of % = E[ES/N0] for Nakagami-m fading with m = 2 212 Fig . 104. Probability of symbol error Pe = (^ ) as a function of % ~ E[ES/N0] for Nakagami-m fading with m = 5 213 Fig . 105. Probability of symbol error Pe = gDM as a function of % = E[ES / N0] for Nakagami-m fading with m = 10 213 Fig . 106. Estimation via 1° N vs. estimation via the S E R for 2NT «; TC0H and L-T<KZ COH 215 Fig . 107. Estimation via l°N vs. estimation via the SER, for 2NT » TC0H and LT» TC0H for Nakagami-m fading with m = \ 216 Fig . 108. Estimation via 1° N vs. estimation via the SER, for 2NT^>TC0H and L • T » TC0H for Nakagami-m fading with m = 2 217 -xvi-Fig. 109. Estimation via 1° N vs. estimation via the SER, for 2NT^>TC0H and L • T » TC0H for Nakagami-m fading with m = 5 217 Fig. 110. Estimation via vs. estimation via the SER, for 2NT » TC0H and L-T^> TC0H for Nakagami-m fading with m = 10. 218 Fig. 111. NMSE comparison of estimation via 1° N vs. the M2M4 estimator and the SVR estimator, M-2, with 2N = 1024 symbols used to compute each estimator 222 Fig. 112. NMSE comparison of estimation via vs. the M2M4 estimator and the SVR estimator, M = 4, with 2 TV = 1024 symbols used to compute each estimator 222 Fig. 113. NMSE comparison of estimation via 1° N vs. the M 2 M 4 estimator and the SVR estimator, M = 8 , with 2N = 1024 symbols used to compute each estimator 223 Fig. 114. Normalized bias comparison of estimation via l°iN vs. the M 2 M 4 estimator and the SVR estimator, M = 2 t with 2N = 1024 symbols used to compute each estimator 223 Fig. 115. Normalized bias comparison of estimation via 1° N vs. the M2M4 estimator and the SVR estimator, M = 4, with 27V = 1024 symbols used to compute each estimator 224 Fig. 116. Normalized bias comparison of estimation via l°N vs. the M2M4 estimator and the SVR estimator, M - 8 , with 27V = 1024 symbols used to compute each estimator 224 Fig. 117. NMSE comparison of SNR estimation via estimation via 1°>N for M - 2 and Nakagami-m fading 226 Fig. 118. NMSE comparison of SNR estimation via estimation via 1° N for M = 4 and Nakagami-m fading 226 Fig. 119. NMSE comparison of SNR estimation via estimation via for M - 8 and Nakagami-m fading 227 Fig. 120. Normalized bias comparisons for SNR estimation via 1®N for M = 2 and Nakagami-m fading 228 -xvii-Fig. 121. Normalized bias comparisons for SNR estimation via 1° N for M = 4 and Nakagami-m fading 228 Fig. 122. Normalized bias comparisons for SNR estimation via l^N for M = 8 and Nakagami-m fading 229 Fig. 123. General structure of a coherent M-PSK receiver showing both the carrier and symbol PLLs 237 Fig. 124. Efficient fixed-point hardware implementation of the symbol synchronization PLL lock detector 239 Fig. 125. p „ k for |«-ifc| = l 264 -xviii-LIST OF ABBREVIATIONS AGC Automatic Gain Control ASIC Application Specific Integrated Circuit BPSK Binary Phase Shift Keying CRB Cramer-Rao Bound DA Data Aided DD Decision Directed DDS Direct Digital Synthesizer D-MPSK Differential M-ary Phase Shift Keying DVB Digital Video Broadcasting ECD Error Correction Decoder FFT Fast Fourier Transform FPGA Field Programmable Gate Array IAD Integrate and Dump IF Intermediate Frequency LSB Least Significant Bit LUT Lookup Table M-PSK M-ary Phase Shift Keying MSB Most Significant Bit MSE Mean Squared Error NCO Numerically Controlled Oscillator NDA Non-Data Aided NMSE Normalized Mean Squared Error PD Phase Detector PLL Phase Locked Loop PSK Phase Shift Keying QPSK Quaternary Phase Shift Keying RF Radio Frequency -xix-RMS Root Mean Square SER Symbol Error Rate SNR Signal-to-Noise Ratio SRRC Square Root Raised Cosine SVR Signal-to-Variation Ratio TED Timing Error Detector VCO Voltage Controlled Oscillator WPAN Wireless Personal Area Network - X X -ACKNOWLEDGEMENTS Where do I begin? First and foremost I would like to thank my supervisor, Prof. Matthew J. Yedlin, for his kindness, gentle guiding, wisdom and support throughout my thesis writing process and indeed throughout my studies at UBC. No words would be adequate to express my gratitude, and so I might as well stop here. A mentor of mine since my early days as an undergraduate is Prof. Shmuel Zaks, of the Technion Israel Institute of Technology. I have known him for more than 15 years, and throughout all this time he has provided me with guidance, wisdom, and assistance that I found invaluable and indispensable. Prof. Robert Schober has also been a friend as well as a purveyor of professional advice throughout my studies, for which I thank him profusely. I would also like to thank him for serving as my supervisor as Prof. Yedlin's proxy while the latter was on sabbatical, and for handling the bureaucratic tasks associated with both my departmental and final defense. His selfless contribution to this process is much appreciated. I would also like to thank my third supervisory committee member, Prof. Steve Wilton, for his guidance and for his excellent course on FPGAs. Last but by no means least, I would like thank my family for their help during this process. My parents, Shai and Ruth, for their financial support and encouragement. My sister, Gilat, for her support in the most crucial moments, and my brother Erez for various errands he performed on my behalf. Lastly, I would like to thank my grandparents Ruth and Amnon, who have always served as an inspiration to me throughout my life and my studies. -xxi-To my sister, Gilat, being there when I needed her most - X X l l -Chapter 1 Introduction and System Model Definitions Digital wireless communications systems involve three general elements: the transmitter, the channel, and the receiver. The transmitter's purpose is to modulate the data stream and send the RF (Radio Frequency) signal over the channel, where that signal is corrupted by noise and possibly also interference and distortion. The receiver's purpose is to recover the original data stream while overcoming to the best of its ability the malicious effects of noise, interference, and distortion. This process is known as demodulation. In this thesis we shall investigate structures for coherent demodulation of single-carrier M-PSK signals. M-PSK modulation is used in a wide variety of contemporary broadcasting and networking standards. For example, M-PSK is used in various configurations of the DVB-S [2], DVB-S2 [3], DVB-T [4], DVB-H [5], WiFi (802.11) [6], WPAN (802.15) [7], and WiMAX (802.16) [8] standards, to name a few. M-PSK also has many applications in military communications, especially using satellites [9], [10], [11]. Design of coherent receivers in digital communications involves generating a local carrier that is in phase with the received carrier, and then using this local carrier in order to demodulate the received signal. Generation of the local carrier can be done in one of the following manners: (a) by using pilot symbols or pilot signals, or (b) by extracting of carrier phase information from the received signal itself. The use of pilot signals or pilots symbols has the distinct and inevitable disadvantage of necessitating the expenditure of transmitter power that could otherwise have been used to increase the transmission power of the information-bearing signal (or, alternatively, to increase the the data rate). Hence, if possible, one would like to avoid the use of pilot signals or symbols and endeavour to regenerate a local carrier using only information obtainable from the information-bearing received signal. This indeed shall be the focus of this thesis. In coherent suppressed-carrier receivers, regeneration of the local carrier is generally done via a Phase Locked Loop (PLL) that operates on the output of a Phase Detector (PD). The phase detector provides an indication of the residual phase error between the local and received carriers, and the PLL acts in feedback in order to cancel that phase error. Recovery of the symbol clock is also done using a PLL, which in this instance is called the symbol timing synchronization PLL (or symbol PLL for short). The purpose of the symbol PLL is to generate a local symbol clock that is time-coherent with the received signal's symbol clock. This allows the receiver to sample the received symbol waveforms at the optimal times. The symbol PLL operates upon an error signal that is supplied by a Timing Error Detector (TED). Another essential element in any PLL circuit is a lock detector. In a carrier PLL the purpose of the lock detector is to indicate when the local carrier is phase-coherent with the received carrier, in which case the carrier PLL is deemed "locked". Timely lock detection is necessary for many receiver operations. For example, when the carrier PLL becomes locked the receiver needs to stop scanning the input frequency uncertainty region in order to avoid driving the carrier PLL out of lock. A crucial element in any modern communications receiver is an SNR (Signal-to-Noise Ratio) estimation module. In many modern communications schemes an accurate Es IN0 estimate is needed not only as a monitoring aid, but rather it plays an important role in the receiver's operation. For example, some error correction decoders can make use of an Es/N0 estimate to increase their coding gain (e.g. turbo codes [12]). Another example are systems that employ diversity reception, for which SNR estimates are used to assign relative weights to the data obtained from the various receivers [13 Sec. 14.4]. As another example we mention adaptive schemes (e.g. [14], [15]) where the data and/or coding rates are altered according to the ES/NQ. See also [16 Sec. 1.2] for a more extensive overview of the uses of SNR estimators in contemporary communications systems, along with many useful references. In this thesis, we shall present new SNR estimators for M-PSK and D-MPSK receivers. 1.1 Overview of This Thesis In the remainder of this introductory chapter (Chapter 1) we shall provide an overview of the signal and receiver models pertaining to this thesis. The object of this thesis is to present new structures for M-PSK lock detection (Chapter 2), phase detectors for coherent M-PSK carrier synchronization (Chapter 3), and SNR estimation in M-PSK and D-MPSK receivers (Chapter 4). The final chapter (Chapter 5) is devoted to conclusions, and is followed by several appendices which contain important mathematical derivations which were relegated there in order to maintain the thesis's flow. As we shall see, the structures presented in this thesis are interrelated, and often one can obtain added benefit by exploiting these interrelationships. Although the digital portion of the receiver could be implemented either in hardware or in software, this thesis focuses on fixed-point hardware implementations. The reason is that, first, fixed-point hardware implementations will always have distinct performance advantages due simply to the fact that fixed-point hardware implementations can always be made to operate faster than any software and/or floating point implementation ([17 Chap. 9], [18], [1]). Secondly, more intriguing challenges are present when trying to design a hardware system: while a software system could be implemented in a high-level language, in contrast when implementing a fixed-point hardware system the designer must explicitly address such issues as scaling and dynamic range, logic resource usage, implementation of mathematical operations, etcetera. Finally, given the preponderance of FPGA-based and ASIC-based receivers, such a focus on fixed-point hardware implementations is appropriate in the context of contemporary trends in receiver design. 1.2 Intuitive Understanding of M - P S K Synchronization Before delving into mathematics, it may be worthwhile to attempt to attain an intuitive understanding of the meaning of carrier and symbol synchronization in an M-PSK receiver. To this end, let us assume a BPSK communications system where the baseband data pulse is rectangular, and let us look at the waveforms at the outputs of the I and Q matched filters (note that the post-matched-filter pulse shape is triangular [13 Sec. 5.1]). In the following figures, the system's symbol rate is denoted as \IT . 1 . 5 r g 0 g-0.5-<y - 1 -" 1 5 0 5 10 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 T i m e (Symbol intervals) Fig. 1. Post-matched-filter signals. The received signal i s noiseless BPSK. No carrier error, no timing error. 1 . 5 r 5 1 1 1 1 1 , 1 , 1 1 , 0 5 1 0 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 T i m e ( S y m b o l intervals) Fig. 2 . Post-matched-filter signals. The received signal i s noiseless BPSK. No Carrier error. Timing error of 774. -4 -51 1 1 1 1 1 1 1 1 1 , 0 5 10 15 20 25 30 35 40 45 50 T i m e (Symbol intervals) Fig. 3. Post-matched-filter signals. The received signal i s BPSK with Es/NQ=1 dB. No Carrier error. Timing Error of 7 7 4 . 1 . 5 R " 1 ' 5 0 5 1 0 1 5 2 0 2 5 3 0 3 5 4 0 4 5 5 0 T i m e ( S y m b o l intervals) Fig. 4. Post-matched-filter signals. The received signal i s noiseless BPSK. Carrier frequency error of 1/(50-77) . No timing error. -5-1.5r _-| 5 I 1 i i i i i i i i i 0 5 10 15 20 25 30 35 40 45 50 Time (Symbol intervals) Fig. 5. Post-matched-filter signals. The received signal i s noiseless BPSK. Carrier frequency error of i/(50-r) . Timing error of TIA . In Fig. 1 we see the situation that occurs when there is perfect carrier and symbol synchronization (a noiseless BPSK signal is assumed). The small circles denote the samples of the I and Q channels (at rate l/T, i.e. 1 sample/symbol, which is the rate needed by the carrier PLL). As we can see, the receiver's carrier is in complete synchronization with the input carrier (as seen by the fact that the Q channel is always 0) and the symbol timing synchronization loop is also working perfectly (as seen by the fact that the samples of the I channel are always taken at the symbol's peak). In Fig. 2 we can observe the effects of a symbol timing synchronization mismatch between the local and received symbol clocks. The lack of symbol timing synchronization is evident in that the samples (the small circles) are now not taken at the peak of the symbols but rather at an offset of 774 seconds. Though these sampling instances are not ideal, in the case of a noiseless BPSK signal they will have no effect upon the error rate since the data decision algorithm, which in this case is simply sign(I), is unaffected. However, there will be a very appreciable impact upon the error rate once noise effects are taken into account. To see this, observe Fig. 3, which shows the effects of a timing error of T14 upon the reception of a BPSK signal with Es INQ = 7 dB. In this case the error rate of the decision algorithm sign(I) is significantly degraded due to the timing error. Now let us take a look at the effects of carrier frequency errors. In Fig. 4 we see the case where there is a carrier frequency error (but no timing error). The carrier frequency error causes the signal to meander between the I and Q channels. In Fig. 5 we see what happens when the carrier loop is unlocked and there is also a timing mismatch between the receiver and transmitter symbol clocks. The carrier frequency error manifests itself in the signal meandering between the / and Q channels, while the timing error manifests itself in that the samples (the small circles) are offset from the peaks of the symbols. Clearly, even for the noiseless BPSK signal shown in Fig. 4 or Fig. 5, there is an extraordinary degradation in the error rate as a result of the carrier frequency error, and when noise effects are taken into account there will be an even more pronounced degradation that will also be adversely affected by the lack of symbol synchronization. Fig. 6. I-Q Graphs for £5/^0 = 7dB- Left: no timing or carrier error. Right: no carrier error, timing error of 7 /4 . It is also instructive to look at various scenarios using I-Q graphs. In Fig. 6 we see samples of a BPSK signal where the carrier PLL is locked. On the left-hand side of Fig. 6 we see a signal with Es IN0 = 7 dB where the symbol timing recovery is perfect. On the right, we see the effects of a timing error of T14. Clearly, the timing error causes some data points to cross the decision region boundary (which is the vertical line 1=0), hence worsening the error rate as compared to the perfect-timing recovery case. In Fig. 7's left we see the effect of carrier frequency error on the I-Q graph; obviously, in this case no data recovery is possible. The case of a constant carrier phase error is shown on the right-hand side. Though data recovery is possible, the error rate will suffer tremendously as a result of the fact that many more of recovered symbols have crossed the decision region boundary 1=0 (as compared to Fig. 6 left). Fig. 7. I-Q Graphs. Left: £ ' 5 / A ^ 0 = 1 5 d B with carrier frequency error, no timing error. Right: Es/N0 = l dB with constant carrier phase error of OAK , no timing error. -1 I I Fig. 8. I-Q graphs for BPSK with Es /N0 =30 d B . Left: no carrier or timing error. Right: Gaussian carrier phase j i t t e r with ^j\ax(0e) = 2 0 °, no timing error. Fig. 9. Post-matched-filter signals. The received signal i s BPSK with EsIN0 = 30 d B . No Carrier error, no timing error. 1.5r O -1h " 1 5 0 50 100 150 200 250 300 350 Time (Symbol intervals) Fig. 10. Post-matched-filter signals. The received signal i s BPSK with EsIN0=30dB . Gaussian carrier phase j i t t e r with Jvar(0e) = 20°, no timing error. In Fig. 8 we see the effects of carrier phase jitter. On the left, we see how an I-Q graph would look for Es / N0 - 30 dB for an ideally synchronized receiver. On the right, we see the effects of carrier phase jitter, that is a phase error 6e which has a zero-mean and Gaussian distribution with /^var(#e) = 20°. The corresponding I and Q channel graphs are shown in Fig. 9 and Fig. 10. Although for BPSK and Esl N0= 30 dB the effects of the carrier phase jitter shown in Fig. 8 (right) and Fig. 10 upon the error rate will be mild, the effect upon the error rate would be much more grave for a lower Es IN0 and/or a higher modulation index (e.g. QPSK, 8-PSK, etc.). We can see this in examining the I-Q graph of such a situation as shown in Fig. 11. As seen there, the phase jitter in this case compromises the separation -10-between the received constellation signal points to such an extent that it causes the error rate to be substantially increased. a o .151 > ' 1 1 1 1 .1 5 i 1 1 1 i i i -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 1 1 Fig. 11. I-Q graphs for QPSK with Es INQ = 30 dB . Left: no carrier or timing error. Right: Gaussian carrier phase j i t t e r with yjvar(0e) = 20°, no timing error. Clearly, as we've seen in this section, achieving carrier and symbol timing synchronization is crucial for proper data recovery. In this section we proceeded with the aim of attaining an intuitive (and, hence, qualitative) understanding of the meaning of synchronization. Quantitative effects of carrier phase jitter and symbol timing error upon the error rate of M-PSK can be found, for example, in [19 Sec. 4.3], [20], [21 Sec. 2.2.5], [22 Chap. 7], [23], and [24]. In the receiver, the synchronized carrier and symbol clock are generated by using PLLs which may have various topologies, an issue that is discussed in the next section. 1.3 Coherent M - P S K Receiver Topologies There are three types of receiver topologies: (a) analog; (b) digital; and (c) hybrid. In the following three subsections, these architectures are discussed. 1.3.1 Analog architectures In this case the PLL is implemented using analog components. Examples can be found in many texts, e.g. [25 Chap. 11], [13 Chap. 6], [26], [27 Chap. 10], [28 Chap. 11], [29 Chap. 9], [11]. Analog implementations suffer from some inherent problems, including: (a) variation of system parameters due to component value fluctuations; (b) parasitic and secondary effects which cause degradations; and (c) crosstalk-induced performance degradation. These problems are eliminated or at least significantly mitigated when using a digital implementation or a hybrid implementation. Until the mid-to-late 1980's to the beginning of the 1990's, analog implementations were the workhorses of demodulators for high-speed communications. With the advent of powerful and cheap microelectronic circuits over the last 2 decades (including, notably, high-speed samplers and fast and dense FPGAs), purely analog implementations are quickly being abandoned in favour of digital and hybrid systems. 1.3.2 Digital architectures Digital receiver architectures have been the subject of much investigation over the past 20 years. Two general subclasses of this architecture are possible. a) IF sampling In the IF-sampling topology, the IF (Intermediate Frequency) signal is sampled, and the downconversion and demodulation is performed digitally on the sampled IF signal. Since the IF frequency is usually much higher than the symbol rate, this means that the sampling of the IF signal must be done at a rate much higher than the theoretical minimum of 2 samples/symbol, which renders this approach impractical for demodulating high datarate signals. Sometimes the IF signal can be sampled using bandpass sampling techniques (see for example [30 Sec. 2.3.2], [31 Sec. 10.4.3], [17 Chaps. 7, 10]). However, even with this approach the sampling rate is usually much higher than 2 samples/symbol, particularly when one takes into account the necessity to increase the sampling rate in order to be able to handle carrier frequency uncertainties. -12-Thus, due to the high sampling rate required as compared to the symbol rate, the IF-sampling topology is usually adopted only for demodulation of low datarate communications. b) Near-baseband sampling In the near-baseband topology, coarse frequency downconversion is done in the analog domain and the resulting near-baseband signal is sampled [22 Chap. 4]. Fine downconversion and demodulation is then performed entirely in the digital domain upon these near-baseband samples. Perhaps the most comprehensive treatment of this subject can be found in [22] (see an overview of this architecture in [22 Chap. 4]). An excellent tutorial and many important results can be found in [32] and [33]. The near-baseband digital architecture offers the important advantage of considerably simplifying the analog section of the receiver (essentially reducing it to an AGC (Automatic Gain Control) circuit followed by a coarse downconversion circuit). However, the digital implementation has two big drawbacks. First, the need to perform downconversion and interpolation ([32], [33], [22]) in the digital domain implies a rather complicated digital section. Secondly, the sampling rate necessary for good performance is at least 2 samples/symbol, and more likely at least 2.5 to 3 samples/symbol (see [33 Tables IV, V]). This, as compared to a minimum of 1 sample/symbol that is necessary for a hybrid implementation (see Sec. 1.3.3). As sampler and FPGA/ASIC speeds become higher and digital logic densities increase, there is no doubt that digital implementations will gradually replace hybrid and analog implementations for many communications systems. However, due to the high sampling rate requirement and the high digital logic complexity, there shall always be a sizeable portion of receivers which are implemented using hybrid architectures, especially for high datarate communications. 1.3.3 Hybrid architectures In a hybrid architecture, some of the PLL components are implemented in the analog domain while others are implemented using digital logic. Since many components can be implemented either analogically or digitally, this gives rise to many architectural possibilities. For example, in one hybrid implementation of a carrier PLL only the phase -13-detector is implemented digitally, while in another implementation the digital portion would also include the loop filter and the matched filters. In general, the choice as to whether to implement a certain component in the analog or digital domains will come down to economics: i.e. what is the cheapest and/or easiest and/or most performance-effective way to implement that particular component, or what is the best implementational tradeoff given these constraints. An example hybrid implementation of an M-PSK receiver is shown in Fig. 12, although the reader is advised to remember that other variations on this architecture are possible. Matched IF Input C a r r i e r P L L loop filter sample rate Carrier Phase Detector sample rate 1/T. 1± Direct Digital Synthesizer S y m b o l P L L loop filter Carrier Lock Detector Symbol Timing Symbol PLL lock detector Error Detector Fig. 12. General structure of a hybrid receiver for d i g i t a l communications. The parts within the dashed line are implemented d i g i t a l l y , while the rest are analog components (the samplers and DDS (Direct Digital Synthesizer) are mixed-signal components). -14-Hybrid implementations offer the important advantages of being able to operate with excellent performance at sample rates as low as 1 sample/symbol (if a 1 sample/symbol TED is used, such as in various detectors discussed in [34], [35], [36], [21 Chaps. 7, 8] and [37], [38], [39], [40]). This is at least 2 to 3 times lower than the necessary sampling rate for a digital topology [33]. Moreover, the digital logic can be made to be extremely simple (see [41], [18], [1]). At a rate of 2 samples/symbol, virtually ideal performance can be achieved and carrier-independent timing error detectors that require 2 samples/symbol can be used ([42], [43], [37], [44]). The rate of 2 samples/symbol is still 25%-50% lower than the 2.5 to 3 samples/symbol usually needed for using a simple linear interpolator within the digital topology [33 Table V]. The hybrid topology is thus particularly suited for high-datarate communications (for example (in current technology) for symbol rates which are above 50 MHz) where sampling and real-time processing of the IF or near-baseband signal is often either impossible or uneconomical. Using the hybrid architecture allows the designer to enjoy architectural benefits which are unavailable when using a completely analog or completely digital receiver. As alluded to earlier, sampling the baseband signal rather than the IF or near-baseband signal generally allows more inexpensive samplers which operate at a lower clock rate to be used. Furthermore, the fact that downconversion of the incoming signal is done in the analog domain considerably simplifies the digital logic required, since the latter is relieved from the need to perform downconversion of the complex (i.e. I-Q) sampled signal. On the other hand, having the I-Q demodulator in the analog domain driven by an analog local carrier allows an arbitrarily high IF frequency flF to be used with the aid of an external mixer and a fixed oscillator. Thus, with a relatively inexpensive low frequency DDS (Direct Digital Synthesizer) and an additional (relatively inexpensive) fixed RF oscillator, the use of an expensive and difficult to use high frequency VCO (Voltage Controlled Oscillator) is averted. Moreover, the long-term stability and phase noise characteristics achievable with DDS chips are difficult to attain using a VCO. See [45 App. A] and [46] for an explanation on the DDS's operation. The advantages of implementing the phase detector and the loop filter digitally are numerous. First and perhaps foremost, the repeatability of filter and transient response -15-specifications that can be achieved via a digital implementation is exceedingly difficult to duplicate in an analog implementation. Second, arbitrarily complicated phase detector and filter structures may be implemented, whereby that complexity is only limited by the amount of logic and computing power available for the digital section's implementation. Finally in this very non-exhaustive list, the implementation of certain synchronization loop elements by digital means allows testability and probing with accuracy and availability that is hard to attain in a completely analog system; for example, if the loop filter is implemented digitally then its (digital) input may be monitored by a computer console or analyzed in real time using the FFT transform. Due to the low sampling rate requirements and the relatively low complexity in both the analog and digital domains, the hybrid implementation is the architecture of choice for high-datarate communications. Thus, it is of most interest to us. Hence, as already stated, we shall assume for the remainder of the thesis that the receiver has a hybrid architecture, whose precise characteristics are discussed in the next section. 1.4 Signal and Receiver Models that are Used in this Thesis In this section we shall present the M-PSK receiver model that is discussed in the remainder of the thesis. In this thesis we are not concerned with implementation of the symbol synchronization PLL, and, unless otherwise stated, that PLL is assumed to be operating ideally. We thus limit ourselves to investigation of the carrier PLL, which is assumed to be a hybrid PLL as discussed in Sec. 1.3.3. Though a hybrid structure is assumed, it is commented that the structures presented in this thesis will work and will have the exact same behaviour if the carrier PLL has a digital topology. The simplified receiver model is as shown in Fig. 13, and this model is used throughout the thesis. -16-Re[/w(0 • exp(y (&),? + 0i))] + n(t) co = ^p(t- nT) cos(a>,/ + 6,+0„) + n(t) K ]T p(t-nT)cos{-A(ot + ei -6o + 0J ®h(t) + n,(t) Matched Filter -IF Input-l(n) K -l(t)-2cos( (DJ + Acot + 60) t sample rate 1/T NCO/VCO Local Carrier Generation Loop Filter Phase Detector 90° 2sm{coit + Acot + 0o) Matched Filter Kt) Q(n) - Q ( t ) - ^ - r -' sample rate 1/T Lock Detector Symbol Decision K ^ p(t-nT )sir(-Acot + 0i-0o+<t>n) <g> h(t) + n0(t) Fig. 13. M-PSK receiver model that i s used in this thesis. oo We denote the M-PSK data signal as m(t) - ]T anp(t - nT), where: n=-co a n = e xP(.M,)> & =1n-mJM, mn e {0,l,...,M-l} (1) is the actual data and pit) is the baseband data pulse. The modulated signal is sm(t) = Re[m(t)eJ6)''+J0i] and that signal is corrupted by an AWGN channel. Fig. 13 shows a simplified diagram of the M-PSK receiver under discussion. In Fig. 13: 1. 1/7" is both the symbol rate and the sample rate. 2. We assume a narrowband bandpass signal, i.e. coj »l/T . 3. n{t) ~ N(0,NQW) where W is the width of the bandpass IF filter (not shown). -17-4. K represents the physical gain associated with the circuit. It is assumed that K has the same value in both the / and Q arms (i.e. the arms are "balanced"). In general, K is a slow function of time controlled by the AGC to achieve a desired signal level at the sampler inputs. A more detailed discussion of the AGC and the parameter K is presented in Section 1.5. 5. When the carrier loop is locked around a stable equilibrium point, we have Aa> = 0 and (since M-PSK carrier synchronization has an inherent M-fold phase ambiguity ([13 Chap. 6], [22 Chap. 5, 6], [21 Sec. 5.7])) 0o e{6> +27tklM -Oe\k = Q,\,--,M-l^, where \0e\<7ilM is the residual phase error. 6. The matched filter h(t) is assumed ideal and the sampling at the outputs of matched filters is considered to be at the ideal time (i.e. the symbol synchronization loop is assumed locked). OO 00 7. From [13 Sec. 4.1.1] we have ES= J [ j P ( 0 C O S ( ^ + ^ ) ] 2 d t * / 2 Jp 2 ( t )d t = /2EP. —00 —00 Without loss of generality, we assume for convenience Ep =1 (implying ES =%)• 8. Throughout this thesis, the terms SNR and Es IN0 ratio are used interchangeably, and we use the notation^ to refer to the Es ING ratio (=SNR). 9. We assume that the Nyquist criterion for zero-ISI [13 Sec. 9.2.1] is obeyed regarding the output of the matched filters. Two important pulse shapes that fulfill this condition are: • The rectangular pulse: p^-U^IT -T/2<t<T/2 0 otherwise The Square-Root Raised Cosine (SRRC) pulse ([47 eq. 68.15]) which is shown in Fig. 14 (where 0 < a < 1 is the rolloff factor) : -18-i, , \ s i n ( ( l - a W f / r ) H ) *Jf(l-16a2t2/T>) 10. Throughout this thesis we make the standard assumption made in synchronization texts (e.g. [22], [21]) that the carrier PLL is a high-loop-gain second-order system. Hence, the linearized-model Laplace transfer function of the PLL is H(s) = £)0(s)/6).(s)= ^°}" s + a>" where £ is the damping ratio and a>n is the s + 2£a>n -s + a>n natural frequency in radians/sec (see for example [25 Sec. 2.2]). 11. We assume, unless otherwise stated, that no signal fading is occurring. Hence, the Es IN0 ratio is considered to be a constant. Fading effects shall be treated in a specific manner where appropriate. cx=0 a=0.5 a=1 ! : i l j j ; \ \ 1 Ii \ \ \ i \ a\ 1 \\\ - 3 - 2 - 1 0 1 2 T i m e ( s y m b o l In te rva ls ) Fig. 14. Square-Root Raise Cosine pulse shapes Note that in Fig. 13 we have: -19-n,{nT) = [(2n(0cos(fflJY + Atot + 9a))® h(t)~\ nQ(nT) = [ ( - 2 « ( 0 s i n ( a > . r + Aa>t + 0o))® h(t)~] N{0,2N0ES) ~N(0,2N0ES) (4) and: I{n) = ( 2 £ s cos ( - Aco • nT + 0. - 0O + + n,{nT Q{n) = K (2£ ssin ( - Aco • nT + 0. - 0O + (f)n) + nQ{nT] ')) )) (5) We shall now derive (5). Referring to the I channel, we have from Fig. 13: I I 0 0 i I{t) = K ^p(t- rT) cos(-Acut + 0 i - 0 o + fr) ® h(t) + n, (t) = KlL {(cos(-A^yr + 0l,-0O + <pr)p{r-rT)h{t-r))• dr + Kn,(7) We shall limit ourselves to dealing with baseband pulses p(t) that are real. Furthermore, we require that p(t) have a finite effective length defined as L = mT, where m is some (small) positive integer, chosen so that p{t) is nonzero only in the interval [-L/2,L/2]. This condition on p(t) does not limit the applicability of the ensuing analysis, since the vast majority of practical systems employ baseband pulses which comply with this requirement or are closely approximated for the purposes of analysis by such finite duration pulses (this is true, for example, for both rectangular and SRRC pulses, given in (2) and (3), respectively). Under those conditions we have that the matched filter h{t) abides by h{t) = p{-i). Furthermore, for convenience we define s{t) = p{t) ® h{i), and we note that we have assumed that s{t) conforms to the Nyquist criterion for zero-ISI [13 Sec. 9.2.1]. From [13 Chap. 9] we thus have for any integer r : For the ensuing analysis to be valid, we must assume that the beat note Act) has a much smaller frequency than the matched filter bandwidth. This is not a real limitation since if this condition is not obeyed, then the I and Q signals cannot even be considered (7) -20-baseband signals and a coarse (open-loop) correction of the local carrier is needed if there is to be any hope of the loop acquiring lock. Formulation of this condition takes the following form: l A ^ « I = J _ (8) 2n L mT Re-examining (6), we note that since hit) is nonzero only in the interval [-L/2,L/2], the integration bounds can be reduced to comprise only the interval [t-L/2,t + L/2]. Furthermore, because of (8), the signal cos(-A<yr + 6>. -0o+(/)r) can be assumed constant in the interval [t - L12,t + L12] with the approximate value of COs(—AoJt + 6i. — 0o + (f>r) . Using these observations, we can simplify (6) to: oo f t+L/2 \ + Krij(t) (9) 7(0 = A " X C O S ( - A G * + 3-0o+#.) j" p(T-rT)h(t-z)dz /-=-«> V t-L/2 CO = K X c o s ( - A o * + 0.-0o+#r )s(t - rT) + Kn, (t) r=—oo The n -th symbol is sampled at time / = nT , so that: 00 I(n) = I(f)\t=nT =KYJ^-^-nT+ei-60+<l>r)s{nT-rT) + KnI{nT) (10) r=—oo Because of (7), only one term in the summation does not vanish (the one for which k=n), and we are left with: I(n) = K(cos(-A6>-nT, + 0i-6o+<p >(0) + n,(nT)) ( n ) = K (2ES cos(-Aa)t •nT + 6i-61o+0n) + n, (nT)) which agrees with the I component in (5). A similar analysis can be carried out to prove the equation for the Q component in (5). 1.5 The A G C ' s Operation General analysis of AGC-induced effects is hindered by the fact that the constraints and parameters of AGC circuits are strongly dependent upon the specific communications system. It is thus, perhaps, less of a surprise that most contemporary synchronization texts ignore these effects (by assuming a constant K=l). This is the case, -21-for example, throughout [22] and [21], which are some of the most comprehensive modern works on synchronization in wireless communications. Nonetheless, some treatment of AGC effects does exist; see for example [48 Chap. 9] and [49 Chap. 7], though the discussions there pertain to unmodulated carrier-wave synchronization For this thesis we wish to attain an understanding of the AGC's operation and effects upon the carrier PLL and associated structures. To that end, it is instructive to take a look at the waveforms before and after the AGC. We assume throughout this thesis an example AGC that attempts to control the waveform amplitudes at the input of the samplers so that the RMS value of the waveforms is 80% of the dynamic range of the samplers. We also assume, for simplicity, that the samplers' full-scale input range is ±1 volt, which means that our AGC tries to ensure that the pre-sampler waveforms have an RMS of 0.8 volt. The characteristics of this example AGC shall be discussed further in the following subsections. 1.5.1 The parameter K and its relationship with the A G C Obtaining a physical insight into the meaning of the parameter K (see Fig. 13) is straightforward and should perhaps even be intuitively apparent to persons who have designed and built an M-PSK wireless receiver. To explicitly spell out this meaning, we recall that we assumed in Sec. 1.4 that for convenience and without loss of generality the baseband pulse energy (=matched filter energy) is unity, i.e. that Ep = 2ES - p2(i)dt = j^h2(t)dt = 1. Let us further assume that we are ideally locked (i.e. Aco = 0 and G0 e {#,. + Ink I M\k = 0,1,..., M -1}). We then have from (5) that: I{n)=K{zQ^n+nXnT)) and Q{n)=K^iri(f>n+nQ{nT^). (12) 1 N Now, define the time-average operator (•) as (x(n)\ = lim V x(n). It is then easy to see that: ^ ^ ^ W ^ (13) i.e. at high SNR we have that K is roughly the RMS (root-mean-square) of the M-PSK signal. Now, to inject a little more real-world issues into the model, we know that -22-samplers have a finite number of bits and the AGC's job is, as already noted, to ensure that the samplers are not overdriven or underdriven. Consider, for example, a system which samples the input I and Q channels with 8-bit samplers, which give a range1 for I(n) and Q(n) of +127. Let us also assume that the AGC controls the input signal so that, to avoid sporadic overdriving the samplers due to noise, the input signal's RMS is controlled to about 80% of the dynamic range. Note that 8-bit samplers and 80% driving of the samplers are certainly real-world parameters. In terms of the signal model used in the thesis (see Fig. 13) where unity gain samplers are assumed, it can be seen that at high SNR the model of Fig. 13 applies with about AT = 100 (we rounded to .£ = 100; the exact number is K = 80% x 127 = 101.6). Let us look at the situation at low SNR. Since we assumed for our model a constant Es, it follows that var (« y («r ) )— co and var(nQ(nT)) E s / N o ^ 0 >co. The AGC still needs to control K so that the samplers are not overdriven. Hence, at low SNR, to insure finite and non-overdriving sampler inputs, we must have K—Es/N°~~>0 >0, In summary then, if b is the number of bits in the samplers (including sign bit) and the AGC attempts to control the RMS of the input signal to 100-r percent of the samplers' range, we have K—Es'N°'M> >0 and K—Es/M°^° >r. 2*"1 _ For example, for the 8-bits samplers with 80% driving we discussed above, we have b=S and r=0.8, and the dynamic range of K is about 0<AT<100. We could have just as easily adopted the convention that the binary point at the output of the samplers immediately follows the sign bit. For example, if the output of the sampler is 0000001 \ b , this could signify the value 3 or, alternatively, ym (if we think of the binary point as being at the right of the sign bit, i.e. 0.0000011A). This decision is purely arbitrary and has no actual bearing upon the resultant dynamic range analysis stemming from K's behaviour and the influence of the latter upon the rest of the 1 More accurately, in a two's complement 8-bit system, we would have I(p), Qfri)&{—128...127]. For simplicity and fluency of the discussion (and without incurring any appreciable loss of accuracy), we ignore the -128 value. -23-receiver. Under the convenient assumption of the binary point being after the sign bit of the sampler, we have that 0<K<r<l regardless of the number of bits in the sampler. Unless otherwise stated, this is the convenient implicit notational choice made in this thesis. 1.5.2 Demonstration of the AGC's operation For the purposes of the following demonstration we make the convenient assumption that we are ideally locked (i.e. Aco — 0 and 0O e {<9. + 27tk/M\k = 0,1,...,M -1}). Now, let us look at the pre-AGC (=post-matched-filter) and the post-AGC (=pre-sampler) waveforms at various SNRs. We shall look at the I -channel of a BPSK signal that has rectangular baseband pulses (remember that the signal has passed through a matched filter, so the signal waveform is now composed of the triangular pulses p(t)®h(t)). The sampling instances in the following figures are denoted by small circles. -24--sz-'8PQZ=°N/S3 S+N After A G C S After A G C N o i s e N After A G C Signal+Noise Signal -93-mQP2=°N/s3 xoj suiz o j 9 A B M pe-rdures osv-^sod pxxe O S Y - ^ d '91 "6TJI S+N After A G C S After A G C Noise N After A G C Signal+Noise Signal -LZ-J O J suiaojeAHM pe-[dures osv-^sod pup OSY-^d ' Z.T '6TJI S+N After A G C s A f t e r A G C Noise N After A G C Signal+Noise Signal • i i i i i -83-'QP£-=°N/S3 S+N After A G C S After A G C Noise N After A G C Signal+Noise Signal I I i I I i In the above graphs, we see the effects of the AGC upon the waveforms. In the bottom subplots, the dashed horizontal lines represent the samplers' full-scale voltage. Of course, the AGC "sees" only the pre-AGC signal+noise waveform, and the samplers "see" only the post-AGC signal+noise waveform; the separation into separate signal and noise waveforms that is shown in the above figures is only presented, courtesy of the computer simulation, for the benefit of the reader. As can be clearly seen in the graphs, the AGC insures that the signal+noise waveform at the input of the samplers is such that the samplers are saturated infrequently. At Es/N0 =-3 dB, which is just about the carrier PLL lock threshold for BPSK, we see from inspection of Fig. 18 that the samplers will be saturated only 10-12 samples out of 100 samples, i.e. only at most about 12% of the time, which may have an acceptably small effect on the receiver (results for higher SNRs are, of course, even better, as Fig. 15-Fig. 17 show). Yet, the 80% RMS driving level ensures that most of the dynamic range of the samplers is used at all SNRs (hence minimizing the quantization noise). Clearly, had K not been reduced to accommodate for the noise power, the samplers would be saturated almost all the time, especially at low SNRs, as can be seen in the signal+noise waveforms before the AGC for ESIN0 =-3 dB and Es/N0=2 dB (in Fig. 18 and Fig. 17, respectively). Now, notice that the assumption made here is of a constant Es and a changing noise power. Put another way, we assume that the Es IN0 changes because 7V"0 changes. This is contrary to what happens in practice. In practice, the noise power is generally constant (we have NoisePowerPerHzdBmlHz = ThermalNoisePowerPerHzdBm/H. +ReceiverNoiseFiguredB) and the Es IN0 changes because Es changes. However, the adoption of the convention of a constant Es does not impact the analysis, and in fact simplifies it. The analysis is simplified because we can assume that a true matched filter (with energy Ep=2-Es) is present at the receiver. The receiver model adopted here is also the one used in most communications and synchronization texts (for example, see [22] and [21]). More -29-importantly, the post-AGC (=pre-sampler) waveforms presented in Fig. 15 to Fig. 18 will be those that will indeed be encountered in practice. Our AGC tries to control the RMS, namely it tries to control the value of ^(Z 2 (n) + Q2(»)) , where (•) is the time-average operator defined as 1 N (x(n)) = lim V x(n). Therefore, this is an AGC that is a squaring detection AGC. X - J y n=-N+\ Note that, despite appearances, this is not an Envelope Detection AGC. An Envelope Detection AGC would try to control the value of (^Jl2(n) + Q2(n)^ . In contrast, our AGC tries to control the value of so that it equals a certain voltage v. This is equivalent to trying to control the value of so that it 2 equals v . That having been said, the graphs for envelope detection and square-law detection are very similar, as [49 Fig. 7.2-5] shows, so the distinction is almost immaterial. As further verification of the validity of the AGC behaviour presented here, we can AK(ES/N0=z) define the AGC signal suppression factor at Es/N0=% as ocAGC(z)——7 \ -K(ES/N0=co) The graph of ccAGC(x) for our example AGC is shown below. -30-5 •ncA i J i i i j t i -20 -15 -10 -5 0 5 10 15 20 E s / N 0 (dB) Fig. 19. AGC signal suppression factor for the example AGC used in this thesis. Comparing Fig. 19 to [49 Fig. 7.2-5] we see that the AGC curve presented here is indeed in agreement with the data presented in [49] for the Squaring Detection AGC. The congruence between the curves is perfect and serves to highlight the validity of the AGC parameters used in this thesis. 1.5.3 Discussion - ideal AGC vs. atrophied AGC It is of paramount importance to realize that we are still assuming an ideal AGC, i.e. the example AGC circuit discussed above is assumed to be devoid of lag time and is assumed to control the RMS of the pre-sampler waveforms to precisely 80% of the samplers' dynamic range. The assumption of a constant K-l, though undertaken in the vast majority of synchronization texts (e.g. [22], [21]) does not really describe an ideal -31-AGC, but rather an atrophied AGC which operates within a system whose samplers have an infinite dynamic range and an infinite number of quantization bits. Clearly, the AGC discussed here (though still "ideal") is a much closer approximation of reality, as opposed to the assumption of a constant K-l. -32-Chapter 2 A Family of Self-Normalizing Carrier Lock Detectors and E s /N 0 Estimators for M-PSK 2.1 Introduction When building coherent M-PSK receivers, there is an invariable need to generate a reliable carrier lock detection mechanism. This is necessary, for example, in order for the receiver to know when to stop searching the carrier frequency uncertainty region in order to avoid driving the receiver out of lock. As another example, lock detection is necessary in order to allow the receiver to change (adaptively) the loop-filter parameters for different responses for acquisition and for tracking. Yet another use of lock detection is as a start trigger for downstream decoding and data processing elements of the receiver. Generally, lock detection incurs generating a lock metric, which is compared to a threshold. If that threshold is exceeded, then lock is assumed; otherwise the receiver is considered to be unlocked. This process is illustrated in Fig. 20. This chapter was published in part in Linn,Y. and Peleg, N. , "A Family of Self-Normalizing Carrier Lock Detectors and E s / N 0 Estimators for M-PSK and Other Phase Modulation Schemes", IEEE Transactions on Wireless Communications, vol. 3, no. 5, pp. 1659-1668, Sept. 2004. -33-I(n) Q(n) Computation algorithm based on many symbols Lock Metric Fig. 20. Lock detection principle Lock detectors such as those suggested in [50], [51], [48 Chap. 11], [22 Sec. 6.5.2], [25 Sec. 5.4], [52] and [53] operate according to the principle of Fig. 20. The prevalent methods for carrier lock detection for M-PSK rely either on Non Data Aided (NDA) detection based on Mth-order nonlinearities ([50], [51]) or via Decision Directed (DD) schemes [22 Sec. 6.5.2]. However, a drawback of the detectors in the aforementioned references is that the lock detector output is dependent upon the input signal level , and thus the threshold must be so dependent. This regularly overlooked problem often consumes a disproportionate amount of engineering time and energy during receiver design, in order to accommodate the dynamic range of the lock detector and to avoid false locking or false-unlocking due to non-ideal signal levels or non-ideal AGC performance. Even with an ideal AGC circuit any change in the AGC's nominal level must evoke a corresponding change in the lock detection threshold. In contrast, the lock detectors suggested in this chapter are self-normalizing, that is the lock threshold's value may be set independent of signal level or AGC performance. 2 The term signal level as it is used in this thesis must not be confused with the term E IN,. ratio. The former refers to the total signal+noise power that is present at the inputs of the samplers in Fig. 13, while the latter refers to the signal-to-noise ratio of that signal. -34-As a useful by-product, the value of the proposed lock metrics, when in lock, will be shown to be a reliable indicator of the Es IN0 ratio, which is another important metric in receiver operation. Frequently, a downstream decoder can make use of an Es/N0 estimate for modifying its own internal metrics in order increase its coding gain (e.g. turbo codes [12], diversity reception [13 Sec. 14.4]). Alternatively, the use of adaptive coding schemes (e.g., [14], [15]), where the coding and/or data rate is altered according to the channel Es/N0, presupposes the availability of a reliable and timely Es IN0 estimate. What is appealing about the proposed metrics is that they provide just such an estimate based only on the sampled baseband input signal (sampled at a rate of one sample per symbol, and that sample corresponds to the symbol strobe), necessitate a relatively small number of samples in order to arrive at an accurate estimate, and are irrespective of the data sequence. This obviates the need to estimate the channel Es IN0 from the pre- or post-decoder symbol or bit error rate of the received sequence, as is often done. Finally, the lock detectors presented here will also be shown to have an extremely simple hardware implementation that requires only a single, compact lookup table and use of summation as the only mathematical operation, thus greatly facilitating implementation as part of an FPGA-based or ASIC-based receiver. With regards to the layout of this chapter, it is as follows. Section 2.2 presents an overview of the general structure of the receiver and signal around which the discussion applies. Section 2.3 engages in rigorous statistical characterization of the lock metric in an AWGN channel, assuming perfect (i.e. jitter-free) carrier and symbol synchronization. That section further outlines the lock detector's hardware implementation and discusses Es IN0 estimation from the lock metric value, and it culminates in a subsection that aims to provide the reader with an intuitive insight into the lock metric's behaviour. Section 2.4 analyzes the lock detector's performance when imperfect locking is present, providing a detailed quantitative treatment for the case of carrier synchronization phase jitter. In Section 2.5 the variance and distribution of the lock metric are investigated. Section 2.6 provides design formulas for determining the lock detector parameters that will result in desired lock detection probabilities and false alarm rates. Section 2.7 -35-discusses the lock detector's operation in the presence of fading. The final section in this chapter, Section 2.8, is devoted to conclusions. 2.2 Signal and Receiver Models The signal and receiver models, as well as the applicable notations, have been defined in Section 1.4. When in this chapter the terms "signal-level dependence" are referenced, the meaning pertains to dependence on K. Since K multiplies both the signal and noise, it is clear that any dependence of the lock detector characteristics or threshold on K is a mathematical appendage, as well as a practical one, because it introduces K (and its dynamic range) as a quantity to be reckoned with during the lock detection process. Furthermore, since K is a function of time controlled by the AGC (see Sec. 1.5), any dependence on K implies dependence of the lock detection process on the AGC loop's temporal behaviour, often through a decidedly nonlinear interaction. And yet, with previously available lock detectors such as those presented in [50], [51], [48 Chap. 11], [22 Sec. 6.5.2], [25 Sec. 5.4], and [53], precisely that kind of dependence exists. In contrast, the lock detectors suggested here are (as shall be shown shortly) absent of any association with K, and hence they and their thresholds are nearly impervious to the AGC's performance or dynamic range. 2.3 Detector Characteristics 2.3.1 Basic Definitions and Equations The family of lock detectors is defined as: M (l2(n) + Q\n)Y which is of course a finite approximation of: -36-2 Re[(/(n) + y-6(")) t f]\ (15) (/ 2 («)+e 2 («)p where (•) represents the time average operator defined as: (*("))= lim — V x(n) ^ , r M A u ? A 1 ^ I4 (n) - 6I2 (n)Q2 (n) + Q\n) For example, for M = 4 (QPSK) we have / 4 w = > — — \ /a \ / a \ / 2Nn=_N+l (l2(n) + Q2(n)) We shall also define for future convenience: (I(n) + j-Q(n))M]/(l2(n) + Q2(n)) whereupon we have: XM,n ~ ^ e f (17) 1 N I = — Yx (18) £N n=-N+\ The proposed lock detector can be thought of as a modification of the M'h - order nonlinearity detector ([50], [51], [22 Sec. 6.5.2]). To see this, consider that if the denominator term in (14) is eliminated, the latter equation reduces to 1 N ^ Re[(/(«) + j • Q(n))M ], which is the M'h - order nonlinearity detector. 2/V n=-N+\ Generally, one can say that the denominator term in (14) performs adaptive normalization on the numerator; this action has a profound influence on the lock detector's statistics and implementation and makes it, despite the notational similarity, quite different from the M'h - order nonlinearity detector. These performance and implementational advantages shall be investigated in the remainder of this chapter. -37-2.3.2 Lock Detector Expectation Elementary rectangular-to-polar manipulations and the use of De Moivre's theorem [54 eq. 6.9] yields: Re[(/(n)+j • Q(n))M] = Ul\n) + &{ritf • Reftcosfl, +j • s in % f (19) M = (/2(«) + e2(/7))^cos(M^) where: <P„ = tan" Q(n) I(n) and using (5) we can write: (20) (p„=tan { Q(n)) = tan_ 1 V s in(-A« • nT+0j-0o+0n) + nQ(nT)/(2Es) (21) oos{-Aco •nT+0i-0O + 0n) + n,(nT)/(2Es) If the carrier loop is locked and assuming perfect coherent demodulation, we have Aco = 0 and 60 e {#,. + 2xkI M\k = 0,1,..., M -1}, and hence from (5): I(n) = K(2ES cos y//n+n,{nT)) Q(n) = K(2Essin y/n+nQ(nT)) where: ¥ n k <j)n + 0. - 0Q = fa - 2 nk IM = 2n (m„ - k) IM Thus, when locked we have from (21)-(23): <P« = t a n " s i n ( ^ „ ) + nQ(nT)/(2Es) (22) (23) (24) c o s ( ^ „ ) + n,(nT)/(2Es) (that is, when locked <pn is a noise-perturbed estimate of y/n). The rationale behind (14)-(15) now immediately becomes apparent: 3 DeMoivre's theorem states that for any real X and y, (x + j- y)M = (x2 + y2)MI2 exp(y • M6) where 6 = tan"' (y I x) . -38-_ / R e [ ( / ( » ) + 7 - e (^ ) ) M ] \_ / xJ(l\n) + Q\n)ycos(M(pn) lM,«> \ M ] - \ X M , „ ) ~{ M_ J (25) \ {l\n) + Q\n)Y j \ (l\n) + Q\n)Y = (cos(M^)) If the carrier loop is unlocked, then the variables <pn are samples of phases of a rotating sinusoid (that sinusoid is noise-corrupted and phase-modulated, but it is nonetheless rotating). Thus: unlocked (26) Further justification of (26) will be given in Section 2.3.6. If the carrier loop is locked, then noting that: cos(M(<pn-y/n)) = cos(M^)cos(2 ; r ( ra - £ ) ) + sin(M<p )sin(2;r(ra -kj) (*, \ ( 2 7 ) = cos \M(pn) and denoting A0n = <pn - y/n, we have: 1 (28) Es/N0—><x> and = (cos (MA(/>n)) l o o p i s l o c k e d ) (cos (M • 0)) = 1 When locked, the departure of lM x from the value of 1 is dependent on the channel Esl N0, and for the finite-approximation lM N this also depends on how large TY is. Thus it is clear that lMN when the carrier loop is locked provides an estimate of the channel Esl N0. In order to develop quantitative results regarding the dependence of lM N on the Es/N0 ratio, we note that when in lock at ESIN0=% the process A<f>n has the distribution as the phase of a Rice random variable with a probability density function (pdf) for \A(/>\ < n of ([55 Sec. 4.5], [56], [13 Sec. 5.2.7], [57 Sec. 3.4]): -39-PR(A<p\Z)±p(A<f>n=A<?>\Es/N0=Z) exp(-^ r) 2n 1 + yjlz c o s ( A 0 ) e x p c o s 2 (A^))- J e _ > , 2 / 2^ (29) Noting that the distributions of the variables A0n are identical (there is no dependence of the pdf on the actual symbol transmitted), and that these variables are mutually independent, we have assuming ergodicity and using (27): f M ( Z ) = E l M , N E s / N 0 = Z =/ = ( c o s = ( COS (30) \ES/N0=% \ - •'•^Es/N0=Zl i ( M A ^ ) L s / ^ ) = ^ [ c o s ( M A ^ ) l ^ / 7 V o = z ] = ^cos(MA<f>)-pR(A<f>\z)-c1A<p It is easy to compute (30) numerically, but due to the complicated nature of PR(A0\%), as expressed in (29), computation of (30) may seem to elude closed-form representations. However it is in fact possible to arrive at a closed-form formulas for fM(x) • This is done in Appendix A. We arrive there at the following closed-form representation: fu (Z) = exp r-_z} V 2 I rz^ M - + 1 f Z ^ A/+1 — \ * J (31) where Ik (•) is the k -th order modified Bessel function of the first kind [54 Chap. 24]. Moreover, (31) can be simplified even more, as shown in Appendix A. We arrive at the following formula, which is a finite sum composed of polynomials and exponentials: i l f m = 1 + ^ Y ( z l I (MI2 + n-\)\ J m K Z ) 2 tt n\ (M/2-n)\Z" exp(-z) M 12 1 (M/2 + n-\)\ (32) tt (n-\)\(M ! 2 - n ) \ X n As examples, the results of (32) are shown for M= 2, 4 and 8 in Table 1. -40-Table 1. Closed-form expressions for lock detector expected value M /M(Z)=E ^M,N E S / N 0 = % ' _ 2 f2(x) = f i _ r 1 X, \ | e x p ( - j ) J X 4 Aiz) = d 4 6^ 1 — + — I X X ) { 2 6^ + T V X X ) e x p ( - j ) 8 Ux)- \ 16 120 480 840^ 1 1 + 2 3 + 4 + v X X X X ) f 4 60 360 840^ 1 , , 2 3 4 e X P ( X) V X X X X J While (31), (32), and Table 1 are extremely useful in computer simulations and generation of lookup tables for estimation of the Es IN0 from lM N , they somewhat tedious for use by the designer during the initial design process, e.g. with a handheld calculator in order to get a "rough draft" idea of the detector's behaviour. Thus a closed-form simplification is sought for this purpose. To that end, if we look at high Es/N0 ratios, we have that A0n is generally very small (i.e. pR(A</>\x) is non-negligible only for small A</>), and thus we can write (using (29) and [54 eq. 15.72]): PR {A0\x) ~ — exP(rx)x 4^X exp( j • cos 2 (A^)) • J e ' y ' n d y (33) exp Which means that for high Es/N0 ratios: A(/)n~N ' 1 A 0, 2*. (34) -41-(see also [56] for a similar derivation) from which it follows that (using [e'ax2cos(bx)dx = ^ e-b2/^ [ 5 4 eq. 15.73]): 71 fM(Z)= \cos(MA<f>)-pR(A<f>\Z)-dA0 -Tt « J^- |cos(MA^)-exp^-j(A^)2^dA(f> = exp (35) Fig. 21 shows the value of lMm2 (obtained through simulation) and Eqs. (30) and (35) as a function of the Es/N0. As seen in that figure, (35) is a good approximation even for low Es IN0 ratios, which makes it quite a useful tool for the designer. Note also that the linear range of the curves corresponds precisely to the most "interesting" range of Esl N0 ratios, i.e. from the minimal ratio where lock can be maintained [58 Sec. Ill] to a moderately high Es IN0 for the modulation in question. 1r 0.9-0) > o • r-l V H + • > 0.8 0.7 0.6 M 0.5 o 3 0.4 u 0.3 o H 0.1 — Exact (Rician phase pdf) Gaussian Approximation • Simulation, N=8192 ! e:5 ftl Iff g... ! £f"' f d1 / | J \ i l M = 8 F ^ = ! 1 6 A A A A A 'A" 4 A A J2T ot -15 -10 -5 5 IS 15 20 E s/N 0 (dB) 25 30 35 Fig. 21. Lock metric when locked vs Es/N0. -42-2.3.3 Hardware Realization lMN lends itself to efficient hardware implementation. This can be easily seen by looking at (14), (17), and (18): the terms x M n can be generated by a single, fixed-point lookup table which has I(n) and Q(n) as its address, and a single digital Integrate-and-Dump module can perform the summation (the division by 2N is avoided if 2N is chosen to be a power of 2, and then the division can be implemented by discarding the lower log2 IN bits of the output of the summation). Thus no mathematical operations except summation are required. This is depicted in Fig. 22. -43-w Lookup Table: </) CD <D MII( M\ "St S l\z\-^IM-2k{n)Qlk{n) g XM,n ] [ ( / 2 ( / i ) + fi2(/# "I c o Integrate and D u m p A v e r a g e r - s u m 2 N s a m p l e s and d isregard lower log 2 (2N) bits M,N Fig. 22. Efficient hardware generation of lMN. Since4 for all M, K and n, \xM n | = |cos(M^>„ \ < \ , this implies that the lookup table in Fig. 22 needs only to facilitate representation of fractional numbers, hence making its implementation in hardware quite practical. Additionally, comments which are also easily applicable to the lookup table in Fig. 22 are given in Sec. 3.4.9 and are omitted here in order to avoid repetition. Those comments outline the practical way to implement the lookup table in fixed-point hardware, and the reader is referred to Sec. 3.4.9 for more information. Now let us treat the other component of Fig. 22, namely the Integrate and Dump module. The Integrate-and-Dump (IAD) averager is one of the simplest modules 4 Ignoring the infinitesimally probable case of ycM n | = 1 which can always be approximated to any desired tolerance by suitably close fractions -44-available for hardware signal processing, and among the most widely used and well known. It must be emphasized that the Integrate-and-Dump averager is not the same as a "moving average". This distinction is crucial, and is now explained. First, let us look at a moving average. It is shown below: x n B Bits •O- -*o B+log2(2N) Bits Discard lower log 2(2N) bits y(k) B Bits Fig. 23. Moving average: There are 2N-1 taps (registers). The current sample and 2N-1 delayed samples are summed at each clock; then (in a fixed-point hardware implementation) the log2(2./V) lower bits are discarded in order to produce the average y(k). Now, let us look at an Integrate-and-Dump module: -45-Fig. 24. Integrate and Dump module. There i s one feedback register (Regl) (the accumulator) which holds the accumulated value. Every 2N samples the value of Regl i s passed on to the output y(k) after the log 2(2/Y) lower bits are discarded. Then the value of Regl i s cleared to make way for the next 2N samples. Note that the moving average produces an output every T seconds, where 1 / T is the rate of samples of x(ri) . In other words, for the moving average the output clock rate of y(k) is the same of that of x(n) . In contrast, the Integrate-and-Dump module produces an output sample every 2-N-T seconds, i.e. the sampling rate of y(k) is less than that of x{ri) . This is the "price" we pay for the reduced complexity of the Integrate-and-Dump module (as opposed to the moving-average). The use of an Integrate-and-Dump averager was deliberate. This is because a moving average would require inordinate amounts of logic resources. For example, consider the resources needed for the storage elements in the moving average for an 8-bit input data (i.e. x(n) is 8-bits wide) for 7V=1024. Using the moving average, we would need 8 • (2N -1) = 16376 flip-flops and 2N -1 = 2047 adders - a truly remarkable amount of logic. However, using the Integrate-and-Dump module, we would need only two -46-registers, Regl and Reg2, and only a single adder. Regl would have 8 + log 2 (2A0 = 8 + l l = 19 flip-flops, and Reg2 would have 8 flip-flops. There would also be need for some "control logic" (see the Fig. 24). This logic is essentially little more than a self-resetting counter which controls the signals of the "clock" of Reg2 and the "clear" of Regl. This control logic is, thus, trivial to implement. In summary, then, the Integrate-and-Dump averager implementation is very compact and offers a very significant logic savings as compared to the moving-average implementation. 2.3.4 Tmplementational Advantages vs. Other Lock Detectors To see the advantage of the proposed detector over previously available detectors, we note that the M-PSK lock detectors which are prevalently used today are the following: • The N D A (Non-Data Aided) M t h power detector ([50], [51], [22 Sec. 6.5.2]): L N D A M N ^ ^ - X Re[(I(n) + j-Q(n))M] (36) • The DD (Decision Directed) detector [22 Sec. 6.5.2]: LDDMN±±- £ Rc([I(n)-j-Q(n)][I(n) + j-Q(n)]) 2 A „-_M+] , N (37) = ^ 7 7 Z I(n)Kn) + Q(n)Q(n) 2 A „=-N+\ where I(n) + j • Q(ri) is the symbol decision. The fact that lu N can be computed using a small fixed-point lookup table is in sharp contrast to the lock detector schemes of (36) and (37), for reasons which we shall now outline. Dealing first with (36), we see that LNDAM N value of the lock detector includes a dominant term that is proportional to K M , and accordingly the lock threshold must be so dependent; additionally, any attempt to compute those lock detectors must accommodate the dynamic range of KM, which quite often precludes their implementation through the use of fixed-point lookup tables in hardware. An example of the necessary implementation is shown in Fig. 25 for LNDA2 N . Since A' is a function of the A G C , lM N's independence from K also provides significant insulation from false -47-locking and loss of lock due to non-ideal AGC behaviour, particularly when dealing with rapidly fading signals. Specifically, instead of the AGC having to cordon K to within the range Knomiml - tolerance < K < Knomina] + tolerance, the AGC now has only to abide by the relatively loose requirements that K is such that (a) no signal-chain or sampler saturation occurs, and (b) the samplers are not underdriven to the extent that quantization noise becomes significant. I(n) Q(n) A v e r a g e for 2N s y m b o l s Lock Met r ic Fig. 25. 2 n d-order nonlinearity lock metric generation for BPSK. Dealing now with LDDMN (given in (37)), we see that the latter is proportional to K. This may, depending on the receiver's implementation, present dynamic range problems which are similar (though not as acute) as compared to LNDAM N . Moreover, we observe that LDDMN's value and threshold are dependent upon the AGC's nominal level and performance. In contrast, again, no such dependence exists for lM N . 2.3.5 Principle of SNR Estimation from the Lock Detector Value We can derive an approximation of the Es IN0 ratio from lM N. Using units of dB we have from (35): ^ ^ l O - l o g . ^ ^ ' ^ ^ l O l o g , -M 4-In (38) -48-If we indeed use units of dB, we have a greatly reduced dynamic range. Combined with the small dynamic range required to describe lMN, this allows (38) to be implemented as a relatively small fixed-point lookup table, hence facilitating rapid, reliable estimation of Es/N0 within an FPGA or an ASIC. Note that if increased accuracy is required the right side of (38) can be replaced with the (numerically obtained) values derived from the lock metric expected value of (31)-(32). This can likewise be incorporated into a lookup table so no logic complexity penalty is incurred. Thus, estimation of the Es IN0 can be achieved using the method described here, using an almost trivial hardware structure, with one sample per symbol (which corresponds to the symbol strobe), and without the need for any symbol decisions to be made. This appears to be quite an improvement over previously available methods, as are analyzed in [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73], [16] and [74]. The proposed SNR estimation method is investigated in depth in Chapter 4. 2.3.6 Intuitive Understanding of the Detectors' Behaviour The behaviour of lM N can be intuitively understood by looking at a contour graph of l\n)-6I2(n)Q2(n) + Q\n) xM . This is shown for the case of QPSK, i.e. x4 n ~ , ? \ 2 > m (/ (n) + Q (»)) Fig. 26 and Fig. 27, where sample demodulated signals are superimposed on contour maps of xAn. Conforming to the title of this subsection, we shall now proceed with an intentionally non-mathematical discussion, with the hope of achieving an intuitive appreciation of the lock detector computation process. As exemplified in Fig. 26, for a given demodulated symbol (l(n),Q(n)) the corresponding value of xMn is an indication of how close the phase of that symbol is to a valid M-PSK constellation point's phase. If the carrier is locked (as is the case depicted in Fig. 26), the symbols (l(n), Q(n)) will be concentrated in clouds around the constellation signal points, which, for QPSK, are on the I and Q axes (on the " 1" contour). The location of the center of each cloud is a function of the -49-signal level (i.e. is dependent upon K), however this does not impact the value of x4 „ due to the radial symmetry of the contours. It is also intuitively clear that the size of the demodulated signal point clouds will be inversely related to the Es/N0. The ideal constellation points are on the "1" contour, and thus when the values are averaged for many symbols the value of the lock metric will be 1 for infinite Esl N0, and decrease as the Es IN0 decreases, as more symbols depart more significantly from the constellation point's phase and a bigger proportion of contours of lower values are encountered. If the carrier is unlocked, the phase of the demodulated constellation will rotate due to the incommensurate local and received carriers. Consequently, the output xM n of the lookup table will be concentrated on a circular ring around the origin, with the ring's radius (i.e. mean distance from the origin) proportional to K, and the ring's width inversely related to the Es/N0 ratio. This scenario is depicted in Fig. 27. The values accrued for many symbols will thus (from symmetry considerations) average to 0 in the unlocked state; this reasoning provides graphical validation to (26). It is again emphasized that due to the radial symmetry that is evident from the contour graphs of Fig. 26 and Fig. 27, there is no dependence of lu N (in either the locked or unlocked case) on K but only on the angle of departure of the demodulated constellation from the ideal M-PSK constellation and on the Es/N0. To visually illustrate the difference between lMN and previously available lock detectors, refer to Fig. 28 and Fig. 29. In those figures demodulated QPSK signals, with K = 0.8 and /x: = 0.4 respectively, are superimposed upon a contour map of Re[(/(«)+/•£>("))4 ]> which is the NDA 4th power nonlinearity lock detector term (see [50], (36)). Whereas the radial symmetry evident in Fig. 26 ensured that variations in K did not have an impact on the value of lM N, this is clearly not the case in Fig. 28 and Fig. 29. -50-Fig. 26. Received QPSK signal, with Es/N0=20dB and with receiver in lock, K = 0.5 , superimposed on contour map of Fig. 27. Received QPSK signal, with Es/N0=20dB, unlocked receiver, K = 0.5 , superimposed on contour map of xA . -51-I(n) Fig. 28. Received QPSK signal, with Es/N0=20dB, A" = 0 . 8 , receiver in lock, superimposed on contour map of Re[(/(«) + y-e(»)) 4 ] . I(n) Fig. 29. Received QPSK signal, with Es/N0=20dB , K = 0 . 4 , receiver in lock, superimposed on contour map of R e [ ( / ( « ) + 7-e(«)) 4 ]. -52-2.4 Performance Analysis for Imperfect Locking If the local carrier exhibits phase-jitter (i.e. imperfect carrier PLL lock), or the sampling of the outputs of the matched filters is not at the ideal instant (i.e. imperfect symbol PLL lock), this also degrades the value of lMN, though this could also be modeled as an effective decrease of the Es IN0 ratio [19 Sec. 4.3.2]. The case of imperfect locking due to carrier phase jitter can also be easily modeled by modifying the definition of 00 during lock to account for the residual phase error 9e (where \9e\ < nlM). The modification is as follows: 60 G {<9;. -6e + 27tklM\k = Q,\,...,M - \) ( 3 9 ) With this definition we have when locked (using (21)): rsm(0e +27r(mn-k)l Af) + nQ(nT)/(2Es)^ (40) cos(6> + 2n{mn -k)lM) + n, (nT)/(2Es) Furthermore, we shall arbitrarily set k = 0. This is tantamount to ignoring the carrier synchronization ambiguity, or equivalently rotating the demodulated constellation by 27zkQ IM, where kQ is an arbitrary integer. This is permissible due to the fact that the value of xM n for a given demodulated symbol is invariant under such a rotation (see Section 2.3.6 and Fig. 26 for a graphical illustration of this property). Finally, as a generalization of (27), we note that for any M-PSK constellation point <j>x = 2n • mx IM where mx is an integer we have: cos(M((pn ~(f)x)) = cos(M^)cos(2 ;r • mx)+ sin(M<pn)sin{2n -mx) (41) = cos(M^)(= xMn) that is, the lock detector metric term for a given symbol is not dependent upon that symbol but rather only on the phase difference between the received symbol's phase <pn and the phase of any valid M-PSK constellation point <px. Thus we can assume for -53-simplicity <j>n = 0 (which implies mn - y/n = 0) for all n. Under the above simplifications we can rewrite (40) as: rsm(0e) + nB(nT)/(2Es)^ <pn = tan (42) cos(ee) + n,(nT)l(2Es) A quick glance shows that (42) is of the same form as (24) with y/n replaced by 6e. Therefore if we re-define: A then A^n is still distributed according to (29). Thus, if we define fM(z) t 0 D e the expected lock detector value at an SNR of Es I NQ= % including carrier jitter effects, then: E [ I M J I \ E S / N 0 = X • I M,oo\ = x cos M>"\ESIN0=X ES/N0=Z E[cos(MA0n)]• E[cos(M6e)]-Ej^ii^ =o (44) Es'N0=z = [ £ [ c o s ( M A ^ ) ] • E [cos (M0e)] = fM(z)E[cos(M0e)\Es/No=Z ES/N0=Z where, through comparison with (30), we see that the presence of carrier phase jitter results in the degradation of the lock metric expected value by the multiplicative, less-than-unity factor of: u(Z)±E[cos(M0e)\Es/No=Z (45) The factor in (45) can be further simplified by using the Taylor expansion method [54 00 x2" Chap. 20]. Using the Taylor expansion cosx = V(-l)" [54 eq. 20.22] and adopting a first order approximation which retains only the first two terms of the Taylor series, we get: -54-o{%) = E [cos(M0e )| Es IN0 = x] = E [0e2\Es/No=X_ Z H ) Amy (2n)\ M2E\ «1 and we can write: (46) fu (z) = fu (z)»(z)x fu (z) ME 1 e. (47) and the phase error variance £• [^ e21 ^ / 7 V 0 = xj c a n D e evaluated using a plethora of known methods, such as those presented in [26], [49], [19], [51], [9], and [58]. To validate the results of (44)-(47), computer simulations of an equivalent (w.r.t. Fig. 13) baseband system were conducted, with the synchronization loop configured to behave as a second-order PLL. This model is shown in Fig. 30. In those simulations, the values of the lock detector were recorded, as was done with the values of the residual phase error 6e. The recorded lock detector values were then compared to those computed using estimates of (44) and (47). The simulation results are shown in Fig. 31, Fig. 32, and Fig. 33, for 2BL 7 = 0.01, 0.1, and 0.25, respectively. BL is the noise bandwidth of the carrier synchronization PLL (see [25 Sec. 3.1 and App. A] for a discussion of this parameter). For the second-order loop under discussion, BL = 0.5con +1/(4^)) where con is the natural radian frequency of the PLL and Q is its damping factor [25 Sec. 3.1 and App. A]). As can be seen in Fig. 31- Fig. 33, is excellent agreement between (44) and the measured results, and this is also true for (47) albeit to a lesser degree of congruence for the higher 2BL • T factors. It should be noted, however, that the values of 2BL • T = 0.1 and 2BL • T = 0.25 are extremely high compared to those usually found in carrier synchronization PLLs; typically 2BL -T'\s in the order of magnitude of 0.01 at most ([25], [58], [22 Chap. 5, 6], [21 Chap. 5]), for which (as Fig. 31 illustrates) the values obtained through (30) and (44) (and even (47)) are virtually identical for all practical Es/N0 ratios. The rather large 2BL T values of 0.1 and 0.25 in Fig. 32 and Fig. 33, respectively, were used with the intent of making the plots distinguishable, so -55-that the reader could ascertain a qualitative appreciation of the influence of carrier phase jitter on the lock metric value. Consequently, it can be said that for most practical carrier synchronization PLLs the effects of carrier phase jitter on the lock detector value can be neglected. Estimate of cos() rijinT) 12ES + + I (n) sin() M I\n) + Q\n)^2 A v e r a g e Q{n) P h a s e Detector Loop Filter K l + a l - z a i ~ 1-a-, -Z nQ(nT) 12ES + 0e VCO/NCO K 1- z e Fig. 30. Equivalent baseband model of the synchronization loop that was used for closed loop simulations. -56-10 15 20 25 30 35 E s / N 0 (dB) Fig. 31. Theoretical and simulated lock detector values, using equivalent baseband model of Fig. 30, for 2BL - 7 = 0.01 . Solid line i s the theoretical, j i t t e r free case (Eq. (30)). Blank polygons connected by dashed lines are the averages of measured lock detector values obtained in the simulation. Gray-filled polygons are approximations for Eq. (44) , i.e. the results of multiplying Eq. (30) by an e s t i m a t e (which we w i l l d e n o t e v(z)) t of v(x) = cos(M6>e) , where 9e was measured in the simulation. Similarly, b l a c k - f i l l e d polygons connected by a dashed line are approximations to Eq. (47) , where u(x) = 1 - VnM2-02 . obtained by the time average -57-E s / N 0 (dB) Fig. 32. Theoretical and simulated lock detector values, using equivalent baseband model of Fig. 30, for 2BL T = 0 . 1 . Solid line i s the theoretical, j i t t e r free case (Eq. (30)). Blank polygons connected by dashed lines are the averages of measured lock detector values obtained in the simulation. Gray-filled polygons are approximations for Eq. (44) , i.e. the results of multiplying Eq. (30) by an estimate (which we w i l l denote v(z)) r °f u(%) = E[cos(M8e)\Es /NQ = x \ t obtained by the time average y ( / ) = cos(M6'J , where 6>ewas measured in the simulation. Similarly, black-filled polygons connected by a dashed line are approximations to Eq. (47), where o(x) = 1 - M2 • 62 . -58-E s / N 0 (dB) Fig. 33. Theoretical and simulated lock detector values, using equivalent baseband model of Fig. 30, for 2 B L T = 0.25. Solid line i s the theoretical, j i t t e r free case (Eq. (30)). Blank polygons connected by dashed lines are the averages of measured lock detector values obtained in the simulation. Gray-filled polygons are approximations for Eq. (44), i.e. the results of multiplying Eq. (30) by an estimate (which we w i l l denote v(z)) i °f °(x) = £ [ c o s ( M # e ) | £ s / N0 = x\ > obtained by the time average £>(%) = cos(M0e) , where 0ewa.s measured in the simulation. Similarly, black-filled polygons connected by a dashed line are approximations to Eq. (47), where u(%) = 1 - M M 1 • 0e2 . -59-2.5 Lock Detector's Variance and Distribution In order to allow for detection probabilities to be evaluated, the variance and distribution of the lock metrics must be ascertained. Recalling (17) and (18), since it has been established in Sec. 2.3.3 that VM,V«,|xM n | < 1 we consequently arrive at the bound: ™(xMn) = E[xM„2]-E2[xMJ< (sup\xMn\} =1 ( 4 8 ) If I and Q sampling is done at the ideal instants (i.e. the symbol synchronization loop is ideally locked) then the symbol components of those samples are mutually independent, which is also true regarding the noise components (due to the matched filters). Thus the variables xMn may be viewed as mutually independent, and hence: c \ f i Y N f i var ( l M i N = — X var(x^„)< 2N 2AM = J - <49> IN Note that since no limiting assumptions were made, (49) is valid for any input signal at any Es IN0, including a noise-only signal, and for any carrier synchronization loop. In particular, (49) is valid for any carrier phase jitter conditions; the value of this observation will be apparent in Section 2.6. Due to the central limit theorem [13 Sec. 2.1.6] the distribution of lM N can be considered Gaussian when in lock since it is a sum of independent equally distributed variables xM n , and this is also true for the unlocked case provided there is no significant frequency error between the local and received carriers. Thus to a good (if conservative in terms of variance) approximation: lM N locked ~ N f 1 ^ lM N I unlocked or noise only input ~ TV (50) ( 1 ^ 0 , — -60-where for jitter-free locking fiL(%) = fM{%) (given in ( 3 0 ) - ( 3 2 ) ) and if modeling of the effects of the carrier synchronization PLL's phase jitter is deemed necessary, ML(z) = / M ( X ) (given by ( 4 4 ) ) , as was discussed in Section 2 . 4 . 2.6 Lock Probabilities and Circuit Parameter Determination Eqs. ( 5 0 ) and ( 3 0 ) - ( 3 2 ) , ( 4 4 ) can be used to set the threshold for achieving lock, from which it is evident that (with M assumed an unalterable system-level constant) the only quantity that needs to be decided upon by the designer in order to facilitate determination of the lock detector circuit's parameters is the minimum ES IN0 for which reliable lock is desired. To illustrate this, for a given threshold r > o, at a given input ES I Ng ratio that we will denote X , the lock detection probability PD and false alarm probability PFA are from ( 5 0 ) (assuming5 r </JL(Z), which is reasonable since otherwise we wouldn't be interested in detecting lock at an ES IN0 of X): P =P 1 D 1 W > r Z 1 0 0 1 " 7 w ¥ \e~iT"L(x)f NdT=^erfc ( r - ^ (x))) ( 5 1 ) noise only input^j 1 ro 1 < , \e-T2NdT = -erfcty[N-T) JTT/N r J 2 ^ > ( 5 2 ) In fact we are implicitly making the assumption here that we are disinterested in detecting lock for any ES IN0=A for which r > /iL(X); this is because for all ES IN0 = A for which r > fiL (A) we have - 6 1 -Solving (51) and (52) through a series of straightforward manipulations yields that suitable6 values of N and T are: N = and: <erfc' (2PFA) - erfc-1 {2PD)^ V MM J (53) r _ erfc~\2PFA). ML(X) erfc \2PFA)-erfc \2PD) Unsurprisingly, (51)-(54) are completely independent of K. As has been noted in Section 2.5, since 1/(2JV) is an absolute upper bound on the variance of lM N which is valid for any phase jitter conditions, eqs. (51)-(54) are equally applicable if significant phase jitter is present, so long as pLiX) = fM (z) (given in (44) or approximated by (47)). Curves of (54) are shown in Fig. 34 for various values of M. Note that the value of X chosen for each M corresponds to a reasonable minimal SNR for which reliable lock can be expected for that modulation [58 Sec. III]. Graphs of (53) are given in Fig. 35 for QPSK and =^1 dB, for jitter free conditions and for a loop SNR of p -16 dB, where p is defined as ([25 Sec. 3.1], [51]) p = \/E[0e2] and (47) is used to compute the graphs pertaining to p -16 dB. The use of j=ldB and p -16 dB allows direct comparison of Fig. 35 to [51 Fig. 6], which, if undertaken, shows that the number of symbols (which is 2- N) needed for l4N is somewhat larger (but not excessively so) than that required for the 4'h -order nonlinearity lock detector LNDA4 N that is discussed in [51]. However it must be remembered that (in contrast to this thesis) the analysis in [51] makes the 6 Note that (51) and (52) are inequalities due to the fact that the variance of lM N is bounded by the limit 1/(2 A f ) (see (49)), and not necessarily equal to it. Thus computation of (53) will produce somewhat conservative (i.e. larger than needed) values of N . -62-assumption generally made in previous analyses of lock detectors, namely that the AGC operates perfectly. As already noted, if the AGC in [51] is not ideal (and none ever is) this will likely have adverse effects on PD and PFA, which are unaccounted for in [51 Fig. 6]. Particularly, if an abrupt fading of the input signal is experienced, PD and PFA will be severely affected until the AGC has settled; because the AGC generally has a time constant that is several orders of magnitudes larger than the symbol interval [49 Chap. 7], it will thus take many symbol intervals for the circuit in [51] to operate anew at the required PD and PFA. Such a phenomenon is entirely absent7 for lM N , and this must be held in context when comparing [51 Fig. 6] to Fig. 35. Furthermore, it is worthwhile noting that the author of this thesis has re-simulated [51 Fig. 6] using the data provided in [51] and the result is shown in Fig. 36, which shows some discrepancy with [51 Fig. 6]. If the results of Fig. 35 are compared to Fig. 36, we see that l4N has almost identical (only very slightly worse) performance (in terms of required symbols) as LNDAA N . 7 So long as no signal-chain or sampler saturation occurs, and the samplers are not underdriven to the extent that quantization noise becomes significant. See Section 2.3.4. -63-^ > M = 2 , P F A = 1 0 " 1 , X = - 3 d B M=2, P F A =10" 4 , x=-3dB M=2, P F A =10" 8 , x=-3dB M = 4 , P F A = 1 0 - 1 , X = 2 d B M=4, P F A =10" 4 , x=2dB M=4 ,P C A =10 - 8 , X =2dB 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 0.999 Fig. 34. Required threshold T, needed to achieve a necessary PD and PFA, for various values of M. -64-R e q u i r e d 2-N Fig. 35. Required No. of symbols, 2-N, i f using l 4 N , needed to achieve a necessary PD and PFA, for j = l d B , for M=A (QPSK) , for ji t t e r - f r e e conditions and for a loop-SNR of p=16dB. -65-0.999-0,95 -0:9— 0.85 -0.8-0° 0.75 0.7 -0.65 -0.6 -0.55 0.5 L 10 Fig. 36. Required No. of symbols, 2-N, i f using LNDAAN , needed to achieve a necessary PD and PFA, for j=ldB, for M=4 (QPSK) , for jit t e r - f r e e conditions and for a loop-SNR of /?=16dB. 2.7 Operation in Fading Conditions Up until this section we have assumed, rather conveniently, that the Es IN0 ratio is constant. In many cases, however, the signal experiences fading ([75], [76], [13 Chap. 14], [77]) which must be taken into account when modeling the lock detector's behaviour. Before we engage in any quantitative discussions, we must first qualify the fading process under consideration. In the analysis of our lock detector, as we shall shortly see, -66-we must differentiate between two cases according to the channel coherence time TC0H . The channel coherence time is a measure of how fast the channel's characteristics are changing (see [13 Chap. 14]) . In general, we can assume that the channel Esl N 0 is the same during time intervals which are significantly shorter than TC0H. Note that TC0H is the inverse of the channel's Doppler spread, that is fD=\ITC0H (see [13 Sec. 14.1.1]). This is an important observation since in many papers the performance is given as a function of fD, whereas here we have found it more convenient to work with TC0H , but since fD=VTC0H these measures are equally applicable. In this thesis we assume that the fading is frequency-flat (i.e. MT «; WC0H where WC0H is the coherence bandwidth of the channel [13 Chap. 14]) and slow (i.e. TC0H » r ) . The logic behind this choice is that we are dealing with single carrier coherent M-PSK system. If frequency-selective fading is present one would perhaps have chosen to implement a multicarrier communications system ([78], [79]), e.g. OFDM (Orthogonal Frequency Division Multiplexing). If fast fading had been present (i.e. TC0H <K T), coherent demodulation without a pilot signal or pilot symbols would be very difficult, and hence one would have likely opted for a non-coherent or a differential modulation (see [13 p. 818], [78], [79]). Hence, the fact that the designer has chosen to implement a coherent, suppressed carrier, pilot-free M-PSK system almost assures, in and of itself, that the fading is frequency flat and slow. In this section we ignore the effects of phase jitter. This is permissible because of the analysis in Sec. 2.4 which showed that ignoring jitter-induced effects is a good assumption even at low SNRs. The conditional probability distribution of the SNR due to fading will be denoted as F(J,O-/) where x l s the average SNR ratio (defined as % = E[ES/NQ]) and the associated pdf (probability density function) is denoted by: PF (Z\ X) = P (ES IN0 = X \ E [ E S IN0 ] = j ) (55) We now present some common phase fading statistics, as taken from [77 Table 2]: -67-Table 2. Fading distributions for common channel types C h a n n e l T y p e F a d i n g D i s t r i b u t i o n PF (Z\X) = p{Es/N0 = X\E[ES/N0] = x) Rayleigh —exp —^ X U J Rice (Nakagami-n) (\ + n2)e-2 ' - ( 1 + » 2 ) * | , \ / exp X - h \ i x V Hoyt (Nakagami-q) - ( l + <72) X ~ - e X P Mx 0 *><IX Nakagami-m m m—\ I m x ^ 7 - ^ e x P f r ( m ) p l - r a j To evaluate the lock detector's behaviour during fading, we must differentiate between two cases8: (a ) IN • T <K TC0H : The lock detector calculation interval is much shorter than the coherence time. Thus, the Es/N0 can be considered constant during the lock detection computation process, and the analysis undertaken so far is applicable, regardless of the fading distribution. 8 A third possibility, namely that 2N • T = TC0H (where " =" means the same order of magnitude), is undesirable and can be avoided, as discussed in Sec. 2.7.1. -68-(b) 2N • T»TC0H: The lock detector calculation interval is much longer than the coherence time. Thus, the effects of fading must be taken into account in computation of the lock metric. We now proceed to analyze each of these two cases. 2.7.1 Lock detector distribution for Case (a): 2N-T «TCOH In case (a), the lock detector value will be based upon the instantaneous Es IN0, which we shall denote x- i n n e u °f the fact that 2N • T«: TC0H this SNR can be considered constant during the lock detector computation process. Hence, the lock detector expected value will be as given in (30), i.e.: fAz) = E \ k N \ ^ = z}= £ r C o s ( M A ^ ( A ^ | j ) j ( A ^ ) (56) where from (29): PR {^\z) = p(Wn = A<p\Es/N0 = x) _ exp(-j) 2TT l + 72jcos(A^)exp(j-cos 2 (A^))- j e~y2,2dy (57) Similarly, it is trivial to note that the analysis given in Sections 2.3 to 2.6 is applicable without modification. 2.7.2 Lock detector distribution for Case (b): 2NT»TCOH For case (b), since 2N-T^>TC0H then we can make two observations. First, since the lock detector estimation period is longer than the channel coherence time, then in the course of computing the lock metric we can expect to encounter changes in the SNR, and hence these must be taken into account when predicting the lock detector's performance. Secondly, since the estimation time is in fact much larger than the channel coherence time, we can assume that the Es IN0 values encountered during the lock detector computation time are distributed according to the channel fading statistics around the -69-average SNR % • Hence, it is expected (and, indeed, is verified shortly) that the lock detector value will be dependent upon two things: (i) the fading statistics, and (ii) the average SNR x- Consequently, SNR estimation and threshold determination for this case will also be dependent upon x ar>d the fading statistics. Before we continue to quantitatively evaluate the performance of the lock detector for the case of 2N• T»TC0H, we comment that the case 2N-T-TC0H (where " = " means the same order of magnitude) is undesirable since in that case the SNR distribution during the lock detector computation period cannot be accurately predicted. This is because since 2N-T = TC0H we are not statistically guaranteed that during the lock detector computation period of IN-T seconds the distribution of the SNR values encountered will be a sufficiently accurate approximation of pF{x\x), n o r a r e w e guaranteed that the SNR remains constant. Thus, for the case of 2N-T = TC0H, the lock-detector's value cannot be predicted and, ipso facto, the lock detection algorithm is rendered useless. Fortunately, 2N • T - TC0H can always be avoided by choosing a large enough N, which ensures that 2N-T^>TC0H (case (b)). Ideally, though, we would prefer to always have 2N • T « TC0H (case (a)), so that we could rapidly generate lock detector values and also maintain the ability to generate estimates of the instantaneous SNR (if we need the average SNR for some purpose, this can always be obtained by averaging the instantaneous SNR estimates). However, if the fading is such that TC0H is small, it might not be possible to find N which satisfies both the estimator accuracy requirements and yet which is small enough so that 2N-T<&TC0H is still obeyed. In that case, we must chose N such that 27V -T^>TC0H and content ourselves with a longer lock detector calculation period (which would be dependent upon (and produce an estimate of) the average SNR ratio x). Since 2N-T^>TC0H, we have that the lock detector expectation is weighed by the SNR distribution given by the fading probability function, i.e.: -70-= [ fM(x)PF{x\x)dx (58) Using (31) we may also write: 4n'X fM(X)=[ • exp f-x^ X M - \ + 1 X M + \ 2 V J 2 Regarding the lock detector's variance and distribution, we note, first, that the assumptions that led to the bound var^/M N j < in Section 2.5 are still valid. Such is also the case regarding the conditions outlined in Section 2.5 which led to the conclusion that lM N is Gaussian. Hence, we conclude that (50) is still valid for the case of fading discussed here, provided that we use ML(z) = JM (X) is as given by (58) instead of Secondly, as an immediate consequence of the previous paragraph we conclude that the analysis given in Section 2.6 is still valid, provided that we use ML(X) = /M (X) as given by (58) instead of ML(z) • As a consequence of the above analysis, we conclude that in order to apply the equations in Sections 2.3-2.6 all we need to do is to compute fM {J) y i a (58) and then use this in the appropriate equations. Obviously, there will be as many lock detector distributions as there are fading probability distributions, namely such analysis gives rise to an infinite number of lock detector expectation curves. Rather than attempt the impossible task of presenting results for all possible fading distributions, in the sequel we shall present some specific results for Nakagami-m fading, with the understanding that computations for other fading statistics can be done in an analogous manner. J PF{x\x)dx (59) -71-2.7.3 Example: Operation in Nakagami-/w fading for 2N'T»TCOH In this section we shall evaluate, as an example, the lock detector's operation in the presence of Nakagami-m fading IN • T » TC0H. As outlined in previous subsection, the distribution of the lock detector is still Gaussian, and the bound \dx{lM^<-^ is still valid. Hence, all that remains to characterize is the lock detector's expected value, namely /M(X)~E where it was found that M ,N = X . This has been done in Appendix B, IM(X) = YQ.O.W \ X ) + 2 Z _ | ~\ / 2 — « ) ! ^ ' " ' ° ' m v ^ ' + M12 ( n ^ / ^ Y 1 (M/2 + n-\)\~ £,(/!-!)! (M/2~n)\ (60) where: ^k,(.,m {x) — m-ke {0,-1,-2,-3,...} ?>,<.*(%) m-ke {0,-1,-2,-3,...} wherek,£>0 and m>0.5, andm = m + m/1000 and: (61) T (T-\± X~kmm -T{m-k) l k , e , m \ X ) - . N_ (62) where k,£ > 0 and m > 0.5 Curves of (60) are shown in Fig. 37. As seen in that figure, the lock detector curves have similar shapes in the presence of fading as compared to their values when operating in a fading-free environment. Note, furthermore, that results for m = 1 correspond to Rayleigh fading (see [13 Sec. 14.3]). -72-Fig. 37. Theoretical expectation of lock detector vs. simulations, for cases of (i) no fading (or, equivalently, fading but with 2N-T<KTCOH) , and (ii) for frequency-flat slow Nakagami-m fading with 2N-T?s>TCOH for various values of m. For case (i) the graph shows fM(%) vs. %, while for case (ii) the graph i s of fM(z) v s . X • T n e simulations used 7V=512 and 100 lock metrics were averaged to compute each simulation data point. For the computation of fM(z) the simulations used TCOH-10-T, while for computation of fM(z) the SNR was assumed to be constant (i.e. no fading, or equivalently TC0H =00). -73-2.8 Conclusions A family of carrier lock detectors for M-PSK receivers was presented, its theoretical properties analyzed, and simulations used to validate the results. It was found that the proposed lock metrics could be of substantial practical significance, as they lend themselves to simple hardware implementation and have easily bounded variance behaviour that, along with self-normalizing qualities, facilitate straightforward lock threshold determination and detection probability computation. It was found that the proposed lock detectors have a significant advantage over previously available lock detectors in terms of their ability to lend themselves to compact implementation in fixed-point hardware. Moreover, the AGC-independence afforded by the proposed detectors is also an advantage over previously available structures. In the final part of the chapter, the operation of the lock detector in the presence of fading was discussed, and specific results were given for Nakagami-m fading. It was further found that the channel Es IN0 could be easily estimated from the lock metric value. This shall be investigated in depth in Chapter 4 Part A. We shall also see that the proposed lock detector has other advantages. In Chapter 3 we shall see that it can be used as a building block of an adaptive phase detector structure. In Chapter 4 Part B we shall use a modified lock detector structure in order to achieve SNR estimation for D-MPSK and for M-PSK in the absence of carrier synchronization. Thus, the lock detectors presented in this chapter are not only useful for lock detection, but have a range of other applications in SNR estimation and phase detection, issues that are explored further in the remainder of this thesis. -74-Chapter 3 Robust M-PSK Phase Detector Structures: Theory, Simulations, and System Identification Analysis 3.1 Introduction Carrier phase error removal in M-PSK demodulators is generally achieved via one of two techniques. The first method uses a feedforward phase estimator ([22 Chap. 5, 6], [80], [81], [82], [83], [84], [85], [86], [87]) such as the Viterbi & Viterbi (V&V) detector [80] to estimate the phase error, and that estimate is then used to demodulate the received signal. The second method is the use of feedback ([22 Chap. 5, 6], [88], [89 Chap. 9], [90 Chap. 16], [91], [92], [93], [94], [95], [96]) systems, which remove the carrier phase error using a Phase Locked Loop (PLL) that ideally cancels the phase error between the local and received carriers. While the phase-error variance performances of appropriately chosen feedforward and feedback systems converge to the same bounds at high SNR (see [22 Chap. 6]), at moderate and low SNRs feedback systems exhibit nonlinear behavioural artefacts that cannot be modeled by an equivalent feedforward system. A simplified illustration of feedforward demodulation of an M-PSK burst is shown in Fig. 38. Illustration of the topology of feedback receivers has already been discussed in This chapter was presented in part in Linn,Y.,"A Robust Phase Detection Structure for M-PSK: Theoretical Derivations, Simulation Results, and System Identification Analysis", 18th Canadian Conference on Electrical and Computer Engineering (CCECE'05), May 1-4, 2005, Saskatoon, SK, Canada, pp. 869-883. -75-Sec. 1.3, and diagrams of the place of the phase detector within such a system can be seen in Fig. 12 and Fig. 13. MPSK near-baseband signal Sto re burst in m e m o r y Estimate carrier phase at center of burst (e.g. using V&V algorithm) Demodulate stored burst using the carrier phase estimate Recovered symbols • w k Fig. 38. Simplified flow diagram of feedforward demodulation of an M-PSK burst. Note that storage and carrier phase estimation can proceed concurrently. It is instructive to engage in a point-by-point comparison of the qualities of feedback phase detectors with those of feedforward carrier phase estimators. Phase detectors: 1. Are used in closed-loop as part of a PLL. 2. The receiver achieves coherent carrier synchronization (i.e. Aa> = 0 and 6e « 0 ). 3. A phase error estimate is produced for each symbol. 4. The PLL that produces the phase estimate is an IIR (Infinite Impulse Response) system. 5. The phase estimate is causal. 6. There is an acquisition time for the PLL to acquire lock. Feedforward carrier phase estimators: 1. Are used in an open-loop fashion and cannot be used in a PLL. 2. The receiver does not achieve coherent carrier synchronization (i.e. Aco 0 and 3. One phase error estimate is produced for each block of symbols. -76-4. The estimator produces a phase estimate using (in most cases) an FIR (Finite Impulse Response) system. 5. The phase estimate is non-causal. 6. There is no acquisition time. Phase-detectors and feedforward carrier phase estimators have different advantages and disadvantages which make them suitable for different tasks. Phase detectors are used when the communications signal is continuous, which allows for acquisition and tracking of the input carrier by a PLL. In contrast, feedforward carrier phase estimators, such as the Viterbi & Viterbi estimator [80], are best suited for burst transmission which is too short to allow a carrier PLL to be employed because the acquisition latency cannot be tolerated. The tradeoff is that the amount of processing required for feedforward phase estimation is much larger and the rate of phase estimates is much reduced :one needs to store the entire burst in memory, and remove the phase error later after a single phase error estimate is made upon the burst; this is also a manifestation of the non-causality of the estimate. Increasing the rate of phase estimates garnered from a feedforward phase estimator requires even more significant processing, i.e. by using multiple overlapping estimation windows ([22 Sec. 6.5.4], [80]). The tracking error variance of a feedback system will tend at high SNR to the same Cramer-Rao Bound of a feedforward system if we choose N = \I(2BLT) where N is the number of symbols used to estimate the phase in the feedforward system, BL is the noise bandwidth of the feedback system, and MT is the symbol rate (see [22 Chap. 6]). However, at moderate and low SNR, loop nonlinearities and/or decision errors will cause an increase in the phase error variance of the feedback system. This phenomenon will not be observed for the corresponding feedforward system. Put another way, at low SNRs the feedback phase detector operates more and more in a nonlinear capacity, the linearization assumptions of the PLL break down, the nonlinear behaviour of the PLL dominates, and this results in an additional degradation in the phase error variance. Even at moderate -77-Es/N0, the phase detector will operate a large enough portion of the time in its nonlinear region, hence affecting the PLL dynamics and the phase-error variance. The feedforward and feedback systems cannot remain equivalent at low and moderate SNR, because the PLL's system dynamics change with decreasing SNR (behaving more and more as a nonlinear system), while the feedforward system's dynamics remains unchanged. In this thesis, as already noted, we concentrate solely on coherent M-PSK receivers that utilize feedback in the form of a carrier synchronization PLL to remove the carrier phase error. Carrier synchronization PLLs in coherent M-PSK receivers are tasked with cancelling the carrier phase error, an estimate of which is provided by a carrier Phase Detector (PD). There are two general categories of PDs: Non Data Aided (NDA) and Decision Directed (DD). The Mth-order nonlinearity detector ([13 Chap. 6], [22 Chap. 5,6], [21 Chap. 5], [92]) and the multiphase NDA Costas loop ([26], [21 Chap. 5], [92]), and the multiphase Costa loop or Mth-order order nonlinearity followed by a limiter [92] are examples of NDA phase detectors. Examples of DD detectors can be found in [13 Chap. 6], [22 Chap. 5, 6], [58], [9], [97], [98], [93] and [91]. An inherent problem of DD detectors is that at low SNRs (and, as well, during acquisition) they suffer from considerable self-noise due to erroneous decisions, something which also has an effect on their S-Curves (see [22 Fig. 6-2], [58], [9], [97]). NDA detectors, while not susceptible to such a phenomenon, are nonetheless seldom used for higher order modulations. This is because at higher Ms NDA detectors experience high self-noise (due to the high-order nonlinearities which they include) and their implementations are significantly more complicated than their Decision Directed counterparts (see for example [26 p. 74], [21 Fig. 5.54]). An additional problem which afflicts the DD and NDA detectors just cited is that their gain is strongly linked to the AGC circuit's operating point and performance. As we shall see in this chapter, the fact that the gain of the phase detector is not constant implies that (unless this change of gain is compensated for in some manner) the carrier PLL's characteristics will change accordingly. AGC-dependence is a particularly bothersome problem when fading signals are encountered, since in such cases the AGC often -78-operates in a distinctly non-ideal manner, which means that AGC-dependence of the phase detector implies a similar lack of optimality in the carrier PLL. This chapter's objective is the investigation of new families of M-PSK phase detectors. The first of those families will be a modification of the Mth-order nonlinearity detectors. The behaviour of the phase detectors is explored using stochastic theory, and it is found that the phase detectors' gain curves are independent of AGC performance and signal-levels. Furthermore, these self-normalizing properties allow the performance requirements of the AGC circuit to be relaxed considerably. Analysis of the squaring loss of the phase detectors is presented, and their closed-loop phase error variance is calculated as well as simulated and found to be better than that of M -order nonlinearity detectors and comparable to that of DD detectors. The phase detectors will also be shown to possess a simple hardware realization, which allows for their straightforward and efficient implementation within an FPGA or ASIC The second family of new phase detectors will be a family of robust NDA adaptive phase detectors. These detectors will be shown to produce a constant-gain detector during tracking, which allows the carrier PLL to maintain optimality at virtually any SNR at which it can lock. The self-noise and phase-error variance performance of the proposed structure will be shown to be superior to that of other NDA and DD detectors. Moreover, unlike other NDA phase detectors, the proposed structure has a compact implementation for all M which is quite suitable for use within an FPGA or ASIC. The organization of the chapter is as follows. In Section 3.2 we review the signal and receiver model around which the discussion applies, while in Section 3.3 we review the accepted metrics used in evaluation of phase detector performance. In Sections 3.4 we present a new family of self-normalizing NDA carrier phase detectors, where the characteristics and performance of this family of phase detectors are analyzed through theoretical derivations and simulations. In Sec. 3.5 we establish why a constant-gain phase detector is desirable, and in Section 3.6 we present a new family of adaptive M-PSK phase detectors which has such constant gain. The performance of these adaptive -79-detectors is then investigated through theoretical derivations, simulations, and system-identification results. Finally, Section 3.7 is devoted to conclusions. 3.2 Signal and Receiver Models The signal and receiver models, as well as the applicable notations, have been defined in Section 1.4. 3.3 Performance Analysis of P L L s - A Brief Overview 3.3.1 Performance metrics, nonlinear model, and linearized model In this section we outline the PLL modeling techniques that we shall use to evaluate the proposed detectors. One of the most widely used PLL performance metrics [58 Sec I] is the phase-error variance var(#e), or equivalently, the loop-SNR defined9 as [49 eq. (3.3-7)] /? = l/var((9e). This is because the phase-error variance has a crucial role in determining the cycle-slip rate of the PLL and the SER (Symbol Error Rate) degradation due to imperfect synchronization [21 p. 20-21, 210-211]. To understand this intuitively, we note that in order for the error rate to be small and to minimize the rate of cycle slips, we must have ([100], [99]) a small 0e, i.e. we desire 6e « 0 . Since 0e is a zero-mean random process (when the PLL is locked), this means that for 0e « 0 to hold statistically we must have a small var(f?e) or, equivalently, a high loop-SNR p(= 1/ var(#e)) . Now, S-Curves of M-PSK phase detectors are periodic with period In IM (see for example Fig. 40, [22 Fig. 6-2]), so if the phase-error strays 9 Sometimes the loop-SNR is defined as p = 1/ var(#0) (e.g. [99]). However, if we want to evaluate the performance of the phase detector only, then we should assume that there is no input-carrier phase noise, in which case we have var(<9J = var(#J and the two definitions of loop-SNR coincide. This is the assumption made in this thesis (as well as in other texts [49], [22], [21], [9], [97], [58]) and we adopt the definition p = \l var(#e). -80-outside the range 0e & \_-nIM,nIM\ we will observe a loss of lock which is at least momentary (i.e. a cycle slip [22 Chap. 6], [49 Chap. 6])). Thus, the meaning of 0e « 0 for M-PSK receivers is more precisely expressed as \0e\«: n IM . Therefore, recalling again that when in lock 0e is a zero-mean random process [26 Chap. 2], for good performance in an M-PSK receiver we must have: v a r ( 0 e ) « Ti2 I M 2 (63) It is clear from (63) that as M increases the requirements upon var(6>e) will be more stringent, i.e. a higher loop-SNR will be required in order to achieve good performance. Hence the useful Es IN0 operating range of the PLL will start at a higher Es IN0 as M increases. Determination of var(f7e) via simulations is easily done ([101 Chap. 5], [102 Sec. 7.6]) using nonlinear models (shown in Fig. 47, Fig. 60), but in general the nonlinearity of the phase detector function presents great obstacles when theoretical analysis is attempted. To arrive at theoretical predictions, a standard approach adopted by synchronization texts (e.g. [22], [21], [21], [9], [97], [58]) has been to assume that the PLL is locked and then analyze the linearized PLL model. In the following subsections, we briefly review this approach. 3.3.2 The linearized P L L model To develop the linearized model, we define the following quantities for any phase detector P{n) : 1) BL = £\HPLL(j2xff df=y2co„(£+V(4C)) is10 the loop's noise bandwidth [25 p. 30-32]. Note that this definition contains a factor of lA w.r.t. the definition in [58]. This factor has been inserted in order to make the definition of BL compatible with the (arguably) more widely used definition, as used throughout [21], [49], [22], and [25]. However, note that (69) incorporates a factor of 2 (w.r.t. the (cont...) -81-2) The phase detector's S-Curve [21 p. 206] is Sp(0e) = E\P(n)\0e], i.e. it is the average output of the phase detector given the phase error. Note that in general Sp(0e) will also have a dependence upon M,K, and x> D u t for simplicity this is not indicated in the notation of the function Sp(0e). 3) The linearized gain of P(n) (or simply the "gain" of P{n) ) is: gP(M,K,x) = {dSp(0e)ld0e\0e=Q (64) Note in (64) that the gain generally depends on M,K, and x • It is common practice to normalize the gain so that it is unity at SNR-co .Most synchronization texts also assume a constant K =1, whereupon the normalized gain is: aX)P =gP(M,l, z)/gP ( Af, 1, co) (65) (note: since X signifies SNR, we use the notations &SNR,P a n d a z , p interchangeably) asNR,p is called the amplitude suppression factor. This factor is the multiplicative factor by which the expected linearized gain of the detector is reduced due to the presence of additive noise at its inputs, as compared to the phase detector's expected linearized gain for noiseless inputs. However, as we showed in Sec. 1.5, despite its widespread use the assumption K = 1 is usually not realistic. Hence, in this chapter we assume that K is a function of the SNR, i.e. K=YAGC(z) (see Sec. 1.5), and we define the effective amplitude suppression factor, denoted Bsmp, which (as we shall see later) is useful for incorporating AGC effects into the PLL model. The effective amplitude suppression factor for a given phase detector P(ri) is the multiplicative factor by which the expected linearized gain of the phase detector is reduced due to the presence of additive noise and corresponding equation in [58]), so the aforementioned factor of Vi does not influence the phase-error variance results -82-variations of K at its inputs, as compared to the phase detector's expected linearized gain for noiseless inputs and K=l. Formally: Px.P=gp(M,r A C C ( z ) , Z ) / g P ( M , l , c o ) (66) (note: since X signifies SNR, we use the notations PSNR,P and PXtp interchangeably). 4) The normalized equivalent loop noise at 6e « 0 is defined as (using [22 eq. (6-73), p. 342] and normalizing) NeP{n) ± ^ P{n)-SP{G,))lgP{MX<*>). (67) 5) The phase detector's self noise is defined as (see [58 eq. (6)]) £p =2- Z-yav(NeP(n)). (68) Note that when AGC effects are ignored then (67) and (68) are computed with K = 1 assumed. When AGC effects are modeled and PSNR P is used as the gain in the linear model, then (67) and (68) must be computed using K = YACC {%). 6) An important tool in evaluating a phase detector is its squaring loss [13 eq. (6.2-59)]. In terms of previous definitions, if we assume K = 1 the squaring loss is A 2 A 2 Q.p—^PIaSNRP, or Qp=%pIPSNR,P if modeling of the AGC's effects is done via K=YAGC(%) (the squaring loss is identical in both cases", since it is an inherent property of the phase detector). 12 The linear model is shown in the lower left of Fig. 55. The phase-error variance is [9 eq. (21)] : 1 1 It is important to note that when AGC effects are modeled via Qp=^p/ p2SNR p , then both denominator and numerator must be computed with K = T4CC(x) • When AGC effects are ignored, then Q.p=£p Ia2SNR P where both denominator and numerator are computed with K = \. See Table 3. 1 2 Note that the definition of BL here contains a factor of lA w.r.t. its definition in [9]. However, (69) compensates with a factor of 2 w.r.t. [9 eq. (21)]. -83-NM{6e) = BL-T-£lpl % = y2ojn(Z + \l(4Q)-T-Clpl % (69) 3.4 A Family of Self-Normalizing Phase Detectors In this section we shall endeavour to present and investigate a new family of self-normalizing carrier phase detectors for M-PSK receivers. We begin by defining the detector and then computing its S-Curve theoretically and via simulations. The amplitude suppression factors, self-noise, squaring loss, and phase-error variance performance are derived theoretically and then this is verified through nonlinear-model simulations. Finally, a compact hardware implementation for this family of detectors is presented, and the novelty of the proposed detectors is illustrated intuitively in a graphical fashion. 3.4.1 Definition Here we present a new family of M-PSK phase detectors. These can be thought of as a modification of the Mth-order nonlinearity detectors ([13], [22 Chap. 5, 6], [21 Chap. 51) cM(n) = Im[(/(ra) + j-Q(n))M]. The modified detector is defined as follows: ( M 7 2 ) - l / M \ Y ( - 1 )* I M - 2 k - \ n )Q 2 k + \ n ) (,Alm[(I(n) + jQ(n))M]_ t 0 { 2 k + l)K (70) a M \ n ) — M_ - M_ (l2(n) + Q2(n))> (l2{n) + Q2{n)Y The denominator term in (70) performs adaptive normalization on the numerator, and - as we shall see - this normalization makes du[n) behave quite differently from cM(n) (despite the notational resemblance). Another common phase detector that we shall use for comparisons is the decision directed detector [58] defined as DDM(n) = I(n)-Q(n)-Q(n)-I(n) (where I(n) and Q(n) are the I and Q decisions). 3.4.2 S-Curve of d^n) The purpose of this section is to determine the S-Curve of the detector dM (ri), we start by defining the phase of the received complex sample as (pn = tan - 1 {Q(n)/I(n)). -84-Applying elementary rectangular-to-polar manipulations to (70) and using De Moivre's theorem13 [54 eq. 6.9] yield: Mkm+jQ{n)f] _(l\n) + Q\n)Y sin(M^) <W) = M- M - s m ( M ^ „ ) . ( 7 1v {l\n) + Q\n)y {l\ri) + Q\n)Y When locked, Aco = 0 and 60 e {9i+2nklM -6e\k = 0,l,...,M-l}, and hence from (21) <pn reduces to: ^sin^ +z7T(m„-K)iivi)+nn{ni (72) p„=tan~' si (0e + 2n{ mn - k) IM)+nQ (  T)/(2ES } cos(0e+27r(mn-k)/M)+n1(nT)/(2Es)J Coherent M-PSK systems overcome the carrier loop's ambiguity by differential precoding of the transmitted symbols [21 Sec. 5.7.6] or other Ambiguity Resolution (AR) circuits. We thus assume for convenience and without any loss of generality operation around the equilibrium point of k - 0 . Furthermore, note that for any constellation point (px = 2n • mx IM (where mx is an integer) sin(M(#? - <j)x)) = sin(M<pn) cos(2;r -mx)- cos(M<pn) s m £ 2 ^ 7 w ^ = s in (M^ n ) . ' 3 ' " r0 ' <73> Eqs. (71) and (73) thus shows that the output of dM{n) is not dependent upon the transmitted symbol but rather only on the phase difference between the received instantaneous symbol phase sample and the phase of any valid M-PSK constellation point. Thus we can assume for simplicity V«, <f>n = 0, implying V«, mn - 0. Under these simplifications we have when in lock, from (71) and (72): DeMoivre's theorem [54 eq. 6.9] states that for any real X and y, (* + J-y)M =(x2 +y2)M/2exp(j-M0) where 0 = tan _ ,(v/jc). -85-f dM(n) = sin\ M-tan V sin(0e)+ nQ(nT)/2Es (74) wi0e)+nI(nT)/2Es„ Define the process A<j>n =q>n-Qe. It can be shown (see [55 Sec. 4.5], as well as Sec. 2.4 of this thesis) that A<pn is distributed as the phase of a Rice random variable with a probability density function (pdf) for |A^| < n of: PR(A<p\z) = p{^n=A<?>\Es/N0=Z) cos(A^)^2^-l + V 2 ^ c o s ( A ^ ) ^ c o s 2 ( A ^ - \ e-y2'2dy 2TT (75) To arrive at the S-Curve, recall that &</>n =(pn-Qe so that cpn - A<pn + 0e and thus from (71): dM (n) = sin(Mp„) = sin(M(A^ + 6? )) (76) = cos(MA^„) sin(M<9e) + sin(MA^) cos(M0e). The S-Curve is then: Sd(0e)±E[dM(n)\0e] = £ [ c o s ( M A ^ ] s i n ( M 0 J + £ [ ^ ^ =o = /„Cr)sin(M0.). where /^Qf) = j * c o s ( M A ^ ) • (A^| • dA0 was defined in (30) and investigated in Chapter 2. From (31) and (35) it was shown that: {~f)\j{M-\)l2 (•§•) + 1(M + \)/2 ("f")] exp( -M 2 / ( 4 j ) ) Substituting (78) into (77) yields the exact expression: /«(z) = ^ i -exp high SNR (78) SAV= ^-exp(^) V 1 ) / 2(f) + W 2 ( f ) sin(M^) (79) -86-(80) And the high-SNR approximation: high SNR Sd{6e) * exp(-M 2/(4j))sin(M f9e) We see from (79) that the S-Curve is sinusoidal with M stable equilibrium points, which is exactly what we would expect from an M-PSK phase detector. S-Curve plots are shown in Fig. 40, where we see that the approximation exp(-M2 /(4^))sin(M<9) is quite accurate (even at low SNR), hence making it a useful tool for the designer for manually estimating the S-Curve behaviour. The gain of dM is from (64) and (79)-(80): = fu(x)-[3(s^(Me,))idee] = M f u ( x ) t r — . . r - , . . (81) = M P high SNR ^ • e x p ( 2 F ) / ( M - l ) / 2 I(M + \)/2 00_) « M - e x p ( - M /(4%)) 3A3 Amplitude suppression factors of dj^fn) The amplitude suppression factors are found from (65), (66), and (79)-(80): exp(-M2/(4j)) [ (M-l)/2 high SNR (82) Exact closed-form expressions exist for aSNRd and pSNRd. These are easily obtained from (82) and (32): 2 1 ( M / 2 + » - l ) ! + e x p ( - j ) n = \ M 12 ^ j I 2 + 1 t l ( n - \ ) \ ( M / 2 - n ) \ Z " (83) -87-We note that (83) is a finite sum composed of polynomials and exponentials, and hence is easily computed using numerical computation packages such as Matlab. Graphs of (82) are given in Fig. 39. As can be seen in that figure, the approximation of exp(-M21(4%)} is quite accurate even at low SNR, particularly for M>2. Therefore, this is a useful approximation that can be used by the engineer for quick manual computations when designing the carrier PLL. -88-1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 ^-exp(-M2/(4z)) // / / 7 l / Z ^ M = 2 '/< M = 4 f \ KI. A / \ J \ \ i \ ' , J i < M = 8 a<-M = i6 : / / : J / ' / • / / m m 1 i i --#-i ir i-- 'f l i—ir^ m m wr^—i i i E S / N 0 (dB) ( = X (dB)) Fig. 39. Comparison of exact expression for fM(z) v s - the approximate expression. Note that since P x , d = a x , d = S d l ' M = fM(z) (see (81), (82)) then this figure i also useful for predicting J3 d , axd , and gd . 3.4.4 Loop noise, self noise, and squaring loss of d^n) The loop noise is easily found from (76) and (67) to be : ^ , r f ( " ) = > i - s i n (MA^ l ) (84) -89-Observe that Ne^ is not Gaussian, though it may be approximated by Gaussian j->oo high x noise at high SNR, since (using (33)) we have Ned{ri) —> A.(f)n ~ 7V(0,l/(2^)) . The self noise of dM is from (68), (84), (31) and (35): & =2x-y^N^(n)) = 2Z-£(/„sin(Mr))2 pR(x\x)-dx = % £ (X"Yicos (2Mr) )P R (T \x ) -dr = ^{\-f2U(x)) ~ " ^ " ( ^ ~ " ^ 2 ~ " " e X P ( ^ ) [ A 2 M - l ) / 2 ( f ) + ^(2M+l)/2 ("f) (-if 1 (M + n-l)\ x M2 M-+ M y tf n\ (M-n)\Zn 1 (M + n-l)\ exp(-^r) V ( - i r z t f ( / i - l ) ! ( A f - » ) ! y (85) and the squaring loss Cld = <%D IccSNRd{=%d I ftSNRd) is readily found using (85) and (82). 3.4.5 Examples of S-Curves of d^n) To give some graphical insight into the S-Curve of dM (n), we start by showing the complete S-Curve of d2(n), as shown in Fig. 40. In that figure, we have used (79) in order to plot the exact predicted values Sd W = • e x P (T1) [/(M-D/2 (f) + 7 ( M + i ) / 2 (f)]) sin(Mt7e), the S-Curve predicted by the approximation of (80) which is Sd{0e) ~ e x p ( - M 2 /(4j))sin(M<9 ) , and simulation results (i.e. computation of E[dM(ri)\0e] through stochastic simulations). As can be seen from that figure, the simulation results agree perfectly with the exact -90-predicted results, and, fAirthermore, the approximation (80) is found to be an excellent approximation, even at low SNRs. -1: -0.8 -0.6' -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Fig. 40. S-Curve of d2(n) for various SNR ratios ('Predicted, Exact' i s Sd(9J = f2(%)sm(20e) , and 'Predicted, Approx.' i s ^ (6>)»exp( - l / j ) s in (20 e ) . See (77), (79), (80)). To further give quantitative results, we first note that from (79) the S-Curves are In periodic with period — over the entire interval -n<9e<n . Hence, it suffices to plot the M Tt Tt S-Curve over the interval ^8e-— • l n Fig- 41 to Fig. 44 we see S-Curve examples M M Tt Tt for over the intervals <6 <— for M=2,4,8 and 16, for various SNR ratios. Plotted M e M are the exact predicted S-Curves (eq. (79)), the approximated predicted S-Curves (eq. (80)), as well as simulation results (i.e., evaluation of ^I^v/( w)|^ e] through stochastic -91-simulations of eq. (70)). We see from those figures that for M>2 the approximate expression of (80) is almost a perfect match with the exact expression of (79). Hence, (80) is a useful tool which enables the designer to predict the S-Curve with accuracy. Note that the SNR values which are used to plot these figures are those which are appropriate for the given modulation, i.e. from low SNRs (= close to the lock threshold of the PLL for the respective modulation index M) through moderate SNRs and then to a high SNR value. For more information about lock thresholds the reader is referred to [58 Sec. III]. 1 0.8 =3 CL 0.6 *-» ZS O i _ 0.4 o o CD 0.2 CD Q CD co CO . c -0.2 CL -0.4-o CD -0.6-Q . X LU -0.8 Fig. 41. S-Curve of d2(n) (for BPSK) for various SNRs in the TT TZ interval — < 6 < — . The phase detector curve i s periodic with 2 e 2 r 2n period — = n over the entire interval -n<0e<n . M -92-0.8 -9" 0.6 o £ 0.2 O o CD -»—> CD Q CD c/) 03 0.4 -0.2 -0.4 TJ CD > O CD Q.-0.6 X LU -0.8 Es/No= E s / r V Es/No= Es/No= Es /No Es / No : N = -0.25 Es/No= Es/No= Es/No= Es/No= 30 d B , 30 d B , 30 d B , 12 d B , 12 d B , 12 d B , 6 d B , 6 d B , 6 d B , O d B , O d B , O d B , S imulat ion P red . -Exac t P r e d - A p p r o x . S imulat ion P red . -Exac t P r e d - A p p r o x . S imulat ion P r e d . - E x a c t P r e d - A p p r o x . S imulat ion P r e d . - E x a c t P r e d - A p p r o x . 0.125 0.25 Fig. 42. S-Curve of d4(n) (for QPSK) for various SNRs in the Tt 7t interval — < 6 < — . The phase detector curve i s periodic with 4 e 4 27T 7T period —=— over the entire interval -K<0e<n . M 2 -93-0.8 B- 0.6 O 0.4 O O $> 0.2 CD Q CD CO CO x: -o.2 CD -0.4 o CD Q.-0.6 X LU -0.8 Es/No= E s / N o = E s / N o = - A - Es/No" + E s / r V - e - E s / N 0 = -a- E s / r V Es/No= - 0 - E s% -4- E S / N 0 = E s / r V 35 dB , 35 dB , 35 dB , 20 dB , 20 dB , 20 dB , 12 dB , 12 dB , 12 dB , 7dB, 7dB, 7 d B , S imulat ion Pred . -Exac t Pred . -Approx . S imulat ion Pred . -Exac t Pred. -Approx. S imulat ion Pred . -Exac t Pred. -Approx. S imulat ion Pred . -Exac t Pred. -Approx. -6112! 0.0625 0.125 e e / 7 i Fig. 4 3 . S-Curve of ds(n) (for 8-PSK) for various SNRs in the interval TC 71 —<t9<—. The phase detector curve i s periodic with 2/r u period —=— over the entire interval -n<0<n v M 4 -94-0.8 * S 0.6 t- 0.4 0.2 -0.2 "S -0.4 O CD O.-0.6 X UJ -0.8 -1 - G -Es/No= Es/No= Es/No= Es/No= E s% Es/No= Es/No= E s / r V 40 d B , 40 d B , 40 d B , 30 d B , 30 d B , 30 d B , 20 d B , 20 d B , 20 d B , 15 d B , 15 d B , 15 dB , S imulat ion P red . -Exac t Pred . -Approx . S imulat ion P red . -Exac t P r e d - A p p r o x . S imulat ion P red . -Exac t Pred . -Approx . S imulat ion P r e d . - E x a c t Pred . -Approx . -0.0625 -0.0313 0.0313 0.0625 e e / ; r Fig. 44. S-Curve of dl6(ri) (for 16-PSK) for various SNRs in the interval -—<#„<— . The phase detector curve i s periodic 16 e 16 In n with period —=— over the entire interval -n<0<n MS 3.4.6 Squaring loss comparison vs. other phase detectors Quantitative judgments regarding the squaring loss of dM can be reached by comparing that loss to that which is incurred when cM or DDM is used. Such a comparison is shown in Fig. 45. Linear-model predictions of closed-loop phase-error variance of dM can be made via (69) and are shown in Fig. 46, with ^=0.95 and -95-ca„ =8.24-10"3 IT fixed at all SNRs so that 2/^7 = 0.01 at all SNR. The plots show that dM provides excellent performance, particularly compared to cM . Hence, dM is a very viable phase detector. Also plotted in Fig. 46 is the Cramer-Rao Bound [22 eq. (6-108)]: Also plotted in Fig. 46 is the Cramer-Rao bound ([22], [21]), defined as: C R B = B L • Tlx=y2m„(C+l /(4£)) • T/Z (86) The data required to plot the results for cM and DDM was taken from Table 3. 10 1 0 ' \ \ \ \ \ \ \ 1 \ \ \ \ i u I u \ w \ w \ \\ \ w 1 u \ i \ \ \ \ I 1 \ \ \ \ \ V I \ \ \ 10 15 2 0 E s / N 0 (dB) - G - - d 2 — & - d 4 - A -d 8 d 1 6 — o - C 2 H H F C 4 - A C 8 • « • — * ° 1 6 DD 2 • D D 4 A D D 8 * D D 1 6 3 5 Fig. 4 5 . Squaring loss as a function of Es/N0. -96-l - z - 1 ! e g I i i i i i 0 5 10 15 20 25 E s /N 0 (dB) Fig. 46. Calculated phase-error variance var(6>J , using linearized baseband model. The loop's noise bandwidth i s held fixed at 25^-7=0.01. AGC effects are ignored (i.e. K=i identically) . <^ =0.95 . 3.4.7 Nonlinear model simulation investigation ofd^n). To verify the linear-model predictions of Fig. 46, nonlinear simulations were conducted, with the simulation model and the results presented in Fig. 47. As we can see by comparing Fig. 46 to Fig. 47, the theoretical predictions of the linear model are in excellent agreement with the nonlinear simulation results anywhere above the PLL's lock threshold. -97-E^ NQ (dB) Fig. 47. Simulated phase-error variance var(<9t) , using equivalent baseband nonlinear model. Loop bandwidth i s held fixed at 2%-T =0.01 • AGC effects are ignored (i.e. K=1 identically) . £"=0.95 . The SNRs below which var(f^,) increases dramatically for M=8 and M=16 are the PLL lock thresholds for those modulations. 3.4.8 H a r d w a r e r e a l i z a t i o n Since from (71) d M (n) = sin (M(pn), we see from inspection that dM(n)'s value is independent of K, and hence independent of the AGC. Moreover, dM{n) has an -98-efficient fixed-point hardware implementation in the form of a lookup table; this is due to exactly the same reasons that enabled such an implementation for xM n in Section 2.3.3, namely the small dynamic range that is needed to express dM{n) (since \dM («)| = |sin(M(»n)|< 1). The proposed implementation is shown in Fig. 48 l(n) — Q ( n ) CP *L-\ 4 - 2 1 s <^ /t = 0 Lookup Table: ro M \ - \ ) k I M - l k - \ n ) Q 2 k + \ n ) 2 o. (/ 2(«) + G 2 ( « ) F ° Fig. 48. Efficient hardware generation of dM(n) To see why the existence of such an implementation is significant, we make note of the fact that other phase detectors suffer from a large dynamic range that often renders a similar implementation unfeasible. To highlight this point, consider the Mth-order nonlinearity cM{n). It is easily seen that cM(ri) <x KM, which means that a phase detector lookup table and the ensuing datapath (in particular, the loop filter) must all be able to handle the dynamic range of KM. This is often prohibitive to implement in fixed-point hardware. Moreover, the dependence on KM implies a nonlinear dependence upon the AGC, a dependence that dM(n) does not exhibit. A similar conclusion can be reached with regards to Decision Directed detector ([58], [9], [97]) DDM(n) = I(n)-Q(n)-Q(n)-I{n) (where I(ri) and Q(n) are decisions on the I and Q channels). Simple substitution shows that DDM(n)ccK, so use of DDM{n) means that a dependence upon the dynamic range of K and the AGC would still exist. -99-In contrast, for dM(n), the output of the lookup table is always in the interval [-1,1], regardless of K or M. Thus, with dM(n), a fixed-point lookup table with just a 10-bit or even just an 8-bit output, i.e. quantization of the phase error estimate to 10 bits (quantization to 1024 levels of the interval [-1,1]) or 8 bits (quantization to 256 levels of the interval [-1,1]) will be more than enough for dM in), for any K and any M. Indeed, we see that dM has many merits. In fact, it is an excellent M-PSK phase detector that can be used instead of cM or DDM. However, even better performance can be achieved by using dM within an adaptive phase detection structure that is introduced in Section 3.6. 3.4.9 L o o k u p T a b l e I m p l e m e n t a t i o n I ssues Though the lookup table's values of dM(n) are well behaved, a valid and necessary question is: what happens when both I(n) and Q(n) are 0? In that case the denominator of (70) vanishes, and the output of the corresponding lookup table is undefined. While in the unquantized theoretical analysis the event I(n) = Q(n) = 0 has an infinitesimal probability and is thus inconsequential, for the practical quantized case this eventuality has a finite probability and must be addressed. Fortunately, there is an exceedingly simple way to solve this problem, which, quite fortuitously, also turns out to be the mathematically correct approach. Let's consider for example BPSK. Then we have from (70): J , A A 2I(n)Q(n) d2(n)= . v , > (87) I\n) + Q\n) Let's assume for the example that (87) is implemented in a lookup table with input quantization of 4 bits for each input, and the output of 8 bits, hence resulting in a (24 • 24) • 8 = 2048 bit lookup table. What happens when the I and Q inputs are both 0? Eq. (87) is useless in this case. However, the key here is to recognize that the -100-quantization of the inputs to 4 bits means that we lack information regarding the bits that were not expressed in the quantized result. Assume that we treat all the binary numbers as fractions, i.e. that the binary point immediately follows the sign bit (this is a completely arbitrary convention). Then in 4-bit quantized binary form a zero value for the I and Q channels is / = 0.0002 and Q = 0.0002. Taking for example the I value, it is important to note that the value I = 0.0002 will result from quantization of all true values of I between I = (0.00000 )2 =0 and I = (0.000111....)2 = 0.125. Thus, the correct way to interpret / = 0.0002 is by the average of all possible values that it could represent, i.e. (0.125 + 0) 12 = 0.0625 = 0.00012. This reasoning applies not only to the 0-value case, but to all values. In other words, the mathematically correct way to interpret a quantized value is to think of the true value as one which contains an additional "invisible" LSB, whose value is 1. When the values of the lookup tables are computed with this value have the added advantage that the lookup table value for all-zero inputs is well-defined. For example, for the case of (87) with 4-bit inputs and an 8 bit output, we have for the 0-input case: f 2-0.0625-0.0625 ^ d2{ri) /=o.ooo2 = quant % 0=O.OOO2 V0.06252 + 0.06252 (88) = quants(\) = 0.11111112(= 0.9921875) Contour maps of the corresponding lookup tables with and without quantization are shown in Fig. 50. -101-Lookup Table I(n) Input Fig. 50. Intensity graph for the lookup table computing d2(n) = 2I(n)Q(n)/(I2(n) + Q2(n)) , with quantization of inputs to 4 bits (each) and the output to 8 bits (note that the computation of the lookup table values i s done as outlined in Sec. 3.4.9) . The fact that the quantized J and Q ranges are asymmetric around 0 (they are [-1,0.875]) i s due to the 2's complement representation. 3.4.10 Intuitive Understanding oid^n) We shall now take an interlude from the rigorous mathematical procedure undertaken thus far, in order to attempt the instilment of an intuitive appreciation of the phase detectors' performance and novelty. This is facilitated by comparing dM(ri) to the standard M t h order nonlinearity cM (n) = Im[(/(«)+j • Q(n))M ]. -103 -As the chosen example we look at M=4 (QPSK). In Fig. 51 and Fig. 52 we see a contour map of dA(n)= ^ v ' Vf , u P o n w n i c h demodulated QPSK (l2(n) + Q2(n)) signals are superimposed, with K = 0.3 and K = 0.8, respectively. As can be clearly seen in the contour graph, the radial symmetry of the contours means that the phase detector output does not depend on the value of K; rather, it depends solely on the phase departure of the current received symbol from the ideal phase of a constellation point. This, however, is clearly not the case for the QPSK phase detector c4(«) = Im[(/(«) + j-Q(ri))4], whose contour graph is featured in Fig. 53 and Fig. 54, upon which the same demodulated constellations are superimposed. As is evident from those figures, for the 4th order nonlinearity detector, there is a strong dependence of the phase detector value upon the value of K, and hence upon the AGC's performance and dynamic range. The AGC-independence of dM(n) is not complete, in the sense that the AGC must still ensure that the samplers and preceding signal chains are not overdriven, and, conversely, that the samplers are not underdriven to the extent that quantization noise becomes significant. However, once the AGC meets these two basic requirements, the values of K (and the fluctuations thereof) are irrelevant. -104-I(n) Fig. 51. Received QPSK constellation, Es IN0 = 20dB, K = Q.2>, superimposed on a contour map of d4(n)= v ' w ~ w (l2(n) + Q\n)) -105--0.8 -0.6 -0.4 -0.2 0 I(n) 0.4 0.6 0.8 1 Fig. 52. Received QPSK constellation, EsIN0 = 20dB , £ = 0.8, e i m e . ^ ^ H rtn a i r . _ - 4 />)0(«)-4 / («)(2>) superimposed on a contour map or aA\n) = -(l\n) + Q2(n)f -106--1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 I(n) Fig. 53. Received QPSK constellation, Es/N0=20dB, K = superimposed on a contour ° map c 4 (n) = Mil in) + j • Q(n))4 ] = 4 / 3 (n)Q(n) - 4/(«)(9 3 (n) . -107-Fig. 54. Received QPSK constellation, Es/N0=20dB , K = 0.&, superimposed on a contour map of c4 (n) = Pm[(/(/7) + j • Q(n)f ] = 4/3 {n)Q(n) - 4I(n)Q3 (n) . Yet another way to interpret the functioning of dM(n) is by noticing that its amplitude suppression factors fixd =axd=^~e x 1^{^+1^(f) are the same as those for a limiter which follows (or precedes) an Mth-order nonlinearity (or, equivalently, a limiter which follows (or precedes) a multiphase Costas phase detector), as is the case analyzed in [92]. Since the amplitude suppression factors are the same, we can thus think of the self-normalizing action of d„ (n) = Im[(7(") + ^(^)) ] a s that of the digital equivalent (I2(n) + Q2(n))M/2 -108-of a limiter (although, of course, the implementation of dM(ri) is much more compact and suited for digital communications than the system considered in [92]). 3.5 The Case for a Constant-Gain Phase Detector In Fig. 46 and Fig. 47 we have 2BLT=con{^ + \I{AQ)-T = Qm at all SNRs. This is achieved by using a different loop-filter gain at each SNR so that £"=0.95 and &>„=8.24-10~3/r (the loop-filter gain is the coefficient Ka in the loop-filter function. See the linear and nonlinear models in the lower left of Fig. 46, Fig. 47, and Fig. 60). Maintaining £ and con constant at all SNRs is the practice adopted in most synchronization texts since it facilitates a meaningful comparison of var(#e) achievable by the compared phase detectors when they are employed in PLLs that have identical parameters. However, it is important to note that constant C, and con cannot be maintained by a P L L with a fixed (=non-adaptive) loop-filter gain and which uses either DDM , cM, or dM . Consider, for example, in Figs. 46 and 47 the P L L for QPSK (M=4) which employsDD M . Let us use the notation KaDD(p) to refer to the loop-filter gain at Es/N0=p. Now, from [58], [9], and [97] we know that i f the loop-filter gain needed to achieve £"=0.95 and con =8.24-10~3 IT in noiseless conditions is KaDD(<x>), then in order to maintain the same Q and con at Es/N0=z the loop-filter gain KaDD(%) must be Ka,DD(z) =Ka,DD(co)/az,DD i f A G C e f f " e c t s a r e ignored by assuming K=l (as done in [58], [9], [97]). Moreover, Sec. 1.5 shows that to assume K=\ is unrealistic and A G C effects must be modeled, and it is easily shown that in practice the loop-filter gain at Es/N0=z actually must be KaDD(co)fimDDI Bx D D . Thus, when inspecting Figs. 46-47 it is important to realize that since the loop-filter's gain is different at each SNR, the results for a given phase detector cannot be obtained using a single P L L with a fixed loop-filter gain. The correct way to interpret Figs. 46-47 -109-(and similar graphs in texts such as [22] and [21]) is hence by considering the results at each SNR as if they were obtained by measurements on PLLs unique to that SNR that were optimizedfor operation at that SNR to yield<^ =0.95 and con =8.24-10~3/r. Indeed, as noted earlier, a single PLL with a fixed loop-filter gain and which uses cM , DDM or dM cannot maintain constant and con over the entire SNR range; rather, the desired values will be attained only at a single SNR which we call the optimization SNR. When we are not operating at the optimization SNR the PLL will observe changes in C, and con that will cause changes in all of the PLL's parameters, e.g. the noise bandwidth BL =y2ct)n(£+l/(4£)), the settling time Tset ^2nlcon, the lock range AcoL ~2£con,the pull-out range AcoPO~l.8a>n(£+l), and the cycle-slip statistics ([103 Chap. 2],[22 Sec. 6.4],[49 Chap. 6], [58], [9], [97]). To quantify this effect, suppose we use a phase detector P{n) in a PLL that has a fixed-gain loop-filter, and that the PLL is designed for operation at ES/N0=A (the "optimization SNR") with the optimal parameters coopl and £ o p l (e.g., A might be the lowest SNR at which the error correction decoder provides an acceptable coding gain). It can be shown [48 Sec. 9.1] that, accounting for AGC effects, at Es/N0=z the natural frequency con% and damping ratio £ will be: <»nX = ®oPt yjfiz,p/fiz,p and £z = £opt yJPz,PlPx,P . (89) Thus (assuming B p increases monotonically vs. x) w e conclude from (89) that for X>A we have higher-than-optimal £"and &>„> and for x<A we have lower-than-optimal £ and con. Only at %=A does the PLL perform as desired. To illustrate this phenomenon and its effect upon the PLL we present in Fig. 55 phase-error variance results and CRBs, computed via (69) and (86). Clearly, as Fig. 55 shows, the variation of C, and &»„ due to the AGC profoundly affects the PLL. Note that at low SNR the AGC appears to cause a reduction in var(#e), but it would be fallacy to say that this is a positive effect, since this reduction is due to the reduction of and con, -110-which has a detrimental effect on the P L L ' s stability, lock range, pull-out range, etc., as outlined in the previous paragraph. -15 r Note how the CRBs are affected by the changes in BL as a result of the AGC -20 [ CN CO c <5 '~u ro i_ i CD TD CU o c ro > •25T •30 h no AGC: aSSB M.AGC:SSNR Optimization SNR (Es/N0 = 5dB): Note how at this SNR results with AGC and without AGC coincide Ne(") -35 h W i t h A G C effects, below the ' *3 Oft - 4 0 L --5 With AGC effects, above the optimization SNR and mu > a> 0 E s /N 0 (dB) 10 A d4. no AGC A c 4, no AGC A DD4, no AGC s^7 -d 4 , W. AGC ^ - c 4 , w. AGC - • - D D 4 , w. AGC - * - C R B , no AGC e CRB d„, w. AGC Q -CRB c4, w. AGC -#-CRB DD„, w. AGC Results w. AGC converge to CRB w. AGC at high SNR 15 Results without AGC converge to CRB without AGC at high SNR Fig. 55. Calculated var(t9) using linearized model when AGC effects are ignored and when AGC effects are modeled. Modulation i s QPSK (M=4) . Optimization SNR i s 1=5 dB . Note that in the linearized model (lower left) the loop noise Ne i s computed differently for the "w. AGC" and "no AGC" cases (see Sec. 3.3.2). Also, when AGC effects are ignored, then the loop-filter's gain Ka i s selected at each SNR such that ^ = £ ^ = 0 . 9 5 and con =coopt =8.24-10 3IT at each SNR. When AGC effects are modeled, Ka i s constant and i s selected so that at ESIN0=5 dB the PLL has ^=^,=0.95 and con =coopl =8.24-10"3/T (since Ka i s constant, this w i l l not hold at other SNRs; rather, changes w i l l occur as per (89)). - I l l -From (89) it follows that to achieve coopt and £ o p t at all SNR, the phase detector must have ftSNR,p=l (implying a constant gain1 4 vs. the SNR since from (66) gP{M,K,x) = gP{MX^-B P ). We now present such a detector. 3.6 A Family of Robust Adaptive Phase Detectors In this section we endeavour to present a family of adaptive phase detectors. This adaptive phase detector is composed of two complementing and related structures: (a) the M-PSK phase detector presented in Sec. 3.4, and (b) the M-PSK lock detector presented in Chapter 2. In the following subsection we briefly review some of the derivation of Chapter 2 which are necessary for the current undertaking. 3.6.1 Review of lock detector In Chapter 2 an auxiliary random process was introduced via y ARe[(7(n) + y - g ( » ) ) M ] to 2k (-\flM-2k(n)Q2k(n) X u (n) 4 w T J J = *=° ^ . Elementary rectangular-to-[l\n) + Q\n)y (l2(n) + Q2(n))^ polar manipulations and the use of De Moivre's theorem [54 eq. 6.9] yielded: xM(n) = M(I(n) + j-Q(n))M]/(l2(n) + Q2(n)y =cos(Mq>a). (90) As clearly seen in (90) and as elaborated upon in Chapter 2, the value of xM(n) is independent of K and, hence, of the AGC. A lock detector was defined (see Sec. 2.3.1) 1 4 Sometimes, the optimal value of con will be SNR dependent [48 Chaps. 7, 8]. However, even in that case the desired variation of (on will not usually corresponds to the variation induced by the changing phase detector gain, so we would rather have a constant-gain PD and modify con via changing the loop filter coefficients adaptively as a function of the SNR (as estimated by an SNR estimator, see Chapter 4). -112-1 JV as / =__ V xu{ri)- It was found that when the carrier loop is unlocked E[lMN]-0. M'N 2N J$+1 Conversely, when locked, at Es / N0 = x we have (Sec. 2.4): fM(z) = E[ijEs/N0=Z] = U = £ [ c o s ( M A ^ ) ] - J & [ c o s ( M c 9 e ) ] = I £ c o s ( M A ^ ) p R ( A ^ \ z ) d A^E [ c o s (M0 e ) ] . Due to xM(«)'s independence vis-a-vis the AGC, the same is true for lMN. Regarding the distribution of }M N , if we ignore the term i?[cos(Mc9 )] (which is a very good approximation, according to the data presented in Sec. 2.4) a good approximation is (see Sec. 2.5): lMN\ locked~N{fM(x), l/(2N)) (92> lMN | unlocked ~ N(0 , l/(2N)) where fM(x) is given in (30)-(32). Further analysis of lM N can be found in Chapter 2. Using iM N and dM (n) we now define the adaptive phase detection structure whose analysis is the primary goal of this section. 3.6.2 Definition and structure The second new phase detector structure presented in this thesis is an adaptive phase detector that will be shown to have unity gain during tracking, hence allowing for optimal loop parameters to be maintained at all SNRs where the PLL can lock (see discussion in Sec. 3.5). The idea behind this adaptive detector is simple: if we somehow estimate gd in real-time, and divide dM («) by this estimate, we arrive at a constant-gain detector. Fortunately, this is easy to do. To see this, we evoke (92) and the fact that every random variable is an unbiased estimate of its expectation to write: I M.N ~ fuiX) (93) -113-which reveals that, when in lock, lM N aax d = BX d = (\/M)gd . Thus, if we define F M l V («)=J M (n) / (M' / M A I ) , such a phase detector should have unity gain. When out of lock, lM N « 0 (see Chapter 2) so an appropriate "worst case" or otherwise defined value H needs to be substituted for lM N in the expression for vM N («) in order to achieve acceptable performance during acquisition (//is discussed further in Sec. 3.6.6). We hence define the adaptive phase detector as follows: . . A , , s ™ • , ^ \lu M Carrier is locked VM,N(n)=dM(n)l(M-8) with S=\ (94) [ju Carrier is unlocked A diagram of a fixed-point hardware implementation structure for VM N(n) is presented in Fig. 56. Observe how the division by 2N is avoided, where it is assumed that 2/Vis a power of 2 (see also Sec. 2.3.3). -114-L o o k u p T a b l e #1: M I 2 ( M \ «S 2k (-1) I {n)Q 0 ) o 3 CL c (l\n) + Q2(n))~ O * A / ( " ) Integrate and Dump Averager -sum 2N samples and disregard lower log 2(2N) bits LM,N -Lock Indication-Compare to Lock Threshold Y 1 Sel „ M U X 8 <g L o o k u p T a b l e #3: ro U •a < -4—' 3 a. c 1 dM(ri) % M 8 3 Q . O To Loop Filter co -o L o o k u p T a b l e #2: z < f-i \2k + l A=0 co Q. 3 CL C (/2(«)+e2(«))2 3 o Fig. 56. Hardware implementation of VM N{n) 3.6.3 Comments regarding the hardware implementation of VM,N Regarding Fig . 56, the implementation of L U T (Lookup Table) #1 was discussed in Sec. 2.3.3, and implementation of L U T #2 was discussed in Sec. 3.4.8. There it was shown that L U T s #1 and #2 can be efficiently realized in fixed-point hardware. A s for L U T #3, observe that the lowest value of lM N that need be handled is the lock threshold value, since below this value /u is used. Hence the largest absolute value that L U T #3 needs to accommodate is (sup\dM (n)\)/(M-mini//,y}) =\/(M-min{ju,/}), with y being the lock threshold. Typically, n>y (see Sec. Sec. 3.6.6) and neither parameter would be less than about 0.04, since below that value the S N R is so low that there is scarcely hope -115-of the PLL locking (see Fig. 31). This means that l/(Mmin{//,^})<25/M , and thus the dynamic range of the data in LUT #3 can be sufficiently limited to allow for its compact fixed-point hardware implementation. It should be noted that although from an implementational efficiency standpoint the implementation of VM N («) can be implemented with the same N as lM N as shown in Fig. 56, this does not have to be the case. This is especially true if the values of N needed to compute lM N in the presence of fading are different from those which are ideal for VM N{n) (see footnote 15 on page 118, Sees. 2.7 and 3.6.8, and also [104]). 3.6.4 Linear modeling of VM,N The linear model analysis of vM N shall now be done by relying on the analysis in Sec. 3.4 regarding dM . a) S-Curve, gain, and amplitude suppression factors Assuming that the loop is locked, from (76) and (94): dM{n) _ sin(M^„) VM.N{n) = M-lMtN M - l M N cos(MA^„) . / l j r „ ^ s i n ( M A<zJ) , , , ^ x (95) To find the linear-model parameters of VM N , we choose to treat lMN as a constant. This is justified by noting that lM N changes significantly slower than dM («) (slower by a factor of 2N, which typically would be at least in the order of 100 (see Sec. 2.6 and Sec. 3.6.8)). With that assumption and since £[sin(MA$,)] = 0, we have from (95) that the S-Curve is ^ ( ^ J ^ £ [ K ^ ( / i ) | ^ J = ^ c o s ( M A ^ J ] s m ( M ( 9 J / ( M - / ^ ) = (fM(Z)KM-i^N))-sm(M0e) -116-and it is easy to show from (96) and (64)-(66) that: gv(M,K, z) = a fMix)H M ,N (97) The central idea here is, again, that / fM (X) so that: gv(M ,K,x) = ax v = BxV *1 . (98) It is stressed that for all Mwe have that gv, aSNRV, and BSNRV approximately equal 1 independent of (a) the value of K and the AGC's performance, and (b) the SNR. b) Loop noise, self noise, and squaring loss of VMfN(n) Once again treating lMN as a constant approximately equal to fM{%), we see that the only random process in (94) is dM («). Thus, the squaring loss of vM N should be identical to that of dM (shown in Fig. 45). Formally, it is easily shown from (67) , (68), (93) and (95)-(96) that Ne y * Necl I fM (X), $v « ^  / f2M (X), and (since axV = BxV * 1) that indeed Q K « £ K / l ~%dl fM(%) = Qd. Hence, from (69) we conclude that dM and VM N have the same phase-error variance performance, but with the important difference that results for vM N can be achieved using a fixed-gain loop-filter (see Sec. 3.5) ; nonlinear-model simulations presented in Sec. 3.6.5 verify this. Table 3 summarizes the main linearized-model parameters for the phase detectors discussed in this thesis. Fig. 57 shows plots of aSNR for the phase detectors discussed in this thesis. Plots of 6SNR , assuming the example AGC parameters of Sec. 1.5, are given in Fig. 58. Comparing Fig. 57 to Fig. 58, we can see that the AGC's effect upon DDM and cM is quite pronounced. Moreover, since only vM N has a constant BSNR , only vM N will be able to maintain optimal loop parameters at all SNRs (see Sec. 3.5). The reasons for the AGC's effect on BSNR DD and BSNR c (and its particularly striking effect on PSNR c) is easily understood by looking at a graph of K and K M , given in Fig. -117-59. Since DDM (n)ccK and cM (n) oc K M (see Sec. 3.4.8) then, as seen, then AGC has a profound and predictable effect on BSNRDD and BSNRc. As a caveat, we note that the delay 27V • T incurred during the computation of lM N must not substantially impact the validity of the approximation }u N « fM(x) when that value is used to compute vM N{n). In Sec. 3.6.8, it is shown that a relatively small TV is required15 to achieve good performance over practical SNRs, so the delay is inconsequential for most systems (particularly where the symbol rate is high compared to the channel fading rate, which is usually a very good assumption if suppressed-carrier coherent M-PSK is the chosen modulation (see [78], [79], [21 p. 250]), and this is a particularly good assumption for geosynchronous microwave satellite links (see [10 Chap. 4])). Nonetheless, this constraint must be taken into account when deciding whether usage of vM N is appropriate. 3.6.5 Nonlinear model simulations for VM>N We have already noted in the previous section that VM N should have the same phase-error variance characteristics of dM, as displayed in Fig. 46 and Fig. 47. But, as a crucial difference of VMN vis-a-vis dM (and, indeed, vis-a-vis cM and DDM ), VM N should be able to achieve the results predicted in those figures, at every SNR, using a single PLL with a fixed loop filter. To validate the above predictions regarding the performance of VM N , simulations were conducted using the nonlinear equivalent baseband model, assuming the AGC of 1 5 A further constraint exists for TV, namely that the desired lock and false-alarm probabilities are attained (see Sec. 2.6). If this constraint conflicts with those of Sec. 3.6.8, the Integrate-and-Dump module in Fig. 56 can be duplicated, with a different N being used in each module. One module would be used to generate / N for lock detection (and to drive the "sel" input to the MUX), while the other would be used to drive the "1" input to the MUX. -118-Sec. 1.5. This is shown in Fig. 60 and Fig. 61, and indeed the results for vM N agree with those in Fig. 46 and Fig. 47 for dM (though with the crucial difference that vM N achieves these results with a fixed-gain loop-filter; see Sec. 3.5). As is evident by comparing Fig. 47 to Fig. 60 and Fig. 61, the AGC has a profound effect upon cM and DDM, due to the changes in con and as described in Sec. 3.5. For the same reasons, the CRBs for cM and DDM become curved (see Sec. 3.5 and Fig. 55; for clarity those CRBs are omitted from Fig. 60 and Fig. 61). 3.6.6 Unlocked-state operation of VM,N From (94), vM N will exhibit behaviour identical to that of dM during acquisition, as it is simply dM («) multiplied by the constant 1/(M • /u). The gain of vM N is then gv(M,K,z) = gd(M,K,%)/(M-/u) = fM(z)/M • To maintain the optimal loop parameters during acquisition, we strive to have gv=l, implying that we should aspire for H = fM ix) • An algorithm for deciding upon an appropriate n would try, for example, to determine the latter either by: (a) using a worst case value (i.e. the value of fM (x) for the lowest SNR for which operation is desired), (b) using the last measured value of lM N when the receiver was locked (because E[lMN] = fM(x)), o r (c) using some SNR estimation technique (such as measurements on an auxiliary pilot signal) to indirectly estimate fM (x) • It is important to note that performance during acquisition is only partially addressed by using a constant / / , due to the fact that, since fM (x) varies with the SNR yet ju is constant, gv = fM (X)/M will vary vs. the SNR. -119-Fig. 5 7 . aSNR for the various phase detectors discussed in this thesis. -120-Fig. 58. Bsm for the various detectors discussed in this thesis, assuming the AGC that i s described in Section 1.5. Also plotted i s the AGC curve (i.e. K=YAGC(ES/N0)) . -121-Fig. 59. K and KM as a function of the SNR for the AGC of Section 1.5. -122-Table 3. Comparison of important linear!zed-model phase detector characteristics PD Self Noise ^ = 2- j-var(^e) Linearized Gain g L ^SWfl a n ( l SNR , „j« v i=o v' ; *•; & = J F ' (M-Z"-<) a from eq. (17) in [92] Source: [22 Eq. 6-116] D D M ^D=2-Z-^ 2-var(^ e D D| Jrr=l) v v— ' from eq. (3) in [58] Source: [9] a x D D fromeq.(10),(ll),(32)in[9] PX,DD = 'ax,DD dM z _ x / i w / 2 ^(2M-l)/2 ( 2 ) ~*~ I(2M+l)/2 ( 2 ) ) 2 E [ ^ L = I ( 2 ) + J ^ m ( 2 ) ] V M \ 2 ^(2M - l ) /2 ( f ) + ^ ( 2 M + l ) / 2 ( f ) ) 1 2 E [ 7 ( M - l ) / 2 ^2J + I (M+l) /2 ^2/Jj Notes: (a) To ignore AGC effects, substitute £=1; (b) Results for vMN assume lM<N~fMix) (see Sec 3.6.8) (c) While l)y>L)d , this does not result in an increase in var(c9), as substitution of the appropriate variables into (69) shows; (d) The expressions involving Bessel functions can be simplified further into f i n i t e sums of terms which include only exponents and polynomials (see Appendix A). -123-0 5 10 15 20 25 30 E s / N 0 < d B ) Fig. 60. var(c9) , obtained via nonlinear-model simulations including AGC effects. The loops for DDM and cM are optimized for input SNRs of 1, 4, 15, and 17 dB, for M= 2, 4, 8, and 16, respectively, where at those SNRs we desire to have ^=0.8 and &>„ =0.011/7 . For VMN, ^=0.8 and con =0.011/7 throughout. The arrows aid in finding the results for VMN at the optimization SNRs (those data points are also circled). For VMN, 7Y=256 . Note that DDM appears to outperform VMN at low SNR; but this i s a fallacy, since this apparent advantage i s due to the reduction of a>n , Q, and BL in the PLLs that use DD,.. See Section 3.6.7. - 124-Fig. 61. var((9) for QPSK and 16-PSK obtained via nonlinear-model simulations including AGC effects. K behaves according to the AGC of Sec. 1.5. For vM N , N=256 was chosen. K i s constant and i s selected so that at the a optimization SNRs (which are 5 dB and 17.5 dB for QPSK and 16-PSK, respectively) the PLLs have ^=^ o / , ,=0.95 and % = c ° o p , =8.24-10~3 IT . Since Ka i s constant while BSNRDD and BSNR c are variable (See Fig. 58), then^=^op, and co„=coopt i s not true at other SNRs for the loops employing DDM and cM ; rather, changes w i l l occur as per (89) . However, since BSNRV~\ at a l l SNR, for the PLLs employing vM N we do have C = C„pt anc* con=coopl at a l l SNR. Note that DDM appears to outperform vM v at low SNR; but this i s a fallacy, since this apparent advantage i s due to the reduction of con and Q in the PLLs that use DDM. See Sec. 3.6.7 and Fig. 63. - 125 -3.6.7 System identification analysis To get a qualitative feel of the operation of VMN, Fig. 62 compares the step response for carrier PLLs which use DDM, cM, and VM<N. The upper subplot of that figure shows the output phase trajectory for a single input data set for each SNR. Because of the input noise, it may be difficult to adequately distinguish the system response from a single data set, especially16 for the lower SNRs. This difficulty is overcome by using several data sets to drive the systems at each SNR, and for each SNR the measured responses are then averaged. It then becomes easy to discern the systems' responses. This is shown in the bottom subplot of Fig. 62, where it is seen that the system responses using VMN are virtually identical at all SNRs, while the responses obtained by using cM and DDM are strongly dependent upon the SNR. 1 6 In the top subplot of Fig. 62 it appears that the response for VMN(ri) at ES/N0=06B is much noisier than that of the other detectors; but this is because the loop bandwidth for cM and DDM decreases at low SNR (see Fig. 63). This reduction in con may have a seemingly positive effect on the phase-error variance, but it has the negative effects of, for example, reducing the lock range, a higher settling time, and a smaller pull-out range (see Sec. 3.5). - 126-2.5r 2 o <a 1.5 1 c CD 0.5 , 1 1.5 T ime (Symbo l Intervals) - - e.(n)/A9. c M (n ) ,E s /N 0 =0 .0dB VN<">-DDM(n), • E s /N 0=0.0 dB E s /N 0 =0.0dB c M (n) ,E s /N 0 =10.0dB E s /N Q =10.0dB E s /N Q =10.0dB A c M ( n ) , E 3 / N 0 = 4 0 . 0 d B V. . ,.(n) ES/NQ=40.0 dB V N 0 = x 1 0 Fig. 62. Simulated responses of carrier PLLs to a phase step of di(n)=Adj-u(nT-tQ) , where u(t) i s the unit step function and / 0 =4100r. Modulation i s QPSK ( M = 4 ) . Loops are optimized for Es/N0 =10dB, where at that SNR we desire £"=0.8 and <y„=9-10~4/r . Upper subplot i s the response obtained from a single data set. Bottom subplot i s the average of the responses obtained from 100 data sets. For VMN, JV=2048 . We assume K behaves according to the AGC described in Section 1.5. - 127 -We can also arrive at quantitative results by using the Steiglitz-McBride [105] system identification algorithm. To do this, at each SNR we average the PLLs' responses for a sufficient number of input data sets, that is, until the averaged response curves are sufficiently noise-free (like we did in order to arrive at the bottom plot of Fig. 62). Then we can use the Steiglitz-McBride algorithm upon the smoothed response in order to estimate con and The results attained by following such a procedure are shown in Fig. 63. As was predicted in Sec. 3.6.2, VMN provides the desired con and £ over the entire input SNR range; DDM and cM do not, and the parameters of their PLLs change according to (89). The strong variations in con and C, for DDM and cM cause a corresponding variation in the PLLs' other parameters (see Sec. 3.5) which means that those PLLs behave very non-optimally when not operating at the optimization SNR. In contrast, we deduce from Fig. 63 that PLLs employing VMN will maintain optimality at all SNRs. - 128-E S/NQ (dB) Fig. 63. Predicted and measured performance of DDM, cM, and VMN. Modulation i s QPSK (M=4). For VMN, iV=2048 was used. PLLs were designed to give CP/=0-8 a n d c o o P , = 9 l 0 ~ 4 / T AT J=10dB. We assume that K behaves according to the example AGC described in Section 1.5. 3.6.8 Bounds on N to ensure satisfactory tolerances in P L L parameters in PLLs using VM,^(n) In this section we investigate the parameter N of VM N. In Sec. 3.5 we determined that to maintain optimal PLL parameters we desire y?MI,=l identically, which, since (from (97)) Px,y=fM{x)U M N , means that we strive to maintain lUN=fM(%)- Since (see (92)) E[lMN]=fM(j), the way to achieve acceptable accuracy of the approximation lMN~fM(x) is by ensuring that /MV's variance is low enough, which (given (92)) means - 129-choosing a high enough N . To obtain a quantitative measure of the value of N that is needed in order to achieve acceptable performance, let us denote the natural frequency and damping factor we are trying to achieve as co^ and £ . We want to achieve them at all ES/NQ=% in the range je[r,oo] where r is some reasonable lower bound (e.g., the PLL's lock threshold). The question is: what is the lower bound on TY necessary to ensure, at each E S / N 0 =^e[r,oo], that: P(K/^P,-l|<to/)>C a n d P^JCcp,-l|<to/)>C (99) where tol is the acceptable tolerance for con and and C is the confidence. Since we want co^ and ^ be achieved for all Esl N 0 e[r,<x>], we can define the optimization SNR arbitrarily (and conveniently) as A=00, and since Bx v =1 we have from (89) and (97) that a»z=a)cp,Jfi4(z)'L,N a n d £x=CoP,yJfM(z)/L,N • Straightforward manipulations then show that an equivalent constraint to (99) is: p(fM(x){(i + toiy2-i)<iM,N-fM(z)<fM(z)((i-toiy2-\))>c. ( 1 0 0 ) Since (see (30)) E \ I M N \ E S I N 0 =X\=fMiZ), to guarantee (100) it suffices that: p i \ i M , N - E [ i M , N \ E S / N 0 = z] |< fM(z)-y)>c (101) where v=minj((l-to/)'"2-l|, ((l+to/)'2-l|. Since iMN is Gaussian (see (92)) then (101) is equivalent to erf^fM{X)-y/[^^^(lM^>C (102) A 2 _ , 2 with erf{x) = —j= \e dt. Now, since (see (92)) we have var(/MA,)<l/(2TV), we can solve (102) for N , whereupon we find that for all ES I Ar0=^>r a suitable lower bound on TY would be: N>{\lf2M{z)){erf\C)ly)2 (103) Graphs for N , computed in this manner, are shown in Fig. 64. From that figure we see that, for example, TV=256 is sufficient to ensure fo/=20% and C=85% for SNRs above -2, 4, 10, and 16 dB for M=2, 4, 8, and 16. 7V=1024 is sufficient to ensure to/=20% and - 130-C=85% for SNRs above -5, 2, 8, and 14 dB for M=2, 4, 8, and 16. Thus, only a relatively small N is needed to guarantee optimal PLL parameters above reasonable lock thresholds [58 Sec. Ill] for the respective modulations. 'A 1 \ \ I \ \ \ ^ \ \ \ \ W\ \ \ ' \ \ o -4-- o -c = 85% tol = 10% M= 2 c = 85% tol = 20% M= 2 c = 95% tol = 10% M= 2 c = 95% tol = 20% M= 2 c = 85% tol = 10% M= 4 c = 85% tol = 20% M= 4 c = 95% tol = 10% M= 4 c = 95% tol = 20% M= 4 c = 85% tol = 10% M= 8 c = 85% tol = 20% M= 8 c = 95% tol = 10% M= 8 c = 95% tol = 20% M= 8 c = 85% tol = 10% M= 16 c = 85% tol = 20% M= 16 c = 95% tol = 10% M= 16 c = 95% tol = 20% M= 16 10 15 ( r = ) E s / N 0 (dB) Fig. 64. Lower bound on N needed to achieve a desired tolerance, at a given confidence. (O op! a n d Cop, t O 3.7 Conclusions In this chapter we presented and investigated two new families of M-PSK NDA carrier phase detectors for operation in carrier synchronization PLLs in feedback-topology M-PSK coherent receivers. - 131 -First, a new family of self-normalizing phase detectors was proposed, and its properties analyzed. It was found that the suggested detectors could be of substantial practical significance, as they have good self-noise performance, have phase-variance th performance which is better than M -order nonlinearity detectors and comparable to DD detectors (according to simulations), and lend themselves to simple hardware implementation, as a compact, fixed-point lookup table. They also possess self-normalizing qualities that simplify the receiver design by significantly decoupling the AGC circuit from the carrier synchronization PLL. Next, we investigated an adaptive phase detection structure for M-PSK. This detector structure was characterized via theoretical derivations, simulation results, and system identification analysis. The major novelty of this family of adaptive phase detectors is that it has a constant gain during tracking, which, as was shown, allows a PLL that uses the proposed structure to maintain optimal PLL parameters at any SNR at which the PLL can attain lock. It is emphasized that these optimal parameters are maintained even though the PLL has a fixed (i.e. non-adaptive) loop filter. As an additional advantage, the detector was found to be inherently independent of the AGC's operating point and performance. Moreover, theoretical derivations using the linear model as well as simulations using the nonlinear model have shown that the detector has superior phase-error variance performance, as compared to DD detectors and Mth-order nonlinearity detectors. Both families of detectors were shown to have a compact fixed-point hardware implementation that is suitable for use within an FPGA or ASIC. Thus, they have immediate applications in contemporary coherent M-PSK receivers. - 132-Chapter 4 New Methods for Real-Time Generation of SNR Estimates for Digital Phase Modulation Signals Part A A New SNR Estimation Structure for M - P S K 4.1 Introduction In any M-PSK receiver, one of the most important metrics that can be generated is an estimate of the channel Es/N0 ratio. In many modern communications schemes an accurate Es IN0 estimate is needed not only as a monitoring aid, but it plays an important role in the receiver's operation. For example, some error correction decoders can make use of an Es/N0 estimate to increase their coding gain (e.g. turbo codes [12]). Another example are systems that employ diversity reception [13 Sec. 14.4], for which SNR estimates are used to assign relative weights to the data obtained from the various receivers. Yet another example are adaptive schemes where the data and/or coding rates are altered according to the Es/N0 (e.g. [14], [15]). The reader is referred to [16 Sec. 1.2] and the references therein for an extensive overview of these and other applications of SNR estimates in communications systems. In this chapter we shall present a quantitative analysis of the SNR estimation method suggested in Chapter 2, as outlined in Section 2.3.5. A focus of this analysis will be around the observation that the need for Es/N0 estimates is not fulfilled merely by facilitating their availability; the estimates must also be timely. In this respect, the method analyzed here is shown to produce accurate estimates using only a small number of symbols, thus facilitating the generation of a rapidly updating estimate. We shall This chapter was published in part in Linn, Y., "Quantitative Analysis of a New Method for Real-Time Generation of SNR Estimates for Digital Phase Modulation Signals", IEEE Transactions on Wireless Communications, vol. 3, no. 6, pp. 1984-1988, Nov. 2004, and in Y. Linn, "A Real-Time SNR Estimator for D-MPSK over Frequency-Flat Slow Fading AWGN Channels," in Proc. 2006 IEEE Sarnoff Symposium, Princeton, NJ, Mar. 27-28, 2006. - 133 -furthermore show that other advantages of the proposed method is that it is Non Data Aided, operates at a rate of one sample per symbol (which corresponds to the symbol strobe), and has a simple implementation that is easily realizable in an FPGA or ASIC. Since the SNR estimation method presented in Chapter 2 will only work if the carrier PLL is locked, this is assumed to hold for this part of the current chapter (Part A). However, in Part B of this chapter we dispense with that assumption and present an SNR estimator that also works in the absence of carrier synchronization. There have been many SNR estimation algorithms proposed by various researchers in the past. For example, the reader is referred to [59], [60], [61], [62], [63], [64], [65], [66], [67], [74]. We shall not address all of those estimators individually, since this would take an inordinate amount of space, and, moreover, as we shall show that this is unnecessary. In lieu of that, we shall conduct our quantitative comparison versus some of the most widely used SNR estimation methods, namely, (a) SNR estimation from the SER (Symbol Error Rate) [62]; (b) the M 2 M 4 estimator [60]; and (c) the SVR estimator [60]. We shall supplement this quantitative comparison with qualitative comparisons versus other SNR estimators that will show that the estimator proposed here possesses several important advantages over these previously proposed estimators. Of those advantages, the most endearing one is the fact that the proposed estimator has an exceptionally simple hardware structure which is almost trivial to implement with FPGAs or ASICs. Moreover, the fact that the estimator is NDA (Non Data Aided) and requires only sample per symbol sets it apart from most other estimators previously cited. 4.1.1 The general principle behind SNR estimation In general, there are two SNR estimator types: Data-Aided (DA) and Blind (or Non Data Aided (NDA)). Data Aided methods use known symbols that are embedded in the data stream in order to estimate the SNR. For example, we could estimate the SNR from measuring the error rate on a preamble or pilot sequence. Non Data Aided methods use a nonlinearity upon the received signal to generate an SNR-dependent metric, which is then used to estimate the SNR. For both types of detectors, the SNR estimation principle is as follows. We want to estimate the Es IN0 ratio, which we denote in this thesis using the symbol Z • We denote the SNR estimate as Y . In general terms, we first compute an - 134-observation variable £ based upon many symbols of the input signal. The idea is to choose a computation process that will yield an £ for which E[£ \ Es IN0 = %] is a strictly monotonic function of Z that we shall call fi%) (i.e. fix) = E[£ | Es IN0 = %]). Then, we can compute an estimate of Z via y = {£) . This process is shown in Fig. 65. 7 II X l(n) Q(n ) The main idea is that f(x) = E[e\Es/N0=x] is strictly monotonic, so that an estimate of the SNR is y = f~\£) C o m p u t a t i o n a lgor i thm b a s e d on m a n y s y m b o l s t r Fig. 65. SNR estimation principle. Specifically, for our SNR estimator discussed in Part A of this chapter, we choose £ = lMN (defined in Sec. 2.3.1) and f(z) = fu (j)(= E ES?N0 = X j (defined in Sec. 2.3.2). We shall now proceed to formally define this process and evaluate it quantitatively and qualitatively. - 135 -4.2 Review of Receiver structure and Lock Detector 4.2.1 Signal and receiver models The signal and receiver models, as well as the applicable notations, have been defined in Section 1.4. We shall at first assume that the SNR is constant, i.e. we ignore the effects of possible fading; these shall be treated in Sec. 4.7. 4.2.2 Lock detector definition, expectation and distribution Here, we review the necessary derivations made in Chapter 2. Since this is only a short review of some results, the reader is urged to re-examine that chapter for more detailed information. First, we define a process: Mill M\ M Y (-1)* IM-2k(n)Q2k"fn) (l2(n) + Q2(n)) (104) (l2(n) + Q2(n))2 A new type of lock detector was defined in Chapter 2 through: 1 N M,N ~ ~ A/- Z XM,n 2../V „=-N+\ (105) When the carrier loop is unlocked, it was shown in Section 2.3.2 that E[lUN] = 0. Conversely, when locked, we have (from eq. (44)): fM(z) = [E[cos(MA^)]. J rT[cos(M^)]]^ / ^ ( 1 0 6 ) = fM(Z)E[cos(M0e)\Es/No=Z_ where Atf>n e \_-n,n\ has the Rician phase distribution (see (29)): pR {A<p\z) 4 p(A^„ = A(/>\ES IN0 = x) _ e x p ( - ^ ) In x cos (A^ ) V 2 / l + 72^cos(A^)exp(j-cos 2(A^))- | e y l n d y (107) - 136-From the central limit theorem on eq. (105), lMN has a Gaussian distribution, and, furthermore: K2Nj var (Lj*) = \jjj J • ( ^ M , „ 2 ] - (£bV„ I)') (108) An important bound on the variance of lM N is also recalled from (49): (L,N)^ (109) var IN Note that eq. (109) is valid for both locked and unlocked states, and for any phase error jitter conditions. 4.2.3 Low jitter approximations Neglecting the effects of carrier phase jitter, (106) becomes: h iX) = /M(X)=[!I C O S {MA(f>) • pR ( A(p\ x) • dA<j> (110) where closed-form expressions for (110) are given in (31)-(32) and Table 1. In the case where carrier phase jitter can be neglected we can also deduce from (108) that: var (/^ ) = • ( £ cos2 (MA</>) • pR ( z ) • dA<j> - (fM ( Z)f ) = ^ ( £ ( i c o s ( 2 M A ^ ) + ^ ) ^ ( A ^ | ^ ) . ^ - ( / A , U ) ) 2 ) (111) 2N, and further closed-form simplifications of (111) are possible using (31)-(32). 4.3 Principle of SNR Estimation from iM N 4.3.1 Theoretical basis fu (x) i s a monotonically increasing function and is thus invertible. It is this inverse relation: r = fM-l(L.N) ( H 2 ) - 1 3 7 -namely the estimation of the ES /7Y0 ratio from the value of lMN, which interests us here. The ES / N0 estimate is usually desired in units of dB, as follows: YdB = 1 0 1 o g I 0 ( / w - , ( / w > w ) ) (113) The same reasoning applies to the case when jitter is modeled, i.e. when we use JM (Z) t 0 estimate the lock metric's value. We can then write: ydB =10- log 1 0 ( / „ - ' ( / „ , „ ) ) (114) Fig. 66 shows simulated and predicted curves of (113) and (114). This figure was generated by first plotting fu(jr) = E[IMN\Es INQ = x\ vs. (ES/N0)DB, using the theoretical prediction of (110), as well as closed-loop simulations of 2N = 20000 symbol intervals in which (105) was computed and (106) was approximated. Then, the graph was reflected through its y - x diagonal to produce the inverse relations, as given in (113) and (114). Results are presented for 2nd order PLL synchronizers with 2BL T = 0.01 and 2^-7 = 0.1, where BL is the PLL's noise bandwidth (BL = 0.5^(^ + 1/(4 )^) where con is the natural radian frequency of the PLL and is its damping factor. More specifically, in Fig. 66 the solid lines are plots of the theoretical jitter-free case of fM~\') (i.e. fM{*) given by eq. (110)). The blank polygons in that figure were obtained in the closed-loop simulations using eq. (105), where a normalized PLL noise bandwidth of 2 ^ 7 = 0.01 was employed. The gray polygons are values of fM~\*) predicted by (106) with 2^-7 = 0.01 and using the time average cos(Mde) to approximate is[cos(M<9e)], where 0e was measured in the aforementioned simulations. Completing Fig. 66 are curves obtained with a normalized PLL noise bandwidth of 2 5 i J = 0.1, where the dashed lines were obtained in closed loop simulations using eq. (105), and the black polygons are values of fM'\m) predicted using eq. (106), with cos(MBe) approximating E[cos(M0e)]. In carrier synchronization PLLs 2BL-T is rarely ([58], [22 Chaps. 5,6], [88]) larger than the order of magnitude of 2BL -7 = 0.01, and is virtually never as high as 2BL 7 = 0.1. Thus, it can be safely said (in lieu of Fig. 66) that it is permissible to always use (110) - 138 -and not bother with trying to predict (106), and that hence SNR estimation can be done using (113) and there is no need to try and predict fM,"'(«) and use (114); this will henceforth be assumed. The case is less clear for using (111) when non-negligible carrier phase error jitter is present; however, since (109) holds for any phase jitter conditions, it is easy to arrive at "worst case" bounds (i.e. assuming var(7M N j = - — ) , which will also be given. Fig. 66. Simulated and predicted values for eq. (113) and (114) . - 139-I(n) Q(n) CO CO CD S _ - t — > Q _ C Lookup Table: £=0 2£ ( - i )* / M - 2 *(77)g z >) CO 03 2*/ x Q MM J 3 Q _ 13 o Integrate and Dump Averager - sum 2N samples and disregard lower log2(2N) bits "D 3 -4—' C L Lookup Table: / M,N CD » ro Q 101og10(/, -1 C L -I—« o YdB • Fig. 67. Efficient hardware generation of / dB 4.3.2 Hardware implementation Fig. 67 presents a structure for the generation of (113) in hardware. Since (see Sec. lMN < 1 , and since a small dynamic range is needed for (113) (see Sees. 2.3.3 and 2.3.4), the lookup tables can be realized as small fixed-point lookup tables, thus permitting efficient implementation of within an FPGA or ASIC. Note how outright division by 27V is avoided in Fig. 67, where for this to be accurate N should be chosen to be a power of 2 (see Sec. 2.3.3). Note also that if lM N is already generated for lock detection then Es IN0 estimation requires merely the addition of the small lookup table implementing (113) , a trivial addendum. As a final point for this subsection, we note that a very useful approximation to (113) , for "manual" use during the design process (e.g., for designing a rough draft of the receiver), is given by ydB « 1 0 log 1 0 ( - M 2 / ^ 4 • ln {lM N ))) (see (38)). However it - 140 -2.3.3) xKn <1 , must be noted that for accurate SNR estimation in the actual receiver this approximation should not be used, and, rather, the contents of the lookup table computing y dB should be computed via numerically evaluating fM~x by numerically inverting17 (31)-(32). 4.4 Discussion: Quantitative Measurement of Estimator Performance We are now interested in obtaining a quantitative evaluation of the efficacy of the proposed estimator. There are various ways by which this can be done. Some researchers have used the Cramer-Rao Bound (CRB) of the Normalized Mean-Square-Error (NMSE) as a limit to which the NMSE of the SNR estimate is compared (see for example [60], [65]). While this does provide a quantitative measure of the estimate's performance, this benchmark is not as easy to translate into actual estimator design parameters as another performance metric (presented in the next paragraphs). Nonetheless, the NMSE metric is useful because other authors have published results which use this metric, hence facilitating comparison with other estimators (in particular, vs. the data presented in [60]). Hence, we shall present NMSE results in Sec. 4.6.2. A more useful metric, in the author's opinion, is the metric proposed in [62], which is the following: How many symbol intervals are necessary in order to generate an SNR estimate to within a desired tolerance, with a desired confidence? 1 7 Numerically inverting a function f (x) for which the inverse does not have a closed-form representation is a topic that has been studied extensively in the literature. The essential process is (a) evaluate f (x) over a fine enough grid x,, ie {1,2,..., N} in the domain [xA,xB] to yield the values f(x.) = y. s[yA,yB], ie{l,2,...,N}, and then (b) determine x= f~'(y) for any y e [ y A > y B ] by c o m P u t i n g the function / " ' ( • ) at y through interpolation from the known coordinate pairs (xt,yi), for example using the Lagrange polynomial method or spline curves. For more information see for example [106 Chap. 9-12]. Note that the inverse function (quantized over a fine enough grid) can be stored in a lookup table, so that this process can be done beforehand and not in real-time. - 141 -Let us qualify this question mathematically. Suppose that p e is an SNR estimate which is based upon the signal information in the preceding £ symbol intervals. The question is as follows: What is the minimal value of £ needed so that the following holds? P{\Pt ~ Z\< tol)> C (115) where in (115) tol is the tolerance and C is the confidence (for example, reasonable values would be tol = 1 dB and C=99%). The answer to such a question can be easily translated into practical conclusions which are pertinent to the design process. To see this, suppose that the value of £ that satisfies (115) is £ 0 . Then this means that in order to achieve an estimate that is accurate to within tol with a confidence of C, the designer knows that he/she must ensure that the estimator operate over £ 0 symbol intervals, and that the delay incurred for estimation would be £ o • T . Conversely, if the maximum allowable estimation delay is A e ? then we can calculate the number of symbol intervals allowable for estimation via (where |_ • J means "round down to the nearest integer"). This, in turn, can be used to ascertain whether the desired tolerance and confidence requirements can be met by the design. We shall use the evaluation method proposed in [62] as the performance metric by which we measure the performance of our estimator to estimation via the SER, and this shall be the topic of Sec. 4.5. As noted earlier, for completeness in Sec. 4.6 we shall present comparisons vs. other SNR estimators using the NMSE criterion. 4.5 Comparison of Estimation via iM N to Estimation via the S E R In this section, following the evaluation method proposed in [62], we shall arrive at quantitative results that describe the minimum amount of symbols necessary to arrive at an SNR estimate from lM N to within a desired tolerance, with a desired confidence. These results will be compared to the number of symbols necessary to estimate the SNR - 142-from the SER, hence arriving at a comparison of the efficacy of the proposed SNR estimation method. 4.5.1 Number of symbols needed for estimation via / M,N From (92) or (111) it is clear that in order to achieve a more accurate value of lM N, and by extension of y - fM~\lM N), N should be increased until var(/WAr) falls below an acceptable value. Indeed, in concordance with the discussion in Sec. 4.4 and (115), the purpose of this section may be stated as follows: we would like to compute the minimal value of 2-N needed to achieve a desired tolerance in the estimation of the Es/N0, with a desired confidence. Mathematically, the question is: What is the minimal value of 2-N needed so that the following holds? f(\fM~] z\< tol)> C (116) where in (116) tol is the tolerance and C is the confidence. Note that we are interested in 2 • N (not simply of N) because 2 • TY is the number of symbols used in computation of the lock metric from which the Es/N0 is estimated (see (105)). Assume tol is in units of dB. We define the constants: ri=(ior"w-i) ( i n ) and r 2 = ( l - 1 0 - ' o / / , ° ) ( 1 1 8 ) These constants describe the allowed deviations from the Es/N0. Since fM is a monotonously increasing function, (116) implies: -r2'X<fM~X[lM,N]-X<rx-X \>C (119) «W/^(a- ' - 2 )^)</^</^(a+i ) -4 > c Now, defining: y±min{fM((l-r2).Z)-fM(Z) , fM((l+riyZ)-fM(z) (120) - 143 -and using E \lMN\Es/N0 = zj = fM (z) • < f M { ( l + r \ ) - z)) > p I - E M ,N ES/N0 = Z var {JM ,N ) < y var (121) Because lMN is Gaussian, it follows from (121) that in order for (116) to occur it is sufficient to require (see [13 Chap. 2]): ^M,N Es I NQ - Z < y (122) = erf y > c with erf(x) = \e'' dt • If carrier phase jitter is negligible, then using (111) in (122) and solving for N: 2N>2 f er*-x V y erf'\C) ( £ c o s 2 ( M A ^ ) . pR(A<f>\z)-dA<fi-(fM(z))2 ( r-\ 1 y ^ \ \ 2 (123) V y J and further closed-form simplifications of (123) are possible using (31)-(32). Conversely, a worst-case result accounting for phase error jitter may be obtained from (122) and (109): 2iV>2 ferf-\C^ (124) v y J (Note that for all the equations in this subsection fM (/) is given in (110). See comments made in Sec. 4.3.1). To recap, we have shown that choosing N which complies with (123) or (124) ensures the verity of (116). As previously stated, eqs. (123)-(124) are expressed in terms - 144-of 2-N (not simply of N) because 2• N is the number of symbols used in computation of the lock metric from which the Es IN0 is estimated (see (105)). 4.5.2 Number of symbols needed for SNR estimation via the SER A great proportion of modern communications systems produce SNR estimates by measuring the pre- or post-decoder error rate. For example, this is what is done in countless systems that estimate the SNR from the number of errors detected in preambles or training sequences that are embedded in the data stream. Thus, perhaps the most meaningful and universally applicable yardstick by which to measure the efficacy of SNR estimation via (113) is attained through comparison of (123) or (124) to the number of symbols needed for Es IN0 estimation via measurement of the /jre-decoder Symbol Error Rate (SER). This is because for coded signals, the post-decoder error rate is always smaller (often by orders of magnitude) than the pre-decoder SER, i.e. the number of symbols needed for SNR estimation via the pre-decoder SER can also be viewed as a lower bound for that which is needed for estimation via post-decoder error rates, regardless of the coding scheme used. From [13 eq. 5.2-56], the uncoded M-PSK symbol error probability is: Pe(M,X) = l- [2pR(A^|x)• d{A<j>)±gM(Z) (125) where eq. (107) defines pR(A0\%) (note that (125) ignores the effects of phase jitter on the SER. Eq. (125) can still model such effects by first incorporating them [19 Sec. 4.3] into a reduced effective Esl N0, but Section 4.5.3 shows that the advantage of the method in Section 4.3 is so great that any effects of such a minute correction are irrelevant). It shall be commented that another equivalent formula has been suggested by Craig [107]: / \ A 1 P7l-nlM ^ Pe{M>Z)=— I exp 7Z J 0 s in 2 {n IM) X ' s i n 2 ( ^ ) d(j) (126) The formula of (126) is quite useful and much easier to work with than (125). This is due to the fact that, unlike (125), in (126) the limits in the integral are finite and the - 145-integrand is composed entirely of elementary functions. This allow numerical calculations to occur significantly faster. To avoid cumbersome notation, we denote the left-hand side of (125) or (126) simply as Pe. We define the binary auxiliary variables Ut as U. -1 an error was detected in symbol /. Assuming the errors in the received symbols occur randomly and independently, we have P ( U j = 1) = P e and P ( U i =0) = l - P e , and therefore E [ U t ] = Pe 1 L and var (LA) = P e ( \ - P e ) . We can define the measured SER as S(L) = — ^ U , , and from the central limit theorem we then have: S(L) ~ N (127) L j Since E[S(L)] = Pe we can estimate the Es/N0 from S(L) via: n = gM\S{Q). (128) For comparison to (123)-(124), we are interested in finding a value of L that ensures that: P{-r2X<*l-X<rxx)>C (129) ^ _ l Y o / / 1 ° l " \ — (\ i A - t o / / 1 0 \ where M — \ 1 U l) and r2—yi — i\j j Developing equation (129) further, we have that it is equivalent to: P{(X-r2)-X< gM~l (S(L)) < (1 + r,)• X) > C. (130) Note that gM(z) is a monotonically decreasing function [13 Sec. 5.2.7] (in words: the probability of symbol error is inversely related to the E s INQ). This is shown in Fig. 68. - 1 4 6 --5 0 5 10 15 20 25 30 35 E S / N Q (dB) F i g . 68. P r o b a b i l i t y o f symbol e r r o r Pe= gM(EsI 7 V 0 ) as a f u n c t i o n o f the Es/N0 r a t i o . Using the monotonically decreasing nature of gM(Es/ NQ), we have that an equivalent requirement to (130) is: P{SM ((1 ~r2)-z)> S(L) >gM((l + r]).Z))>C. (131) Defining: Z^m^{\gM(^~r2)-X)-Pel\gM(^ + ^ ) - Z ) - P E \ } (132) we have that for (131) to be fulfilled it suffices that: S(L)-Pe >ar(S(L)) < >ar(5(L)) > C (133) - 147-Eq. (127) applied to (133) means that for (129) to be guaranteed it is sufficient to require: V _ Y L > 2 P e { \ - P e ) . erf~\C) (134) 4.5.3 Display and Analysis of Results A meaningful appraisal of the utility of estimating the Es/N0 via (113) may be obtained by comparing the number of symbols necessary for such an estimate, as per (123) or (124), to that which is required to attain a similarly accurate estimate from the SER, as per (134). We now embark upon making such comparisons. In Fig. 69, Fig. 70, and Fig. 71 we see a comparison of estimation via lMN vs. estimation via the SER, for BPSK, QPSK, and 8-PSK, for C = 99% and C = 99.9%, with tol - 0.5 dB . As those figures clearly illustrates, estimation via lM N is particularly advantageous for higher Es IN0 ratios, where it is seen that, while estimation via the SER experiences exponential growth in the number of symbols necessary, the growth rate for estimation from lM N is much milder. It is worthwhile noting that the fact that the required number of symbols for both estimation methods increases at high SNRs is a manifestation of the fact that in both cases the number of required symbols is asymptotically inversely dependent upon the squared magnitude of the derivative fM {%) and gM {%) (respectively). The fact that the growth rate for estimation from lM N is much milder is a manifestation of the fact that as the SNR increases d{fM) tends to 0 at a much milder rate than d(gM) This issue is explored in depth in Sees. 4.5.4 and 4.5.5. - 148-•j Q I i i L I L L L J i -4 -2 0 2 4 6 8 10 12 14 E s / N 0 (dB) Fig. 69. Number of symbols needed to estimate the Es/N0 to within ±0.5 d B , for M - 2 (BPSK) . "WC" denotes worst-case results for l M N , i.e. using (124). Other curves obtained using (123) and (134). - 149-•j Q I i i i i L L L i i 2 4 6 8 10 12 14 16 18 20 E s / N 0 (dB) Fig. 70. Number of symbols needed to estimate the Es/N0 to within ±0.5 dB, for M = 4 (QPSK). "WC" denotes worst-case results for l M N , i.e. using (124). Other curves obtained using (123) and (134). - 1 5 0 -•fO I I i L j L I L j J 6 8 10 12 14 16 18 20 22 24 E s /N 0 (dB) Fig. 71. Number of symbols needed to estimate the Es/N0 to within ±0 .5 d B , for M = 8 (8-PSK) . "WC" denotes worst-case results for lMN, i.e. using (124). Other curves obtained using (123) and (134). Some comments are in order regarding Fig. 69 to Fig. 71 when the data stream sent is not a known sequence but is rather an arbitrary data stream. In those figures, it is seen that for low Es/N0 ratios, apparently estimation via the SER requires less symbols. However this is a fallacy since Fig. 69 to Fig. 71 assume that the SER is measured precisely. The only way to achieve a data-independent SER measurement is by first using a error correction decoder and then comparing the corrected data to the input data stream [62], thus arriving at the pre-decoder SER. This tacitly assumes that the post-decoder data stream is error-free. Yet, at low Es IN0 the post-decoder data stream cannot be approximated as error-free, thus inherently skewing the SER measurement. Indeed, - 151 -the decoder may not even be in lock for low Es IN0 ratios, making the SER measurement impossible in the first place. "Training sequences", or known sequences of symbols, can be transmitted in order to arrive at an accurate SER estimate; however this means that at least some of the channel throughput is taken up by such sequences. In contrast, estimation via lM N suffers no such impediments, as it is independent of the data stream, the decoding scheme, and the post-decoder error rate. Thus, Fig. 69 to Fig. 71 can be viewed as optimistic w.r.t. estimation via the SER, and consequently estimation via lM N can be considered superior for all Es IN0 ratios. Furthermore, estimation via the SER requires error detection and accrual mechanisms, which often necessitate non-trivial hardware and/or software resource appropriations. This is quite different from the simple and compact hardware implementation of the proposed SNR estimation method, as given in Sec. 4.3.2. 4.5.4 Asymptotic Formulas for Number of Necessary Symbol Intervals In this section we shall endeavour to find asymptotic closed-form formulas for the number of symbol intervals necessary for estimation of the SNR (i.e., asymptotic formulas for (124) and (134)). As we shall see, these approximations will enable us to attain a better intuitive understanding of the estimators' performance as it was displayed in Section 4.5.3. Starting off with (124), we note that from (120) that y = m m | / M ( ( l - r 2 > j ) - / M ( j ) | , | / M ( ( l + r 1 ) - j ) - / M ( Z ) | and from (117) and (118) that r, = ( l0 ' o / / 1 ° -1) and r2 = (l - 10"' o / / 1 °) . Let us investigate how y behaves for very small values of tol. The Taylor expansion of the expression 10r is [54 eq. 20.16]: . . (xln(lO))2 10*=l+jc-ln(l0)+i ^ ~ ^ + - - - ( 1 3 5 ) Now, if tol is very small, we have that we can approximate 10' o / / 1 ° by retaining only the first two terms of the Taylor series, upon which we have that: - 152-1 0 t e , / , 0 * l + —-ln( lO) 10 V ' and similarly: 10-»" 1 0 « l - i 2 L l n ( l O ) 10 V ; (136) (137) This means that from (117) and (118) we have: , = ( l O - " " - l ) - l + ^ . l n ( 1 0 ) - l - ^ . | n ( 1 0 ) (138) and r2 = ( l -10 - ' o , / , 0 )* l tol tol - . I n ( l O ) =^l . ln( lO) 10 v ') 10 v ' (139) Thus, we find that for small values of tol we have rx & r2. To give an example of the validity of this approximation, take tol = 0.5 dB , which was the case analyzed in Section 4.5.3. For tol = 0.5 dB we have rx = ( l 0 ' o / / 1 ° - l ) = 0.122 and r2 = (l - K r ' 0 , / , ° ) = 0.108 , and we have that r21 rx x 100 = 89.1% . Hence, rx * r2 is quite a good approximation. For simplicity of notation, we define the variable f — — - — , then for small tol we have rx « r2 « r . With this notation we see from (120) that: ^ m i n { | / w {{\-r2)-X) ~fM(z)\,\ fM (0 + 1 )•%) ~fM (z) |} , m i n { | / M ( ( l - r ) - j ) -fM(Z) \ , \ fM((l + r)-Z) -fM(Z) |} r j - m i n fM{i}-r)-z)-fM{z) rZ fM^ + r)-x)-fM{X) rZ (140) •rZ' dz where we have used the fact that r > 0 , that r is very small, and that fM is continuous so that: - 153 -lim r->0 f M ( ( \ - r ) - x ) - f M ( x ) _ dim r->0 fM((} + r)-x)-fM{x) rx dX Continuing, we find from (140) and (124) that: 2N>2 2 ~2 ferf-\cy 7 d{fu) { y ) { r% ) / V J Now, using / M ( j ) « e x p d ( f M ) ^ -M2 ^ v 4x j (see (35)) we find that M' •exp v 4X j d(Ux) dx Ml 4j : -exp V 4x j Plugging (143) into (142) we find that: (141) (142) (143) 2 / V > 2 (erf~\C)^ Jerf-l(C)}2 / ( M2 f - M 2 ^ y « 2 rx 4X' exp 4X At high SNR, i.e. for ^ —> oo, we have from (144) that: (144) rerf-l(C) 2N > 2 v y J erf-' ( C ) rx M 4 r exp v 4x j (145) i6r M 4 Now, let us perform the same procedure for estimation via the SER and gM(x)- We find from (132) that: z = m i n { | g „ ( a - r 2 ) - ^ (146) ~r% (where we used the fact that gM (x) = Pe (see (125))) and thus from (134) we find that: - 154-L > 2 P e ( \ - P \ erf-\C) 2gM U)0 - SM 0))> rZ \d (gM ) d Z = 2 erf-\C) erf~x ( C ) Y (gM (z)~ gM2 (Z)) (147) r Z d (gM ) d Z From [13 eq. 5.2-61] we have that gM{x)~2-Q sin— where [13 eqs. 2.1-97, 2.1-98] Q(x) = [e-'2/2dt = -erfc(^=). Furthermore, from [19 App. 3B] we have \27r 2 vv2 J at high SNR f-x^ (148) Q(x) ~ — ^ e x P XyjlK V 2 , so that: gM{z)*2-Q V 2 j s i n — 1 1 exp V 2 , (149) J r - T r - s i n — V^ M exp • 2 ^ r - s i n — v M and, moreover: - 155 -dQ{x) dx x2yj2~7r x - » c o J " d dx exp \fl7r exp f-x2^ X f-x2^ V 2 JJ -x + —;=exp f-x2^ j x^ln v 2 , •exp V 2 , and therefore: = 2-= 2 2-dx -1 V 2 ^ -1 Q(x) 2 ~ Q V s^in d f My -1X=~J2X sin JL dx s i l 1 7 7 M M Q(x) -\x=yf2x sin M 1 . 71 sin M f exp V ( M j 1 . 71 •sin J2Z M exp / s i n ' n M sm 71 M (150) (151) Thus, using (149) and (151): - 156-{gM (z)-gM2(z)) gM (z) d(gM) dz d(gM) dz yjz -s in n M e x p Z -sin 2 J* M 4z~-e x p Z sin 1 V f e x p • 2 # y •sin — sin n sm M and therefore from (147): L > 2 P e ( \ - P e ) * 2 e r f ( C ) Y ( g „ U)~ g M 2 U)) d (gM ) r . 2 * y •sin — [/i M sm = 2 erf~l(C) ^ 2 j 3 / 2 • e x p n M f Z • s i n sin ~M (152) (153) Let us take a look at (145) and (153). We see that the dependence of 2N upon the SNR % i s polynomial, while the dominant dependence of L upon % is exponential. Since an exponential dependence will always grow much more rapidly than any(fmite order polynomial dependence, then we have just proven a theoretical justification for the results presented in Fig. 69 to Fig. 71. - 157-To verify the derivations of this section, we shall compute and graph (145) and (153) vs. the results obtained from (124) and (134). This is done in Fig. 72 to Fig. 74, for tol = 0.5 dB and C = 99% . As can be seen from those figures, the asymptotic expressions are quite accurate and hence provide a useful tool that the designer can use to roughly calculate the estimation interval requirements at high SNR. Fig. 72 to Fig. 74 also serve to validate the theoretical calculations made in this section, the results presented in Section 4.5.3, and the computer algorithms used to generate those results. I , - — L i i i i r -5 0 5 10 15 20 25 E s / N 0 (dB) Fig. 72. Number of symbols needed to estimate the Es/N0 to within ±o.5 d B , with a confidence of C=99%, for M = 2 (BPSK). "WC" denotes worst-case results for lMN, i.e. using (124). Results for estimation vs. the SER obtained via (134). Asymptotic results obtained from (145) and (153). - 158-0 5 10 15 20 25 30 35 E s / N 0 (dB) Fig. 73. Number of symbols needed to estimate the Es/N0 to within ±0.5 dB , with a confidence of C=99%, for M = 4 (QPSK). "WC" denotes worst-case results for l M N , i.e. using (124). Results for estimation vs. the SER obtained via (134). Asymptotic results obtained from (145) and (153). - 159-10 20 15 10 o E B 1 0 10J O WC estimation via i6 Asymptote for WC estimation via ^ Estimation via SER Asymptote for estimation via SER 10 15 20 25 E s / N 0 (dB) Fig. 74. Number of symbols needed to estimate the Es/N0 to within ±o.5 dB , with a confidence of C=99%, for M = 8 (8-PSK) . "WC" denotes worst-case results for lMN, i.e. using (124). Results for estimation vs. the SER obtained via (134). Asymptotic results obtained from (145) and (153). 4 . 5 . 5 The Causes of the Proposed Method's Advantages Let us now recapacitate and investigate further the implementation-related and performance-related differences between estimation via the SER and estimation via lM N , with the aim of better understanding the roots of the advantage of the proposed method. In practical receivers, neither f~l nor would be computed directly; rather, they would be computed beforehand and stored in a lookup table. This is true even for software implementations. This is due to the fact that there is no closed-form formula for the inverses of either function, and so this must be done numerically. Doing such - 1 6 0 -numerical calculations for each estimate would simply take too long and would preclude any real time operation. Thus, the results would be computed beforehand, and stored either in a lookup table (in a hardware implementation) or in an array stored in the computer's memory (in a software implementation). Regarding the fundamental need to compute both inverse functions in this manner, there is no essential difference between estimation via the SER and estimation via lM N . The real differences between the methods are that: 1. The lookup table (or memory array) for computation of f~l would be much smaller than that for gM\ at least in a fixed-point implementation (which basically covers all hardware implementations in an ASIC or FPGA) 2. Estimation via lM N does not include the need to perform error detection, which in contrast is a prerequisite for estimation via g^ • 3. As the analysis in the chapter shows, estimation via lMN requires significantly less symbols to arrive an equally accurate estimate. Let us analyze the above points individually. 1. The lookup table for f~] is much smaller because its input, which is lM N, always lies in the interval [0,1] and is "well-behaved" as a function of the %dB (remember, estimation is only possible when the carrier loop is locked, whereupon the output of the lock detector is in the interval [0,1]). The meaning of "well-behaved" is that the derivative of fM(%) = E\JMN\EsIN0= x HfM)) as a function of XJB n a s a s m a U absolute value, i.e. is small for "reasonable" XdB 0 - E - ' SNRs above the threshold of the respective modulation, - 161 -but not excessively high). A manifestation of this phenomenon is that is also "well-behaved", and this can be easily seen from inspection of Fig. 66. Since lM N behaves "nicely" as a function of XdB > t n e m P u t of the lookup table computing f'1 in a fixed-point implementation does not need to be quantized to any great precision in order to achieve accurate estimation of all practical xdB • This is in sharp contrast to a lookup table used for computation of gMx. The input of that lookup table is the SER, which, while also in the interval [0,1], is by contrast is not as "well behaved" because it tends towards zero very fast for moderate and high %dB (see Fig. 68). This means that if any accuracy is desired for estimation of those %dB ratios via g~^, then the input of the lookup table implementing gMx needs to be able to accurately represent very small numbers (for moderate and high %) which differ in orders of magnitude from each other as the XdB increases only very slightly. Thus, in a fixed-point implementation this means that many more bits are needed for the input of the lookup table that implements gMl. Now, denote the number of bits at the input of the lookup table as k, then the number of entries in the lookup table is 2k. So, the size of the lookup table implementing gMl in a fixed-point architecture will be much larger than that needed for implementation of f ' 1 . 2. This point is rather obvious and was discussed in Sec. 4.5.3. Of course, the implementation of error correction hardware or software is no trivial matter, and the lack of it for the proposed method is an outright saving of resources. Also, as noted in Sec. 4.5.3, at low SNRs the error correction decoder may operate with a non-negligible output error rate (if it is locked at all), thus hampering estimation via g j . - 162-3. The fact that the proposed estimation method requires much less symbols than that required for estimation via g'1 is clearly seen from the results developed and presented in Sec. 4.5.3. This means that estimation via f'1 is much better suited for real-time estimate generation. 4.6 Comparison of Estimation via iMN to additional SNR estimation methods 4.6.1 Qualitative comparison Several additional SNR estimation methods are presented in [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73], [16] and [74]. While a quantitative and exhaustive comparison versus all of those methods could be undertaken, this is unwarranted since the qualitative characteristics of the proposed SNR estimator make it so attractive that it renders such a comparison unnecessary. To this end, it shall be commented that of the [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73], [16] and [74]: • Some are unique to a specific receiver structure • Most require some form of symbol decisions to be made • Most require more than one sample per symbol • (Most importantly:) None of those methods appear to have a hardware implementation nearly as compact as the one suggested here. Moreover, it is mentioned that, due to the resilience of lM N to effects stemming from imperfect AGC control of K in Fig. 13 (see Chapter 2), this resilience is also present in the resulting Es IN0 estimates, thus deeming them reliable even when rapidly fading signal conditions are encountered (see also discussion in Sec. 4.7). - 163 -4.6.2 Quantitative comparison of NMSE vs. other blind 1-sample/symbol SNR estimators Although, as stated above, it would be impossible to engage in quantitative comparisons with all of the SNR estimators referenced in Sec. 4.6.1, we shall, for completeness, engage in quantitative comparisons vs. the M2M4 estimator and the Signal-to-Variation Ratio (SVR) estimator which are analyzed in [60]. This is because the performance of the aforementioned estimators (especially the M2M4 estimator) has been found in [60] to be very good (in fact, the M 2 M 4 estimator is judged one of the "best" SNR estimators [60 Sec. V]). Moreover, the M 2 M 4 and SVR estimators are blind methods that need a sampling rate of 1 sample/symbol, which are exactly the characteristics of the proposed SNR estimator. Hence, a comparison is appropriate. First, following [60], let us review the M2M4 and SVR estimators. a) The M2M4 estimator Define rn = I(n) + j • Q(n) . The M 2 M 4 estimator utilizes the 2nd-moment and 4th-moment of the signal, defined as: |2' M2=E and: It can be shown [60 eq. (35)-(36)] that we have: M2=S + N and: (154) (155) (156) M4 = kaS2 + 4SN + kwN' (157) where S is the signal power, N is the noise variance, ka — E ~\ I4 ~ "l |2 \a \E \a 1 " 1 / V 1 n \ is the signal kurtosis and kw — E 1 I4 1 I2 \n \E \n 1 " 1 / I 1 " 1 is the noise kurtosis (where - 164-we define nn = n, (n) + j • nQ (n) ). We can solve (156)-(157) for S and N to yield [60 eq. (37)-(38)]: * = M2 (kw - 2) + ^ ( 4 - KkJMJ + M 4 (*g + £ w - 4 ) and: (158) N = M2-S (159) For the M-PSK signals discussed in this thesis, it can be shown [60] that ka=\ and kw = 2 , and using (158)-(159), we find the M 2 M 4 SNR estimate as [60 eq. (39)]: TM2M4 =S/N = yJ2M22 - M, 1 , (160) M2 - yJ2M22 - M4 Obviously, in an actual implementation the moments M2 and M4 are estimated from a finite number of symbols, as follows (from [60 eq. (43)-(44)], with adaptation to the notations used in this thesis): \r n=-N + \ and: 1 N 1 N M 2 = — Y I2 2 N I n (161) 2 7 V I»l (162) H = - W + l Eq. (160), with the moments M2 and M4 computed as in (161)-(162), is the M 2 M 4 estimator. b) The SVR estimator To define the SVR estimator, we define the auxiliary variable [60 eq. (45)] 'n-1 -E 'n-1 (163) The SVR estimator is defined as [60 eq. (48)]: - 165 -YsVR (p -1)+VG* - !)2 -1 1 - A*, -1)] [i - p{K -1)] (164) l - £ ( * f l - l ) For the M-PSK signals discusses in this thesis, it can be shown [60] that ka = \ and kw = 2 , so that (164) simplifies to: ^ = ^ - 1 + ^ 05 -1 ) (165) In practice, P is estimated from a finite number of symbols, as follows (from [60 eq. (53)], with adaptation to the notations used in this thesis): P 1 N T \r\ ' n - l 1 N 1 N 1 ^—i I |4 1 V - 1 I I / \r \ > \r r _ i Z—i I " I 9 A/- _ 1 • z - 1 I "I (166) 27Y 'n - l Eq. (165), with /5 computed as in (166), is the SVR estimator. c) Comparison metrics vs. the M2M4 and SVR estimators In order to facilitate the comparison between the proposed estimator and the M2M4 and SVR estimators, we shall use the same metrics used in [60]. The first of these metrics is [60 eq. (65)] Normalized Mean Squared Error (NMSE), i.e. we shall compare E (/- z) X , which is the NMSE for the proposed method (with y = fM x{lM N) , see (112)) , with the NMSEs for the M 2 M 4 and SVR estimators, which are E {?'M2M4 X) X and E {.Y SVR X) respectively. The second metric is the normalized bias [60 eq. (68)], i.e. we shall compare E[{y~x)]/x to E[{YM2M4 ~X)]/X and E[(ysyR -z]\jX . In order to help us evaluate the results, we shall also look at the Cramer-Rao (CRB) bound for the NMSE, which is [60 eq. (65)] : CRB= 2 + — Z-2N 2N (167) 166 The CRB is the lowest theoretically attainable NMSE. As for the best theoretically attainable bias, obviously that limit is 0. In order to understand the significance of the NMSE metric, it is helpful to note that the MSE is MSE = • NMSE . Thus, if we have, for example, an NMSE of 10 2 , this would mean that the average RMS error of the estimation would be 10-log 1 0 ( l . l j /^) = 0.41 dB and a negative RMS deviation of 10 • log l0(0.9%/ x) = -0.46 dB , which is a respectably small error. Even an NMSE of 5-10 2 , which will have an RMS error of 0.22 X ? which is a positive RMS deviation of 10 • l o g 1 0 ( 1 . 2 2 ^ / = 0.86 dB and a negative RMS deviation of 10 • log 1 0(0.78^/ x) = "I 08 dB, which is still satisfactory for many receivers. d) Quantitative results and discussion In the following figures we shall present NMSE and normalized bias results calculated through simulations. The figures are followed with a discussion of the results. X -10 — OAx, which in dB is a positive RMS deviation of - 167-1.0" C Z Z Z E Z Z Z Z Z Z Z Z Z L Z Z Z Z Z Z Z -*-M=2, via proposed method -Jk-M=2, via M 2 M 4 -«-M=2, via SVR — C R B 10' Es/N0(dB) Fig. 7 5 . NMSE comparison of estimation via l M N , the M2M4 estimator, and the SVR estimator, with 2/V = 1024 symbols used to compute each estimator. Modulation i s BPSK ( M = 2 ) . - 168-0 2 4 6 8 10 12 14 16 18 E s / N 0 (dB) Fig. 76. NMSE comparison of estimation via lMN, the M 2 M 4 estimator, and the SVR estimator, with 2iV = 1024 symbols used to compute each estimator. Modulation i s QPSK ( M = 4 ) . - 169-=8, via proposed method -M=8, via M0M. 2 4 -M=8,via SVR •CRB E s / N 0 ( dB) Fig. 77. NMSE comparison of estimation via lM N , the M 2 M 4 estimator, and the SVR estimator, with 27V = 1024 symbols used to compute each estimator. Modulation i s 8-PSK ( M = 8 ) . - 170-0.05 0.04 0.03 0.02 m 0.01' T3 <L> N " r o -0.01 -0 02 -0.03 -0.04 -0.05. 0 -•^M=2, via proposed method - ^ M = 2 , via M 2 M 4 - • - M = 2 , . v i a S V R 8 10 E s / N 0 (dB) 12 14 16 18 Fig. 78. Normalized bias comparison of estimation via lM N , the M 2 M 4 estimator, and the SVR estimator, with 27V = 1024 symbols used to compute each estimator. Modulation i s BPSK ( M = 2) . - 171 -W ro CQ TJ 0 N ro E 8 10 E s / N 0 (dB) 12 14 16 18 Fig. 79. Normalized bias comparison of estimation via l M N , the M2M4 estimator, and the SVR estimator, with 27V = 1024 symbols used to compute each estimator. Modulation i s QPSK ( M = 4) . - 172-0.1-0.08 -0.06 -- • - M = 8 , v ia p roposed method - 4 ^ M = 8 , v ia M _ M . 2 4 - * - M = 8 , via SVR 0 0 4 -ro to. 0 . 0 2 --0.04\--0 .06 -4-, - 0 . 0 8 - :| f; Y)< \ I L i i i :ii i i i i _ 8 10 12 14 16 18 20 22 24 Eg/No (dB) Fig. 80. Normalized bias comparison of estimation via l M N , the M2M4 estimator, and the SVR estimator, with 2/V = 1024 symbols used to compute each estimator. Modulation i s 8-PSK ( Af = 8 ) . A s comparison of the N M S E results in Fig . 75-Fig. 77 shows, the proposed estimator perform better or on par with the M2M4 and S V R estimators. The best N M S E results for the proposed method are attained for B P S K , where the proposed method is seen to be better than the M 2 M 4 and S V R estimators at all SNRs. For Q P S K and 8-PSK the N M S E of the proposed method is best at medium and high SNR, though at the lowest SNRs there is some mild increase in N M S E . A s for bias results, as F ig . 78-Fig. 80 show, all three methods (the proposed method, the M 2 M 4 and S V R estimators) have small bias that is very near the optimal value of 0. To conclude the quantitative comparisons vs. the M 2 M 4 and S V R estimators, we have seen that the proposed S N R estimation method is competitive and sometimes superior to those methods. - 173 -Some concluding qualitative remarks are in order. First, the proposed method has a very compact hardware implementation that is much more compact than that of the M 2 M 4 and SVR estimators (see [60], [68]), and yet (as we have shown) has comparable and sometimes better performance. Secondly, the M 2 M 4 and SVR estimators will not work well in the presence of fading [70], and we shall discuss this in Sec. 4.7. In contrast, the proposed estimator is easily adaptable to work in fading conditions without any appreciable hardware complexity or performance penalties. Finally, it should be noted that the M 2 M 4 and SVR estimators, being moment-base estimators, will be able to provide an estimate if the carrier PLL is unlocked, and this will not be true for the proposed method (which requires prior carrier PLL synchronization). However, we shall see in Part B of this chapter that the proposed method can be modified to work in the absence of carrier synchronization and yet retain similar performance. 4.7 SNR Estimation via iMN for M - P S K in the Presence of Fading: Comparison with Estimation via the S E R Until this point in the chapter we have ignored the possible presence of signal fading in the channel. To incorporate such effects into the analysis, we shall assume that our signal is subject to frequency-flat (= frequency-nonselective) slow fading (for discussion of fading processes, see for example [13 Chap. 14]). We shall comment that, in general, a suppressed carrier coherent M-PSK system would perform poorly in frequency-selective or fast fading (i.e. where TC0H « T or 1/T <z.WC0H) In such a case a different, probably noncoherent, multicarrier, or spread-spectrum type modulation would be chosen (see see [13 p. 818], [78], [79]). Thus, the assumption of frequency-flat slow fading will be appropriate for the overwhelming majority of suppressed-carrier M-PSK systems. We use the notation % to refer to the instantaneous SNR, and the notation x t 0 denote the average SNR (i.e., J = E[ES /N0]). The conditional pdf (probability density function) of the SNR due to fading will be denoted as PF{X\X) = p{ES /N0 = x\E[Es/N0] = x) • For example, from [77 Table 2] we have for Nakagami-m fading pF (x\x) exp -mx XmT{m) K x J - 174-We differentiate between two cases: (a) 2NT<s:TC0H and (b) 2NT^>TC0H. For case (a), during the averaging over 2NT symbol intervals that is done in eq. (105) the channel SNR will not have changed much; hence SNR estimation from lMN will yield an estimate of the instantaneous SNR ratio X • F° r case (b), since 2NT is much larger than the channel coherence time, SNR estimation from lM N will yield an estimate of the average SNR ratio % . In general, we would ideally like to produce SNR estimates which are instantly available and can be fed in real-time to the decoder, equalizer, or other receiver components which could make good use of them. Hence, ideally, we would be served by perfect knowledge of the instantaneous SNR ratio x (which, if desired, could be averaged over time in order to produce an estimate of x ) . However, estimation of x is not always possible, due to the fact that TC0H may be too short as compared to the estimation period 2NT which is necessary in order to achieve an acceptable accuracy in the SNR estimation (see discussion in Sec. 2.7 and the following subsections). Nonetheless, if 2NT » TC0H , timely knowledge of the average SNR x is often sufficient in order to facilitate substantial performance gains (e.g. [12], [15]). 4.7.1 Case (a): 2NT<zTC0H For the case of 2NT «: TC0H the effects of fading upon the SNR are not felt during the estimation period of 2NT. Thus, we can treat this case as if no fading were present. Hence, the discussion up to now in this part of the chapter has addressed this case, and the results have been given in Sees. 4.4 to 4.6. The SNR estimate generated in this case would be that of the instantaneous SNR x • A.l.2 Case (b): 2NT^TC0H For the case of 2NT » TC0H, we must take into account the fading distribution. To estimate the SNR ratio x we first compute the expected value of lM N in the presence of fading, as per (58) which is repeated here: - 175 -fu (Z) = E ! F (168) \o{l-*C0S(M A<0^  (A I^"F7= ( A ^ ) ) / ? F (x\z)dz We can then estimate j in a way analogous to (113), i.e. via: = 10 1 o g 1 0 ( / „ - ' ( / „ , „ ) ) (169) In a way analogous to Sec. 4.5 and (124), we can conclude that the amount of symbols needed to estimate x from lMN is given by (assuming a worst-case scenario jitter-wise): ( * r ' ( c ) ) 2 2N>2' (min{|/M((l-r 2)j) -fM(x) |,| /„ ((! + #;)*) (170) As a comparison yardstick by which to measure the efficacy of estimation via (124), we can compare (170) to estimation via the SER. To do this, we first note that the SER for fading channels is (using (126)) : SM(Z)=[ - [ exp -z sin {nlM) sm dO PF(x\x)dz (171) Presenting results for all possible fading distributions would be impossible. In the sequel we thus present results for Nakagami-m fading, with the understanding that results for other fading statistics can be obtained in an analogous manner. For Nakagami-m fading we have [108 eqs. (3), (9)] that: g . 1 rn-nIM M CF> = -I I 2 / _ / n s\ — \ 1 + sin (xl M) x sin 2 <j; m (172) Graphs gM (x) for Nakagami-m fading are given in Fig. 81 to Fig. 84. - 176-E[E s/N 0](dB) Fig. 81. Probability of symbol error gM(x) as a function of X for Nakagami-m fading with m = l . - 177-: M =16 A Q-1' i ' I I I I I I I -5 0 5 10 15 20 25 30 35 E[E s/N 0](dB) Fig. 82. Probability of symbol error gM(x) as a function of X for Nakagami-m fading with m = 2 . - 178 -E[E s /N 0 ] (dB) Fig. 83. Probability of symbol error gM(x) as a function of X for Nakagami-m fading with m=5 . - 179--5 0 5 10 15 20 25 30 35 E [ E s / N 0 ] ( d B ) Fig. 84. Probability of symbol error gM(x) as a function o f ^ for Nakagami-m fading with m = 10 . The estimate of the average SNR from the SER, when fading is present, done through: IdB = 1 0 1 ° g i o {SM~\L,N)) (173) which will give an estimate of x • Following along the lines of the derivations in Sec. 4.5.2, we find that the number of symbols necessary to estimate x from the SER (using (173)) is: 2gM{z)(l-gM{x)){erf-\C))2 L > — j (174) (min{ \g M ((1 -r2)-z)-gM (z)\,\ gM ((1 + rx)-x)-gM (J)|}) We note that, just as was the case for estimation from lM N, we must differentiate between two cases, case (i): L <sc TCOH and case (ii): L » TCOH . For case (i) we - 180-should use (134) and the estimate will be of % . For case (ii) we must use (174) and the estimate will be of % . The desired comparison during fading conditions between the proposed method and estimation via the SER is achieved by comparing (170) to (174). This is done in Fig. 85 to Fig. 88. The lowest SNRs for which results are given in Fig. 85 to Fig. 88 are rough thresholds TM defined as the average SNRs at which gM (TM) = 5 • 10"2. It should be noted that due to the fact that the SER often requires orders-of-magnitude more symbols, when fading is present and estimation via the SER is attempted, it is likely that we will have L » TQOH whereas we would still have 2NT <K TC0H for estimation SNR estimation via lMM. In such a case, the correct comparison would be curves for case (a) (2NT <s. TC0H) vs. curves from case (b) (L » TCOH )• Such a comparison can easily be done by looking at the appropriate curves from Fig. 85 to Fig. 88 and Fig. 69-Fig. 71. - 181 -10 2 0 "O CD 10 1 8 3 1 0 ' " cr 01 14 10 o JQ E >^  ,10 10 12 • M= --2 via Z 2 N * M= =2 via SER H O - M= 4 f . 4,N =4 via SER - e - M= 8 ia I 8,N -e- M= =8 via SER 10 15 20 25 E [ E S / N 0 ] ( dB) 30 35 40 Fig. 85. Estimation via lMN vs. estimation via the SER, for case (b) ( 2 N T»T C 0 H ) and case (ii) (L^>TCOH) for Nakagami-m fading with m = \ . - 182-102 ' ' 1 ' 1 1 ' 1 1 0 5 10 15 20 25 30 35 40 E [ E S / N 0 ] (dB) F i g . 86. E s t i m a t i o n v i a lMN v s . e s t i m a t i o n v i a the SER, f o r case (b) (2NT^>TC0H) and case ( i i ) ( L » TC0H ) f o r Nakagami-m f a d i n g w i t h m = 2 . - 183 -4 rf I i i i I i i i i 0 5 10 15 20 25 30 35 40 E [ E S / N 0 ] (dB) Fig. 87. Estimation via lM N vs. estimation via the SER, for case (b) {2NT^>TC0H) and case (ii) (L » TCOH ) for Nakagami-m fading with m = 5 . - 184--|Q 2 I I i I I I j i I 0 5 10 15 20 25 30 35 40 E[E S /N 0 ] (dB) F i g . 88. E s t i m a t i o n v i a lMN v s . e s t i m a t i o n v i a the SER, f o r case (b) (2NT^>TC0H) and case ( i i ) (L^>TC0H) f o r Nakagami-m f a d i n g w i t h m = \0 . 4.7.3 D i s c u s s i o n As can be seen from inspection of Fig. 85 to Fig. 88, estimation of the average SNR via lM N often requires orders of magnitude less symbols as compared to estimation via the SER. Looking first at high SNR, we note that estimation via lM N generally requires much less symbol intervals than estimation via the SER. Thus, by estimating from lMN we can generate estimates much more rapidly than by estimating the SNR from the SER. We also see that as the fading index m increases, so does the advantage of the proposed estimator. For moderate and high m, and, as well, for case (a) ( 2NT <K TCOH ) we find that - 185 -estimation via lM N is much better than estimation via the SER, often by many orders of magnitude. At low SNRs, we see that estimation via lMN requires about the same number of symbols as estimation via the SER. However, consider the case of unknown data being transmitted. The sole way to obtain an SER estimate from unknown data is by via code-decode process [62]. Specifically, the transmitted data is coded and then decoded at the receiver (e.g., using block codes or convolution codes [13 Chap. 8]), an error rate estimate the receiver is obtained by comparing the decoded data stream to the input data stream. However, this implicitly assumes that the error correction decoder's output is completely error free, which is a bad assumption at low SNRs. Thus, the SER estimate would be inherently more unreliable as the SNR decreases, and, moreover, the error correction decoder (ECD) might not even be locked at low SNRs. As one method of countering this effect, training sequences, pilot symbols, or preambles can be sent over the channel and the SER estimation can be done upon those known symbols. However, this incurs a reduction in the channel's information-bearing capacity, since channel throughput that is taken up by those known symbols cannot be used in order to transmit data. Secondly, if we call the percentage of known symbols in the data stream P (e.g., P=10%), then we have that the number of symbol intervals that we actually have to wait in order to arrive at the SER estimate is increased by a factor of 1/P over the quantities predicted by Fig. 85 to Fig. 88, i.e. for P=10% we need to multiply those quantities by 10, which clearly degrades the performance of estimation via the SER as compared to estimation via lM N . In conclusion, due to the aforementioned reasons Fig. 85 to Fig. 88 present optimistic results with regards to SNR estimation via the SER. Therefore, the proposed method is superior (in terms of estimation latency) at low SNRs as well. In terms of hardware complexity, estimation via the SER requires the implementation of error detection and accrual mechanisms. This often means the allocation of sizeable hardware and/or software resources for this purpose. Moreover, a lookup table that would translate the SER measurement into an SNR measurement is still required, and such a lookup table is much larger that the one presented in Sec. 4.3.2 (see discussion in Sec. 4.5.5) . - 186-4.8 S N R E s t i m a t i o n v i a iMN f o r M - P S K i n t he P r e s e n c e o f F a d i n g : N M S E e v a l u a t i o n When fading is present, the SNR estimation approach taken in recent years has been to estimate the SNR using Viterbi algorithm-based [15] or EM (Expectation Maximization) algorithm-based estimators [70]. Obviously, such estimators have many orders-of magnitude more complexity than the proposed estimator. As for previously studied blind 1 sample/symbol moment-based estimation, it has been documented that the M2M4 performs very poorly in fading conditions [70]. In order to evaluate the effects of fading upon the proposed estimator's NMSE, simulation results are obtained for the case of Nakagami-m fading for various values of m, and the NMSE is compared to the NMSE without fading. 10" LU co -•-M=2'm=1 -*-M=2 m=2 -«-M=2 m=5 -•-M=2 m=10 -*-M=2, no fading: — C R B . n o fading 8 10 12 E [ E S / N 0 ] (dB) 14 16 20 Fig. 89. NMSE comparison of SNR estimation via estimation via /, for BPSK and Nakagami-m fading. For each metric 2A^ = 1024 symbols were used. - 187-M=4 m=1 M=4'm=2 M=4 m=5 M=4 m=10 •-*-M=4, no fading — C R B no fading 8 10 12. E[ES/N0] (dB) 20 Fig. 90. NMSE comparison of SNR estimation via estimation via lM N for QPSK and Nakagami-m fading. For each metric 2/V = 1024 symbols were used. 8 10 12 14 16 18 20 22 24 E[ES/N0] (dB): Fig. 91. NMSE comparison of SNR estimation via lM N for 8-PSK and Nakagami-m fading. For each metric 2/V = 1024 symbols were used. - 188-Looking at Fig. 89-Fig. 91, we see that over all the SNR range of interest, the SNR estimator performs quite well, that is, its NMSE is below 5 • 10 2 and usually below 10 2 , which is quite a useful range (see Sec. 4.6.2). An increase in the NMSE is observed at high-SNR, although this is not particularly problematic since SNR estimates need not be very accurate at high SNRs (since palpable performance gains by employing SNR estimates will be achieved only in low and moderate SNRs). Furthermore, it should be noted that the NMSE can be reduced simply by increasing the number of symbols 2N that is used in order to compute the estimator. This will have a negligible effect on the estimators complexity: the only increase in complexity is the augmentation of the accumulator register and adder in the IAD structure by a few bits - see Fig. 24 (for example, if we increase 2N to 2N = 1024 • 16 = 16384 , we would need to augment these structures by log216 = 4 bits). Hence, the degradation due to fading can be easily overcome by increasing the estimation period and with a negligible increase in the estimator complexity. Now, let us look at the effect of fading on the estimator's bias. This is shown in the following figures. - 189-Fig. 92. Normalized bias comparisons for SNR estimation via lM N for BPSK and Nakagami-m fading. For each metric 27V = 1024 symbols were used. Fig. 93. Normalized bias comparisons for SNR estimation via lM N for QPSK and Nakagami-m fading. For each metric 2/V = 1024 symbols were used. - 190-Fig. 94. Normalized bias comparisons for SNR estimation via lMN for 8-PSK and Nakagami-m fading. For each metric 2/V = 1024 symbols were used. As can be seen in Fig. 92-Fig. 94, there is no appreciable effect of fading on the estimator's bias and it remains very near 0 for all SNRs of interest. To conclude this section, we note that that the effects of fading on the proposed estimator mild and manageable. This is a sharp contrast to the performance of the M2M4 estimator [70]. Moreover, the complexity of the proposed estimator (which is NDA and operates at 1 sample/symbol) is significantly lower than that of other estimators which have been proposed for SNR estimation in fading conditions [15],[70]. - 191 -Par tB A New SNR Estimation Structure for D - M P S K and for M -P S K in the Absence of Carrier Synchronization. 4.9 Introduction to Part B of This Chapter D-MPSK (Differential M-ary Phase Shift Keying) systems differ from M-PSK systems in that in a D-MPSK receiver there is no carrier synchronization loop, but, rather, the demodulation is done by differential detection (see, for example, [13 Sec. 5.2.8], [109 Chap. 10]). In D-MPSK, like for M-PSK, one of the most important signal metrics in any receiver's operation is an estimate of the received signal's SNR. Some applications of such a metric can be found in Sec. 4.1 and will not be repeated here. In this, Part B of Chapter 4, we present a robust real-time SNR estimator for D-MPSK. This estimator is a modification of the estimator for coherent M-PSK presented in Part A of this chapter, and, as such, it retains the advantages which were observed for that estimator. Specifically, the estimator is shown to have the following advantages: (i) The estimator has a compact fixed-point hardware implementation which is quite suitable for implementation within FPGAs or ASICs. (ii) The estimator requires only 1 sample/symbol. (iii) Accurate estimates can be generated in real-time. (iv) The estimator is resistant to imperfections in the AGC (Automatic Gain Control) circuit. We investigate the proposed estimator theoretically and through simulations. General formulas are developed for SNR estimation in the presence of frequency-flat slow fading (i.e. where TC0H «: T and MT «: WC0H ), and specific results are presented for Nakagami-m fading. The proposed estimator is then compared to other SNR estimators, and it is shown that the proposed method requires less hardware resources while at the same time providing superior performance. Finally, we consider application of the proposed D-MPSK SNR estimator to SNR estimation in coherent M-PSK receivers when carrier synchronization has not been achieved. The estimator is shown to have excellent performance and an exceptionally compact hardware implementation. - 192-Organization of this part of the chapter is as follows. In Section 4.10 we briefly outline the system model upon which the discussion is undertaken. In Section 4.11 we present the motivation for the new SNR estimator as well as its hardware implementation. Stochastic analysis of the SNR estimation process is pursued in 4.12 and 4.13. In Sections 4.14 and 4.15 we compare the efficacy of the proposed estimator to other estimators. In Section 4.16 we discuss the application the SNR estimator to M-PSK systems where the carrier PLL is unlocked. Finally, Section 4.17 is devoted to conclusions. 4.10 D - M P S K System Model Signal and receiver characteristics are assumed identical to those of Part A, except that here (since we assume D-MPSK demodulation) we do not have a carrier recovery PLL. (Alternatively, as we discuss in Section 4.16, the system may be considered identical to that of Part A, except that the carrier PLL is assumed to be unlocked (with certain conditions upon Aco, do be established below)) . The reader is strongly urged to take a thorough look at Part A of this chapter and Section 1.4, since notations and results given there will be used here extensively. The baseband PSK signal is tn(t) = ^ ™^anp(t-nT), with p(t) being the pulse shape and an = exp{](/>„), </>„ = 2K• mn IM, with mn e {0,1,...,M-\). The modulated signal is sm(t) = Re[m(t)exp(j'&>^+jO,)]. A simplified diagram of the front-end of the receiver under discussion is shown in Fig. 95. At the receiver, from Section 1.4 we have I(n)=K(2Es-cos (-A^wr+6>+^)+«7(«r)) and g(«)=/^(2/is-sm(-A^-«r+6'e+^)+«2(«r)), with 0e=6-0o and nj(nT),nQ(nT) ~ N(0,2N0ES). K is the equivalent (AGC-controlled) I-Q arm gain (see Sec. 1.5 for a thorough discussion of the AGC and the parameter K). The complex symbol is: r=m+j-&») 075) and its phase is: - 193 -(pn±im\Q(n)ll{n)) (176) We then have: =K|exp(M,) (177) Here (unlike in Part A), we do not assume A<y=0, but rather |A.co\«: 2/r/(M • T). It should be noted that the assumption |Ao| <s: 2nj{M • T) is the standard assumption that is made in D-MPSK receivers (see for example [27 Sec. 10.19]). We assume that our signal is subject to a frequency-flat (= frequency-nonselective) channel with slow fading (i.e. where TC0H «c T and MT «; WC0H. For the definition of such a process, see for example [13 Sec. 14.3]). We shall comment that, in general, D-MPSK would perform poorly in frequency-selective or fast fading (in which a different, probably noncoherent, multicarrier, or spread-spectrum type modulation would be chosen see [13 p. 818], [78], [79]). Thus, the assumption of frequency-flat slow fading will be appropriate for most D-MPSK systems. Re[m(t)• exp(jCO; • t + jGj)] + n(t) IF Input->—m ro .92 o CO -1 O Matched Filter -l(t)-l(n) sample rate 2cos ( £ v *+A < 5 W +<9 0 ) 1/T 90° sample rate -2sm(cvrt+Aa> -t+0o) ^ Matched Filten \ -Q(t)-Q(n) Fig. 9 5 . Front end of D-MPSK receiver (simplified diagram) - 194-4.11 Motivation and Estimator Structure 4.11.1 Motivation Detection of D-MPSK signals is often facilitated by first generating a pseudo-coherentiy demodulated M-PSK signal «.=VL (178) and then applying M-PSK decision regions upon un. The motivation here is similar, but we add a twist: the idea is to use the estimator of Part A upon a normalized pseudo-coherently demodulated M-PSK signal * vn=VTT-\ (179) As we shall see, using vn instead of un yields a simpler hardware implementation. 4.11.2 Estimator structure and operation principle We define: < » = R e [ ( v „ ) M ] / | v „ r (180) (Note: to avoid confusion with Part A, throughout this part of the chapter we use superscript "D" in variables pertaining to D-MPSK structures). The estimator of Part A applied to vn is defined as: \M ^ -KT M >" 4^ i.v w J / i n| (181) Here we do not use 1° N as a lock detector (since there is no carrier PLL) but rather only as an SNR estimator. Note that |v„| = l for all n, so theoretically we could have 1 N defined l ^ N = — X R e [ ( v J w ] - However, when quantization effects are taken into „=_A/+I 1 N account we see that Iv I = 1 does not always hold, and then 1 ° N =— Y Re[(v )w]/|vn\M has distinct implementation and performance advantages (which are outlined in Section 4.11.3). - 195 -In this part of the chapter we shall present a general method for D-MPSK SNR estimation in the presence of fading. As noted earlier, we assume that the fading is slow (i.e. TC0H »T, where TC0H is the channel coherence time) and that the channel is frequency-nonselective. Once again we use the notation x t 0 r e f e r to the instantaneous SNR, and the notation % to denote the average SNR (i.e., % = E[ESINQ]). The conditional pdf (probability density function) of the SNR due to fading will be denoted a s PF{x\x) = p{EsINa = x\E[Es /N0] = X) • For example, from [77 Table 2] we have for Rayleigh fading PF(z\x)=-=exp X v X j and for Nakagami-m fading m m—\ X\x) = - — r e x P _ • Other common fading distributions were given in Table XmT[m) y X J 2. We differentiate between two cases: (a) 2NT«TC0H (b) 2NT»TC0H For case (a), during the averaging over 2NT symbol intervals that is done in eq. (181) the channel SNR will not have changed much; hence SNR estimation from l°N will yield an estimate of the instantaneous SNR ratio x • For case (b), since 27Y7 is much larger than the channel coherence time, the distribution of SNR values encountered during the estimator computation will follow PF{X\X)> a n d SNR estimation from l°N will yield an estimate of the average SNR ratio X • As already noted in Part A of this chapter, in general we would ideally like to produce SNR estimates that are instantly available and can be fed in real-time to the decoder, equalizer, or other receiver components which could make good use of them. Hence, ideally, we would be served by perfect knowledge of the instantaneous SNR ratio X (which, if desired, could be averaged over time in order to produce an estimate of x). However, estimation of x is n o t always possible, due to the fact that TC0H may be too - 196-short as compared to the estimation period 2NT which is necessary in order to achieve an acceptable accuracy in the SNR estimation (see Sees. 4.13-4.14). Nonetheless, if 2NT»TC0H, timely knowledge of the average SNR x is often sufficient in order to facilitate substantial performance gains [15]. Estimation is achieved following a procedure analogous to (113), i.e. we estimate the SNR via the following: Case (a) ( 2NT <sc TCOH ) : the instantaneous SNR is estimated through: = 10 • log 1 0 ) _ 1 ( / ^ ) ) (182) where f£ (x) = E l^NEs/N0=x Case (b) (2NT » TC0H ) \ the average SNR is estimated through: F'«=10-log 1 „((Z?)" , (/^)) (183) where f£ (z) = E lD M ,N E[ES/N0]= z 4.11.3 Hardware implementation A fixed-point (2's complement) hardware implementation of the estimator is shown in Fig. 96. The LUTs (Lookup Tables) require nbits = 2b222b> + 2b222b> + b 4 2 2 b i + b s 2 h I T T T (184) LUT#\ LUT#2 LUT#3 LUT#4 bits. See Sees. 2.3.3 and 4.3.2 for discussions applicable to LUT #3 and LUT #4, as well as discussion of why 2N should be a power of 2. The use of v„ rather than un significantly reduces the hardware resources needed to compute 1° N ; this is because the normalized constellation has less dynamic range (it is [-1,1]) so this reduces b2 and 63 required to achieve an acceptable degradation due to quantization, as we shall see in Sec. 4.12. - 197-- l ( n ) - ^ - H -Q(n)-L U T #1 / ( » ) jl\n) + Q2(n) Qin) 4l\n) + Q2(n) => R?(rnV|rn| Mr n)/|r n | -l(n-1)--Q(n -1 )^ -> *>1 L U T #2 V/ 2(»-i)+c? 2(«-i) 0(n-l) R e ^ y i r ^ l V 1 4x lm(v n) ^ R e ( v n ) Retain only b,, M S B s Retain only b 3 M S B s •7^ '3 - 7 ^ L U T #3 3 CL v.. M S J O X D M,n Integrate and Dump Ave rage r - s u m 2N s a m p l e s and d isregard lower log2(2N) bits lM,N w 5 L U T #4 lD I CO co Q <io - i o g l o Ufzy (/^)| o or io-iog|0((7^ )"'fc)) or Fig. 96. Fixed-point hardware generation of yDdB . 4.12 Conditional Distribution of i°N In this section we shall derive the conditional probability distribution of 1° N . These stochastic properties will then be used to develop the SNR estimation method in Section 4.13. For simplicity and without loss of generality (see Sec. 2.4) we assume V«,^n =0, whereupon from (176): - 198-<P« = t W r sin(-A<y • nT + 0e) + nQ (nT) I (2ES)' (185) cos(-A<y • nT + 0e) + n} (nT) I (2ES) Let us define (similar to (43)): A<f>n±<pn-(-AconT + 0e) (186) Since we assumed V«,^n = 0 then the physical meaning of A<f>n is clear: it is the phase error in the received symbol that can be attributed to n,(nT) and nQ(nT) (to see this, substitute n1(nT) = nQ(nT) = 0 in the expressions for I(n) and Q(n), and then <pH = tan"1 {Q(n)/I(n)) = -AconT + 0e => A0n =<p„- (-AcoriT + 0e) = O). Since (185) has the same form as (42) with 0e replaced by -AconT + 9e, we have that A(f>n as defined in (186) is distributed the same as A<j)n as defined in (43), namely at Esl N0 = x it has a Rician phase pdf (probability density function) given by (29): PR{^\x) = p{^n=^\EsIN,=x) exp( -z ) cos(A(!>)72j l + 72^cos(A(2>)exp(j-cos2(A^))- j" e~y2'2dy (187) 271 where —n < A</) < n. Now, let us investigate vn. Trivial substitutions of (177) and (178) into (179) show that _ K |exp (j(p„ )|r„_, |exp (- j<pn_x ) ex where D A Yl - 1 (188) (189) Observe that <p° e[-2x,2n], though the true phase is <pf mod2/r e \-7t,n\. We could have indeed performed the modulo operation and confined the range of co^ to [-7r,7r]; this is, in fact, the approach undertaken in [110], [111], and [112]. In contrast, we choose to follow the approach of [113] and to maintain the pretence (p° e \-2n,2n~\ because, as we shall see, it simplifies the analysis. However, note that since vn = exp (j<p^  ) = exp (j (cp® mod2;r this choice has no bearing upon the results. From (186) we then have: - 199-=<Pn~ <P„-x = ^ + (-Aco -nT + 6e)- (A^_, + (-Aco • (n - \)T + 9e)) = A ^ - A ^ _ , - A ^ r Let us define A^„° = A^„ -A^„_,; note that since A ^ , A ^ _ , e [ - ^ , j ] then A$f £.\-2n,2n\. The pdf of A^f is easily found since it is a convolution of the distributions of A0n and (-A^„_,), namely (for -In < A<f>D <2n) P o = P(^» = WD\ES/N0=%) = )pR{r)pR{T-A(pD)dT (191) -71 which is straightforward to evaluate numerically. We note that though closed form expressions for (191) seem unattainable, a Fourier series representation for the pdf is given in [113 eq. (4)] (substituting 0 = 0 there)). Moreover, a simple expression for the distribution at high SNR is easily obtained: from (34) we have Mn,A<j)n ~ iV"(0,l/(2j)); now, since A^f = A(f)n -A<j>n_x and since A(f>n and A0n_x are d x ' ^ x > independent (see Sec. 2.3) we have ~ N(0A/ %), i.e. Pd( a^\X) *~^y[xfa-exp(-0.5• x• ( ) ' ) (192) We now have the tools necessary in order to investigate the distribution of N . 4.12.1 Conditional expectation of l^N given x for 2NT<^TC COH For 2NT «: TC0H, we can assume that the I Nn ratio is constant over the estimation interval and is equal to the instantaneous SNR x • Hence: -200-lM,N S _ = E Re (cos cp Dn + j • sin (p Dn ) M I (cos2 cp Dn + sin 2 q>° f 2 is__ N0 -X = E = E cos(Af - MAaT)\^ = % = E ~cos(MAtf)|^==* cos (MAcoT) + E V s in(MA^r) = E =0 " c o s ( M A ^ ) | ^ = j cos (MAcoT) (193) = 1 J 2 % O S ( M A ^ ) / J D ( A ^ | ^ ) J ( A ^ ) J C O S ( M A ^ 7 ; ) As was the case with fM (x), it is possible to find closed-form expressions for /M(Z) • This is done in Appendix C. We find using those derivations that: -|2 M-\ +1 fx^ cos(MAcoT) (194) which (for even M) can be simplified to (see Appendix C): /M(X) = f M2 ^ 2 ^ 2 ( - l ) w + " (M/2+n-l)\(M/2+k-l)\ ~7tZti n\k\ ' (MI2-n)\(MI2-k)\xn n+k + ( _ i ) " ' M M . e - ' £ Z Mil Mil (-1)" (M/2+n-l)\(M/2+k-l)l Mil Mil + exp( -2^ )£ ]>] n=\ k=\ tt£(n\(k-l)\ (M/2-n)\(M/2-k)\Z (M/2+n-\)\(M/2+k-\)\ (n-\)\(k-l)\(M/2-n)\(M/2-k)\x"+k (195) xcos (MAcoT) It is emphasized that the sums in (195) have a finite number of terms and can thus be easily and accurately computed. -201 -At high SNR we use (192) to obtain a useful approximation (using We assumed (see Sec. 4.10) as is appropriate for the operating point of D-MPSK systems, that \AG^<Z:2K/(M-T) holds, implying cos(MAcoT) « 1. Therefore, the degradation in (193)-(196) due to carrier frequency error is negligible. We thus henceforth assume for simplicity Aa> = 0, though we note that (193)-(196) provide an easy way to incorporate modeling of small frequency errors. Plots of (193), (196), and simulated results for Aa> = 0 are given in Fig. 97; we see that (196) is an excellent approximation. The simulations in Fig. 97 which include quantization effects are quite realistic since they model the following AGC effects: (a) sampler input signal-level backoff (samplers are assumed to be driven at an RMS (Root-Mean-Square) of 80% of the samplers' full-scale voltage range) and (b) clamping by the samplers when they are saturated. The AGC is assumed to behave as the example AGC described in Section 1.5. Hence, the simulations presented should be a good prediction of achievable results. If we assume for example b5 = 8 (which would imply an 8-bit SNR measurement18 in dB - usually more than sufficient), then from (184) we have for the simulated quantized systems in Fig. 97 that nbits= 14336, 30720, 124928, respectively, all of which are very reasonable considering the amount of dedicated memory available in contemporary FPGAs (e.g. the various Xilinx Virtex families [114] or Altera Stratix families [115]) or which can be implemented in ASICs. Fig. 97 shows that for low M Note that such a measurement could include digits after the binary point. For example, if we put the binary point to the left of the Least Significant Bit (LSB), then we have for an 8-bit output the following: 1 sign bit followed by 6 whole-number bits and 1 fractional bit, which would allow, in 2's complement notation, the representation of the interval -64 dB to +63 dB in 0.5 dB intervals, which is usually quite sufficient range and quantization. •DO 2 I— 2 >3 cos(bx)dx = ^ e - b ^ [ 5 4 e q . 15.73]): (196) -202 -only coarse quantization is needed, while (as expected) higher Ms require finer quantization to achieve good agreement with the predicted value of / D M,N 1 0.9 0.8 0.7 0.6 D 2 0.5 0.4 0.3 0.2 0.1 — Expected, Exact * Expected, Gaussian Approx o Simul., No Quantization • Simul., [b1 b2 b3 b4]=[4 4 5 8] » • Simul., [b1 b2 b3 b4]=[5 5 5 8] K m Simul., [b1 b2 b3 b4]=[6 7 5 8] 1 i •m +• , f l I—I "XO" / • fa 5 10 15 20 E s /N 0 (dB) 25 30 Fig. 97. Expected and simulated values of l°N for case (a) 4.12.2 Conditional expectation of l°N given X for 2AT»7/ a w For the case of 2NT » TC0H , since 2 AT is much larger than the channel coherence time, the distribution of S N R values encountered during the estimator computation wi l l follow pF{x\x)i a n d S N R estimation from wi l l yield an estimate of the average S N R ratio x • The expectation of 1° N conditioned upon j is: -203 -D LM,N nS _ — = E Re (cos cpDn + j • sin <p° ) M /(cos2 q>Dn + sin2 <pD) v M / 2 z = E cos(M(p°)\E ^ 1 = Z = E cos(MA0°-MAOJT)\E[^ =X = E cos(MAtf)\E " V = Z COS(MACOT) (\91) + E sin(MAtf )\E ~ v _*<>_ = Z sm(MAcoT) = E V =0 cos(MA^D)|/i T V L"o _ K cos(MAcoT) Another expression of (197) is simply through: 7M (z) = £/2(z)pAz\z)dz (198> Eq. (198) will be of particular importance in Appendix D, where closed-form expressions for /M (Z) are developed. At high Z we can also assume that the instantaneous SNR Z is also high, and we (192) to obtain a useful approximation (using use eq. 15.73]): z-yx> c o s ( M A ^ ) ^ e x p ( ^ ( A ^ ) jd(Af)pF(x\x)d% cos(MA«r)= ^ exp v 2z j PF(z\z)dx cos(MAa>T) Due to the infinite number of possible fading distributions, we obviously cannot present results (197) for all fading types. Rather, for more insight into the behaviour of when 27VT»TC 0 W we shall investigate its behaviour under Nakagami-m fading, 204 which is a fading statistic commonly found in systems which use D - M P S K ([116], [108]). We again assume Aa>~0 (more specifically, that \Ao)\<z:2x/(M-T), see Sec. 4.10) and plot theoretical and simulated results for (197) in Fig . 98 for various types of Nakagami-m statistics. The theoretical results were derived using Appendix D. Comparing Fig . 98 to Fig . 97 we see that the effect of the Nakagami-m fading upon the curve of l£N is rather mild. To evaluate the effects of fading upon the quantization requirements, we plot (197) for the various quantizations used in Fig. 97. This is done in F ig . 99. A s we see by comparing Fig. 99 to Fig. 97, the quantization which was sufficient for case (a) (as shown in Fig . 97) is also sufficient for case (b). Hence, there is no appreciable impact of fading upon the hardware resources required for implementation of the proposed structure. Fig. 98. f£(z) = E lD E[EsIN,] = x as a function of Z for Nakagami-m fading, obtained theoretically (via (254)) and via simulations, for various values of m. Quantization effects are ignored. - 205 -1 0.9 0.8-0.7-' 2 0.6-Q 2 0.5-m 0.4-0.3 0.2 0.1 Simul. , No Quantization Simul. , [b1 b2 b3 b4]=[4 4 5 8] Simul. , [b1 b2 b3 b4]=[5 5 5 8] Simul. , [b1 b2 b3 b4]=[6 7 5 8] • • • Z • • • • • • • 10 15 20 E[E S/N 0] (dB) 25 30 35 Fig. 9 9 . Demonstration of the effects of quantization on the measured value of fading with m = 2 . M,N for Nakagami-m 4.12.3 Variance of l?A M,N It can be shown (see Appendix E) that for \Aa\<s:l/(M-T) and for slow fading the cross-correlation coefficients of \xDM nV defined as A. E[xM „xM k ] — E[xM n ]E[xM t ] = J ^ U ^ F T ) < 2 0 0 ) satisfy -206-Pn,k = 1 n = k PX(X) \n-k\ = \ (201) 0 |«-A;|>1 where |/0,(jr)| < 0.3 . Moreover, we can still use the derivations of Sec. 2.5 to surmise that Vn,a2x ± var(x£ „ ) < 1 .Now, v a r ( / v % ) = v a r f I L - ± x°u „ ) = J L - ( 2 Na] + 2 • (2N - l)P] (X)a\) ( 2 ° 2 ) and using a\ < 1 and \P] (%) | < 0.3 we have v a r ) * 7 ^ ( 2 ^ + 2 • (27V - 1 ) 0 . 3 ) - ^ - + i A (203) Finally, from the central limit theorem for m-dependent variables [117 Chap. 7] we have that 1° N is Gaussian. 4.12.4 Summary: Conditional distribution of Let us unite what we have learned in the previous subsections. We have : • For case (a) (that is, 2NT«TC0H): ~ N(f°(z),a2) where is given in(193)-(196)and cr2 < 1.6/27V • For case (b) (that is, 2NT»TCOH): l^N ~ N (J° (z),a2 ) where f°(x) is given in (197)-(199) and cr2 <1.6/27V. lD M ,N 4.13 Principle of SNR Estimation from l As noted in Section 4.11.2, for case (a) the instantaneous SNR is estimated through YdB = 10 • log,0 ((/M) ' UM,N) j ' while for case (b) the average SNR is estimated through yfB = 10 • log 1 0 ( ( / M ° ( /£*) ) • Graphs of y% =10-log10 ((/*)"' (/£„)) and ^ =10-logl0((/^) ' (IM N)\ are shown in Fig. 100. These curves are the value of LUT #4 in Fig. 96, and the curve to use would be chosen according to the fading characteristics -207 -of the channel. There is an additional small point that needs to be addressed: theoretically we can encounter negative values of 1° v , in which case f°B and y% would be undefined. This is solved by setting the output19 of LUT #4 to -2*5"1 for l°N <0 (not shown in Fig. 100). This correctly reflects the SNR estimate for l°N <0 (which should be -oo dB, i.e. no signal) within the limits of the available quantization. Fig. 100. rl ( f° r case (a)) or r ° (for case (b)) vs. (the "No Fading" curves are those for case (a) , while the others are for case (b)). 1 9 This is the value of the LUT output if the representation is of whole numbers (not necessarily the case, see Footnote 18). Generally, the idea is to use the lowest SNR expressible via the LUT's quantization. -208 -There is a very strong relationship between Fig . 100 and Fig . 98. To see this, recall that to graph the inverse of any monotonic function, all one has to do is reflect the graph over the line y = x . Thus, i f we reflect the curves of Fig . 98 over the line y = x then we arrive at Fig . 100. 4.14 Comparison of Estimation via to Estimation via the S E R (with and without fading) 4.14.1 Number of symbols needed for estimation via / D M,N To quantitatively measure the efficacy of the S N R estimator, following Sec. 4.4 we ask: What is the minimal value of estimation symbol intervals (which is 2-N) needed to ensure P (|y * - % d B | < tol) > C (for case (a)) or P (\f-fB - XdB\< tol)> C (for case (b)) where tol is the tolerance and C is the confidence. For example, some appropriate values of tol and C would be tol = 1.5 dB and C = 9 5 % . Straightforward following of the development of (124) shows that the answer for case (a) is: ( * / • - ( c ) ) 2 27V>2-1 .6-( m i n { | / M ° ( ( l - w 2 ) ^ ) -fS(X) \,\f°{(l + w,)-Z) -f°{%) where w, = ( l 0 ' o / / 1 ° - l ) and w2 = ( l - K T t o ' / l 0 ) . For case (b), the answer is: (erf-(C))2 (204) 2 i V > 2 - 1 . 6 -( m i n { | / M D ( ( l - w 2 ) j ) -f£(z) | , | / M D ( ( l + w, )^ ) -f£(z) (205) We shall now find a comparison yardstick to which the results computed through (204) and (205) can be compared. 4.14.2 Number of symbols needed for SNR estimation via the SER Following the discussion in Sec. 4.5.2 we note that many demodulators generate S N R estimates by measuring the pre- or post-decoder error rate. For example, this is done in systems that estimate the S N R from the number of errors detected in preambles, pilot symbols, or "training sequences" that are embedded in the data stream. Therefore, as -209 -noted in Sec. 4.5.2, perhaps the most meaningful and universally applicable yardstick by which to measure the efficacy of SNR estimation via i° N is through comparison of (204) and (205) to the number of symbols needed for SNR estimation through measurement of the pre-decoder Symbol Error Rate (SER). SNR estimation from the SER is based upon the principle that the SER is always a strictly monotonically decreasing function of the SNR (in words: the higher the SNR, the lower the SER). Suppose we denote the SER function as h(z) or h'%), then if we measure the SER upon L received symbols, and denote this measured SER as S(L), then an estimate of the SNR may be obtained via h~[(S(L)). See Sec. 4.5.2 for a more thorough discussion of this point. To address the effects of fading we must differentiate between two cases: Case ( i ) : L-T<zTCOH: during the counting of errors encountered during the preceding L received symbols, i.e. during the computation of S(L), the channel SNR will not have changed much; hence SNR estimation from the SER will yield an estimate of the instantaneous SNR ratio x • Case ( i i ) : L-T^>TC0H : since L is much larger than the channel coherence time, the distribution of SNR values encountered during the computation of S(L) will follow pF(x\z)> a n d SNR estimation from S(L) will yield an estimate of the average SNR ratio x • The symbol error rate for D-MPSK for case (i) is [118 eq. (3)]: 1 pc-nIM f ~ : - 2 , _ / » , A \ gM(x) = -l exp 71 * -x sin (jtlM) V 1 + cos(;z7 M) cos £ (206) The SER for D-MPSK when L • T » TC0H is accordingly SM(X)=1 -[ exp -x-sm2(7tl M) l + cos(;r/M)cos£ PF(x\x)dx (207) -210-For Nakagami-m fading an even simpler formula exists for (x) in the form of an integral with finite limits we have (see [108 eqs. (3), (13)], and also [116]): M <z>= -J [ 'K-TZIM 1+-s i n 2 ( ; r / M ) x (208) 1 + cos(;r / M) cos £ m where the above equation was derived using the Moment Generating Function (MGF) method of SER computation (see [116],[108]). It is noted the MGF-oriented method is quite useful for computer calculations, as it involves computing only a single integral upon finite limits whose integrand is composed entirely of elementary functions. Graphs of gM(x) (for all fading types) and gM(z) (for Nakagami-m fading) are given in Fig. 101 to Fig. 105. 10°| 10"1 £ 10"2 LU „3 - Q I ™ 4 CO 4— O -5 CD 10 . Q o °- : 10"7 -o- M=2 10"8 r M=4 : -o- M=8 - A - M=16 10' 5 10 15 20 E s / N 0 (dB.) 25 30 35 Fig. 101. Probability of symbol error Pe=g®(Es/N0) as a function of the Es/N0 ratio. -211 -»[l -*- M = 1 61 -5 0 5 10 15 20 25 30 35 E [ E S / N 0 ] (dB) Fig. 102. Probability of symbol error Pe=gM{x) a s a function of % = E[ES/N0] for Nakagami-m fading with m = \ . -5 0 5 10 15 20 25 30 35 E [ E S / N 0 ] (dB) Fig. 103. Probability of symbol error Pe=g~M(%) a s a function of % = E[ES/N0] for Nakagami-m fading with m = 2 . -212-E [ E S / N 0 ] (dB) Fig. 104. Probability of symbol error Pe=gM{x) a s a function of % = E[ES/N0] for Nakagami-m fading with m = 5 . E [ E S / N 0 ] (dB) Fig. 105. Probability of symbol error Pe=gM{x) a s a function of % = E[ES/N0] for Nakagami-m fading with m = \0. -213 -SNR estimation from the SER is done analogously to (182) and (183), namely, for case (i) we can estimate the instantaneous SNR via: ^ = 1 0 - l o g 1 0 ( ( g £ ) ~ ' ( S ( Z ) ) ) (209) while for case (ii) we estimate the average SNR through: =^10-log10((g£)~V(£))) (210) For judging the efficacy of estimation via the SER, we ask: What is the minimal value of L needed to ensure p ( J 7 7 d B - % dB | < tol) > C (for case (i)) or p ( l ^ / s ~~ X~dB I < t o I) > c ( f° r c a s e ("))• From a derivation similar to that which led to (134), we find that for case (i) (i.e., LT <sc TCOH ) the answer to this question is: L > 2g° (%)(l-gDM {X)){erf-\C))2 (min{|g^ {(l-w2)-x)-gDM(x)\,\gDA(l + ™])-x)-gDM (x)\})2 For case (ii) (i.e., LTys> TC0H ), the answer is: , 2gj(x)(l-gj{x)){erf-\C))2 (212) (min {\g° ((1 - w2) • x)- g°M (x)\,I Wi ((1 + ) • x)~ g°M (x)|})2 where w, ^( l0 ' o / / 1 ° - l ) and w2 ^(l-lO"'0"'0). 4.14.3 Graphical Exhibition of Results Graphs of (204) vs. (211) are given in Fig. 106, where we see that the proposed estimator often requires considerably fewer symbol intervals in order to arrive at an equally accurate estimate. The lowest SNRs for which results are given in Fig. 106 are rough thresholds FM defined as the SNRs at which gDM (YM) = 5 • 10~2. -214-E s / N 0 (dB) Fig. 106. Estimation via lMN vs. estimation via the SER for 2NT « Trn„ and L • T « TrniT . In Fig. 107 to Fig. 110 we present results for case (b) and (ii), for Nakagami-m fading for various values of m. As can be seen in those figures, the advantage of the proposed technique is more pronounced for higher m's, and it is easily seen that as m increases the performance approaches that of case (a) vs. case (i), as shown in Fig. 106. This is explained by the fact that as m —> co the Nakagami-m fading behaviour approximates a no-fading situation [13 Sec. 14.3]. Again, the lowest SNRs for which results are given in Fig. 107 to Fig. 110 are rough thresholds TM defined as the average SNRs at which g £ ( F M ) = 5-10-2. It is important to make note of the fact that we presented here graphs of case (a) and case (i) (in Fig. 106) and case (b) and (ii) (in Fig. 107 to Fig. 110). Theoretically, there could be situations were L and N are such that one would have to compare case (a) to -215 -case (ii) or case (b) to case (i). Such a comparison can be made by looking at the appropriate curves taken from Fig . 106 to Fig . 110. T3 CD u. 'C5 CT CD CrT _co o -Q E CO CD _Q E Ci 1 0 1 0 1 0 1 0 1 0 1 0 1 0 8 1 0 6 1 0 4 1 0 2 2 0 1 8 1 6 1 4 1 2 1 0 - E -M=2 via 1 ° N M=2 via S E R M=4 v ia I ° . 4 , N M=4 via S E R M=8 via I D N • O - M=8 via S E R tol = 1.5 dB C = 95% m = 1 15 20 25 E[E S /N 0 ] (dB) Fig. 107. Estimation via vs. estimation via the SER, for 2NTs>TC0H and L-T»TC0H for Nakagami-m fading with m = l . 216-T3 CD '5 CT CD QL o J O E >> CO CD JO E 3 10* 101 101 101 10"!-101 106 M=2 via 1 °N M=2 via SER -m- M=4 via I ° T 4,N • M=4 via SER ^ M = 8 v i a f ° N M=8 via SER 10' 10 10" 15 20 25 E [ E S / N 0 ] ( d B ) Fig. 108. Estimation via l£N vs. estimation via the SER, for 2NT^>TC0H and L-Ty>TCOH for Nakagami-m fading with m = 2 . E [ E S / N 0 ] ( d B ) Fig. 109. Estimation via 1®N vs. estimation via the SER, for 2NT^>TC0H and L-T^>TC0H for Nakagami-m fading with m = 5 . - 2 1 7 -Fig. 110. Estimation via l£N vs. estimation via the SER, for 2NT^>TC0H and L-T»TC0H for Nakagami-m fading with m = \0 . 4.14.4 Discussion of Results The analysis of the results presented in Fig . 106 to Fig. 110 w i l l proceed in a manner similar to that of Sec. 4.5.3. A s can be seen by inspecting Fig . 106 to Fig . 110, the number of symbols needed for the estimation of the S N R from 1° N does not change significantly as a function of the fading characteristics and coherence time. In contrast, estimation via the S E R is strongly affected by the fading characteristics and the coherence time. a) Estimation latency analysis Let us first treat operation at high SNR. A t high S N R we observe that estimation via ifix in general requires much less symbol intervals than estimation via the SER. This means that the proposed estimator can generate estimates much more rapidly than - 2 1 8 -estimation via the SER. An exception to this rule can be see in Fig. 107 (Nakagami-m fading with m = 1 (=Rayleigh fading)), where we see that estimation via the SER requires in general less symbols at high SNR. Inspection of Fig. 107 to Fig. 110 shows that as the fading index m increases, so does the advantage of the proposed estimator. For moderate and high m, and, as well, for case (a) (2NT <s TC0H) we find that the proposed estimation method is much better than estimation via the SER, often by many orders of magnitude. Now let us discuss low SNR operation. At low SNRs, we see that the proposed method requires about the same number of symbols as SNR estimation via the SER. Since it is often the case that the receiver spends most of its lifetime operating in the low-SNR region, one could make the argument that the advantage of the proposed method is minimal since it requires about the same number of symbol intervals as estimation via the SER. This, however, ignores several key issues. First, consider the case of unknown data being transmitted. The only way by which an SER estimate can be obtained from unknown data is by obtaining an error rate estimate from a code-decode process [62]. This means that one must first code the transmitted data at the transmitter and then decode it at the receiver (e.g., using block codes or convolution codes), and that in order to obtain an error rate estimate the receiver would compare the decoded data stream to the input data stream, hence arriving at an error rate estimate. This, however, implicitly assumes that the error correction decoder's output is completely error free - which is a fallacy at low SNRs. Hence, at lower SNRs the SER estimate would be inherently unreliable, with this problem being more severe as the SNR decreases. Moreover, the error correction decoder (ECD) may not even be locked at low SNRs, hence precluding SER estimation in the first place. To combat this problem, known symbols can be sent over the channel (in the form of training sequences, pilot symbols, or preambles) and the error rate estimation can be done upon those symbols. This, however, introduces two problems. First, obviously, the channel throughput that is taken up by those symbols cannot be used in order to transmit data, i.e. a reduction in the channel's information throughput is incurred. Secondly, unless we are prepared to significantly shut down the information-bearing content of the channel, the known symbols must only be allowed to take up a small percentage of the data stream. If we call this percentage P (e.g., P=10%), then we have that the number of symbol intervals that we actually have to wait in order -219-to arrive at the SER estimate is increased by a factor of 1/P over the quantities outlined in Fig. 106 to Fig. 110. For example, for P=10% we would need to multiply those quantities by a factor of 10, which clearly degrades the performance of estimation via the SER as compared to estimation via l£ N . Therefore, we can say that the results presented in Fig. 106 to Fig. 110 are optimistic with regards to estimation via the SER, and that, consequently, the proposed method is superior at low SNRs as well. b) Hardware complexity analysis In terms of complexity, we note that estimation via the SER requires the implementation of error detection and accrual mechanisms, which often necessitate a non-trivial amount of hardware and/or software resources. This, in addition to an algorithm or lookup table that would translate the SER measurement into an SNR measurement. In contrast, the proposed method is impervious to the content of the data stream, the coding method, and the error rate. Regarding fixed-point implementation of the proposed estimator, we make note of the fact that the value of tol must, obviously, be larger than the minimum resolution achievable given the quantization. In Sec. 4.14.3 we assumed that enough quantization bits were used and we ignore quantization effects (a good assumption, considering Sec. 4.12). Moreover, it is easily shown that accurate fixed-point hardware estimation of the SNR from the SER would require an unfeasibly large LUT (due to the large dynamic range of the SER). Thus, including quantization effects would have heavily favoured estimation via l£ N even more. Let us now delve even further into hardware complexity analysis. As noted in Section 4.11.1, in D-MPSK systems detection of the received symbols is often achieved via generating a pseudo-coherently demodulated M-PSK signal un=rnrn_x (see [119 Sec. 6.5.3]). However, an equally valid detector would be via generation of the normalized * pseudo-coherently demodulated M-PSK signal v„ =r~if—r. In fact, it is trivial to see that WK-il the latter has advantages in terms of the stability of the dynamic range of the pseudo-coherent constellation vis-a-vis the AGC's operation. Thus, implementation of * v„ ="j—IT2—j- in Fig. 96 obviates the need to generate the constellation u„=rnrn_x, and, N M -220-hence, it can be argues that the only real hardware penalty incurred by implementing the proposed estimator is the sequence LUT#3-IAD-LUT#4 (see Fig. 96), which is the same order of complexity as the estimator of Part A, i.e., trivial. 4.15 Comparison to other estimators using the N M S E metric In Sec. 4.13-4.14 we focused on comparisons of the proposed method versus estimation via the SER. As mentioned in Section 4.14, this is due to the fact that estimation via the SER is a prevalent and universally applicable SNR estimation method. Other SNR estimators have been suggested in [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73], [16] and [74]. In theory, these too could be applied to the pseudo-coherent constellation vn in order to produce an SNR estimate, although such a procedure is not immediate given that the noise statistics of vn differ from those of rn. While in-depth comparison vs. all those estimators is impossible within the span of the current thesis, it is claimed that the proposed estimator possesses several qualitative advantages which would indicate a favourable outcome to such a comparison. Those qualitative advantages were highlighted in Sec. 4.6.1 and are equally applicable in the current case. 4.15.1 Comparison for INT « TCOH For completeness, we now present NMSE and normalized bias results vs. the M2M4 and SVR estimators. The comparison vs. the M 2 M 4 estimator is particularly important because, as outlined in Sec. 4.6, the M 2 M 4 estimator is a blind estimator operating on 1 sample/symbol, and is perhaps the best previously available estimator of this kind [60]. As noted in Sec. 4.6, the M 2 M 4 and SVR estimators are impervious to the carrier phase, and therefore their performance results for D-MPSK are identical to those for M-PSK. In contrast, estimation from l^N will have different performance than that of lM N, as we have already seen in Sec. 4.13-4.14. We shall now present NMSE and normalized bias comparisons vs. the M2M4 estimator and SVR estimators. -221 -10-V. E s / N Q ( d B ) Fig. 111. NMSE comparison of estimation via l°N vs. the M 2 M 4 estimator and the SVR estimator, M = 2 , with 2JV = 1024 symbols used to compute each estimator. Fig. 112. NMSE comparison of estimation via 1°N vs. the M2M4 estimator and the SVR estimator, M = 4, with 2A^ = 1024 symbols used to compute each estimator. -222 -Fig. 113. NMSE comparison of estimation via l^N vs. the M 2 M 4 estimator and the SVR estimator, M = 8, with 2A^ = 1024 symbols used to compute each estimator. M=2, via proposed method M=2, via M.M.. 2 4 M=2, via SVR a= =^=A ! it -;—•» Ui • * ~ L • A .. 8 10 Eg/No. (dB) 12 14 16 Fig. 114. Normalized bias comparison of estimation via lMN vs. the M2M4 estimator and the SVR estimator, M = 2 4 with 2A^=1024 symbols used to compute each estimator. -223 --•-M=4, via proposed method T*^M=4, via M - M . 2 4 -*-M=4, via S V R 10 12 14 16 E s / N 0 (dB) 18 20 22 24 Fig. 115. Normalized bias comparison of estimation via l£N vs. the M 2 M 4 estimator and the SVR estimator, M = 4, with 2A^=1024 symbols used to compute each estimator. 18 20' 22 E s /N 0 (dB) 30 Fig. 116. Normalized bias comparison of estimation via l£N vs. the M2M4 estimator and the SVR estimator, M = 8, with 2A^=1024 symbols used to compute each estimator. - 2 2 4 -First, let us discuss the NMSE results. As can be seen in Fig. 111-Fig. 113 estimation via 1° N performs very respectfully at medium and high SNR. At such SNRs, it is better than the SVR estimator and only slightly worse than the M 2 M 4 estimator. We can see that at high SNR estimation via l°N tends to a high-SNR bound that is 50% higher than estimation via the M2M4 estimator. This is not surprising, since operates on the pseudo-demodulated constellation, whose phase perturbation variance is higher than that of the original constellation upon which the M2M4 estimator operates (compare (192) to (33)). However, this disadvantage may be overcome if the N used to compute l°N is simply increased by 50%. Regarding operation at low SNRs, the NMSE may be decreased by increasing N, with the tradeoff being a longer estimation period. Such a tradeoff may be acceptable, since (as noted in Sec. 4.8 in comments that are easily applied here) an increase in N causes a negligible increase in the complexity of l^N. Thus, if the designer of the system for which SNR estimates are produced is more concerned about hardware efficiency than about estimation latency, estimation via 1° N for D-MPSK (or M-PSK in the case of lack of carrier synchronization) may be an attractive choice over the M2M4 estimator (for coherent M-PSK when the carrier PLL is locked, estimation via lMN is a much better alternative to both, as seen in Sec. 4.6). 4.15.2 Comparison for INT » TC0H When 2 A T » r c c w , then estimation is of the average SNR and that estimate is obtained through fdDB = 10 • log10 ( ( / « ) ' ( s e e S e c - 4-i3). As noted in Sec. 4.8, the M2M4 estimator performs very poorly under fading conditions [70], and in general more complicated method based on the Viterbi algorithm [15] or EM algorithm [70] have been suggested for SNR estimation in fading conditions. In order to nonetheless evaluate the effects of fading upon the proposed estimator's NMSE, simulation results are obtained for the case of Nakagami-m fading for various values of m, and the NMSE is compared to the NMSE without fading. -225 -1 0 - V r H-»-M=2'm=1 I - ^ M = 2 ' m=2 - B - M = 2 m=5 - • - M = 2 m=10 - * - M = 2 , no fading 0 2 4 6 8 10. 12 14 16 18 20: E [ E S / N 0 ] (dB) Fig. 117. NMSE comparison of SNR estimation via estimation via 1°N for M = 2 and Nakagami-m fading. For each metric 2A^ = 1024 symbols were used. .1 i I J I i i I j L_ 4 6 8 10 12 1 4 16 18 20 22 Eg/No (dB) Fig. 118. NMSE comparison of SNR estimation via estimation via l°N for M = 4 and Nakagami-m fading. For each metric 2 # = 1024 symbols were used. -226-1 0 3 - -1 0 12 14 16 16 20 22 24 26 28 30 E s / N 0 (dB) Fig. 119. NMSE comparison of SNR estimation via estimation via for M = 8 and Nakagami-m fading. For each metric 2A/ = 1024 symbols were used. We see from Fig. 117-Fig. 119 that over much of the SNR range of interest an increase in the NMSE is observed at (particularly at high-SNR, although this is not particularly problematic since SNR estimates usually need not be very accurate at high SNRs, since only minor performance gains of coders/equalizers are achievable at high SNRs from precise knowledge of the SNR). As noted in Sec. 4.8, the NMSE can be reduced simply by increasing the number of symbols 27V that is used in order to compute the estimate, which will have a negligible effect on the estimator's complexity. We now compare the biases of estimation via 1° N under fading conditions to that which is obtained when fading is not present or when 2NT «c TC0H . -227 -o.t -»-M=2 m=1 0.08 -*-M=2 m=2 -«-M=2 m=5 0.06 -•-M=2 m=10 -*-M=2, ho fading 0.04- ; " ro s - ° 0 2 i -0.04- •  \ j - H ! ; ; H i -0.08- ! j - \ j j °n -i I i i i i i i i i ^ i 0 2 4 6 8 10 12 14 16 18 20 E[E S /N 0 ] (dB) Fig. 120. Normalized bias comparisons for SNR estimation via 1°N for M = 2 and Nakagami-m fading. For each metric 2TY = 1024 symbols were used. Fig. 121. Normalized bias comparisons for SNR estimation via 1°N for M = 4 and Nakagami-m fading. For each metric 27V = 1024 symbols were used. -228 -0.1 HMYi=8,m=1 0,08 -A— tiA — ft'm — 1 -s"e-|VI— o rn — Z -«-M=8 , m=5 0.06 - - • - M = 8 m=10 0:04 - * -M=8, no fading .2 -0.04H - 0 . 0 8 - \ ; 4 j \ | ; " l I i g-H I i i J i J L i ii i i 10 12 14 16 18 20 22 24 26 28 30 Es/N0(dB) Fig. 122. Normalized bias comparisons for SNR estimation via N for M = 8 and Nakagami-m fading. For each metric 2JV = 1024 symbols were used. As can be seen in Fig. 120-Fig. 122, the bias of the proposed estimator is very near the optimal value of 0 for all fadings and SNRs considered. 4.16 Application to SNR Estimation for M - P S K in the Absence of Carrier Synchronization A very important observation is that the proposed SNR estimator can be used to provide an SNR estimate for coherent M-PSK but without the need for carrier synchronization (which is a prerequisite for the estimator of Part A). We note, however, that in general SNR estimation for coherent M-PSK via l°yN requires more symbol intervals than estimation via lM N (see Fig. 69 to Fig. 71, Fig. 85 to Fig. 88, Fig. 106 to Fig. 110). Thus, the estimator of Part A should be used when carrier synchronization has been achieved. It is emphasized that for SNR estimation via 1°<N to be done on an M-PSK signal in the absence of carrier synchronization, there is no need to differentially code or decode the datastream (since the estimator is NDA). Finally, we note that for -229 -estimation via l°N to work, the same limitation on the carrier error exists as was the case for the D-MPSK, namely that |AG>| « 2 n / ( M - T ) . We note that this is not a very problematic limitation since for the carrier PLL to lock the frequency error must be within the lock range of the PLL which is ±AcoL = ±2^an (see [103 Chap. 2]) and since in typical PLLs con «IIT and 0.7 < £ < 1.3 (see Sec. 2.4 and [25 Chap. 7]) we then typically have that |A<y|<2£&>„ « ; 2nj{M-T) when the carrier PLL is near or within its lock range. However, when the carrier PLL is in search mode, i.e. the VCO or NCO is made to scan the potentially much wider frequency uncertainty, then the condition \Aa>\«: 2K/(M -T) might not hold, and care must be exercised if SNR estimation from 1° N is attempted. 4.17 Conclusions In this chapter, we started off in Part A by presenting an analysis of a new method of Es/N0 estimation for M-PSK receivers, and quantitative formulas describing its performance were developed. It was found that the proposed method has several quantitative and qualitative advantages with respect to previously available methods. These include: independence from the received data, no reliance on symbol decisions or error detection, resilience to AGC imperfections, a simple and compact fixed-point hardware implementation, and, perhaps most importantly, it requires only a relatively small number of symbols to arrive at an accurate estimate (as compared to SNR estimation via the SER). This method is therefore particularly well-suited for real-time estimate generation in M-PSK receivers. We presented quantitative NSME comparisons vs. the M2M4 and SVR estimators and found that the proposed estimator has respectable and often better performance in comparison. We also provided quantitative results for the performance of the proposed estimator in Nakagami-m fading. In the second part of this chapter we presented a new Non Data Aided SNR estimator for D-MPSK operating at 1 sample/symbol. It was found that the estimator has a simple fixed-point hardware implementation that can be easily implemented within contemporary FPGAs or ASICs. Quantitative comparisons were conducted vs. the M2M4 and SVR estimators as well as estimation from the SER. These results included investigation of the estimator's performance in Nakagami-m fading. In terms of speed - 2 3 0 -and accuracy performance of the estimator, it was shown that it generally performs much better that estimation via the SER and also possesses several qualitative and quantitative implementational advantages. This estimator was also compared in the NMSE sense to the M 2 M 4 and SVR estimators, and was found to be a competitive estimator (though less so than the estimator discussed in Part A). For all of the preceding reasons, the SNR estimation method proposed in Part B in this chapter has immediate applications in contemporary D-MPSK communications systems as well as in SNR estimation for M-PSK systems when carrier synchronization has not yet been achieved. In a coherent M-PSK receiver, the SNR estimator of Part A of this chapter is an attractive choice once carrier synchronization has been achieved. As shown, the estimator of Part A uses less symbol intervals as compared to the estimator of Part B and thus should be preferred over the latter when the carrier PLL is locked. However, if the carrier PLL is unlocked, it was shown that the signal's SNR can be estimated using the estimator of Part B. Hence, to obtain an SNR estimate in coherent M-PSK receivers during both the tracking and acquisition operation modes of the carrier PLL, the M-PSK receiver would use the estimator of Part A when the carrier PLL is locked, and opt to use the estimator of Part B when the carrier is unlocked. -231 -Chapter 5 Conclusions and Future Work 5.1 Summary of Contributions In this thesis, we presented new structures for lock detection, phase detection, and SNR estimation in M-PSK receivers. In Chapter 2 we presented a new type of self-normalizing lock detectors for carrier synchronization PLLs in M-PSK receivers. This was followed in Chapter 3 by presentation of two new families of phase detectors, and in Chapter 4 two new SNR estimation structures were defined and analyzed. A recurring theme in all of the structures is that they are self-normalizing. Sometimes, when the normalization term is omitted the proposed structure reduces to previous known structures. For example, the normalized Mth-order nonlinearity presented (l2(n) + Q2(n))2 , is reduced to the non-normalized Mth-order nonlinearity phase detector cM (n) = Im[(7(tt) + jQ(n))M ] when the denominator term is omitted. Similar observations can be done with regards to the lock detector presented in Chapter 2, which reduces to the Mth-order nonlinearity lock detector when the normalization term is omitted. Despite their relative simplicity, it was shown that the normalization factors have a profound effect upon the proposed structures' behaviour and implementation. It was shown that (in contrast to non-normalized structures) the proposed structures are very resilient to vis-a-vis the AGC's operating point and performance. Furthermore, the normalization reduced the dynamic range of the proposed structures, which allows them to be efficiently implemented in fixed-point hardware, which is an issue that was given specific attention in this thesis. These efficient implementations contrast with the considerably more complicated implementations of previously available structures, a subject that also was discussed in this thesis. -232 -As an additional advantage, it was found that the proposed structures are interrelated and often these interrelationships can aid in the theoretical analysis and can be exploited in order to achieve better performance. For example, the lock detector in Chapter 2 was shown to be a good estimator of the gain of the phase detector of Sec. 3.4, and this was exploited in Sec. 3.6 in order to construct a constant-gain phase detector. As another example, the SNR estimator of Part A in Chapter 4 is based upon the lock detector of Chapter 2, and in turn the SNR estimator of Part B of Chapter 4 is based upon an enhancement of the SNR estimator of Part A of Chapter 4. These interrelationships were exploited throughout Chapter 4 in order to arrive at qualitative and quantitative results. It can be said that there are two uniting thread of this thesis. The first being that all the structures presented have a compact and practical implementation. Indeed, a thesis in an engineering discipline has only limited value if it cannot be applied in the real world. Hence, special emphasis was given in order to find and analyze practical implementations for the proposed structures. The second unifying thread of this thesis has been the effort to achieve results that will be useful to the practicing engineer on an intuitive level. This has been the motivation, for example, in finding and analyzing the closed-form approximate value of the expectation of the lock detector of Chapter 2, namely fM(z) ~ e x p ( - M 2 1 ^x) (eq. (35)), which can be computed by hand by the engineer and nonetheless serves as a very accurate predictor of the lock-detector value. A similar desire for practical usefulness motivated the derivation and analysis of the approximate results given in Chapter 3 and Chapter 4, and the inclusion of sections that treat the subject matter on an intuitive level, such as Sec. 1.5, 2.3.6, and 3.4.10. Moreover, the choice of quantitative comparison metrics in Chapter 4 (see Sec. 4.5, Sec. 4.14) was specifically motivated by the usefulness of such comparisons in a practical engineering situation, as explained in Sec. 4.4. Although this thesis includes exact results for all of the structures analyzed, the approximate but simple results are arguably more important in an engineering environment, since good but conceptually complicated structures are often rejected at the design stage because engineers often prefer to use inferior but tractable structures. Hence, it is hoped that the inclusion of approximate but simple expressions in this thesis will facilitate the proposed structures' adoption by the -233 -grassroots engineering community so that they will not remain solely in the realm of academia. 5.2 Future Work 5.2.1 Future Research: Analysis during Lack of Symbol Synchronization There is significant and potentially fruitful terrain for future research based upon this thesis. One obvious avenue of such research is the investigation of the proposed structures when the symbol synchronization PLL is unlocked. Indeed, it can be shown through simulations that the proposed structures will indeed work when the symbol synchronization PLL is unlocked, but that the lack of symbol synchronization will cause significant changes in the structures' statistical properties. A complicating factor is that it is easily seen, even intuitively, that when the receiver lacks symbol synchronization the value of the proposed structures is highly dependent upon the symbols' baseband pulse shape. This contrasts with the situation discussed in this thesis, in which it was shown that the performance of the proposed structures when the symbol PLL is locked is independent of the baseband pulse shape so long as the post-matched-filter pulse shape conforms to the Nyquist criterion for zero ISI [13 Sec. 9.2.1]. Nonetheless, the path to analyzing the performance of the proposed structures in the absences of symbol synchronization is relatively straightforward, at least conceptually. 5.2.2 Future Research and Continuing Research: A Constant-Gain Detector during Both Tracking and Acquisition An almost obvious extension of the work in this thesis is the construction of a phase detector that has constant gain during both tracking and acquisition. The principle underlying this detector is simple. In Sec. 3.6 we estimated the gain of dM{n) using M • lMN and achieved a constant-gain detector during tracking via dM(n)l(M • lM N) . To achieve a constant-gain detector during acquisition, i.e. when the carrier is unlocked, -234-we can employ a similar procedure by estimating20 the gain of dM(n) using M ' | / M ° (fu ) j (JM,N ) J a n d then arriving at a constant-gain detector during acquisition via dM {n)j[M • | / M O (f® ) j (JM,N )j • This has the potential of allowing the carrier P L L to perform optimally not only during tracking, but also during acquisition. This is investigated in depth in [104]. 5.2.3 Future and Continuing Research: Symbol Synchronization P L L Structures Another proven avenue of research based upon the current thesis is the investigation of similar self-normalizing structures for the symbol timing synchronization P L L . a) Revised system model accounting for lack of symbol timing synchronization In order to give an outline of this field, we first modify the system model given in Sec. 1.4 in order to account for the possible lack of symbol synchronization . The revised model is as follows. The baseband M-PSK signal before modulation is defined as 00 m(t) = ]5T exp(j0r)p(t-rT) where 1/T is the symbol rate, r=-oo 0r=2n-mr/M + TIM-K/M, mr e { ( l l , . . . , M - l } , and r i M = { l i f M * 2 , 0 i f M = 2}. We use the notation z\ to signify the signal's propagation delay. At the input of the I-Q demodulator the IF signal is sm(t) = Re[m(t-Tj)exp(jcoit+jd,)] and that signal is corrupted by A W G N . The revised M-PSK model which includes the possible effects of the lack of symbol synchronization is as shown in Fig. 123, where: 1. \ITS = 2IT is the sample rate. 2 0 In the general discussion in this section we assume that no fading is present. However, in [104] fading effects are treated. -235 -2. n(t)~N(0,N0w) where W is the width of the bandpass IF filter before the I-Q demodulator (not shown). 3. We assume a narrowband bandpass signal (i.e. a>. » i / r ) and that the Nyquist criterion for zero-ISI [13 Sec. 9.2.1] is obeyed regarding the output of the matched filters. 4. K , and K Q are the equivalent gains associated with the circuit, and are a slow function of time controlled by the A G C circuit (the AGC's purpose is to ensure that the dynamic range of the samplers is utilized yet the samplers are not saturated). For simplicity in this thesis we assumed that these gains are equal, i.e. K , =KQ=K (which is usually the case), but this is not a necessary requirement for the symbol-PLL structures that we shall shortly outline. 5. The matched filter h(t) = p(-t) is assumed ideal. 6. When the carrier loop is locked we have A<y = 0 and (since M-PSK carrier synchronization has an inherent M-fold phase ambiguity ([22 Chap. 5, 6], [21 Sec. 5-7])) 00 e{0, +27rklM -0 e | * = O , l , . . . ,Af - l } , where \Oe\<nlM is the residual carrier phase error. -236-Re m(7-r,)-exp(y(&>f + 6>.)) +n(t) -IF Input-KO -1(f)-I(kTs+f,) sample rate (Matched Filter) < 2cos(«,.r + Acot + 0o) 1/TS=2/T NCO/VCO I—I Local Carrier Generation Carrier Synchronization PLL Loop Filter Carrier Phase Detector 90° -2sm(coit + Acot + 0o) -Q(f)-h(t) (Matched Filter) sample rate 1/TS=2/T NCOA/CO Sampling Clock Generation Symbol Synchronization PLL Loop Filter Symbol Timing Error Detector Fig. 123. General structure of a coherent M-PSK receiver showing both the carrier and symbol PLLs. The notation f. is employed to refer to the receiver's estimate of r ; . The symbol synchronization timing error is defined as r = [zi - f^mody, with T G[-T/2,T/2] . The even samples of the channels are then: Je(n) = I ( t ) \ t = 2 n T s + f i ^dQe(n) = Q(t)\t=2nTs+^ (213) and the odd samples are: = J ( 0 \ t = { 2 n + l ) T s + f i and Q0(n) = fi(0|/=(2ll+1)7.i + f / (214) It is worth noting that under perfect symbol synchronization conditions (that is, f. = rl), the even samples correspond to the peaks of the symbols, and the odd samples correspond to the transitions between symbols. -237 -b) Symbol PLL lock detection and SNR estimation We can apply the principle of self-normalization used in this thesis in order to arrive at a new lock detector for the symbol P L L that is based upon the non-normalized detector of Karam et al. [120], which is (detector B in [120]) _ L N P L L lock detector as: N Z \{h\n)-h\n)) + UM\Qe\n)-QXn))). We define the normalized symbol 1 N / » - / » / » + / » +n M Qe\n)-Q0\n) Qe\n) + Q0\n) (215) Since the lock detector presented in is a one-to-one function of the SNR, the S N R can be estimated from the lock detector value, just like an S N R estimate was generated from the carrier P L L lock metric value in Chapter 4 of this thesis. This is explored in [121]. Unlike the estimators of Chapter 4, which were independent of the post-matched filter pulse shape so long as the latter conformed to the Nyquist criterion for zero ISI, estimators based upon the symbol P L L lock detector are dependent upon the pulse shape. Define fM,P(z) = E[sN | Es I N 0 =Z,Tx pulse shape is p(t)] < 2 1 6 > then the S N R can be estimated via: YSdB = 10 - l o g 1 0 (fM\p(sN)) (217) The definition and investigation of this symbol-PLL lock detector and S N R estimator structure was done in [122], [123], and [121] (in [122] and [123] only B P S K and Q P S K are considered, though extension to operation for M > 4 is straightforward and is discussed in [121]). A n efficient fixed-point hardware implementation of sN is shown in F ig . 124. Obviously, for actual symbol P L L lock detection to occur, sN must be compared ([122],[123]) to a lock threshold (not shown in Fig . 124). -238 --Qe(n)\ -Q0(n)\ % L o o k u p T a b l e : ~ loin) £-% I2(n) + I2{n) O L o o k u p T a b l e : * ; 1 e72(")-g?w(") | s branch for all M Integrate and Dump Averager -sum 2N samples and disregard lower log 2 (2N) bits This branch for M>2 only L o o k u p T a b l e : „ -*101og ] 0 ( / " > „ ) ) ! — • Fig. 124. Efficient fixed-point hardware implementation of the symbol synchronization PLL lock detector sN ([122], [123], [121]) and associated SNR estimation method [121]. c) Timing error detectors for the symbol PLL Using the principle of self-normalization, we can derive new Timing Error Detectors (TEDs) for the symbol PLL (a illustration of where the TED operates in the M-PSK receiver, see Fig. 123). For example, we can start off from the Gardner TED [42], defined for QPSK as: gQPsK(n) = {le(n)-Ie(n-\))l0(n-\) +(e.(»)-G.(»-i))e>-i) Using normalization, we arrive at a new detector: 5 Ie{n)I0(n-\) Ie(n-\)Io(n-l) gQPSK w - ( / / { n ) + I J t { n _ 1 } ) - ( / / { n _ l ) + c { n _ x ) ) , fi. ( « ) & ( " - ! ) Qe(n-l)Q0(n-l) (219) [Qe2 («) + Q2 in -1)) (Qe2 («-!) + Q02 ( » -1)) This detector is investigated in [43], and it is shown that it possesses significant implementational and performance advantages vis-a-vis (218). Another detector can be derived from the Mueller & Muller DD (M&M) [34] (1 sample/symbol), for example for BPSK: MM(n) = sign(Ie(n-l))Ie(n)~ sign(Ie(n))Ie(n-l) (220) and the corresponding normalized detector is: -239 -± S i g n { I { n - Xy{n) - si&V<nmn ~ 1) 4l2{n-\) + l\n) This detector, as well as a normalized version of the Decision Directed Gardner detector, was analyzed in [37], which also includes definition, analysis and results for the new normalized detectors for M>2. Once again, as is shown in [37], the normalized detectors have many performance and implementation advantages. 5.3 Final Remarks A s was shown in this concluding chapter, the structures proposed in this thesis have immediate practical applications in current M - P S K receivers. Moreover, the principles used to derive these new structures can be applied to the symbol timing synchronization P L L s , where similar performance and implementation advantages are observed. A promising avenue of future work is the investigation of the structures proposed in this thesis when the symbol synchronization P L L is unlocked. In short, the structures and methods proposed in this thesis have immediate applications as wel l as significant promise of future discovery. - 2 4 0 -References Y. Linn, "A Methodical Approach to Hybrid PLL Design for High-Speed Wireless Communications," in Proc. 8,h IEEE Wireless and Microwave Technology Conf. (WAMCON2006), Clearwater, FL, Dec. 4-5, 2006. K. Young-Wan, C. Jong-Suk, and P. Dong-Chul, "Circuit design and performance analysis of carrier recovery loop for digital DBS system in the presence of phase noise," IEEE Trans. Broadcasting, vol. 45, no. 3, pp. 294-302, Sep. 1999. ETSI (European Telecommunications Standards Institute), "DVB-S2 Technical Report ETSI TR 102 376 VI.1.1," Feb. 2005. ETSI (European Telecommunications Standards Institute), "Implementation guidelines for DVB terrestrial services: Technical Report ETSI TR 101 190 VI.2.1 ", Nov. 2004. ETSI (European Telecommunications Standards Institute), "DVB-H Implementation Guidelines: Technical Report ETSI TR 102 377 VI.2.1," Nov. 2005. B. O'Hara and A. Petrick, The IEEE 802.11 handbook : a designer's companion, 2nd ed. New York: IEEE, 2004. J. P. K. Gilb, Wireless multimedia : a guide to the IEEE 802.15.3 standard. New York, NY: Standards Information Network, IEEE Press, 2004. IEEE 802.16 Working Group, "IEEE Standard for Local and Metropolitan Area Networks Part 16: Air Interface for Fixed Broadband Wireless Access Systems (IEEE Std 802.16-2004)," Oct. 2004. W. P. Osborne and B. T. Kopp, "An analysis of carrier phase jitter in an M-PSK receiver utilizing MAP estimation," in Proc. MLCOM '93, Boston, MA, USA, 1993, pp. 465-470. M. M. J. L. van de Kamp, "Climatic radiowave propagation models for the design of satellite communication systems," PhD. Thesis, Technische Universiteit Eindhoven, The Netherlands, 1999. R. M. Gagliardi, Satellite communications. Second edition: Van Nostrand Reinhold, 1991. T. A. Summers and S. G. Wilson, "SNR mismatch and online estimation in turbo decoding," IEEE Trans. Commun., vol. 46, no. 4, pp. 421-423, Apr. 1998. J. G. Proakis, Digital communications, 4th ed. Boston: McGraw-Hill, 2001. -241 -ETSI (European Telecommunications Standards Institute), "DVB-S2 Technical Report ETSI TR 102 376 V1.1.1," 2005. K. Balachandran, S. R. Kadaba, and S. Nanda, "Channel quality estimation and rate adaptation for cellular mobile radio," IEEE Journal on Selected Areas in Communications, vol. 17, no. 7, pp. 1244-1256, Jul. 1999. D. R. Pauluzzi, "Signal-to-noise ratio and signal-to-impairment ratio estimation in AWGN and wireless channels," Ph.D. Thesis, Queen's University, Kingston, Ontario, Canada, Sept. 1997. R. G. Lyons, Understanding digital signal processing, 2nd ed. NJ: Prentice Hall, 2004. Y. Linn, "A Tutorial on Hybrid PLL Design for Synchronization in Wireless Receivers," in Proc. International Seminar: 15 Years of Electronic Engineering, Universidad Pontificia Bolivariana, Bucaramanga, Colombia, Aug. 15-19, 2006 (invited paper). M. K. Simon, S. M. Hinedi, and W. C. Lindsey, Digital communication techniques. NJ: Prentice Hall, 1995. M. K. Simon and D. Divsalar, "Some new twists to problems involving the Gaussian probability integral," IEEE Trans. Commun., vol. 46, no. 2, pp. 200-210, Feb. 1998. U. Mengali and A. N. D'Andrea, Synchronization techniques for digital receivers. NY: Plenum Press, 1997. H. Meyr, M. Moeneclaey, and S. Fechtel, Digital communication receivers: synchronization, channel estimation, and signal processing. NY: Wiley, 1998. P. K. Vitthaladevuni and M. S. Alouini, "Effect of imperfect phase and timing synchronization on the bit-error rate performance of PSK modulations," IEEE Trans. Commun., vol. 53, no. 7, pp. 1096-1099, Jul. 2005. W. Lindsey and M. Simon, "The Effect of Loop Stress on the Performance of Phase-Coherent Communication Systems," IEEE Trans. Commun., vol. 18, no. 5, pp. 569-588, Oct. 1970. F. M. Gardner, Phaselock techniques, 2nd ed. NY: Wiley, 1979. W. C. Lindsey and M. K. Simon, Telecommunication systems engineering. NJ: Prentice-Hall, 1973. S. Haykin, Communication systems, 2nd ed. NY: Wiley, 1983. D. R. Stephens, Phase-locked loops for wireless communications : digital, analog, and optical implementations, 2nd ed. Boston: Kluwer Academic, 2002. -242 -D. H. Wolaver, Phase-locked loop circuit design. NJ: Prentice Hall, 1991. E. C. Ifeachor and B. W. Jervis, Digital signal processing: a practical approach, 2nd ed. NY: Prentice Hall, 2002. A. V. Oppenheim and R. W. Schafer, Discrete-time signal processing. NJ: Prentice Hall, 1989. F. M. Gardner, "Interpolation in digital modems. I. Fundamentals," IEEE Trans. Commun., vol. 41, no. 3, pp. 501-507, Mar. 1993. L. Erup, F. M. Gardner, and R. A. Harris, "Interpolation in digital modems. II. Implementation and performance," IEEE Trans. Commun., vol. 41, no. 6, pp. 998-1008, Jun. 1993. K. H. Mueller and M. Muller, "Timing recovery in digital synchronous data receivers," IEEE Trans. Commun., vol. 24, no. 5, pp. 516-530, May 1976. W. G. Cowley and L. P. Sabel, "The performance of two symbol timing recovery algorithms for PSK demodulators," IEEE Trans. Commun., vol. 42, no. 6, pp. 2345-2355, Jun. 1994. M. Moeneclaey and T. Batsele, "Carrier-independent NDA symbol synchronization for M-PSK, operating at only one sample per symbol," in Proc. GLOBECOM '90, San Diego, CA, USA, 1990, pp. 594-598. Y. Linn, "Two new decision directed M-PSK timing error detectors," in Proc. 18th Canadian Conference on Electrical and Computer Engineering (CCECE'05), Saskatoon, SK, Canada, May 1-4, 2005, pp. 1759-1766. D. Verdin and T. C. Tozer, "Symbol-timing recovery for M-PSK modulation schemes using the signum function," in Proc. IEE Colloquium on New Synchronisation Techniques for Radio Systems (Digest No.1995/220), London, UK, 1995, pp. 2/1-2/7. D. Verdin, "Synchronization in sampled receivers for narrowband digital modulation schemes," Ph.D. Thesis, in Dept. of Electrical Engineering, University of York, United Kingdom, 1996. L. P. Sabel, "A Maximum Likelihood Approach to Symbol Timing Recovery in Digital Communications," Ph.D. Thesis, in School of Electronic Engineering, University of South Australia, Adelaide, Australia, 1993. Y. Linn, "Synchronization and Receiver Structures in Digital Wireless Communications (workshop notes)," in International Seminar: 15 Years of Electronic Engineering. Universidad Pontificia Bolivariana, Bucaramanga, Colombia, Aug. 15-19, 2006. -243 -F. M. Gardner, "A BPSK/QPSK timing-error detector for sampled receivers," IEEE Trans. Commun., vol. 34, no. 5, pp. 423-429, May 1986. Y. Linn, "A new NDA timing error detector for BPSK and QPSK with an efficient hardware implementation for ASIC-based and FPGA-based wireless receivers," in Proc. 2004 IEEE Intl. Symp. on Circuits and Systems (ISCAS'04), Vancouver, BC, Canada, May 23-26, 2004, pp. IV:465-468. D. Verdin and T. C. Tozer, "Symbol timing recovery scheme tolerant to carrier phase error," Electronics Letters, vol. 30, no. 2, pp. 116-117, Jan. 1994. R. L. Peterson, R. E. Ziemer, and D. E. Borth, Introduction to spread-spectrum communications. NJ: Prentice Hall, 1995. Analog Devices, "AD9851 Datasheet, Rev. C," retrieved from www.analog.com. J. D. Gibson (editor), "The communications handbook," 2nd ed. Boca Raton, FL: CRC Press, 2002. A. Blanchard, Phase-locked loops. Application to coherent receiver design. NY: Wiley, 1976. H. Meyr and G. Ascheid, Synchronization in digital communications. NY: Wiley, 1990. A. Mileant and S. Hinedi, "Lock detection in Costas loops," IEEE Trans. Commun., vol. 40, no. 3, pp. 480-483, Mar. 1992. A. Mileant and S. Hinedi, "On the effects of phase jitter on QPSK lock detection," IEEE Trans. Commun., vol. 41, no. 7, pp. 1043-1046, Jul. 1993. K. Yi, et al., "A new lock detection algorithm for QPSK digital demodulator," in Proc. Seventh IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC'96), Taipei, Taiwan, Oct. 15-18, 1996, pp. 848-852. L. Kyung Ha, J. Seung Chul, and C. Hyung Jin, "A novel digital lock detector for QPSK receiver," IEEE Trans. Commun., vol. 46, no. 6, pp. 750-753, Jun. 1998. [54] M. R. Spiegel, Mathematical handbook of formulas and tables. NY: McGraw-Hill, 1968. R. N. McDonough and A. D. Whalen, Detection of signals in noise, 2nd ed. CA: Academic Press, 1995. [56] H. Gudbjartsson and S. Patz, "The Rician distribution of noisy MRI data," Magnetic Resonance in Medicine, vol. 34, no. 6, pp. 910-914, Dec. 1995. -244-J. Sijbers, "Signal and Noise Estimation from Magnetic Resonance Images," Ph.D. Thesis, University of Antwerp, Antwerp, Belgium, 1998. B. T. Kopp and W. P. Osborne, "Phase jitter in MPSK carrier tracking loops: analytical, simulation and laboratory results," IEEE Trans. Commun., vol. 45, no. 11, pp. 1385-1388, Nov. 1997. N. C. Beaulieu, A. S. Toms, and D. R. Pauluzzi, "Comparison of four SNR estimators for QPSK modulations," IEEE Commun. Letters, vol. 4, no. 2, pp. 43-45, Feb. 2000. D. R. Pauluzzi and N. C. Beaulieu, "A comparison of SNR estimation techniques for the AWGN channel," IEEE Trans. Commun., vol. 48, no. 10, pp. 1681-1691, Oct. 2000. D. R. Pauluzzi and N. C. Beaulieu, "A comparison of SNR estimation techniques in the AWGN channel," in Proc. IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, May 17-19, 1995, pp. 36-39. N. Celandroni, E. Ferro, and F. Potorti, "Quality estimation of PSK modulated signals," IEEE Commun. Mag., vol. 35, no. 7, pp. 50-55, Jul. 1997. N. Celandroni and S. T. Rizzo, "Detection of errors recovered by decoders for signal quality estimation on rain-faded AWGN satellite channels," IEEE Trans. Commun., vol. 46, no. 4, pp. 446-449, Apr. 1998. N. Celandroni and S. T. Rizzo, "Corrections to 'Detection of Errors Recovered by Decoders for Signal Quality Estimation on Rain Faded AWGN Satellite Channels'," IEEE Trans. Commun., vol. 47, no. 5, pp. 784-784, May 1999. G. Ping and C. Tepedelenlioglu, "SNR estimation for nonconstant modulus constellations," IEEE Trans. Signal Proc, vol. 53, no. 3, pp. 865-870, Mar. 2005. M. K. Simon and S. Dolinar, "Improving SNR estimation for autonomous receivers," IEEE Trans. Commun., vol. 53, no. 6, pp. 1063-1073, Jun. 2005. A. Ramesh, A. Chockalingam, and L. B. Milstein, "SNR estimation in Nakagami-m fading with diversity combining and its application to turbo decoding," IEEE Trans. Commun., vol. 50, no. 11, pp. 1719-1724, Nov. 2002. R. Matzner, F. Engleberger, and R. Siewert, "Analysis and Design of a Blind Statistical SNR Estimator," in Proc. Audio Engineering Society (AES) 102nd Convention, Munich, Germany, Mar. 22-25, 1997. A. Wiesel, J. Goldberg, and H. Messer, "Non-data-aided signal-to-noise-ratio estimation," in Proc. IEEE International Conference on Communications (ICC 2002). Apr. 28 - May 2, 2002, pp. 1:197-201. -245 -A. Wiesel, J. Goldberg, and H. Messer-Yaron, "SNR estimation in time-varying fading channels," IEEE Trans. Commun., vol. 54, no. 5, pp. 841-848, May 2006. X. Hua and Z. Hui, "The simple SNR estimation algorithms for MPSK signals," in Proc. 7th International Conference on Signal Processing (ICSP '04), Aug. 31 -Sept. 4 2004, pp. 11:1781-1785. R. Guangliang, C. Yilin, and Z. Hui, "A new SNR's estimator for QPSK Modulations in an AWGN channel," IEEE Trans, on Circuits and Systems II: Express Briefs, vol. 52, no. 6, pp. 336-338, Jun. 2005. C. F. Mecklenbrauker and S. Paul, "On estimating the signal to noise ratio from BPSK signals," in Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (ICASSP '05), Mar. 18-23, 2005, pp. IV:65-68. L. Bin, R. DiFazio, and A. Zeira, "A low bias algorithm to estimate negative SNRs in an AWGN channel," IEEE Commun. Letters, vol. 6, no. 11, pp. 469-471, Nov. 2002. E. Biglieri, J. Proakis, and S. Shamai, "Fading channels: information-theoretic and communications aspects," IEEE Trans. Info. Theory, vol. 44, no. 6, pp. 2619-2692, Jun. 1998. M. K. Simon and M.-S. Alouini, Digital communication over fading channels: a unified approach to performance analysis. NY: Wiley, 2000. M. K. Simon and M. Alouini, "A unified approach to the performance analysis of digital communication over generalized fading channels," Proc. IEEE, vol. 86, no. 9, pp. 1860-1877, Sep. 1998. B. Sklar, "Rayleigh fading channels in mobile digital communication systems. I. Characterization," IEEE Comm. Mag., vol. 35, no. 9, pp. 136-146, Sep. 1997. B. Sklar, "Rayleigh Fading Channels in Mobile Digital Communication Systems Part II: Mitigation," IEEE Comm. Mag, vol. 35, no. 9, pp. 148-155, Sep. 1997. A. J. Viterbi and A. M. Viterbi, "Nonlinear estimation of PSK-modulated carrier phase with application to burst digital transmission," IEEE Trans. Info. Theory, vol. 29, no. 4, pp. 543-551, Jul. 1983. R. Hamila, J. Vesma, and M. Renfors, "Polynomial-based maximum-likelihood technique for synchronization in digital receivers," IEEE Trans. Circuits and Systems II, vol. 49, no. 8, pp. 567-576, Aug. 2002. D. Taich and I. Bar-David, "Maximum-likelihood estimation of phase and frequency of MPSK signals," IEEE Trans. Info. Theory, vol. 45, no. 7, pp. 2652-2655, Jul. 1999. -246-R. Hamila, "Synchronization and Multipath Delay Estimation Algorithms for Digital Receivers," Ph.D. Thesis, Tampere University of Technology, Tampere, Finland, 2002. N. Noels, et al., "Carrier phase and frequency estimation for pilot-symbol assisted transmission: bounds and algorithms," IEEE Trans. Signal Proc, vol. 53, no. 12, pp. 4578-4587, Dec. 2005. M. L. Boucheret, et al., "A new algorithm for nonlinear estimation of PSK-modulated carrier phase," in Proc. 3rd European Conference on Satellite Communications, Manchester, UK, 1993, pp. 155-159. W. G. Cowley, "Phase and frequency estimation for PSK packets: bounds and algorithms," IEEE Trans. Commun., vol. 44, no. 1, pp. 26-28, Jan. 1996. M. Moeneclaey and G. de Jonghe, "ML-oriented NDA carrier synchronization for general rotationally symmetric signal constellations," IEEE Trans. Commun., vol. 42, no. 8, pp. 2531-2533, Aug. 1994. N. A. D'Andrea, U. Mengali, and R. Reggiannini, "Comparison of carrier recovery methods for narrow-band polyphase shift keyed signals," in Proc. GLOBECOM '88, Hollywood, FL, USA, 1988, pp. 1474-1478. J. B. Anderson, T. Aulin, and C.-E. Sundberg, Digital phase modulation. NY: Plenum Press, 1986. E. A. Lee and D. G. Messerschmitt, Digital communication, 2nd ed. Boston: Kluwer Academic Publishers, 1994. H. C. Osborne, "A generalized 'polarity-type' Costas loop for tracking MPSK signals," IEEE Trans. Commun., vol. 30, no. 10, pp. 2289-2296, Oct. 1982. S. A. Butman and J. R. Lesh, "The effects of bandpass limiters on n-phase tracking systems," IEEE Trans. Commun., vol. 25, no. 6, pp. 569-576, Jun. 1977. B. T. Kopp, "An analysis of carrier phase jitter in an MPSK receiver Utilizing MAP estimation," Ph.D. Thesis, New Mexico State University, Las Cruces, NM, 1994. C. Dick, F. Harris, and M. Rice, "Synchronization in software radios. Carrier and timing recovery using FPGAs," in Proc. 2000 IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA, USA, Apr. 17-19, 2000, pp. 195-204. L. Franks, "Carrier and Bit Synchronization in Data Communication - A Tutorial Review," IEEE Trans. Commun., vol. 28, no. 8, pp. 1107-1121, Aug. 1980. R. Hayashi, F. Ishizu, and K. Murakami, "A delta-sigma baseband phase detector realizing AGC-free PSK and FSK receivers," in Proc. 13th IEEE International - 247 -Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Sept. 15-18, 2002, pp. 2382-2388. [97] W. P. Osborne and B. T. Kopp, "Synchronization in M-PSK modems," in Proc. ICC '92, Chicago, IL, USA, 1992, pp. 1436-1440. [98] R. De Gaudenzi, T. Garde, and V. Vanghi, "Performance analysis of decision-directed maximum-likelihood phase estimators for M-PSK modulated signals," IEEE Trans. Commun., vol. 43, no. 12, pp. 3090-3100, Dec. 1995. [99] G. Ascheid and H. Meyr, "Cycle Slips in Phase-Locked Loops: A Tutorial Survey," IEEE Trans. Commun., vol. 30, no. 10, pp. 2228-2241, Oct. 1982. [100] W. Lindsey and C. Chak, "Performance Measures for Phase-Locked Loops - A Tutorial," IEEE Trans. Commun., vol. 30, no. 10, pp. 2224-2227, Oct. 1982. [101] M. C. Jeruchim, P. Balaban, and K. S. Shankugan, Simulation of Communication Systems. New York: Plenum Publishing Corporation, 1992. [102] J. G. Proakis and M. Salehi, Contemporary Communication Systems Using Matlab. Pacific Grove, CA: Brooks/Cole, 2000. [103] R. E. Best, Phase-locked loops: theory, design, and applications, 2nd ed. NY: McGraw-Hill, 1993. [104] Y. Linn, "An Optimal Adaptive M-PSK Carrier Phase Detector Suitable for Fixed-Point Hardware Implementation within FPGAs and ASICs," in Proc. IEEE 2006 Workshop on Signal Processing Systems (SiPS'06), Banff, AB, Canada, Oct. 2-4, 2006, pp. 238-243. [105] K. Steiglitz and L. McBride, "A technique for the identification of linear systems," IEEE Trans. Automatic Control, vol. 10, no. 4, pp. 461-464, Oct. 1965. [106] G. Engeln-Miillges and F. Uhlig, Numerical Algorithms with C. Berlin: Springer, 1996. [107] J. W. Craig, "A new, simple and exact result for calculating the probability of error for two-dimensional signal constellations," in Proc. MILCOM'91, 4-7 Nov., 1991, pp. 571-575. [108] S. Hyundong and L. Jae Hong, "On the error probability of binary and M-ary signals in Nakagami-m fading channels," IEEE Trans. Commun., vol. 52, no. 4, pp. 536-539, Apr. 2004. [109] J. H. Roberts, Angle modulation: the theory of system assessment. Stevenage, UK: Peter Peregrinus Ltd., 1977. -248 -[110] R. F. Pawula, S. O. Rice, and J. H. Roberts, "Distribution of the phase angle between two vectors perturbed by Gaussian noise," IEEE Trans. Commun., vol. 30, no. 8, pp. 1828-1841, Aug. 1982. [Ill] R. F. Pawula, "Distribution of the phase angle between two vectors perturbed by Gaussian noise II," IEEE Trans. Veh. Technol., vol. 50, no. 2, pp. 576-583, Mar. 2001. [112] R. Pawula, "On M-ary DPSK Transmission Over Terrestrial and Satellite Channels," IEEE Trans. Commun., vol. 32, no. 7, pp. 752-761, Jul. 1984. [113] N. Blachman, "The Effect of Phase Error on DPSK Error Probability," IEEE Trans. Commun., vol. 29, no. 3, pp. 364-365, Mar. 1981. [114] Xilinx Inc., "Virtex Series FPGAs," at hdp://www.xilinx.conVproducts/silicon_solutions/fpgas/virtex/index.htm, accessed Nov. 2006 [115] Altera Inc., "Altera Product Catalog Jan. 2006," at http://www.altera.com/literature/lit-index.html, accessed Nov. 2006 [116] A. Annamalai and C. Tellambura, "Error rates for Nakagami-m fading multichannel reception of binary and M-ary signals," IEEE Trans. Commun., vol. 49, no. 1, pp. 58-68, Jan. 2001. [117] K. L. Chung, A course in probability theory. NY: Harcourt, 1968. [118] R. F. Pawula, "Generic error probabilities," IEEE Trans. Commun., vol. 47, no. 5, pp. 697-702, May 1999. [119] J. R. Barry, E. A. Lee, and D. G. Messerschmitt, Digital communication, 3rd ed. Boston: Kluwer, 2004. [120] G. Karam, V. Paxal, and M. Moeneclaey, "Lock detectors for timing recovery," in Proc. ICC '96., Dallas, TX, USA, 1996, pp. 1281-1285. [121] Y. Linn, "A Hardware Method for Real-Time SNR Estimation for M-PSK using a Symbol Synchronization Lock Metric," in Proc. 9th Canadian Workshop on Information Theory, Montreal, QC, Canada, Jun. 5-8, 2005, pp. 247-251. [122] Y. Linn, "A symbol synchronization lock detector and SNR estimator for QPSK, with application to BPSK," in Proc. 3rd IASTED International Conference on Wireless and Optical Communications (IASTED WOC'03), Banff, AB, Canada, Jul. 14-16, 2003, pp. 506-514. [123] Y. Linn, "A self-normalizing symbol synchronization lock detector for QPSK and BPSK," IEEE Trans. Wireless Commun., vol. 5, no. 2, pp. 347-353, Feb. 2006. [124] K.-P. Ho, Phase-modulated optical communication systems. NY: Springer, 2005. -249 -APPENDICES Appendix A Closed-Form Expressions for fM (z) In (30) we presented a formula for the computation of the lock detector expectation, repeated here for convenience: n /M{Z)= \cos(MA<f>)-pR(A<f)\z)-dA</) (222) —K which is based upon the probability distribution of (29), also repeated for convenience: PR{A0\z) = p(Wn=W\Es/No=z) exp(-j) 2TT l + 72^cos(A^)exp(j-cos2(A^))- j e~yl/2dy (223) While at first glance due to the complicated nature of (223) closed-form formulas for (222) may seem unattainable, in fact using Fourier analysis we can reach formulas which are given entirely as finite sums of elementary functions. We begin by noting that the domain of pR {&<f>\%} is [-K,7r] and that therefore its periodic continuation can be represented as a Fourier series. Such an analysis has been conducted in [124 App. 4A]. The Fourier series coefficients for pR ( A ^ | £ ) a re given by [124 eq. 4.A.9]: This appendix was presented in part in Y. Linn, "Simple and Exact Closed-Form Expressions for the Expectation of the Linn-Peleg M-PSK Lock Detector," in Proc. 2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM'07), Victoria, BC, Canada, Aug. 22-24, 2007, pp. 102-104. -250-cm = \ PR(A0\z)eMJ-™-A0)-dA<f> -Tt Tt Tt = jpR(A0\ j)cos(m-A^)- dA(/) + j- ^ pR (A^| j)sin(m-A^)-ofA^ (224) Now we make use of the fact that pR (A</>\x) is an even function of Act) to conclude that the imaginary part of (224) vanishes, so that: Tt cm= \PR (A<P\ Z)cos(m -A0)-dA<p (225) -Tt Comparing (222) to (225), we find that fM{z) = cM (226) Hence, if we can find a closed-form formula for cm then we can find a closed-form expression for fM(%). Fortunately, the coefficients cm have been investigated in the literature, and a formula for them is given by ([124 eq. 4.A.11], with conversion to the appropriate notations used here): 4n-z cm=—^ exp m-l V 2 y + 1 m+l fz] (227) where Ik (•) is the k -th order modified Bessel function of the first kind (see [54 Chap. 24]). Therefore from (226): I T V 2 J M-\ + 1 fZ^ (228) Moreover, given that M is always an even number, we have that -^y1 and M 2 ± L are always half of an odd integer. Given this fact, we can express these modified Bessel functions in closed form, given the recursive relation [54 eq. (24.46)]: I^(x) = I^{x)-—Im(x) X Along with [54 eq. (24.58)]: (229) Jm(x) = -nx sinh(x) (230) -251 and [54 eq. (24.60)]: Im(x) = \— cosh(x)-sinh(x) (231) V 71X \ X j For example, using (230) and (231) in (228) we find that for M — 2 : / 2 ( * ) = ^ - e x p X •exp V ^ J -x) 2 J A/2 r 2 \ n (x^ \2j \2j + 1 sinh 3 / 2 V2y + X •exp (4 sinh fx) + cosh I 2 J UJ \2j sinh cosh \2j sinh \2j X_ 2 1 2 (232) e^xp \ ^ J exp ^ e x P \^ J -exp V 2 y l - exp ( - ^ ) l = i 1 | exp(-j) Similarly, for M = 4 we have from [54 eq. (24.62)] (or, equivalently, using (229)-(231)) that: ff / 5 / 2 W = J \ 7TX and that therefore —+ 1 J sinh(x) — cosh(x) (233) -252 -Ux) = X •exp \ J 1 M-\ + 1 rx^ M+l • exp = exp exp \ J f-x} V 2 y V 2 y ~ \ ^ J ~ \ ^ J f \ JM=4 cosh sinh 1--V XJ 1 — cosh \2j X_ 2 X + 1_2__2 Vx2 X rxv vv^y ^ + 1 sinh fX^ v^y ^ cosh + 1 sinh y v^y (234) f exp ^ XJ (12 2 , + — + 1 U X ) v2y Y exp + exp JJ V \^J -exp —T V 2 JJ f l 4 2 1 — + + — — x2) X x2) exp(-^) Using a similar procedure, we find that: 4^x Mx) •exp f-x) L fx) { 2 ) I 2 U J I1 ,2) , 16 120 480 840 1 + — T + - , X X J + 4 60 360 840 (235) X X In general for even M we find that fM(z) \ X X X X exp(-j) = 1 M ^ 2 ( - l ) " (MI2 + n-\)\ n + e x p ( - ^ ) ! (MI2-n)\x" 1 (M/2 + n-\)\ (-i)M,2+' E M / 2 t T ( « - l ) ! ( M / 2 - « ) ! Z " _M^(-\)" (MI2 + n-\)\ ~~2~h~^~ {MI2-n)\Xn (236) + e x p ( - ^ ) ( - i ) " ' " I Mil 1 (MI2 + n-\)\ fx(n-\)\(MI2-n)\Xn (see also [92] for a different problem in which the factors of (227) arose. There, a different simplification approach was used but the same result as (236) was reached, hence providing additional validation for these derivations). -253 -It is emphasized that (236) is a summation of a finite number of terms which are composed entirely of exponentials and polynomials (as exemplified in (232) for M = 2). Hence, eq. (236) is very conducive to easy computation via computer algorithms (and even manually for small M). -254-Appendix B Closed-Form Expressions for fM(x) in the Presence of Nakagami-iw Fading Depending upon the fading probability distribution, we can sometimes use the results of Appendix A in order to obtain closed-form expressions for fM (x) in the presence of fading. A case in point is the Nakagami-m distribution. From Table 2 we have in this case that: ( I — \ A rn £ Pmk-m \Z\Z)— —mi^/ \ e x P X T(m) f -mx^ (237) V X J In order to facilitate the ensuing computations it is advantageous to compute some preliminary definite integrals. We do this using the integral [54 eq. (15.76)] which is: T(n +1) jV" exp(-<3r)Jr = a n+1 (238) and we use this result to define the following function: oo Y M , m (x) = \y~k exp(-40 • pNak_m (y\z)dy r -* , ,^ mmy = \y QxP(-£y) exp o z r(m) ^ —m ^ —y \ Z ) m m «> XmT(m)i \ym-k~X exp v X) y dy \dy (239) J This appendix was presented in part in Y. Linn, "Simple and Exact Closed-Form Expressions for the Expectation of the Linn-Peleg M-PSK Lock Detector," in Proc. 2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM'07), Victoria, BC, Canada, Aug. 22-24, 2007, pp. 102-104. - 255 -m zmr(m) T(m-k) _ mm T{m-k) V X J £ + m X) X~kmm-Y{m-k) (lX + rri)m kT(m) where k,£ > 0 and m > 0.5 . A minor difficulty may arise when using (239): when m -k is a non-positive integer, then T(m - k) tends to positive or negative infinity (see [54 Fig. 16.1]). This problem is elegantly solved by substituting m = m + e instead of m in (239), where s is a very small number, i.e. 0 < s «: m , such as E = 0.001 • m . Doing so will have a negligible effect upon the results while avoiding the singularity of T{m - k). Hence we define the following function: X,tM) m-k* {0,-1,-2,-3,...} \ T M , * ( * ) m-ke {0,-1,-2,-3,...} (240) where k,£ > 0 and m > 0.5, and m = m + m/1000 From (58) we have that fM (I) = \0 fM iz)PNak-m (X \ X )dX . We can now use (239)-(240) in conjunction with the results of Appendix A in order to arrive at closed-form expressions for fM(x) for the case of Nakagami-m fading. For example, for M = 2 we have from (232), (239) and (240) that: fi iX) = t 0 „ _ {X) ~ t 0 - (X) +1,, _ (X) ^'k,t,m (X) : ^•r (m - l ) | rmm-T(m-\) m~x-Y(m) (x + m)m~lT(m) = 1 (241) (in (241) we assume that m * 1 . If m - 1 we use m - 1.00 1 in (241), as per (240)). Similarly, for M = 4 we have from (234), (239) and (240), we have: -256 -(242) (in (241) we assume that m ^ 1 and m ^ 2 . If m = 1 or m = 2 we use m =1.001 or m = 2.002, respectively, in (242), as per (240)). In the general case, for even Mwe have from (236), (239) and (240) that: It is emphasized that (243) is a finite sum of terms which are composed of ratios of finite polynomials and gamma functions, and is very easy to compute using various numerical computation packages such as Matlab. Furthermore, we note that the Rayleigh and one-sided Gaussian fading distributions are particular cases of the Nakagami-m distribution (with m=l and m=0.5 respectively (see [77 Table 2])), so closed-form expressions for these fading distributions can be obtained from (243) as well. M 12 + ( - I f - £ •2 „ = i (243) -257 -Appendix C Closed-Form Expressions for /Six) Through use of Fourier analysis, closed-form expressions may be attained for f^ix)-To do so, we first re-define for convenience the density function pD{A</>D \x) over the interval \-n,n~\ (this is possible because the true phase is in the interval \-n,n'\, as noted in Sec. 4.12). We name this distribution pD(A<ji>D \%) (where A ^ E [ - I , J ] ) , and it is given by [19 eq. (7.3), p. 441]: PDMD\Z) = 1/j(sinz-).(l + ^ (l + cosA^ D s in r ) )^ ( 1 - c o s A ^ s i n r ) ^ ^ 0 With this definition of pD we find from (193) that (assuming for simplicity Aco = 0): fS(x)=T cos(M-A0D)pD(A0D\Z)d(A0D) (245) (Note that the limits of the integral in (245) are [-n:,n\). Now, because the domain of pD is finite, the periodic extension of pD can be expressed as a Fourier series, i.e. PD(A0D \Z)= 2VZI=-OOC" exp(-y'mA^D) where, using the fact that pD is even: cDm = f c o s ( m - A ^ ) / 3 D ( A ^ D \X)d{A$D) (246) (see [124 App. 4A]). From comparing (246) to (245), we see that f°(z) = c°\ . The \m=M coefficients cDm were computed in [124 App. 4A], from which it follows that (see [124 eq. (4.A.18)]): H2 — Q *• m = M ^ I M - + 1 (247) flA iZ) C m im= IVI /i i - . where Ik(») is the k-th order modified Bessel function of the first kind (see [54 Chap. 24]). -258 -Quick inspection of (247) and (227) shows that we have cf =(c,„) 2, which in turn means that fu(x)-{fu(x)f • Hence, from (236) we have: fSiz) = 'Mm(-\)n (M/2 + w-l)! ~lh~rrV (MI2-n)\%n Mil + e x D ( _ r ) | ( . , ) « ™ y 1 (M /2 + « - l ) ! + « P l 1) | r ( „ _ l ) ! ( M / 2 - „ ) ! Z » expanding, we have: f f2(z) = M 4 Mil n=0 n Mil (-1)" ( M / 2 + « - l ) ! ! * ( M / 2 - « ) ! j " (-1)" ( M / 2 + Jfc-l)! ^ k\ (MI2-k)\Xh + ( - l ) M / 2 + 1 M e x p ( - j ) (-1)" ( M / 2 + w - l ) ! x Mil U n\ (MI2-n)\X" 1 (M/2 + k-l)\ V L M/2 ? ( £ - l ) ! ( M / 2 - £ ) ! / 1 ( M / 2 + n - l ) ! + exp ( -2^) M/2 y . £ ? ( / i - l ) ! ( M / 2 - / i ) ! ; t r " ^ 2 1 (M/2 + k-\)\ Xh{k-\)\(M/2-k)\Zk (248) (249) Simplifying: -259 -D _M2m^(-l) (M/2+n-l)\(M/2+k-l)\ fM(z)-—LL n W ' (M/2-n)\{M/2-k)\rk + ( - i r ^ f i ( ^ 2 + - i ) ! ( ^ 2 ^ - i j ! ( 2 5 0 ) V } h^n\(k-\)\(MI2-n)\(MI2-k)\f+k W 2 y ] f f (M/2+n-l)\(M/2+k-l)\ K Z ; t r t r ( « - l ) ! ( ^ - l ) ! ( M / 2 - ^ ) ! ( M / 2 - ^ ) ! ^ f?(z) = 1-8* 1 +28^ 2 - 4 S * 3 + 3 6 ^ - 4 j V* +4X~2e (252) Following this procedure, we arrive at the following expressions: f? (z) = 1-2* 1+ + 2 * ^ - 2 ^ 2 ^ +z~2e-2% (251) + 4;^Y 2 z +24Z~V + 2 4 j V 2 ' - 7 2 j~V z +36z~Ae2z Expressions for M > 4 can also be found. However, since those expressions are very tedious and since they can be arrived at following the same procedure outlined above, they are omitted. Moreover, we note that the approximation (196) is extremely accurate for M>2 (see Sec. 4.12 and Fig. 97), so as a practical issue use of those expressions is often unnecessary. -260-Appendix D Closed-Form Expressions for f^ix) in the Presence of Nakagami-/w Fading Using (248)-(252) we can compute (2) v * a (198). For certain fading distributions, this can lead to closed form expressions for f°(z~)- A case in point is again the very important Nakagami-m distribution. From (198), exact closed-form expressions for f°(z) can be obtained for Nakagami-m fading for all M and all m through computation of the definite integral by using (248)-(252) along with the formula for the definite integral /M ( Z ) ~ > 0 /M ( Z ) j - r (m) I x ) (253) (see (238)) and the definition of t M m (^) (see (239)-(240)). We now set upon doing this. Using (250) in (253) we have (-1) (MI2 + n-\)\(MI2 + k-\)\ n\k\ ' (MI2-n)\{MI2-k)\ For example, for M=2 we have: -261 -f2D (x) = r 0 > 0 > m (z) - 2 • f 1 > 0 > m c p ) + 1 2 A M (^) +2-YhlJz)-2-Y2AJz)^2aJz) = 1 2 F 1 - r ( m - i ) | F 2 - r ( m - 2 ) V 7 V 7 (255) | 2 f ' w " - r ( m - 1 ) . j - 2 m " - r ( m - 2 ) + m )'" T (m ) (z~+ m Y' r (ra ) | J - 2 m " - r ( m - 2 ) (2 jp~ + m )w r (m ) (in (255) we assume that m ^ 1 and m ^ 2 . If m = 1 or m = 2 we use m =1.001 or m = 2.002 , respectively, in (255), as per (240)). Expressions for M > 2 can be obtained in a similar straightforward manner, though the resulting expressions are extremely long. It is important to note that through the method presented here we can compute J° (/) for Nakagami-m fading using only elementary functions and gamma functions, something that can be easily done using numerical computational packages such as Matlab. These exact expressions for f°(z) were used in the computation of Fig. 98, where we see that the theoretical results agree completely with those obtained through simulations. -262-Appendix E Asymptotic Limits and Simulation Computations of the Cross-Correlation Coefficients P„,k In this appendix we discuss the cross-correlation coefficients used in Sec. 4.12.3, and 1 n = k we prove that p„tk=\pi(x) \n-k\ = l where |/?, (j)| < 0.3 and pnk is the cross-0 |w-£|>l correlation coefficients of \xDM } defined as: P„,k = / D D (256) V v a r ( x M , J v a r ( * ^ ) Let us first assume that no fading is involved. We begin by noting that xDMjJV = cos(MA ,^f )cos(MA&>r) = cos(M(A^ - A^w_, ))cos(MA&>r). Since we assumed Aco<3z2x/(M -T) then this simplifies to xDM „ =cos(M(A^„ -A^,_,)). We note that the variables {A^}™^ are mutually independent. From this it immediately follows that x°Mw and xDMk will be independent for all |«-&|>1, which means that pnk=0 for |«-&|>1. Moreover, the fact that pnn = 1 is a fundamental result from probability theory which can also be verified by inspection of the definition of pnk. Thus, the only remaining issue is the characterization of pnk for |«-&| = 1, namely the characterization of the function P A X ) -As for operation with fading, we note that since we assumed slow fading, we can assume that the SNR remains constant over two symbol intervals. Thus, ipso facto, we will have the same correlation coefficient pnk for |«-A:| = 1 as for the non-fading case. Regarding the cases n = k and |«-&|>1 , the arguments that led to the conclusion that -263 -pnn=l and pnk = 0 for \n - k\ > 1 remain valid in the presence of slow fading. Hence, we conclude that the coefficients pnk are unaffected by slow fading. Pursuant to the preceding analysis, we now engage upon characterizing pnk for \n -k\ = 1, which is the only thing which remains in order to fully qualify these variables. This can be done through stochastic simulations, i.e. through computation of p^x) (me notation we use for pnk for \n-k\ = 1) using simulated sequences of I{n) and Q(n). This is shown in Fig. 125. As seen there, we indeed have (^ )| ^ 0.3 which was the assumption used in Sec. 4.12.3. Fig. 125. pnk for | n - fc | = l (also denoted as p(x)) a s a function of the SNR (=x) • -264-As seen in Fig. 125, we have px(x)—>0 and px(x)—>0.25. We can actually justify these asymptotic values theoretically, an endeavour that we shall presently undertake. First, let us take a loop at the case of j -> 0. In that limit, we have that there is no signal component in the values of I(n) and Q(n). It is then easy to show (and is also clear intuitively) that pnk for \n - k\ = 1 tends to 0 (that is, lack of correlation). Let us now take a look at x a n d prove that px(x)—x^ >0.25 . Assuming that X - » co, we have from the Taylor series expansion that i(MAtf) >1 y - ^ L - L - Thus: var : var \2A 1--1 M2(Atf) \ M2(A^f 1-M2E M*E M K ) K ) K ) 4 l-M2E M \E K > 2 (257) - E K ) 2 2^  Now, at high SNR we have ^ n 3 high SNR N so that E K ) ' and X K ) . Thus from (257): var M4f 3 1 ML 2%2 (258) We now turn our attention to the numerator of (256). We assume k = n -1 (a similar derivation will give the same result for k = n + \). Since E\xDM J = £ [ c o s ( M A ^ ^ and high SNR no _ 2 /— _. "> / / 4 \ since A ^ „ ~ T Y ( 0 , 1 / J ) we have (using [ e m cos{bx)dx = \ ^ e b l("a) [ 5 4 eq. 15.73]): -265 -E[xDMn] = E[x^n_]] high SNR , £ = cos ( M r exp - y ? " 2 = exp Furthermore: ] = ^ [ c o s ( M A t f ) c o s ( M A t f _ 1 )~ ' M2 ^ 2Z (259) 2 (260) i c o s ( M ( A t f + Atf_ 1 )) + i c o s ( M ( A t f - A ^ ) ) c o s ( M ( A # , - A # , _ , + A^_, - A ^ „ _ 2 ) ) + c o s ( M ( A ^ - A^_, - A^„_, + A ^ „ _ 2 ) ) _ = i ( ^ [ c o s ( M ( A ^ -A^H_ 2))] + i<[cos(M( A £ - 2 A £ _ 1 + A^„_ 2 ))]) Let us define y/ = A<f>n- A<f>„_2 and K = A0n- 2A<f>n_] + A<f>n_2. At high SNR we have that high SNR A.(f)n ~ A^(0,l/(2^)) for all n. Hence, since { A ^ } " = ^ are mutually independent it is easy to show that we have: y/~N(QMX) (261) and: K~N(0,3/Z) (262) Thus (using ^e'^ cos(bx)dx = j^e~b2/{4a) [54 eq. 15.73]): £ [ c o s ( A f ( A ^ - A ^ _ 2 ) ) ] = j" c o s ( M ^ ) ^ - ^ - e x p ^ - y ^ 2 dy/=exp \ 2X J (263) and: c o s ( M ( A ^ - 2 A ^ _ 1 + A ^ _ 2 ) y £ c o s ( M ^ ) ^ e x p f y \ V 6 jdic = exp Substituting (263) and (264) into (260) we get: ( 3 M ^ " 2X (264) -266 -exp f 3MM + exp — 2% j 1 2x) (265) Substituting (265), (259), and (258) into (256) we get: Pn,n-\ =• exp f M2} \ 2XJ -i-exp ( 3M2^ 2Z exp V \ "A J) Ml 2z Ml 2Z2 (266) 21 A 1 It is convenient to define L) = — and express (266) using this new variable. We have: X Pn,n— exp f M2^ + exp f m 2 ^ exp V V ME, JJ MY (267) The limit z ~>00 is equivalent to E -> 0+. Taking this limit upon (267) and using L'hopital's rule we have: d_ 2 v exp f M^} v" 2 j + exp f 3M2E^ f exp V v ( M2^ ^2 ^ J) dE. M -exp ( f M2^ ME, 3M2 exp Ml;' 3M2^ exp = lim - V V J ME, exp ME, M-2E. (268) • lim-£->0+ -exp ME, 3M2 -exp ' 3M2^ + M2 exp(-M2<f) ME, Both the denominator and numerator still tend to 0 so we use L'hopital's rule again to yield: 2 1 Note that this quantity has nothing to do with the phase detector self-noise that was used in Chapter 3 and which was written using a similar notation. -267 -d_ lim pn = lim —-exp 2 e A 3 M 2 exp 2 e A + M 2 e x p ( - M 2 £ ) - lim £->0 + ~ 2 exp + 9M exp -3M 2 £ (269) - ( M 4 e x p ( - M 2 ^ ) ) M 1 9 —+ — 4 4 1 = which is what we set out to prove. To summarize this appendix, we have used stochastic simulations to show that PnJk 1 n = k P\(x) \n-k\ = \ where |/?,(^ )| <0.3. Furthermore, we used heuristic and (0 \n-k\>\ mathematical derivations to justify the asymptotic values of p^z), namely that A ( Z ) ^ ^ 0 and A O r ) ^ ^ 0 . 2 5 . -268 -

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0100657/manifest

Comment

Related Items