BIDIRECTIONAL SEQUENTIAL DECODINGbyKaiping LiM. A, Sc., Northern Jiao-Tong University, Beijing, P. R. of China, 1986B. A. Sc., Northern Jiao-Tong University, Beijing, P. R. of China, 1983A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF ELECTRICAL ENGINEERINGWe accept this thesis as conformingredstThe UNIVERSITY OF BRITISH COLUMBIAJune 1994© KAIPING LI, 1994In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of CiThe University of British ColumbiaVancouver, CanadaDate____________DE-6 (2188)AbstractThe main drawback of sequential decoding is the variability of its decoding effortwhich could cause decoding erasures. We propose and analyze in this dissertation efficientbidirectional sequential decoding (B SD) techniques to alleviate this drawback, In theproposed BSD, two decoders are used; one is called forward decoder (PD), and is usedto search the tree from forward direction; while the other is called backward decoder(BD), and is used for the backward search of the tree. Forward decoding and backwarddecoding are performed simultaneously, and stop somewhere in the tree. In one class ofBSD, to which we refer as BSD-merge, decoding stops whenever FD and BD merge ata common encoder state some where in the tree. In the other class of BSD, that is BSDno-merge, no common encoder state is required, and decoding stops when FD meetsBD. Different BSD algorithms based on the stack algorithm are constructed; namelyAlgorithm TAmeet which belongs to the class of B SD-no-merge, Algorithm TAmergeand Algorithm Timerge which belong to BSD-merge, and finally Algorithm HTTmergewhich is a hybrid version of BSD-merge and BSD-no-merge.The relationships between backward coding and forward coding are examined indetail. Good convolutional codes, with memory m ranging from 2 to 25, suitable forbidirectional decoding, found through extensive computer search, are provided. Thesecodes possess the same distance properties from both forward and backward directions.It is found by analysis and computer simulations that the distribution of the totalnumber of computations per decoded block of the proposed BSD is still Pareto, as thatof unidirectional sequential decoding (USD). However, the advantage of the proposed11BSD appears as an increase in the Pareto exponent, and hence as a decrease in thecomputational variability and erasure probability. More specifically, we prove by usingthe random coding approach that the Pareto exponent of BSD using Algorithm TAmeetis asymptotically twice that of USD, and conjecture that this also applies to AlgorithmTAmerge. On the other hand, it is found that the computational cutoff rate remainsunchanged, but the use of our BSD reduces the average number of computations perdecoded bit.Using the random coding approach, we show that the error performance of BSDmerge is asymptotically the same as that of USD. Moreover, we show that the bit errorprobability of Algorithm TAmeet satisfies the random coding bound for block codes.Computer simulations are provided to confirm the analytical findings.The use of BSD-merge substantially reduces the computational variability of conventional sequential decoding without compromising the error performance, However,BSD does not completely eliminate the erasure problem. We therefore combine our BSDidea in conjunction with the multiple stack algorithm (MSA), which is an erasure-freedecoding algorithm. It is shown through analysis and computer simulations that thenew bidirectional multiple stack algorithm (BMSA) offers substantial advantages overthe MSA in terms of computational effort, memory requirements and error performance.The BMSA appears as an attractive alternative to the Viterbi algorithm (VA) where lowerror probabilities and high decoding speeds are required.111ContentsAbstractList of TablesList of FiguresList of SymbolsAcknowledgmentChapter 11.11.2Chapter 22.12.22.32.4Chapter 33.13.23.3viiixxiixvii189111116202733344150IntroductionOutline of the DissertationClaim of Contributions of this Dissertation .Convolutional Coding and Sequential DecodingConvolutional CodingSequential DecodingBackward Coding/DecodingSearch for Good Symmetric CodesBidirectional Sequential Decoding AlgorithmsAlgorithm TAmeet, a BSD-no-mergeAlgorithm TAmerge, a BSD-mergeCoarsening on Algorithm TAmergeivChapter 44.14.24.34.44.54.6Chapter 55.15.25.35,4Chapter 66.16.26.36.46.4.16.4.26.564666871919193109110115116122125135136138138Computational Performance of BSD 55A Simple Approach to Computational Distribution 55Upper Bounds to Computational Distribution of AlgorithmTAmeet 58Extension to Other BSD AlgorithmsLower Bound to Computational DistributionAverage Number of ComputationsNumerical and Simulation ResultsError Performance of BSDError Performance of USDError Performance of BSD-mergeError Performance of Algorithm TAmeet .Numerical and Simulation ResultsBidirectional Multiple Stack AlgorithmThe Multiple Stack Algorithm (MSA)Bidirectional Multiple Stack Algorithm (BMSA)Computational Properties of the BMSA . . .Error Performance of the BMSAEffect of Parameter Z1Effects of Parameters T and ZComparison with Viterbi DecodingVChapter 7 Conclusions and Suggestions for Further Research . . . 141References 145Appendix B Proof of Theorem 4.4 . 150Appendix C List of Acronyms 154viList of TablesTable 2.1 Symmetric bidirectional optimum distance profile (SBODP)codes ,.,,.,.,.,..,.,,...,,.,...,,,., 28Table 2.2 Distance spectra of SBODP codes ................. 29Table 2.3 Symmetric almost bidirectional optimum distance profile(SABODP) codes 31Table 2,4 Distance spectra of SABODP codes 32Table 4.1 Comparison of average computations for L = 400, Clim = 8000and p = 0.0409 75Table 4.2 Comparison of average computations for L = 200, Clim = 4000and p = 0.0594 81Table 4.3 Comparison of average computations for L = 400, Clim 4000and p = 0.0289. 83Table 4.4 Comparison of computational and error performances ofdifferent m with the new definition of one computation 88Table 5.1 Error performance comparison of different algorithms for thecase Pr = 1.1, i.e., R < Rcomp 112Table 5.2 Error performance comparison of different algorithms for thecase Pr = 0.68, i.e., R > Rcomp ................. 113Table 5.3 Error performance comparison of Algorithm TAmeet (z = 1)for different block length L 114viiTable 5.4 Error performance comparison of Algorithm TAmerge ( = 1)for different block length L 114Table 6.1 Average number of computations as a function of Z1 134Table 6.2 Average number of computations as a function of T . . . . 135Table 6.3 Average number of computations as a function of Cijm . . . 135Table 6,4 Error performance as a function of parameter T 138Table 6.5 Error performance as a function of parameter Z 138vuList of FiguresFigure 1.1 General digital communication system 1Figure 1.2 Binary symmetric channel 3Figure 2.1 Convolutional encoder for the (3, 1, 2) code 12Figure 2.2 Forward code trellis diagram for the encoder of Figure 2.1. . 13Figure 2.3 Forward code tree diagram of the encoder of Figure 2.1. . . 14Figure 2.4 Backward code trellis diagram for the encoder of Figure 2.1 . 21Figure 2.5 Backward code tree diagram of the encoder of Figure 2.1,. . 22Figure 3.1 Illustration of the stopping rules of Algorithm TAmeet. . . . 35Figure 3.2 Illustration of the overlapping in BSD-merge........... 42Figure 3.3 Example of opposing paths passing each other 43Figure 3,4 Typical impossible event in Algorithm TAmerge......... 44Figure 3.5 Illustration of a typical impossible event in Figure 3.4...... 45Figure 3.6 Illustration of the possible event ..........,..... 45Figure 3.7 Illustration of another impossible event 46Figure 3.8 Distinct forward and backward paths in the FS and BS 47Figure 3.9 Merging test in Algorithm TTmerge 52Figure 4.1 (a) Correct path metric with forward decoding. (b) Correct pathmetric with backward decoding. (c) Correct path metric withbidirectionaldecoding 57Figure 4.2 Distribution of the total number of computations per decodedblock for the case L = 400 and p = 0.0409, i.e., Pr = 1.1. . . 73ixFigure 4,3 Distribution of the total number of computations per decodedblock for different codes using the same parameters as inFigure 4.2 77Figure 4.4 Distribution of the total number of computations per decodedblock for a systematic ODP code (Pr = 1.1) 78Figure 4.5 Disthbution of the total number of computations per decodedblock for the case L = 200 and p = 0.0594, i.e., Pr = 0.68. . 80Figure 4.6 Distribution of the total number of computations per decodedblock for the case L = 400 and p = 0.0289, i.e., Pr = 1.5. . . 82Figure 4.7 Probability mass of merging and meeting when code rate <Rcomp 84Figure 4.8 Probability mass of merging and meeting when code rate >Rcomp 86Figure 4.9 Distribution of the overlapping length 87Figure 4.10 Distribution of the total number of computations for differentm with the new definition of one computation 89Figure 5.1 Example of error events with a correct merging encoder state. 94Figure 5.2 Example of error events with an incorrect merging encoderstate 94Figure 5.3 Illustration of the merging error event 97Figure 6.1 Illustration of the multiple stack algorithm 117Figure 6.2 Empirical computational distribution for the final decision ofthe MSA 120xFigure 6.3 The first stack saving as a function of P1 128Figure 6,4 Empirical computational distribution for the final decisionusing Z1 as a parameter 130Figure 6.5 Empirical computational distribution for the final decisionusing T as a parameter 131Figure 6.6 Empirical computational distributions for the final decisionusing Z as a parameter 132Figure 6.7 Bit error probability of the BMSA and MSA as a function ofZ1 137Figure 6.8 Comparison on error performance for different algorithms. . 140xiList of SymbolsDefinitionLrl Smallest integer not less than r[TJ Largest integer not greater than rDefined by R = E0(o)/oSubstack spacing7 Fano metricOne branch metric dropPr Defined by R =Indicator functionad Total number of paths of weight dfree +free +C 2 max [mm (c[, c_)]CM8A Total number of computations needed to decode one branch using theBMSACSD Total number of computations needed to decode one branch using BSDCISA Total number of computations needed to decode one branch using theMSAC’5-’ Total number of computations needed to decode one branch using USDCcrir Critical number of computationsChannel capacityxiiCdjree+j Total number of nonzero information bits on all weight dfree + i pathsCf5D Total number of computations required to decode a block of L branchesby any BSD algorithmCf, Cf Total number of computations required by forward and backwarddecoders to decode a whole block of L branches, respectivelyC, Cf_i Total number of computations needed by forward and backward decoders to decode x or (L — x) branches without searching beyond theboundary at level x, respectivelyAF ‘iB‘L—z Total number of computations required by forward and backwarddecoders to correctly decode x or (L— x) branches, respectivelyCi The i-th branch computationsCL Total number of computations needed to decode a block of L branchesby using Algorithm TAmeetCL Total number of computations required to decode a block of L branchesby using Algorithm TAmerge with the assumption of correct decodingTAmergeCL Total number of computations required to decode a block of L branchesby using Algorithm TAmergeCfM Total number of computations required to decode a block of L branchesby using the BMSATotal number of computations required to decode a block of L branchesby using the MSAXIIICiim Computational limitTotal number of computations required to correctly decode the firstA branches using USDD Unit delayd Distance profiledfree Free distanced[], d[.] Column distance function of forward and backward codes, respectivelyIncorrect path subset generated from level iE[1 Expectation functionE0() Gallager functionSmallest concave function E()G(D), Gr(D) Transfer function matrix of forward and backward codes, respectivelyH() Entropy functionI Identity matrixK Code constraint lengthk Number of bits per unit time at the input of an encoderL Block length (including a known tail)I Meeting point‘F lB The farthest level reached by forward and backward decoders, respectivelyxiv1FTOP’ 1BTOP The length of top path in FS and BS, respectively1 Merging pointM[] Path metric functionWI Total encode memorym Encode memory ordermh Number of matching information symbols in the merging testNb Total number of bit errors given Smg incorrectn Number of bits per unit time at the output of an encodernb(r) Expected number of bit errors caused by an incorrect path divergingat level r1 + l — L at the end of decoding.Pj The first (primary) stack overflow probabilityPb Bit error probabilityBlock error probabilityPer Erasure probabilityp Crossover probability of BSCR Code rate (flats per channel symbol)r Code rate (bits per channel symbol)Rcomp Computational cutoff rateSF, S5 Current number of forward and backward stacks, respectivelyS1 Encoder state ixvSmg Common encoder state at 10The first (primary) stack size saving ratiox(i) Correct path from root node to level ix(t) Incorrect path which diverges from the correct path at level i andremerges at level (i + t)Incorrect path which diverges from the correct path at level i andstretches out t branches from the correct node iThe set of all possible incorrect paths which diverge from the correctpath at level iX(i;j) The set of all incorrect paths which diverge and remerge with thecorrect path at level i and (L- j)Total number of elements in .(i;j)Z = 2zpMsAZ1 ZM = 2ZM,i = 2,3,••ZpM The i-th forward or backward stack size in the BMSAZj The i-th stack size in the MSAxviAcknowledgmentI would like to thank my wife, Libing, for her constant encouragement, support, andunderstanding during these years. I am also greatly indebted to my research supervisor,Dr. Samir Kallel, for his continual encouragement and highly constructive comments. Iwish to express my sincere thanks for his guidance throughout this research.I would like to thank Mr. Dimitrios P. Bouras, Mr. Siavash Massoumi, Mr.Longxiang Dai, and Mr. Weimin Sun for their helpful discussions.I would also like to thank Dr. Samir Kallel, Mr. Siavash Massoumi, and Mr.Dimitrios P. Bouras for their help with transcript changes after I joined OKI Telecom.Finally, I would like to thank Mrs. Vicki Waidro-Snider and Mr. Peter Howard for theirinvaluable help in perfecting the use of the English language throughout this text.xviiChapter 1IntroductionDigital communication deals primarily with the process of transmitting digital information from one point to another through a channel which is subject to noise disturbances.This implies that the information received at the destination (prior to processing) is not always exactly the same as transmitted. Shannon [11 showed in his famous coding theoremthat digital information can be transmitted with arbitrarily low error probability providedthat the data rate is smaller than the channel capacity. Since then, numerous coding anddecoding techniques have been proposed in the attempt to approach Shannon’s promiseof reliable communication near the channel capacity. The increasing demand for efficientand reliable digital communications over noisy channels has led to the use of sophisticatedcoding and decoding techniques for the purpose of error correction. These techniques,called forward error correction (FEC) are useful, especially with power-limited systems.r Source______Channel______Source Encoder Encoder MoDestination H-H_::r DemodulatorFigure 1.1 General digital communication system.1Chapter 1 IntroductionA general block diagram of a digital communication system is given in Figure 1.1.An arbitrary source generates a signal which is then transformed by a source encoder intoa redundancy-free digital information sequence. If the source is analog, analog-to-digital(A/D) conversion is performed at the source encoder. Generally, source and sourceencoder are treated as a digital source which emits an equal likely binary sequence. Theinformation sequence then enters the channel encoder which adds redundancy to combatthe channel noise. The outputs from the channel encoder, referred to as codewords, arethen mapped into a waveform signal by the modulator for transmission over the waveformchannel, The waveform channel may represent a telephone line, a wireless radio link, aspace communication link, etc. The demodulator processes the received waveform signaland produces an output that may be discrete (quantized) or continuous (unquantized). Theoutput of the demodulator is called the received sequence. The channel decoder, basedon the received sequence and the known structure of the channel encoder has to selectthe best possible transmitted sequence. The purpose of the channel encoder and decoderis to allow the digital information sequence to be efficiently and reliably reproduced atthe output of the channel decoder. The decoded sequence then enters the source decoder,which reproduces a signal that is delivered to the user. If the source signal is continuous,this also involves digital-to-analog (D/A) conversion.To focus attention on the channel encoder and channel decoder, the modulator/demodulator (modem) and the waveform channel are combined into a coding channel.In this dissertation, a discrete memoryless channel (DMC) is assumed, which constitutesthe simplest class of coding channel models and is defined as follows. The input is asequence of elements from a finite alphabet, and the output is a sequence of elements2Chapter 1 Introductionfrom the same or different alphabet. Moreover, each element in the output sequence isstatistically dependent only on the element in the corresponding position of the inputsequence and is determined by a fixed conditional probability assignment. For example,the binary symmetric channel (BSC), which is shown in Figure 1.2, is a DMC with binaryinput and output alphabets, where each digit at the channel input is reproduced correctlyat the channel output with some fixed probability (1-p) and is altered by noise into theopposite digit with probability p.Input alphabet Output alphabet0 01 1Figure 1.2 Binary symmetric channel.A multitude of methods on coding and decoding techniques which permit low errorprobability in digital communications have been proposed and investigated. Since excellent introductory literature into coding/decoding theory is available [2—8], the followingsurvey will therefore be limited to matters relevant to this thesis, that is convolutionalencoding with probabilistic decoding. Supporting literature is found in the references.A convolutional code is described by three integers, n, k, and m. At each unit time,the encoder accepts k-bit of the information sequence and produces a codeword (encodedsequence) of n-bits. The integer m is a parameter known as the code memory length. Animportant characteristic of an (n, k, m) convolutional code is that the encoder has memory‘-p3Chapter 1 Introduction— the n-bits emitted by the convolutional encoding procedure is not only a function ofthe k-bit input, but is also a function of its previous m inputs.Viterbi decoding [9] and sequential decoding’ [11] are the two main probabilisticdecoding methods for convolutional codes. The Viterbi algorithm (VA) is an optimummaximum likelihood decoding scheme, but its optimality is obtained at a large decodingeffort. The number of computations performed per information digit is constant, butincreases exponentially with the code memory length. In practice, the VA can thus beemployed only with short memory codes.Historically, sequential decoding was proposed in 1957 by Wozencraft and Reiffen[11] as the first practical decoding procedure for convolutional codes. In sequentialdecoding, the tree-like structure of the convolutional code is used in a step-by-step searchof the most likely transmitted sequence. As long as the data rate does not exceed aquantity called computational cutoff rate R01, which is less than the channel capacity,the average number of computations required to decode one information digit is smalland independent of the code memory length.The error probability of sequential decoding decreases exponentially with the codememory length [12], and it is lower bounded by that of the VA. As m —* co, however,both decoding techniques yield the same error performance. In addition, the decodingcomplexity of sequential decoding is practically insensitive to the code memory. Consequently, convolutional encoding with sequential decoding is at present one of the bestknown methods to achieve an arbitrarily low error probability over a DMC [7, 8, 13, 14].In this dissertation, sequential decoding is used in a narrow sense. That is, it only includes the metric-first and the depth-firstalgorithms although the breadth-first algorithm can be categorized as sequential decoding in a wide sense [10].4Chapter 1 IntroductionConvolutional coding with sequential decoding has been used in various systems such assatellite communications, deep space communications, etc.There are two principle sequential decoding algorithms: the Fano algorithm [13] andthe stack algorithm proposed independently by Zigangirov [15] and Jelinek [16]. Regardless of the algorithm, a major problem with sequential decoding is the computationalvariability. As a consequence of this variability, the decoding effort for a given receivedsequence may occasionally exceed the physical limitations of the decoder, leading inevitably to buffer overflows and infonnation erasures. Thus, while the undetected errorprobability of sequential decoding can be made arbitrarily small, the erasure probabilityor buffer overflow remains substantial [7, 8, 17]. Although the disadvantage of overflows or erasures of sequential decoding may be useful in systems with retransmissioncapabilities, it is desirable to make the erasure probability as small as possible withoutcompromising the error performance in order to achieve higher throughput and reliability. In addition, there are many other cases where erasures are undesirable. For example,in real-time systems with long path delays, erasures can cause complete data loss, Inthis case, the main contribution of errors comes from the buffer overflow. Thus, it isvery important to reduce the erasure probability. The prime objective of this researchis tailored at alleviating the computational variability, and thus minimizing erasures insequential decoding.In the past, several modifications of sequential decoding to reduce the erasureprobability have been reported and implemented. Excessively long tree searches canbe terminated early by placing a backsearch limit on the Fano and the stack algorithms,5Chapter 1 Introductionthereby lowering the erasure probability [14]. A similar effort is achieved with the stackalgorithm by deleting the path with the lowest metric from the stack whenever the stackfills up [15, 16]. Although these methods accelerate decoding and reduce the erasureprobability, the resulting increase in the error probability is found to be substantial [18].Haccoun and Ferguson [19] proposed a class of generalized (or multiple path) stackalgorithms which extends several paths simultaneously from the top of the stack. It wasshown that the variability of the computational effort is reduced, but at a cost of a largeaverage number of computations compared to the ordinary stack algorithm.In order to eliminate erasures entirely, Chevillat and Costello [20, 21] introduced aninteresting multiple stack algorithm (MSA). It is a modification of the stack algorithmand uses several stacks to accommodate the problem of stack overflow. However, it isfound that the resulting increase in stack and buffer storage is substantial.Fomey [22] in 1967 suggested that sequential decoding can be started from the endof a block which is terminated by a known tail. He proposed a bidirectional decodingscheme which uses a forward decoder and a backward decoder to search a terminatedtree from initial and final encoder states, respectively. Decoding with Forney’s schemestops whenever either the forward or the backward decoder reaches the end of its tree.Computer simulations revealed that this scheme does not offer much improvement interms of computational performance.The idea of bidirectional search has also been used for computing the free distanceof convolutional codes [23, 24] and distance spectrum of trellis codes [25]. Recently,a bidirectional decoding search has been applied to the decoding of block codes [26].6Chapter 1 IntroductionMost recently, Haccoun and Beizile [27, 28] investigated bidirectional breadth-first Malgorithm for the decoding of convolutional codes. Although erasures are avoided by thenature of the breadth-first M algorithm used, the loss of the correct path problem, whichis inherent to the M algorithm, limits the performance of this scheme.We propose and analyze in this dissertation efficient bidirectional sequential decoding(BSD) techniques [29]. It is shown that the proposed BSD techniques substantiallyalleviate the drawbacks of conventional or unidirectional sequential decoding (USD).In the proposed BSD, two decoders are used, like in Forney’s scheme; one is calledforward decoder (FD), and is used to search the tree from forward direction; the otheris called backward decoder (BD), and is used for the backward search of the tree.Forward decoding and backward decoding are performed simultaneously. However,unlike Forney’s scheme, decoding in the proposed BSD techniques stops somewherein the tree, In one class of BSD, which we refer to as BSD-merge, decoding stopswhenever FD and BD merge at a common encoder state somewhere in the tree. Inthe other class of BSD, that is BSD-no-merge, decoding stops when FD meets BD.Different BSD algorithms based on the stack algorithm are constructed; namely AlgorithmTAmeet which belongs to the class of B SD-no-merge, Algorithm TAmerge and AlgorithmTTmerge which belong to B SD-merge, and Algorithm HTTmerge which is a hybridversion of BSD-merge and BSD-no-merge. Finally, to eliminate erasures entirely, wecombine Algorithm TTmerge in conjunction with the MSA of Chevillat and Costello[20, 18, 21], to obtain a more efficient and reliable erasure-free algorithm, which we callbidirectional multiple stack algorithm (BMSA).7Chapter 1 Introduction1.1 Outline of the DissertationChapter 2 is a brief summary of the basic elements of convolutional coding andsequential decoding. The idea and basic properties of backward coding/decoding are alsopresented in this chapter. Moreover, good convolutional codes suitable for bidirectionaldecoding, which are found through computer search, are listed in this chapter.In Chapter 3, the proposed BSD algorithms are described and their fundamentalproperties are investigated. Furthermore, our proposed BSD algorithms are comparedwith the existing BSD algorithms which can be classified under the category of BSDno-merge in a wide sense.A theoretical analysis of the computational performance of the proposed BSD algorithms is given in Chapter 4, together with extensive computer simulation results. It isfound, both by theoretical analysis as well as computer simulations, that the distributionof the total number of computations per decoded block by any of the proposed BSDalgorithm is still Pareto, as that of USD. However, the advantage of the proposed BSDappears as an increase in the Pareto exponent, and hence as a decrease in the computational variability. More specifically, we prove by using the random coding approachthat the Pareto exponent of BSD using Algorithm TAmeet is asymptotically twice that ofUSD, and conjecture that this also applies to Algorithm TAmerge. On the other hand, it isfound that the computational cutoff rate remains unchanged, but the use of the proposedBSD reduces the average number of computations per decoded bit. This reduction incomputational variability and average number of computations translates into a substantial decrease in the erasure probability (or buffer overflow problem), which is the main8Chapter 1 Introductiondrawback of sequential decoding.The error performance of the proposed BSD techniques is explored in Chapter 5by both theoretical analysis and computer simulations. It is found that BSD-merge hasasymptotically the same error performance as USD, while the bit error probability ofAlgorithm TAmeet follows the random coding bound for block codes.In Chapter 6, an erasure-free BSD technique based on the MSA [18, 211, whichwe call bidirectional multiple stack algorithm (BMSA), is proposed and analyzed. It isfound that the BMSA performs better than the conventional MSA, in terms of memoryrequirements, computational effort and error performance.Chapter 7 contains final conclusions and suggestions for future research,Appendix A contains a rigorous proof of Theorem 4.4 which suggests that underthe assumption of correct decoding the computational cutoff rate of BSD is the sameas that of USD.Appendix B lists acronyms used in this dissertation.1.2 Claim of Contributions of this DissertationThe major contributions of this research are summarized below.The properties of backward coding/decoding are examined in great detail. Morespecifically, the relationships between a backward code and the corresponding forwardcode are given through Theorems 2.1 and 2.2, Corollaries 2.1 and 2.2 and Properties2.1 and 2.2. Good convolutional codes suitable for bidirectional decoding found throughcomputer search are provided.9Chapter 1 IntroductionEfficient BSD techniques that alleviate the drawbacks of conventional sequentialdecoding are proposed and examined by analysis and extensive computer simulations.The main theoretical contributions in the analysis of the proposed BSD techniques areas follow:1, A computational upper bound for Algorithm TAmeet is found for a specific time-invariant convolutional code (Theorem 4.1).2. A computational upper bound for Algorithm TAmeet is discovered for an ensembleof trellis codes (Theorem 4.2). It is conjectured that this bound also applies toAlgorithm TAmerge.3. A computational lower bound is obtained for any code and any type of BSD (Theorem4.3).4. The computational cutoff rate of BSD is found to be the same as that of USD(Theorem 4.4).5. The error performance of BSD-merge is proved to be asymptotically the same as thatof USD (Theorems 5.1 to 5.3, Corollaries 5.1 and 5.2).6. The bit error probability of Algorithm TAmeet is found to satisfy the random codingbound for block codes.An erasure-free algorithm, the BMSA, is proposed and analyzed. Through computersimulations and analysis, we demonstrate that the new BMSA performs better that theconventional MSA.10Chapter 2Convolutional Coding andSequential Decoding2.1 Convolutional CodingAn (n, k, m) convolutional encoder can be implemented with k shift registers, nmodulo-2 adders and a commutator that scans the output of the adders. The length ofeach of the k shift registers is less than or equal to the encoder memory order m [81,2The connections of the n adders to the k shift registers define the code. This is generallyspecified with a so-called k x n transfer function matrix G(D) [8], given byg(D) g2(D) ... g(D)g1(D) g2(D) ... g(D)G(D) = 2 2 2 (2.1)g3’(D)g’(D) g2(D) ... g(D)where g (D) is a polynomial of degree m that indicates the connection of the j-thadder to the i-th shift register. For example, Figure 2.1 represents a (3, 1, 2) encoderwithG(D) = [1+D,l+D2,l +].Encoding is performed k-bit at a time. Each k-bit to be encoded is fed to the encoderfrom the left side, and the content of the k shift registers is shifted one position to theright. The modulo-2 adders are then sampled in sequence by the commutator, producingn output coded bits. For convenience, we refer to this as forward encoding.Defining fl max [deg g3)(D)] as the total encoder memory [8] and the statei=1 1<j<nof the encoder as its k shift registers’ contents, there are a total of 2 different possible2 The memory order m is related with the constraint length K [7], which is defined as the maximum number of present and pastencoder input symbols that influence any output symbol, i.e., K = m + 1.11Chapter 2 Convolutional Coding and Sequential DecodingFigure 2.1 Convolutional encoder for the (3, 1, 2) code.states. Then, the operations of the encoder can be described using a state diagram, wherethe n output bits are a function of the state and the k input bits [8]. Operations of aconvolutional encoder can also be described by a trellis or a tree diagram [8]. Figures2.2 and 2.3 represent six-level trellis and tree diagrams, respectively, for the (3, 1, 2)encoder of Figure 2.1, corresponding to information sequence of length 4 bits with azero tail of length 2. The upper branch leaving each state in both the trellis and treediagrams in Figures 2.2 and 2.3 represents input one, while the lower branch representsinput zero. Starting from the initial state So, each path in the trellis or tree diagramcorresponds to a possible encoded sequence as generated by the encoder. For example,the highlighted path on the forward code tree of Figure 2.3 corresponds to informationsequence (110100) and code sequence (111, 010, 110, 100, 101, 011).UviV12Chapter 2 Convolutional Coding and Sequential DecodingTime unitsFigure 2.2 Forward code trellis diagram for the encoder of Figure 2.1.0 1 2 3 4 5 613Chapter 2 Convolutional Coding and Sequential Decoding001 110 011001010[110 011 000100 101 011110111011 000 000— 010 110 011100r1ioi 011 000101—1 111 101 011001 110 011010V 110 011 0000 111100 101 011101011 000 000000010 110 011111r— 101 011 0000001_000101 011000 000 000Figure 2.3 Forward code tree diagram of the encoder of Figure 2.1.14Chapter 2 Convolutional Coding and Sequential DecodingThe code rate of an (n, k, m) convolutional code is r = k/n information bits per codedbit or channel symbol. The code rate can also be expressed asR in (u)/n, nats per channel symbol (2.2)where u = 2k, We have thus R = r in 2.The error correcting capabilities of a convolutional code are intimately related withthe memory m and code rate r, The larger is m and/or the smaller is r, the more powerfulis the code. Lowering the rate of the code involves a higher bandwidth expansion fortransmission, which might be problematic in certain applications. It is therefore desirableto increase m as much as possible.The two main probabilistic decoding techniques for convolutional codes are Viterbidecoding [9] and sequential decoding [11]. Given a received sequence of channel symbolsand a code structure, a Viterbi decoder explores the whole code trellis in order to decidewhich is the most probable path or code sequence to have been transmitted. Viterbialgorithm (VA) is thus an optimum decoding technique in the maximum likelihoodsense. However, due to the amount of computations needed for decoding, which growsexponentially with the code memory m, the VA is practically limited to short memorycodes (typically m 7) [7, 8]. A sequential decoder on the other hand attempts to makethe best decision by only exploring a small portion of the code tree (or trellis). Sequentialdecoding is thus suboptimum, but can be used with long memory convolutional codes.Unfortunately, sequential decoding has serious drawbacks, and our efforts in this thesisare oriented towards the goal of alleviating some of these drawbacks.15Chapter 2 Convolutional Coding and Sequential Decoding2.2 Sequential DecodingIn a system using convolutional coding and sequential decoding, information isusually transmitted in blocks (L— m branches), and each block is terminated by aknown tail (usually m zero branches). Sequential decoding is a sub-optimum tree searchprocedure that only explores a fraction of the encoded tree in order to determine themost likely transmitted path for a given received sequence. The basic idea of sequentialdecoding is as follow, Assuming a DMC, starting from the root node, a sequentialdecoder explores the code tree, one branch at a time and uses the log-likelihood functionor Fano metric [13] given by1P(Yilxi)R (23)P(y)where x is the i-th channel input symbol (1 i nL), y, is the corresponding receivedsymbol, and P(y) = P(yjIx)q(x) is the a priori probability for the received symbol)y with q(xj) being the distribution of channel input symbol x1. The total metric for apath of length I branches is thenft(2.4)i=land a sequential decoder, based on the stack algorithm, will always attempt to searchand extend the path having the largest cumulative metric. At the end of the tree, thepath with the highest metric is accepted as the decoded path. Thus, while this decoderis moving forward into the tree, it occasionally realizes that it is not following the bestcurrent path, and backs up to follow a more promising one. This backtracking ability[30], while essential for achieving good error performance, is also the main cause of the16Chapter 2 Convolutional Coding and Sequential Decodingserious problems in sequential decoding. Since here, decoding is performed along onlythe forward direction, it shall be referred to as unidirectional sequential decoding (USD).Since the stack algorithm is much simpler to describe and analyze than the Fanoalgorithm, we will focus on the stack algorithm in this dissertation. In this algorithm, astack is used to store the already searched paths in decreasing order of their metric values.The top of the stack is the path with the largest accumulated metric among the paths inthe stack, and will be extended one level further into the tree. The stack is reorderedafter each extension, so that the top always contains the current best or most likely path.Decoding stops whenever the top path reaches the end of the tree. For ease of reference,we refer to this algorithm as unidirectional stack algorithm and list it in the following.Unidirectional Stack Algorithm:Step 0: Place the initial root node of the code tree in the stack.Step J. Compute the metrics of all successors of the top node in the stack, and enterthe new nodes in the stack, ordered with respect to their cumulative metrics.Step 2: Remove from the stack the node whose successors were just inserted.Step 3: If the top node in the stack is the final node of the tree, stop. Otherwise,return to step 1.The major problem of the above algorithm is the reordering of the stack after eachextension (Step 1) of the top path. To overcome this difficulty, Jelinek [16] proposed aquantized version of the algorithm in which the paths are placed at random into substacksaccording to their metric values. All paths whose metric values are within a certain range17Chapter 2 Convolutional Coding and Sequential Decodingare stored in a substack. That is a path of metric F is inserted into substack H ifHzS.F<(H+1)A, (2.5)where \ is the substack spacing in metric value. In this quantized stack, the search forthe top path reduces to the search for the highest non-empty substack. The path that isto be extended is taken at random or usually in a first-in last-out mode. In our computersimulation, the first-in last-out mode was used which is suggested by Johannesson [31]to be near-optimal.The decoding effort of USD is usually measured by the distribution of the numberof computations per either decoded bit or decoded block. In this thesis, the number ofcomputations per block is used. A computation is defined as one hypothesis of a node tobe on the correct path.3 The lower bound given by Jacobs and Berlekamp [30] togetherwith the upper bounds given by Savage [321, Falconer [33] and Jelinek [34] have shownthat the computational distribution of USD is essentially Pareto. Specifically, defineas the total number of computations required to decode the first A branches of thecode tree using sequential decoding. With the assumption of correct decoding, Jacobsand Berlekamp [30] showed that for all codes, any DMC and any USD algorithm, CKsatisfiesP(CKsD > N) > N exp {_o(v’iT) }, (2.6)where 0 (/17) is an asymptotically unimportant term,4 and the Paretian exponent ctis defined by the following parametric equation___________________R =E0()/c. (2.7)A computation in the stack algorithm is an extension of the top node, while in the Fano algorithm, it is usually defined to bea forward move [5].0(b) represents that the value can be upper-bounded by Abs, where A is a finite constant for all indexes i.18Chapter 2 Convolutional Coding and Sequential DecodingThe function Eo(a) in (2.7) is the smallest concave function greater than or equal to theGallager function E0(cx) [5]. Using the random coding technique over an ensemble oftrellis codes [35], Savage, Falconer and Jelinek showed that for a DMC, the first-branchcomputations C1 of sequential decoding is upper-bounded asP(C1 N) AN, (2,8)where p < Pr, which is defined as R = Eo(pr)/pr,5and A is a finite constant independentof N [36]. Moreover, it was noticed that asymptotically, the lower bound gives an accuratedescription of the computational distribution for good codes and practical sequentialdecoding algorithms [311.The computational performance of USD with a specific convolutional code dependson the decoding algorithm employed and the distance properties of the forward code.In particular, optimum forward sequential decoding performance requires rapid initialcolumn distance growth of the forward code for fast decoding and minimum erasureprobability, and a large free distance (dfree) for minimum error probability [37, 38].Specifically, for a code with rate R and a BSC with a crossover probability p satisfyingR<1n2+pln —H(p), (2.9)‘—pwhere H(p) = —plnp — (1—p) ln (1—p) is the entropy function, the distributionP(C1 > N) for a specific time-invariant convolutional code is upper-bounded by [38]P(C1 N) <• exp{—id[log N] + log N}, (2.10)where o-, 5 and are parameters which are functions of p and R, but independentof N, and are as defined in [38].pr = a if the channel is a BSC, but Pr a in genera’.19Chapter 2 Convolutional Coding and Sequential DecodingThe distance profile d of a forward code is defined as the column distance function(CDF) over only the first constraint length (i.e., d [d0, d1,“, dm1) [8]. Since thedistance profile determines the initial column distance growth of the forward code, andis easier to compute than the entire CDF, it is usually used instead of the entire CDF asa criterion for selecting codes for use with sequential decoding. A code is said to havea d superior to a d’ of another forward code of the same memory order m, when thereis some i m such thatdj{j: (2.11)A forward code is said to have an optimum distance profile (ODP) [8] if its d is superiorto that of any other forward code of the same memory order.23 Backward Coding/DecodingBecause data is transmitted with a known (e.g. zero) tail in a block mode, all pathsdepart from and terminate at the same state S. A path in the trellis or tree diagram canthus be viewed in reverse as starting from the final state S0 and terminating at the initialstate So. Every such path also corresponds to an encoded sequence. When viewed asstarting from the final state, the distinct paths in the tree or trellis define a code whichwe refer to as the backward code. From the tree (trellis) of the original forward code,one can generate the backward tree (trellis) for the backward code. Figures 2.4 and 2.5show, respectively, the backward trellis and tree for the backward code obtained fromthe original forward code (3, 1, 2) of Figure 2.1. Note that a backward code sequenceis obtained by simply reversing the original forward code sequence. In addition, the20Time unitsFigure 2,4 Backward code trellis diagram for the encoder of Figure 2.1.corresponding information sequences are reversals of each other, as well. For example,the highlighted path on theChapter 2 Convolutional Coding and Sequential Decoding0 1 2 3 4 5 621Chapter 2 Convolutional Coding and Sequential Decoding100 010 111100 1010 111 000011001 101 111010111 000 000110011 010 111001111000 000 000100 010 111011V 010 111 0000 110001 101 111101111 000 000000011 010 111110101 111 000000110 101 111000000 000 000Figure 2.5 Backward code tree diagram of the encoder of Figure 2.1.22Chapter 2 Convolutional Coding and Sequential Decodingforward code tree of Figure 2,3 corresponds to the information sequence (1101) and thecode sequence (111, 010, 110, 100, 101, 011). Their equivalent reversed informationand code sequences are (1011) and (110, 101, 001, 011, 010, 111), respectively (see thehighlighted path on the backward tree of Figure 2.5).It can be readily shown that the transfer function matrix Gr(D) of an (n, k, m)backward convolutional code can be expressed in terms of the transfer function matrixof the forward code [39] byg(D’) g’)(D_l) ,, g(D’)Gr(D) = Dm g(D’) g(D_’)::‘1_1), (2.12)g (D’) g’(D’) .0 (D’)As an example, consider the encoder of Figure 2.1. The transfer function matrix of itscorresponding backward encoder is Gr(D) = [1 + D + D2, 1 + D2, D + D2].It appears from the above discussion that, given a convolutional encoder, one canperform backward encoding by feeding the information bits to be encoded from the rightside of the registers, and set the registers to shift left.A forward (n, k, m) convolutional code is said to be invertible if there exists an nx k matrix G(D) such thatG(D)G’(D) = ID6 (2.13)for some 6 0, where I is the k x k identity matrix [8]. For an (n, 1, m) convolutionalcode, this means that there exists a feedforward inverse G(D) of delay S (S 0) ifand only if [40]GCD [g(1)(D),g2(D),’ ‘ ‘ , g)(D)] = D6, (2.14)23Chapter 2 Convolutional Coding and Sequential Decodingwhere GCD denotes the greatest common divisor. For an (n, k, m) convolutional codewith k > 1, let A ( D), i = 1, 2,..., (), be the determinants of the () distinct k x ksubmatrixes of the transfer function matrix G(D). Then a feedforward inverse G’ (D) ofdelay 6 exists if and only ifGCD[z(D) : i = l,2,•••, ()] = (2.15)for some 6 0 [401. Also, (2.14) or (2.15) is a necessary and sufficient condition fora code to be noncatastrophic. We now show that a backward convolutional code isinvertible provided that the forward code is invertible.Lemma 2.1: For an invertible (n, 1, m) convolutional code, the delay S in (2.14) isalways no greater than m, since deg [g(i)(D)] mfor all i.Lemma 2.2: For an invertible (n, k, m) convolutional code with k > 1, the delay S in(2.15) is always no greater than km, since deg[A(D)] km for all i.Theorem 2,1: A backward code corresponding to an invertible (n, 1, m) convolutionalcode with delay S is invertible and its delay is equal to (m — 5).Proof: From (2.14), we can writeGCD [g() (D’), g(fl_1) (D’), . . . , g(’) (D’)] = D, (2.16)andGCD [Drng(n) (D’),. . . , Dmg(l) (D—’)] = (2.17)From (2.12), we can write (2.17) asGCD [g.’(D),g2(D),’’’ ,g1c(D)] = (2.18)24Chapter 2 Convolutional Coding and Sequential DecodingFrom Lemma 2.1, we know that m -6 0. Q.E.D.Theorem 2.2: A backward code corresponding to an invertible (n, k, m) convolutionalcode (Ic> 1) with delay 6 is also invertible and its delay is equal to (km — 6).Proof According to (2.12), we can writeGCD[Lri(D) i = 1,2,..., ()] = GCD[Dkm/(D_1) : i 1,2,•., ()].(2.19)From (2.15), we can writeGCD[A1(D_1) : i = 1,2,••, ()] = D6, (2.20)Therefore, from (2.19) and (2.20) we can writeGCD[Lri(D) : i = ()] = Dkm, (2,21)and from Lemma 2.2, we know km — 6 0. Q.E.D.According to the above discussion we can write the following.Corollary 2.1: A backward convolutional code is noncatastrophic if and only if itsforward code is noncatastrophic.Corollary 2.2: Decoding of a forward invertible convolutional code can be performedusing the corresponding backward code, This will be called backward decoding asopposed to the usual forward decoding.Property 2.1: The distance spectrum [8] ofa backward convolutional code is identicalto that of the corresponding forward code. This is because the Hamming weight ofany path on the forward code tree or trellis is the same as the corresponding path onthe backward code tree or trellis. As a consequence of this property, when maximum25Chapter 2 Convolutional Coding and Sequential Decodinglikelihood decoding is used, the error performance of a backward code is the same asthat of the corresponding forward code.Property 2.2: The distance profile of a backward code is usually differentfrom that ofthe correspondingforward code. This is rather obvious and can be seen from the forwardand backward code trees of Figures 2.3 and 2.5. For this example, the first coefficientof the distance profile of the forward code is equal to 3; whereas for its correspondingbackward code, it is equal to 2. As a consequence of this property, backward sequentialdecoding is in general not as efficient as its corresponding forward sequential decodingwhen a forward ODP code is used.With our bidirectional sequential algorithms to be discussed later, decoding is performed in both forward and backward directions. It is thus desirable to use codes thatpossess the same good distance profile in both directions (see Theorem 4.1 in Chapter 4).A sufficient condition for a forward code to have the same distance profile as its corresponding backward code is symmetry. A code is said to be symmetric or reversible [41,42j if the code generators of the forward code are the same as those of the correspondingbackward code. For example, the code with generators g’)(D) = 1+D+D2 andg2(D)= 1+D2 is symmetric. Moreover, since this code has an optimum distance profile [43jwe call it a symmetric bidirectional optimum distance profile code. A code is said tohave a symmetric bidirectional optimum distance profile (SBODP) if it is symmetric andits distance profile is optimum. Unfortunately, most known ODP codes do not possessthe property of symmetry. However, one can search for new symmetric codes that havea good distance profile.26Chapter 2 Convolutional Coding and Sequential Decoding24 Search for Good Symmetric CodesUsing a computer search procedure [44, 45], we have found good symmethc (2, 1,m) non-systematic codes with m ranging from 2 to 25. In our computer search, we firstattempt to find a symmetric code with the same distance profile as the correspondingknown ODP code. If more than one code is found, we select the one with the highestdfree and the minimal number of paths of weight dfree. Of course, we considered onlynoncatastrophic codes. Results are given in Table 2.1, Also included in the table are thefree distances of known ODP codes. Distance spectra of the found SBODP codes arelisted in Table 2.2. In Table 2.2, adiree+j denotes the number of paths of weight dfree + ,while Cdf +i denotes the total number of nonzero information bits on all these paths.From Table 2.1, one can immediately notice that SBODP codes do not exist for somevalues of m, which are marked by dash signs. Moreover, as it can be seen from Table2.1, to find a SBODP code, usually dfree had to be sacrificed. In the following, some newcodes which slightly compromise the distance profile for a better dfree are presented.27Chapter 2 Convolutional Coding and Sequential DecodingTable 2.1 Symmetric bidirectional optimum distance profile (SBODP) codes6m G2 dfree dfree of ODP2 7 5 5 53 ——— 64 23 31 6 75— —— 86 127 165 8 107 —— — 108 —— — 129 1157 1731 12 1210 2251 3003 9 1411 5247 7125 12 1412——— 1513 27537 37275 12 1614 56777 77735 12 1715 134277 176435 16 1816 222511 304443 14 1917 475417 741571 16 2018 1346477 1762635 18 2119——— 2220 5734323 6261675 18 2221 12022373 15744405 20 2422 31065423 23340731 21 2423 44407043 61070111 18 2624 176153077 134442235 24 2625 221446737 373544611 24 276 The m = 2 SBODP code is the known ODP code found by Johannesson [43] and is listed here for completeness.28Chapter 2 Convolutional Coding and Sequential DecodingTable 2.2 Distance spectra of SBODP codesm (adjr+ji = 0, 1,2,[cdfr+j = 0, 1,2, 14 23 31 6 (1,0,4,0,22,0,124,0,682,0)[1,0,10,0,96,0,778,0,5616,0]6 127 165 8 (1,0,5,0,35,0,187,0,1074,0)[2,0,15,0,188,0,1275,0,9350,0]9 1157 1731 12 (8,0,16,0,159,0,741,0,5027,0)[32,0,78,0,1150,0,6390,0,52470,0]10 2251 3003 9 (1,0,1,1,,1,18,21,27,110,256)[1,0,5,4,9,88,123,186,796,2060]11 5247 7125 12 (3,0,10,0,33,0,216,0,1216,0)[10,0,42,0,228,0,1702,0,11248,0]13 27537 37275 12 (1,0,2,0,11,0,54,0376,0)[2,0,6,0,68,0,352,0,3278,0]14 56777 77735 12 (2,0,0,0,10,0,16,0,208,0)[4,0,0,0,46,0118,01420,0]15 134277 176435 16 (13,0,0,0,166,0,96,0,3651,0)[60,0,0,0, 13 30,0,734,0,37790,0]16 222511 304443 14 (2,0,3,0,7,0,59,0,237,0)[5,0,13,0,36,0,455,0,1966,0]17 475417 741571 16 (2,0,1,0,33,0,97,0,751,0)[6,0,7,0,220,0,753,0,7386,0]18 1346477 1762635 18 (1,0,12,0,51,0,302,0,1801,0)[5,0,70,0,365,0,2720,0,18953,0]20 5734323 6261675 18 (1,0,1,0,12,0,89,0,474,0)[5,0,6,0,92,0,782,0,5034,0]21 12022373 15744405 20 (3,0,8,0,36,0,196,0,1246,0)[24,0,56,0,288,0,1782,0,14312,0]22 31065423 23340731 21 (1,5,7,18,30,74,175,341,863,2097)[5,36,37,168,298,670, 1721,3686,9883,25366123 44407043 61070111 18 (1,0,2,0,4,0,19,0,54,0,)[1,0,10,0,20,0,116,0,406,0]24 176153077 134442235 24 (2,3,26,42,93,173)[16,13,234,350,884,1769]25 221446737 373544611 24 (4,0,17,0,78,0,508,0,2780,0)[26,0,135,0,756,0,5540,0,34644,0]29Chapter 2 Convolutional Coding and Sequential DecodingIn the event that no SBODP code can be found, we then slightly decrease the distanceprofile to find a good code (large dfree and good distance profile). The same method canbe applied when the dfree of a SBODP code is inferior to that of the known ODP code. Weshall refer to this symmetric non-ODP code as an almost BODP (SABODP) code, Thecomputer search results are listed in Table 2.3. The dash sign in Table 2.3 indicates thatno code with a better dfree is found after slightly decreasing the distance profile. Distancespectra of the found SABODP codes are listed in Table 2.4. The search for SABODPcodes is conducted as follows. We first examine all codes obtained through possiblemodifications of the last four coefficients of the distance profile of the correspondingODP code. Then, among these codes, we select the one with the highest dfree and thebest distance profile. The last four coefficients of the distance profile of the obtainedcodes are given in the last column of Table 2.3. When the distance profile is slightlydecreased, a code with a better dfree may be found. For example, dfree of the SBODPcode of memory m = 23 is equal to 18. When only the last coefficient of the distanceprofile is decreased by one, a SABODP code with dfree = 24 is found.Notice that for any (2, 1, m) systematic code to be symmetric, the second adder ofthe encoder should be connected to only the last stage of the shift register. This wouldresult in a very poor code in terms of both the free distance and the distance profile.30Chapter 2 Convolutional Coding and Sequential DecodingTable 2.3 Symmetric almost bidirectional optimum distance profile (SABODP) codesm G’ , dm dfree dfree of ODP2 — — —— 53 13 15 2,3,3,3 6 64 — — —— 75 57 75 3,4,4,4 8 86 — — —— 107 247 345 4,5,5,5 10 108 507 705 5,5,5,6 10 129 — — —— 1210 2617 3615 6,6,6,6 12 1411 — — —— 1412 16767 12345 6,7,7,7 14 1513 24433 33045 6,7,7,7 14 1614 47263 63271 7,7,8,8 16 1715 — — —— 1816 232357 367131 8,8,8,8 16 1917 723705 507627 8,8,8,8 18 2018 1167671 1712517 8,8,9,9 19 2119 2714477 3744635 9,9,9,9 20 2220 5267447 7117325 9,9,9,9 20 2221 17242231 11444257 9,9,9,10 22 2422 32427561 21675053 9,10,10,10 22 2423 55231643 61346255 10,10,10,10 24 2624 - - - - 2625 - - - - 2731Chapter 2 Convolutional Coding and Sequential DecodingTable 2.4 Distance spectra of SABODP codesm G’ djree (ad1,,+,i = 0, 1,2,[Cdf,,+ji = 0, 1,2,3 13 15 6 (2,0,10,0,49,0,241,0,1185,0)[4,0,38,0,277,0,1806,0,11063,0]5 57 75 8 (3,0,12,0,70,0,397,0,2223,0)[8,0,46,0,400,0,2925,0,20446,0]7 247 345 10 (4,0,19,0,95,0,539,0,3111,0)[10,0,86,0,649,0,4400,0,30287,0]8 507 705 10 (2,0,10,0,51,0,287,0,1560,0)[4,0,36,0,289,0,2118,0,13984,0]10 2617 3615 12 (2,0,15,0,71,0,383,0,2333,0)[6,0,65,0,494,0,3053,0,22486,0]12 16767 12345 14 (2,0,20,0,93,0,612,0,3283,0)[6,0,111,0,701,0,5456,0,35447,0]13 24433 33045 14 (3,0,10,0,47,0,280,0,1687,0)[11,0,52,0,323,0,2478,0,17675,0]14 47263 63271 16 (6,0,27,0,140,0,808,0,4745,0)[34,0,151,0,1120,0,7520,0,52014,0]16 232357 367131 16 (1,0,2,0,38,0,239,0,1131,0)[10,0,6,0,246,0,1917,0,11114,0]17 723705 507627 18 (1,0,29,0,119,0,578,0,3433,0)[3,0,182,0,953,0,5816,0,39833,0]18 1167671 1712517 19 (4,7,1,25,74,138,353,953,2090,5351)[20,58,3,182,612,1242,3369,10278,23840,66680]19 2714477 3744635 20 (5,0,45,0,160,0,851,0,5184,0)[24,0,319,0,1440,8775,0,61394,0]20 5267447 7117325 20 (1,0,19,0,101,0,423,0,2492,0)[4,0,143,0,884,0,4295,0,29122,0]21 17242231 11444257 22 (9,0,58,0,275,0,1290,0,7381,0)[49,0,488,0,2695,14502,0,94513,0]22 32427561 21675053 22 (1,0,44,0,128,0,793,0,4094,0)[11,0,368,0,1078,0,8648,0,50150,0]23 55231643 61346255 24 (14,0,60,0,348,0,1959,0,10600,0)[108,0,592,0,3602,0,24395,0,145826,0]32Chapter 3Bidirectional SequentialDecoding AlgorithmsIn this chapter, BSD algorithms in which the code tree is explored from bothforward and backward directions are presented. Operations of all the proposed algorithmsare essentially the same, except for their stopping rules. In our BSD algorithms,two separate stacks are used. One is used for the forward search of the tree and iscalled the forward stack (FS), while the other is used for the backward search andis called the backward stack (BS). Starting from the root nodes of the forward andbackward trees (trellises), respectively, forward and backward search operations areperformed alternatively according to the usual stack algorithm. Define the total numberof computations needed to decode one block by any BSD algorithm as the sum of thecomputations performed by forward and backward decoders.The proposed BSD algorithms can be classified into two categories: BSD-no-mergeand BSD-merge. In BSD-merge, decoding stops whenever a path in one stack (PS orBS) merges with a path in the other stack. A path in one stack is said to merge (oragree) with a path in the other stack if the two paths share the same encoder state at acommon tree level. In the class of BSD-no-merge, forward and backward portions of thedecoded path do not have to agree with each other.In the following, Algorithm TAmeet which belongs to the class of BSD-no-merge isfirst presented and analyzed. Next, two algorithms, Algorithm TAmerge and AlgorithmTimerge, which belong to the class of BSD-merge, are presented and analyzed. Finally,33Chapter 3 Bidirectional Sequential Decoding AlgorithmsAlgorithm HTTmerge, which is a hybrid scheme of B SD-merge and BSD-no-merge, ispresented. For the convenience of describing the following algorithms, let 1FTQP denotethe level of the top node in the FS, and IF the farthest level7 reached by the forwarddecoder, Similarly, let iBTOP denote the level of the top node in the BS, and ‘B thefarthest level reached by the backward decoder.3.1 Algorithm TAmeet, a BSD-no-mergeAlgorithm TAmeet stops decoding whenever a path in one of the two stacks meetsa path in the opposite direction, and these two meeting paths do not have to agree witheach other. The level of the forward tree at the meeting point is denoted by 1. Asa consequence, the portion of the tree explored by forward decoder does not overlapwith the portion explored by backward decoder, as illustrated by Figure 3.1. AlgorithmTAmeet is as follow.Algorithm TAmeet:Step 0: Place the root nodes of forward and backward trees in the FS and BS,respectively.Step 1. Compute the metrics of all successors of the top node in the FS, and enterthe new nodes in the FS, ordered with respect to their cumulative metrics.Step 2.’ Remove from the FS the node whose successors were just inserted.Step 3: Check if 1F +18T0p = L. If yes, stop decoding and go to step 7. Otherwise,go to the next step.The farthest level is not necessarily equal to the level of the top node because the decoder allows backtracking.34Chapter 3 Bidirectional Sequential Decoding AlgorithmsStop decoding if one of the Meeting point 1nodes is at the top of FS or BS+® Extended node0 Unextended nodeFigure 3.1 Illustration of the stopping rules of Algorithm TAmeet.Step 4: Compute the metrics of all successors of the top node in the BS, and enterthe new nodes in the BS, ordered with respect to their cumulative metrics.Step 5: Remove from the BS the node whose successors were just inserted.Step 6: Check if1FT0P+ 1B L. If yes, stop decoding and go to step 10. Otherwise,return to step 1.Step 7: The top path8 in the BS is accepted as the backward portion of the decodedpath. The forward portion of the decoded path is selected from the FS, whose end nodeis at level IF, (select the one with the highest metric value if there is more than onepath at level IF). If mm ([m/2] , iBo) = 1B’rop’ retreat m — 1BTOP branches fromthe end node of the forward portion of the decoded path, then take the information8 Defined as the path with the top node as its end node,Forward treeroot node1Backward treeroot nodeL branches35Chapter 3 Bidirectional Sequential Decoding Algorithmssymbols corresponding to the rest of the path as the decoded symbols, and go to step13. Otherwise go to the next step.Step 8: If mm ([m/21, iF) = 1F retreat m — 1F branches from the end node of thebackward portion of the decoded path, then take the information symbols correspondingto the rest of the path as the decoded symbols, and go to step 13. Otherwise go to thenext step.Step 9: Retreat the last rrn/21 and [m/2J branches from the forward and backwardportions of the decoded path, respectively. Take the information symbols correspondingto the rest of the two paths as the decoded symbols, and go to step 13.Step JO: The top path in the FS is accepted as the forward portion of the decodedpath. The backward portion of the decoded path is selected from the BS, whose endnode is at level lB (select the one with the highest metric value if there is more thanone path at level IB). If mm ([rn/2j , = 1FTOP’ retreat m — 1FTQP branches fromthe end node of the backward portion of the decoded path, then take the informationsymbols corresponding to the rest of the path as the decoded symbols, and go to step13. Otherwise go to the next step.Step 11: If mm (Em/21,1B) = 1B retreat m — 1B branches from the end node of theforward portion of the decoded path, then take the information symbols correspondingto the rest of the path as the decoded symbols, and go to step 13. Otherwise go to thenext step.Step 12: Retreat the last [rn/2J and Em/21 branches from the forward and backwardportions of the decoded path, respectively. Take the information symbols corresponding36Chapter 3 BidirectionaL Sequential Decoding Algorithmsto the rest of the two paths as the decoded symbols.Step 13: Output the decoded information symbols.When the channel is not too noisy, the forward and backward portions of the decodedpath may actually merge (agree) with each other since a sequential decoder always followsthe most likely path. Should this not be the case, a valid path (not necessarily the correctone) can be assembled by retreating rn branches from forward and backward portions ofthe decoded path, as described above.Note that in the absence of a burst in a block, the meeting point I would most ofthe time fall near the middle of the tree. However, should there be a single burst errorwithin a block, it is expected that the meeting point would occur near the middle of theburst. Thus by retreating about m/2 branches from forward and backward meeting paths,the burst error may be totally eliminated if its length is less than m.The meeting point 1 at the end of decoding in Algorithm TAmeet is equal to 1FTQPdecoding is stopped at step 6, or (L— 1BTOP) if decoding is stopped at step 3. Note thatforward and backward decoders always meet somewhere in the tree, i.e. 1 < 1 < L — 1.Let x be an arbitrary level in the forward code tree, and imagine cutting the tree intotwo separate subtrees at this level. Let C denote the number of computations neededby the forward decoder to decode x branches without searching beyond level x of theforward tree, i.e., when the top node in the PS reaches depth x for the first time. This isequivalent to decoding an x branch-tree without a terminating tail. Similarly, let C_denote the number of computations needed by the backward decoder to decode an (L— x)branch-tree without searching beyond level (L— x) of the backward tree. Clearly, for any37Chapter 3 Bidirectional Sequential Decoding AlgorithmsDMC and any fixed arbitrary level x, random variables C and Cf are independentbecause the forward and backward search areas are disjoint.Let us assume that the amount of time to hypothesize (extend) one node is thesame for both the forward and the backward decoders. Let CL denote the total numberof computations needed to decode one block by Algorithm TAmeet, By the nature ofAlgorithm TAmeet, we can write the following property.Property 3.1: Let I be the meeting point, then— J 2C( and C_1 Cf, if decoding is stopped at step 6 (3 J)L — 2CE_1 + 1 and C1F Cf_1 + 1, if decoding is stopped at step 3ProofS According to Algorithm TAmeet, if decoding is stopped at step 6, the forwarddecoder must have extended the same number of nodes as the backward decoder. ifdecoding is stopped at step 3, however, the forward decoder must have extended onemore node than the backward decoder. Let us first assume that decoding is stopped atstep 6. By the definition of the meeting point I (see Figure 3.1), we know that1FTQP = 1,which means CL = 2C(. Moreover, since 1B = L — 1 1BTOP’ then C_1 > Cf.Now, suppose that decoding is stopped at step 3. In this case, CL = 2C_1 + 1 since1BTOP = L—l, where the additional computation is due to the last extension by the forwarddecoder. Moreover, since 1F = 1 1FTOP’ then Cf > Cf_1 + 1. Q.E.D.Corollary 3.]: Let I be the meeting point, thenCL — 1 <2 mm (cf, cf_1), (3.2)since CL — 1 < 2Cf and CL — 1 2C_1 according to Property 3.].38Chapter 3 Bidirectional Sequential Decoding AlgorithmsObviously, CL = 2C( = 2Cf_1 if and only if both top nodes in the FS and BSare at forward tree level I at the end of decoding, i.e., both top nodes are at the meetingpoint I. It is worth noticing that the minus one in (3.2) can actually be ignored and thusC’L is essentially equal to 2 mm (Of’, cf_1).Now, we introduce an important property which is the foundation in our analysis ofthe computational performance of the proposed BSD in Chapter 4.Property 3.2: Let integer i E [1, L — 1], then— 1 <2 max {min(Cf. cP_j]. (3.3)Proof First of all, note that there must exist a meeting point in Algorithm TAmeet,i.e., I < 1 < L — 1. By Corollary 3.1, one can write— 1 <2mm (c(,o_). (3.4)Moreover, we can writemax [min(CfCf_j)] mmn(Cf,Cf_j), (3,5)since 1 < 1 < L — 1. Q.E.D.In fact, it is easy to see that CL is virtually equal to 2 max [mm (of, C_)] .This approximation, however, is not necessary for the proof of the upper bounds on thecomputational distribution of Algorithm TAmeet in Chapter 4.It is interesting to point out that Forney’s scheme [22] belongs to BSD-no-mergesince forward and backward decoders do not need to agree with each other, and itsNote that CL mm (cr, ce,) for all possible i.39Chapter 3 Bidirectional Sequential Decoding Algorithmstotal number of computations is equal to 2mm (Ci, of), where Cf and of denotethe number of computations needed by forward and backward decoders to decode thewhole block, respectively.Let us now compare Algorithm TAmeet with the bidirectional M algorithm [28], Itis obvious that they both belong to the same category of BSD-no-merge since mergingbetween the forward and backward portions of the decoded path is not required in bothalgorithms. The main difference, however, is in the nature of the search algorithmemployed in the forward and backward decoders. In the bidirectional M algorithm thenumber of computations per decoded branch is constant, whereas in Algorithm TAmeetthat number is variable since backtracking is allowed, As a consequence, erasure freedecoding is possible in the bidirectional M algorithm, while Algorithm TAmeet can notguarantee it, On the other hand, the bidirectional M algorithm suffers the correct pathloss problem. Another difference is how they handle a burst. In Algorithm TAmeet,forward and backward decoders search into forward and backward trees alternatively nomatter how many or where the bursts are located. At the end of decoding, both decodersin Algorithm TAmeet retreat about m I 2 branches. In the bidirectional M algorithm,however, in order for the first decoder encountering a burst to wait for the other decoder’shelp from the opposite direction, a burst detection method is required. Moreover, onlythe bad decoder retreats m branches at the end of decoding. The BER of the bidirectionalM algorithm degrades with increasing block length L [281, while in Algorithm TAmeet,the BER improves as L increases (see Chapter 5 for analysis and simulation results).40Chapter 3 Bidirectional Sequential Decoding Algorithms32 Algorithm TAmerge, a BSD-mergeIn BSD-merge, decoding does not stop whenever two opposing paths meet eachother as in the algorithm described above. Instead, a merging test is applied to opposingpaths and decoding stops only when the test is successful. We present in the followingAlgorithm TAmerge which belongs to the class of BSD-merge,Algorithm TAmerge works in the same way as Algorithm TAmeet, except that at themeeting point I decoding is not stopped, unless a pair of forward and backward pathsmerge with each other. Instead, decoding is continued until a pair of merged forward andbackward paths is found. Let I denote the level of the forward tree when this happens.Obviously, here the portion of the tree searched by forward decoder does overlap with thatportion searched by backward decoder, as illustrated by Figure 3.2. More specifically,Algorithm TAmerge is as follow.Algorithm TA merge:Steps 0 to 2 are exactly the same as in Algorithm TAmeet.Step 3: If 1 BTOP +iF > L, check the top path in the BS with all paths with depths equalto L— 1BTQP in the FS. If one (or more) merging path is found, stop decoding. Amongall merging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to the next step.Steps 4 and 5 are exactly the same as in Algorithm TAmeet.Step 6: If1FTQP +lB> L, check the top path in the FS with all paths with depths equalto L— 1Frop in the BS. If one (or more) merging path is found, stop decoding. Among41Chapter 3 Bidirectional Sequential Decoding AlgorithmsMerging point 101\\—\\ Q- o\• Correct path— -o-- — Incorrect pathL branchesFigure 3.2 Illustration of the overlapping in BSD-merge.all merging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to step 1.It is worth pointing out that all nodes in one stack (FS or BS) can be easily linkedwith respect to their depths. Hence, in the merging test (steps 3 and 6), it is easy tofind all those nodes which should be tested with the top node in the opposite direction.During the merging test at steps 3 and 6, the top path in either the FS or BS is onlytested with end nodes of paths in the opposite direction, which are at the same level withthe tested top path. A natural question that may arise is whether two opposing pathsmay pass each other or not since intermediate nodes are not being examined during themerging test. For example, Figure 3.3 illustrates such a situation where two opposingForward treeroot node 0- —\KBackward treeroot node10\ —\0 —42Chapter 3 Bidirectional Sequential Decoding Algorithmspaths may have common encoder states at some of their intermediate nodes, We nowshow that the events shown in this figure can not take place in Algorithm TAmerge.(d)Figure 3.3 Example of opposing paths passing each other.First of all, events like (c) and (d) in Figure 3.3 can not happen unless event (a) or(b) occurs in the first place. This is because both forward and backward decoders mustreach their predecessors before any further extensions. We are going to show that events(a) and (b) in Figure 3.3 can not occur in algorithm TAmerge. Clearly, events (a) and(b) are the same type of event. Hence, only event (a), which is shown in Figure 3.4 formore detail, will be considered in the following.• common encoder state(C)43Chapter 3 Bidirectional Sequential Decoding AlgorithmsL) UB]Figure 3.4 Typical impossible event in Algorithm TAmerge.Figure 3.4 implies that the backward path with end node BO merges with the forwardpath with its intermediate node FO. Moreover, EQ and BO are the only common nodesbetween the forward and backward paths. As Figure 3.4 indicates, (Ii, 12, , Im)denotes the common encoder state. Moreover, the encoder state of node B] is either(12,J3, , J,O) or (12,13,’’ , ‘m, 1). Let denote the last bit of the encoder stateB]. Similarly, let denote the last bit of the encoder state F]. Since FO is theintermediate state of the forward path, all successors of node FO must have been reached,and FO is consequently removed from the FS (steps 1 and 2 in Algorithm TAmerge). LetF] and F]’ be the successors of FO as illustrated in Figure 3.4. Since FO and BO arethe only common nodes between forward and backward paths, Fl and B] are commonnodes, i.e. = Otherwise, nodes Fl’ and B] will share a common encoderstate, as indicated in Figure 3.5.Now, let us assume that F] is still in the FS, which means F] has not been extendedlevel x levelFOE]’‘mi44Chapter 3 Bidirectional Sequential Decoding Algorithms• common encoder stateFigure 3.5 Illustration of a typical impossible event in Figure 3.4.levelx÷1= 10• common encoder stateFigure 3.6 Illustration of the possible event.yet before the backward path reached node BO. According to step 3 in AlgorithmTAmerge, however, decoding must have been stopped at the time when B] is at thetop of the BS. Thus, BO does not have any chance to be reached because B] must beat the top of the BS before BO can be reached. This is illustrated in Figure 3.6, whichclearly shows that end node F] of the forward path merges with end node B] of thebackward path.Next, let us assume that F] is not in the FS, which means F] has already beenextended. As shown in Figure 3.7, let P2 and F2’ denote the successors of F], andassume that F2 and B2 have the same encoder state. Clearly, Figure 3.7 shows the sameF]level xStop decoding whenB] becomes the topnode in the BS.45Chapter 3 Bidirectional Sequential Decoding Algorithmstype of event as that in Figure 3.3 (d) since there is more than one common encoderstate. Based on our discussion above, this event can not occur before an event like thatin Figure 3.3 (a) or (b) happens. Following this same line of reasoning, we get thefollowing property.level x. common encoder stateFigure 3.7 Illustration of another impossible event.Property 3.3: in Algorithm TAmerge, forward and backward paths in the FS and BScan only merge at their end nodes. That is merging paths can not overlap each other.Since the decoded path is a merged path, forward and backward portions of thedecoded path will never overlap in Algorithm TAmerge. Furthermore, we can concludethat overlapped forward and backward paths in Algorithm TAmerge do not have thesame encoder state at any of their common tree levels because otherwise Property 3.3would be breached.Two forward and backward paths are said to be totally distinct as long as they donot overlap or do not share the same encoder state at any common tree level in theiroverlapping portion. Figure 3,8 illustrates all possible types of paths in the FS and inthe BS. In Figure 3.8 (a), forward and backward paths do not have a chance to meet.46Chapter 3 Bidirectional Sequential Decoding AlgorithmsIn Figure 3.8 (b), merged forward and backward paths do not overlap, while in Figure3.8 (c) forward and backward paths do not share any common encoder state in theiroverlapping area. Thus, we can write the following property.Forward treelevel !FXpBackward treelevel 1B(c)Figure 3.8 Distinct forward and backward paths in the FS and BS.Property 3.4: in Algorithm TAmerge, all paths in the FS are totally distinct from allpaths in the BS,Like in Algorithm TAmeet, forward and backward decoders cannot reach theirterminal nodes in Algorithm TAmerge because all successors of the root nodes of forward(a)(b)47Chapter 3 Bidirectional Sequential Decoding Algorithmsand backward trees were reached at the beginning. Thus, forward and backward decodersmust merge somewhere in the tree, i.e., 1 < 10 < L — 1. In the following analysisof Algorithm TAmerge, correct decoding is assumed, which is a good approximation,especially for long memory codes. For an arbitrary forward tree level x, let us defineC as the number of computations required to correctly decode the first x branchesby forward decoder. Similarly, define as the number of computations required tocorrectly decode the last (L-x) branches by backward decoder. Unlike Algorithm TAmeet,forward decoder may extend incorrect paths beyond level x before it finally decodes all xbranches. Backward decoder may also extend incorrect paths beyond level (L-x) beforeit finally decodes all (L-x) branches.Now, let CL denote the total number of computations needed by Algorithm TAmergewith the assumption of correct decoding. By analogy to Algorithm TAmeet, we canclaim the following properties.Property 3.5: Let 10 be the merging point of Algorithm TAmerge, thenI 20( and Cf. Gf’, if decoding is stopped at step 6 (3.6)2C_10+l and C(CL1 + 1, if decoding is stopped at step 3Proof Similar to Algorithm TAmeet, if decoding is stopped at step 6, forwarddecoder must have extended the same number of nodes as backward decoder. If decodingis stopped at step 3, however, forward decoder must have extended one more node thanbackward decoder. Without loss of generality, let us assume that decoding is stoppedby step 6. Thus, the top node at the FS is at level 1 when decoding is stopped,and the forward decoded path up to this top node is correct. Hence, the forward48Chapter 3 Bidirectional Sequential Decoding Algorithmsdecoder must have performed exactly C( computations. Furthermore, the algorithmdoes not require that the backward decoder totally decodes the last (L— l) branches, i.e.,Of Of’. Q.E.D.Coro1lay 3.2: Let 1 be the merging point in Algorithm TAmerge, then—1 <2mm (C(,Cf_1) (3.7)according to Property 3.5. Similar to the discussion of Corollary 3.1, CL is almost equalto 2 (or,Property 3,6: Let integer i e [1, L — 1], thenCL — 1 <2 max [rnin(O[, Of_)]. (3.8)Proof’ Same as the proof of Property 3.2. Q.E.D.Similar to Algorithm TAmeet, one can easily conclude that CL is practically equalto 2 max [mm (Of’, c)]. Strictly speaking, the number of computations without theassumption of correct decoding is less than or equal to that with the assumption.We now examine the merging test (step 3 and step 6) of Algorithm TAmerge in moredetail. In Algorithm TAmerge, decoding stops whenever a top path either in the FS or inthe BS merges with any path from the opposite direction. Clearly, the number of pathsbeing tested in the merging test is a random variable. This variable is small under goodchannel conditions and increases as the channel gets noisier. Suppose that the top pathin the BS merges with a forward path FP which is not at the top of the FS (step 3). Theend node of FP is at the merging point 1 according to Property 3.3. As a consequence of49Chapter 3 Bidirectional Sequential Decoding Algorithmsthis merging, forward decoder does not need to extend all forward paths whose metricsare higher than M[FP] to decode the forward portion of the decoded path FP. In otherwords, the merging test at step 3 effectively reduces the number of computations neededby forward decoder to decode PP. Similarly, the merging test at step 6 may reduce thenumber of computations needed by backward decoder if the top path in the FS mergeswith an opposite path that is not at the top of the BS. Hence, the multiple examinationsof the top path in one stack with all paths in the opposite stack results in a reductionin computational variability since the overall number of computations per decoded blockis reduced. The parallel to this is the multiple path extensions in the generalized stackalgorithms of Haccoun and Ferguson [19], which allows a reduction in computationalvariability at a cost of an increase in average number of computations per decoded block.Note that in Algorithm TAmerge, the average number of computations is not increased,but is even reduced, as it will be shown in Chapter 4,33 Coarsening on Algorithm TAmergeIn order to efficiently perform the merging test in Algorithm TAmerge, nodes haveto be linked together with respect to their depths. However, this introduces the drawbackof extra processing time and memory needed for storage in forward and backwarddecoders. In the following, we provide two modified algorithms, quite similar toAlgorithm TAmerge but without the aforementioned drawback. These two algorithmswork best with quantized stacks.The modification is done in the merging test of Algorithm TAmerge (steps 3 and6). Here, the merging test is only performed among all nodes in the highest non-empty50Chapter 3 Bidirectional Sequential Decoding Algorithmssubstacks in the FS and BS instead of testing the top node in each stack with all possiblemerging nodes in the opposite direction. As a consequence, however, two opposing pathsmay pass each other because forward and backward portions of the merged path may notcoexist in the highest non-empty substacks at the same time. in other words, forward andbackward portions of the merged path may overlap, as illustrated in Figure 3.3. In orderto reduce the possibility of two opposite paths passing each other, not only the endpointsof the two possible merging paths should be tested but also other possible merging pointsin the overlapping area as will be described later.Coarsening I (Algorithm TTmerge):Steps 0 to 2 are the same as in Algorithm TAmeet, except the top node is now anynode in the non-empty highest substack,Step 3: If 1F + 1B > L, check all paths in the non-empty highest substack in theBS with all paths in the non-empty highest substack in the FS. If there is an overlapbetween forward and backward paths that are being tested, check all encoder states inthe overlapping area. If one or more merging paths is found, stop decoding. Among allmerging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to the next step.Steps 4 and 5 are the same as in Algorithm TAmeet, except the top node is now anynode in the non-empty highest substack.Step 6: If 1F + 1B > L, check all paths in the non-empty highest substack in theFS with all paths in the non-empty highest substack in the BS. If there is an overlapbetween the forward and the backward paths being tested, check all encoder states in51Chapter 3 Bidirectional Sequential Decoding Algorithms(b)Figure 3.9 Merging test in Algorithm TTmerge,the overlapping area. If one or more merging paths is found, stop decoding. Among allmerging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to the next step.Step 7: Check if the node to be extended in the highest non-empty substack in theFS is a terminal node in the forward tree. Check if the node to be extended in the highestnon-empty substack in the BS is a terminal node in the backward tree. If either one isa terminal node, stop decoding and that top path is accepted as the decoded path. Inthe event that both top paths in the PS and BS are terminal nodes in the forward andbackward trees, respectively, the path with the highest cumulative metric is selected asthe decoded path. Otherwise, go to step 1.possible merging nodes(a)path inFS BStoL52Chapter 3 Bidirectional Sequential Decoding AlgorithmsIf two opposing paths pass each other and one of them reaches the end of itscorresponding tree, clearly the decoding should be stopped. Therefore, step 7 is neededin Algorithm TTmerge. However, step 7 can be eliminated if we treat the forward andbackward root nodes as connected with zero paths. For example, if the top node in theFS is at level L, it can be viewed as a merging node with the backward root node.The merging test in Algorithm TT’merge is illustrated in Figure 3.8. In Figure 3.8 (a),the forward and backward paths do not overlap and thus only the end points are tested.However in Figure 3.8 (b), forward and backward paths do overlap over to branches. Inthis case all intermediate nodes are examined. This involves t0+1 comparisons whichcan be done by simply performing one exclusive-OR operation between the last m+toinformation symbols of the tested paths, and checking if there exits m consecutive zeros,In case the depth of one of the tested paths is smaller than (m + to), the shorter one canbe viewed as one rooted by a zero path of any length.It is interesting to notice that the number of nodes in the highest forward and backwardsubstacks varies according to the channel conditions. When the channel is quiet, onlythe correct node is likely to float in the highest substack. As the channel conditiondegrades, the number of nodes in the highest substacks tends to increase. This meansthat the merging test involves a variable number of examinations, which depends on thechannel conditions.Coarsening 2 (Algorithm HTTmerge):We now describe Algorithm HTTmerge which is a hybrid version of AlgorithmTTmerge and Algorithm TAmeet. Algorithm HTTmerge is exactly the same as Algorithm53Chapter 3 Bidirectional Sequential Decoding AlgorithmsTlmerge except that the number of information symbols required to be the same in themerging test of Algorithm TTmerge (steps 3 and 6), denoted m, is less than m, The ideais that if two opposing paths agree on their last mj < m branches, it is anticipated thatthey would eventually agree on all m last branches if decoding is continued. Of coursethis may not happen all the time, especially if mh is relatively small, and this prematurestopping may introduce additional decoding errors. If mh=m, Algorithm HTTmergebecomes equivalent to Algorithm TTmerge. In the other extreme, if mt=O, it becomesequivalent to Algorithm TAmeet. Clearly, Algorithm HTTmerge reduces the amountof computations by compromising the error performance since the number of matchinginformation symbols mh is less than m. Hence, Algorithm HTTmerge can provide atrade-off between computational effort and error performance by controlling the numberof matching information symbols in the merging test.54Chapter 4Computational Performance of BSDIn this chapter, we analyze the computational performance of the proposed BSDalgorithms. It is assumed that the channel is a DMC. Also, BSD using the stack algorithmis assumed. However, all results in this chapter can be applied to arbitrary BSD sequentialdecoders. Let denote the total number of computations performed by any BSDalgorithm for decoding one block of L branches (including a known tail), where onecomputation is defined as in Chapter 3, i.e., the extension of one node from eitherforward or backward direction. We are mainly concerned with the distribution of CfSD,i.e., P (CfSD > N). In the following, we first quickly demonstrate the property of thecomputational distribution by a simple approach. Then, Algorithm TAmeet is analyzedin detail. More specifically, upper bounds on the computational distribution are givenby Theorem 4.1 (for a specific time-invariant convolutional code) and by Theorem 4.2(for an ensemble of trellis codes). Moreover, we conjecture that Theorem 4.2 can beapplied to Algorithm TAmerge. A lower bound on the computational distribution for anycode and any type of BSD is also derived. The average number of computations andthe computational cutoff rate of BSD are then discussed. Finally, extensive computersimulation results are provided, confirming our theoretical analysis.4.1 A Simple Approach to Computational DistributionBefore giving the general derivation, we first quickly show interesting results onthe computational distribution of BSD when the code rate is below the computational55Chapter 4 Computational Performance of BSDcutoff rate Rcomp. Let us define a dip [36, 46] on the correct path as the largest metricdifference between two consecutive non-adjacent breakout nodes (see Figure 4.1 (a)).Notice that the dip is almost the same whether decoding is performed from forward orbackward direction because the metric of the backward correct path is the mirror imageof its corresponding metric of the forward correct path (see Figure 4.1 (a) and (b)).Alternatively, the dip can be viewed as a “noise burst” [30], It is admitted that the Paretodistribution P (CysD > N) N—Pr is caused by the following two opposing effects:(i) the probability of the dip size (burst length) y decreases exponentially with ; (ii)the number of computations necessary to decode the correct path increases exponentiallywith ‘y [5, 15, 30, 32, 36, 46, 47]. All these suggest the inherent inability of USD todeal effectively with a large dip (long burst). These difficulties arise from the essentialnature of USD algorithms, i.e., searching only from one direction, and not from anyfundamental limitations on convolutional codes or from any basic difficulties arisingfrom the dip (burst) [30]. In the following, we demonstrate that the proposed BSDrelieves the inefficiency of USD by attacking the dip (burst) form both forward andbackward directions.Suppose that the code rate is below Rcomp, experimental evidence [48] indicates thatwith a very high probability only one important dip exists in a block. Moreover, it iseasy to see that C’ is an increasing function of x, while G_ is a decreasing functionof x. Hence, according to Property 3.2, C/’ cf_1 in Algorithm TAmeet, where Iis the meeting point. Similarly, in Algorithm TAmerge, Cr C_ where 1 is themerging point. Thus, Algorithm TAmeet and Algorithm TAmerge optimally break everyreceived block into two portions that require roughly the same number of computations56Chapter 4 Computational Performance of BSD1(c)Figure 4.1 (a) Correct path metric with forward decoding. (b) Correct path metric withbackward decoding. (c) Correct path metric with bidirectional decoding.for decoding. Also, portions of the decoded forward and backward paths in the FS andBS will never overlap in Algorithm TAmeet and Algorithm TAmerge.In summary, the proposed BSD algorithms attack the dip (burst) from both sides,and halt somewhere near the middle of the dip. Hence, with the assumption of correctForward treeroot nodeo Nonbreakout node• Breakout node(a)Backward treeroot node DIP(b)Forward treeroot nodeOptimum meeting (merging) pointBackward treeroot node57Chapter 4 Computational Performance of BSDdecoding, the meeting (or merging) event can be viewed as the breaking of the dip inan almost optimal way, i.e., into two almost equal parts (see Figure 4.1 (c)). It seemsplausible that the required amount of computations by the proposed BSD is approximatelythe amount required by a USD to move through two dips (bursts) of half the original size.This is why our BSD needs substantially less computations. Therefore, if USD needsN cx exp-y computations for any given correct path dip ‘-y, our BSD asymptotically needsabout twice of exp cx /T computations for the same dip. If this characterization isvalid, we can conclude thatp(CBSD > N) P(Cy8D > (N)2)2 N2Pr. (4.1)Note the factor two in the exponent of the distribution (4.1).4.2 Upper Bounds to Computational Distributionof Algorithm TAmeetThe computational effort to decode a given block depends on the strategy of thealgorithm, the transmitted and the received sequences. For the convenience of analysis,define a random variable C as C 2 max [mm (Of, Ce_)j. We first derive upperbounds to the distribution function of the random variable C, then we will apply themto CL of Algorithm TAmeet.Lemma 4.1:P(C> N) < P(cf > N/2)F(Cf_ N/2), (4.2)where the summation is over all possible values ofx, i.e., 1 < x < L— 1.58Chapter 4 Computational Performance of BSDProofS By the definition of the random variable C, one can immediately writeP(C> N) P(max{rnin[C,C_]} N/2)p (min[C, C_] N/2), (4.3)where the summation is over all possible values of x, i.e., 1 <x < L — 1. The inequalityin (4.3) is obvious since the summation includes the maximization point. Because randomvariables C’ and Cf_i are independent for any fixed arbitrary x, one can write (4.2)[49]. Q.E.D.Before we continue the derivation of the upper bound, we need the following lemma.Lemma 4.2: The distribution P(CK’’ > N) of USD to decode A branches in afiniteA branch code tree (or infinite code tree) is upper bounded byP(GKSD N) AP(C1— 1), (4,4)where C1 denotes the number of computations peiformed by USD in order to decode thefirst branch in an infinite code tree.Proof’ For an infinite code tree, according to the union bound, we have [20]P(CK >N) <P(OctN_A)N_A)=AP(C1_l) (4.5)where CK’ denotes the number of computations required to decode the first A information branches in an infinite tree, and Cj is the number of computations by USD in59Chapter 4 Computational Performance of BSDorder to decode the t-th branch. For an infinite code tree, C = 01 since the numberof computations performed by USD to decode one branch is the same for all branches.For a finite code tree, we haveP(CK8D N) P(CK18D > N)AP C>——1 (4,6)because CK Q.E.DLemma 4.2 confirms that there exists no essential difference between P (cS) N)and P(C1 N) as Jacobs and Berlekamp suggested in [30]. From now on, we let ofand C1B denote the number of computations required to decode the first branch in theinfinite forward and backward code trees, respectively.Theorem 4.1: For a specific time-invariant convolutional (i.e., linear trellis) code t!with rate R and a BSC with a crossover probability p satisfying (2.9), the computationaldistribution of Algorithm TAmeet is upper bounded byP(CL > N) <A1 . exp{—t(d[log N — log (uL)]+ d[log N — 1og (uL)]) + 21og N} (4.7)where A1 is a finite constant independent ofN. Here , q, d[x], c1[x], ndF and arethe same as defined by Chevillat and Costello [38]. The superscripts F and B denote thecorresponding forward and backward codes of Q, respectively.Proof’ Using Chevillat and Costello’s upper bound, which is given by (2.10) for aspecific time-invariant convolutional code [381, we can express of asP(cf — <a n .exp{_itd[logu ( — q1og (Y — i)} (4.8)60Chapter 4 Computational Performance of BSDand Cj8 as1B N B 1N2(L— x)— 1) < exp{—d 1og 2(L — x) — 1) +log(2LN_X)_1)}. (4.9)Thus, according to Lemmas 4.1 and 4.2, we can writeP(C N) •(L—x)•ndF .exP{_(d[1ogu (_ )]+d [iogu (2(L x) 1)]) + log [( x) _i)] }. (4.10)Notice that the CDF is a non-decreasing function and log (a — 1) > log (a) — 1 whena 2 and u 2. We can thus writeP(C N) <2 x (L— x).ndF exP{_(d{logu ()_ ]+ d [iogu (2(L - X)) 1]) + 1og [ () ( -2 —-- 1---- F•2 mu . [x. (L — x)] mu. “z .exp{—(d [1ogN— log (2uL)} + d[1og N — logy (2uL)]) + 2log N}= A1 exp{—t(d[1og N — 1og (2uL)]+ d[1og N — 1og (2uL)]) + 2q1og N}. (4.11)According to Property 3.2, we know that CL — 1 < C, Hence, P(CL > N)P(C> N). Q.E.D.Theorem 4.1 shows that rapid column-distance growth of both forward and backwardcodes are essential to make BSD efficient. Using a computer search procedure, we haveprovided in Chapter 2 SBODP and SABODP codes which have the same rapid columndistance growth rate from both forward and backward directions.61Chapter 4 Computational Performance of BSDHowever, if the convolutional code used is optimum from one direction but is verybad from the other direction, one can expect that the distribution of computations usingBSD falls close to that using USD from the optimum direction, as the above boundindicates. A systematic ODP code is one typical example. Therefore, systematic codesare in general not suitable for BSD algorithms.Corollaiy 4.]: For a symmetric convolutional code (e.g., a SBODP or SABODPcode), Theorem 4.1 can be written asP(CL > N) <A1 (ndj2 . exp {2(—d[1og N — 1og (2uL)] + 1og N)}, (4.12)since d[x] = d[x] d[x] and d’ = = Notice the factor two in theexponent of the distribution (4.12).Furthermore, the above bound (Theorem 4.1) can also be shown to be related torandom-coding results, For an ensemble of trellis codes, d[x] = d[x] = n x/2,< N/2 and ndB < N/2 [38]. Therefore, for an ensemble of trellis codes, Theorem4.1 can be expressed asP(CL> N) <2 2i [x . (L — x)]. (N)2 . exp{—2(n/2— ) log N+ n1og (2uL)}= A1 (2uL).()2. exp {—2(n/2— )log N}= A2N” (4.13)where J, = (au. n — 2q)/(2lnu) — 1, which is the same as that of USD [38]. Moreover,A2 and J, are functions of the channel and code rate but independent of N.The above arguments are more clearly demonstrated by the following theorem. Noticethat Lemmas 4.1 and 4.2 can be applied to an ensemble of codes.62Chapter 4 Computational Performance of BSDTheorem 4.2: Let Pr be defined by R = E0(Pr) / Pr On any DMC and for largeN, the computational distribution of Algorithm TAmeet for an ensemble of trellis codes,P(CL > N), is upper bounded byP(CL > N) A3N2, (4.14)where P <Pr is the Pareto exponent of USD and A3 is a finite constant independent ofN.Proof According to Lemma 4.1, 4.2 and (2.8), one obtainsN) <A2x(L_x)(_‘Y[2(Lx)—i]=2AN[x(L — )]1+P(i — 2L— x)_P (4.15)in (4.15) for N sufficiently large such that N> 2L, the terms in the last two brackets arevery small and can be bounded by a constant. Therefore,P(C> N) <A3N2, (4.16)whereA3 = 2A [x(L — x)J’ (i — 2x) — 2(L— x)]_P[x(L — x)]1+P(l +) [i + 2px)]<22PA(1 + p + p2/4) [x(L — x)]1. (4.17)From Property 3.2, we know that CL — 1 < C. Hence, P(CL > N) <P(C N). Q.E.D.From our computer simulation evidence (see section 4.6), we suggest that Theorem4.2 can be applied to good specific time-invariant convolutional codes. The reason63Chapter 4 Computational Performance of BSDis as follows. Experimental results [48) have demonstrated that inequality (2.7) canactually be used as an approximate expression for the computational distribution withgood specific time-invariant convolutional codes. Moreover, Anderson [50, 51] showedthat good specific time-invariant convolutional codes follow the random code propertiesquite closely. In other words, good specific time-invariant convolutional codes havepseudorandom characters although they are not random.4.3 Extension to Other BSD AgorithmsIn this section, we conjecture that the same asymptotic random coding upper bound(Theorem 4.2) also applies to Algorithm TAmerge. Algorithm TAmerge is similar toAlgorithm TAmeet except for the stopping rules. More specifically, Property 3.2 ofAlgorithm TAmeet and Property 3.6 of Algorithm TAmerge are basically the same. Byanalogy to the analysis of Algorithm TAmeet, we can writeP(GmeTe> N) P(GL > N) P (min[G, cf] > N/2), (4.18)where0Amerge is the true number of computations of Algorithm TAmerge, i.e. withoutthe assumption of correct decoding. Hence, the only question is that for any fixed arbitraryx whether C and are independent or not.As defined in Chapter 3, for an arbitrary forward tree level x, random variablesO and Of are the number of computations required to correctly decode x and (L-x)branches by forward and backward decoders, respectively. Furthermore, unlike AlgorithmTAmeet, both forward and backward decoders may extend incorrect paths beyond levelx before the end of decoding. Thus, some incorrect nodes searched by forward decodermay also be visited by backward decoder.64Chapter 4 Computational Performance of BSDLet Df’ be the forward incorrect path subset generated from forward tree level i,Similarly, let D denote the backward incorrect path subset generated from backwardtree level j. According to the operation of sequential decoding, C is a function of themetrics of the correct path up to level x and metrics of incorrect paths in the subsets Dr,1 =1, 2, ..., x, [31, 46, 52J. Thus, one can write= f(M[xF(h)], M[F]), where h e [1, x] and F E U Dr. (4.19)Here, XF(h) is the forward correct path up to forward tree level h, and *F is a forwardincorrect path. Similarly, one can write= f(M[xB(h)],M[B]), where he [1,L — x] and B C UDP. (4.20)Here, xB(h) is the backward correct path up to backward tree level h, and $cB is abackward incorrect path.Strictly speaking, on a random code tree, C and C_ are not truiy independent,because an incorrect forward path *F may share some common nodes with an incorrectbackward path xB. Fortunately, the probability of this event gets smaller as the codememory m increases.10 Hence, by neglecting the above event, we can consider C’and cf essentially independent, and thus conjecture that Theorem 4.2 also applies toAlgorithm TAmerge.In Algorithm TTmerge, forward and backward merged paths may overlap each other,and hence the analysis of the computational distribution is even more difficult (if notimpossible). However, as our computer simulation results (see Figure 4.9 in section 4.6)10 The probability of this event is in the same order of the probability of error, which is shown in Chapter 5 to decreaseexponentially with m.65Chapter 4 Computational Performance of BSDsuggest, the overlapping area is insignificant, especially when code rate R Thus,it seems reasonable to suggest that the computational performance of Algorithm TTmergecan be approximated by that of Algorithm TAmeet if the code rate RFinally, it is easy to see that the computational distribution of Algorithm HTTmergeis upper bounded by that of Algorithm TTmerge and lower bounded by that of AlgorithmTAmeet. Clearly, Algorithm HTTmerge is closer to Algorithm TTmerge when the numberof required matching information symbols mh is close to the code memory m. Otherwise,it is more like Algorithm TAmeet. This phenomenon is clearly observed in the computersimulations (see Figure 4.2 in section 4.6).4.4 Lower Bound to Computational DistributionFollowing Jacobs and Berlekamp [30], we show that with the assumption of correctdecoding no algorithm of BSD type can have a computational distribution better thanN2°.Theorem 4.3: For all convolutional (or trellis) codes, any DMC, and with the assumption of correct decoding, the distribution P (Cf > N) of any BSD algorithm islower bounded byp(CBSD> N) 22aN_2exp{_O(/i)}, (4.21)where a is defined in parametric equation (2.7).Proof Let us separate every received block into two almost equal parts, i.e. forwardtree search operations will not go beyond forward tree level [(L — m)/2j and backwardtree search will not go beyond backward tree level F(L— m)/21 by the help of a genie as66Chapter 4 Computational Performance of BSDdefined by Jacobs and Berlekamp [30]. Notice that there are m branches in the codewordtree separating the portions considered by the genie-aided forward and backward decoders,allowing them to choose codeword paths independently. Let Cm)/2Janddenote the total number of computations required before first correctly decoding the[(L — rn)/2J branches by the idealized forward decoder and the [(L — m)/2] branchesby the idealized backward decoder, respectively. Obviously under the assumption ofcorrect decoding, > 2mm_m)/2J0RL_m)/21] for any BSD algorithm.Therefore, we can write> N) P (2mm [C_m)/j,l]> N)= (C_mIj > N/2, C_m)/l > N/2). (4.22)By the assistance of a genie, forward and backward search areas do not overlap, i.e.events (C_m)I2j> N/2) and (0_m)/21 > N/2) are independent for any DMC.Therefore, we finally getP(0BsD > N) (Cm)/2j> N/2) . p (Cm)/2l> N/2){ (*) a exp [-Oi (vci)]=2aN_exp [_o(/iT)]. Q.E.D. (4.23)Combining the lower bound (Theorem 4.3) and the upper bound (Theorem 4.2), wedemonstrated that the computational distribution of the proposed BSD is Pareto, and itsPareto exponent is between 2c and 2p, where p < Pr is the Pareto exponent of USD.Since the Pareto exponent of USD is approximately c or Pr, we conclude that the Paretoexponent of BSD is approximately twice that of USD,67Chapter 4 Computational Performance of BSD4.5 Average Number of ComputationsIn this section, we examine the average number of computations of BSD and showthat it is smaller than that of USD. The computational cutoff rate of BSD is then discussed,We define the average number of computations per decoded branch, denoted by Gb,as the total number of computations used to correctly decode one block divided by theblock length L. Using Jacobs and Berlekamp’s approximation [301 for the distributionof CKSD, which is given byP(CKSD > N) AN, (4.24)we can approximate the distribution of BSD as> N) (r_1L)2 N, (4.25)where we assume L >> m, and ignore the difference between L(L — m)/2j and— m)/21 since both tenns are almost equal to L I 2 when L is large.The total number of computations per decoded block of BSD can be any integer inthe interval [L, M] where M is the maximum number of computations in BSD. Thus,BSD — 1 BSDECb 7L= NP(Cf = N)N [ (cfSD> N) — p (cf > N)]M-l M= (N + l)P(C > N) — NP(Cf > N)N=L-1 N=L68Chapter 4 Computational Performance of BSDM-l=P(cfsD>N)+P(cfsD>L_l)—SD> M). (4.26)In (4.26), P(C > L — i) = 1 and p(cfS) > M) = 0. Therefore, using (4.25),(4.26) becomesM-lE[C] =l+yzP(Cf8D>N)M—l1 + 22a_2 L > N2’. (4.27)Moreover, similar to [30], we can writeM-l M— L2j, a (4.28)lnM—lnL,Finally, using (4.28), we obtain1L jl—2a L2(1_a)l .i 1E [csD] . + L — a - (4.29)11+222L(1nM—lnL),Similarly, one can get the average number of computations per decoded branch forUSD asE C-° f 1 + — L’), a 1 (4.30)H1+1nMi—1nL, a=1’where M1 is the maximum number of computations in USD.From (4.29) and (4.30), it can be seen that E [c°] is smaller than E forany a. However, the difference becomes less significant when a is much larger than69Chapter 4 Computational Performance of BSDone, This is because the average computation for both USD and the proposed BSD isessentially one computation per branch when a >1. However, it is expected that higherorder moments of the computation will be significantly improved by the proposed BSDeven for high a.Let us now discuss the computational cutoff rate. The computational cutoff rateof sequential decoding is defined as the supremum of rates for which the averagecomputation per branch is bounded [301. It is well known that with the assumption ofcorrect decoding the computational cutoff rate Rcomp of USD is equal to E0(]) [30, 53].Rcomp is a truly critical channel parameter, and with the assumption of correct decodingthe average number of computations per decoded branch of USD is unbounded if thecode rate R > R0 [30, 53]. In the following theorem, it is shown that the computationalcutoff rate for BSD with the assumption of correct decoding is the same as that of USD.Theorem 4.4: Consider an infinite sequence of convolutional (or trellis) codes {}and any DMC. For code j, let R denote its rate and m its memory. Suppose each codehas n symbols per branch in its codeword tree, and R1 R for all i, for some R > R01,Code is used with block length L1 (including a known tail to terminate the encoder ina known state) for BSD. Equiprobable codewords and arbitrary sequential decoder areassumed. Under the assumption of correct decoding,” (fL —* oc and L (i +for any i and some 0 < eo 1, then the expected number of computations per branchsatisfiesE [CsD] oo (4.3])u This assumption is equivalent to defining the number of computations needed to be infinite if BSD makes decoding error.70Chapter 4 Computational Performance of BSDProof’ See Appendix A. Q.E.D.As a simple approach, from (4.29) one can actually show that E [GSD] —* co whenL — cc and a < 1. However, it is important to notice that the above computationalcutoff rate is based on the assumption of correct decoding. As suggested by Anderson[51], this assumption is unrealistic and in fact there is no cutoff rate phenomenon forsequential decoding as long as Fe > 0. Nevertheless, the cutoff rate is a bound such thatabove this rate the average computation will increase exponentially. From (4.29) and(4.30), one can easily see that when a < 1, i.e. above the cutoff rate, E [CD] is muchsmaller than E [C1] although it is also much larger than one. Hence, the useful rangeof code rates with the proposed BSD is increased compared to that with USD accordingto Anderson’s definition on a “useful decoder” [51].46 Numerical and Simulation ResultsComputer simulations have been performed in order to verify the above theoreticalresults. Non-systematic rate r = 1 / 2 (bits per channel symbol) convolutional codes wereused. All algorithms were run over a BSC with a transition probability p under strictlyidentical conditions (i.e., using the same code and noise sequences), in order to makethem comparable. In our computer simulations, metrics were scaled into integers. InAlgorithm TAmeet, Algorithm TAmerge and Algorithm HTTmerge, the substack spacingwas chosen to be equal to one, i.e., using unquantized stacks. Different substackspacing values, namely L = I and zi = 7 were chosen in Algorithm Tlmerge. Noticehere that there may exist more than one path in the highest non-empty substack evenwhen z I.71Chapter 4 Computational Performance of BSDFigure 4.2 shows the distributions of the total number of computations per decodedblock of both USD and BSD algorithms for the case p = 0.0409, corresponding to aPareto exponent Pr = 1.1 (i.e., R0 = 0.52 bits per channel symbol). A SBODP codeof memory m = 23 was used, and 200,000 blocks each of length L = 400 bits (377information bits) were simulated for each algorithm. The Pareto approximations forUSD (stack algorithm z = 1) and BSD (Algorithm TAmerge) are also indicated on thefigure. As expected, the use of BSD results in a significant reduction of the computationalvariability of sequential decoding. More specifically, the slope of the straight line thatapproximates the tail of the distribution of Algorithm TAmerge is equal to 2.16, which ismore than twice that of USD. This is in agreement with our derived upper bound on thecomputational distribution (Theorem 4.2), Moreover, it also agrees with the lower bound(Theorem 4.3), i.e., less than 2cr. It can be noticed from Figure 4.2 that the slopes of thecomputational distributions of all BSD-merge algorithms are similar to that of AlgorithmTAmeet, which suggests that the additional computations associated with the merging testare asymptotically unimportant as compared to the computations of Algorithm TAmeetwhen the code rate R <As Forney [36] suggested, Figure 4.2 shows that an unidirectional quantized stackalgorithm (z2i=7) increases the computational effort compared to an unidirectional unquantized stack algorithm (z=1). However, Figure 4.2 shows that the tail of the computationaldistribution of Algorithm TTmerge improves when the quantized stack algorithm (=7)is used. This is due to the fact that if the substack spacing L is increased, more pathswill be tested in the merging test so that the chance of merging gets enlarged.72ProbabilityP(CL>N)C Q.CD t-)00CD.Cf2 1CDC C—CCDIIC C CC) CD 000 C C C0, (JC C00000Chapter 4 Computational Performance of BSDFor comparison purposes, the computational distribution of Fomey’s scheme [22]built on the stack algorithm is also depicted in Figure 4.2. As we pointed out in Chapter3, the total number of computations of Forney’s scheme is equal to 2 mm (Ci, CL9.Hence, the minimum number of computations with this scheme is equal to twice thatof USD. Clearly, as indicated in the figure, Fomey’s scheme does not provide anysignificant advantage over conventional USD. This is mainly because the dip size in thecorrect path metric is essentially the same whether decoding is performed from forwardor backward directions (see Figure 4,1). Thus, the number of computations, which ismainly determined by the dip size of correct path metric, is basically the same for bothforward and backward decoders.Table 4.1 shows the average number of computations per decoded branch of differentalgorithms. Clearly, the average number of computations is reduced by using theproposed BSD. Table 4.1 also shows the number of erasure blocks (undecoded blocks)with a maximum allowed computation Cijm = 8000. These results suggest that ourBSD can significantly alleviate the erasure or overflow problem, which is the mainobstacle in the application of sequential decoding. Furthermore, Table 4.1 shows thatwith an increase in computations of about 3% with Algorithm TAmerge compared toAlgorithm TAmeet, decoding errors (18.3% decoded blocks were in error after decodingby Algorithm TAmeet) were practically eliminated.12 Thus, with the use of the mergingtest in BSD, decoding errors can be significantly reduced. For Algorithm HTTmerge,Figure 4.2 and Table 4.1 suggest that there exists a substantial computational gain by justletting m = (m — 1) in the merging test. Thus, Algorithm HTTmerge can easily provide2 For a more detailed analysis and computer simulation results on the error perfomiance of the proposed BSD, refer to Chapter6.74Chapter 4 Computational Performance of BSDSBODP Code m = 23 Ave. comp.Erasure blocks Error blocksPr’’, L=400, 2x 10 5runs per branchUSD (unquantized stack 1.621 1324 0algorithm_z = 1)USD (quantized stack algorithm 1.776 1743 0L = 7)Algorithm 1.264 7 36614TAmeet (z2i = 1)Algorithm 1.303 21 1TAmerge_(L = 1)Algorithm1.390 84 1rrmerge_( = 1)Algorithm 1.393 59 0TTmerge_(z = 7)Algorithm HTTmerge 1.333 39 1(mh = m — 1, L = 1)Algorithm HTT’merge1.319 31 3(mh = m — 2, = 1)a trade-off between computational effort and error performance by adjusting the numberof matching information symbols m.Figure 4.3 compares the computational distributions of Algorithm TAmerge for ODP,SBODP and SABODP codes all of memory m = 23, using the same parameters asin Figure 4.2, It can be seen from this figure that the distribution of the number ofcomputations per decoded block using a SBODP code is better than that with an ODPcode. However, the distribution of the number of computations per decoded block usinga SABODP code is almost identical to that with a SBODP code. With a SABODP code,no errors were found after decoding because its free distance (equals to 24) is muchlarger than the free distance of the SBODP code (equals to 18). These simulation resultsTable 4.1 Comparison of average computations for L = 400, Ciim = 8000 and p = 0.0409.75Chapter 4 Computational Performance of BSDare clearly in agreement with our theoretical observations.Figure 4,4 shows the distributions of the total number of computations per decodedblock of different algorithms using a systematic ODP code (m = 23). As expected, thisfigure shows that the computational distribution of BSD is almost the same as that ofUSD. This is due to the fact that the backward code of an ODP forward code is toopoor and hence the backward decoder can hardly ever help in decoding. Figure 4.4 alsodemonstrates that the number of computations needed in Forney’s scheme is essentiallyequal to twice that of USD.Figure 4.5 shows the computational distributions of the total number of computationsper decoded block of both USD and our BSD algorithms for the case p = 0.0594,corresponding to a Pareto exponent Pr = 0.68 (i.e., R0 = 0.44 bits per channelsymbol). The same code as in Figure 4.2 was used again, and 10,000 blocks eachof length L = 200 bits were simulated for each algorithm. The Pareto approximationsof these distributions are also indicated on the figure. In Figure 4.5, the computationaldistribution of Algorithm TAmeet (a BSD-no-merge) clearly violates the lower boundgiven by Theorem 4.3 because there were too many decoding errors (44.1% decodedblocks were in error as illustrated in Table 4.2),13 However, no errors were noticedafter decoding by Algorithm TTmerge, while only one block was in error after decodingby Algorithm TAmerge. The slope of the straight line that approximates the tail ofthe computational distribution of Algorithm TAmerge is equal to 0.89, which is morethan twice that of USD. Again, it agrees with our upper and lower bounds. Moreover,Notice thai the lower bound in Theorem 4.3 is based on the assumption of correct decoding.76Chapter 4 Computational Performance of BSD0Ac-)• 10Figure 4.3 Distribution of the total number of computations per decodedblock for different codes using the same parameters as in Figure 4.2.400 800 1200 1600 2000 4000 8000N77Chapter 4 Computational Performance of BSDAc-)LFigure 4.4 Distribution of the total number of computationsper decoded block for a systematic ODP code (Pr 1.1).8000010N78Chapter 4 Computational Performance of BSDthe slopes that approximate the computational distributions of Algorithm TTmerge andAlgorithm TAmerge are almost the same.Table 4.2 compares the average number of computations for different algorithms forthe case p = 0.0594. The number of erasure blocks and the number of error blocks arealso shown in the table. It can be seen that the number of erasure blocks is substantiallyreduced by using the proposed BSD. Also, by using the merging test, almost all errorsassociated with Algorithm TAmeet were avoided, Furthermore, by comparing Table4.1 and Table 4.2, one can notice that the improvements in the average number ofcomputations of our BSD compared to that of USD are more significant when the Paretoexponent Pr is less than one. As anticipated, the useful range of code rates [51] withour BSD is increased as compared to the case with USD. These observations agree withour suggestions made in section 4.5.Figure 4.6 shows the distributions of the total number of computations per decodedblock of both USD and our BSD algorithms for the case p = 0.0289, corresponding toa Pareto exponent Pr = 1.5 (i.e., Rcomp = 0.58 bits per channel symbol), The same codeas in Figure 4.2 was used, and 5,000,000 blocks each of length L = 400 bits were runfor each algorithm. Again, the slopes that approximate the computational distributionsof the different BSD algorithms are quite similar. Once again, these simulation resultsare in agreement with our analytical results.Table 4.3 compares the average number of computations for different algorithmsfor the case p = 0.0289. The average number of computations, the number of erasureblocks and the number of error blocks are also shown in Table 4.3. It can be noticedthat BSD-merge eliminates all decoding errors associated with Algorithm TAmeet (4.7%790Ac)1010Chapter 4 Computational Performance of BSDFigure 4.5 Distribution of the total number of computations perdecoded block for the case L = 200 and p = 0.0594, i.e., Pr 0.68.400010iv80Chapter 4 Computational Performance of BSDTable 4.2 Comparison of average computations for L = 200, Cijm = 4000 and p = 0.0594.SBODP Code m = 23 Ave. Coffi Erasure blocks Error blocksPr=°68,L=200, i04 runs per branchUSD (unquantized stack 4561 1175 0algorithm_ = 1)USD (quantized stack 5.106 1382 0algorithm_ = 7)Algorithm 1,541 0 4406TAmeet (z = 1)Algorithm 2.284 166 1TAmerge_(L = 1)Algorithm2.799 293 0TTmerge_(z = 1)Algorithm 2.894 306 0TTmerge_( = 7)blocks). Moreover, the additional number of computations due to the merging test isonly 0.5% with Algorithm TAmerge compared to Algorithm TAmeet. All results showthat the additional computations due to the merging test are worth the reward and thosecomputations can be ignored when the code rate R <Reomp. As indicated in the table,the average number of computations of USD and our BSD are quite comparable here.However, due to the reduction in computational variability, as shown in Figure 4.6, it isexpected that higher order moments will be smaller with the proposed BSD than withUSD.Figure 4.7 shows the empirical probability masses of merging and meeting points inBSD for the two cases Pr = 1.1 and Pr = 1.5. It can be seen that statistically, there isessentially no difference between the meeting and merging points. This implies that theoverlapping portion between forward and backward search areas when the merging test810 C Q.CD0 CD. 0IICD00CDII 000 CE]. flC C IProbabilityP(CL>N)— oo0 C 00 0 C——oC0C C C C 0Chapter 4 Computational Performance of BSDTable 4.3 Comparison of average computations for L = 400, Clim = 4000 and p = 0.0289.SBODP Code m = 23 Ave. comp. Erasure blocks Error blocksPr=L5,L=400, 5x 106 runs per branchUSD (unquantized stack 1.139 575 0algorithm_L = 1)USD (quantized stack algorithm 1.167 4336 0__7)Algorithm1,107 5 233321TAmeet (z = 1)Algorithm1.113 11 0TAmerge_(L = 1)Algorithm1.122 42 0TTmerge_( = 1)Algorithm1.137 15 0TTmerge_(z = 7)is used is negligible when the code rate R <R01,, Moreover, Figure 4.7 also showsthat the probability mass of the merging (meeting) point is concentrated towards themiddle of the block. That is to suggest that when the channel is relatively quiet (i.e.,Pr > 1), the merging (or meeting) point is more likely to be near L / 2, which meansthat the computational lower bound on the computational distribution (Theorem 4.3) isquite accurate in this case.Figure 4.8 shows the empirical probability masses of the merging points for AlgorithmTAmerge and Algorithm TTmerge and that of the meeting point of Algorithm TAmeet,for the case Pr = 0.68. As can be seen, the probability mass of the merging point withAlgorithm TAmerge is quite similar to that with Algorithm TTmerge. However, unlikethe above case (i.e., R < Rcomp), here the probability mass of the meeting point startsto differ from that of the merging point. This suggests that the additional number of83Chapter 4 Computational Performance of BSD0.070,060.05—. 0.04-.S0.030.020.0100 400Figure 4.7 Probability mass of merging and meeting when code rate100 200 300x84Chapter 4 Computational Performance of BSDcomputations needed for merging from the first meeting point increases as the code rateincreases beyond Rcomp.Figure 4.9 shows the empirical distributions of overlapping lengths in AlgorithmTAmerge and Algorithm TTmerge, which is defined as= 1F + 1B — L at the end ofdecoding. O, represents the portion of the tree over which forward search and backwardsearch overlap each other. In Figure 4.9, the block length L=200 for the case Pr = 0.68,and L=400 for Pr = 1.1 and Pr = 1,5. As expected, the distribution of the overlappinglength decays faster as the channel becomes less noisy. Figure 4.9 suggests that thedistribution of the overlapping length decays in an exponential fashion. This is because O,is essentially the sum of the deepest depths reached by forward and backward decoders,and it is known that the distribution of the depth x of an incorrect path explored by asequential decoder decreases exponentially with x [4, 38].Finally, since in BSD two separate tree search processors can work in parallel, itseems more reasonable to define a computation in BSD as hypotheses (extension) of twonodes, one on the forward direction and the other one on the backward direction. Clearly,all analyses in this chapter still hold except that “N” should be replaced by “2N”. Asa result of this definition, our computer simulation results should be shifted horizontallyto the left by a factor of two.Figure 4.10 compares the distributions of Algorithm TTmerge (L = 1) for differentSABODP codes (m = 12, 16, 23) using the same parameters as in Figure 4.2, but withthe new definition of one computation with BSD. The number of blocks used for eachsimulation was 200,000 each of length L = 400. The maximum number of computationsallowed to decode one block was set to 4000. The computational distributions of USD85Chapter 4 ComputationaL Performance of BSD0.0350.030.0250.020.0150.010.0050200Figure 4.8 Probability mass of merging and meeting when code rate > R01.0 50 100 150x8600TjCD 0 C CD C CD CDProbabilityP(O>X)0n 0 C) C Cl)C CChapter 4 Computational Performance of BSDTable 4.4 Comparison of computational and error performancesof different m with the new definition of one computation.SABODP Codes Ave. comp. Erasure blocks Error blocksPr1,L=400, 2x i05 runs per branchUSD (unquantized stack 1.571 2761 100algorithm_A = 1, m = 12)USD (quantized stack algorithm 1.564 2793 11A = 1, m = 16)USD (quantized stack algorithm 1.530 2595 0A = 1, m = 23)Algorithm Tlmerge 0.710 42 383(A = 1, m = 12)Algorithm TTmerge 0.707 73 27(A = 1, m = 16)Algorithm Tlmerge 0.700 77 0(A = 1, m = 23)are also shown in Figure 4.10. Moreover, the number of erasure blocks, the numberof error blocks, and the average number of computations, (using the new definition ofone computation with BSD) are shown in Table 4,4, Similar to USD, the number ofcomputations of BSD is essentially not related with m. However, the error performanceis improved with increasing m.In summary we have demonstrated that the use of the proposed BSD can reduceboth the computational variability and average number of computations of sequentialdecoding. It should be pointed out at this stage that in our comparison of BSD-mergealgorithms with USD, we did not take into account the additional processing time neededto perform the merging test in BSD-merge. However, one can use extra processors toperform this task that would not slow down the node extension operations of sequentialdecoding. Therefore, it seems that the additional complexity due to the merging test88Chapter 4 Computational Performance of BSDFigure 4.10 Distribution of the total number of computationsfor different m with the new definition of one computation.A010—110-210-31010200\Unidirectional“ Stack Algorithm‘\ ‘% V (A=l)\\____\\‘c‘Algorithm TTmerge — — — —(A=l)%...—c•\%.,\___SABODP Codesm=23m=16m = 12 — —— — ——%.400 600 800 1000N2000 400089Chapter 4 Computational Performance of BSDis unimportant compared to the significant improvements in computational performanceobtained by using the proposed BSD.90Chapter 5Error Performance of BSDIn this chapter, the error performance of the proposed BSD algorithms is analyzedthrough the random coding argument. We assume that the channel is a DMC and BSDis based on the stack algorithm. However, all results obtained in this chapter can alsobe applied to arbitrary BSD sequential decoders, It is obvious that the error performanceof any BSD is lower bounded by that of Viterbi decoding since sequential decoding issuboptimum. Thus, we only need to develop upper bounds on the error performance of theproposed BSD. First, the basic elements of the error performance of USD are summarized.Then, we deal with the probability of decoding error of B SD-merge (Algorithm TAmergeand Algorithm ‘TTmerge). Finally, the error performance of BSD-no-merge (AlgorithmTAmeet) is discussed. It is found that for an ensemble of linear trellis codes the errorperformance of BSD-merge is asymptotically the same as that of USD, while the bit errorprobability of Algorithm TAmeet satisfies the random coding bound for block codes.5.1 Error Performance of USDThe basic results of the error performance of USD are summarized in the following.Let x(j) denote the correct path from root node to level j. Let x (t) denote an incorrectpath which diverges from the correct path x at level i and remerges at level (i + t).A proto-error event as defined by Forney [36] is an event for which some incorrectpath x(t) has a metric M[x(t)j at the point of convergence with the correct path x91Chapter 5 Error Performance of BSDsuch thatM[x(t)] rninM[x(j)j. (5.1).2 tBy definition, no error can occur without a proto-error event whereas a proto-error eventmay not necessarily lead to a decoding error for a unidirectional unquantized stackalgorithm [36]. Furthermore, for an incorrect path to remerge with the correct path, tmust be no less than the code constraint length K. For an ensemble of linear trellis codes,the block error probability p8D, and the bit error rate with a unidirectionalunquantized stack algorithm can be bounded by [6, 12, 36]pUSD < [1_Uo(1)/R]2G 0 R (1 — E)Rcomp (52)I [l__E(5)IR]2e , (1 — e)Rcomp R (1 —andpU9D < I [l_U , 0 R (1 — )Rcompb—en’o(6), (1— e)Rcomp R (1 —where is a positive number, R is the code rate as defined in (2.2), E0(S) is the Gallagerfunction [5], Ci,, is the channel capacity, and for Rcomp < R < C,,, S e [0, 1] is thesolution ofR=(l—e)E08 (5.4)and for 0 R Rcomp, S = 1.Now, define a proto-error event with a bias B as the event for which some incorrectpath x(t) has a metric M[x(t)] at the point of convergence with the correct path xsuch thatM[x(t)j minM[x(j)]— B, (5.5)92Chapter 5 Effor Performance of BSDwhere B is a positive number. Thus, for the unidirectional quantized stack algorithmwith a substack spacing , no error can occur without a proto-error event with a bias B= z. Hence, it can be shown that the error performance of the unidirectional quantizedstack algorithm is asymptotically the same as that of an unquantized one but with aninsignificant increase factor of A6 = e’(’6)in (5.2) and (5.3) [6, 36].5.2 Error Performance of BSD-mergeIn this section, the error performance of Algorithm TAmerge using unquantized stacksis analyzed in detail. The results are then extended to other BSD algorithms. Althoughthe following analysis assumes unquantized stacks, it can be easily generalized to thecase of quantized stacks, in a similar way as for USD [6, 36].Let Smg denote the common encoder state of forward and backward portions of themerged path at the merging point 1. Obviously, Smg can be either correct or incorrect.Figures 5.1 and 5,2 show some typical situations in BSD where the common encoderstate Smg is correct and incorrect, respectively. As indicated in Figure 5.1, there isalways a possibility that more than one forward (backward) path may merge with morethan one path in the opposite direction at encoder state Smg. Should this occurs, AlgorithmTAmerge always choose the one (perhaps not the correct one) with the highest cumulativemetric. In any case, the encoder state Smg at the merging point 10 in Figure 5.1 is assumedto be the correct one. In Figure 5.2, however, Smg is an incorrect encoder state. In thefollowing, we divide the derivation of the block and bit error probabilities into three stepsto prove that the error performance of Algorithm TAmerge is asymptotically the same asthat of USD. In the first step, we show that if the encoder state Smg is correct, decoding93Chapter 5 Enor Performance of BSDS mgFigure 5.1 Example of error events with a correct merging encoder state.Merging point 10Backward tree level iFigure 5.2 Example of error events with an incorrect merging encoder state.errors can be upper-bounded by the proto-error event with a bias ), where ) is the largestpossible one branch metric drop. In step 2, we investigate the probability that the encoderstate Smg is incorrect. Finally, the overall error performance is derived in step 3.Before investigating the above situations in detail, we need the following lemmas.Consider an incorrect path (t) which diverges from the correct path x at level i andForward correct path Backward correct pathMergingpoint 10Forward incorrect pathsBackward incorrect paths• Correct encoder stateForward correct pathForward tree level iBackward correct pathForward incorrect pathsS mgBackward incorrect paths• Correct encoder stateIncorrect encoder state94Chapter 5 Error Performance of BSDstretches out t branches from the correct node i. First of all, an incorrect path will beextended by a sequential decoder using an unquantized stack algorithm only if [6, 36]M[$c(t)] rninM[x(j)]. (5.6)Thus, one can immediately write the following lemma.Lemma 5,1: An incorrect path 5c(t) (either in forward decoding or in backwarddecoding), which diverged from the correct path at level i and remained unmerged up tolevel (i + t), may be on the top of the stack only fM[*(t)]> mm M[x(j)] (5.7,)where (i + h) denotes the farthest level of the correct path reached by the unquantizedstack algorithm.Furthermore, notice that for any path present at any position in the stack, the parentnode of that path must have been at the top of the stack at an earlier time. Thus, wecan write the following lemma.Lemma 5.2: An incorrect path j(t) (either in forward decoding or in backwarddecoding), which diverged from the correct path at level i and remained unmerged up tolevel (i + t), could be in the stack only if> mm M[x(j)] — ). (‘5.8)iji+hwhere (i + h) denotes the farthest level of the correct path reached by the unquantizedstack algorithm.We now proceed with the derivation of the error probability of Algorithm TAmerge.95Chapter 5 Error Performance of BSDStep 1: Analyze the error performance of Algorithm TAmerge given that the encoderstate Smg is correct. According to the operations of Algorithm TAmerge, merging canoccur at either step 3 or step 6. Without loss of generality, we suppose that decoding isstopped at step 3. This means that the end node of the backward portion of the decodedpath at level l is at the top of the BS, whereas the end node of the forward portion ofthe decoded path could be at any position in the FS. As indicated in Figure 5.1, decodingerror in this case is clearly upper-bounded by proto-error events,More specifically, let us suppose that there exists a forward incorrect path that mergeswith a backward incorrect path which is at the top of the BS. Clearly, decoding errorsin the backward portion of the decoded path belong to proto-error events without a bias.Furthermore, according to Lemma 5.2, the metric of the forward incorrect path in the FScan not be lower than the minimum cumulative metric of the forward correct path minus\. Hence, the decoding error event near the merging point 1 made by the forward decoderis bounded by the proto-error event with a bias B = \. Clearly, any other decoding errors(if any) in the forward decoder can be upper-bounded by the proto-error event withouta bias. In summary, all error events in Algorithm TAmerge when the encoder state Smgis correct can be upper-bounded by the proto-error events with a bias B = A. Thus, wecan write the following lemma.Lemma 5.3: if in the merging test the encoder state Smg is correct, decoding errorevents in Algorithm TAmerge can be upper-bounded by the proto-error events with a bias\, which is asymptotically the same as that of USD.Step 2.’ Investigate the probability that the encoder state Smg is incorrect. Since the96Chapter 5 Error Performance of BSDencoder state Smg is incorrect, a decoding error must exist around the merging point 1(see Figure 5.2). Let us define this decoding error event between the last diverging pointsof the correct forward and backward paths as the merging error event (see Figure 5.3).Clearly, only one such event can occur within each decoded block. Let pMerge denotethe probability of the merging error event. Obviously, outside the merging error event,everything is the same as in USD, as illustrated in Figure 5,2.Correct pathForward tree level i+h1 Incorrect path+ Backward tree level iForward tree level iBackward tree level DMerging point 1= i+t1i.e. backward tree level j +t2Figure 5,3 Illustration of the merging error event.In Figure 5.3, we assume that forward and backward incorrect paths diverge from thecorrect path at forward tree level i and backward tree level j, respectively. As indicatedin the figure, tj is the length of the forward portion of the decoded path from level i,and t2 is the length of the backward portion of the decoded path from level j. Let theensemble of all possible forward incorrect paths which diverge from the correct path atS mg97Chapter 5 Error Performance of BSDforward tree level i be denoted by X(i). Similarly, let X(j) denote the ensemble ofall possible backward incorrect paths which diverge from the correct path at backwardtree level]. Let *F,(t1) denote a forward incorrect path in X(i) whose end node is atforward tree level (i + t1). Analogously, let B,(t2) denote a backward incorrect pathin (j) whose end node is at backward tree level (j + t2). According to Property 3.3,t + t2 = L—j — i and the merging point 10 = i + t1 = L— (j + t2) since the forwardand backward portions of the decoded path wifi never overlap. Let (i + hi) denote thefarthest depth of the forward correct path reached by forward decoder, Similarly, let (j+ h2) denote the farthest depth of the backward correct path reached by the backwarddecoder. Obviously, h1 + h2 t1 + t2 according to Property 3.3.Clearly, not all paths in X.(i) and X(j) can pairwise merge. Let .(i; j) containall paths in i) and j) which are pairwise merging paths. It is obvious that1(i; j) can be viewed as the set of all incorrect paths which diverge and remerge withthe correct path at forward tree level i and L—j. Define 1(i; j) as the total number ofpairwise merging incorrect paths contained in the set .(i; j). Hence, t + t2 > K, andI (i;j) (u — l)uL_i_i_K = (u — l)utl+t2_K [6]. Now, let pMere(j1h1,t2h2)be the probability of the merging error event as shown in Figure 5.3. Note that ifparameters i, t1, t2 are given, parameterj is also determined since j = L — (i + tl + t2).We will demonstrate that the upper bound on P(i, t1, h1,t2,h2) is only related with(t1 + t2) and (h1 + h2) after averaging over the ensemble of linear trellis codes.Lemmas 5.1 and 5.2 are all that we need to determine the upper bound on theprobability of the merging error event, and the method used in our derivation follows98Chapter 5 Error Performance of BSDclosely that of Viterbi and Omura [6]. Also, it is important to notice that conditions inLemmas 5A and 5.2 are only necessary but not sufficient.Lemma 5,4: Given i, t1, h1, t2, h2, the probability of the merging error event inAlgorithm TAmerge is upper-bounded byi+hi j+h2pMere(1h1,t2h2) < > P(yx) e_M [xp*r)1 e_M[(T)1x eM[,j(t1)]eM[*B,j(t2)] (5.9)I (F,(il);B,2L))eX(i;j) Iwherej = L— (i+ti +t2).Proof’ Without loss of generality, we assume that decoding is stopped at step 3 inAlgorithm TAmerge. That is merging occurs between a top path xB(t2) in the BS anda path xF(t1) which could be at any position in the FS. According to Lemmas 5.1 and5.2, we can writeM[F,(t1)] — mm M[xF(Ti)]+) 9F(i,tl,hl) O (5J0):<r1z+hiandM[*B,(t2)] — mm M[xB(r2)] 9B(j,t2,h2) 0. (5.11)Moreover, 5cF,j(tl) and XB,,(t2) must share the same encoder state Smg at level 10, i.e.,(*F,(t) XB,f(12)) e X(i;j).Similar to [6, 36], we can writepMerge( t1, h1, 12, h2)<P(yx)q(y), (5.12)99Chapter 5 Error Performance of BSDwhere the received code vector14 runs over all symbols between forward tree level i andbackward tree level j, and the indicator function q5(y) is defined as(1, if gF(i,tl,hl) 0 and gB(j,tl,h2) 0q(y) for some (*F,(t);kB,(t2)) E X(i;j) (5J3)1 0, otherwise.As in [6, 36], if for a given y, (y) = 1, then for u 0exp {.gF(i, t1, hi)}.exp {agB(i, t2, h)} l,for some (kF,(t1); B,(t2)) C X(i;j).(5.14)Hence for this yexp{u .gF(i,tl,hl)} .exp{ .gB(j,t2,h2)}]6 1= (y), a, 0.(ti ) ;xB,,(12))EX(s;j)(5.15)while if (y) = 0, (5.15) holds trivially. Also, we can write [36, 6]i+hiexp —uS{ mm M[xp(Tl)]} exp—uSMxF(T)], (5.16)r=sandj+h2exp—crS{ mm M[XB(T2)]} < exp—aSMxB(T)]. (5.17),r2j+hT=)Substituting (5.16) and (5.17) into (5.15), and combining (5.12) and (5.15) yield(5.9). Q.E.D.As in [54, 35, 6], define the exponent functions15fi() = In P(y) q(x) [] , (5.18)14 Notation is simplified if we do not specify the dimensions of vectors; they are either implicit or specifically designated aftereach equation.Note fi(o8) = —Ec(, 8)—.78R, f2(J, 8) = —Ecj(r, 6) and f(o, 8) = —E1(o, 8)+8R, where Ec(, 8), Eci(o, 6)and Ez(cr, 8) are defined by Viterbi and Omura [6 pp. 358].100Chapter 5 Effor Performance of BSDf2(a, 6) = in P(){ q(x) [)] q(x’) [P(Y)] , (5.19)and(5.20)By applying Holder inequality [4] to the above exponent functions, we can writeefl(A) exp —(1 — )E0[/(1— )] = (5.21)ef2(M <exp —{(1 — 6)Eo{6/(1— 6)] + u6E0[(1 — cr)/cr]} = 6cSie, (5.22)and<exp —6E0[(1— )/j = 6je()äR, (5.23)where 8c and S are as defined in [6 pp.359].Now, we are ready to show that the upper bound on P(i,t1,h1,t2h2) is onlyafunction oft1 +t2 = L—j —i K and h1 +h2 < t1 +t2.Lemma 5.5: Given parameters i, tj, h1, t2, and h2, the ensemble average probabilityof the merging error event in Algorithm TAmerge is upper-bounded bypMergc(ti,h1,t2h2) (enR —l)3S(h1+h2)S(t1+t2)e_6nKR, (5.24)where u>0,6 [0,1],t1 +12 > K,h1 1,h2 1, and h1 + h2 ti +t.Proof’ According to Lemma 5.4 and letting S e [0, 1] as in [4, 6, 36, 54], we canwrite________________________i+hi i+h2pMerge(t1, h1, 12, h2) P(yx) eT5M[xF(T)1 e_0SM[xB(T)1101Chapter 5 Error Performance of BSD( Sx eM[F,j(t1)1 x eM[Bj(t2)1( (*Q);*n3(i2)X( ;.j)i+h1 i+h2< e’X X(i;j) P(ylx) >x {eM[*Fi(u1)1 X eM[(t2)]}< — l)Su(t1+t2_ki+h1 j+h2x P(yx) e_L13M[xF(T)ix {eaM{*Fi(ul)1} X {e’M[3(t2)i} . (5.25)Obviously, (5.25) is independent of indexes i and j due to the ensemble average. Forthe sake of simplicity, let i= j = 0 in above summations, and omit subscripts i and jin (5,25). Also, note that there is no difference in the ensemble average of the forwardand backward codes. ThuspMcrgc (i, t1 h1,t2, h2) e’(u — l)Su(tt2_I)6h1 h2x P(yjx) e_M[xF(1’)1x {eM[F(l)i}5{eJM[B(t2)]=e(u— 1 )Su(t12_K)sT(hi,t1)T(h2t2), (5.26)whereT(h, t)Y x(h:)( Sx q[(tj)jeUMt1 , (5.27)102Chapter 5 Error Performance of BSDand where x(h1) and *(t) are codeword segments of h1 and t1 branches, respectively.Notice the similarity between equation (5.27) and equation (6.2.9) in Viterbi and Omura[6]. Hence, we can write T(hL, t) asT h — fexpn{(h —t)[f1(u6) +6R]+tf2(u,6)}, t 528( — expn{(t — h)[f3(,6)— 6R] +hf(u,6)}, t hSubstitute (5.21) through (5.23) into (5.28), we obtainT(h,t) < flhSfltze_ThtiöR (5.29)Finally, substitute (5.29) into (5.26), and note u = enR by the definition of the code ratein (2.2), to obtain (5.24). Q.E.D.In the following, we show that for an ensemble of linear trellis codes pMerge andpMerge have the same exponential exponent as pUSD and Pb’D but with an insignificantincrease factor of e/(1) at most.Theorem 5.]: For an ensemble of linear trellis codes, the probability of the mergingerror event is upper-bounded by(A —nKRpMerge< ) rl7e V — — “comp (530)e—e0(S), Rcomp R C1’whereA7 = (enR— i) (5.31)is a finite constant, and for Reomp R C, 6 [0, 1] is the solution ofRE0(6) (5.32)while for 0 R Rcomp, S = 1.103Chapter 5 Error Performance of BSDProof The probability of the merging error event without any condition is given bypMerge = (5.33)Vi,21,hi,t2hAccording to Lemma 5.5, given parameters i, t1, h, t2, and h2, the ensemble averageprobability of the merging error event is upper-bounded bypMerge( t1,h1t,h2) < e6A (e7lR —i)6fl(h1+h2)fl(t1+L2)_nKR (5.34)where u 0,6 C [0,lj,t1 +t2 K,h1 > l,h > 1, and h1 + h2 <t1 +t2. Moreover,one can write [6]l,6j <1, if = 1 and R<E0(6) (5.35)Thus letting R = E0(6)/6 where 8 C [0, 1], h1 = = 1 and t = K, we can bound(5.34) bypMerge(t1,h1,t2,h2) eA68K(emR —l)6e_SnKR< eTf (e — i) (5,36)As in [6], choose 6 = 1 if B < E0(1) = Rcomp. Substituting (5.36) into (5.33), provesthe above theorem. Q.E.D.Note that in (5.30) A7 = A6 (e — 1) if z = \. Thus, A7 has the similar effect tothat of quantizing the stack in USD with = \. Therefore, by comparing (5.30) with(5.2) of USD, one can conclude that the additional error if any due to the merging errorevent has the same exponent as USD.Step 3: Now, we are ready to show that the overall error performance of AlgorithmTAmerge is asymptotically the same as that of USD.104Chapter 5 Error Performance of BSDTheorem 5.2: For an ensemble of linear trellis codes, the probability of error withAlgorithm TAmerge is upper-bounded bypTAmerge <A6PsD + pT9e, (537)Proof’ Let A, B and B’ be some events where B’ is the complement of event B.Thus, we can writeP(A) = P(A, B) + P(A, B’)= P(AB)P(B) + P(AIB’)P(B’). (5.38)Now let A be the decoding error event and B the event that the encoder state Smg iscorrect, Then P(AIB’) 1 and we can writeP(A) = P(AIB)P(B) + P(B’)< P(AIB) + P(B’). (5.39)Combining (5.39) and Lemma 5.3, we complete the proof. Q.E.D.Corollary 5,1: For an ensemble of linear trellis codes, the probability of error withAlgorithm TAmerge is upper-bounded bypTAmerge <2A6PJSD. (5.40)Corollary 5.1 suggests that the error probability of Algorithm TAmerge is at mosttwice that of a quantized USD with =105Chapter 5 Error Perfomiance of BSDTheorem 5.3: Given that the encoder state Smg is incorrect, the bit error probabilityofAlgorithm TAmerge follows the same upper-bound as that of a unidirectional quantizedstack algorithm with =Proof According to the definition in [6], the bit error probability is the expectednumber of bit errors in a given sequence of received bits normalized by the total number ofbits in the sequence. Thus for a trellis code with block length L and log2 u informationbits per branch,TAmerge . E[Nbl13b (Smg is incorrect) = (5.41)(L — m)log2uwhere Nb is the total number of bit errors in the L-branch code sequence given that theencoder state Smg is incorrect.Now, let nb(T) denote the expected number of bit errors caused by an incorrect pathdiverging at forward tree level T. As indicated in Figure 5.2, Nb can be divided intothree parts i.e.,i—rn L—mE[Nbj E[n(r)] + E[nb(i, t1,h1,t2,h2)] + E[nb(r)], (5.42)where the first and third terms correspond to the total number of bit errors outside themerging error event, and the second term represents the number of bit errors due to themerging error event with parameters i, tj, hj, t2, h. The inequality in (5.42) followsfrom the fact that bit error sequences outside the merging error event may overlap [6].As in [6], we can writeE[nb(i, t1, h1, 12, h2)] (t + t2 — K + 1) log2 (u)P(i, t, h1, 12, h2), (5.43)106Chapter 5 Error Performance of BSDsince (t1 + t2 — K + 1) log2 (u) is the number of bit errors due to the merging errorevent. According to Lemma 5.5, E[nb(i, t1,h1,t2,h2)] is only related with parameterst t + t2 and h h1 + h2, where t K and 0 < h t. Moreover, since pMere( h)follows the same upper bound as that of the proto-error event with a bias \, we can writeL—mE[Nb] e° E[nb(T)J (5.44)where o 0 arid 6 e [0, 1].Substitute (5.44) into (5.41), we can write_______________________L—mpTAmer9e(g i incorrect) (L -_rn)log2u > E[nb(T)j. (5.45)Notice that aside from the factor e, the right side of (5,45) is the upper bound ofthe bit error probability of a unidirectional unquantized stack algorithm [6 pp. 312j.As discussed earlier, e°’ has the same effect as using a unidirectional quantized stackalgorithm with = \. Q.E.D.Similar to the proof of Theorem 5.2, one can writeP(A) = P(AIB)P(B) + P(AIB’)P(B’)= P(AB) + P(B’) [P(AIB’) — P(AIB)], (5.46)where B still denotes the event that the encoder state Smg is correct. Now, let P(A) bethe bit error probability with Algorithm TAmerge. Then, P (A B) is the bit error rateof Algorithm TAmerge given that the encoder state Smg is correct, and P(AIB’) the biterror rate given that Smg is incorrect.In (5.46), P(B’) is the probability that Smg is incorrect, which is upper-bounded byTheorem 5.1. Furthermore, According to Lemma 5.3, P(AB) follows the same upper107Chapter 5 Error Performance of BSDbound as for a unidirectional quantized stack algorithm with = ).. Similarly, accordingto Theorem 5.3, P(AB’) also follows the same upper bound as for a unidirectionalquantized stack algorithm with = ). Although P(AIB) P(AIB’) in general, we canstill conclude that the second term in (5.46) compared to the first term is asymptoticallyunimportant. Thus, the following corollary follows.Corollary 5.2: For an ensemble of linear trellis codes, the bit error probability ofAlgorithm TAmerge is asymptotically upper-bounded bypTAmerge <A6PD. (5.47)According to Corollaries 5.1 and 5.2, one can conclude that the error performanceof Algorithm TAmerge is asymptotically the same as that of USD. More accurately, theerror performance of Algorithm TAmerge using unquantized stacks is similar to that ofa unidirectional quantized stack algorithm with =The error performance of Algorithm TTmerge using unquantized stacks is better thanthat of Algorithm TAmerge since the metric of its merged path is higher than or equalto the merged path metric in Algorithm TAmerge. This is because the merging test inAlgorithm TTmerge is only performed among paths in the highest forward and backwardsubstacks. The error performance of Algorithm HTTmerge is close to that of AlgorithmTTmerge when the number of required matching information symbols mh is close tothe code memory length m. On the other hand, when that number is zero (or close tozero), Algorithm HTTmerge behaves like Algorithm TAmeet. Thus, one can say thatthe error performance of Algorithm HTTmerge, is upper-bounded by that of Algorithm108Chapter 5 Error Performance of BSDTAmeet and lower bounded by that of Algorithm TTmerge. Furthermore, the numberof computations needed by Algorithm HTTmerge decreases as mh decreases. Hence,Algorithm HTTmerge can easily provide a trade-off between the error performance andcomputational effort by controlling the number of matching information symbols mduring the merging test.53 Error Performance of Agorthm TAmeetAccording to Property 2.1 in Chapter 2, the distance spectra of the forward andbackward codes of a specific time-invariant convolutional code are exactly the same.Suppose we let forward and backward decoders decode to the ends of their treesindependently. Then, the error performance of the backward decoder is expected to bealmost the same as that of the forward decoder. In Algorithm TAmeet, however, decodingstops whenever the search areas of forward and backward decoders meet. In other words,the searching areas of the forward and backward decoders will never overlap. The benefitsare that the computational effort is reduced to its minimum among our proposed BSDalgorithms and the extra merging test which might be time consuming is not required.However, the disadvantage is that the decoded forward and backward paths may notmerge (agree) with each other. Thus, Algorithm TAmeet may make additional decodingerrors near its meeting point. Hence, the error performance of Algorithm TAmeet isobviously worse than that of any other BSD algorithms with a merging test.In Algorithm TAmeet, both decoded forward and backward paths retreat about m /2branches from their end nodes.’6 Thus, the length of the retreated branches, m / 2, can16 Here, we ignore the Diophantine constraint, i.e., treat m 12 as a integer.109Chapter 5 Error Performance of BSDbe viewed as the length of the backsearch limit for the last decoded information symbolsby forward and backward decoders. For the rest of the decoded information symbols, thebacksearch limit is clearly greater than m / 2. Hence, the decoded information symbolsnear the beginning and end of each block are more reliable than those near the meetingpoint. Thus, we can upper-bound the bit error probability of Algorithm TAmeet by that ofUSD with a backsearch limit equal to [m/2j. Clearly, the above upper bound indicatesthat the BER of Algorithm TAmeet is essentially not related with the block length L.Furthermore, the above argument suggests that a long block length is preferred in orderto reduce the actual BER with Algorithm TAmeet.For an ensemble of linear trellis codes, Zigangirov [55] provided an upper bound onthe bit error probability of USD with backsearch limit r that is greater than the constraintlength K. When the backsearch limit 7 is equal to or less than the constraint length K, itis known [12, 56, 57] that the probability of error satisfies the random coding bound forblock codes. Therefore, we can conclude that the random block coding exponent appliesto the bit error probability of Algorithm TAmeet, and thus we can writepTAmeet < A8 [m/2J n[Eo(pr)—prR] (5.48)From the above discussion, one can see that the only reason the error performance ofAlgorithm TAmeet is worse than that of USD is because it does not require the decodedforward and backward paths to merge with each other.5.4 Numerical and Simulation Resultsin order to verify the analysis results, lengthy computer simulations were conducted.110Chapter 5 Error Performance of BSDSince decoding errors rarely occur with a long memory code, a short memory (m = 10)SABODP code was used. Similar to Chapter 4, the code rate was chosen to be equal to0.5, a BSC channel was used, and metrics were scaled into integers in the simulations.Moreover, all algorithms were run under strictly identical conditions (i.e., using the samecode and noise sequences) in order to make the results comparable.Results are shown in Tables 5.1 and 5.2 for Pr = 1.1 and Pr = 0.68, respectively.In these simulations, the one branch metric drop ) = 16 and 14 for Pr = 1.1 and Pr =0.68, respectively. 50,000 blocks of length L = 400 were simulated for each algorithmin Table 5,1, while 5,000 blocks of length L = 200 were simulated for each algorithmin Table 5.2. In addition, the computational limit for each algorithm was selected largeenough to make the erasure probability negligible.As expected, Tables 5.1 and 5.2 show that the error perfonnance (both bit and blockerror rate) of Algorithm TAmerge is similar to that of USD. More precisely, the BERof Algorithm TAmerge with unquantized stacks is very close to that of a unidirectionalquantized stack algorithm with a substack spacing ).. Moreover, as Corollary 5.1suggested, the block error probability of Algorithm TAmerge with unquantized stacksis upper-bounded by twice that of unidirectional quantized stack algorithm with z =Tables 5.1 and 5.2 suggest that the error performance of Algorithm Timerge with =1 is very close to that of a unidirectional unquantized stack algorithm. Table 5.1 alsoshows that the error performance of Algorithm TTmerge with a small A, i.e., A < A, isslightly better than that of Algorithm TAmerge using unquantized stacks. One can alsonotice from Tables 5.1 and 5.2 that the error performance of Algorithm TT’merge with A= A is close to that of Algorithm TAmerge with A = 1. Moreover, when m = (m — 1)111Chapter 5 Error Performance of BSDTable 5.1 Error performance comparison of differentalgorithms for the case Pr = 1,1, i.e., R < RcompSABODP Code m=10 Error Erasure Max. Ave.BERPr=” ,L=400,5 x 1 runs blocks blocks comp. comp./br.USD (unquantized stack 1.86e-4 261 0 163045 1.824algorithm_ = 1)USD (quantized stack 5.73e-4 649 4 2e5 3.383algorithm_ = 16)Algorithm 8,09e-3 14552 0 9278 1.274TAmeet ( = 1)Algorithm 6,08e-4 962 0 9364 1.310TAmerge (z = 1)Algorithm 1.39e-4 340 0 17567 1.397TTmerge_( = 1)Algorithm 2.79e-4 496 0 16877 1.403TTmerge_(zS. = 7)Algorithm 8.07e-4 1188 0 23383 1.876TTmerge_( = 16)Algorithm HTlnierge (nih 3.52e-4 672 0 9562 1.336= rn—i,_z = 1)Algorithm HTfmerge (rnh 6.57e-4 1204 0 9562 1,322= m —2, = 1)and = 1, the error performance of Algorithm HTTmerge is similar to that of AlgorithmTTmerge. The trade-off between the error performance and computational effort that canbe provided by Algorithm HTTmerge is clearly apparent in Tables 5.1 and 5.2.The average and maximum number of computations, and the number of blocksremained undecoded (erasure blocks) with different algorithms are listed in Tables 5.1and 5.2. As it can be seen, the maximum number of computations and the averagenumber of computations necessary to decode a block using the proposed BSD are muchless than those needed by USD. Moreover, the simulation results suggest that a large112Chapter 5 Error Performance of BSDTable 5.2 Error performance comparison of differentalgorithms for the case Pr = 0.68, i.e., R > R0SABODP Code m=10 Error Erasure Max. Avg.BERpr=0.68,L200,5X runs blocks blocks comp, comp./br.USD (unquantized stack l,99e-2 413 4 2e5 18.434algorithm_L=1)USD (quantized stack 3.88e-2 770 6 2e5 32.5 14algorithm_with z=l4)Algorithm 4.88e-2 3047 0 2545 1.559TAmeet ( = 1)Algorithm l.58e-2 591 0 3095 1.898TAmerge_(A = 1)Algorithm 9.27e-3 300 3 2e4 3.026Tlmerge_(z=1)Algorithm 2.91e-2 871 0 19107 4.144Tlmerge_(LS=14)Algorithm HTlmerge 1.39e-2 515 0 6758 2.108(mh = m - 1,_L=1)Algorithm HTTmerge l.70e-2 716 0 4258 1.938(mh = m 2,_ti=l)in Algorithm TTmerge does not benefit both the error performance and averagecomputations. Thus, a small value of L seems to be a better choice for the overallperformance of Algorithm TTmerge as well as Algorithm HTTmerge.Finally, we examine the effect of block length L on the error performance of theproposed BSD algorithms. Simulation results of the error performance of AlgorithmTAmeet and Algorithm TAmerge are shown in Table 5.3 and Table 5.4, respectively. Itcan be noticed from Table 5.3 that the BER of Algorithm TAmeet improves slightly asL increases, which confirms the findings in section 5.3. The BER of all algorithms witha merging test is not sensitive to L. However, as expected, the block error probability113Chapter 5 Effor Performance of BSDTable 5.3 Error performance comparison of AlgorithmTAmeet (L\ = I) for different block length LSABODP Code m=10 L=l00 L=200 L=t300 L=400 L=500pr=l.1,5X10 I’UflSBit error rate (BER) l.03e-2 9,84e-3 8.90e-3 8.09e-3 7.56e-3#of error blocks 8869 11944 13482 14552 15348# of erasure blocks 0 0 0 0 0Max. #ofcomp. 583 2805 5021 9278 11200Ave. comp. / branch 1.142 1.202 1.244 1.274 1,302Table 5.4 Error performance comparison of AlgorithmTAmerge (z = 1) for different block length LSABODP Code m=10 L=100 L=200 L=300 L=400 L=500pr1.l,5X10 fllflSBit error rate (BER) 3.07e-4 4.62e-4 5.50e-4 6.08e-4 6.52e-4# of error blocks 151 409 664 962 1256# of erasure blocks 0 0 0 0 0Max. #ofcomp. 1133 3074 6465 9364 11362Ave. comp. / branch 1.189 1,247 1.283 1.310 1.335increases with increasing L, for all the algorithms under study.Tables 5.3 and 5.4 indicate that Algorithm TAmerge and Algorithm TAmeet havebasically the same maximum and average number of computations when the code rate islower than Rcomp. This argument is especially true when L is not too small. Thus, theadditional computations introduced by the merging test are asymptotically insignificant,and are worth the improvements in error performance. Moreover, the maximum andaverage computations is much smaller than those of USD for all values of L. In otherwords, the number of erasure blocks with the proposed BSD is much smaller than thoseof USD under the same maximum number of computations.114Chapter 6Bidirectional Multiple Stack AlgorithmThe main disadvantage of sequential decoding algorithms is their inherent overflowproblem. The number of computations in sequential decoding is a random variable witha Pareto distribution. Hence, there is always a small fraction of blocks which require aninfeasible number of computations and thus cannot be completely decoded. This ratio ofundecodable blocks is often called the erasure probability Per. Several known techniqueshave been proposed to alleviate the overflow problem of sequential decoding [14—16,18, 19, 58J,In Chapter 3, we proposed BSD algorithms which were shown to significantlyreduce the erasure probability. The proposed BSD algorithms improve the computationalperformance by approximately doubling the Pareto exponent of conventional sequentialdecoding techniques. However, the decoding effort is still a random variable with aPareto distribution, and the overflow or erasure problem is still not eliminated.Chevillat and Costello [18] proposed an interesting decoding algorithm, called multiple stack algorithm (MSA), that allows erasure-free decoding. The basic idea is to makeuse of higher order stacks for blocks that require an excessive number of computationsto be decoded. However, to achieve a good error performance with the MSA, a largeamount of memory is needed.In this chapter we propose to modify the MSA to accommodate bidirectional treesearch [29], as described in Chapter 3. We refer to this decoding algorithm as bidirectionalmultiple stack algorithm (BMSA). It is found by analysis as well as computer simulations115Chapter 6 Bidirectional Multiple Stack Algorithmthat the use of bidirectional decoding with the MSA improves the performance in termsof computational effort, memory requirements and decoding errors.First, the basic elements of the conventional MSA are reviewed briefly. The BMSAis then discussed in detail and its performance is evaluated by analysis and computersimulations. Finally, different decoding algorithms, including the MSA, the new BMSAand the Viterbi algorithm, are compared on the basis of their memory requirements, biterror performances and computational efforts.61 The Multiple Stack Algorithm (MSA)The multiple stack algorithm (MSA) eliminates the overflow problem entirely atthe expense of a substantial increase in stack and buffer storage requirements. It is amodification of the stack algorithm which employs several stacks to alleviate the problemof stack overflow. The basic functions of the MSA are illustrated in Figure 6.1. Initially,the MSA decoder operates as a conventional stack algorithm. If a terminal node inthe tree is reached before the first (primary) stack with size Zfvls is filled, decoding iscomplete, and the decoded sequence is identical to that of a conventional stack algorithm.However, if the first stack is filled and an extensive search is needed, the top T nodes ofthe first stack are transferred to the second stack and decoding resumes using only the Ttransferred nodes. If the end of the tree is reached before this stack is full, the terminalnode is accepted as a tentative decision and is stored in a tentative register. The decoderclears the second stack, returns to the first stack, and attempts to find another tentativedecision, The new tentative decision is compared to the preceding one, and the better oneis retained and stored in the register. The process of transferring T nodes to another stack116Chapter 6 Bidirectional Multiple Stack Algorithmis not limited to the second stack. Additional stacks are formed as needed. If a stackfills up before the end of decoding, T nodes are transferred to a higher order (secondary)stack until a tentative decision is made. For simplicity, we assume that all secondarystacks have the same stack size, i.e., ZM = ZMSA, i = 2, 3,.... The MSA terminatesdecoding when it reaches the end of the tree in the first stack, or if a predeterminedcomputational limit Cijm is reached while the search is in progress. In both cases, thebest tentative decision is accepted as the final decision.Transfer ifI 1full Transfer if Transfer ifZ2 if full Z3 if full1 TopTnodes ISecond Stack Z2 Third Stack Z3First Stack ZiFigure 6.1 Illustration of the multiple stack algorithm.The major difference between the single stack algorithm and the multi-stack algorithmis well described by their names. The single stack algorithm stops the decoding processwhen the stack is full and erases the undecoded block. The MSA continues decodingin secondary stacks to reach a decision. When a tentative decision is made in one ofthe secondary stacks, and the computational limit Gum has not been reached, the MSATop T nodes117Chapter 6 Bidirectional Multiple Stack Algorithmreturns to previous stacks to refine the tentative decision. Decoding stops either aftersatisfactorily achieving first-stack decoding or by reaching the computational limit.In [18], the block error probability of the MSA, is upper-bounded bypMSA PeUSD + p1USD (6.1)where PY is the block error probability of the single stack algorithm with an infinitestack size, and pf-’ is the probability of the first stack overflow. It is shown in [18] that/ \PrP1 A9 (Z1 — 1) , (6,2)where A9 is a constant and Pr is the Pareto exponent of USD as defined in Chapter 2.Similarly, the bit error probability of the MSA can be upper-bounded bypjfrISA <pUSD + (6.3)As pointed out by Viterbi and Omura [6], the block error probability is not areasonable performance measure, and ultimately the most useful measure is the bit errorprobability. Moreover, as suggested in Chapter 5, the bit error probability of sequentialdecoding, including our BSD, is independent of the block length L. Hence, in this chapter,the bit error probability is the criteria used to measure the error performance of variousalgorithms.According to (6.3), in order to achieve a low bit error probability with the MSA, bothpJSD and P5 must be very small. It is trivial to make pS) very small by employinga long memory convolutional code, The objective is then to make P-”-’ as small as118Chapter 6 Bidirectional Multiple Stack Algorithmthe desirable error performance. However, to make pfSD very small, say comparableto PbU9D with a long memory code, the size of the first stack must be chosen very large.As indicated in (6,2), pSD only decreases in a Pareto function with an increase in thefirst stack size ZtsA, This means that a substantial first stack size ZsA is needed tomake pf9D relatively small. As it will shown later, the use of the proposed BSD is analternative to reduce the first stack overflow probability for a given first stack size.Chevillat and Costello [181 investigated the distribution of block computations forthe first tentative decision of the MSA. They showed that it is Pareto before the firststack is filled, decreasing exponentially thereafter, They also analyzed the computationaldistribution for the final decision and gave the following upper bound:(A9N-, N < — 1P(Cp > N) < pfJSD Z- N Cm (6.4)0, N > Cim,where cj’’ is the total number of computations required for the final decision and p <Pr’ However, no experimental results on the final computational distribution are reportedin the literature. Since the number of computations for the final decision represents theoverall computational performance of the decoder, in the rest of this dissertation, thecomputational distribution for the final decision will be the focal point.In all computer simulations reported in this chapter, antipodal signaling over anadditive white Gaussian noise (AWGN) channel with hard decision at the output of acoherent demodulator is assumed. Figure 6.2 shows empirical results of the computationaldistribution for the final decision of the MSA with parameters L = 128, T = 3, ZM = 11A7, Ciim = 2000, Eb / N0 = 4.58 dB, corresponding to Pr = 1, and ZfL’5’A = 4L, 5L, 6L.The code used is an m = 7 SABODP code of rate r = 1 / 2 (see Table 2.3).119010A-, 10Chapter 6 Bidirectional Multiple Stack AlgorithmN1536 2000Figure 6.2 Empirical computational distribution for the final decision of the MSA.As indicated by (6,4), it can be seen from the figure that the computational distributionfor the final decision can be divided into three different segments. Before the first stackbecomes full (i.e., N Zf’), the computational effort follows a Pareto distributionas in the case of a conventional infinite stack size algorithm. After the first stack fillsup (i.e., Zj’lsA < N < Gum), the distribution quickly reaches a computational floor,10128 256 384 512 640 768120Chapter 6 Bidirectional Multiple Stack AlgorithmFinally, after the computational limit (i.e. N> Ciim) is reached, it becomes zero.The behavior of the first and last segments of the final computational distributionare obvious. The behavior of the second segment (i.e., ZSA < N Gum) can beexplained as follows. Whenever the first stack fills up, the algorithm transfers T nodes tothe second stack. The transferred nodes may either include the correct one, or they canall be incorrect nodes, In the first case, the correct node is amongst the transferred nodesin the secondary stacks. As described before, the decoding process is stopped only if atentative decision is reached in the first stack, or if the computational limit is reached.Since the current decoding path is only accepted as a tentative decision, the MSA willnot stop decoding even after the correct final node is located in a secondary stack. Let’sassume that the correct final node is in the i-th (i>]) stack. The MSA empties the i-thstack and returns to the (i — ])-th stack. Since the correct node is removed from the i-thstack, there is no correct node in any of the stacks.17 It is therefore unlikely to returnto the first stack and reach a tentative decision, In other words, with a high probability,decoding will not stop until the computational limit is reached.Next, assume that all the transferred T nodes are incorrect. Since there is no correctnode in the second stack, it will take many computations to find one tentative decisionand it is therefore very likely to require higher order stacks, Assume that the tentativedecision is made in the i-th (1>]) stack. The MSA empties the i-th stack and returnsto the (i — ])-th stack and tries to find another decision. After many computations, ifthe computational limit is not yet reached, the MSA will finally return to the first stack.However, the MSA can only extend T nodes before the first stack fills up again. This17 Here, we ignore the fact that an incorrect path may merge with the correct path since this only occurs with a small probability.121Chapter 6 Bidirectional Multiple Stack Algorithmimplies that there is a very small chance that a tentative decision can be reached withinthe T extensions in the first stack. If the first stack fills up again, another T nodes willbe transferred to the second stack. Hence, if the correct node is not transferred into thesecond stack, during the first time, it may be transferred into the second stack during thesecond, third, etc. times. In summary, whenever the first stack fills up, the MSA willnot, most of the time, stop the decoding process until the computational limit is reached.6.2 Bidirectional Multiple Stack Algorithm (BMSA)A bidirectional multiple stack algorithm (BMSA) is constructed in this section, whichcombines the ideas of BSD and the use of multiple stacks. Any of the BSD algorithmsdescribed in Chapter 3 can be used. However, Algorithm TTmerge is chosen due to itsease of implementation and good error and computational performances.The BMSA based on Algorithm TTmerge consists of a number of forward andbackward finite-size memory stacks. We assume that the pairs of forward and backwardstacks are of the same size, denoted by ZPM. Initially, decoding with the BMSAis exactly the same as with Algorithm Timerge. If decoding is terminated before thetwo stacks fill up, the decoded sequence is the same as that with Algorithm TTmerge,However, should one of the stacks (FS or BS) fill up, like in the conventional MSA,the T top nodes in that stack are transferred to a secondary stack, and decoding resumesusing only the T transferred nodes. Once a merged path is found,18 that path is takenas a tentative decision, and is stored in a register. The secondary stack is then cleared,and decoding continues on the primary stacks. If, however, one of the stacks fills upFor simplicity and ease of implementation, the merging test is only conducted between nodes of top non-empty substacks ofthe current pair of forward and backward stacks.122Chapter 6 Bidirectional Multiple Stack Algorithmbefore a decision is reached, the T top nodes in that stack are transferred to yet anothersecondary stack, and decoding resumes once again using only the T transferred nodes.This process is continued, that is secondary stacks are used as needed. Once a tentativedecision is found in a given pair of forward and backward stacks, these two stacks arecleared, and decoding resumes on the predecessor stacks in the attempt to refine thattentative decision. Decoding is stopped whenever a decision is reached in the pair ofprimary stacks or when a predetermined computation limit Cijm is reached. In any event,the best tentative decision is accepted as the decoded path. A detailed description of theBMSA is given below, Let SF denote the current number of the forward stack, and SBthe current number of the backward stack. The steps are as follow:Step 0: Place the root nodes of forward and backward trees in the 1st FS and the1st BS, respectively. Set SF = 5B = I.Step 1: Compute the metrics of all successors of any node from the highest non-empty substack in the SFth FS, remove this node from the SFth FS, and enter the newnodes in their proper substack in the Spth FS.Step 2: If 1F + 1B > L, check all paths in the highest non-empty substack in theSBth BS with all paths in the highest non-empty substack in the SFth FS. if one or moremerging paths is found, select the one, including the one already in the tentative registerif any, with the highest cumulative metric as a tentative decision and go to step 13.Otherwise, go to the next step.Step 3: If the SFth FS is full, transfer T nodes from the highest non-empty substackto the (SF + 1)th FS,’9 and set SF = SF + 1. Go to the next step.19 If the highest non-empty substack contains less than T nodes, pick the rest from the next inghest non-empty substack,123Chapter 6 Bidirectional Multiple Stack AlgorithmStep 4: Check if the node to be extended in the highest non-empty substack in theSBth BS is a terminal node of the backward tree. If yes, compare that path with the onein the tentative register, if any, select the better one, and go to the next step, Otherwise,go to step 6.Step 5: If 5B = 1, stop decoding and take the path in the tentative register as thedecoded path. Otherwise, go to step 15.Step 6: If the number of computations is equal to Cjjm, stop decoding and take thepath in the tentative register as the decoded path. Otherwise, go the next step.Step 7: Compute the metrics of all successors of any node from the highest non-empty substack in the SBth BS, remove this node from the stack and enter the new nodesin their proper substack in the SBth BS.Step 8: If 1F + 1B L, check all paths in the highest non-empty substack in theSFth FS with all paths in the highest non-empty substack in the SBth BS. If one or moremerging paths is found, select the one, including the one afready in the tentative register,if any, with the highest cumulative metric as a tentative decision and go to step 16.Otherwise, go to the next step.Step 9: If the SBth BS is full, transfer T nodes from the highest non-empty substackto the (SB + J)th BS, and set SB = 5B + 1. Go to the next step.Step 10: Check if the node to be extended in the highest non-empty substack in theSFth FS is a terminal node of the forward tree. If yes, compare that path with the onein the tentative register, if any, select the better one, and go to the next step. Otherwise,go to step 12.124Chapter 6 Bidirectional Multiple Stack AlgorithmStep 11: If 5F = 1, stop decoding and take the path in the tentative register as thedecoded path. Otherwise, go to step 18.Step 12: If the number of computations is equal to Ciim, stop decoding and take thepath in the tentative register as the decoded path. Otherwise, go to step 1,Step 13: If SF = 1 and SB = 1, stop decoding and take the path in the tentativeregister as the decoded path. Otherwise, go to the next step.Step 14: If SF > 1, empty the SFth FS, set SF SF — 1, and go to the next step.Otherwise, remove the merged node from the 1st FS, and go to the next step.Step 15: If SB > 1, empty the SBth BS, set SB = SB — 1, and go to step 6. Otherwise,remove the merged node from the 1st BS, and go to step 6.Step 16: If SF = 1 and 5B = 1, stop decoding and take the path in the tentativeregister as the decoded path. Otherwise, go to the next step.Step 17: If 5B > 1, empty the SBth BS, set SB = 5B — 1, and go to the next step.Otherwise, remove the merged node from the 1st BS, and go to the next step.Step 18: If SF> 1, empty the SFth PS, set 5F = 5F — 1, and go to step 12. Otherwise,remove the merged node from the 1st PS, and go to step 12.6.3 Computational Properties of the BMSADefine the total number of computations CfMSA needed to decode one block by theBMSA as the sum of the computations performed by forward and backward decoders.Let Z1 = 2ZpM Zf’, and Z = 2ZpM = Z/’ for i=2,3, . As in theMSA, the size of the 1st stack ZpMs4 should be made large enough so that only thevery noisy blocks, which are potential erasures in conventional BSD, use higher order125Chapter 6 Bidirectional Multiple Stack Algorithmforward and backward stacks. On the other hand, the size of the secondary stacks ZM9Ashould be made substantially small for the BMSA to quickly advance through forwardand backward trees, and find a reasonably good tentative decision.For T1, at most L pairs of stacks are formed before reaching the first tentativedecision. Once a tentative decision is made, the decoder returns to previous stacks torefine the tentative decision. The BMSA is therefore erasure-free if one or more tentativedecisions are made before reaching the computational limit Cijm. In the worst casescenario, where forward and backward decoders do not merge before the final forward orbackward node is reached, the number of computations needed by the BMSA to guaranteeerasure-free decoding would be twice the critical number of computations for the MSA,Ccrii, which is given by [18]L—rn—1Gcrii = ( i) + 2m. (6.5)Therefore, to ensure erasure-free decoding using the BMSA with T=], we must haveGum > 2 X Gcrit. (6.6)Although the above criteria for selecting Cijm is extremely conservative, it offers an easyand safe design rule.Clearly, the computational properties of the first tentative decision of the BMSA aresimilar to those of the MSA. Thus, following [18] and using the approximation on thecomputational distribution of Algorithm TTmerge (see Chapter 4), one can writeProperty 6.]: The first stack overflow probability for the BMSA, Pj”, is given bythe erasure probability of Algorithm TTmerge with primary stack size Z1. That is,pSD = P(cf8D > z1— i) A1o(Z — i)_2Pr, (6,7)126Chapter 6 Bidirectional Multiple Stack Algorithmwhere A10 is a constant independent of Zi, and Pr is the Pareto exponent.From (6.7) and (6.2), it can be noticed that p1B9D is much smaller than p9D for agiven Z1. In other words, to achieve a given overflow probability, the required Zj withthe BMSA is much smaller than that with the MSA, To quantify this reduction in stacksize, let ZsA(P1,Eb/No), and ZjBM(P1,Eb/N0)denote respectively, the stack sizesneeded with the MSA and the BMSA to achieve a certain first stack overflow probabilityPj. at a given signal to noise ratio Eb I N0. The first (primary) stack size saving ratio2°can be expressed by‘MSAD 12 IT’.T\s - (68)zPMSA(P,Eb/No)’which can be simplified to(A9/P1)(Ao/Pi)2— ( A rp_1/2Pr1=A11P’2, (6,9)where A11 is a constant that depends on the channel condition and the decoder used, whichcan be determined by simulation. For comparison purpose, from now on, we let A11 = 1.Figure 6.3 shows a plot of S3 as a function of P1 for different values of Paretoexponent Pr. As shown in this figure, the smaller the required first stack overflowprobability Pj is, the greater is the saving in Zj. Moreover, the saving in the primarystack size increases as the Pareto exponent (or EbIN0)decreases. For instance, when Pr20 The first stack size saving ratio represents approximately the saving in the total memory size since the first stack is much largerthan the secondary stacks.127Chapter 6 Bidirectional Multiple Stack Algorithm150100500102 iü-Figure 6.3 The first stack saving as a function of Pj.= 1 (E,IN=4.58 dB), and P1 = i03, the primary stack size saving is about 32 times.For the same Pj. S = 18 for Pr = 1.2 (Eb/N05.O dB).As for the total number of computations for the final decision of the BMSA, following[18], we can write the following property.350300250200io-4P1128Chapter 6 Bidirectional Multiple Stack AlgorithmProperty 6.2: The final computational distribution of the BMSA is upper-bounded by(A1oN2, N <Z1 — 1P(CfMSA > N) < N 0lim (6j1 0, elsewhere,where the Pareto exponent p < Pr’ and Z1 =2ZBMsA.The computational distribution of the final decision for the BMSA reaches a floorbeyond a certain threshold like that of the conventional MSA. However, since pD <<pSD for a given primary stack size Z1, when compared to the MSA, the computationalfloor is substantially reduced with the BMSA.In order to verify the above properties, computer simulations were performed using50,000 blocks each of length L = 128. The signal to noise ratio Eb / N0 was fixed to4.58 dB, corresponding to a Pareto exponent of Pr 1. The code used is a SABODP,non-systematic rate r = I / 2 memory m = 7 code (see Table 2.3). The stacks in boththe MSA and the BMSA were quantized with = 7.Figure 6.4 shows the results of the final computational distributions of the MSA andthe BMSA for different values of the first stack size Z1 with parameters T = 3, Z = 11,and Ciim = 2000. The Pareto approximations are plotted as solid straight lines on thefigure. As it can be seen from this figure, the simulation results are in total agreementwith the analytical properties discussed above.Figure 6.5 shows the results of the final computational distributions of the MSA andthe BMSA for different values of T. The different parameters used are: Z1 = 6L, Z = 1]andC11m2000. This figure suggests that the first stack overflow probability and the finalnumber of computations are essentially independent of parameter T, as it was predictedby analysis (see (6.7) and (6.10)).129Chapter 6 Bidirectional Multiple Stack AlgorithmFigure 6.4 Empirical computational distributionfor the final decision using Z1 as a parameter.Figure 6.6 shows the results of the final computational distributions of the MSA andthe BMSA for various values of Z. The different parameters used are: Cijm = 2000, 3300and 4500 for Z = 11, 21 and 31, respectively, T = 3, and Z1= 6L. Clim is increased withZ in order to guarantee erasure-free decoding. This figure shows that parameter Z doesnot affect either the first stack overflow probability or the final number of computations,AI010—110-210-310128L__MSA:: Z1=512Zi = 640 -0Z1=768- //I \—— .BMSAParetoApproximation256 384 512 640 768N1536 200013010_iAio2According to the above analysis and computer simulation results, the primary stacksize Zj is the dominating factor in the final computational effort for both the MSA andthe BMSA, while Ciim is mainly determined by Ccrit to guarantee erasure-free decoding.The average number of computations of the proposed BMSA is now examined,010Chapter 6 Bidirectional Multiple Stack Algorithm256 384 512 640 768-310 128NFigure 6.5 Empirical computational distributionfor the final decision using T as a parameter.as it was predicted by analysis (see (6.7) and (6.10)).1536 2000131AI0Chapter 6 Bidirectional Multiple Stack AlgorithmNFigure 6.6 Empirical computational distributionsfor the final decision using Z as a parameter.4500Following the derivation of the average number of computations of the proposed BSDin Chapter 4, we get the following.Property 6.3: The average number of computations per branch of the BMSA is upper-128 256 384 768 1536 2304132Chapter 6 Bidirectional Multiple Stack Algorithmbounded by { A,0 LL— 1)12P— (Z1— 1 )1_2P] p, (Oil)E [CBMSA] <A12 + (2p-1)L [— [1n(Zi—1)—1n(J—1)j, p=where the Pareto exponent p < Pr and A12 is a constant given bypBSDIC Zi)A12 1 + ‘S hm — (6.12)LProof Similar to the analysis for the proposed BSD in Chapter 4, the average numberof computations per branch can be expressed byC:m —1E[CM9A]= 1 + P(CfM.9A > N), (6,13)N=Lwhere P(CfM’A > N) is upper-bounded by (6.10). Substituting (6.10) into (6.13),yieldsZ1 pBSDf—z1)1 ‘S1imE[GM5]<j A1oN2+ L +1N=Lz1—1A10=N + A12, (6,14)N=Lwhere A12 = 1 + PiBSD(Ciim — Z1)/L. Moreover, sincez, 1N2’°< f x2°dxNL L-1= { — 1)’ — (Z1 — i)1_2P] P 2 (6.15)ln(Zi—1)—ln(L—1),—and substituting (6.15) into (6.14) the above property is proved. Q.E.D.133Chapter 6 Bidirectional Multiple Stack AlgorithmSimilarly, for the MSA, the average number of computations per branch can bebounded byE [cM] <A13 + (p-)L [(L—— (Z1 — i)1_P], # (6.16){ln(Z—l)—ln(L—1)], p=lwhere Al? is a constant and is defined as belowpUsDi ZA13 1 + urn — 1) (6.17)By comparing (6.11) and (6.16), one can notice that under the same parameters,the average number of computations with the BMSA is smaller than that with the MSA,Under the same Cijm, in order to reduce the average number of computations, the primarystack size Z1 should be increased. This indicates a clear trade-off between the requiredmemory and the computational performance for both the MSA and the BMSA. However,the BMSA offers a substantial improvement in terms of computational performance andmemory requirements compared to the MSA, as it was shown above. Moreover, one cannotice from (6.11) and (6.16) that the average number of computations is less sensitiveto Ciim for the BMSA as compared to the MSA.Table 6.1 shows some empirical results of the average number of computations fordifferent values of Z1. As predicted, in all cases, the use of the BMSA results in asignificant reduction of the average number of computations.Table 6.1 Average number of computations as a function of Z1SABODP (m = 7), L =128, 5x104 runs Z1 = 512 Z1 = 640 Zj = 768Eb/No = 4.58 dB, T = 3, Z = 11, Ciim = 2000MSA 2.512 2.347 2.256BMSA 1.576 1,504 1.478134Chapter 6 Bidirectional Multiple Stack AlgorithmTable 6.2 shows some empirical results of the average number of computations fordifferent values of T, These results indicate that parameter T is not related with theaverage number of computations. This can also be seen from the analytical expressionsgiven by (6.11) and (6.16).Table 6.2 Average number of computations as a function of TSABODP (m = 7), L = 128, 5x i0 runsE/NO=4.58dB,Zjt768,Z=ll,Cjj= T1 T=2 T=32000MSA 2.256 2.256 2.256BMSA 1.475 1.477 1.478Table 6.3 shows some empirical results on the average number of computations asa function of Cjjm. The results confirm that the average number of computations is lesssensitive to Cijm in the case of the BMSA compared to the MSA.Table 6.3 Average number of computations as a function of CijmSABODP (m = 7), L = 128, 5x runs Cjjm = 2000 Ciim 3300 Ciim = 4500Eb/No=4.S8dB,Z]=768,T=3 (Z=11) (Z=21) (Z=31)MSA 2,256 2.803 3.310BMSA 1.478 1.522 1.5556.4 Error Performance of the BMSAThe error performance of the BMSA can be derived in the same way as for theconventional MSA [18]. One can hence write the following property.Property O.4: The bit error probability of the BMSA is upper-bounded bypBMSA <pBSD+ (6.18)135Chapter 6 Bidirectional Multiple Stack Algorithmwhere pSD is the bit error probability of BSD using a single pair of infinite size FSand BS.The effect of parameters Z1, T and Z on the bit error probability of the BMSA isnow provided.6.4.1 Effect of Parameter ZjBased on the results in Chapter 5, pSD of BSD-merge is asymptotically the sameas that of USD. As it was found in the previous section, for the same primary stacksize Z1, the first stack overflow probability pBSD of the BMSA is much smaller thanthat of the conventional MSA. For example, as shown in Figure 6.4, for Z1 = 768,p1BSD is around 5 x i0, whereas Pfr’ is only about 5 x 10—2. Hence, by employingthe BMSA, the error performance can be substantially improved without increasing theprimary stack size Zi.Figure 6.7 shows the bit error probability curves of the MSA and the BMSAas a function of Z1 for the two values Lb/No = 4.58 dB (Pr 1) and Lb/No =5.55 dB (Pr = 1.5), which are drawn in dotted and dashed lines, respectively. Theparameters used in the decoding are the same as those used for Figure 6.4. As it can beseen from the figure, the error performance of both the MSA and the BMSA improvesas Zj increases, up to a certain floor which is determined by the error performance limitof the employed code, that is pBSD of B SD-merge or pJ.9) of USD. For reference, weobtained p8D through computer simulations using Algorithm TTmerge with unlimitedsizes of FS and BS. Results are plotted on Figure 6.7 as straight lines. The error floorat Lb/No = 4.58 dB is around 3 x i0, and at Lb/No = 5.55 dB it is around lO,136Chapter 6 Bidirectional Multiple Stack AlgorithmWith the BMSA, the error floor is practically reached with Z1 = 640 and Z1 = 512 forEb/No = 4.58 dB and Eb/No = 5.55 dB, respecfively. For Eb/No 4.58 dB and Zj =640, we can get an approximation of p1BSD which is about 7 x i0 from Figure 6.5.To achieve this same value with the MSA, using the Pareto approximation on Figure 6,5,the required Z1 is about 5356, which is 8 times more than that of the BMSA.Figure 6.7 Bit error probability of the BMSA and MSA as a function of Zi.‘‘10_I-210-3l:J.:1 10-410‘ %.‘4‘\‘‘4‘4‘4.‘-—---.4%-4444-‘%4%4MSA E BMSAError Floor (Pr_ 1)Error Floor r 1.5) --510121 256 384 512 640 768Zi137Chapter 6 Bidirectional Multiple Stack Algorithm6.4.2 Effects of Parameters T and ZAs suggested by Chevillat and Costello [18], T and Z have no significant effects onthe error performance of the MSA. Similar arguments also hold for the BMSA. Tables6.4 and 6.5 show the error performance of both the BMSA and the MSA as a functionof T and Z with the same parameters used for Figures 6,5 and 6.6, respectively. Asexpected, the error performance is independent of parameters T and Z. Thus, T and Z canbe used to improve memory requirements and computational performance.Table 6.4 Error performance as a function of parameter T.Table 6.5 Error performance as a function of parameter Z.SABODP code MSA BMSAm = 7, L = 1285xlO4runs Z=ll Z=2l Z=31 Z=ll Z=21 Z=31EbIN0=4.58 dBBER 1.84e-2 l.81e-2 l.80e-2 3,88e-3 3.81e-3 3.81e-3# of error blocks 3464 3457 3454 1802 1796 17956.5 Comparison with Viterbi DecodingWe now compare the proposed BMSA with the optimum Viterbi algorithm (VA).Figure 6.8 shows the BER of the various algorithms found through computer simulations.The codes and parameters used are as follow:SABODP code MSA BMSAm = 7, L = 1285xl04runs T=l T=2 T=3 T=l T=2 T=3Eb/No=4.58 dBBER l.88e-2 1.86e-2 l.84e-2 3.96e-3 3.89e-3 3.88e-3#of error blocks 3503 3480 3464 1809 1803 1802138Chapter 6 Bidirectional Multiple Stack Algorithm• A rate 1/2 memory m = 7 optimum free distance code [7] was used with VA.• A rate 1/2 memory m = 7 SABODP code was used with both the MSA and theBMSA. The different parameters used with both the MSA and the BMSA are: L =128, T = 3, Z = 11, Z1 = 640, and Cijm = 2000.• Another rate 1/2 memory m = 10 SABODP code was used with the BMSA. Thedifferent parameters used are: L = 128, T = 3, Z = 11, Z1 = 1024, and Gum = 2300.As expected, the BER performance of the BMSA is better than that of the conventional MSA. For example, for the same code and the range of Eb/No shown in Figure6.8, the BMSA is about 0.5 dB better than the MSA. The VA using the optimum rate1/2 code performs better than the BMSA for the same code memory m. However, as thememory m of the code used increases, the error performance of the BMSA improves.For example, with the m = 10 code, the error performance of the BMSA is about 0.4dB better than that of the VA at a BER = j5, This improvement in BER can be evenmade larger by using a longer memory code. For the BMSA, the average number ofcomputations per decoded bit is typically less than 2 (see Table 6.1). This is much smallerthan that of the VA, which requires 2” computations per bit. However, as pointed out byMa [21], a node extension is executed much faster with the VA than with a sequentialdecoder (typically 10 times faster), Nevertheless, the overall decoding speed with theBMSA remains faster than with the VA, More importantly, the decoding complexity withthe BMSA is virtually insensitive to the code memory m.Moreover, it was found by Ma [21] that as the code memory m increases, the MSAtends to require less memory for decoding than with the VA. Furthermore, we have shownthat the proposed BMSA not only performs better than the MSA, but also requires less139BERticC) )C-CDCDCC) C‘r-,)C-DCz-qC— CCCCD-f-tC CDCCDC.’)tTi-CCDCDCLttCtICCCDCC-f CDCD• C C-CCD—ticC)C--_ftic2CCD$.-tic C.Dd<0CDCDCDCDCDCChapter 7Conclusions and Suggestionsfor Further ResearchIn a system using convolutional coding and sequential decoding, data is transmittedin blocks, and each block is terminated by a known tail. A conventional sequentialdecoder searches the tree from the forward direction, starting from initial encoder state(initial root of the tree), up to the final encoder state (end root of the tree). The maindrawback of sequential decoding is the variability of its decoding effort which could causedecoding erasures. In order to alleviate this drawback, efficient bidirectional sequentialdecoding (BSD) techniques, in which the tree is simultaneously searched from forwardand backward directions, were proposed and analyzed in this dissertation.The relationships between backward coding and forward coding have been examinedin detail. Good convolutional codes, with memory m ranging from 2 to 25, suitable forbidirectional decoding have been found through computer search. These codes possessthe same distance properties from both forward and backward directions.In the proposed BSD, two decoders are used; one is called forward decoder (FD),and is used to search the tree from forward direction; while the other is called backwarddecoder (BD), and is used for the backward search of the tree. Forward decoding andbackward decoding are performed simultaneously, and stop somewhere in the tree. Inone class of BSD, which are referred to as BSD-merge, decoding stops whenever FD andBD reach a common encoder state somewhere in the tree. In the other class of BSD,that is BSD-no-merge, no common encoder state is required, and decoding stops when141Chapter 7 Conclusions and Suggestions for Further ResearchFD meets BD. Different BSD algorithms were constructed based on the stack algorithm;Algorithm TAmeet which belongs to the class of B SD-no-merge, Algorithm TAmergeand Algorithm TTmerge which belong to BSD-merge, and finally Algorithm HTTmergewhich is a hybrid version of BSD-merge and BSD-no-merge.The computational performance of the proposed BSD algorithms has been analyzed indetail. It was found that the distribution of the total number of computations per decodedblock is still Pareto, as that of unidirectional sequential decoding (USD). However, theadvantage of the proposed BSD appears as an increase in the Pareto exponent, and henceas a decrease in the computational variability. This results a decrease in the erasureprobability. More specifically, it is proved by using the random coding approach that thePareto exponent of BSD using Algorithm TAmeet is asymptotically twice that of USD.It is conjectured that this also applies to Algorithm TAmerge. On the other hand, it wasfound that the computational cutoff rate of sequential decoding remains unchanged, butthe use of BSD reduces the average number of computations per decoded block. Extensivecomputer simulations were performed which confirmed our theoretical analysis.Based on the random coding approach, the error performance of BSD-merge isshown to be asymptotically the same as that of USD. Moreover, the bit error probabilityof Algorithm TAmeet is found to satisfy the random coding bound for block codes.Computer simulations were conducted and found to agree with the analytical findings.Among all the proposed BSD algorithms, Algorithm TAmeet is the simplest one interms of implementation. However, its error performance is relatively poor compared tothat of USD. Both Algorithm TAmerge and Algorithm TTmerge offer good computational142Chapter 7 Conclusions and Suggestions for Further Researchand error performances. The merging test is more demanding in Algorithm TAmerge thanin Algorithm TTmerge, and therefore, Algorithm TTmerge may be preferred in practice.Algorithm HTlmerge is very useful as it can provide a tradeoff between computationaleffort and error performance.None of the above four schemes can completely eliminate the erasure problem.Therefore, the idea of BSD was combined with the MSA, and an efficient erasure-freebidirectional MSA (BMSA) was constructed, Through analysis and computer simulations,it is shown that the new BMSA offers substantial advantages over the MSA in terms ofcomputational effort, memory requirements and error performance. The BMSA appearsas an attractive alternative to the VA where low error probabilities and high decodingspeeds are required.Some interesting problems for future research are:1. The implementation aspects of the proposed BSD algorithms need to be investigated.One interesting systolic search was recently proposed for the stack algorithm [59]. Itwould be nice to see how and if this idea can be adapted to the BSD algorithms.2. BSD can be used for the decoding of tree-like block codes. Of course one needsto examine the distance properties of these tree-like block codes from the backwarddirection. Perhaps good block codes suitable for BSD need to be designed.3. Recently sequential decoding was successfully applied to trellis codes [58], and goodlong memory trellis codes were found. Similarly, the proposed BSD algorithms canbe applied to trellis codes. Our analysis is general and applies to both convolutionaland trellis codes. However, one needs to find good symmetric trellis codes that143Chapter 7 Conclusions and Suggestions for Further Researchpossess good distance properties from both forward and backward directions. Notethat the rate 1/2 SBODP or SABODP codes are suitable for Gray mapped QPSKmodulation since in this case the Euclidean and Hamming distances are equivalent.4. It would be of interest to see how well the proposed BSD algorithms performfor resolving intersymbol interference (1ST) channels, compared to conventionalsequential decoding [60],5. Finally, the BSD idea can be combined in conjunction with the buffer lookingalgorithm which is another erasure-free decoding method proposed recently by Wang[58]. A study of this approach is also of interest.144References[1] C. E. Shannon, “A Mathematical Theory of Communication,” Bell Syst. Techn. J.,vol. 27, July and Oct. 1948.[2] J. M. Wozencraft and 1. M. Jacobs, Principles of Communication Engineering. NewYork: Wiley, 1965.[3] W. W. Peterson and E. J. Weldon, Error-Correcting Codes. Cambridge, Mass.: MITPress, 2nd ed., 1972.[4] F, Jelinek, Probabilistic Information Theory. McGraw-Hill Book Company, 1968.[5] R. G. Gallager, Information Theory and Reliable Communication. New York: Wiley,1968.[6] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Coding.McGraw-Hill Book Company, 1979.[7] V. K. Bhargava, D, Haccoun, R. Matyas, and P. Nuspi, Digital Communication bySatellite. New York: Wiley, 1981,[8] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Applications.NJ: Prentice-Hall: Englewood Cliffs, 1983.[9] A. J. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically OptimumDecoding Algorithm,” IEEE Trans. on Inform. Theory, vol. IT- 13, pp. 260—269, April1967.[10] J. B. Anderson and S. Mohan, “Sequential Coding Algorithms: A Survey and CostAnalysis,” IEEE Trans. on Comm., vol. COM-32, pp. 169—176, Feb. 1984.[11] J. M. Wozencraft and B. Reiffen, Sequential Decoding. Cambridge, Mass.: MITPress, 1961.[12] H. L. Yudkin, “Channel State Testing in Information Decoding,” Sc.D. Diss., Dept.Elec. Eng., MJ.T., Cambridge, 1964.[13] R. M. Fano, “A Heuristic Discussion of Probabilistic Decoding,” IEEE Trans. onInform. Theory, vol. IT-9, pp. 64—73, April 1963.145[14] G. D. Forney, Jr. and E. K. Bower, “A High-Speed Sequential Decoder: PrototypeDesign and Test,” IEEE Trans. Commun. Technol., vol. COM-19, pp. 82 1—835, Oct.1971.[15] K. S. Zigangirov, “Some Sequential Decoding Procedures,” Probi. Peredach.Inform., vol. 2, no. 4, pp. 13—15, 1966.[16] F. Jelinek, “Fast Sequential Decoding Using a Stack,” JBMJ. Res. Develop., vol. 13,pp. 675—685, Nov. 1969.[17] J, W. Layland, and W. A. Lushbaugh, “A Flexible High-Speed Sequential Decoderfor Deep Space Channels,” IEEE Trans. Commun. Technol., vol. COM-19, pp. 813—820, Oct. 1971.[181 P. R. Chevillat and D. J. Costello, Jr., “A Multiple Stack Algorithm for ErasurefreeDecoding of Convolutional Codes,” IEEE Trans. on Comm., vol. COM-25, pp. 1460—1470, Dec. 1977.[19] D. Haccoun and M. J. Ferguson, “Generalized Stack Algorithms for DecodingConvolutional Codes,” iEEE Trans. on Inform. Theory, vol. IT-21, pp. 638—651,Nov. 1975.[20] P. R. Chevillat, “Fast Sequential Decoding and a New Complete DecodingAlgorithm,” Ph.D. Diss., Dept. Elec. Eng., Illinois Institute of Technology, Chicago,IL, May 1976.[21] H. H. Ma, “The Multiple Stack Algorithm Implemented on a Zilog Z-80 Microcomputer,” IEEE Trans. on Comm., vol. COM-28, pp. 1876—1882, Nov. 1980,[22] G. D. Forney, Jr., “Final Report on a Coding System Design for Advanced SolarMissions,” Rep. NAS2 -3637, Contract NASA CR73167, NASA Ames Res. Center,Moffett Field, CA, 1967.[23] L. R. Bahi, C. D. Cullum, W. Frazer, and F. Jelinek, “An Efficient Algorithm forComputing Free Distance,” IEEE Trans. on Inform. Theory, vol. IT-18, pp. 437—439, May 1972.[24] K. J. Larsen, “Comments on ‘An Efficient Algorithm for Computing Free Distance’,”IEEE Trans. on Inform. Theory, vol. IT-19, pp. 577—579, July 1973.146[251 M. Rouanne and D. J. Costello, Jr., “An Algorithm for Computing the DistanceSpectrum of Trellis Codes,” IEEE J. on Selected Areas in Commun., vol. 7, pp. 929—940, Aug. 1989,[26] F. Hemmati, “Bidirectional Trellis Decoding,” 1990 IEEE Book of Abstracts ofInformation Theory Symposium, San Diego, p. 107, Jan. 1990.[27] D. Haccoun and J. Belzile, “Bidirectional Algorithms for the Decoding of Convolutional Codes,” 1990 IEEE Book of Abstracts of Information Theory Symposium,San Diego, p. 177, Jan. 1990.[28] J. Belzile and D. Haccoun, “Bidirectional Breadth-First Algorithms for the Decodingof Convolutional Codes,” IEEE Trans. on Comm., vol. 41, pp. 370—380, Feb. 1993,[29] K. Li and S. Kallel, “Bidirectional Sequential Decoding Algorithms,” IEEESymposium on Information Theory, in San Antonio, TX, USA, Jan. 1993.[30] I. M. Jacobs and E. R. Berlekamp, “A Lower Bound to the Distribution ofComputation for Sequential Decoding,” IEEE Trans. on Inform. Theory, vol. IT-13, pp. 167—174, April 1967.[31] R. Johannesson, “On the Distribution of Computation for Sequential Decoding Usingthe Stack Algorithm,” IEEE Trans. on Inform. Theory, vol. IT-25, pp. 323—33 1,May 1979.[32] J. E. Savage, “Sequential Decoding — the Computation Problem,” Bell Syst. Techn.J., vol. 45, no. 1, pp. 149—175, 1966.[33] D. D. Falconer, “An Upper Bound on the Distribution of Computation for SequentialDecoding with Rate Above Rcom,” M.i.T.-RLE Quart. Progress Rept., pp. 174—179,April 1966.[34] F. Jelinek, “An Upper Bound on Moments of Sequential Decoding Effort,” IEEETrans. Inform. Theory, vol. IT-iS, pp. 140—149, January 1969.[35] 0. D. Forney, Jr., “Convolutional Codes. II. Maximum-Likelihood Decoding,”Inform. Contr., vol. 25, pp. 222—266, 1974.[36] 0. D. Forney, Jr., “Convolutional Codes. III. Sequential Decoding,” inform. Contr.,vol. 25, pp. 267—297, 1974.147[37] P. R. Chevillat and D. J, Costello, Jr., “Distance and Computation in SequentialDecoding,” IEEE Trans. on Commun., vol. COM-24, pp. 440—447, April 1976.[38] P. R. Chevillat and D. J. Costello, Jr., “An Analysis of Sequential Decoding forSpecific Time-Invariant Convolutional Codes,” IEEE Trans. on Inform. Theory,vol. IT-24, pp. 443—451, July 1978.[39] K. Li and S. Kallel, “Bidirectional Sequential Decoding for Convolutional Codes,”1991 IEEE Pacific Rim Conference on Communications, Computers and SignalProcessing in Victoria, B.C., Canada, May 1991.[40] J. L. Massey and M. K. Sam, “Inverses of Linear Sequential Circuits,” IEEE Trans.on Comp., vol. C-17, pp. 330—337, April 1968.[41] J. L. Massey, “Reversible Codes,” Inform. and Control, vol. 7, pp. 369—380, 1964.[42] J. P. Robinson, “Reversible Convolutional Codes,” IEEE Trans. on Inform. Theoty,pp. 609—610, July 1968.[43] R. Johannesson, “Robustly-Optimal Rate One-Half Binary Covolutional Codes,”IEEE Trans. on Inform. Theory, vol. IT-21, pp. 464—468, July 1975,[44] R. Johannesson and E. Paaske, “Further Results on Binary Covolutional Codeswith an Optimum Distance Profile,” IEEE Trans. on Inform. Theory, vol. IT-24,pp. 264—268, March 1978.[45] M. Cedervall and R. Johannesson, “A Fast Algorithm for Computing DistanceSpectrum of Convolutional Codes,” IEEE Trans. on lnform. Theory, vol. 35,pp. 1146—1159, Nov. 1989.[46] D. Haccoun, “A Markov Chain Analysis of the Sequential Decoding Metric,” IEEETrans. on Inform. Theory, vol. IT-26, pp. 109—113, January 1980.[47] J. B. Anderson and S. Mohan, Source and Channel Coding.’ an AlgorithmicApproach. Kluwer Academic Publishers, 1991.[48] J. K. L. Jordan, “The Performance of Sequential Decoding in Conjunction withEfficient Modulation,” IEEE Trans. Commun. Technol., vol. COM-14, pp. 283—297,June 1966.[49] A. Papoulis, Probability, Random Variables, and Stochastic Processes. McGraw-Hill Book Company, 1965.148[50] J. B. Anderson, “Limited Search Trellis Decoding of Convolutional Codes,” IEEETrans. on Inform. Theory, vol. IT-35, pp. 944—955, Sept. 1989.[51] J. B. Anderson, “Sequential Decoding Based on an Error Criterion,” IEEE Trans.on Inform. Theory, vol. IT-38, pp. 987—1001, May 1992.[52] D. Haccoun, “A Branching Process Analysis of the Average Number of Computations of Stack Algorithm,” IEEE Trans. on Inform. Theory, vol. IT-30, pp. 497—508,May 1984.[53] E. Arikan, “An Upper Bound on the Cutoff Rate of Sequential Decoding,” IEEETrans. on Inform. Theory, vol. IT-34, pp. 55—63, January 1988.[54] F. Jelinek, “Upper Bounds on Sequential Decoding Performance Parameters,” IEEETrans. Inform. Theory, vol. IT-20, pp. 227—239, March 1974.[55] K. S. Zigangirov, “On the Error Probability of Sequential Decoding on the BSC,”IEEE Trans. inform. Theory, vol. IT-18, pp. 199—202, January 1972.[56] K. S. Zigangirov, “Algorithm of Sequential Decoding in which the Error ProbabilityIncreases in Agreement with the Random Coding Bound,” Probi. Peredach. Inform.,vol. 4, pp. 83—85, 1968.[57] K. S. Zigangirov, “Procedures of Sequential Decoding,” Coding and Complexity -C1SM Courses and Lectures No. 216, Springer-Verlag, Wien, New York, pp. 109—130, 1975.[58] F. Q. Wang, “Efficient Sequential Decoding of Trellis Codes,” PhD. Diss., Dept.Elec. Eng., University of Notre Dame, Notre Dame, Indiana, Dec. 1992,[59] C. Y. Chang and K. Yao, “Systolic array architecture for the sequential stackdecoding algorithm,” Advanced algorithms and architectures for signal processing,vol. 696, pp. 196—203, 1986.[60] F. Xiong, A. Zerilc and E. Shwedyk, “Sequential Sequence Estimation for Channelswith Intersymbol Interference of Finite or Infinite Length,” IEEE Trans. Commun.,vol. 38, pp. 795—804, June 1990.[61] B. Hajek, “Hitting-time and Occupation-time Bounds Implied by Drift Analysis withApplications,” Advances in Applied Probability, vol. 14, pp. 502—525, 1982.149Appendix AProof of Theorem 4.4In the proof of Theorem 4.4, we need a lemma (see Lemma 2.2 of [61] for a moregeneral result).Lemma Ad: Suppose a real-valued random variable V satisfies E[V] 7> 0 andI V < M almost surely,for some constants and M. Then for any > 0,E [eV]—TI( + M) + eflM, (Ad)In particular, for,= (1 /M) in (1 + /M),E [e] <(1 + /M)[1 — ln(1 + /M)] <1. (A2)Proof The proof follows from noting that e”=(TI/k! 1—V +[e1M— (1 + TIM)]. Q.E.D.Proof of Theorem 4.4: Similar to the proof of the computational lower boundin Theorem 4.3, separate every received block into two almost equal parts, i.e., levels [1, [(L — m)/2J] for forward tree search operations and [[(L — m)/2j + m, L] forbackward tree search. Here, L and m denote any L and m for any code j. Alsowith the assumption of correct decoding, CfSD > mm [C_m)/2j,CRL_m)/21] whereand C_m)/2ldefined in the proof of Theorem 4.3, are independent random variables.Now, let l be an arbitrary integer of multiple n and less than [(L — m)/2J . n, anddefine U [(L — m)rI/2l?lj. For I j U, define NJ to be the set of nodes in the150forward tree at segment j1, that are descendants of the correct end node of segment (j —1)1. Given that the end node of the segment (j—l)l, is correctly decoded, let cf be thenumber of distinct nodes in NJ that are hypothesized (extended) by the idealized forwarddecoder before correctly decoding the segment jl. Then c$. Define asimilar quantity, n, to be the number of nodes in NJ that are at least as likely to bethe correct nodes at segment j1,. Then, because the codewords are a priori equiprobableand correct decoding is assumed, one can use Lemma 3.1 in [53]21 to show thatE[cr] E[n]. (A3)Lemma 3.1 of [53] assumes that the stack algorithm is used. However, its proof remainsvalid for any sequential decoder for which one can determine which of two nodes at thesame level is hypothesized (extended) first, by considering only the received messageup to that level.Now let K denote the DMC channel. For every Q > 0 and 0 < o 1, one can alwayschoose a finite l(K, R,60Q) sufficiently large to satisfy the condition of Lemma 6.1in [53], and thus[exp (l(R — R1)/8) —2] > 32Q(1 + ‘)n’. (A.4)Therefore, using Lemma 6.1 in [53] into (A.3), we can writeE[cf]_E[n2]exp [l(R — R)/8]> l6Ql(1 + + 1, 1 <j <U. (A.5)21 The quantities defined by Arikan in [53j do not count the correct node itself (and thus equal one less than the quantities definedhere), but clearly the result still holds151Let = 16Ql(1 + E)n’, JI = 1, and M = X where Xj denote the sizeof the channel input alphabet. Then, E—> 1 and cr — ILOI < M sinceo < cr < exp (Rl), 0 < .uo < E [c’j < exp (Rl), and for distinct codewords,exp (Rl,) X. Let = (1/M) In (1 + l/M). By Lemma A.l,E[exp (_ii(cr — <8, (A.6)where 8 = (1 + l/M)[l — ln(l + l/M)], and 8 < 1.Since cr, 1 <j < U are identically distributed, independent random valuables, onecan use Chemoff bound and getP{(F) o} E[exP (-(cro))]= {E[exp (-(cr - ))] }(A.7)Now choose Lo(K, R,E0 Q) so that:(1) L0 > 4(1 +e1)l/n and(A.8)(2) L0 > 2l(l +o)(l —1n2/lnO)/n.SupposeL L0 and L (1 + e’)m. ThenP(C_m)/2j> uo) p( cr > Uo)= Uo)1 —> 1/2. (A.9)152In the same way, one can showP(C_m.)/21 > Ui0) > 1/2, (A.1O)FC and (BCSince C[(Llfl)/2j j(L—mj)/21 are independent, we haveE[C’’] =IFC BCE [mm CL(Lm.)/2j, G1(L.m)/2])] /L> (1/2)(1/2)U0/L>Q. Q.E.D,(A.11)153Appendix BList of AcronymsAID Analogy-to-DigitalAWGN Additive White Gaussian NoiseBD Backward DecoderBMSA Bidirectional Multiple Stack AlgorithmBODP Bidirectional Optimum Distance ProfileBS Backward StackBSC Binary Symmetric ChannelBSD Bidirectional Sequential DecodingBSD-merge Bidirectional Sequential Decoding With a Merging TestBSD-no-merge Bidirectional Sequential Decoding Without a Merging TestCDF Column Distance FunctionDIA Digital-to-AnalogyDMC Discrete Memoryless ChannelEb/No Energy Per Bit to Noise RatioFD Forward DecoderFEC Forward Error CorrectionFS Forward StackGCD Greatest Common DivisorISI Intersymbol Interference154modem Modulator/DemodulatorMSA Multiple Stack AlgorithmODP Optimum Distance ProfileQPSK Quaternary Phase Shift KeySABODP Symmetric Almost Bidirectional Optimum Distance ProfileSBODP Symmetric Bidirectional Optimum Distance ProfileUSD Unidirectional Sequential DecodingVA Viterbi Algorithm155
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Bidirectional sequential decoding
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Bidirectional sequential decoding Li, Kaiping 1994
pdf
Page Metadata
Item Metadata
Title | Bidirectional sequential decoding |
Creator |
Li, Kaiping |
Date Issued | 1994 |
Description | The main drawback of sequential decoding is the variability of its decoding effort which could cause decoding erasures. We propose and analyze in this dissertation efficient bidirectional sequential decoding (B SD) techniques to alleviate this drawback, In the proposed BSD, two decoders are used; one is called forward decoder (PD), and is used to search the tree from forward direction; while the other is called backward decoder (BD), and is used for the backward search of the tree. Forward decoding and backward decoding are performed simultaneously, and stop somewhere in the tree. In one class of BSD, to which we refer as BSD-merge, decoding stops whenever FD and BD merge at a common encoder state some where in the tree. In the other class of BSD, that is BSD no-merge, no common encoder state is required, and decoding stops when FD meets BD. Different BSD algorithms based on the stack algorithm are constructed; namely Algorithm TAmeet which belongs to the class of BSD-no-merge, Algorithm TAmerge and Algorithm Timerge which belong to BSD-merge, and finally Algorithm HTTmerge which is a hybrid version of BSD-merge and BSD-no-merge. The relationships between backward coding and forward coding are examined in detail. Good convolutional codes, with memory m ranging from 2 to 25, suitable for bidirectional decoding, found through extensive computer search, are provided. These codes possess the same distance properties from both forward and backward directions. It is found by analysis and computer simulations that the distribution of the total number of computations per decoded block of the proposed BSD is still Pareto, as that of unidirectional sequential decoding (USD). However, the advantage of the proposed BSD appears as an increase in the Pareto exponent, and hence as a decrease in the computational variability and erasure probability. More specifically, we prove by using the random coding approach that the Pareto exponent of BSD using Algorithm TAmeet is asymptotically twice that of USD, and conjecture that this also applies to Algorithm TAmerge. On the other hand, it is found that the computational cutoff rate remains unchanged, but the use of our BSD reduces the average number of computations per decoded bit. Using the random coding approach, we show that the error performance of BSD merge is asymptotically the same as that of USD. Moreover, we show that the bit error probability of Algorithm TAmeet satisfies the random coding bound for block codes. Computer simulations are provided to confirm the analytical findings. The use of BSD-merge substantially reduces the computational variability of con ventional sequential decoding without compromising the error performance, However, BSD does not completely eliminate the erasure problem. We therefore combine our BSD idea in conjunction with the multiple stack algorithm (MSA), which is an erasure-free decoding algorithm. It is shown through analysis and computer simulations that the new bidirectional multiple stack algorithm (BMSA) offers substantial advantages over the MSA in terms of computational effort, memory requirements and error performance. The BMSA appears as an attractive alternative to the Viterbi algorithm (VA) where low error probabilities and high decoding speeds are required. |
Extent | 3605157 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-04-14 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0065074 |
URI | http://hdl.handle.net/2429/7040 |
Degree |
Doctor of Philosophy - PhD |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1994-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1994-953578.pdf [ 3.44MB ]
- Metadata
- JSON: 831-1.0065074.json
- JSON-LD: 831-1.0065074-ld.json
- RDF/XML (Pretty): 831-1.0065074-rdf.xml
- RDF/JSON: 831-1.0065074-rdf.json
- Turtle: 831-1.0065074-turtle.txt
- N-Triples: 831-1.0065074-rdf-ntriples.txt
- Original Record: 831-1.0065074-source.json
- Full Text
- 831-1.0065074-fulltext.txt
- Citation
- 831-1.0065074.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0065074/manifest