@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix dc: . @prefix skos: . vivo:departmentOrSchool "Applied Science, Faculty of"@en, "Electrical and Computer Engineering, Department of"@en ; edm:dataProvider "DSpace"@en ; ns0:degreeCampus "UBCV"@en ; dcterms:creator "Li, Kaiping"@en ; dcterms:issued "2009-04-14T18:44:29Z"@en, "1994"@en ; vivo:relatedDegree "Doctor of Philosophy - PhD"@en ; ns0:degreeGrantor "University of British Columbia"@en ; dcterms:description """The main drawback of sequential decoding is the variability of its decoding effort which could cause decoding erasures. We propose and analyze in this dissertation efficient bidirectional sequential decoding (B SD) techniques to alleviate this drawback, In the proposed BSD, two decoders are used; one is called forward decoder (PD), and is used to search the tree from forward direction; while the other is called backward decoder (BD), and is used for the backward search of the tree. Forward decoding and backward decoding are performed simultaneously, and stop somewhere in the tree. In one class of BSD, to which we refer as BSD-merge, decoding stops whenever FD and BD merge at a common encoder state some where in the tree. In the other class of BSD, that is BSD no-merge, no common encoder state is required, and decoding stops when FD meets BD. Different BSD algorithms based on the stack algorithm are constructed; namely Algorithm TAmeet which belongs to the class of BSD-no-merge, Algorithm TAmerge and Algorithm Timerge which belong to BSD-merge, and finally Algorithm HTTmerge which is a hybrid version of BSD-merge and BSD-no-merge. The relationships between backward coding and forward coding are examined in detail. Good convolutional codes, with memory m ranging from 2 to 25, suitable for bidirectional decoding, found through extensive computer search, are provided. These codes possess the same distance properties from both forward and backward directions. It is found by analysis and computer simulations that the distribution of the total number of computations per decoded block of the proposed BSD is still Pareto, as that of unidirectional sequential decoding (USD). However, the advantage of the proposed BSD appears as an increase in the Pareto exponent, and hence as a decrease in the computational variability and erasure probability. More specifically, we prove by using the random coding approach that the Pareto exponent of BSD using Algorithm TAmeet is asymptotically twice that of USD, and conjecture that this also applies to Algorithm TAmerge. On the other hand, it is found that the computational cutoff rate remains unchanged, but the use of our BSD reduces the average number of computations per decoded bit. Using the random coding approach, we show that the error performance of BSD merge is asymptotically the same as that of USD. Moreover, we show that the bit error probability of Algorithm TAmeet satisfies the random coding bound for block codes. Computer simulations are provided to confirm the analytical findings. The use of BSD-merge substantially reduces the computational variability of con ventional sequential decoding without compromising the error performance, However, BSD does not completely eliminate the erasure problem. We therefore combine our BSD idea in conjunction with the multiple stack algorithm (MSA), which is an erasure-free decoding algorithm. It is shown through analysis and computer simulations that the new bidirectional multiple stack algorithm (BMSA) offers substantial advantages over the MSA in terms of computational effort, memory requirements and error performance. The BMSA appears as an attractive alternative to the Viterbi algorithm (VA) where low error probabilities and high decoding speeds are required."""@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/7040?expand=metadata"@en ; dcterms:extent "3605157 bytes"@en ; dc:format "application/pdf"@en ; skos:note "BIDIRECTIONAL SEQUENTIAL DECODINGbyKaiping LiM. A, Sc., Northern Jiao-Tong University, Beijing, P. R. of China, 1986B. A. Sc., Northern Jiao-Tong University, Beijing, P. R. of China, 1983A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF ELECTRICAL ENGINEERINGWe accept this thesis as conformingredstThe UNIVERSITY OF BRITISH COLUMBIAJune 1994© KAIPING LI, 1994In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of CiThe University of British ColumbiaVancouver, CanadaDate____________DE-6 (2188)AbstractThe main drawback of sequential decoding is the variability of its decoding effortwhich could cause decoding erasures. We propose and analyze in this dissertation efficientbidirectional sequential decoding (B SD) techniques to alleviate this drawback, In theproposed BSD, two decoders are used; one is called forward decoder (PD), and is usedto search the tree from forward direction; while the other is called backward decoder(BD), and is used for the backward search of the tree. Forward decoding and backwarddecoding are performed simultaneously, and stop somewhere in the tree. In one class ofBSD, to which we refer as BSD-merge, decoding stops whenever FD and BD merge ata common encoder state some where in the tree. In the other class of BSD, that is BSDno-merge, no common encoder state is required, and decoding stops when FD meetsBD. Different BSD algorithms based on the stack algorithm are constructed; namelyAlgorithm TAmeet which belongs to the class of B SD-no-merge, Algorithm TAmergeand Algorithm Timerge which belong to BSD-merge, and finally Algorithm HTTmergewhich is a hybrid version of BSD-merge and BSD-no-merge.The relationships between backward coding and forward coding are examined indetail. Good convolutional codes, with memory m ranging from 2 to 25, suitable forbidirectional decoding, found through extensive computer search, are provided. Thesecodes possess the same distance properties from both forward and backward directions.It is found by analysis and computer simulations that the distribution of the totalnumber of computations per decoded block of the proposed BSD is still Pareto, as thatof unidirectional sequential decoding (USD). However, the advantage of the proposed11BSD appears as an increase in the Pareto exponent, and hence as a decrease in thecomputational variability and erasure probability. More specifically, we prove by usingthe random coding approach that the Pareto exponent of BSD using Algorithm TAmeetis asymptotically twice that of USD, and conjecture that this also applies to AlgorithmTAmerge. On the other hand, it is found that the computational cutoff rate remainsunchanged, but the use of our BSD reduces the average number of computations perdecoded bit.Using the random coding approach, we show that the error performance of BSDmerge is asymptotically the same as that of USD. Moreover, we show that the bit errorprobability of Algorithm TAmeet satisfies the random coding bound for block codes.Computer simulations are provided to confirm the analytical findings.The use of BSD-merge substantially reduces the computational variability of conventional sequential decoding without compromising the error performance, However,BSD does not completely eliminate the erasure problem. We therefore combine our BSDidea in conjunction with the multiple stack algorithm (MSA), which is an erasure-freedecoding algorithm. It is shown through analysis and computer simulations that thenew bidirectional multiple stack algorithm (BMSA) offers substantial advantages overthe MSA in terms of computational effort, memory requirements and error performance.The BMSA appears as an attractive alternative to the Viterbi algorithm (VA) where lowerror probabilities and high decoding speeds are required.111ContentsAbstractList of TablesList of FiguresList of SymbolsAcknowledgmentChapter 11.11.2Chapter 22.12.22.32.4Chapter 33.13.23.3viiixxiixvii189111116202733344150IntroductionOutline of the DissertationClaim of Contributions of this Dissertation .Convolutional Coding and Sequential DecodingConvolutional CodingSequential DecodingBackward Coding/DecodingSearch for Good Symmetric CodesBidirectional Sequential Decoding AlgorithmsAlgorithm TAmeet, a BSD-no-mergeAlgorithm TAmerge, a BSD-mergeCoarsening on Algorithm TAmergeivChapter 44.14.24.34.44.54.6Chapter 55.15.25.35,4Chapter 66.16.26.36.46.4.16.4.26.564666871919193109110115116122125135136138138Computational Performance of BSD 55A Simple Approach to Computational Distribution 55Upper Bounds to Computational Distribution of AlgorithmTAmeet 58Extension to Other BSD AlgorithmsLower Bound to Computational DistributionAverage Number of ComputationsNumerical and Simulation ResultsError Performance of BSDError Performance of USDError Performance of BSD-mergeError Performance of Algorithm TAmeet .Numerical and Simulation ResultsBidirectional Multiple Stack AlgorithmThe Multiple Stack Algorithm (MSA)Bidirectional Multiple Stack Algorithm (BMSA)Computational Properties of the BMSA . . .Error Performance of the BMSAEffect of Parameter Z1Effects of Parameters T and ZComparison with Viterbi DecodingVChapter 7 Conclusions and Suggestions for Further Research . . . 141References 145Appendix B Proof of Theorem 4.4 . 150Appendix C List of Acronyms 154viList of TablesTable 2.1 Symmetric bidirectional optimum distance profile (SBODP)codes ,.,,.,.,.,..,.,,...,,.,...,,,., 28Table 2.2 Distance spectra of SBODP codes ................. 29Table 2.3 Symmetric almost bidirectional optimum distance profile(SABODP) codes 31Table 2,4 Distance spectra of SABODP codes 32Table 4.1 Comparison of average computations for L = 400, Clim = 8000and p = 0.0409 75Table 4.2 Comparison of average computations for L = 200, Clim = 4000and p = 0.0594 81Table 4.3 Comparison of average computations for L = 400, Clim 4000and p = 0.0289. 83Table 4.4 Comparison of computational and error performances ofdifferent m with the new definition of one computation 88Table 5.1 Error performance comparison of different algorithms for thecase Pr = 1.1, i.e., R < Rcomp 112Table 5.2 Error performance comparison of different algorithms for thecase Pr = 0.68, i.e., R > Rcomp ................. 113Table 5.3 Error performance comparison of Algorithm TAmeet (z = 1)for different block length L 114viiTable 5.4 Error performance comparison of Algorithm TAmerge ( = 1)for different block length L 114Table 6.1 Average number of computations as a function of Z1 134Table 6.2 Average number of computations as a function of T . . . . 135Table 6.3 Average number of computations as a function of Cijm . . . 135Table 6,4 Error performance as a function of parameter T 138Table 6.5 Error performance as a function of parameter Z 138vuList of FiguresFigure 1.1 General digital communication system 1Figure 1.2 Binary symmetric channel 3Figure 2.1 Convolutional encoder for the (3, 1, 2) code 12Figure 2.2 Forward code trellis diagram for the encoder of Figure 2.1. . 13Figure 2.3 Forward code tree diagram of the encoder of Figure 2.1. . . 14Figure 2.4 Backward code trellis diagram for the encoder of Figure 2.1 . 21Figure 2.5 Backward code tree diagram of the encoder of Figure 2.1,. . 22Figure 3.1 Illustration of the stopping rules of Algorithm TAmeet. . . . 35Figure 3.2 Illustration of the overlapping in BSD-merge........... 42Figure 3.3 Example of opposing paths passing each other 43Figure 3,4 Typical impossible event in Algorithm TAmerge......... 44Figure 3.5 Illustration of a typical impossible event in Figure 3.4...... 45Figure 3.6 Illustration of the possible event ..........,..... 45Figure 3.7 Illustration of another impossible event 46Figure 3.8 Distinct forward and backward paths in the FS and BS 47Figure 3.9 Merging test in Algorithm TTmerge 52Figure 4.1 (a) Correct path metric with forward decoding. (b) Correct pathmetric with backward decoding. (c) Correct path metric withbidirectionaldecoding 57Figure 4.2 Distribution of the total number of computations per decodedblock for the case L = 400 and p = 0.0409, i.e., Pr = 1.1. . . 73ixFigure 4,3 Distribution of the total number of computations per decodedblock for different codes using the same parameters as inFigure 4.2 77Figure 4.4 Distribution of the total number of computations per decodedblock for a systematic ODP code (Pr = 1.1) 78Figure 4.5 Disthbution of the total number of computations per decodedblock for the case L = 200 and p = 0.0594, i.e., Pr = 0.68. . 80Figure 4.6 Distribution of the total number of computations per decodedblock for the case L = 400 and p = 0.0289, i.e., Pr = 1.5. . . 82Figure 4.7 Probability mass of merging and meeting when code rate Rcomp 86Figure 4.9 Distribution of the overlapping length 87Figure 4.10 Distribution of the total number of computations for differentm with the new definition of one computation 89Figure 5.1 Example of error events with a correct merging encoder state. 94Figure 5.2 Example of error events with an incorrect merging encoderstate 94Figure 5.3 Illustration of the merging error event 97Figure 6.1 Illustration of the multiple stack algorithm 117Figure 6.2 Empirical computational distribution for the final decision ofthe MSA 120xFigure 6.3 The first stack saving as a function of P1 128Figure 6,4 Empirical computational distribution for the final decisionusing Z1 as a parameter 130Figure 6.5 Empirical computational distribution for the final decisionusing T as a parameter 131Figure 6.6 Empirical computational distributions for the final decisionusing Z as a parameter 132Figure 6.7 Bit error probability of the BMSA and MSA as a function ofZ1 137Figure 6.8 Comparison on error performance for different algorithms. . 140xiList of SymbolsDefinitionLrl Smallest integer not less than r[TJ Largest integer not greater than rDefined by R = E0(o)/oSubstack spacing7 Fano metricOne branch metric dropPr Defined by R =Indicator functionad Total number of paths of weight dfree +free +C 2 max [mm (c[, c_)]CM8A Total number of computations needed to decode one branch using theBMSACSD Total number of computations needed to decode one branch using BSDCISA Total number of computations needed to decode one branch using theMSAC’5-’ Total number of computations needed to decode one branch using USDCcrir Critical number of computationsChannel capacityxiiCdjree+j Total number of nonzero information bits on all weight dfree + i pathsCf5D Total number of computations required to decode a block of L branchesby any BSD algorithmCf, Cf Total number of computations required by forward and backwarddecoders to decode a whole block of L branches, respectivelyC, Cf_i Total number of computations needed by forward and backward decoders to decode x or (L — x) branches without searching beyond theboundary at level x, respectivelyAF ‘iB‘L—z Total number of computations required by forward and backwarddecoders to correctly decode x or (L— x) branches, respectivelyCi The i-th branch computationsCL Total number of computations needed to decode a block of L branchesby using Algorithm TAmeetCL Total number of computations required to decode a block of L branchesby using Algorithm TAmerge with the assumption of correct decodingTAmergeCL Total number of computations required to decode a block of L branchesby using Algorithm TAmergeCfM Total number of computations required to decode a block of L branchesby using the BMSATotal number of computations required to decode a block of L branchesby using the MSAXIIICiim Computational limitTotal number of computations required to correctly decode the firstA branches using USDD Unit delayd Distance profiledfree Free distanced[], d[.] Column distance function of forward and backward codes, respectivelyIncorrect path subset generated from level iE[1 Expectation functionE0() Gallager functionSmallest concave function E()G(D), Gr(D) Transfer function matrix of forward and backward codes, respectivelyH() Entropy functionI Identity matrixK Code constraint lengthk Number of bits per unit time at the input of an encoderL Block length (including a known tail)I Meeting point‘F lB The farthest level reached by forward and backward decoders, respectivelyxiv1FTOP’ 1BTOP The length of top path in FS and BS, respectively1 Merging pointM[] Path metric functionWI Total encode memorym Encode memory ordermh Number of matching information symbols in the merging testNb Total number of bit errors given Smg incorrectn Number of bits per unit time at the output of an encodernb(r) Expected number of bit errors caused by an incorrect path divergingat level r1 + l — L at the end of decoding.Pj The first (primary) stack overflow probabilityPb Bit error probabilityBlock error probabilityPer Erasure probabilityp Crossover probability of BSCR Code rate (flats per channel symbol)r Code rate (bits per channel symbol)Rcomp Computational cutoff rateSF, S5 Current number of forward and backward stacks, respectivelyS1 Encoder state ixvSmg Common encoder state at 10The first (primary) stack size saving ratiox(i) Correct path from root node to level ix(t) Incorrect path which diverges from the correct path at level i andremerges at level (i + t)Incorrect path which diverges from the correct path at level i andstretches out t branches from the correct node iThe set of all possible incorrect paths which diverge from the correctpath at level iX(i;j) The set of all incorrect paths which diverge and remerge with thecorrect path at level i and (L- j)Total number of elements in .(i;j)Z = 2zpMsAZ1 ZM = 2ZM,i = 2,3,••ZpM The i-th forward or backward stack size in the BMSAZj The i-th stack size in the MSAxviAcknowledgmentI would like to thank my wife, Libing, for her constant encouragement, support, andunderstanding during these years. I am also greatly indebted to my research supervisor,Dr. Samir Kallel, for his continual encouragement and highly constructive comments. Iwish to express my sincere thanks for his guidance throughout this research.I would like to thank Mr. Dimitrios P. Bouras, Mr. Siavash Massoumi, Mr.Longxiang Dai, and Mr. Weimin Sun for their helpful discussions.I would also like to thank Dr. Samir Kallel, Mr. Siavash Massoumi, and Mr.Dimitrios P. Bouras for their help with transcript changes after I joined OKI Telecom.Finally, I would like to thank Mrs. Vicki Waidro-Snider and Mr. Peter Howard for theirinvaluable help in perfecting the use of the English language throughout this text.xviiChapter 1IntroductionDigital communication deals primarily with the process of transmitting digital information from one point to another through a channel which is subject to noise disturbances.This implies that the information received at the destination (prior to processing) is not always exactly the same as transmitted. Shannon [11 showed in his famous coding theoremthat digital information can be transmitted with arbitrarily low error probability providedthat the data rate is smaller than the channel capacity. Since then, numerous coding anddecoding techniques have been proposed in the attempt to approach Shannon’s promiseof reliable communication near the channel capacity. The increasing demand for efficientand reliable digital communications over noisy channels has led to the use of sophisticatedcoding and decoding techniques for the purpose of error correction. These techniques,called forward error correction (FEC) are useful, especially with power-limited systems.r Source______Channel______Source Encoder Encoder MoDestination H-H_::r DemodulatorFigure 1.1 General digital communication system.1Chapter 1 IntroductionA general block diagram of a digital communication system is given in Figure 1.1.An arbitrary source generates a signal which is then transformed by a source encoder intoa redundancy-free digital information sequence. If the source is analog, analog-to-digital(A/D) conversion is performed at the source encoder. Generally, source and sourceencoder are treated as a digital source which emits an equal likely binary sequence. Theinformation sequence then enters the channel encoder which adds redundancy to combatthe channel noise. The outputs from the channel encoder, referred to as codewords, arethen mapped into a waveform signal by the modulator for transmission over the waveformchannel, The waveform channel may represent a telephone line, a wireless radio link, aspace communication link, etc. The demodulator processes the received waveform signaland produces an output that may be discrete (quantized) or continuous (unquantized). Theoutput of the demodulator is called the received sequence. The channel decoder, basedon the received sequence and the known structure of the channel encoder has to selectthe best possible transmitted sequence. The purpose of the channel encoder and decoderis to allow the digital information sequence to be efficiently and reliably reproduced atthe output of the channel decoder. The decoded sequence then enters the source decoder,which reproduces a signal that is delivered to the user. If the source signal is continuous,this also involves digital-to-analog (D/A) conversion.To focus attention on the channel encoder and channel decoder, the modulator/demodulator (modem) and the waveform channel are combined into a coding channel.In this dissertation, a discrete memoryless channel (DMC) is assumed, which constitutesthe simplest class of coding channel models and is defined as follows. The input is asequence of elements from a finite alphabet, and the output is a sequence of elements2Chapter 1 Introductionfrom the same or different alphabet. Moreover, each element in the output sequence isstatistically dependent only on the element in the corresponding position of the inputsequence and is determined by a fixed conditional probability assignment. For example,the binary symmetric channel (BSC), which is shown in Figure 1.2, is a DMC with binaryinput and output alphabets, where each digit at the channel input is reproduced correctlyat the channel output with some fixed probability (1-p) and is altered by noise into theopposite digit with probability p.Input alphabet Output alphabet0 01 1Figure 1.2 Binary symmetric channel.A multitude of methods on coding and decoding techniques which permit low errorprobability in digital communications have been proposed and investigated. Since excellent introductory literature into coding/decoding theory is available [2—8], the followingsurvey will therefore be limited to matters relevant to this thesis, that is convolutionalencoding with probabilistic decoding. Supporting literature is found in the references.A convolutional code is described by three integers, n, k, and m. At each unit time,the encoder accepts k-bit of the information sequence and produces a codeword (encodedsequence) of n-bits. The integer m is a parameter known as the code memory length. Animportant characteristic of an (n, k, m) convolutional code is that the encoder has memory‘-p3Chapter 1 Introduction— the n-bits emitted by the convolutional encoding procedure is not only a function ofthe k-bit input, but is also a function of its previous m inputs.Viterbi decoding [9] and sequential decoding’ [11] are the two main probabilisticdecoding methods for convolutional codes. The Viterbi algorithm (VA) is an optimummaximum likelihood decoding scheme, but its optimality is obtained at a large decodingeffort. The number of computations performed per information digit is constant, butincreases exponentially with the code memory length. In practice, the VA can thus beemployed only with short memory codes.Historically, sequential decoding was proposed in 1957 by Wozencraft and Reiffen[11] as the first practical decoding procedure for convolutional codes. In sequentialdecoding, the tree-like structure of the convolutional code is used in a step-by-step searchof the most likely transmitted sequence. As long as the data rate does not exceed aquantity called computational cutoff rate R01, which is less than the channel capacity,the average number of computations required to decode one information digit is smalland independent of the code memory length.The error probability of sequential decoding decreases exponentially with the codememory length [12], and it is lower bounded by that of the VA. As m —* co, however,both decoding techniques yield the same error performance. In addition, the decodingcomplexity of sequential decoding is practically insensitive to the code memory. Consequently, convolutional encoding with sequential decoding is at present one of the bestknown methods to achieve an arbitrarily low error probability over a DMC [7, 8, 13, 14].In this dissertation, sequential decoding is used in a narrow sense. That is, it only includes the metric-first and the depth-firstalgorithms although the breadth-first algorithm can be categorized as sequential decoding in a wide sense [10].4Chapter 1 IntroductionConvolutional coding with sequential decoding has been used in various systems such assatellite communications, deep space communications, etc.There are two principle sequential decoding algorithms: the Fano algorithm [13] andthe stack algorithm proposed independently by Zigangirov [15] and Jelinek [16]. Regardless of the algorithm, a major problem with sequential decoding is the computationalvariability. As a consequence of this variability, the decoding effort for a given receivedsequence may occasionally exceed the physical limitations of the decoder, leading inevitably to buffer overflows and infonnation erasures. Thus, while the undetected errorprobability of sequential decoding can be made arbitrarily small, the erasure probabilityor buffer overflow remains substantial [7, 8, 17]. Although the disadvantage of overflows or erasures of sequential decoding may be useful in systems with retransmissioncapabilities, it is desirable to make the erasure probability as small as possible withoutcompromising the error performance in order to achieve higher throughput and reliability. In addition, there are many other cases where erasures are undesirable. For example,in real-time systems with long path delays, erasures can cause complete data loss, Inthis case, the main contribution of errors comes from the buffer overflow. Thus, it isvery important to reduce the erasure probability. The prime objective of this researchis tailored at alleviating the computational variability, and thus minimizing erasures insequential decoding.In the past, several modifications of sequential decoding to reduce the erasureprobability have been reported and implemented. Excessively long tree searches canbe terminated early by placing a backsearch limit on the Fano and the stack algorithms,5Chapter 1 Introductionthereby lowering the erasure probability [14]. A similar effort is achieved with the stackalgorithm by deleting the path with the lowest metric from the stack whenever the stackfills up [15, 16]. Although these methods accelerate decoding and reduce the erasureprobability, the resulting increase in the error probability is found to be substantial [18].Haccoun and Ferguson [19] proposed a class of generalized (or multiple path) stackalgorithms which extends several paths simultaneously from the top of the stack. It wasshown that the variability of the computational effort is reduced, but at a cost of a largeaverage number of computations compared to the ordinary stack algorithm.In order to eliminate erasures entirely, Chevillat and Costello [20, 21] introduced aninteresting multiple stack algorithm (MSA). It is a modification of the stack algorithmand uses several stacks to accommodate the problem of stack overflow. However, it isfound that the resulting increase in stack and buffer storage is substantial.Fomey [22] in 1967 suggested that sequential decoding can be started from the endof a block which is terminated by a known tail. He proposed a bidirectional decodingscheme which uses a forward decoder and a backward decoder to search a terminatedtree from initial and final encoder states, respectively. Decoding with Forney’s schemestops whenever either the forward or the backward decoder reaches the end of its tree.Computer simulations revealed that this scheme does not offer much improvement interms of computational performance.The idea of bidirectional search has also been used for computing the free distanceof convolutional codes [23, 24] and distance spectrum of trellis codes [25]. Recently,a bidirectional decoding search has been applied to the decoding of block codes [26].6Chapter 1 IntroductionMost recently, Haccoun and Beizile [27, 28] investigated bidirectional breadth-first Malgorithm for the decoding of convolutional codes. Although erasures are avoided by thenature of the breadth-first M algorithm used, the loss of the correct path problem, whichis inherent to the M algorithm, limits the performance of this scheme.We propose and analyze in this dissertation efficient bidirectional sequential decoding(BSD) techniques [29]. It is shown that the proposed BSD techniques substantiallyalleviate the drawbacks of conventional or unidirectional sequential decoding (USD).In the proposed BSD, two decoders are used, like in Forney’s scheme; one is calledforward decoder (FD), and is used to search the tree from forward direction; the otheris called backward decoder (BD), and is used for the backward search of the tree.Forward decoding and backward decoding are performed simultaneously. However,unlike Forney’s scheme, decoding in the proposed BSD techniques stops somewherein the tree, In one class of BSD, which we refer to as BSD-merge, decoding stopswhenever FD and BD merge at a common encoder state somewhere in the tree. Inthe other class of BSD, that is BSD-no-merge, decoding stops when FD meets BD.Different BSD algorithms based on the stack algorithm are constructed; namely AlgorithmTAmeet which belongs to the class of B SD-no-merge, Algorithm TAmerge and AlgorithmTTmerge which belong to B SD-merge, and Algorithm HTTmerge which is a hybridversion of BSD-merge and BSD-no-merge. Finally, to eliminate erasures entirely, wecombine Algorithm TTmerge in conjunction with the MSA of Chevillat and Costello[20, 18, 21], to obtain a more efficient and reliable erasure-free algorithm, which we callbidirectional multiple stack algorithm (BMSA).7Chapter 1 Introduction1.1 Outline of the DissertationChapter 2 is a brief summary of the basic elements of convolutional coding andsequential decoding. The idea and basic properties of backward coding/decoding are alsopresented in this chapter. Moreover, good convolutional codes suitable for bidirectionaldecoding, which are found through computer search, are listed in this chapter.In Chapter 3, the proposed BSD algorithms are described and their fundamentalproperties are investigated. Furthermore, our proposed BSD algorithms are comparedwith the existing BSD algorithms which can be classified under the category of BSDno-merge in a wide sense.A theoretical analysis of the computational performance of the proposed BSD algorithms is given in Chapter 4, together with extensive computer simulation results. It isfound, both by theoretical analysis as well as computer simulations, that the distributionof the total number of computations per decoded block by any of the proposed BSDalgorithm is still Pareto, as that of USD. However, the advantage of the proposed BSDappears as an increase in the Pareto exponent, and hence as a decrease in the computational variability. More specifically, we prove by using the random coding approachthat the Pareto exponent of BSD using Algorithm TAmeet is asymptotically twice that ofUSD, and conjecture that this also applies to Algorithm TAmerge. On the other hand, it isfound that the computational cutoff rate remains unchanged, but the use of the proposedBSD reduces the average number of computations per decoded bit. This reduction incomputational variability and average number of computations translates into a substantial decrease in the erasure probability (or buffer overflow problem), which is the main8Chapter 1 Introductiondrawback of sequential decoding.The error performance of the proposed BSD techniques is explored in Chapter 5by both theoretical analysis and computer simulations. It is found that BSD-merge hasasymptotically the same error performance as USD, while the bit error probability ofAlgorithm TAmeet follows the random coding bound for block codes.In Chapter 6, an erasure-free BSD technique based on the MSA [18, 211, whichwe call bidirectional multiple stack algorithm (BMSA), is proposed and analyzed. It isfound that the BMSA performs better than the conventional MSA, in terms of memoryrequirements, computational effort and error performance.Chapter 7 contains final conclusions and suggestions for future research,Appendix A contains a rigorous proof of Theorem 4.4 which suggests that underthe assumption of correct decoding the computational cutoff rate of BSD is the sameas that of USD.Appendix B lists acronyms used in this dissertation.1.2 Claim of Contributions of this DissertationThe major contributions of this research are summarized below.The properties of backward coding/decoding are examined in great detail. Morespecifically, the relationships between a backward code and the corresponding forwardcode are given through Theorems 2.1 and 2.2, Corollaries 2.1 and 2.2 and Properties2.1 and 2.2. Good convolutional codes suitable for bidirectional decoding found throughcomputer search are provided.9Chapter 1 IntroductionEfficient BSD techniques that alleviate the drawbacks of conventional sequentialdecoding are proposed and examined by analysis and extensive computer simulations.The main theoretical contributions in the analysis of the proposed BSD techniques areas follow:1, A computational upper bound for Algorithm TAmeet is found for a specific time-invariant convolutional code (Theorem 4.1).2. A computational upper bound for Algorithm TAmeet is discovered for an ensembleof trellis codes (Theorem 4.2). It is conjectured that this bound also applies toAlgorithm TAmerge.3. A computational lower bound is obtained for any code and any type of BSD (Theorem4.3).4. The computational cutoff rate of BSD is found to be the same as that of USD(Theorem 4.4).5. The error performance of BSD-merge is proved to be asymptotically the same as thatof USD (Theorems 5.1 to 5.3, Corollaries 5.1 and 5.2).6. The bit error probability of Algorithm TAmeet is found to satisfy the random codingbound for block codes.An erasure-free algorithm, the BMSA, is proposed and analyzed. Through computersimulations and analysis, we demonstrate that the new BMSA performs better that theconventional MSA.10Chapter 2Convolutional Coding andSequential Decoding2.1 Convolutional CodingAn (n, k, m) convolutional encoder can be implemented with k shift registers, nmodulo-2 adders and a commutator that scans the output of the adders. The length ofeach of the k shift registers is less than or equal to the encoder memory order m [81,2The connections of the n adders to the k shift registers define the code. This is generallyspecified with a so-called k x n transfer function matrix G(D) [8], given byg(D) g2(D) ... g(D)g1(D) g2(D) ... g(D)G(D) = 2 2 2 (2.1)g3’(D)g’(D) g2(D) ... g(D)where g (D) is a polynomial of degree m that indicates the connection of the j-thadder to the i-th shift register. For example, Figure 2.1 represents a (3, 1, 2) encoderwithG(D) = [1+D,l+D2,l +].Encoding is performed k-bit at a time. Each k-bit to be encoded is fed to the encoderfrom the left side, and the content of the k shift registers is shifted one position to theright. The modulo-2 adders are then sampled in sequence by the commutator, producingn output coded bits. For convenience, we refer to this as forward encoding.Defining fl max [deg g3)(D)] as the total encoder memory [8] and the statei=1 1 N) > N exp {_o(v’iT) }, (2.6)where 0 (/17) is an asymptotically unimportant term,4 and the Paretian exponent ctis defined by the following parametric equation___________________R =E0()/c. (2.7)A computation in the stack algorithm is an extension of the top node, while in the Fano algorithm, it is usually defined to bea forward move [5].0(b) represents that the value can be upper-bounded by Abs, where A is a finite constant for all indexes i.18Chapter 2 Convolutional Coding and Sequential DecodingThe function Eo(a) in (2.7) is the smallest concave function greater than or equal to theGallager function E0(cx) [5]. Using the random coding technique over an ensemble oftrellis codes [35], Savage, Falconer and Jelinek showed that for a DMC, the first-branchcomputations C1 of sequential decoding is upper-bounded asP(C1 N) AN, (2,8)where p < Pr, which is defined as R = Eo(pr)/pr,5and A is a finite constant independentof N [36]. Moreover, it was noticed that asymptotically, the lower bound gives an accuratedescription of the computational distribution for good codes and practical sequentialdecoding algorithms [311.The computational performance of USD with a specific convolutional code dependson the decoding algorithm employed and the distance properties of the forward code.In particular, optimum forward sequential decoding performance requires rapid initialcolumn distance growth of the forward code for fast decoding and minimum erasureprobability, and a large free distance (dfree) for minimum error probability [37, 38].Specifically, for a code with rate R and a BSC with a crossover probability p satisfyingR<1n2+pln —H(p), (2.9)‘—pwhere H(p) = —plnp — (1—p) ln (1—p) is the entropy function, the distributionP(C1 > N) for a specific time-invariant convolutional code is upper-bounded by [38]P(C1 N) <• exp{—id[log N] + log N}, (2.10)where o-, 5 and are parameters which are functions of p and R, but independentof N, and are as defined in [38].pr = a if the channel is a BSC, but Pr a in genera’.19Chapter 2 Convolutional Coding and Sequential DecodingThe distance profile d of a forward code is defined as the column distance function(CDF) over only the first constraint length (i.e., d [d0, d1,“, dm1) [8]. Since thedistance profile determines the initial column distance growth of the forward code, andis easier to compute than the entire CDF, it is usually used instead of the entire CDF asa criterion for selecting codes for use with sequential decoding. A code is said to havea d superior to a d’ of another forward code of the same memory order m, when thereis some i m such thatdj{j: (2.11)A forward code is said to have an optimum distance profile (ODP) [8] if its d is superiorto that of any other forward code of the same memory order.23 Backward Coding/DecodingBecause data is transmitted with a known (e.g. zero) tail in a block mode, all pathsdepart from and terminate at the same state S. A path in the trellis or tree diagram canthus be viewed in reverse as starting from the final state S0 and terminating at the initialstate So. Every such path also corresponds to an encoded sequence. When viewed asstarting from the final state, the distinct paths in the tree or trellis define a code whichwe refer to as the backward code. From the tree (trellis) of the original forward code,one can generate the backward tree (trellis) for the backward code. Figures 2.4 and 2.5show, respectively, the backward trellis and tree for the backward code obtained fromthe original forward code (3, 1, 2) of Figure 2.1. Note that a backward code sequenceis obtained by simply reversing the original forward code sequence. In addition, the20Time unitsFigure 2,4 Backward code trellis diagram for the encoder of Figure 2.1.corresponding information sequences are reversals of each other, as well. For example,the highlighted path on theChapter 2 Convolutional Coding and Sequential Decoding0 1 2 3 4 5 621Chapter 2 Convolutional Coding and Sequential Decoding100 010 111100 1010 111 000011001 101 111010111 000 000110011 010 111001111000 000 000100 010 111011V 010 111 0000 110001 101 111101111 000 000000011 010 111110101 111 000000110 101 111000000 000 000Figure 2.5 Backward code tree diagram of the encoder of Figure 2.1.22Chapter 2 Convolutional Coding and Sequential Decodingforward code tree of Figure 2,3 corresponds to the information sequence (1101) and thecode sequence (111, 010, 110, 100, 101, 011). Their equivalent reversed informationand code sequences are (1011) and (110, 101, 001, 011, 010, 111), respectively (see thehighlighted path on the backward tree of Figure 2.5).It can be readily shown that the transfer function matrix Gr(D) of an (n, k, m)backward convolutional code can be expressed in terms of the transfer function matrixof the forward code [39] byg(D’) g’)(D_l) ,, g(D’)Gr(D) = Dm g(D’) g(D_’)::‘1_1), (2.12)g (D’) g’(D’) .0 (D’)As an example, consider the encoder of Figure 2.1. The transfer function matrix of itscorresponding backward encoder is Gr(D) = [1 + D + D2, 1 + D2, D + D2].It appears from the above discussion that, given a convolutional encoder, one canperform backward encoding by feeding the information bits to be encoded from the rightside of the registers, and set the registers to shift left.A forward (n, k, m) convolutional code is said to be invertible if there exists an nx k matrix G(D) such thatG(D)G’(D) = ID6 (2.13)for some 6 0, where I is the k x k identity matrix [8]. For an (n, 1, m) convolutionalcode, this means that there exists a feedforward inverse G(D) of delay S (S 0) ifand only if [40]GCD [g(1)(D),g2(D),’ ‘ ‘ , g)(D)] = D6, (2.14)23Chapter 2 Convolutional Coding and Sequential Decodingwhere GCD denotes the greatest common divisor. For an (n, k, m) convolutional codewith k > 1, let A ( D), i = 1, 2,..., (), be the determinants of the () distinct k x ksubmatrixes of the transfer function matrix G(D). Then a feedforward inverse G’ (D) ofdelay 6 exists if and only ifGCD[z(D) : i = l,2,•••, ()] = (2.15)for some 6 0 [401. Also, (2.14) or (2.15) is a necessary and sufficient condition fora code to be noncatastrophic. We now show that a backward convolutional code isinvertible provided that the forward code is invertible.Lemma 2.1: For an invertible (n, 1, m) convolutional code, the delay S in (2.14) isalways no greater than m, since deg [g(i)(D)] mfor all i.Lemma 2.2: For an invertible (n, k, m) convolutional code with k > 1, the delay S in(2.15) is always no greater than km, since deg[A(D)] km for all i.Theorem 2,1: A backward code corresponding to an invertible (n, 1, m) convolutionalcode with delay S is invertible and its delay is equal to (m — 5).Proof: From (2.14), we can writeGCD [g() (D’), g(fl_1) (D’), . . . , g(’) (D’)] = D, (2.16)andGCD [Drng(n) (D’),. . . , Dmg(l) (D—’)] = (2.17)From (2.12), we can write (2.17) asGCD [g.’(D),g2(D),’’’ ,g1c(D)] = (2.18)24Chapter 2 Convolutional Coding and Sequential DecodingFrom Lemma 2.1, we know that m -6 0. Q.E.D.Theorem 2.2: A backward code corresponding to an invertible (n, k, m) convolutionalcode (Ic> 1) with delay 6 is also invertible and its delay is equal to (km — 6).Proof According to (2.12), we can writeGCD[Lri(D) i = 1,2,..., ()] = GCD[Dkm/(D_1) : i 1,2,•., ()].(2.19)From (2.15), we can writeGCD[A1(D_1) : i = 1,2,••, ()] = D6, (2.20)Therefore, from (2.19) and (2.20) we can writeGCD[Lri(D) : i = ()] = Dkm, (2,21)and from Lemma 2.2, we know km — 6 0. Q.E.D.According to the above discussion we can write the following.Corollary 2.1: A backward convolutional code is noncatastrophic if and only if itsforward code is noncatastrophic.Corollary 2.2: Decoding of a forward invertible convolutional code can be performedusing the corresponding backward code, This will be called backward decoding asopposed to the usual forward decoding.Property 2.1: The distance spectrum [8] ofa backward convolutional code is identicalto that of the corresponding forward code. This is because the Hamming weight ofany path on the forward code tree or trellis is the same as the corresponding path onthe backward code tree or trellis. As a consequence of this property, when maximum25Chapter 2 Convolutional Coding and Sequential Decodinglikelihood decoding is used, the error performance of a backward code is the same asthat of the corresponding forward code.Property 2.2: The distance profile of a backward code is usually differentfrom that ofthe correspondingforward code. This is rather obvious and can be seen from the forwardand backward code trees of Figures 2.3 and 2.5. For this example, the first coefficientof the distance profile of the forward code is equal to 3; whereas for its correspondingbackward code, it is equal to 2. As a consequence of this property, backward sequentialdecoding is in general not as efficient as its corresponding forward sequential decodingwhen a forward ODP code is used.With our bidirectional sequential algorithms to be discussed later, decoding is performed in both forward and backward directions. It is thus desirable to use codes thatpossess the same good distance profile in both directions (see Theorem 4.1 in Chapter 4).A sufficient condition for a forward code to have the same distance profile as its corresponding backward code is symmetry. A code is said to be symmetric or reversible [41,42j if the code generators of the forward code are the same as those of the correspondingbackward code. For example, the code with generators g’)(D) = 1+D+D2 andg2(D)= 1+D2 is symmetric. Moreover, since this code has an optimum distance profile [43jwe call it a symmetric bidirectional optimum distance profile code. A code is said tohave a symmetric bidirectional optimum distance profile (SBODP) if it is symmetric andits distance profile is optimum. Unfortunately, most known ODP codes do not possessthe property of symmetry. However, one can search for new symmetric codes that havea good distance profile.26Chapter 2 Convolutional Coding and Sequential Decoding24 Search for Good Symmetric CodesUsing a computer search procedure [44, 45], we have found good symmethc (2, 1,m) non-systematic codes with m ranging from 2 to 25. In our computer search, we firstattempt to find a symmetric code with the same distance profile as the correspondingknown ODP code. If more than one code is found, we select the one with the highestdfree and the minimal number of paths of weight dfree. Of course, we considered onlynoncatastrophic codes. Results are given in Table 2.1, Also included in the table are thefree distances of known ODP codes. Distance spectra of the found SBODP codes arelisted in Table 2.2. In Table 2.2, adiree+j denotes the number of paths of weight dfree + ,while Cdf +i denotes the total number of nonzero information bits on all these paths.From Table 2.1, one can immediately notice that SBODP codes do not exist for somevalues of m, which are marked by dash signs. Moreover, as it can be seen from Table2.1, to find a SBODP code, usually dfree had to be sacrificed. In the following, some newcodes which slightly compromise the distance profile for a better dfree are presented.27Chapter 2 Convolutional Coding and Sequential DecodingTable 2.1 Symmetric bidirectional optimum distance profile (SBODP) codes6m G2 dfree dfree of ODP2 7 5 5 53 ——— 64 23 31 6 75— —— 86 127 165 8 107 —— — 108 —— — 129 1157 1731 12 1210 2251 3003 9 1411 5247 7125 12 1412——— 1513 27537 37275 12 1614 56777 77735 12 1715 134277 176435 16 1816 222511 304443 14 1917 475417 741571 16 2018 1346477 1762635 18 2119——— 2220 5734323 6261675 18 2221 12022373 15744405 20 2422 31065423 23340731 21 2423 44407043 61070111 18 2624 176153077 134442235 24 2625 221446737 373544611 24 276 The m = 2 SBODP code is the known ODP code found by Johannesson [43] and is listed here for completeness.28Chapter 2 Convolutional Coding and Sequential DecodingTable 2.2 Distance spectra of SBODP codesm (adjr+ji = 0, 1,2,[cdfr+j = 0, 1,2, 14 23 31 6 (1,0,4,0,22,0,124,0,682,0)[1,0,10,0,96,0,778,0,5616,0]6 127 165 8 (1,0,5,0,35,0,187,0,1074,0)[2,0,15,0,188,0,1275,0,9350,0]9 1157 1731 12 (8,0,16,0,159,0,741,0,5027,0)[32,0,78,0,1150,0,6390,0,52470,0]10 2251 3003 9 (1,0,1,1,,1,18,21,27,110,256)[1,0,5,4,9,88,123,186,796,2060]11 5247 7125 12 (3,0,10,0,33,0,216,0,1216,0)[10,0,42,0,228,0,1702,0,11248,0]13 27537 37275 12 (1,0,2,0,11,0,54,0376,0)[2,0,6,0,68,0,352,0,3278,0]14 56777 77735 12 (2,0,0,0,10,0,16,0,208,0)[4,0,0,0,46,0118,01420,0]15 134277 176435 16 (13,0,0,0,166,0,96,0,3651,0)[60,0,0,0, 13 30,0,734,0,37790,0]16 222511 304443 14 (2,0,3,0,7,0,59,0,237,0)[5,0,13,0,36,0,455,0,1966,0]17 475417 741571 16 (2,0,1,0,33,0,97,0,751,0)[6,0,7,0,220,0,753,0,7386,0]18 1346477 1762635 18 (1,0,12,0,51,0,302,0,1801,0)[5,0,70,0,365,0,2720,0,18953,0]20 5734323 6261675 18 (1,0,1,0,12,0,89,0,474,0)[5,0,6,0,92,0,782,0,5034,0]21 12022373 15744405 20 (3,0,8,0,36,0,196,0,1246,0)[24,0,56,0,288,0,1782,0,14312,0]22 31065423 23340731 21 (1,5,7,18,30,74,175,341,863,2097)[5,36,37,168,298,670, 1721,3686,9883,25366123 44407043 61070111 18 (1,0,2,0,4,0,19,0,54,0,)[1,0,10,0,20,0,116,0,406,0]24 176153077 134442235 24 (2,3,26,42,93,173)[16,13,234,350,884,1769]25 221446737 373544611 24 (4,0,17,0,78,0,508,0,2780,0)[26,0,135,0,756,0,5540,0,34644,0]29Chapter 2 Convolutional Coding and Sequential DecodingIn the event that no SBODP code can be found, we then slightly decrease the distanceprofile to find a good code (large dfree and good distance profile). The same method canbe applied when the dfree of a SBODP code is inferior to that of the known ODP code. Weshall refer to this symmetric non-ODP code as an almost BODP (SABODP) code, Thecomputer search results are listed in Table 2.3. The dash sign in Table 2.3 indicates thatno code with a better dfree is found after slightly decreasing the distance profile. Distancespectra of the found SABODP codes are listed in Table 2.4. The search for SABODPcodes is conducted as follows. We first examine all codes obtained through possiblemodifications of the last four coefficients of the distance profile of the correspondingODP code. Then, among these codes, we select the one with the highest dfree and thebest distance profile. The last four coefficients of the distance profile of the obtainedcodes are given in the last column of Table 2.3. When the distance profile is slightlydecreased, a code with a better dfree may be found. For example, dfree of the SBODPcode of memory m = 23 is equal to 18. When only the last coefficient of the distanceprofile is decreased by one, a SABODP code with dfree = 24 is found.Notice that for any (2, 1, m) systematic code to be symmetric, the second adder ofthe encoder should be connected to only the last stage of the shift register. This wouldresult in a very poor code in terms of both the free distance and the distance profile.30Chapter 2 Convolutional Coding and Sequential DecodingTable 2.3 Symmetric almost bidirectional optimum distance profile (SABODP) codesm G’ , dm dfree dfree of ODP2 — — —— 53 13 15 2,3,3,3 6 64 — — —— 75 57 75 3,4,4,4 8 86 — — —— 107 247 345 4,5,5,5 10 108 507 705 5,5,5,6 10 129 — — —— 1210 2617 3615 6,6,6,6 12 1411 — — —— 1412 16767 12345 6,7,7,7 14 1513 24433 33045 6,7,7,7 14 1614 47263 63271 7,7,8,8 16 1715 — — —— 1816 232357 367131 8,8,8,8 16 1917 723705 507627 8,8,8,8 18 2018 1167671 1712517 8,8,9,9 19 2119 2714477 3744635 9,9,9,9 20 2220 5267447 7117325 9,9,9,9 20 2221 17242231 11444257 9,9,9,10 22 2422 32427561 21675053 9,10,10,10 22 2423 55231643 61346255 10,10,10,10 24 2624 - - - - 2625 - - - - 2731Chapter 2 Convolutional Coding and Sequential DecodingTable 2.4 Distance spectra of SABODP codesm G’ djree (ad1,,+,i = 0, 1,2,[Cdf,,+ji = 0, 1,2,3 13 15 6 (2,0,10,0,49,0,241,0,1185,0)[4,0,38,0,277,0,1806,0,11063,0]5 57 75 8 (3,0,12,0,70,0,397,0,2223,0)[8,0,46,0,400,0,2925,0,20446,0]7 247 345 10 (4,0,19,0,95,0,539,0,3111,0)[10,0,86,0,649,0,4400,0,30287,0]8 507 705 10 (2,0,10,0,51,0,287,0,1560,0)[4,0,36,0,289,0,2118,0,13984,0]10 2617 3615 12 (2,0,15,0,71,0,383,0,2333,0)[6,0,65,0,494,0,3053,0,22486,0]12 16767 12345 14 (2,0,20,0,93,0,612,0,3283,0)[6,0,111,0,701,0,5456,0,35447,0]13 24433 33045 14 (3,0,10,0,47,0,280,0,1687,0)[11,0,52,0,323,0,2478,0,17675,0]14 47263 63271 16 (6,0,27,0,140,0,808,0,4745,0)[34,0,151,0,1120,0,7520,0,52014,0]16 232357 367131 16 (1,0,2,0,38,0,239,0,1131,0)[10,0,6,0,246,0,1917,0,11114,0]17 723705 507627 18 (1,0,29,0,119,0,578,0,3433,0)[3,0,182,0,953,0,5816,0,39833,0]18 1167671 1712517 19 (4,7,1,25,74,138,353,953,2090,5351)[20,58,3,182,612,1242,3369,10278,23840,66680]19 2714477 3744635 20 (5,0,45,0,160,0,851,0,5184,0)[24,0,319,0,1440,8775,0,61394,0]20 5267447 7117325 20 (1,0,19,0,101,0,423,0,2492,0)[4,0,143,0,884,0,4295,0,29122,0]21 17242231 11444257 22 (9,0,58,0,275,0,1290,0,7381,0)[49,0,488,0,2695,14502,0,94513,0]22 32427561 21675053 22 (1,0,44,0,128,0,793,0,4094,0)[11,0,368,0,1078,0,8648,0,50150,0]23 55231643 61346255 24 (14,0,60,0,348,0,1959,0,10600,0)[108,0,592,0,3602,0,24395,0,145826,0]32Chapter 3Bidirectional SequentialDecoding AlgorithmsIn this chapter, BSD algorithms in which the code tree is explored from bothforward and backward directions are presented. Operations of all the proposed algorithmsare essentially the same, except for their stopping rules. In our BSD algorithms,two separate stacks are used. One is used for the forward search of the tree and iscalled the forward stack (FS), while the other is used for the backward search andis called the backward stack (BS). Starting from the root nodes of the forward andbackward trees (trellises), respectively, forward and backward search operations areperformed alternatively according to the usual stack algorithm. Define the total numberof computations needed to decode one block by any BSD algorithm as the sum of thecomputations performed by forward and backward decoders.The proposed BSD algorithms can be classified into two categories: BSD-no-mergeand BSD-merge. In BSD-merge, decoding stops whenever a path in one stack (PS orBS) merges with a path in the other stack. A path in one stack is said to merge (oragree) with a path in the other stack if the two paths share the same encoder state at acommon tree level. In the class of BSD-no-merge, forward and backward portions of thedecoded path do not have to agree with each other.In the following, Algorithm TAmeet which belongs to the class of BSD-no-merge isfirst presented and analyzed. Next, two algorithms, Algorithm TAmerge and AlgorithmTimerge, which belong to the class of BSD-merge, are presented and analyzed. Finally,33Chapter 3 Bidirectional Sequential Decoding AlgorithmsAlgorithm HTTmerge, which is a hybrid scheme of B SD-merge and BSD-no-merge, ispresented. For the convenience of describing the following algorithms, let 1FTQP denotethe level of the top node in the FS, and IF the farthest level7 reached by the forwarddecoder, Similarly, let iBTOP denote the level of the top node in the BS, and ‘B thefarthest level reached by the backward decoder.3.1 Algorithm TAmeet, a BSD-no-mergeAlgorithm TAmeet stops decoding whenever a path in one of the two stacks meetsa path in the opposite direction, and these two meeting paths do not have to agree witheach other. The level of the forward tree at the meeting point is denoted by 1. Asa consequence, the portion of the tree explored by forward decoder does not overlapwith the portion explored by backward decoder, as illustrated by Figure 3.1. AlgorithmTAmeet is as follow.Algorithm TAmeet:Step 0: Place the root nodes of forward and backward trees in the FS and BS,respectively.Step 1. Compute the metrics of all successors of the top node in the FS, and enterthe new nodes in the FS, ordered with respect to their cumulative metrics.Step 2.’ Remove from the FS the node whose successors were just inserted.Step 3: Check if 1F +18T0p = L. If yes, stop decoding and go to step 7. Otherwise,go to the next step.The farthest level is not necessarily equal to the level of the top node because the decoder allows backtracking.34Chapter 3 Bidirectional Sequential Decoding AlgorithmsStop decoding if one of the Meeting point 1nodes is at the top of FS or BS+® Extended node0 Unextended nodeFigure 3.1 Illustration of the stopping rules of Algorithm TAmeet.Step 4: Compute the metrics of all successors of the top node in the BS, and enterthe new nodes in the BS, ordered with respect to their cumulative metrics.Step 5: Remove from the BS the node whose successors were just inserted.Step 6: Check if1FT0P+ 1B L. If yes, stop decoding and go to step 10. Otherwise,return to step 1.Step 7: The top path8 in the BS is accepted as the backward portion of the decodedpath. The forward portion of the decoded path is selected from the FS, whose end nodeis at level IF, (select the one with the highest metric value if there is more than onepath at level IF). If mm ([m/2] , iBo) = 1B’rop’ retreat m — 1BTOP branches fromthe end node of the forward portion of the decoded path, then take the information8 Defined as the path with the top node as its end node,Forward treeroot node1Backward treeroot nodeL branches35Chapter 3 Bidirectional Sequential Decoding Algorithmssymbols corresponding to the rest of the path as the decoded symbols, and go to step13. Otherwise go to the next step.Step 8: If mm ([m/21, iF) = 1F retreat m — 1F branches from the end node of thebackward portion of the decoded path, then take the information symbols correspondingto the rest of the path as the decoded symbols, and go to step 13. Otherwise go to thenext step.Step 9: Retreat the last rrn/21 and [m/2J branches from the forward and backwardportions of the decoded path, respectively. Take the information symbols correspondingto the rest of the two paths as the decoded symbols, and go to step 13.Step JO: The top path in the FS is accepted as the forward portion of the decodedpath. The backward portion of the decoded path is selected from the BS, whose endnode is at level lB (select the one with the highest metric value if there is more thanone path at level IB). If mm ([rn/2j , = 1FTOP’ retreat m — 1FTQP branches fromthe end node of the backward portion of the decoded path, then take the informationsymbols corresponding to the rest of the path as the decoded symbols, and go to step13. Otherwise go to the next step.Step 11: If mm (Em/21,1B) = 1B retreat m — 1B branches from the end node of theforward portion of the decoded path, then take the information symbols correspondingto the rest of the path as the decoded symbols, and go to step 13. Otherwise go to thenext step.Step 12: Retreat the last [rn/2J and Em/21 branches from the forward and backwardportions of the decoded path, respectively. Take the information symbols corresponding36Chapter 3 BidirectionaL Sequential Decoding Algorithmsto the rest of the two paths as the decoded symbols.Step 13: Output the decoded information symbols.When the channel is not too noisy, the forward and backward portions of the decodedpath may actually merge (agree) with each other since a sequential decoder always followsthe most likely path. Should this not be the case, a valid path (not necessarily the correctone) can be assembled by retreating rn branches from forward and backward portions ofthe decoded path, as described above.Note that in the absence of a burst in a block, the meeting point I would most ofthe time fall near the middle of the tree. However, should there be a single burst errorwithin a block, it is expected that the meeting point would occur near the middle of theburst. Thus by retreating about m/2 branches from forward and backward meeting paths,the burst error may be totally eliminated if its length is less than m.The meeting point 1 at the end of decoding in Algorithm TAmeet is equal to 1FTQPdecoding is stopped at step 6, or (L— 1BTOP) if decoding is stopped at step 3. Note thatforward and backward decoders always meet somewhere in the tree, i.e. 1 < 1 < L — 1.Let x be an arbitrary level in the forward code tree, and imagine cutting the tree intotwo separate subtrees at this level. Let C denote the number of computations neededby the forward decoder to decode x branches without searching beyond level x of theforward tree, i.e., when the top node in the PS reaches depth x for the first time. This isequivalent to decoding an x branch-tree without a terminating tail. Similarly, let C_denote the number of computations needed by the backward decoder to decode an (L— x)branch-tree without searching beyond level (L— x) of the backward tree. Clearly, for any37Chapter 3 Bidirectional Sequential Decoding AlgorithmsDMC and any fixed arbitrary level x, random variables C and Cf are independentbecause the forward and backward search areas are disjoint.Let us assume that the amount of time to hypothesize (extend) one node is thesame for both the forward and the backward decoders. Let CL denote the total numberof computations needed to decode one block by Algorithm TAmeet, By the nature ofAlgorithm TAmeet, we can write the following property.Property 3.1: Let I be the meeting point, then— J 2C( and C_1 Cf, if decoding is stopped at step 6 (3 J)L — 2CE_1 + 1 and C1F Cf_1 + 1, if decoding is stopped at step 3ProofS According to Algorithm TAmeet, if decoding is stopped at step 6, the forwarddecoder must have extended the same number of nodes as the backward decoder. ifdecoding is stopped at step 3, however, the forward decoder must have extended onemore node than the backward decoder. Let us first assume that decoding is stopped atstep 6. By the definition of the meeting point I (see Figure 3.1), we know that1FTQP = 1,which means CL = 2C(. Moreover, since 1B = L — 1 1BTOP’ then C_1 > Cf.Now, suppose that decoding is stopped at step 3. In this case, CL = 2C_1 + 1 since1BTOP = L—l, where the additional computation is due to the last extension by the forwarddecoder. Moreover, since 1F = 1 1FTOP’ then Cf > Cf_1 + 1. Q.E.D.Corollary 3.]: Let I be the meeting point, thenCL — 1 <2 mm (cf, cf_1), (3.2)since CL — 1 < 2Cf and CL — 1 2C_1 according to Property 3.].38Chapter 3 Bidirectional Sequential Decoding AlgorithmsObviously, CL = 2C( = 2Cf_1 if and only if both top nodes in the FS and BSare at forward tree level I at the end of decoding, i.e., both top nodes are at the meetingpoint I. It is worth noticing that the minus one in (3.2) can actually be ignored and thusC’L is essentially equal to 2 mm (Of’, cf_1).Now, we introduce an important property which is the foundation in our analysis ofthe computational performance of the proposed BSD in Chapter 4.Property 3.2: Let integer i E [1, L — 1], then— 1 <2 max {min(Cf. cP_j]. (3.3)Proof First of all, note that there must exist a meeting point in Algorithm TAmeet,i.e., I < 1 < L — 1. By Corollary 3.1, one can write— 1 <2mm (c(,o_). (3.4)Moreover, we can writemax [min(CfCf_j)] mmn(Cf,Cf_j), (3,5)since 1 < 1 < L — 1. Q.E.D.In fact, it is easy to see that CL is virtually equal to 2 max [mm (of, C_)] .This approximation, however, is not necessary for the proof of the upper bounds on thecomputational distribution of Algorithm TAmeet in Chapter 4.It is interesting to point out that Forney’s scheme [22] belongs to BSD-no-mergesince forward and backward decoders do not need to agree with each other, and itsNote that CL mm (cr, ce,) for all possible i.39Chapter 3 Bidirectional Sequential Decoding Algorithmstotal number of computations is equal to 2mm (Ci, of), where Cf and of denotethe number of computations needed by forward and backward decoders to decode thewhole block, respectively.Let us now compare Algorithm TAmeet with the bidirectional M algorithm [28], Itis obvious that they both belong to the same category of BSD-no-merge since mergingbetween the forward and backward portions of the decoded path is not required in bothalgorithms. The main difference, however, is in the nature of the search algorithmemployed in the forward and backward decoders. In the bidirectional M algorithm thenumber of computations per decoded branch is constant, whereas in Algorithm TAmeetthat number is variable since backtracking is allowed, As a consequence, erasure freedecoding is possible in the bidirectional M algorithm, while Algorithm TAmeet can notguarantee it, On the other hand, the bidirectional M algorithm suffers the correct pathloss problem. Another difference is how they handle a burst. In Algorithm TAmeet,forward and backward decoders search into forward and backward trees alternatively nomatter how many or where the bursts are located. At the end of decoding, both decodersin Algorithm TAmeet retreat about m I 2 branches. In the bidirectional M algorithm,however, in order for the first decoder encountering a burst to wait for the other decoder’shelp from the opposite direction, a burst detection method is required. Moreover, onlythe bad decoder retreats m branches at the end of decoding. The BER of the bidirectionalM algorithm degrades with increasing block length L [281, while in Algorithm TAmeet,the BER improves as L increases (see Chapter 5 for analysis and simulation results).40Chapter 3 Bidirectional Sequential Decoding Algorithms32 Algorithm TAmerge, a BSD-mergeIn BSD-merge, decoding does not stop whenever two opposing paths meet eachother as in the algorithm described above. Instead, a merging test is applied to opposingpaths and decoding stops only when the test is successful. We present in the followingAlgorithm TAmerge which belongs to the class of BSD-merge,Algorithm TAmerge works in the same way as Algorithm TAmeet, except that at themeeting point I decoding is not stopped, unless a pair of forward and backward pathsmerge with each other. Instead, decoding is continued until a pair of merged forward andbackward paths is found. Let I denote the level of the forward tree when this happens.Obviously, here the portion of the tree searched by forward decoder does overlap with thatportion searched by backward decoder, as illustrated by Figure 3.2. More specifically,Algorithm TAmerge is as follow.Algorithm TA merge:Steps 0 to 2 are exactly the same as in Algorithm TAmeet.Step 3: If 1 BTOP +iF > L, check the top path in the BS with all paths with depths equalto L— 1BTQP in the FS. If one (or more) merging path is found, stop decoding. Amongall merging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to the next step.Steps 4 and 5 are exactly the same as in Algorithm TAmeet.Step 6: If1FTQP +lB> L, check the top path in the FS with all paths with depths equalto L— 1Frop in the BS. If one (or more) merging path is found, stop decoding. Among41Chapter 3 Bidirectional Sequential Decoding AlgorithmsMerging point 101\\\\—\\\\ Q- o\\• Correct path— -o-- — Incorrect pathL branchesFigure 3.2 Illustration of the overlapping in BSD-merge.all merging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to step 1.It is worth pointing out that all nodes in one stack (FS or BS) can be easily linkedwith respect to their depths. Hence, in the merging test (steps 3 and 6), it is easy tofind all those nodes which should be tested with the top node in the opposite direction.During the merging test at steps 3 and 6, the top path in either the FS or BS is onlytested with end nodes of paths in the opposite direction, which are at the same level withthe tested top path. A natural question that may arise is whether two opposing pathsmay pass each other or not since intermediate nodes are not being examined during themerging test. For example, Figure 3.3 illustrates such a situation where two opposingForward treeroot node 0- —\\KBackward treeroot node10\\ —\\0 —42Chapter 3 Bidirectional Sequential Decoding Algorithmspaths may have common encoder states at some of their intermediate nodes, We nowshow that the events shown in this figure can not take place in Algorithm TAmerge.(d)Figure 3.3 Example of opposing paths passing each other.First of all, events like (c) and (d) in Figure 3.3 can not happen unless event (a) or(b) occurs in the first place. This is because both forward and backward decoders mustreach their predecessors before any further extensions. We are going to show that events(a) and (b) in Figure 3.3 can not occur in algorithm TAmerge. Clearly, events (a) and(b) are the same type of event. Hence, only event (a), which is shown in Figure 3.4 formore detail, will be considered in the following.• common encoder state(C)43Chapter 3 Bidirectional Sequential Decoding AlgorithmsL) UB]Figure 3.4 Typical impossible event in Algorithm TAmerge.Figure 3.4 implies that the backward path with end node BO merges with the forwardpath with its intermediate node FO. Moreover, EQ and BO are the only common nodesbetween the forward and backward paths. As Figure 3.4 indicates, (Ii, 12, , Im)denotes the common encoder state. Moreover, the encoder state of node B] is either(12,J3, , J,O) or (12,13,’’ , ‘m, 1). Let denote the last bit of the encoder stateB]. Similarly, let denote the last bit of the encoder state F]. Since FO is theintermediate state of the forward path, all successors of node FO must have been reached,and FO is consequently removed from the FS (steps 1 and 2 in Algorithm TAmerge). LetF] and F]’ be the successors of FO as illustrated in Figure 3.4. Since FO and BO arethe only common nodes between forward and backward paths, Fl and B] are commonnodes, i.e. = Otherwise, nodes Fl’ and B] will share a common encoderstate, as indicated in Figure 3.5.Now, let us assume that F] is still in the FS, which means F] has not been extendedlevel x levelFOE]’‘mi44Chapter 3 Bidirectional Sequential Decoding Algorithms• common encoder stateFigure 3.5 Illustration of a typical impossible event in Figure 3.4.levelx÷1= 10• common encoder stateFigure 3.6 Illustration of the possible event.yet before the backward path reached node BO. According to step 3 in AlgorithmTAmerge, however, decoding must have been stopped at the time when B] is at thetop of the BS. Thus, BO does not have any chance to be reached because B] must beat the top of the BS before BO can be reached. This is illustrated in Figure 3.6, whichclearly shows that end node F] of the forward path merges with end node B] of thebackward path.Next, let us assume that F] is not in the FS, which means F] has already beenextended. As shown in Figure 3.7, let P2 and F2’ denote the successors of F], andassume that F2 and B2 have the same encoder state. Clearly, Figure 3.7 shows the sameF]level xStop decoding whenB] becomes the topnode in the BS.45Chapter 3 Bidirectional Sequential Decoding Algorithmstype of event as that in Figure 3.3 (d) since there is more than one common encoderstate. Based on our discussion above, this event can not occur before an event like thatin Figure 3.3 (a) or (b) happens. Following this same line of reasoning, we get thefollowing property.level x. common encoder stateFigure 3.7 Illustration of another impossible event.Property 3.3: in Algorithm TAmerge, forward and backward paths in the FS and BScan only merge at their end nodes. That is merging paths can not overlap each other.Since the decoded path is a merged path, forward and backward portions of thedecoded path will never overlap in Algorithm TAmerge. Furthermore, we can concludethat overlapped forward and backward paths in Algorithm TAmerge do not have thesame encoder state at any of their common tree levels because otherwise Property 3.3would be breached.Two forward and backward paths are said to be totally distinct as long as they donot overlap or do not share the same encoder state at any common tree level in theiroverlapping portion. Figure 3,8 illustrates all possible types of paths in the FS and inthe BS. In Figure 3.8 (a), forward and backward paths do not have a chance to meet.46Chapter 3 Bidirectional Sequential Decoding AlgorithmsIn Figure 3.8 (b), merged forward and backward paths do not overlap, while in Figure3.8 (c) forward and backward paths do not share any common encoder state in theiroverlapping area. Thus, we can write the following property.Forward treelevel !FXpBackward treelevel 1B(c)Figure 3.8 Distinct forward and backward paths in the FS and BS.Property 3.4: in Algorithm TAmerge, all paths in the FS are totally distinct from allpaths in the BS,Like in Algorithm TAmeet, forward and backward decoders cannot reach theirterminal nodes in Algorithm TAmerge because all successors of the root nodes of forward(a)(b)47Chapter 3 Bidirectional Sequential Decoding Algorithmsand backward trees were reached at the beginning. Thus, forward and backward decodersmust merge somewhere in the tree, i.e., 1 < 10 < L — 1. In the following analysisof Algorithm TAmerge, correct decoding is assumed, which is a good approximation,especially for long memory codes. For an arbitrary forward tree level x, let us defineC as the number of computations required to correctly decode the first x branchesby forward decoder. Similarly, define as the number of computations required tocorrectly decode the last (L-x) branches by backward decoder. Unlike Algorithm TAmeet,forward decoder may extend incorrect paths beyond level x before it finally decodes all xbranches. Backward decoder may also extend incorrect paths beyond level (L-x) beforeit finally decodes all (L-x) branches.Now, let CL denote the total number of computations needed by Algorithm TAmergewith the assumption of correct decoding. By analogy to Algorithm TAmeet, we canclaim the following properties.Property 3.5: Let 10 be the merging point of Algorithm TAmerge, thenI 20( and Cf. Gf’, if decoding is stopped at step 6 (3.6)2C_10+l and C(CL1 + 1, if decoding is stopped at step 3Proof Similar to Algorithm TAmeet, if decoding is stopped at step 6, forwarddecoder must have extended the same number of nodes as backward decoder. If decodingis stopped at step 3, however, forward decoder must have extended one more node thanbackward decoder. Without loss of generality, let us assume that decoding is stoppedby step 6. Thus, the top node at the FS is at level 1 when decoding is stopped,and the forward decoded path up to this top node is correct. Hence, the forward48Chapter 3 Bidirectional Sequential Decoding Algorithmsdecoder must have performed exactly C( computations. Furthermore, the algorithmdoes not require that the backward decoder totally decodes the last (L— l) branches, i.e.,Of Of’. Q.E.D.Coro1lay 3.2: Let 1 be the merging point in Algorithm TAmerge, then—1 <2mm (C(,Cf_1) (3.7)according to Property 3.5. Similar to the discussion of Corollary 3.1, CL is almost equalto 2 (or,Property 3,6: Let integer i e [1, L — 1], thenCL — 1 <2 max [rnin(O[, Of_)]. (3.8)Proof’ Same as the proof of Property 3.2. Q.E.D.Similar to Algorithm TAmeet, one can easily conclude that CL is practically equalto 2 max [mm (Of’, c)]. Strictly speaking, the number of computations without theassumption of correct decoding is less than or equal to that with the assumption.We now examine the merging test (step 3 and step 6) of Algorithm TAmerge in moredetail. In Algorithm TAmerge, decoding stops whenever a top path either in the FS or inthe BS merges with any path from the opposite direction. Clearly, the number of pathsbeing tested in the merging test is a random variable. This variable is small under goodchannel conditions and increases as the channel gets noisier. Suppose that the top pathin the BS merges with a forward path FP which is not at the top of the FS (step 3). Theend node of FP is at the merging point 1 according to Property 3.3. As a consequence of49Chapter 3 Bidirectional Sequential Decoding Algorithmsthis merging, forward decoder does not need to extend all forward paths whose metricsare higher than M[FP] to decode the forward portion of the decoded path FP. In otherwords, the merging test at step 3 effectively reduces the number of computations neededby forward decoder to decode PP. Similarly, the merging test at step 6 may reduce thenumber of computations needed by backward decoder if the top path in the FS mergeswith an opposite path that is not at the top of the BS. Hence, the multiple examinationsof the top path in one stack with all paths in the opposite stack results in a reductionin computational variability since the overall number of computations per decoded blockis reduced. The parallel to this is the multiple path extensions in the generalized stackalgorithms of Haccoun and Ferguson [19], which allows a reduction in computationalvariability at a cost of an increase in average number of computations per decoded block.Note that in Algorithm TAmerge, the average number of computations is not increased,but is even reduced, as it will be shown in Chapter 4,33 Coarsening on Algorithm TAmergeIn order to efficiently perform the merging test in Algorithm TAmerge, nodes haveto be linked together with respect to their depths. However, this introduces the drawbackof extra processing time and memory needed for storage in forward and backwarddecoders. In the following, we provide two modified algorithms, quite similar toAlgorithm TAmerge but without the aforementioned drawback. These two algorithmswork best with quantized stacks.The modification is done in the merging test of Algorithm TAmerge (steps 3 and6). Here, the merging test is only performed among all nodes in the highest non-empty50Chapter 3 Bidirectional Sequential Decoding Algorithmssubstacks in the FS and BS instead of testing the top node in each stack with all possiblemerging nodes in the opposite direction. As a consequence, however, two opposing pathsmay pass each other because forward and backward portions of the merged path may notcoexist in the highest non-empty substacks at the same time. in other words, forward andbackward portions of the merged path may overlap, as illustrated in Figure 3.3. In orderto reduce the possibility of two opposite paths passing each other, not only the endpointsof the two possible merging paths should be tested but also other possible merging pointsin the overlapping area as will be described later.Coarsening I (Algorithm TTmerge):Steps 0 to 2 are the same as in Algorithm TAmeet, except the top node is now anynode in the non-empty highest substack,Step 3: If 1F + 1B > L, check all paths in the non-empty highest substack in theBS with all paths in the non-empty highest substack in the FS. If there is an overlapbetween forward and backward paths that are being tested, check all encoder states inthe overlapping area. If one or more merging paths is found, stop decoding. Among allmerging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to the next step.Steps 4 and 5 are the same as in Algorithm TAmeet, except the top node is now anynode in the non-empty highest substack.Step 6: If 1F + 1B > L, check all paths in the non-empty highest substack in theFS with all paths in the non-empty highest substack in the BS. If there is an overlapbetween the forward and the backward paths being tested, check all encoder states in51Chapter 3 Bidirectional Sequential Decoding Algorithms(b)Figure 3.9 Merging test in Algorithm TTmerge,the overlapping area. If one or more merging paths is found, stop decoding. Among allmerging paths, select the one with the highest cumulative metric as the decoded path.Otherwise, go to the next step.Step 7: Check if the node to be extended in the highest non-empty substack in theFS is a terminal node in the forward tree. Check if the node to be extended in the highestnon-empty substack in the BS is a terminal node in the backward tree. If either one isa terminal node, stop decoding and that top path is accepted as the decoded path. Inthe event that both top paths in the PS and BS are terminal nodes in the forward andbackward trees, respectively, the path with the highest cumulative metric is selected asthe decoded path. Otherwise, go to step 1.possible merging nodes(a)path inFS BStoL52Chapter 3 Bidirectional Sequential Decoding AlgorithmsIf two opposing paths pass each other and one of them reaches the end of itscorresponding tree, clearly the decoding should be stopped. Therefore, step 7 is neededin Algorithm TTmerge. However, step 7 can be eliminated if we treat the forward andbackward root nodes as connected with zero paths. For example, if the top node in theFS is at level L, it can be viewed as a merging node with the backward root node.The merging test in Algorithm TT’merge is illustrated in Figure 3.8. In Figure 3.8 (a),the forward and backward paths do not overlap and thus only the end points are tested.However in Figure 3.8 (b), forward and backward paths do overlap over to branches. Inthis case all intermediate nodes are examined. This involves t0+1 comparisons whichcan be done by simply performing one exclusive-OR operation between the last m+toinformation symbols of the tested paths, and checking if there exits m consecutive zeros,In case the depth of one of the tested paths is smaller than (m + to), the shorter one canbe viewed as one rooted by a zero path of any length.It is interesting to notice that the number of nodes in the highest forward and backwardsubstacks varies according to the channel conditions. When the channel is quiet, onlythe correct node is likely to float in the highest substack. As the channel conditiondegrades, the number of nodes in the highest substacks tends to increase. This meansthat the merging test involves a variable number of examinations, which depends on thechannel conditions.Coarsening 2 (Algorithm HTTmerge):We now describe Algorithm HTTmerge which is a hybrid version of AlgorithmTTmerge and Algorithm TAmeet. Algorithm HTTmerge is exactly the same as Algorithm53Chapter 3 Bidirectional Sequential Decoding AlgorithmsTlmerge except that the number of information symbols required to be the same in themerging test of Algorithm TTmerge (steps 3 and 6), denoted m, is less than m, The ideais that if two opposing paths agree on their last mj < m branches, it is anticipated thatthey would eventually agree on all m last branches if decoding is continued. Of coursethis may not happen all the time, especially if mh is relatively small, and this prematurestopping may introduce additional decoding errors. If mh=m, Algorithm HTTmergebecomes equivalent to Algorithm TTmerge. In the other extreme, if mt=O, it becomesequivalent to Algorithm TAmeet. Clearly, Algorithm HTTmerge reduces the amountof computations by compromising the error performance since the number of matchinginformation symbols mh is less than m. Hence, Algorithm HTTmerge can provide atrade-off between computational effort and error performance by controlling the numberof matching information symbols in the merging test.54Chapter 4Computational Performance of BSDIn this chapter, we analyze the computational performance of the proposed BSDalgorithms. It is assumed that the channel is a DMC. Also, BSD using the stack algorithmis assumed. However, all results in this chapter can be applied to arbitrary BSD sequentialdecoders. Let denote the total number of computations performed by any BSDalgorithm for decoding one block of L branches (including a known tail), where onecomputation is defined as in Chapter 3, i.e., the extension of one node from eitherforward or backward direction. We are mainly concerned with the distribution of CfSD,i.e., P (CfSD > N). In the following, we first quickly demonstrate the property of thecomputational distribution by a simple approach. Then, Algorithm TAmeet is analyzedin detail. More specifically, upper bounds on the computational distribution are givenby Theorem 4.1 (for a specific time-invariant convolutional code) and by Theorem 4.2(for an ensemble of trellis codes). Moreover, we conjecture that Theorem 4.2 can beapplied to Algorithm TAmerge. A lower bound on the computational distribution for anycode and any type of BSD is also derived. The average number of computations andthe computational cutoff rate of BSD are then discussed. Finally, extensive computersimulation results are provided, confirming our theoretical analysis.4.1 A Simple Approach to Computational DistributionBefore giving the general derivation, we first quickly show interesting results onthe computational distribution of BSD when the code rate is below the computational55Chapter 4 Computational Performance of BSDcutoff rate Rcomp. Let us define a dip [36, 46] on the correct path as the largest metricdifference between two consecutive non-adjacent breakout nodes (see Figure 4.1 (a)).Notice that the dip is almost the same whether decoding is performed from forward orbackward direction because the metric of the backward correct path is the mirror imageof its corresponding metric of the forward correct path (see Figure 4.1 (a) and (b)).Alternatively, the dip can be viewed as a “noise burst” [30], It is admitted that the Paretodistribution P (CysD > N) N—Pr is caused by the following two opposing effects:(i) the probability of the dip size (burst length) y decreases exponentially with ; (ii)the number of computations necessary to decode the correct path increases exponentiallywith ‘y [5, 15, 30, 32, 36, 46, 47]. All these suggest the inherent inability of USD todeal effectively with a large dip (long burst). These difficulties arise from the essentialnature of USD algorithms, i.e., searching only from one direction, and not from anyfundamental limitations on convolutional codes or from any basic difficulties arisingfrom the dip (burst) [30]. In the following, we demonstrate that the proposed BSDrelieves the inefficiency of USD by attacking the dip (burst) form both forward andbackward directions.Suppose that the code rate is below Rcomp, experimental evidence [48] indicates thatwith a very high probability only one important dip exists in a block. Moreover, it iseasy to see that C’ is an increasing function of x, while G_ is a decreasing functionof x. Hence, according to Property 3.2, C/’ cf_1 in Algorithm TAmeet, where Iis the meeting point. Similarly, in Algorithm TAmerge, Cr C_ where 1 is themerging point. Thus, Algorithm TAmeet and Algorithm TAmerge optimally break everyreceived block into two portions that require roughly the same number of computations56Chapter 4 Computational Performance of BSD1(c)Figure 4.1 (a) Correct path metric with forward decoding. (b) Correct path metric withbackward decoding. (c) Correct path metric with bidirectional decoding.for decoding. Also, portions of the decoded forward and backward paths in the FS andBS will never overlap in Algorithm TAmeet and Algorithm TAmerge.In summary, the proposed BSD algorithms attack the dip (burst) from both sides,and halt somewhere near the middle of the dip. Hence, with the assumption of correctForward treeroot nodeo Nonbreakout node• Breakout node(a)Backward treeroot node DIP(b)Forward treeroot nodeOptimum meeting (merging) pointBackward treeroot node57Chapter 4 Computational Performance of BSDdecoding, the meeting (or merging) event can be viewed as the breaking of the dip inan almost optimal way, i.e., into two almost equal parts (see Figure 4.1 (c)). It seemsplausible that the required amount of computations by the proposed BSD is approximatelythe amount required by a USD to move through two dips (bursts) of half the original size.This is why our BSD needs substantially less computations. Therefore, if USD needsN cx exp-y computations for any given correct path dip ‘-y, our BSD asymptotically needsabout twice of exp cx /T computations for the same dip. If this characterization isvalid, we can conclude thatp(CBSD > N) P(Cy8D > (N)2)2 N2Pr. (4.1)Note the factor two in the exponent of the distribution (4.1).4.2 Upper Bounds to Computational Distributionof Algorithm TAmeetThe computational effort to decode a given block depends on the strategy of thealgorithm, the transmitted and the received sequences. For the convenience of analysis,define a random variable C as C 2 max [mm (Of, Ce_)j. We first derive upperbounds to the distribution function of the random variable C, then we will apply themto CL of Algorithm TAmeet.Lemma 4.1:P(C> N) < P(cf > N/2)F(Cf_ N/2), (4.2)where the summation is over all possible values ofx, i.e., 1 < x < L— 1.58Chapter 4 Computational Performance of BSDProofS By the definition of the random variable C, one can immediately writeP(C> N) P(max{rnin[C,C_]} N/2)p (min[C, C_] N/2), (4.3)where the summation is over all possible values of x, i.e., 1 N) of USD to decode A branches in afiniteA branch code tree (or infinite code tree) is upper bounded byP(GKSD N) AP(C1— 1), (4,4)where C1 denotes the number of computations peiformed by USD in order to decode thefirst branch in an infinite code tree.Proof’ For an infinite code tree, according to the union bound, we have [20]P(CK >N) N)AP C>——1 (4,6)because CK Q.E.DLemma 4.2 confirms that there exists no essential difference between P (cS) N)and P(C1 N) as Jacobs and Berlekamp suggested in [30]. From now on, we let ofand C1B denote the number of computations required to decode the first branch in theinfinite forward and backward code trees, respectively.Theorem 4.1: For a specific time-invariant convolutional (i.e., linear trellis) code t!with rate R and a BSC with a crossover probability p satisfying (2.9), the computationaldistribution of Algorithm TAmeet is upper bounded byP(CL > N) log (a) — 1 whena 2 and u 2. We can thus writeP(C N) <2 x (L— x).ndF exP{_(d{logu ()_ ]+ d [iogu (2(L - X)) 1]) + 1og [ () ( -2 —-- 1---- F•2 mu . [x. (L — x)] mu. “z .exp{—(d [1ogN— log (2uL)} + d[1og N — logy (2uL)]) + 2log N}= A1 exp{—t(d[1og N — 1og (2uL)]+ d[1og N — 1og (2uL)]) + 2q1og N}. (4.11)According to Property 3.2, we know that CL — 1 < C, Hence, P(CL > N)P(C> N). Q.E.D.Theorem 4.1 shows that rapid column-distance growth of both forward and backwardcodes are essential to make BSD efficient. Using a computer search procedure, we haveprovided in Chapter 2 SBODP and SABODP codes which have the same rapid columndistance growth rate from both forward and backward directions.61Chapter 4 Computational Performance of BSDHowever, if the convolutional code used is optimum from one direction but is verybad from the other direction, one can expect that the distribution of computations usingBSD falls close to that using USD from the optimum direction, as the above boundindicates. A systematic ODP code is one typical example. Therefore, systematic codesare in general not suitable for BSD algorithms.Corollaiy 4.]: For a symmetric convolutional code (e.g., a SBODP or SABODPcode), Theorem 4.1 can be written asP(CL > N) N) <2 2i [x . (L — x)]. (N)2 . exp{—2(n/2— ) log N+ n1og (2uL)}= A1 (2uL).()2. exp {—2(n/2— )log N}= A2N” (4.13)where J, = (au. n — 2q)/(2lnu) — 1, which is the same as that of USD [38]. Moreover,A2 and J, are functions of the channel and code rate but independent of N.The above arguments are more clearly demonstrated by the following theorem. Noticethat Lemmas 4.1 and 4.2 can be applied to an ensemble of codes.62Chapter 4 Computational Performance of BSDTheorem 4.2: Let Pr be defined by R = E0(Pr) / Pr On any DMC and for largeN, the computational distribution of Algorithm TAmeet for an ensemble of trellis codes,P(CL > N), is upper bounded byP(CL > N) A3N2, (4.14)where P 2L, the terms in the last two brackets arevery small and can be bounded by a constant. Therefore,P(C> N) N) N) P(GL > N) P (min[G, cf] > N/2), (4.18)where0Amerge is the true number of computations of Algorithm TAmerge, i.e. withoutthe assumption of correct decoding. Hence, the only question is that for any fixed arbitraryx whether C and are independent or not.As defined in Chapter 3, for an arbitrary forward tree level x, random variablesO and Of are the number of computations required to correctly decode x and (L-x)branches by forward and backward decoders, respectively. Furthermore, unlike AlgorithmTAmeet, both forward and backward decoders may extend incorrect paths beyond levelx before the end of decoding. Thus, some incorrect nodes searched by forward decodermay also be visited by backward decoder.64Chapter 4 Computational Performance of BSDLet Df’ be the forward incorrect path subset generated from forward tree level i,Similarly, let D denote the backward incorrect path subset generated from backwardtree level j. According to the operation of sequential decoding, C is a function of themetrics of the correct path up to level x and metrics of incorrect paths in the subsets Dr,1 =1, 2, ..., x, [31, 46, 52J. Thus, one can write= f(M[xF(h)], M[F]), where h e [1, x] and F E U Dr. (4.19)Here, XF(h) is the forward correct path up to forward tree level h, and *F is a forwardincorrect path. Similarly, one can write= f(M[xB(h)],M[B]), where he [1,L — x] and B C UDP. (4.20)Here, xB(h) is the backward correct path up to backward tree level h, and $cB is abackward incorrect path.Strictly speaking, on a random code tree, C and C_ are not truiy independent,because an incorrect forward path *F may share some common nodes with an incorrectbackward path xB. Fortunately, the probability of this event gets smaller as the codememory m increases.10 Hence, by neglecting the above event, we can consider C’and cf essentially independent, and thus conjecture that Theorem 4.2 also applies toAlgorithm TAmerge.In Algorithm TTmerge, forward and backward merged paths may overlap each other,and hence the analysis of the computational distribution is even more difficult (if notimpossible). However, as our computer simulation results (see Figure 4.9 in section 4.6)10 The probability of this event is in the same order of the probability of error, which is shown in Chapter 5 to decreaseexponentially with m.65Chapter 4 Computational Performance of BSDsuggest, the overlapping area is insignificant, especially when code rate R Thus,it seems reasonable to suggest that the computational performance of Algorithm TTmergecan be approximated by that of Algorithm TAmeet if the code rate RFinally, it is easy to see that the computational distribution of Algorithm HTTmergeis upper bounded by that of Algorithm TTmerge and lower bounded by that of AlgorithmTAmeet. Clearly, Algorithm HTTmerge is closer to Algorithm TTmerge when the numberof required matching information symbols mh is close to the code memory m. Otherwise,it is more like Algorithm TAmeet. This phenomenon is clearly observed in the computersimulations (see Figure 4.2 in section 4.6).4.4 Lower Bound to Computational DistributionFollowing Jacobs and Berlekamp [30], we show that with the assumption of correctdecoding no algorithm of BSD type can have a computational distribution better thanN2°.Theorem 4.3: For all convolutional (or trellis) codes, any DMC, and with the assumption of correct decoding, the distribution P (Cf > N) of any BSD algorithm islower bounded byp(CBSD> N) 22aN_2exp{_O(/i)}, (4.21)where a is defined in parametric equation (2.7).Proof Let us separate every received block into two almost equal parts, i.e. forwardtree search operations will not go beyond forward tree level [(L — m)/2j and backwardtree search will not go beyond backward tree level F(L— m)/21 by the help of a genie as66Chapter 4 Computational Performance of BSDdefined by Jacobs and Berlekamp [30]. Notice that there are m branches in the codewordtree separating the portions considered by the genie-aided forward and backward decoders,allowing them to choose codeword paths independently. Let Cm)/2Janddenote the total number of computations required before first correctly decoding the[(L — rn)/2J branches by the idealized forward decoder and the [(L — m)/2] branchesby the idealized backward decoder, respectively. Obviously under the assumption ofcorrect decoding, > 2mm_m)/2J0RL_m)/21] for any BSD algorithm.Therefore, we can write> N) P (2mm [C_m)/j,l]> N)= (C_mIj > N/2, C_m)/l > N/2). (4.22)By the assistance of a genie, forward and backward search areas do not overlap, i.e.events (C_m)I2j> N/2) and (0_m)/21 > N/2) are independent for any DMC.Therefore, we finally getP(0BsD > N) (Cm)/2j> N/2) . p (Cm)/2l> N/2){ (*) a exp [-Oi (vci)]=2aN_exp [_o(/iT)]. Q.E.D. (4.23)Combining the lower bound (Theorem 4.3) and the upper bound (Theorem 4.2), wedemonstrated that the computational distribution of the proposed BSD is Pareto, and itsPareto exponent is between 2c and 2p, where p < Pr is the Pareto exponent of USD.Since the Pareto exponent of USD is approximately c or Pr, we conclude that the Paretoexponent of BSD is approximately twice that of USD,67Chapter 4 Computational Performance of BSD4.5 Average Number of ComputationsIn this section, we examine the average number of computations of BSD and showthat it is smaller than that of USD. The computational cutoff rate of BSD is then discussed,We define the average number of computations per decoded branch, denoted by Gb,as the total number of computations used to correctly decode one block divided by theblock length L. Using Jacobs and Berlekamp’s approximation [301 for the distributionof CKSD, which is given byP(CKSD > N) AN, (4.24)we can approximate the distribution of BSD as> N) (r_1L)2 N, (4.25)where we assume L >> m, and ignore the difference between L(L — m)/2j and— m)/21 since both tenns are almost equal to L I 2 when L is large.The total number of computations per decoded block of BSD can be any integer inthe interval [L, M] where M is the maximum number of computations in BSD. Thus,BSD — 1 BSDECb 7L= NP(Cf = N)N [ (cfSD> N) — p (cf > N)]M-l M= (N + l)P(C > N) — NP(Cf > N)N=L-1 N=L68Chapter 4 Computational Performance of BSDM-l=P(cfsD>N)+P(cfsD>L_l)—SD> M). (4.26)In (4.26), P(C > L — i) = 1 and p(cfS) > M) = 0. Therefore, using (4.25),(4.26) becomesM-lE[C] =l+yzP(Cf8D>N)M—l1 + 22a_2 L > N2’. (4.27)Moreover, similar to [30], we can writeM-l M— L2j, a (4.28)lnM—lnL,Finally, using (4.28), we obtain1L jl—2a L2(1_a)l .i 1E [csD] . + L — a - (4.29)11+222L(1nM—lnL),Similarly, one can get the average number of computations per decoded branch forUSD asE C-° f 1 + — L’), a 1 (4.30)H1+1nMi—1nL, a=1’where M1 is the maximum number of computations in USD.From (4.29) and (4.30), it can be seen that E [c°] is smaller than E forany a. However, the difference becomes less significant when a is much larger than69Chapter 4 Computational Performance of BSDone, This is because the average computation for both USD and the proposed BSD isessentially one computation per branch when a >1. However, it is expected that higherorder moments of the computation will be significantly improved by the proposed BSDeven for high a.Let us now discuss the computational cutoff rate. The computational cutoff rateof sequential decoding is defined as the supremum of rates for which the averagecomputation per branch is bounded [301. It is well known that with the assumption ofcorrect decoding the computational cutoff rate Rcomp of USD is equal to E0(]) [30, 53].Rcomp is a truly critical channel parameter, and with the assumption of correct decodingthe average number of computations per decoded branch of USD is unbounded if thecode rate R > R0 [30, 53]. In the following theorem, it is shown that the computationalcutoff rate for BSD with the assumption of correct decoding is the same as that of USD.Theorem 4.4: Consider an infinite sequence of convolutional (or trellis) codes {}and any DMC. For code j, let R denote its rate and m its memory. Suppose each codehas n symbols per branch in its codeword tree, and R1 R for all i, for some R > R01,Code is used with block length L1 (including a known tail to terminate the encoder ina known state) for BSD. Equiprobable codewords and arbitrary sequential decoder areassumed. Under the assumption of correct decoding,” (fL —* oc and L (i +for any i and some 0 < eo 1, then the expected number of computations per branchsatisfiesE [CsD] oo (4.3])u This assumption is equivalent to defining the number of computations needed to be infinite if BSD makes decoding error.70Chapter 4 Computational Performance of BSDProof’ See Appendix A. Q.E.D.As a simple approach, from (4.29) one can actually show that E [GSD] —* co whenL — cc and a < 1. However, it is important to notice that the above computationalcutoff rate is based on the assumption of correct decoding. As suggested by Anderson[51], this assumption is unrealistic and in fact there is no cutoff rate phenomenon forsequential decoding as long as Fe > 0. Nevertheless, the cutoff rate is a bound such thatabove this rate the average computation will increase exponentially. From (4.29) and(4.30), one can easily see that when a < 1, i.e. above the cutoff rate, E [CD] is muchsmaller than E [C1] although it is also much larger than one. Hence, the useful rangeof code rates with the proposed BSD is increased compared to that with USD accordingto Anderson’s definition on a “useful decoder” [51].46 Numerical and Simulation ResultsComputer simulations have been performed in order to verify the above theoreticalresults. Non-systematic rate r = 1 / 2 (bits per channel symbol) convolutional codes wereused. All algorithms were run over a BSC with a transition probability p under strictlyidentical conditions (i.e., using the same code and noise sequences), in order to makethem comparable. In our computer simulations, metrics were scaled into integers. InAlgorithm TAmeet, Algorithm TAmerge and Algorithm HTTmerge, the substack spacingwas chosen to be equal to one, i.e., using unquantized stacks. Different substackspacing values, namely L = I and zi = 7 were chosen in Algorithm Tlmerge. Noticehere that there may exist more than one path in the highest non-empty substack evenwhen z I.71Chapter 4 Computational Performance of BSDFigure 4.2 shows the distributions of the total number of computations per decodedblock of both USD and BSD algorithms for the case p = 0.0409, corresponding to aPareto exponent Pr = 1.1 (i.e., R0 = 0.52 bits per channel symbol). A SBODP codeof memory m = 23 was used, and 200,000 blocks each of length L = 400 bits (377information bits) were simulated for each algorithm. The Pareto approximations forUSD (stack algorithm z = 1) and BSD (Algorithm TAmerge) are also indicated on thefigure. As expected, the use of BSD results in a significant reduction of the computationalvariability of sequential decoding. More specifically, the slope of the straight line thatapproximates the tail of the distribution of Algorithm TAmerge is equal to 2.16, which ismore than twice that of USD. This is in agreement with our derived upper bound on thecomputational distribution (Theorem 4.2), Moreover, it also agrees with the lower bound(Theorem 4.3), i.e., less than 2cr. It can be noticed from Figure 4.2 that the slopes of thecomputational distributions of all BSD-merge algorithms are similar to that of AlgorithmTAmeet, which suggests that the additional computations associated with the merging testare asymptotically unimportant as compared to the computations of Algorithm TAmeetwhen the code rate R N)C Q.CD t-)00CD.Cf2 1CDC C—CCDIIC C CC) CD 000 C C C0, (JC C00000Chapter 4 Computational Performance of BSDFor comparison purposes, the computational distribution of Fomey’s scheme [22]built on the stack algorithm is also depicted in Figure 4.2. As we pointed out in Chapter3, the total number of computations of Forney’s scheme is equal to 2 mm (Ci, CL9.Hence, the minimum number of computations with this scheme is equal to twice thatof USD. Clearly, as indicated in the figure, Fomey’s scheme does not provide anysignificant advantage over conventional USD. This is mainly because the dip size in thecorrect path metric is essentially the same whether decoding is performed from forwardor backward directions (see Figure 4,1). Thus, the number of computations, which ismainly determined by the dip size of correct path metric, is basically the same for bothforward and backward decoders.Table 4.1 shows the average number of computations per decoded branch of differentalgorithms. Clearly, the average number of computations is reduced by using theproposed BSD. Table 4.1 also shows the number of erasure blocks (undecoded blocks)with a maximum allowed computation Cijm = 8000. These results suggest that ourBSD can significantly alleviate the erasure or overflow problem, which is the mainobstacle in the application of sequential decoding. Furthermore, Table 4.1 shows thatwith an increase in computations of about 3% with Algorithm TAmerge compared toAlgorithm TAmeet, decoding errors (18.3% decoded blocks were in error after decodingby Algorithm TAmeet) were practically eliminated.12 Thus, with the use of the mergingtest in BSD, decoding errors can be significantly reduced. For Algorithm HTTmerge,Figure 4.2 and Table 4.1 suggest that there exists a substantial computational gain by justletting m = (m — 1) in the merging test. Thus, Algorithm HTTmerge can easily provide2 For a more detailed analysis and computer simulation results on the error perfomiance of the proposed BSD, refer to Chapter6.74Chapter 4 Computational Performance of BSDSBODP Code m = 23 Ave. comp.Erasure blocks Error blocksPr’’, L=400, 2x 10 5runs per branchUSD (unquantized stack 1.621 1324 0algorithm_z = 1)USD (quantized stack algorithm 1.776 1743 0L = 7)Algorithm 1.264 7 36614TAmeet (z2i = 1)Algorithm 1.303 21 1TAmerge_(L = 1)Algorithm1.390 84 1rrmerge_( = 1)Algorithm 1.393 59 0TTmerge_(z = 7)Algorithm HTTmerge 1.333 39 1(mh = m — 1, L = 1)Algorithm HTT’merge1.319 31 3(mh = m — 2, = 1)a trade-off between computational effort and error performance by adjusting the numberof matching information symbols m.Figure 4.3 compares the computational distributions of Algorithm TAmerge for ODP,SBODP and SABODP codes all of memory m = 23, using the same parameters asin Figure 4.2, It can be seen from this figure that the distribution of the number ofcomputations per decoded block using a SBODP code is better than that with an ODPcode. However, the distribution of the number of computations per decoded block usinga SABODP code is almost identical to that with a SBODP code. With a SABODP code,no errors were found after decoding because its free distance (equals to 24) is muchlarger than the free distance of the SBODP code (equals to 18). These simulation resultsTable 4.1 Comparison of average computations for L = 400, Ciim = 8000 and p = 0.0409.75Chapter 4 Computational Performance of BSDare clearly in agreement with our theoretical observations.Figure 4,4 shows the distributions of the total number of computations per decodedblock of different algorithms using a systematic ODP code (m = 23). As expected, thisfigure shows that the computational distribution of BSD is almost the same as that ofUSD. This is due to the fact that the backward code of an ODP forward code is toopoor and hence the backward decoder can hardly ever help in decoding. Figure 4.4 alsodemonstrates that the number of computations needed in Forney’s scheme is essentiallyequal to twice that of USD.Figure 4.5 shows the computational distributions of the total number of computationsper decoded block of both USD and our BSD algorithms for the case p = 0.0594,corresponding to a Pareto exponent Pr = 0.68 (i.e., R0 = 0.44 bits per channelsymbol). The same code as in Figure 4.2 was used again, and 10,000 blocks eachof length L = 200 bits were simulated for each algorithm. The Pareto approximationsof these distributions are also indicated on the figure. In Figure 4.5, the computationaldistribution of Algorithm TAmeet (a BSD-no-merge) clearly violates the lower boundgiven by Theorem 4.3 because there were too many decoding errors (44.1% decodedblocks were in error as illustrated in Table 4.2),13 However, no errors were noticedafter decoding by Algorithm TTmerge, while only one block was in error after decodingby Algorithm TAmerge. The slope of the straight line that approximates the tail ofthe computational distribution of Algorithm TAmerge is equal to 0.89, which is morethan twice that of USD. Again, it agrees with our upper and lower bounds. Moreover,Notice thai the lower bound in Theorem 4.3 is based on the assumption of correct decoding.76Chapter 4 Computational Performance of BSD0Ac-)• 10Figure 4.3 Distribution of the total number of computations per decodedblock for different codes using the same parameters as in Figure 4.2.400 800 1200 1600 2000 4000 8000N77Chapter 4 Computational Performance of BSDAc-)LFigure 4.4 Distribution of the total number of computationsper decoded block for a systematic ODP code (Pr 1.1).8000010N78Chapter 4 Computational Performance of BSDthe slopes that approximate the computational distributions of Algorithm TTmerge andAlgorithm TAmerge are almost the same.Table 4.2 compares the average number of computations for different algorithms forthe case p = 0.0594. The number of erasure blocks and the number of error blocks arealso shown in the table. It can be seen that the number of erasure blocks is substantiallyreduced by using the proposed BSD. Also, by using the merging test, almost all errorsassociated with Algorithm TAmeet were avoided, Furthermore, by comparing Table4.1 and Table 4.2, one can notice that the improvements in the average number ofcomputations of our BSD compared to that of USD are more significant when the Paretoexponent Pr is less than one. As anticipated, the useful range of code rates [51] withour BSD is increased as compared to the case with USD. These observations agree withour suggestions made in section 4.5.Figure 4.6 shows the distributions of the total number of computations per decodedblock of both USD and our BSD algorithms for the case p = 0.0289, corresponding toa Pareto exponent Pr = 1.5 (i.e., Rcomp = 0.58 bits per channel symbol), The same codeas in Figure 4.2 was used, and 5,000,000 blocks each of length L = 400 bits were runfor each algorithm. Again, the slopes that approximate the computational distributionsof the different BSD algorithms are quite similar. Once again, these simulation resultsare in agreement with our analytical results.Table 4.3 compares the average number of computations for different algorithmsfor the case p = 0.0289. The average number of computations, the number of erasureblocks and the number of error blocks are also shown in Table 4.3. It can be noticedthat BSD-merge eliminates all decoding errors associated with Algorithm TAmeet (4.7%790Ac)1010Chapter 4 Computational Performance of BSDFigure 4.5 Distribution of the total number of computations perdecoded block for the case L = 200 and p = 0.0594, i.e., Pr 0.68.400010iv80Chapter 4 Computational Performance of BSDTable 4.2 Comparison of average computations for L = 200, Cijm = 4000 and p = 0.0594.SBODP Code m = 23 Ave. Coffi Erasure blocks Error blocksPr=°68,L=200, i04 runs per branchUSD (unquantized stack 4561 1175 0algorithm_ = 1)USD (quantized stack 5.106 1382 0algorithm_ = 7)Algorithm 1,541 0 4406TAmeet (z = 1)Algorithm 2.284 166 1TAmerge_(L = 1)Algorithm2.799 293 0TTmerge_(z = 1)Algorithm 2.894 306 0TTmerge_( = 7)blocks). Moreover, the additional number of computations due to the merging test isonly 0.5% with Algorithm TAmerge compared to Algorithm TAmeet. All results showthat the additional computations due to the merging test are worth the reward and thosecomputations can be ignored when the code rate R N)— oo0 C 00 0 C——oC0C C C C 0Chapter 4 Computational Performance of BSDTable 4.3 Comparison of average computations for L = 400, Clim = 4000 and p = 0.0289.SBODP Code m = 23 Ave. comp. Erasure blocks Error blocksPr=L5,L=400, 5x 106 runs per branchUSD (unquantized stack 1.139 575 0algorithm_L = 1)USD (quantized stack algorithm 1.167 4336 0__7)Algorithm1,107 5 233321TAmeet (z = 1)Algorithm1.113 11 0TAmerge_(L = 1)Algorithm1.122 42 0TTmerge_( = 1)Algorithm1.137 15 0TTmerge_(z = 7)is used is negligible when the code rate R 1), the merging (or meeting) point is more likely to be near L / 2, which meansthat the computational lower bound on the computational distribution (Theorem 4.3) isquite accurate in this case.Figure 4.8 shows the empirical probability masses of the merging points for AlgorithmTAmerge and Algorithm TTmerge and that of the meeting point of Algorithm TAmeet,for the case Pr = 0.68. As can be seen, the probability mass of the merging point withAlgorithm TAmerge is quite similar to that with Algorithm TTmerge. However, unlikethe above case (i.e., R < Rcomp), here the probability mass of the meeting point startsto differ from that of the merging point. This suggests that the additional number of83Chapter 4 Computational Performance of BSD0.070,060.05—. 0.04-.S0.030.020.0100 400Figure 4.7 Probability mass of merging and meeting when code rate100 200 300x84Chapter 4 Computational Performance of BSDcomputations needed for merging from the first meeting point increases as the code rateincreases beyond Rcomp.Figure 4.9 shows the empirical distributions of overlapping lengths in AlgorithmTAmerge and Algorithm TTmerge, which is defined as= 1F + 1B — L at the end ofdecoding. O, represents the portion of the tree over which forward search and backwardsearch overlap each other. In Figure 4.9, the block length L=200 for the case Pr = 0.68,and L=400 for Pr = 1.1 and Pr = 1,5. As expected, the distribution of the overlappinglength decays faster as the channel becomes less noisy. Figure 4.9 suggests that thedistribution of the overlapping length decays in an exponential fashion. This is because O,is essentially the sum of the deepest depths reached by forward and backward decoders,and it is known that the distribution of the depth x of an incorrect path explored by asequential decoder decreases exponentially with x [4, 38].Finally, since in BSD two separate tree search processors can work in parallel, itseems more reasonable to define a computation in BSD as hypotheses (extension) of twonodes, one on the forward direction and the other one on the backward direction. Clearly,all analyses in this chapter still hold except that “N” should be replaced by “2N”. Asa result of this definition, our computer simulation results should be shifted horizontallyto the left by a factor of two.Figure 4.10 compares the distributions of Algorithm TTmerge (L = 1) for differentSABODP codes (m = 12, 16, 23) using the same parameters as in Figure 4.2, but withthe new definition of one computation with BSD. The number of blocks used for eachsimulation was 200,000 each of length L = 400. The maximum number of computationsallowed to decode one block was set to 4000. The computational distributions of USD85Chapter 4 ComputationaL Performance of BSD0.0350.030.0250.020.0150.010.0050200Figure 4.8 Probability mass of merging and meeting when code rate > R01.0 50 100 150x8600TjCD 0 C CD C CD CDProbabilityP(O>X)0n 0 C) C Cl)C CChapter 4 Computational Performance of BSDTable 4.4 Comparison of computational and error performancesof different m with the new definition of one computation.SABODP Codes Ave. comp. Erasure blocks Error blocksPr1,L=400, 2x i05 runs per branchUSD (unquantized stack 1.571 2761 100algorithm_A = 1, m = 12)USD (quantized stack algorithm 1.564 2793 11A = 1, m = 16)USD (quantized stack algorithm 1.530 2595 0A = 1, m = 23)Algorithm Tlmerge 0.710 42 383(A = 1, m = 12)Algorithm TTmerge 0.707 73 27(A = 1, m = 16)Algorithm Tlmerge 0.700 77 0(A = 1, m = 23)are also shown in Figure 4.10. Moreover, the number of erasure blocks, the numberof error blocks, and the average number of computations, (using the new definition ofone computation with BSD) are shown in Table 4,4, Similar to USD, the number ofcomputations of BSD is essentially not related with m. However, the error performanceis improved with increasing m.In summary we have demonstrated that the use of the proposed BSD can reduceboth the computational variability and average number of computations of sequentialdecoding. It should be pointed out at this stage that in our comparison of BSD-mergealgorithms with USD, we did not take into account the additional processing time neededto perform the merging test in BSD-merge. However, one can use extra processors toperform this task that would not slow down the node extension operations of sequentialdecoding. Therefore, it seems that the additional complexity due to the merging test88Chapter 4 Computational Performance of BSDFigure 4.10 Distribution of the total number of computationsfor different m with the new definition of one computation.A010—110-210-31010200\\Unidirectional“ Stack Algorithm‘\\ ‘% V (A=l)\\\\____\\\\‘c‘Algorithm TTmerge — — — —(A=l)%...—c•\\%.,\\___SABODP Codesm=23m=16m = 12 — —— — ——%.400 600 800 1000N2000 400089Chapter 4 Computational Performance of BSDis unimportant compared to the significant improvements in computational performanceobtained by using the proposed BSD.90Chapter 5Error Performance of BSDIn this chapter, the error performance of the proposed BSD algorithms is analyzedthrough the random coding argument. We assume that the channel is a DMC and BSDis based on the stack algorithm. However, all results obtained in this chapter can alsobe applied to arbitrary BSD sequential decoders, It is obvious that the error performanceof any BSD is lower bounded by that of Viterbi decoding since sequential decoding issuboptimum. Thus, we only need to develop upper bounds on the error performance of theproposed BSD. First, the basic elements of the error performance of USD are summarized.Then, we deal with the probability of decoding error of B SD-merge (Algorithm TAmergeand Algorithm ‘TTmerge). Finally, the error performance of BSD-no-merge (AlgorithmTAmeet) is discussed. It is found that for an ensemble of linear trellis codes the errorperformance of BSD-merge is asymptotically the same as that of USD, while the bit errorprobability of Algorithm TAmeet satisfies the random coding bound for block codes.5.1 Error Performance of USDThe basic results of the error performance of USD are summarized in the following.Let x(j) denote the correct path from root node to level j. Let x (t) denote an incorrectpath which diverges from the correct path x at level i and remerges at level (i + t).A proto-error event as defined by Forney [36] is an event for which some incorrectpath x(t) has a metric M[x(t)j at the point of convergence with the correct path x91Chapter 5 Error Performance of BSDsuch thatM[x(t)] rninM[x(j)j. (5.1).2 tBy definition, no error can occur without a proto-error event whereas a proto-error eventmay not necessarily lead to a decoding error for a unidirectional unquantized stackalgorithm [36]. Furthermore, for an incorrect path to remerge with the correct path, tmust be no less than the code constraint length K. For an ensemble of linear trellis codes,the block error probability p8D, and the bit error rate with a unidirectionalunquantized stack algorithm can be bounded by [6, 12, 36]pUSD < [1_Uo(1)/R]2G 0 R (1 — E)Rcomp (52)I [l__E(5)IR]2e , (1 — e)Rcomp R (1 —andpU9D < I [l_U , 0 R (1 — )Rcompb—en’o(6), (1— e)Rcomp R (1 —where is a positive number, R is the code rate as defined in (2.2), E0(S) is the Gallagerfunction [5], Ci,, is the channel capacity, and for Rcomp < R < C,,, S e [0, 1] is thesolution ofR=(l—e)E08 (5.4)and for 0 R Rcomp, S = 1.Now, define a proto-error event with a bias B as the event for which some incorrectpath x(t) has a metric M[x(t)] at the point of convergence with the correct path xsuch thatM[x(t)j minM[x(j)]— B, (5.5)92Chapter 5 Effor Performance of BSDwhere B is a positive number. Thus, for the unidirectional quantized stack algorithmwith a substack spacing , no error can occur without a proto-error event with a bias B= z. Hence, it can be shown that the error performance of the unidirectional quantizedstack algorithm is asymptotically the same as that of an unquantized one but with aninsignificant increase factor of A6 = e’(’6)in (5.2) and (5.3) [6, 36].5.2 Error Performance of BSD-mergeIn this section, the error performance of Algorithm TAmerge using unquantized stacksis analyzed in detail. The results are then extended to other BSD algorithms. Althoughthe following analysis assumes unquantized stacks, it can be easily generalized to thecase of quantized stacks, in a similar way as for USD [6, 36].Let Smg denote the common encoder state of forward and backward portions of themerged path at the merging point 1. Obviously, Smg can be either correct or incorrect.Figures 5.1 and 5,2 show some typical situations in BSD where the common encoderstate Smg is correct and incorrect, respectively. As indicated in Figure 5.1, there isalways a possibility that more than one forward (backward) path may merge with morethan one path in the opposite direction at encoder state Smg. Should this occurs, AlgorithmTAmerge always choose the one (perhaps not the correct one) with the highest cumulativemetric. In any case, the encoder state Smg at the merging point 10 in Figure 5.1 is assumedto be the correct one. In Figure 5.2, however, Smg is an incorrect encoder state. In thefollowing, we divide the derivation of the block and bit error probabilities into three stepsto prove that the error performance of Algorithm TAmerge is asymptotically the same asthat of USD. In the first step, we show that if the encoder state Smg is correct, decoding93Chapter 5 Enor Performance of BSDS mgFigure 5.1 Example of error events with a correct merging encoder state.Merging point 10Backward tree level iFigure 5.2 Example of error events with an incorrect merging encoder state.errors can be upper-bounded by the proto-error event with a bias ), where ) is the largestpossible one branch metric drop. In step 2, we investigate the probability that the encoderstate Smg is incorrect. Finally, the overall error performance is derived in step 3.Before investigating the above situations in detail, we need the following lemmas.Consider an incorrect path (t) which diverges from the correct path x at level i andForward correct path Backward correct pathMergingpoint 10Forward incorrect pathsBackward incorrect paths• Correct encoder stateForward correct pathForward tree level iBackward correct pathForward incorrect pathsS mgBackward incorrect paths• Correct encoder stateIncorrect encoder state94Chapter 5 Error Performance of BSDstretches out t branches from the correct node i. First of all, an incorrect path will beextended by a sequential decoder using an unquantized stack algorithm only if [6, 36]M[$c(t)] rninM[x(j)]. (5.6)Thus, one can immediately write the following lemma.Lemma 5,1: An incorrect path 5c(t) (either in forward decoding or in backwarddecoding), which diverged from the correct path at level i and remained unmerged up tolevel (i + t), may be on the top of the stack only fM[*(t)]> mm M[x(j)] (5.7,)where (i + h) denotes the farthest level of the correct path reached by the unquantizedstack algorithm.Furthermore, notice that for any path present at any position in the stack, the parentnode of that path must have been at the top of the stack at an earlier time. Thus, wecan write the following lemma.Lemma 5.2: An incorrect path j(t) (either in forward decoding or in backwarddecoding), which diverged from the correct path at level i and remained unmerged up tolevel (i + t), could be in the stack only if> mm M[x(j)] — ). (‘5.8)iji+hwhere (i + h) denotes the farthest level of the correct path reached by the unquantizedstack algorithm.We now proceed with the derivation of the error probability of Algorithm TAmerge.95Chapter 5 Error Performance of BSDStep 1: Analyze the error performance of Algorithm TAmerge given that the encoderstate Smg is correct. According to the operations of Algorithm TAmerge, merging canoccur at either step 3 or step 6. Without loss of generality, we suppose that decoding isstopped at step 3. This means that the end node of the backward portion of the decodedpath at level l is at the top of the BS, whereas the end node of the forward portion ofthe decoded path could be at any position in the FS. As indicated in Figure 5.1, decodingerror in this case is clearly upper-bounded by proto-error events,More specifically, let us suppose that there exists a forward incorrect path that mergeswith a backward incorrect path which is at the top of the BS. Clearly, decoding errorsin the backward portion of the decoded path belong to proto-error events without a bias.Furthermore, according to Lemma 5.2, the metric of the forward incorrect path in the FScan not be lower than the minimum cumulative metric of the forward correct path minus\\. Hence, the decoding error event near the merging point 1 made by the forward decoderis bounded by the proto-error event with a bias B = \\. Clearly, any other decoding errors(if any) in the forward decoder can be upper-bounded by the proto-error event withouta bias. In summary, all error events in Algorithm TAmerge when the encoder state Smgis correct can be upper-bounded by the proto-error events with a bias B = A. Thus, wecan write the following lemma.Lemma 5.3: if in the merging test the encoder state Smg is correct, decoding errorevents in Algorithm TAmerge can be upper-bounded by the proto-error events with a bias\\, which is asymptotically the same as that of USD.Step 2.’ Investigate the probability that the encoder state Smg is incorrect. Since the96Chapter 5 Error Performance of BSDencoder state Smg is incorrect, a decoding error must exist around the merging point 1(see Figure 5.2). Let us define this decoding error event between the last diverging pointsof the correct forward and backward paths as the merging error event (see Figure 5.3).Clearly, only one such event can occur within each decoded block. Let pMerge denotethe probability of the merging error event. Obviously, outside the merging error event,everything is the same as in USD, as illustrated in Figure 5,2.Correct pathForward tree level i+h1 Incorrect path+ Backward tree level iForward tree level iBackward tree level DMerging point 1= i+t1i.e. backward tree level j +t2Figure 5,3 Illustration of the merging error event.In Figure 5.3, we assume that forward and backward incorrect paths diverge from thecorrect path at forward tree level i and backward tree level j, respectively. As indicatedin the figure, tj is the length of the forward portion of the decoded path from level i,and t2 is the length of the backward portion of the decoded path from level j. Let theensemble of all possible forward incorrect paths which diverge from the correct path atS mg97Chapter 5 Error Performance of BSDforward tree level i be denoted by X(i). Similarly, let X(j) denote the ensemble ofall possible backward incorrect paths which diverge from the correct path at backwardtree level]. Let *F,(t1) denote a forward incorrect path in X(i) whose end node is atforward tree level (i + t1). Analogously, let B,(t2) denote a backward incorrect pathin (j) whose end node is at backward tree level (j + t2). According to Property 3.3,t + t2 = L—j — i and the merging point 10 = i + t1 = L— (j + t2) since the forwardand backward portions of the decoded path wifi never overlap. Let (i + hi) denote thefarthest depth of the forward correct path reached by forward decoder, Similarly, let (j+ h2) denote the farthest depth of the backward correct path reached by the backwarddecoder. Obviously, h1 + h2 t1 + t2 according to Property 3.3.Clearly, not all paths in X.(i) and X(j) can pairwise merge. Let .(i; j) containall paths in i) and j) which are pairwise merging paths. It is obvious that1(i; j) can be viewed as the set of all incorrect paths which diverge and remerge withthe correct path at forward tree level i and L—j. Define 1(i; j) as the total number ofpairwise merging incorrect paths contained in the set .(i; j). Hence, t + t2 > K, andI (i;j) (u — l)uL_i_i_K = (u — l)utl+t2_K [6]. Now, let pMere(j1h1,t2h2)be the probability of the merging error event as shown in Figure 5.3. Note that ifparameters i, t1, t2 are given, parameterj is also determined since j = L — (i + tl + t2).We will demonstrate that the upper bound on P(i, t1, h1,t2,h2) is only related with(t1 + t2) and (h1 + h2) after averaging over the ensemble of linear trellis codes.Lemmas 5.1 and 5.2 are all that we need to determine the upper bound on theprobability of the merging error event, and the method used in our derivation follows98Chapter 5 Error Performance of BSDclosely that of Viterbi and Omura [6]. Also, it is important to notice that conditions inLemmas 5A and 5.2 are only necessary but not sufficient.Lemma 5,4: Given i, t1, h1, t2, h2, the probability of the merging error event inAlgorithm TAmerge is upper-bounded byi+hi j+h2pMere(1h1,t2h2) < > P(yx) e_M [xp*r)1 e_M[(T)1x eM[,j(t1)]eM[*B,j(t2)] (5.9)I (F,(il);B,2L))eX(i;j) Iwherej = L— (i+ti +t2).Proof’ Without loss of generality, we assume that decoding is stopped at step 3 inAlgorithm TAmerge. That is merging occurs between a top path xB(t2) in the BS anda path xF(t1) which could be at any position in the FS. According to Lemmas 5.1 and5.2, we can writeM[F,(t1)] — mm M[xF(Ti)]+) 9F(i,tl,hl) O (5J0):0,6 [0,1],t1 +12 > K,h1 1,h2 1, and h1 + h2 ti +t.Proof’ According to Lemma 5.4 and letting S e [0, 1] as in [4, 6, 36, 54], we canwrite________________________i+hi i+h2pMerge(t1, h1, 12, h2) P(yx) eT5M[xF(T)1 e_0SM[xB(T)1101Chapter 5 Error Performance of BSD( Sx eM[F,j(t1)1 x eM[Bj(t2)1( (*Q);*n3(i2)X( ;.j)i+h1 i+h2< e’X X(i;j) P(ylx) >x {eM[*Fi(u1)1 X eM[(t2)]}< — l)Su(t1+t2_ki+h1 j+h2x P(yx) e_L13M[xF(T)ix {eaM{*Fi(ul)1} X {e’M[3(t2)i} . (5.25)Obviously, (5.25) is independent of indexes i and j due to the ensemble average. Forthe sake of simplicity, let i= j = 0 in above summations, and omit subscripts i and jin (5,25). Also, note that there is no difference in the ensemble average of the forwardand backward codes. ThuspMcrgc (i, t1 h1,t2, h2) e’(u — l)Su(tt2_I)6h1 h2x P(yjx) e_M[xF(1’)1x {eM[F(l)i}5{eJM[B(t2)]=e(u— 1 )Su(t12_K)sT(hi,t1)T(h2t2), (5.26)whereT(h, t)Y x(h:)( Sx q[(tj)jeUMt1 , (5.27)102Chapter 5 Error Performance of BSDand where x(h1) and *(t) are codeword segments of h1 and t1 branches, respectively.Notice the similarity between equation (5.27) and equation (6.2.9) in Viterbi and Omura[6]. Hence, we can write T(hL, t) asT h — fexpn{(h —t)[f1(u6) +6R]+tf2(u,6)}, t 528( — expn{(t — h)[f3(,6)— 6R] +hf(u,6)}, t hSubstitute (5.21) through (5.23) into (5.28), we obtainT(h,t) < flhSfltze_ThtiöR (5.29)Finally, substitute (5.29) into (5.26), and note u = enR by the definition of the code ratein (2.2), to obtain (5.24). Q.E.D.In the following, we show that for an ensemble of linear trellis codes pMerge andpMerge have the same exponential exponent as pUSD and Pb’D but with an insignificantincrease factor of e/(1) at most.Theorem 5.]: For an ensemble of linear trellis codes, the probability of the mergingerror event is upper-bounded by(A —nKRpMerge< ) rl7e V — — “comp (530)e—e0(S), Rcomp R C1’whereA7 = (enR— i) (5.31)is a finite constant, and for Reomp R C, 6 [0, 1] is the solution ofRE0(6) (5.32)while for 0 R Rcomp, S = 1.103Chapter 5 Error Performance of BSDProof The probability of the merging error event without any condition is given bypMerge = (5.33)Vi,21,hi,t2hAccording to Lemma 5.5, given parameters i, t1, h, t2, and h2, the ensemble averageprobability of the merging error event is upper-bounded bypMerge( t1,h1t,h2) < e6A (e7lR —i)6fl(h1+h2)fl(t1+L2)_nKR (5.34)where u 0,6 C [0,lj,t1 +t2 K,h1 > l,h > 1, and h1 + h2 E[nb(T)j. (5.45)Notice that aside from the factor e, the right side of (5,45) is the upper bound ofthe bit error probability of a unidirectional unquantized stack algorithm [6 pp. 312j.As discussed earlier, e°’ has the same effect as using a unidirectional quantized stackalgorithm with = \\. Q.E.D.Similar to the proof of Theorem 5.2, one can writeP(A) = P(AIB)P(B) + P(AIB’)P(B’)= P(AB) + P(B’) [P(AIB’) — P(AIB)], (5.46)where B still denotes the event that the encoder state Smg is correct. Now, let P(A) bethe bit error probability with Algorithm TAmerge. Then, P (A B) is the bit error rateof Algorithm TAmerge given that the encoder state Smg is correct, and P(AIB’) the biterror rate given that Smg is incorrect.In (5.46), P(B’) is the probability that Smg is incorrect, which is upper-bounded byTheorem 5.1. Furthermore, According to Lemma 5.3, P(AB) follows the same upper107Chapter 5 Error Performance of BSDbound as for a unidirectional quantized stack algorithm with = ).. Similarly, accordingto Theorem 5.3, P(AB’) also follows the same upper bound as for a unidirectionalquantized stack algorithm with = ). Although P(AIB) P(AIB’) in general, we canstill conclude that the second term in (5.46) compared to the first term is asymptoticallyunimportant. Thus, the following corollary follows.Corollary 5.2: For an ensemble of linear trellis codes, the bit error probability ofAlgorithm TAmerge is asymptotically upper-bounded bypTAmerge R0SABODP Code m=10 Error Erasure Max. Avg.BERpr=0.68,L200,5X runs blocks blocks comp, comp./br.USD (unquantized stack l,99e-2 413 4 2e5 18.434algorithm_L=1)USD (quantized stack 3.88e-2 770 6 2e5 32.5 14algorithm_with z=l4)Algorithm 4.88e-2 3047 0 2545 1.559TAmeet ( = 1)Algorithm l.58e-2 591 0 3095 1.898TAmerge_(A = 1)Algorithm 9.27e-3 300 3 2e4 3.026Tlmerge_(z=1)Algorithm 2.91e-2 871 0 19107 4.144Tlmerge_(LS=14)Algorithm HTlmerge 1.39e-2 515 0 6758 2.108(mh = m - 1,_L=1)Algorithm HTTmerge l.70e-2 716 0 4258 1.938(mh = m 2,_ti=l)in Algorithm TTmerge does not benefit both the error performance and averagecomputations. Thus, a small value of L seems to be a better choice for the overallperformance of Algorithm TTmerge as well as Algorithm HTTmerge.Finally, we examine the effect of block length L on the error performance of theproposed BSD algorithms. Simulation results of the error performance of AlgorithmTAmeet and Algorithm TAmerge are shown in Table 5.3 and Table 5.4, respectively. Itcan be noticed from Table 5.3 that the BER of Algorithm TAmeet improves slightly asL increases, which confirms the findings in section 5.3. The BER of all algorithms witha merging test is not sensitive to L. However, as expected, the block error probability113Chapter 5 Effor Performance of BSDTable 5.3 Error performance comparison of AlgorithmTAmeet (L\\ = I) for different block length LSABODP Code m=10 L=l00 L=200 L=t300 L=400 L=500pr=l.1,5X10 I’UflSBit error rate (BER) l.03e-2 9,84e-3 8.90e-3 8.09e-3 7.56e-3#of error blocks 8869 11944 13482 14552 15348# of erasure blocks 0 0 0 0 0Max. #ofcomp. 583 2805 5021 9278 11200Ave. comp. / branch 1.142 1.202 1.244 1.274 1,302Table 5.4 Error performance comparison of AlgorithmTAmerge (z = 1) for different block length LSABODP Code m=10 L=100 L=200 L=300 L=400 L=500pr1.l,5X10 fllflSBit error rate (BER) 3.07e-4 4.62e-4 5.50e-4 6.08e-4 6.52e-4# of error blocks 151 409 664 962 1256# of erasure blocks 0 0 0 0 0Max. #ofcomp. 1133 3074 6465 9364 11362Ave. comp. / branch 1.189 1,247 1.283 1.310 1.335increases with increasing L, for all the algorithms under study.Tables 5.3 and 5.4 indicate that Algorithm TAmerge and Algorithm TAmeet havebasically the same maximum and average number of computations when the code rate islower than Rcomp. This argument is especially true when L is not too small. Thus, theadditional computations introduced by the merging test are asymptotically insignificant,and are worth the improvements in error performance. Moreover, the maximum andaverage computations is much smaller than those of USD for all values of L. In otherwords, the number of erasure blocks with the proposed BSD is much smaller than thoseof USD under the same maximum number of computations.114Chapter 6Bidirectional Multiple Stack AlgorithmThe main disadvantage of sequential decoding algorithms is their inherent overflowproblem. The number of computations in sequential decoding is a random variable witha Pareto distribution. Hence, there is always a small fraction of blocks which require aninfeasible number of computations and thus cannot be completely decoded. This ratio ofundecodable blocks is often called the erasure probability Per. Several known techniqueshave been proposed to alleviate the overflow problem of sequential decoding [14—16,18, 19, 58J,In Chapter 3, we proposed BSD algorithms which were shown to significantlyreduce the erasure probability. The proposed BSD algorithms improve the computationalperformance by approximately doubling the Pareto exponent of conventional sequentialdecoding techniques. However, the decoding effort is still a random variable with aPareto distribution, and the overflow or erasure problem is still not eliminated.Chevillat and Costello [18] proposed an interesting decoding algorithm, called multiple stack algorithm (MSA), that allows erasure-free decoding. The basic idea is to makeuse of higher order stacks for blocks that require an excessive number of computationsto be decoded. However, to achieve a good error performance with the MSA, a largeamount of memory is needed.In this chapter we propose to modify the MSA to accommodate bidirectional treesearch [29], as described in Chapter 3. We refer to this decoding algorithm as bidirectionalmultiple stack algorithm (BMSA). It is found by analysis as well as computer simulations115Chapter 6 Bidirectional Multiple Stack Algorithmthat the use of bidirectional decoding with the MSA improves the performance in termsof computational effort, memory requirements and decoding errors.First, the basic elements of the conventional MSA are reviewed briefly. The BMSAis then discussed in detail and its performance is evaluated by analysis and computersimulations. Finally, different decoding algorithms, including the MSA, the new BMSAand the Viterbi algorithm, are compared on the basis of their memory requirements, biterror performances and computational efforts.61 The Multiple Stack Algorithm (MSA)The multiple stack algorithm (MSA) eliminates the overflow problem entirely atthe expense of a substantial increase in stack and buffer storage requirements. It is amodification of the stack algorithm which employs several stacks to alleviate the problemof stack overflow. The basic functions of the MSA are illustrated in Figure 6.1. Initially,the MSA decoder operates as a conventional stack algorithm. If a terminal node inthe tree is reached before the first (primary) stack with size Zfvls is filled, decoding iscomplete, and the decoded sequence is identical to that of a conventional stack algorithm.However, if the first stack is filled and an extensive search is needed, the top T nodes ofthe first stack are transferred to the second stack and decoding resumes using only the Ttransferred nodes. If the end of the tree is reached before this stack is full, the terminalnode is accepted as a tentative decision and is stored in a tentative register. The decoderclears the second stack, returns to the first stack, and attempts to find another tentativedecision, The new tentative decision is compared to the preceding one, and the better oneis retained and stored in the register. The process of transferring T nodes to another stack116Chapter 6 Bidirectional Multiple Stack Algorithmis not limited to the second stack. Additional stacks are formed as needed. If a stackfills up before the end of decoding, T nodes are transferred to a higher order (secondary)stack until a tentative decision is made. For simplicity, we assume that all secondarystacks have the same stack size, i.e., ZM = ZMSA, i = 2, 3,.... The MSA terminatesdecoding when it reaches the end of the tree in the first stack, or if a predeterminedcomputational limit Cijm is reached while the search is in progress. In both cases, thebest tentative decision is accepted as the final decision.Transfer ifI 1full Transfer if Transfer ifZ2 if full Z3 if full1 TopTnodes ISecond Stack Z2 Third Stack Z3First Stack ZiFigure 6.1 Illustration of the multiple stack algorithm.The major difference between the single stack algorithm and the multi-stack algorithmis well described by their names. The single stack algorithm stops the decoding processwhen the stack is full and erases the undecoded block. The MSA continues decodingin secondary stacks to reach a decision. When a tentative decision is made in one ofthe secondary stacks, and the computational limit Gum has not been reached, the MSATop T nodes117Chapter 6 Bidirectional Multiple Stack Algorithmreturns to previous stacks to refine the tentative decision. Decoding stops either aftersatisfactorily achieving first-stack decoding or by reaching the computational limit.In [18], the block error probability of the MSA, is upper-bounded bypMSA PeUSD + p1USD (6.1)where PY is the block error probability of the single stack algorithm with an infinitestack size, and pf-’ is the probability of the first stack overflow. It is shown in [18] that/ \\PrP1 A9 (Z1 — 1) , (6,2)where A9 is a constant and Pr is the Pareto exponent of USD as defined in Chapter 2.Similarly, the bit error probability of the MSA can be upper-bounded bypjfrISA N) < pfJSD Z- N Cm (6.4)0, N > Cim,where cj’’ is the total number of computations required for the final decision and p Ciim) is reached, it becomes zero.The behavior of the first and last segments of the final computational distributionare obvious. The behavior of the second segment (i.e., ZSA < N Gum) can beexplained as follows. Whenever the first stack fills up, the algorithm transfers T nodes tothe second stack. The transferred nodes may either include the correct one, or they canall be incorrect nodes, In the first case, the correct node is amongst the transferred nodesin the secondary stacks. As described before, the decoding process is stopped only if atentative decision is reached in the first stack, or if the computational limit is reached.Since the current decoding path is only accepted as a tentative decision, the MSA willnot stop decoding even after the correct final node is located in a secondary stack. Let’sassume that the correct final node is in the i-th (i>]) stack. The MSA empties the i-thstack and returns to the (i — ])-th stack. Since the correct node is removed from the i-thstack, there is no correct node in any of the stacks.17 It is therefore unlikely to returnto the first stack and reach a tentative decision, In other words, with a high probability,decoding will not stop until the computational limit is reached.Next, assume that all the transferred T nodes are incorrect. Since there is no correctnode in the second stack, it will take many computations to find one tentative decisionand it is therefore very likely to require higher order stacks, Assume that the tentativedecision is made in the i-th (1>]) stack. The MSA empties the i-th stack and returnsto the (i — ])-th stack and tries to find another decision. After many computations, ifthe computational limit is not yet reached, the MSA will finally return to the first stack.However, the MSA can only extend T nodes before the first stack fills up again. This17 Here, we ignore the fact that an incorrect path may merge with the correct path since this only occurs with a small probability.121Chapter 6 Bidirectional Multiple Stack Algorithmimplies that there is a very small chance that a tentative decision can be reached withinthe T extensions in the first stack. If the first stack fills up again, another T nodes willbe transferred to the second stack. Hence, if the correct node is not transferred into thesecond stack, during the first time, it may be transferred into the second stack during thesecond, third, etc. times. In summary, whenever the first stack fills up, the MSA willnot, most of the time, stop the decoding process until the computational limit is reached.6.2 Bidirectional Multiple Stack Algorithm (BMSA)A bidirectional multiple stack algorithm (BMSA) is constructed in this section, whichcombines the ideas of BSD and the use of multiple stacks. Any of the BSD algorithmsdescribed in Chapter 3 can be used. However, Algorithm TTmerge is chosen due to itsease of implementation and good error and computational performances.The BMSA based on Algorithm TTmerge consists of a number of forward andbackward finite-size memory stacks. We assume that the pairs of forward and backwardstacks are of the same size, denoted by ZPM. Initially, decoding with the BMSAis exactly the same as with Algorithm Timerge. If decoding is terminated before thetwo stacks fill up, the decoded sequence is the same as that with Algorithm TTmerge,However, should one of the stacks (FS or BS) fill up, like in the conventional MSA,the T top nodes in that stack are transferred to a secondary stack, and decoding resumesusing only the T transferred nodes. Once a merged path is found,18 that path is takenas a tentative decision, and is stored in a register. The secondary stack is then cleared,and decoding continues on the primary stacks. If, however, one of the stacks fills upFor simplicity and ease of implementation, the merging test is only conducted between nodes of top non-empty substacks ofthe current pair of forward and backward stacks.122Chapter 6 Bidirectional Multiple Stack Algorithmbefore a decision is reached, the T top nodes in that stack are transferred to yet anothersecondary stack, and decoding resumes once again using only the T transferred nodes.This process is continued, that is secondary stacks are used as needed. Once a tentativedecision is found in a given pair of forward and backward stacks, these two stacks arecleared, and decoding resumes on the predecessor stacks in the attempt to refine thattentative decision. Decoding is stopped whenever a decision is reached in the pair ofprimary stacks or when a predetermined computation limit Cijm is reached. In any event,the best tentative decision is accepted as the decoded path. A detailed description of theBMSA is given below, Let SF denote the current number of the forward stack, and SBthe current number of the backward stack. The steps are as follow:Step 0: Place the root nodes of forward and backward trees in the 1st FS and the1st BS, respectively. Set SF = 5B = I.Step 1: Compute the metrics of all successors of any node from the highest non-empty substack in the SFth FS, remove this node from the SFth FS, and enter the newnodes in their proper substack in the Spth FS.Step 2: If 1F + 1B > L, check all paths in the highest non-empty substack in theSBth BS with all paths in the highest non-empty substack in the SFth FS. if one or moremerging paths is found, select the one, including the one already in the tentative registerif any, with the highest cumulative metric as a tentative decision and go to step 13.Otherwise, go to the next step.Step 3: If the SFth FS is full, transfer T nodes from the highest non-empty substackto the (SF + 1)th FS,’9 and set SF = SF + 1. Go to the next step.19 If the highest non-empty substack contains less than T nodes, pick the rest from the next inghest non-empty substack,123Chapter 6 Bidirectional Multiple Stack AlgorithmStep 4: Check if the node to be extended in the highest non-empty substack in theSBth BS is a terminal node of the backward tree. If yes, compare that path with the onein the tentative register, if any, select the better one, and go to the next step, Otherwise,go to step 6.Step 5: If 5B = 1, stop decoding and take the path in the tentative register as thedecoded path. Otherwise, go to step 15.Step 6: If the number of computations is equal to Cjjm, stop decoding and take thepath in the tentative register as the decoded path. Otherwise, go the next step.Step 7: Compute the metrics of all successors of any node from the highest non-empty substack in the SBth BS, remove this node from the stack and enter the new nodesin their proper substack in the SBth BS.Step 8: If 1F + 1B L, check all paths in the highest non-empty substack in theSFth FS with all paths in the highest non-empty substack in the SBth BS. If one or moremerging paths is found, select the one, including the one afready in the tentative register,if any, with the highest cumulative metric as a tentative decision and go to step 16.Otherwise, go to the next step.Step 9: If the SBth BS is full, transfer T nodes from the highest non-empty substackto the (SB + J)th BS, and set SB = 5B + 1. Go to the next step.Step 10: Check if the node to be extended in the highest non-empty substack in theSFth FS is a terminal node of the forward tree. If yes, compare that path with the onein the tentative register, if any, select the better one, and go to the next step. Otherwise,go to step 12.124Chapter 6 Bidirectional Multiple Stack AlgorithmStep 11: If 5F = 1, stop decoding and take the path in the tentative register as thedecoded path. Otherwise, go to step 18.Step 12: If the number of computations is equal to Ciim, stop decoding and take thepath in the tentative register as the decoded path. Otherwise, go to step 1,Step 13: If SF = 1 and SB = 1, stop decoding and take the path in the tentativeregister as the decoded path. Otherwise, go to the next step.Step 14: If SF > 1, empty the SFth FS, set SF SF — 1, and go to the next step.Otherwise, remove the merged node from the 1st FS, and go to the next step.Step 15: If SB > 1, empty the SBth BS, set SB = SB — 1, and go to step 6. Otherwise,remove the merged node from the 1st BS, and go to step 6.Step 16: If SF = 1 and 5B = 1, stop decoding and take the path in the tentativeregister as the decoded path. Otherwise, go to the next step.Step 17: If 5B > 1, empty the SBth BS, set SB = 5B — 1, and go to the next step.Otherwise, remove the merged node from the 1st BS, and go to the next step.Step 18: If SF> 1, empty the SFth PS, set 5F = 5F — 1, and go to step 12. Otherwise,remove the merged node from the 1st PS, and go to step 12.6.3 Computational Properties of the BMSADefine the total number of computations CfMSA needed to decode one block by theBMSA as the sum of the computations performed by forward and backward decoders.Let Z1 = 2ZpM Zf’, and Z = 2ZpM = Z/’ for i=2,3, . As in theMSA, the size of the 1st stack ZpMs4 should be made large enough so that only thevery noisy blocks, which are potential erasures in conventional BSD, use higher order125Chapter 6 Bidirectional Multiple Stack Algorithmforward and backward stacks. On the other hand, the size of the secondary stacks ZM9Ashould be made substantially small for the BMSA to quickly advance through forwardand backward trees, and find a reasonably good tentative decision.For T1, at most L pairs of stacks are formed before reaching the first tentativedecision. Once a tentative decision is made, the decoder returns to previous stacks torefine the tentative decision. The BMSA is therefore erasure-free if one or more tentativedecisions are made before reaching the computational limit Cijm. In the worst casescenario, where forward and backward decoders do not merge before the final forward orbackward node is reached, the number of computations needed by the BMSA to guaranteeerasure-free decoding would be twice the critical number of computations for the MSA,Ccrii, which is given by [18]L—rn—1Gcrii = ( i) + 2m. (6.5)Therefore, to ensure erasure-free decoding using the BMSA with T=], we must haveGum > 2 X Gcrit. (6.6)Although the above criteria for selecting Cijm is extremely conservative, it offers an easyand safe design rule.Clearly, the computational properties of the first tentative decision of the BMSA aresimilar to those of the MSA. Thus, following [18] and using the approximation on thecomputational distribution of Algorithm TTmerge (see Chapter 4), one can writeProperty 6.]: The first stack overflow probability for the BMSA, Pj”, is given bythe erasure probability of Algorithm TTmerge with primary stack size Z1. That is,pSD = P(cf8D > z1— i) A1o(Z — i)_2Pr, (6,7)126Chapter 6 Bidirectional Multiple Stack Algorithmwhere A10 is a constant independent of Zi, and Pr is the Pareto exponent.From (6.7) and (6.2), it can be noticed that p1B9D is much smaller than p9D for agiven Z1. In other words, to achieve a given overflow probability, the required Zj withthe BMSA is much smaller than that with the MSA, To quantify this reduction in stacksize, let ZsA(P1,Eb/No), and ZjBM(P1,Eb/N0)denote respectively, the stack sizesneeded with the MSA and the BMSA to achieve a certain first stack overflow probabilityPj. at a given signal to noise ratio Eb I N0. The first (primary) stack size saving ratio2°can be expressed by‘MSAD 12 IT’.T\\s - (68)zPMSA(P,Eb/No)’which can be simplified to(A9/P1)(Ao/Pi)2— ( A rp_1/2Pr1=A11P’2, (6,9)where A11 is a constant that depends on the channel condition and the decoder used, whichcan be determined by simulation. For comparison purpose, from now on, we let A11 = 1.Figure 6.3 shows a plot of S3 as a function of P1 for different values of Paretoexponent Pr. As shown in this figure, the smaller the required first stack overflowprobability Pj is, the greater is the saving in Zj. Moreover, the saving in the primarystack size increases as the Pareto exponent (or EbIN0)decreases. For instance, when Pr20 The first stack size saving ratio represents approximately the saving in the total memory size since the first stack is much largerthan the secondary stacks.127Chapter 6 Bidirectional Multiple Stack Algorithm150100500102 iü-Figure 6.3 The first stack saving as a function of Pj.= 1 (E,IN=4.58 dB), and P1 = i03, the primary stack size saving is about 32 times.For the same Pj. S = 18 for Pr = 1.2 (Eb/N05.O dB).As for the total number of computations for the final decision of the BMSA, following[18], we can write the following property.350300250200io-4P1128Chapter 6 Bidirectional Multiple Stack AlgorithmProperty 6.2: The final computational distribution of the BMSA is upper-bounded by(A1oN2, N N) < N 0lim (6j1 0, elsewhere,where the Pareto exponent p < Pr’ and Z1 =2ZBMsA.The computational distribution of the final decision for the BMSA reaches a floorbeyond a certain threshold like that of the conventional MSA. However, since pD < N), (6,13)N=Lwhere P(CfM’A > N) is upper-bounded by (6.10). Substituting (6.10) into (6.13),yieldsZ1 pBSDf—z1)1 ‘S1imE[GM5] 0 andI V < M almost surely,for some constants and M. Then for any > 0,E [eV]—TI( + M) + eflM, (Ad)In particular, for,= (1 /M) in (1 + /M),E [e] <(1 + /M)[1 — ln(1 + /M)] <1. (A2)Proof The proof follows from noting that e”=(TI/k! 1—V +[e1M— (1 + TIM)]. Q.E.D.Proof of Theorem 4.4: Similar to the proof of the computational lower boundin Theorem 4.3, separate every received block into two almost equal parts, i.e., levels [1, [(L — m)/2J] for forward tree search operations and [[(L — m)/2j + m, L] forbackward tree search. Here, L and m denote any L and m for any code j. Alsowith the assumption of correct decoding, CfSD > mm [C_m)/2j,CRL_m)/21] whereand C_m)/2ldefined in the proof of Theorem 4.3, are independent random variables.Now, let l be an arbitrary integer of multiple n and less than [(L — m)/2J . n, anddefine U [(L — m)rI/2l?lj. For I j U, define NJ to be the set of nodes in the150forward tree at segment j1, that are descendants of the correct end node of segment (j —1)1. Given that the end node of the segment (j—l)l, is correctly decoded, let cf be thenumber of distinct nodes in NJ that are hypothesized (extended) by the idealized forwarddecoder before correctly decoding the segment jl. Then c$. Define asimilar quantity, n, to be the number of nodes in NJ that are at least as likely to bethe correct nodes at segment j1,. Then, because the codewords are a priori equiprobableand correct decoding is assumed, one can use Lemma 3.1 in [53]21 to show thatE[cr] E[n]. (A3)Lemma 3.1 of [53] assumes that the stack algorithm is used. However, its proof remainsvalid for any sequential decoder for which one can determine which of two nodes at thesame level is hypothesized (extended) first, by considering only the received messageup to that level.Now let K denote the DMC channel. For every Q > 0 and 0 < o 1, one can alwayschoose a finite l(K, R,60Q) sufficiently large to satisfy the condition of Lemma 6.1in [53], and thus[exp (l(R — R1)/8) —2] > 32Q(1 + ‘)n’. (A.4)Therefore, using Lemma 6.1 in [53] into (A.3), we can writeE[cf]_E[n2]exp [l(R — R)/8]> l6Ql(1 + + 1, 1 1 and cr — ILOI < M sinceo < cr < exp (Rl), 0 < .uo < E [c’j < exp (Rl), and for distinct codewords,exp (Rl,) X. Let = (1/M) In (1 + l/M). By Lemma A.l,E[exp (_ii(cr — <8, (A.6)where 8 = (1 + l/M)[l — ln(l + l/M)], and 8 < 1.Since cr, 1 4(1 +e1)l/n and(A.8)(2) L0 > 2l(l +o)(l —1n2/lnO)/n.SupposeL L0 and L (1 + e’)m. ThenP(C_m)/2j> uo) p( cr > Uo)= Uo)1 —> 1/2. (A.9)152In the same way, one can showP(C_m.)/21 > Ui0) > 1/2, (A.1O)FC and (BCSince C[(Llfl)/2j j(L—mj)/21 are independent, we haveE[C’’] =IFC BCE [mm CL(Lm.)/2j, G1(L.m)/2])] /L> (1/2)(1/2)U0/L>Q. Q.E.D,(A.11)153Appendix BList of AcronymsAID Analogy-to-DigitalAWGN Additive White Gaussian NoiseBD Backward DecoderBMSA Bidirectional Multiple Stack AlgorithmBODP Bidirectional Optimum Distance ProfileBS Backward StackBSC Binary Symmetric ChannelBSD Bidirectional Sequential DecodingBSD-merge Bidirectional Sequential Decoding With a Merging TestBSD-no-merge Bidirectional Sequential Decoding Without a Merging TestCDF Column Distance FunctionDIA Digital-to-AnalogyDMC Discrete Memoryless ChannelEb/No Energy Per Bit to Noise RatioFD Forward DecoderFEC Forward Error CorrectionFS Forward StackGCD Greatest Common DivisorISI Intersymbol Interference154modem Modulator/DemodulatorMSA Multiple Stack AlgorithmODP Optimum Distance ProfileQPSK Quaternary Phase Shift KeySABODP Symmetric Almost Bidirectional Optimum Distance ProfileSBODP Symmetric Bidirectional Optimum Distance ProfileUSD Unidirectional Sequential DecodingVA Viterbi Algorithm155"@en ; edm:hasType "Thesis/Dissertation"@en ; vivo:dateIssued "1994-11"@en ; edm:isShownAt "10.14288/1.0065074"@en ; dcterms:language "eng"@en ; ns0:degreeDiscipline "Electrical and Computer Engineering"@en ; edm:provider "Vancouver : University of British Columbia Library"@en ; dcterms:publisher "University of British Columbia"@en ; dcterms:rights "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en ; ns0:scholarLevel "Graduate"@en ; dcterms:title "Bidirectional sequential decoding"@en ; dcterms:type "Text"@en ; ns0:identifierURI "http://hdl.handle.net/2429/7040"@en .