- P ^ V ^ ' ' ' \\ * + + n r ! < ^ The University of British Columbia Vancouver, Canada Date \/(\/W- <*, \/ffS DE-6 (2\/88) Abstract Built-in Self-Test (BIST) is becoming a widely accepted means for testing VLSI circuits. BIST usually consists of two major functions known as on chip test pattern generation and test response evaluation. There are two major difficulties regarding test response evaluation. The first is reducing the error escape rate or aliasing while still maintaining reasonably small hardware requirements. The other is accurately assessing the impact of aliasing on the overall test quality of a BIST scheme. This dissertation addresses these two difficulties by developing a group of techniques known as multiple intermediate signature analysis. Compared to the conventional single signature analysis, multiple intermediate signature analysis has many advantages, e.g., smaller aliasing, easier exact fault coverage computation, shorter average test time, and increased fault diagnosability. Based on the investigation of an aliasing model, this dissertation develops a com-prehensive fault coverage model for predicting the fault coverage performance with multiple intermediate signature analysis. In addition to the parameters used in the aliasing model, such as the number of intermediate signatures and the length of each signature, the proposed model also includes information on the scheduling of intermediate signatures. In addition to the studies on the conventional multiple intermediate signature anal-ysis, referred to as CMS schemes, this dissertation also describes two novel multiple inter-mediate signature analysis techniques. The first is a fuzzy multiple intermediate signature analysis, or simply called the FMS scheme. Unlike the CMS schemes, where each checked signature must correspond to a specific reference on a one-to-one basis for a circuit under test (CUT) to be declared good, the FMS scheme declares a CUT good if each checked sig-nature maps to any element of the same set of references. In comparison, the FMS scheme is very simple and easy to implement. A complete theory for the aliasing performance and ii hardware requirement prediction with the FMS scheme is derived. The second novel multiple intermediate signature analysis proposed in this disser-tation is single reference multiple intermediate signature analysis, simply referred to as the SMS scheme. Conventionally, checking n signatures requires n references. With the SMS scheme, however, regardless of the number of checked signatures, only one single reference is needed. The SMS scheme requires minimal hardware for multiple intermediate signature analysis, i.e., essentially the same amount of hardware as for conventional single signature analysis. To efficiently implement the SMS scheme, a systematic approach is developed based on the discovery of some identical signature properties. This implementation ap-proach of the SMS scheme does not require any circuit modification of the CUTs. The cost for implementing the SMS scheme is a non-recurring CPU time overhead in the de-sign phase. In return, the SMS scheme yields significantly recurring silicon area savings as well as reduced aliasing. With the algorithms provided in this dissertation, The CPU time overhead for implementing the SMS scheme is very small. For example, if the SMS scheme is used to check two 16-bit signatures, which yields 65,536 times smaller aliasing at no extra hardware cost compared to conventional single signature schemes, the total CPU time overhead required for implementing the SMS scheme is less than 4 seconds on a Sun Sparc 2 workstation for a test length of 220, independently of the size of CUTs. iii Table of Contents Abstract ii Table of Contents iv List of Tables viii List of Figures ix Acknowledgement xi Claims of Originality xii 1 Introduction 1 1.1 Dissertation Objective and Outline 1 1.2 Fault Models 3 1.3 Test Quality Measures 5 1.4 Conventional Approach to IC Testing 6 1.5 Design for Testability 7 1.6 Built-in Self-Test 9 1.7 Test Quality Problems of BIST 10 2 Built-in Self-Testing 12 2.1 Hardware Models of BIST 12 2.2 An Important BIST Component \u2014 LFSR 15 2.3 Test Pattern Generation 18 2.3.1 Exhaustive and Pseudo-exhaustive Testing 18 iv 2.3.2 Pseudorandom Testing 20 2.4 Test Response Evaluation 22 2.4.1 LFSR-based Data Compaction 23 2.4.2 Counter-based Data Compaction 25 3 Aliasing and Aliasing Reduction Techniques 27 3.1 Aliasing Measures and Error Models 27 3.1.1 Aliasing Measures 27 3.1.2 Error Models 29 3.2 Advanced Compaction Techniques 32 3.2.1 Multiple Signature Analysis 32 3.2.2 Output Data Modification (ODM) 35 3.2.3 Zero Aliasing Techniques 36 3.2.4 Modified LFSR 40 4 Multiple Intermediate Signature Analysis \u2014 I 42 4.1 An Aliasing Model 43 4.2 Fault Coverage Models 44 4.2.1 Preliminaries 44 4.2.2 A Comprehensive Fault Coverage Model 46 4.2.3 A Simplified Fault Coverage Model 49 5 Multiple Intermediate Signature Analysis \u2014 II 52 5.1 Possible Implementations 52 5.1.1 Conceptual Understanding of Multiple Signature Analysis 52 5.1.2 Straightforward Implementations 53 5.1.3 An Implementation by Resource Sharing 55 5.2 Test Result Observation 57 v 5.3 Control of Multiple Signature Analysis 58 5.4 Applications 60 5.4.1 Exact BIST Fault Coverage Calculation 61 5.4.2 Test Time Reduction 62 5.4.3 BIST Failure Diagnosis 64 6 Fuzzy Mult iple Signature Analysis 66 6.1 Basis and Implementations 66 6.1.1 Basis 66 6.1.2 Implementation 67 6.2 FMS Aliasing Performance Analysis 68 6.3 FMS Hardware Requirement Analysis 71 6.4 Comparative Evaluation of the FMS Scheme 73 6.4.1 FMS vs. SS 73 6.4.2 FMS vs. M-LFSR 74 6.4.3 FMS vs. CMS 76 6.5 Experimental Results 77 6.6 Conclusions 78 7 Single Reference Multiple Signature Analysis 80 7.1 Basis 80 7.2 Preliminaries 81 7.2.1 Signature Analysis 81 7.2.2 Non-singular LFSRs 83 7.3 Identical Signature Properties 84 7.4 Fast Realization of the SMS Scheme 90 7.4.1 Efficient Sample Sequences Generation 90 7.4.2 IPG and SA Seeds Selection 91 vi 7.5 Cost and Performance 95 7.5.1 L vs. Aliasing Performance 96 7.5.2 CPU Time Overhead for Implementing the SMS Scheme 97 7.5.3 A Special Case: k = 1 100 7.6 Experimental Results 101 7.7 Discussions 104 7.8 Extensions 106 7.8.1 Applications to MISR 106 7.8.2 Extensions to the FMS 107 7.8.3 Applications to Weighted Random Testing 108 7.9 Conclusions 108 8 Conclusions 110 8.1 Summary 110 8.2 Future Work 112 Bibliography 114 Appendix A 126 Appendix B 128 vn List of Tables 6.1 The FMS scheme vs. the M-LFSR scheme when a LFSR is used 76 6.2 The FMS scheme vs. the M-LFSR scheme when a MISR is used 76 6.3 Fault coverage enhancements 78 6.4 Fault simulation time reductions 79 7.5 Example RIS probabilities 88 7.6 L vs. lower bounds on confidence, given Pai \u00ab 2~nk 97 7.7 Examples of n, k, confidence and required L 98 7.8 Example CPU time overheads for the SMS scheme 99 7.9 Example average CPU time overheads of the SMS scheme 100 7.10 L vs. lower bounds on confidence for p = 2~(n~1)k 101 7.11 Experimental results for \/ = 216 102 7.12 Experimental results for \/ = 220 103 7.13 Experimental results for n = 16 and k \u2014 1 103 vin List of Figures 2.1 General BIST scheme 13 2.2 The parallel-parallel BIST model 13 2.3 The parallel-serial BIST model 14 2.4 The serial-parallel BIST model 15 2.5 The serial-serial BIST model 16 2.6 An example LFSR 16 2.7 An example MISR 18 2.8 A parity checker 25 2.9 General counter-based compaction process 26 2.10 Example event detectors 26 3.11 Multiple LFSRs signature analysis 33 3.12 Non-uniform deception volume of counter-based schemes 35 3.13 An ODM scheme 36 3.14 An example of zero aliasing compaction 38 3.15 Zero aliasing transition counting 39 5.16 Conceptual representation of the CMS scheme 53 5.17 A straightforward CMS implementation with ROM 54 5.18 A straightforward CMS implementation with CL 54 5.19 A CMS implementation with resource sharing 55 5.20 An example local control circuit 56 5.21 Test result observation for multiple signature analysis 58 ix 5.22 Example controllers for signature analysis 60 5.23 Fault simulation time to determine fault coverage before data compaction using fault dropping 61 5.24 Fault simulation time to determine fault coverage after data compaction. . . 62 5.25 Fault simulation time with multiple signature analysis 63 6.26 Conceptual representation of the FMS scheme 67 6.27 The FMS Data Compactor 68 6.28 An Example of the FMS Scheme 69 6.29 Aliasing performance of the FMS scheme 72 6.30 The FMS scheme vs. the SS scheme 75 7.31 Example state transition diagrams 83 7.32 LFSR for the example 93 7.33 Confidence vs. (n-2)k 96 7.34 Confidence vs. L 97 7.35 Output compaction using a partial-length MISR 107 x Acknowledgement I would like to take this opportunity to thank my supervisor, Dr. Andre Ivanov, whose friendship, constant encouragement, support and valuable guidance I was fortunate to re-ceive all these years. I would also like to thank my colleague Chun Zhang for providing access to her fault simulator which initiated the work reported here. My appreciation goes to Victor Wong, Carly Wong, Kaiping Li and Gang Li for the helpful discussions on various topics. I would like to thank Peter Bonek, who included my work in his VLSI design, for being my first customer even though he has not paid me a penny yet. I am indebted to many other people who helped me these past few years, e.g., Jeff Chow, William Low and Barry Tsuji, who were always there when I had English problems; Andrew Bishop, who helped me put his nice frames on my slides. I would also like to thank my examination committee, especially Dr. N. Saxena, for helping me correct the errors that used to be in Chapter 4. The work reported here would have been impossible without the financial support of the University of British Columbia's Center for Integrated Computer Systems Research, the British Columbia Advanced Systems Institute, and the Natural Sciences and Engineering Research Council of Canada. Finally, I would like to express my sincere thanks to my family. I am very grateful to my dear wife, Yan Sa, who paved the way to the successful completion of this dissertation with her love and support. I do not know if she fully realizes just how much she has contributed to this work. I am extremely grateful to my parents, Yilee Wang and Zude Wu, who always encourage and support whatever I do. Without them, nothing would have been possible. Many thanks also go to my sister, Yueyan Wu, and brother-in-law, Tong Zhou, who gave me their great support by providing the extra care for my parents that I would have provided. I would like to mention my son, Eric Fenghua Wu, who brought special joys to my life even though he was time-consuming. xi Claims of Originality The author claims originality for the following contributions of this dissertation. \u2022 In Chapter 4, the comprehensive fault coverage model is developed. This model is based on a fault detection probability density function of a circuit under test (CUT). In addition to the possible fault coverage loss due to aliasing, this model also takes into account the impact of the test vectors applied to the CUT. \u2022 In Chapter 5, an efficient implementation of conventional multiple intermediate sig-nature analysis is presented. This scheme shares some hardware existing in a conven-tional signature analysis BIST scheme. The detailed discussions on test control and test result observation are also not published previously. \u2022 In Chapter 6, the proposed concept of fuzzy multiple intermediate signature analy-sis is novel. Unlike conventional multiple intermediate signature analysis, the fuzzy signature analysis does not contain the strict one-to-one signature-reference corre-spondence. The complete theory for aliasing performance prediction with the fuzzy multiple intermediate signature analysis is given. Considerations for practical imple-mentations of fuzzy signature analysis as well as analysis on the hardware requirements are presented. \u2022 In Chapter 7, a novel concept referred to as single reference multiple intermediate sig-nature analysis is proposed. The proposed single reference signature analysis checks multiple signatures against a single reference, thus reducing the hardware require-ments for implementation to a minimum. Therefore, this scheme is also referred to as minimal hardware multiple intermediate signature analysis. xii \u2022 In Chapter 7, the discovery of the identical signature properties is a contribution to knowledge. Based on the identical signature properties, a systematic method for implementing single reference multiple intermediate signature analysis is developed. Two algorithms are given, one for the efficient generation of large number of fault-free sequences and the other for the fast identification of the fault-free sequence that possesses the identical signature properties. \u2022 In Chapter 7, the classification of LFSRs according to their singularity (a useful prop-erty for implementing single reference multiple intermediate signature scheme) is also a contribution to knowledge. xiu Chapter 1 Introduction For Very Large Scale Integrated (VLSI) circuits, testing is a difficult and expensive process. With improving semiconductor technology and computer-aided design (CAD) techniques, circuits with a very large number of devices can now be fabricated on a single chip, e.g., the Intel Pentium microprcessor has 3.1 million transistors on a single chip [Barr93]. The increasing package density of VLSI chips not only makes the chips more powerful, but also causes dramatic reductions in chip production cost. On the other hand, the percentage of the chip production expenditure consumed by testing has greatly increased. The increasing package density of VLSI chips has made the problem of testing extremely difficult. 1.1 Dissertation Objective and Outline When testing VLSI circuits, conventional test strategies have encountered a number of problems, such as the requirement for prohibitive CPU efforts for test pattern generation, low test quality due to limited accessibility to the internal circuit nodes, and the requirement of expensive Automatic Test Equipments (ATE). Built-in Self-Test (BIST) is one of the most promising solutions to these problems. BIST is the capability of a circuit to test itself without requiring external ATEs. In BIST, both the test pattern generation and test response evaluation are performed on the same chip as the circuit under test (CUT). However, when implementing a BIST, there exist several technical difficulties. Among these, two major difficulties are the so-called error escape or aliasing, and the inability to compute the exact fault coverage with reasonable CPU time efforts. The error escape problem is that some faults detected by the test patterns escape detection during the test response 1 Chapter 1. Introduction 2 evaluation process, thus causing some faulty circuits to be mistakenly declared good. Due to aliasing, some BIST schemes may result in poor test quality [Saxena85][Zorian86][Argwal83]. This dissertation addresses these two problems by developing a group of techniques known as multiple intermediate signature analysis. The dissertation is organized as follows. The first three chapters are basically intro-ductory chapters. Thus, readers familiar with VLSI testing may skip these chapters. The remainder of this chapter provides a general review of the art of VLSI testing. Chapter 2 is an introduction to BIST which introduces typical BIST hardware models, important BIST components, and commonly used test pattern generation and response evaluation techniques. Since this dissertation concentrates on BIST test response evaluation, a more detailed discussion on this issue and its current status is provided in Chapter 3. The contribution to knowledge of this dissertation starts from Chapter 4. Chapters 4 and 5 discuss the basis of multiple intermediate signature analysis. Chapter 4 studies the aliasing and fault coverage performance of multiple intermediate signature analysis. More specifically, several models for aliasing and fault coverage predictions are developed. In Chapter 5, other issues associated with multiple intermediate signature analysis are discussed. The chapter begins with a discussion of the possible implementations of multiple signature analysis, in which a new scheme that shares the hardware resources of conventional BIST schemes is developed. This is followed by the discussions of test control, test result observation, and the advantages and drawbacks of multiple intermediate signature analysis compared to other data compaction techniques. Chapter 6 is devoted to the development of a fuzzy multiple intermediate signature analysis technique for BIST. It discusses the basic concept of introducing fuzziness into signature analysis so as to simplify the conventional way of checking multiple intermediate signatures. It also discusses possible implementations, their hardware requirements, models for aliasing performance prediction, and comparisons with some other techniques. Exper-imental results obtained while using the fuzzy multiple intermediate signature scheme are Chapter 1. Introduction 3 also reported in Chapter 6. Chapter 7 presents a single reference multiple intermediate signature scheme that requires a minimal amount of hardware for multiple signature analysis. In addition to showing how the minimal hardware requirement is achieved, Chapter 7 also develops several techniques that help to efficiently implement the proposed scheme. Feasibility studies as well as experiments on benchmark circuits are reported. Finally, Chapter 8 summarizes this dissertation and discusses possible future work in these areas. 1.2 Fault Models Being physical devices, VLSI circuits are subject to failures. A failure is defined to occur when the delivered service by a circuit deviates from its specified service [Abraham86]. The cause of a failure is an error, which is denned to be any discrepancy between the actual circuit output sequence and the specified or expected output sequence [Breuer76]. The cause of an error is said to be a fault [Abraham86]. The causes of a fault can be numerous. First, incorrect design or design specification of a circuit can lead to a fault in the final fabricated circuit. Secondly, a fault can occur due to manufacturing defects, such as open and poor interconnections, shorts between conduc-tors, excess leakage current, etc. [Bardell87]. Thirdly, even if a circuit is \"perfectly\" man-ufactured, it could subsequently wear out in the field due to electromigration, hot-electron injection, spreading charge loss, electrical overload, etc. [Abraham86]. Even during storage, faults may occur in a circuit due to factors such as temperature, humidity, leakage of sealed elements, and aging [Breuer76]. Lastly, even for a perfectly \"good\" circuit, a fault may occur temporarily in the field due to physical or environmental causes, such as lightening, radiation, stress, vibration, heat dissipation, etc. [Breuer76][Savaria86][Johnson89]. Regardless of the causes of a fault, in order for a fault's effect to be assessed, it must be modeled in a manner that is consistent with the representation of the circuit. In general, Chapter 1. Introduction 4 a fault described at a lower level can more accurately represent failure mechanisms, but involves a much greater degree of complexity [Abraham86]. For example, a fault described at the transistor level may become intractable because of the extremely large number of transistors in a VLSI chip, though it can very accurately describe the physical phenomena causing the fault. On the other hand, a fault described at a higher level, such as the gate level or functional level, can significantly reduce the complexity of treatment but, due to the loss of information, may result in some lower level failures not being considered [Abraham86]. For different requirements, many fault models have been developed. Some of them are simple while others are sophisticated. Among them, the stuck-at fault model [Breuer76][Bardell87] is one of the simplest and most commonly-used. A stuck-at-0 (stuck-at-1) fault is defined to be any fault condition that causes a signal line to behave as if it were stuck at logical 0 (1). This model is a logical fault model, and is thus technology independent. Faults can occur singly or in multiples. A special class of these stuck-at faults is the single stuck-at fault model that assumes the existence of at most one such fault in a circuit. The single stuck-at fault model is the most commonly used model in practice. The popularity of this model is mainly due to its simplicity and ability to cover many common defects in ICs, e.g., bridging faults [Bardell87] and multiple stuck-at faults [Kubiak9l]. However, there exist some defects that the stuck-at fault cannot model very well, e.g., delay faults and CMOS stuck-open faults. Nevertheless, the single stuck-at fault model is still the most popularly used, and still the model against which all other more complex fault models are compared [Bardell87]. As many other researches in this field, this dissertation assumes the single stuck-at fault model. But, as will be noted, the methodology and techniques developed in this dissertation are also applicable to other fault models. Chapter 1. Introduction 5 1.3 Test Quality Measures Testing of digital circuits consists of applying a sequence of input vectors to a circuit, observing the output sequence, and comparing it with a precomputed or expected output sequence. The presence of a given fault is said to be detected when an appropriate input vector or vectors, applied to the CUT, causes an incorrect logic output at one or more of the CUT's output lines. The input vectors are called test patterns or test vectors. Although a test procedure can be generic, its quality measures are usually associated with a specific fault model. In VLSI testing, the quality is usually measured by the ratio of the detected faults to the total number of possible faults under the assumption of a specific fault model. This ratio is termed fault coverage. Due to its dependence on fault models, for a same CUT and a same set of test patterns applied in the same order, the fault coverages obtained for different fault models can be significantly different. The fault coverage that a given set of test vectors can achieve is usually computed by a process called fault simulation. Fault simulation consists of simulating the applica-tion of every pattern in the test set to the fault-free as well as a set of faulty circuits (each corresponding to a circuit to which a fault is injected), and comparing the sim-ulated test response of the fault-free circuit with that from each of the faulty circuits. Fault simulation is the only way to determine the exact fault coverage [Wagner87]. In the past few years, gate-level stuck-at fault simulation for combinational circuits has been made extremely fast [Blank84][Waicukauski85][Mamari90][Keller90][Lee91]. A report from industry even declares that further speed up may not be necessary [Atken90]. For other fault models, such as delay faults, fast fault simulation techniques have also be developed [Waicukauski87b][Schulz87][Fink90][Wu92a]. For fault coverage computation, an alternative to fault simulation is the use of analytical techniques for fault coverage estimation [Wag-ner87][Savir84][Savir84b]. These techniques are based on the knowledge of the detectabilities Chapter 1. Introduction 6 1 of the faults in the CUT. Two major difficulties for using these techniques are the obstacle of obtaining the detectability profile of the CUT, and the inadequate accuracy. 1.4 Conventional Approach to IC Testing Testing a digital circuit requires three major steps, namely the generation of a set of test vectors, the application of these vectors to a CUT, and the analysis of the collected test response from the CUT. In conventional IC testing, an external tester is employed to apply test vectors and to collect and analyze the test response. When dealing with large circuits, the conventional approach encounters a number of difficulties. First, the computational time requirements for the test pattern generation may be prohibitive. Due to the fact that signals at internal circuit nodes are easier to control in combinational circuits than in sequential circuits, test pattern generation for combinational circuits is much easier. Many test pattern generation algorithms have been developed, e.g., [Goel81][Fujiwara83][Rajski87] [Schulz88]. To speed up the test pattern generation process, fault simulation is usually employed to determine whether a test pattern generated for one fault is also able to detect other faults. If so, these detected faults are dropped from further consideration. Unfortunately, even with the best algorithm, to generate a complete test set (the set of test vectors that covers all detectable faults in a CUT) is still prohibitively expensive for today's large circuits [Sedmak85]. Secondly, the generated test set usually cannot achieve adequate fault coverage. This is because the controllability and observability of the internal circuit nodes through I\/O pins have been significantly reduced as a result of the increased complexity and density of VLSI circuits. The reduced accessibility to the internal nodes not only makes test pattern generation more difficult, but results in large number of undetectable faults, and hence poor test quality. Thirdly, external testers or so-called ATEs are not only expensive, but impose a limitation on the speed at which the 'The detectability of a fault is defined to be the probability that the fault is detected by a randomly chosen test vector. Chapter 1. Introduction 7 CUTs can be tested. Conventionally, IC chips are tested at low speed to verify their static state functionality. However, a CUT that passed the low speed test may not be able to work properly at its operational speed due to the existence of AC faults1 [Schulz87] [Maxwell91]. The speed at which a circuit can run at test is limited by the speed of the ATE. High speed ATEs are extremely expensive (easily millions of dollars). Moreover, all ATEs are made with existing IC technology. Thus, their speed is likely to be slower than the latest technology. Besides, the amount of test data is becoming too large to be handled efficiently by ATEs. To ease these problems, a group of new techniques known as design-for-testability have been proposed [William83][McCluskey85]. 1.5 Design for Testability In general, design-for-testability (DFT) is any design technique that helps make a circuit more testable. For example, one can either enhance the controllability of internal circuit nodes, or enhance their observability, or both. Many DFT techniques are now available. Some are structured or generic design techniques. Some are ad hoc. A group of well-known structured DFT techniques is scan path design. As discussed earlier, testing a sequential circuit can be much more difficult than testing a combinational circuit. Scan design defines two operational modes of a circuit: normal mode and test mode. In the normal mode, the circuit performs its normal function as a sequential circuit. In the test mode, the circuit is reconfigured into a combinational circuit and a scan register. The problem of testing a sequential circuit is thus reduced to that of testing a combinational circuit and a scan register. The conversion is accomplished by connecting all the circuit's internal state memory elements into a shift register, and connecting one end of the register to an input pin while connecting the other to an output pin. With the scan register, any internal state memory element can be fully controlled by AC faults are the faults that cause timing failure of a system but may not affect the system's steady state functionality. Chapter 1. Introduction 8 shifting a bit into that specific element from the shift register's input pin. The state of the memory element can easily be observed by shifting out its content to the shift register's output pin. In this way, the circuit nodes or lines that are connected to the internal memory elements can be treated as primary inputs and outputs in the test mode. Although scan design significantly simplifies the problem of IC testing, it has several drawbacks. A possible drawback is that some faults may not be detected since the testing is performed in the test mode instead of the circuit's normal mode. In addition, in exchange for the reduced complexity in testing, scan design implies extra hardware requirements. For example, as estimated in [Nagle89], full scan design usually entails 10% - 20% additional hardware requirements. Furthermore, scan register cells sometimes require a more complex clocking system. Take the Level Sensitive Scan Design (LSSD) [Eichelberger78] technique for instance, which is now a compulsory requirement for all IBM's designs, two clock signals are required. In addition to extra hardware requirements and possible performance degra-dation, scan design also requires substantially longer test time because the test vectors must be serially shifted into the scan path bit by bit. However, experience has shown that the price paid by scan design is well compensated for by the significantly reduced effort in testing [McCluskey85]. It is worth mentioning another type of scan design for testing multichip assemblies even though this dissertation focuses on chip level testing. This type of scan design is known as boundary scan, which assigns a memory element to each I\/O pin of a chip, and connects these elements into a shift register called the boundary scan register. When testing a multichip assembly, a long shift register is formed by connecting each chip's boundary scan register together. Through the long shift register, the I\/O pins of each chip can be easily accessed. Boundary scan design successfully solves the problem of assembly intercon-nection testing [Hassan88][Jarwala89] without requiring the use of bed-oj-nails equipments. Boundary scan design has now become an IEEE standard [IEEE90] Chapter 1. Introduction 9 1.6 Built-in Self-Test DFT techniques such as scan design may ease the difficulty of test pattern generation and application. However, the core problems of limited internal circuit node accessibility and the requirements of expensive test pattern generation and ATEs still remain [Sedmak85]. Another DFT design approach which can well be coupled with scan design is built-in self-test (BIST) [Sedmak85][McCluskey85]. BIST is the capability of a product (wafer, chip, multichip assembly, system) to test itself without requiring external ATEs [Sedmak85]. This simple but very promising idea attacks not only the limited accessibility problem, but also the costly test pattern generation and ATE problems. A BIST must consist of a strategy for generating test vectors, a strategy for evaluating test response, and the implementation mechanisms [McCluskey85]. In chip level BISTs, both test pattern generation and test response evaluation are performed by some simple hardware in the same chip that is un-der test [Gelsinger86][Katoozi92][Zorian91][Hagihara92][Kuban84]. The conventional test approach's expensive requirements for test pattern generation and ATEs are no longer nec-essary with BIST. Furthermore, the on-chip test pattern generation and output evaluation significantly enhance the accessibility of internal circuit nodes, thus possibly yielding better test quality. Another strength of BIST is the ability to test circuits at their operational speed, which enables the detection of many AC defects in addition to the detection of static state failures. BIST can also be used for in-field test, and for multichip assembly fault diagnosis, which requires faulty chip localization. As pointed out in [McCluskey85], all BIST methods have some associated cost. Since BIST circuitry uses chip area, there is a decrease both in the yield1 and in the reliability [McCluskey85], and there is an increase in the power consumption [Levy91]. These costs are compensated for, however, by the significant savings in testing and especially system maintenance in the product life-cycle [Agrawal93]. The extra silicon area consumed by 1 Yield is usually defined to be the ratio of the number of good chips over that of all the chips produced. Chapter 1. Introduction 10 BIST used to be considered as an overhead. Recently, however, with increasing demands for higher test quality, more and more people in the VLSI community tend to consider BIST as a necessary function of a chip. Thus, the silicon area associated with BIST is considered as a natural requirement, and not as an overhead. Of course, it is still desirable to keep the hardware requirements of BIST as low as possible for the sake of cost, yield, reliability, and power- consumption. 1.7 Test Quality Prob lems of BIST BIST is a simple and powerful idea to solve the problems of VLSI testing. However, it has a distinctive technical difficulty. In BIST, due to the on-chip test response evaluation, the bit-by-bit comparison technique that is usually adopted in the conventional approaches is not normally practical any more. To evaluate the test response efficiently while consuming reasonable amount of silicon area, the test response sequence is usually compacted into a small sequence of bits, called a signature. At the end of a test, the signature collected from the CUT is compared with an expected signature or reference to determine whether the CUT is fault-free. A major drawback of the compaction is the loss of information, which may result in some erroneous sequences being compacted into a same signature as the fault-free one, thus causing an incorrect diagnosis whereby a faulty circuit declares itself as good. This problem is well-known as aliasing or error masking [McCluskey85][Bardell87]. Due to the aliasing problem, the overall test quality of a BIST scheme depends not only on the quality of the test vectors generated, but also on the quality of the adopted data compaction technique. Many recent research efforts have been aimed at reducing the aliasing problem while still maintaining reasonably small hardware requirements, e.g., [Zorian86], [Agarwal87], [Li87], [Robinson87], [Robinson88], [Gupta90] and [Raina91j. But, as will be discussed in the subsequent chapters, most compaction techniques have the deficiency of either excessively Chapter 1. Introduction 11 large aliasing or substantially high hardware requirements. Another difficulty associated with output data compaction is the assessment of the test quality of a BIST scheme. As discussed in Section 1.3, fault coverage is an accepted quality measure of VLSI testing. Fault coverage is usually computed by fault simulation. An important technique that makes fault simulation fast is dropping a fault from further consideration once it is detected by a test vector. This technique is known as fault dropping. In BIST, however, fault dropping is not usually possible. This is because, due to possible aliasing, once a fault is detected there is no guarantee that it will be detected at the end of the test. Without fault dropping, fault simulation is not generally computationally feasible for large circuits. Due to the inability to accurately quantify the test quality of a BIST scheme, the test quality issue of BIST has traditionally been split into two parts. The first part is to use fault coverage before compaction to measure the quality of the test vectors generated. The second part is to characterize the possible loss of the coverage due to aliasing. Unlike fault coverage measures, which are deterministic, measures for aliasing are usually probabilistic. Although many advanced probabilistic techniques have been developed, such techniques are difficult to use confidently when dealing with a specific CUT because of the statistical uncertainty. Evidence of the uncertainty can be found in experimental reports, e.g., [Aitken89], [Xavier92], [Rajski91b] and [Debany92]. Chap te r 2 Built-in Self-Testing BIST is becoming a widely-used means for VLSI testing. Many commercial products [Kuban84][Daniels85][Gelsinger86][Gelsigner89][Hagihara92] illustrates how far the BIST concept has become a reality. Testing different types of circuits usually requires different types of BIST [Zorian91], e.g., BIST for random logic, BIST for RAMs, BIST for ROMs, etc. This dissertation focuses on the BIST schemes designed for testing random logic. How-ever, the schemes developed in this dissertation still apply to the BIST for other types of circuits. As discussed in Chapter 1, the two major functions of BIST are the on-chip test pattern generation and output data evaluation. This chapter provides a general review of both these BIST functions. The subsequent chapter will provide a more detailed discussion on BIST output data evaluation since it is the major topic of this dissertation. 2.1 Hardware Models of BIST BIST is the capability for a chip to test itself. In general, BIST consists of the generation and application of test vectors to the CUT, as well as the evaluation of the test response on the same chip as the CUT. In implementation, there are some commonly-required components, namely an input pattern generator, an output data compactor, a pre-calculated fault-free signature or reference, and a comparator. Fig. 2.1 shows a generic BIST scheme. After applying the test patterns to the CUT, the comparator compares the final content of the data compactor with the reference stored on-chip, and produces a pass\/fail (or go\/nogo) signal as a test result. Different structures are used in different instances. Depending on how the test patterns are applied and how the output data are collected, Figs. 2.2 - 2.5 12 Chapter 2. Built-in Self-Testing 13 give four typical BIST structures. They are the parallel-parallel (Fig. 2.2) [Konemann79], parallel-serial (Fig. 2.3) [Pomeranz92][Li87], serial-parallel (Fig. 2.4), and serial-serial (Fig. 2.5) [Lambidonis91b]. Output Response Compaction Reference Signature Figure 2.1: General BIST scheme. Fass\/Fail P-IPG EI CUT H U P-ODC P - I P G t t i C U T \u2022 i sc ii i I P-ODC \/ Figure 2.2: The parallel-parallel BIST model. In Figs. 2.2 - 2.5, the P-IPG is a parallel input pattern generator; the P-ODC is a parallel output data compactor; the SC is a space compactor that compacts a m-bit vector into a A:-bit vector, where m > k; the S-IPG is a serial input pattern generator; and the S-ODC is a serial output data compactor. The S-to-P and P-to-S blocks in the figures are Chapter 2. Built-in Self-Testing 14 P-1PG I I I CUT v u v P-to-S S-ODC Figure 2.3: The parallel-serial BIST model. serial-to-parallel and parallel-to-serial converters, respectively. A P-IPG generates one multi-bit test vector for each clock cycle, and applies each vector to the CUT in parallel. A test vector generated by a S-IPG, however, consists of a series of bits applied to the CUT via a serial-to-parallel converter. The most commonly used test pattern generators are the Linear Feedback Shift Registers (LFSR) [Bardell87]. Cellular Automata (CA) [Hortensius89] can also be used as test pattern generators. A P-ODC collects multiple bits concurrently from the CUT. A S-ODC collects bits serially. S-ODCs are often used with P-to-S converters as shown in Figs. 2.3 and 2.5, unless the CUT has only a single output line. The most commonly used P-ODC is the Multiple Input Shift Register (MISR) [Bardell87], which is in fact a LFSR with multiple inputs. In general, for a iV-output CUT, one can use a TV-stage MISR for data compaction. However, the use of the A'-stage MISR can be very expensive in silicon area for large CUTs. In comparison, the use of a space compactor followed by a short MISR is more economical [Reddy88][WuM92][Zorian93]. The most commonly used S-ODCs are LFSRs [Bardell87]. Regarding the S-to-P and P-to-S converters, the scan path discussed in Chapter 1 is the most popular. When a scan chain is used, after the application of a test vector, Chapter 2. Built-in Self-Testing 15 S-IPG \u2014 \u2022 P-IPG - I I CUT ir i r V P-ODC S-IPG P - JPG t t i CUT t i sc ii i + P-ODC \/ Figure 2.4: The serial-parallel BIST model. the response at the CUT's output is shifted out of the scan chain for examination. For a scan chain P-to-S converter, each test vector usually yields multiple bits serially. Besides the scan chain, other components can also be used as the P-to-S, e.g., Multiple Input Non-feedback Shift Registers (MINSR) [Agarwal87], XOR trees [Katoozi92][Li87], and MISRs [Gelsinger86j. 2.2 An Important BIST Component \u2014 LFSR A LFSR is simply a shift register with linear feedback, i.e., feedback that is an exclusive OR (XOR) of the contents of the selected memory elements of the shift register. Fig. 2.6 shows an example LFSR with feedback polynomial x5 -f x2 + 1. A LFSR can have two operational modes. When used as a test pattern generator, the LFSR usually works in its autonomous mode, where no external input is applied to it. When such a LFSR is initialized with a non-zero seed, with each state transition or shift, its content is different from its previous ones. Thus, each register state can serve as a test pattern. Each LFSR has its specific period. After the LFSR shifts a certain number of cycles, its state returns to its initial state (seed). It has been shown that, for a Chapter 2. Built-in Self-Testing 16 S-IPG Figure 2.5: The serial-serial BIST model.

oo for combinational faults, where k is the length of the binary counter [Ivanov92b][Pilarski92j. However, the rate at which they converge to 2~k is much slower as compared to the LFSR-based techniques [Pilarski92]. Other studies also showed that the counter-based techniques generally yield poorer aliasing performance than the LFSR-based techniques [Robinson87][Saxena87][Aitken88][Yih91][Pilarski92]. Chapter 3. Aliasing and Aliasing Reduction Techniques 32 3.2 Advanced Compaction Techniques The aliasing performance of the adopted compaction technique has a significant impact on the final test quality. A &-stage LFSR signature analyzer can achieve an aliasing probability of 2~k. If k = 16, Pai \u00ab 2~16 = ggjgg = 0.000005. This number may seem very small, and is in fact typically accepted in practice [LeBlanc84][Kuban84][Gelsinger86]. However, for large VLSI circuits and high quality demands, it has been argued that Pa\\ fa 2 - 1 6 is far from ade-quate [Agarwal83][Zorian86]. By increasing k, one can easily reduce this number. However, the impact on hardware requirements is substantial. For BIST applications, the test cir-cuitry's hardware requirement must be taken into account seriously since higher hardware requirement implies not only higher cost, but also lower yield [McCluskey85], lower reliabil-ity [McCluskey85], and higher power consumptions [Levy91]. In the past decade, significant efforts have been made toward improving the aliasing performance of data compactors while maintaining reasonable hardware requirements, and many advanced data compaction tech-niques have been proposed. Some of these are briefly surveyed next. 3.2.1 Multiple Signature Analysis Multiple LFSRs Signature Analysis A straightforward way to check multiple signatures is to employ two or more LFSRs, each with a different feedback polynomial [Bhavsar84]. Fig. 3.11 shows a scheme employing two compactors with polynomials f\\(x) and f2(x). Obviously, the error sequences aliased by the multiple LFSR compaction scheme are those that are aliased by both f\\(x) and J2(x). The aliasing probability of this scheme Pa\\ = P\\?2 [Bhavsar84], where Pi and P2 are the aliasing probabilities of the two signature analyzers. It would be desirable to choose f\\(x) and f2(x) such that the error sequences aliased by fi(x) and those aliased by \/^(z) be disjoint. Unfortunately, such f\\{x) and f2{x) do not exist [Bhavsar84]. Assuming that the lengths of fi(x) and f2(x) are k\\ and &2, respectively, yields P\\ fa 2~kl and P2 \u00ab 2~k2. Chapter 3. Aliasing and Aliasing Reduction Techniques 33 LFSR1 LFSR2 Figure 3.11: Multiple LFSRs signature analysis. Then, Pa! \u00bb 2-^2~k2 = 2~(kl+h\\ Apparently, Pa[ for two LFSRs is equivalent to that of using a single (ki + A^-stage LFSR. In terms of aliasing performance against hardware requirements, this scheme does not achieve any improvement over simply increasing the LFSR length in the conventional single signature analysis scheme. Alternatively to using multiple LFSRs, different types of data compaction techniques can be combined [Robinson87]. In [Robinson87][Robinson88], a scheme that simultaneously uses a LFSR and a counter was proposed. Its Pa\\ is equal to the product of the aliasing probabilities of the LFSR and the ones counter. As compared to using two LFSRs, it is less attractive. This is because it requires a similar amount of silicon area, but its aliasing probability is higher on average since the aliasing probability of a ones counter is in general higher than that of a LFSR. Multiple Test Sets Signature Analysis Another method to check multiple signatures is to employ several test sets [Bhavsar84] [Hassan84]. This scheme requires only a single LFSR. A signature collected by the LFSR is checked after each test set has been applied to the CUT. Assume that two test sets, T\\ and T2, are applied. Let S\\ and S2 be the fault-free signatures obtained under T\\ and T2, respectively. In the occurrence of a fault, the probability that it would be aliased by this scheme is the probability that its signature under T\\ matches Si, and its signature under T2 matches 5 2 . Assuming the length of the LFSR to be k yields Pal \u00ab 2 - ; : 2 - , t = 2~2k Chapter 3. Aliasing and Aliasing Reduction Techniques 34 [Bhavsar84]. Multiple test sets signature analysis uses a single LFSR. Its extra hardware require-ment compared to single signature analysis is the silicon area for storing the additional reference signatures. An extra advantage of multiple test sets signature analysis is the possibly higher fault coverage, especially for unmodeled faults, due to the use of more test vectors. However, the use of multiple test sets not only significantly increases test time, but also implies a more complex test pattern generator. In [Hassan84], instead of using multi-ple test sets, a single test set is used twice in different orders, thus making test generation easier. Multiple Intermediate Signature Analysis Multiple intermediate signature analysis is sometimes called split or segmented sequence testing [Bhavsar84][Bardell87j. The basis of this scheme is to sample intermediate signatures collected by a single LFSR in addition to checking the final one. For example, if we want to check a total of two signatures, we can check one intermediate signature after the first l\\ bits of the sequence have been shifted into the signature analyzer, and check the second signature after the entire sequence of length \/ has been compacted. Assume the two corresponding good signatures to be S\\ and 62\u2022 The aliasing probability in this case is equal to the probability that an error sequence produce S\\ when the first signature is checked, and 52 when the second one is checked. As will be shown in the next chapter, for this scheme Pal \u00bb 2~2k. Like the multiple test sets scheme, multiple intermediate signature analysis uses a single LFSR. Unlike the multiple test sets scheme, however, this scheme uses only one test set. Therefore, this scheme will not incur any test time increase. On the contrary, it can significantly reduce test time on average. This is because a test session can be terminated and the CUT declared faulty once an incorrect intermediate signature is found [Lee88]. In addition to reducing aliasing and test time, recent studies have demonstrated many Chapter 3. Aliasing and Aliasing Reduction Techniques 35 other advantages of this scheme, e.g., easier fault coverage computation [Lambidonis91], increased fault diagnosability [Waicukauski87], and possible zero aliasing for modeled faults [Pomeranz92]. Like all the multiple signature analysis techniques, the disadvantages of this scheme include the increased implementation and control complexities, and increased hardware requirements for storing the multiple reference signatures. 3.2.2 Output Data Modification (ODM) ODM is a counter-based compaction scheme that takes advantage of counter-based tech-niques' non-uniformity of aliasing. As shown earlier, a ones counter's deception volume is DVe = ({\u201e), where \/ and w are the length and weight of a fault-free sequence. For a given \/, DVe is a function of w, which is depicted in Fig. 3.12. If the sequence weight w is DVe w Figure 3.12: Non-uniform deception volume of counter-based schemes. either very low or very high, the DVe can be extremely small. The basic concept of ODM is to reduce the weight of the fault-free output sequence, and then use ones counting as a signature of the \"modified\" sequence. One method to do this is to generate a reference sequence such that the sequence obtained by bit-wise XOR of the reference sequence and the actual fault-free sequence yields a reduced weight [Agarwal83][Zorian86][Agarwal87]. This is illustrated in Fig. 3.13. Ideally, the reference sequence is identical to the fault-free sequence, thus making DVe = 0. Unfortunately, the generation of such an ideal sequence Chapter 3. Aliasing and Aliasing Reduction Techniques 36 may generally require too much silicon area. Therefore, in general, only a proper reference sequence can be generated for the purpose of ODM. In [Agarwal87], a systematic approach for generating such reference sequences was proposed. Although it claims to reduce a ones counter's aliasing by a factor of 2hundreds or even 2thousands, the corresponding hardware requirement is generally considerable. reference sequence \u00a3 o g 0> M M o Counter Figure 3.13: An ODM scheme. In [Li87], a different way of implementing ODM was proposed, where the output sequence modification is carried out in the process of space compaction. However, it does not provide any general method to reduce the aliasing to meet a given requirement. In [Zorian92], the concept of ODM is applied to ROM testing. Instead of generating the reference sequence on-chip, the reference sequence is pre-calculated and stored in an extra column of the ROM under test. 3.2.3 Zero Aliasing Techniques Zero Aliasing by Monitoring the Quotient Sequence Signature analysis is based on the concept of polynomial division. Denote the feedback polynomial by f(x), the sequence to be compacted by Ri(x). Signature analysis can be represented as Rl(x) ~ q(x)f(x) + r(x), (3.1) Chapter 3. Aliasing and Aliasing Reduction Techniques 37 where q(x) is the quotient of the division, which is also the sequence shifted out from the last stage of the LFSR; and r(x) represents the remainder taken as a signature. Conventionally, the signature obtained above is compared with a predetermined fault-free signature given by Rj(x) = q'(x)f(x) + r*(x), (3.2) where R*(x) represents the good circuit response. Aliasing arises when r(x) = r*(x) but R{(x) ^ R*(x). Obviously, this problem is caused by the information loss due to the discarding of the quotient sequences. Thus, if q(x) and q*(x) are compared as well, i.e., r(x) and q{x) compared with r*(x) and q*(x), respectively, aliasing can be avoided. In general, directly storing the quotient sequences is not practical in BIST. In [Gupta90], instead of storing the quotient sequences, a new approach was proposed to select a specific f(x) for a given R*(x) such that q*(x) is periodic. By being periodic, it is easy to use a simple periodic sequence checker to determine whether the actual quotient sequence from the LFSR is error-free. For instance, in the example shown in Fig. 3.14 where the error-free quotient sequence is all 1, the periodic sequence checker is simply a 0-detector. In the case shown in Fig. 3.14, if the signature obtained is 010, and the quotient obtained is all 1, the sequence under compaction is fault-free. Otherwise, it is faulty. Here no aliasing can occur since both q(x) and r(x) are monitored. Unfortunately, this data compaction scheme generally requires too much hardware for practical use. For example, it was shown in [Gupta90] that the required LFSR that corresponds to the selected f(x) can have j-stages if the sequence to compact is of length \/. Since \/ is usually at the order of hundreds of thousands or even millions, this makes this technique clearly not feasible. Zero Aliasing by Transition Counting Another zero aliasing technique was proposed in [Diamantaras91]. It is based on transition counting, whose deception volume DVe = {[) \u2014 1, with \/ and t being the length and number Chapter 3. Aliasing and Aliasing Reduction Techniques 38 \u00bbQ) \u00bb _ > \u2014 \u2022 -RI(x) = 101000111 r(x) = 010 q(x) =111111 Figure 3.14: An example of zero aliasing compaction. of transitions of the fault-free sequence [Bardell87]. If all the test vectors that produce Os at the fault-free CUT's output are applied first, and the others are applied afterwards. This yields a fault-free output sequence in the form 00...0011...11. There is only a single transition in the sequence. Thus, the aliasing is very small, but still non-zero. In [Diamantaras91], a special 0-detector, as shown in Fig. 3.15, was proposed to monitor the output sequence. The switch shown in Fig. 3.15 is first set to the inverting terminal, and all the test vectors that would produce 0s at the CUT's output are applied. If a fault is detected by some of these vectors, the CUT would output some Is. In this case, the 0-detector will detect the possible errors. After applying the test patterns that produce 0s at the CUT's output, the switch is set to the non-inverting terminal, and all the vectors that have the CUT output Is in the absence of faults are applied. Similarly to the earlier case, any error, i.e., 0 in this case, in the output sequence will be caught by the 0-detector. This scheme requires little hardware on the data compaction side. However, a major problem exists with the test pattern ordering. Small hardware requirement test pattern generators that can produce test vectors in such an order have not yet been found yet. Another drawback of reducing transition count is that it also reduces the effectiveness of the test vectors for testing AC faults. This is because AC fault testing usually requires the generation of a large number of transitions at the fault-free circuits' output lines. Chapter 3. Aliasing and Aliasing Reduction Techniques 39 Input rCxx> switch sequence -O Zero detector Figure 3.15: Zero aliasing transition counting. Zero Aliasing for Modeled Faults During signature analysis, the first error in an error sequence will always be captured by the signature analyzer LFSR [Bardell87]. However, as the compaction process carries on, the error captured in the LFSR might be cancelled by some other errors afterwards, thus causing aliasing. One technique described in [Pomeranz92] proposes to check an inter-mediate signature before a captured error is cancelled. Therefore, for each possible error sequence, if at least one of its errors can be detected by checking intermediate signatures, aliasing can be eliminated. To find the possible error sequences, fault simulation assuming a certain fault model must be used. Thus, unlike the zero aliasing techniques discussed ear-lier, this technique is only valid for modeled faults. To reduce the hardware requirements, careful scheduling of the intermediate signatures is achieved using an algorithm developed in [Pomeranz92] that minimizes the total number of required signatures. In general, the signatures are not periodically scheduled. Thus, besides the silicon area required for storing the reference signatures, the control for the signature checkings may be complex [Wu92c]. Zero Aliasing by Sequence Identification Given a fault-free sequence, a nonlinear machine can be designed to identify the sequence. Thus, if such a machine is used for data compaction, no aliasing will occur. Such a technique was proposed in [Chakrabarty93]. However, a major problem of this method is that no upper bound on hardware requirements has been derived so far for long output sequences. Chapter 3. Aliasing and Aliasing Reduction Techniques 40 Therefore, in general, this method is not practical. 3.2.4 Modified LFSR Recently, a modified LFSR (M-LFSR) was proposed to reduce aliasing [Raina91]. The M-LFSR has two modes. One is the regular LFSR mode. The other is the modified LFSR mode which converts the LFSR into a non-linear machine. The basic idea of this approach is to use the normal LFSR to compact the first (\/ \u2014 s) bits of the sequence, and then use the non-linear machine to recognize the remaining s bits. If s = 1, i.e., if the last one sequence bit can be identified, the M-LFSR can reduce a LFSR's aliasing by half. This is because half of the 2' possible sequences have a last bit with a opposite value to the fault-free one. By identifying the last bit alone, half of the 2l error sequences can be detected. Regarding the remaining 2 ' - 1 \u2014 1 error sequences, they would be detected as in regular signature analysis. Thus, the deception volume of the M-LFSR scheme is DVe = 2l~k~l \u2014 1. Compared to 2l~k \u2014 1, which is the deception volume of a k-h\\i LFSR, the M-LFSR yields half the aliasing. In general, if the last s bits of the sequence can be identified, the M-LFSR scheme can reduce a LFSR's aliasing by a factor of 2s. According to [Raina91], for a r-input ft-stage LFSR, the M-LFSR can achieve the following aliasing: PM-LFSR = ^ ( 1 - p)\", r < k, (3.3) where p is the probability for a bit to be in error given a CUT failure, independently of the other bit errors. For the equally likely error model, p = 0.5, which is also assumed in the experiments in [Raina91]. Assuming p \u2014 0.5 yields: PM-LFSR = 2^k+rs\\ r < k. (3.4) The worst case of the M-LFSR is when r = 1, which corresponds to a single input LFSR. In this case, PM-LFSR \u2014 2~(k+s'. The best case occurs when r = k, which corresponds to Chapter 3. Aliasing and Aliasing Reduction Techniques 41 a &-input MISR. In the best case, PM-LFSR = 2~k(s+1K In terms of hardware requirements, the M-LFSR requires f | (8 + r + k)] 4-input NAND\/NOR gates, which is equivalent to |\\s(8 + r + k)] 2-input NAND\/NOR gates since each 4-input gate requires the same hardware as three 2-input gates. In general, the hard-ware requirement of this method is high. For example, to reduce the aliasing probability from 2 - 1 6 to 2 - 3 2 , the M-LFSR requires 400 to 600 2-input gates in addition to a 16-stage LFSR. Chapter 4 Multiple Intermediate Signature Analysis \u2014 I As discussed in the preceding chapter, aliasing reduction is usually achieved at the cost of high hardware requirements. Furthermore, except for the zero aliasing schemes, the difficulty for test quality assessment associated with data compaction (see Section 1.7) still remains. Unfortunately, the zero aliasing schemes are not generally feasible in terms of silicon area. Among the non-zero aliasing schemes, multiple intermediate signature analysis seems to be superior over the others since recent work has shown that this scheme lends itself well to exact fault coverage computation [Lambidonis91] and is able to significantly reduce aliasing with little extra hardware cost [Wu93]. Furthermore, multiple intermediate signature analysis can also shorten average test time [Lee88] and increase fault diagnosability [Waicukauski87]. In this chapter, we will investigate the models for predicting the aliasing and fault coverage of multiple intermediate signature analysis. In the remainder of this dissertation, unless otherwise stated, multiple intermediate signature analysis is simply referred to as multiple signature analysis. If a total of two signatures are checked, the sequence is divided into two segments of length l\\ and l-i = \/\u2014\/i, and the signature analyzer is checked at the end of each segment, i.e., the first signature is checked after the first l\\ bits of the sequence have been compacted and the second signature is checked after the entire sequence has been compacted. In general, assume that a total of n signatures, Si, \u00a32, \u2022\u2022\u2022, Sn, are checked after the first \/ j , I2, \u2022\u2022\u2022, In On \u2014 0 bits of the sequence have been shifted into the signature analyzer, respectively. In the remainder of this dissertation, the positions \/,\u2022 where the signatures are checked are 42 Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 43 referred to as check points, while the subsequence between each pair of adjacent check points is referred to as a segment. 4.1 An Aliasing Model In multiple signature analysis, the error sequences that escape detection are those that escape the detection at each intermediate and final signature checking. For example, assume an output sequence of length \/ is compacted by a Ar-stage LFSR. Let the sequence be divided into two segments of lengths \/x and I2 \u2014 I \u2014 h, and let the LFSR be checked at the end of each segment, i.e., at the check points \/1 and \/. Assume the corresponding fault-free signatures of the two segments are S\\ and ^2- Under the equally-likely error model, the number of possible sequences that produce Si at l\\ is \u2014^-, while the number of sequences that produce ^2 at \/ is 2 fc* [Bhavsar84][Bardell87], where both l\\ and \/ \u2014\/1 are greater than k. Therefore, the deception volume in this case, denoted by DVe(2), is [Bhavsar84][Bardell87j: oh y 9('- ' i ) ^ ( 2 ) = 2-A^r- - 1 = 2'-2k - 1. (4.5) Considering the existence of the 2l - 1 possible error sequences yields the aliasing probability Pal{2): 2l~2k - 1 Pa,(2) = - g r n - (4-6) Obviously, when 2l \u00bb 1, Po\/(2) \u00ab 2~2k. In general, if n signatures are checked at the arbitrary positions l\\, I2, \u2022\u2022-, ln, with the constraint \/,\u2022 > lj and \/,\u2022 \u2014\/j > k for i > j and \/\u201e = \/, it can be shown [Bhavsar84][Bardell87] that its deception volume is: DVe{n) = 2l~nk - 1. (4.7) Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 44 The corresponding aliasing probability is: nl\u2014nk _ i Pal(n) = 2 l _ 1 \u2022 (4.8) Assuming 2l > > 1 yields: Pal{n) \u00bb 2-n*. (4.9) From Eqns. 4.7 - 4.9, the deception volume associated with multiple signature analysis, and the corresponding aliasing probability, are independent of the positions where the signatures are checked. 4.2 Fault Coverage Models In reality, some error sequences are more likely to occur than others. For example, if two signatures are checked at the check points l\\ and \/, the error sequences in the form e[\/] = OO...Oe\/1+ie\/,4.2...e\/n would occur more often than those in the form e[l] \u2014 e1e2...e\/100...0. This is because most faults that produce errors in the first segment are random-easy faults, and thus would also have good chance producing errors during the second segment, thus generating the error sequence in the form of e[l] = eie2---ei1ei1+i...ei. To more precisely predict the aliasing performance, this section discusses fault coverage models that take into account the detection probability profile of a CUT. 4.2.1 Prel iminar ies This section introduces definitions and briefly reviews some results presented in [Seth90] and [Rajski91bj. Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 45 Basic Definitions Definition 4 . 1 : The detection probability of a fault is the probability of detecting the fault by applying a random vector to the CUT. Detection probabilities of faults in a CUT can be represented by a probability density function p(x) such that p(x)dx corresponds to the fraction of testable faults with detection probability between x and x + dx. Since x represents a probability: r i p(x)dx - 1. (4.10) Fault Coverage of Random Vectors without Compaction Since there are p(x)dx faults with detection probability x, the mean coverage among these faults by a random vector is xp(x)dx. Suppose we apply a sequence of random vectors to the circuit. The mean coverage by the first vector is: 2\/1 \/ xp(x)dx. (4-11) Jo The actual coverage by a random vector might be different from the mean by a random quantity. However, the variance will be small for almost all circuits [Seth90]. If all faults are assumed testable, after removing the faults detected by the first vector, the normalized number of the remaining undetected faults is: UDT = 1-1 xp(x)dx Jo = f (1 - x)p{x)dx. (4.12) Jo Thus, the distribution of the detection probabilities of the remaining undetected faults is (1 \u2014 x)p(x). Hence, the coverage of two random vectors is: \u2014 Vi + I 2:(1 - x)p(x)dx Jo = J x[l + (1 - x)]p(x)dx. (4.13) Jo J\/2 Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 46 Similarly, the coverage of\/ vectors is [Seth90]: y,= f x[l + (1 - x) + (1 - xf + ... + (1 - X)'-I]p(x)dx Jo = 1 - \/ (1 - x)1p{x)dx Jo = 1 - \/ ( \/ ) , (4-14) where I(n) = JQ(1 - x)lp(x)dx. Eqn. 4.14 shows the fault coverage achieved with \/ random vectors without data compaction. Obviously, when \/ \u2014\u2022 oo, yi \u2014> 1. That is, one can detect all testable faults in a CUT if a sufficiently large number of vectors are applied1. Fault Coverage with Single Signature Analysis Now assume a k-bit signature is checked after applying \/ random vectors to a CUT. Denote the aliasing probability by p, i.e., p \u00ab 2~k. Denote the probability of no aliasing by \/?, i.e., (3 = 1 \u2014 p. According to [Rajski91b], the fault coverage with the single signature analysis can be well estimated by f3 X FC = (1 \u2014 p)FC, where FC is the fault coverage before data compaction. Thus, from Eqn. 4.14, the fault coverage of single signature analysis can be represented as: Fd = \/ ? [ l - \/ (1 - x)lp(x)dx] Jo = \/ ? ( l - \/ ( \/ ) ) . (4.15) 4.2.2 A Comprehensive Fault Coverage Model In the following, for better understanding the fault coverage model derivation, we first label the aliasing probability of the ith signature checked at check point \/,- by \/?,-, and hence the probability of no-aliasing by \/?,-=!\u2014 \/?,-, where i = 1,..., n and ln \u2014 I. Then, at the end of 'Note that, for simplicity, Eqn. 4.14 considers only testable faults [Seth90]. Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 47 this section, we remove the subscript i by defining p = pi = 2 for i = 1, ...,n. Similar to the analysis in Sec. 4.2.1, the fault coverage after checking the first signature is: FCx =fii[l- ! (1 - x)Llp(x)dx) Jo = Pi[l-I(Li)). (4-16) The portion of the faults that remain undetected after checking the first signature is: UDTi = l-FCi = 1 - \/ ? ! ( ! - \/ (l-x)L>p(x)dx) Jo = j [1 - ft + \/?x(l - z ) L i ] K z ) ^ . (4.17) Jo Therefore, the new distribution of the detection probabilities of the remaining faults is pi(x) = [1 \u2014 Pi + Pi(l \u2014 x)Ll]p(x). The fault coverage after checking the second signature at check point I2 is: FC2 = FCi + p2[UDTi - I (1 - x)L*pi{x)}dx Jo = FCi + P2{UDTi - \/ (1 - x)L*{l -Pi+ Pi(l - x)L>]p(x)dx} Jo = [P2 + \/3i(l - P2)] - [Pi(l - p2)I{Li) + p2(l - Pi)I{L2)} - Pip2I(Lx + L2). (4.18) Similarly, we can find the distribution of the detection probabilities of the remaining faults, and the fault coverage after checking the nth signature: n n n n n-j+1 n n- j+2 r> n a j Fcn = E A l I \u00ab - I I \u00ab D E r E T- E ^ ( E W ] , (4-19) i=l j=i+l i=l j = l m, = l ^ m i m 2 = m i + l \" m 2 mj=mj_i+l ^ m i g=l where m0 = 0, \u00a3 a = 0 and fJa = 1 i f a > b-Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 48 As an example, the fault coverage expression for checking three signatures is given below: FC3 = [X> npj]- \/w3[\u00a3^i(Lm i)+ j : ^ J: f ^CEW i = l j = i + l m i = l ^ m i m i = l ^ m i m 2 = m i + l Pm2 g = l + E~ E -1 E ^ ' ( E W ] m i = l r \" U 7712=77!!+! ^ \" 2 m 3 = m 2 + l ^ m 3 5 = 1 = [01P2P3 + P2P3 + Pz] ~ plP2P3{[\u2014I(Lx) + ^I{L2) + ^I(L3)} Pi P2 P3 + [ ^ J(Za + L2) + M*I(Ll + I*) + ^ I(L2 + L3)} + [M*b.J(Ll + L2 + L3))} P1P2 P1P3 P2P3 P1P2P3 - WlP2P3 + P2P3 + P3] ~ {[PlP2P3l(Ll) + Plfi2P3l(L2) + plp2P3l(L3)} +WmHLi + L2) + \/?i\/>2\/Wi + L3) + PifoP3l{L2 + L3)) +[A\/?2\/J3\/(Xi + L2 + L3)]}. (4 Assuming each signature to be of the same length, i.e., \/?,- = 2~k, and hence \/?,\u2022 = (1 \u2014 2~k), for i \u2014 1,..., n, yields: FCn = (1 - 2~k)Yj2-(n-i)k 1 = 1 n-l+l n-j+2 -EK1-2\"*)i2'(B'i)* E E - E r(\u00a3Lm,)]. (4.21) j = l 7711=1 7712=7711+1 m j = 7 n j _ l + l 3 = 1 If 2~k \u00ab 1, we have: n n-j+1 n-j+2 n j Fcn\u00abi - Y;i2~{n~J)k E E - E ' (\u00a3*\u00ab,)]\u2022 (4-22) j\u2014l 7711=1 7712=7711+1 7 7 l j = 7 n j _ l + 1 p = l For example, when n = 3 and 2 - * << 1, the above equation yields: 3 3-i+l 3-J+2 3 j Fc3*i-Ev-{3-j)k E E - E i(LLm.)] j=l 77ii = l 77i2=77ii+l m j = r r i j _ i + l 3 = 1 Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 49 = i -2-2* J2 ' ( W - 2~k E E %\u00bb, + Lm2) mi= l m i = l m 2 = m i + l 1 2 3 - E E E \/(imi + xm2 + im3) m i = l m 2 = m i + l 7713=7712+1 = l - 2 - ^ [ \/ ( i 1 ) + \/ ( i 2 ) + \/ ( i 3 ) ] - 2 - f c [ \/ ( i 1 + \u00a32) + \/ ( I i + i 3 ) + J ( i 3 + is) ] - \/ ( i i + L2 + \u00a33). (4-23) The comprehensive fault coverage model developed above is based on the density function of fault detection probabilities of a CUT. Like all the other aliasing analysis tech-niques based on detection probability profile, a major difficulty of using this model is the prohibitive computational efforts required for obtaining a precise detection probability den-sity function for large CUTs. 4.2.3 A Simplified Fault Coverage Model As discussed in Section 4.2, most faults detected in a segment i are also likely to be detected in some of the later segments i + 1,2 + 2, ..., n. In [Zhang93][Zhang93b], a simplified fault coverage model was proposed which simply assumes that all faults that are detected in a segment i would also produce errors in later segments, i + 1, i + 2, ..., n. This simplified model is based on the aliasing probability, and the easily obtainable fault coverage curve before compaction. Assume the aliasing probabilities at check points \/ j , I2, ..., ln to be p\\, p2, \u2022\u2022\u2022-, pn, respectively. Assume that, at the check points, the corresponding fault coverages before data compaction are known to be F-\\, F2, ..., Fn. The portion of faults detected in segment i is thus F{ \u2014 i r,_i, for i \u2014 1,..., n, defining Fo = 0. After checking the first signature at \/ j , F\\pi of the faults detected in the first segment would escape detection. If all the aliased faults are assumed to be re-detected in later segments 2, 3, ..., n, after checking the second signature at Zg, the portion of faults aliased from the portion F\\ due to aliasing would be Fipip?. Thus, the portion of faults Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 50 that would be aliased from F\\ after checking n signatures is: n FCL1=F1]Jpj. (4.24) 3 = 1 Similarly, for the (F2 \u2014 Fi) faults first detected in the second segment, after checking the second signature at I2, (F2 \u2014 F\\)p2 of them would be aliased. Again, assuming that these aliased faults will be re-detected in later segments 3, 4, ..., n yields the fault coverage loss from (F2 \u2014 F\\) after checking n signatures: n FCL2 = {F2-Fl)Y[Pj. (4.25) 3=2 In general, for the (Fi \u2014 i^- i ) faults first detected during the ith segment, the portion of the faults aliased is: n FCLi = (F - F^)Y[Pj i = 1,...,n. (4.26) i=\u00ab Therefore, the total fault coverage loss with n signatures is: n FCL = J2 FCLi 1 = 1 = Y,(Fi-F,_l)f[pJ. (4.27) 1=1 j=i Since the final fault coverage before data compaction is Fn, the fault coverage when checking n signatures is [Zhang93][Zhang93b]: FCn = Fn- FCL =:Fn-f2(F,-Fi^)flPj. (4.28) >=1 j=i Assuming pi = 2~k, for i = 1,..., n, yields: n FCn = Fn- J2(F< ~ Fi-i )2-(n-'+1>fc. (4.29) Eqns. 4.28 and 4.29 predict the fault coverage with multiple signature analysis. This model is solely based on the knowledge of the fault coverage before data compaction and Chapter 4. Multiple Intermediate Signature Analysis \u2014 I 51 the aliasing probability of the signature analyzer. It is thus much easier to use than the one developed in Sec. 4.2.2. However, this simplified model is optimistic since the model assumes that all the faults detected in segment i are re-detected in all later segments, i + l,i + 2, ..., n. In practice, this assumption may not be true. Some faults may be re-detected in some of the later segments, but not necessarily in all the segments. Some faults may not be re-detected at all. However, if Li < Z1+1 for i = 1,..., n as in [Lee88][Lambidonis91], the assumption can be well justified as shown in [Zhang93][Zhang93b]. Chapter 5 Multiple Intermediate Signature Analysis \u2014 II The previous chapter studied the aliasing and fault coverage performance of multiple sig-nature analysis. This chapter addresses other issues of multiple signature analysis, namely possible implementations, test control, test result observation, and applications. 5.1 Possible Implementations 5.1.1 Conceptual Understanding of Multiple Signature Analysis Although multiple signature analysis can be implemented in different ways, there are some functional blocks that are commonly required. Assume that the n signatures generated by the LFSR at the check points l\\,h,---Jn are si,S2,--.,sn, respectively. Once the check points are fixed, by logic simulation, one can easily determine the n corresponding fault-free signatures or references, ri,T2,..-,rn. Except for the fuzzy multiple signature (FMS) analysis [Wu92] and the single-reference multiple signature (SMS) analysis [Wu93], which we will introduce in the next two chapters, when testing a CUT, the ith signature S{ is compared with a specific reference r,- at the ith check point. Thus, in conventional multiple signature (CMS) analysis, the signatures and references must correspond on a one-to-one basis for a CUT to be declared good, i.e., signature s,- must match r,- for all i. Thus, in implementation, besides the hardware required for storing the n references, an index generator is also required to build the one-to-one correspondence between signatures and references. Fig. 5.16 conceptually illustrates the CMS scheme. In Fig. 5.16, a ring counter is used to provide the index. When the test reaches the ith check point, the ith bit of 52 Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 53 the ring counter is \" 1 \" , and the signature collected in the LFSR is compared with the tth reference r,-. If they match, the ring counter advances by shifting the \" 1 \" to the (i + l)th bit. Otherwise, the test can be terminated, and the CUT declared bad. If signature S{ matchs r,- for all i, the CUT is declared good. Pass \/ Fail Index (ring) counter Figure 5.16: Conceptual representation of the CMS scheme. 5.1.2 Straightforward Implementations A straightforward implementation of the CMS scheme is shown in Fig. 5.17. As shown, in addition to the ROM for storing the references, extra circuitry is required to generate the index and compare the references with the signatures. In this implementation, the index is provided by the ROM address generator which consists of a \/o^2(n)-stage binary counter and a log2(n) \u2014 to \u2014 n index decoder. The comparator can be composed of k 2-input XOR gates and a &-input NAND or NOR gate. Usually, the extra circuitry consumes more silicon area than that of the ROM itself. Furthermore, connections between these functional blocks can also take considerable silicon area. Alternatively, instead of storing the references in a ROM and using a seperate com-parator to generate the Pass\/Fail signal, as shown in Fig. 5.18, one can use a combinational Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 54 ROM U o TO CI. X o o Comparator (k - bits) a Pass \/ Fail LFSR Input squences Figure 5.17: A straightforward CMS implementation with ROM. logic (CL) for storing the references as well as for comparing them with the collected sig-natures. Assume the ith output line of the index decoder to be 1 at the zth check point \/,-. Then, at check point \/,-, if the signature collected in the LFSR matches the ith reference, the CL generates a Pass signal. Otherwise, it generates a Fail. If the CL generates Pass signals at all check points, the CUT is declared good. Otherwise, the CUT is bad. Input squenc CL y < ' 1 LFSR es \u2022 ^ ^ \u2022 -w K & o TO o. x TO \" 1 Pass \/ Fail O o \u00a7 Figure 5.18: A straightforward CMS implementation with CL. Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 55 5.1.3 An Implementation by Resource Sharing In [Wu91], a CMS implementation that shares some resources used for test pattern gen-eration was described. Compared to the straightforward implementation discussed in the preceding section, the resource sharing CMS implementation requires less silicon area. Fig. 5.19 shows the block diagram of the CMS implementation by resource sharing. \u2022 \u00a9 - \u2014 MUX RG CUT in to PASS\/FAIL Figure 5.19: A CMS implementation with resource sharing. In Fig. 5.19, the dotted box containing the CUT shows the standard BIST scheme, except that the pseudorandom input pattern generator (IPG) is split into two segments, IPG\\ and IPG2, which are controlled by separate clock signals. The extra circuits required by the scheme is a reference generator (RG), and a k - to \u2014 1 multiplexer (MUX), where k is the signature length. There is some local wiring overhead, but it is not significant. The scheme works as follows. Each time a signature is checked, the RG generates a k-b\\t reference. Then, the MUX converts the generated reference into a k-h'it sequence. Meanwhile, the XOR gate compares the signature generated in the signature analyzer, bit by bit, with the reference sequence from the MUX, and gives a Pass\/Fail signal. IPGi is a LFSR of length log2(k). Besides generating input vectors to the CUT, the IPGi also provides the \/o^2(^) control signals to the MUX. If the test length applied to the CUT is \/, then the number of input lines to the RG is log2{l). To check k-b'it signatures, the RG Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 56 has k output lines. The RG is basically a decoder which can easily be implemented with a PLA or other type of logic. Regarding the control of the scheme, if a centralized controller can provide two clock signals to IPG\\ and IPG2, then no extra control circuitry is required. When applying test patterns to the CUT, the two clock signals are identical. When checking a signature, the clock to IPG2 is stopped, and k clock cycles are provided to the IPG\\. If the central controller provides only one clock signal, some extra local control circuitry is required. Fig. 5.20 shows a possible local controller required for the case k = 4. It works as follows. When a signature is checked, the RG outputs a logic 1 signal to the local controller, which resets the IPG\\ to a known state and cuts off the clock signal to IPG2- After IPG\\ is reset, Gate 1 outputs a logic 0 signal to cut off the reset signal (rs). Now, the clock feeds to only IPG\\. After shifting IPG\\ k clock cycles, Gate 2 outputs a signal to reopen the clock to IPG2. The overhead of the local controller is about the size of a 1-bit LFSR. [}^h to MUX > IPG 1 RG C5^Ch>J > IPG 2 clock Figure 5.20: An example local control circuit. A deficiency of this implementation is that its hardware requirement increases as the test length \/ increases since the size of the RG block is a function of log2(l). Thus, for large CUTs that require long test lengths, this implementation may impose substantial hardware requirements. Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 57 5.2 Test Result Observation In BIST, after a test session terminates, the test result must be accessible externally through an I\/O pin or pins. A simple way to report the test result is to produce a single bit Pass\/Fail signal when the test is complete. In the single signature analysis, after the entire test output sequence has been compacted into the signature analyzer, the content of the LFSR is compared with the reference. The result of the comparison forms the Pass\/Fail signal. In multiple signature analysis, however, the final Pass\/Fail signal must be based on \"intermediate Pass\/Fail\" signals produced at each of the check points. For a CUT to generate a \"Pass\" signal, all of the \"intermediate Pass\/Fail\" signals must be in the \"Pass\" state. Otherwise, a \"Fail\" signal is generated. The following considers two different cases for test result observation in multiple signature analysis. Assume that a logic \"0\" corresponds to an \"intermediate Pass\" signal, while a logic \" 1 \" corresponds to an \"intermediate Fail\" signal. In the first case, if there is a Pass\/Fail pin available, a \"zero\" detector may be used, as shown in Fig. 5.21, to detect the \"intermediate Pass\/Fail\" signals. Prior to testing a CUT, the \"zero\" detector is preset to \" 1 \" , thus setting the Pass\/Fail pin to 0. When a signature is checked, the controller temporarily sets the chk signal to 1. This sensitizes the \"zero\" detector to the intermediate Pass\/Fail signal. Once a intermediate Fail signal is detected, the detector outputs and keeps a Fail signal, \" 1 \" , at the Pass\/Fail pin. The \"zero\" detector can be shared by all self-testable blocks on a chip. When shared, the detector outputs and keeps a Fail signal if any of the self-testable blocks is found faulty. This Fail signal can be used to terminate testing, thereby saving test time. In the second case, where a specific protocol for control and observation of BIST is available [Scholz88], the test result of each self-testable block may consist of a single-bit flag stored in a status register [Ravinder89][Zorian91]. Once a faulty signature from a self-testable block is detected, the flag for that block is set. The status can be accessed through the IEEE 1149.1 Test Access Port [IEEE90][Zorian91]. In this case, the flag can be used as Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 58 A Pass\/Fail Pin -o chk Preset Figure 5.21: Test result observation for multiple signature analysis. the \"zero\" detector required in the previous case. 5.3 Control of Multiple Signature Analysis For the control of the conventional single signature schemes, an on-chip log2(l)-bit binary counter, with \/ being the test length, is required to count the applied test vectors (see Fig. 5.22 a) [Breuer88][Gelsinger86]. When the final count is reached, a simple decoder detects this final count and generates a signal to stop the test and perform signature evaluation. In multiple signature analysis, however, multiple control signals are required, one for each of the check points. Therefore, the hardware requirement in this case would be in general greater than that required in the single signature schemes. However, as will be shown next, if the check points are periodically scheduled, the hardware requirement for the control of multiple signature analysis can be made as small as that required in any single signature scheme. Three possible cases depending on the scheduling of the check points are considered. Case 1. The check points are arbitrarily scheduled, i.e., the test length between check points \/,\u2022 and \/1+1 is arbitrary. To control multiple signature analysis in this case, like in any single signature analysis scheme, an on-chip counter is required to count the applied test vectors. When a check point is reached, a decoder detects the corresponding count and outputs the chk signal to enable the evaluation of a signature. Unlike the single signature Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 59 scheme's controller, where the decoder detects only the final count, the decoder in this case must decode all the counts corresponding to the check points, and is thus considerably larger. Assuming the use of a single \/o(72(\/)-input NAND or NOR gate for decoding the final count in the single signature case, the decoder in the multiple signature case may require up to n such gates plus a n-input gate for checking n signatures in the worst case if no logic minimization can be performed. Among the three possible control implementations considered in this section, this is the worst in terms of hardware overhead. Case 2. The scheduling of the check points follows a periodic pattern, i.e., the test length between \/,\u2022 and \/1+i is constant for all i. A convenient constant is 2q, where q is an integer. To control multiple signature analysis in this case, one may simply split the binary counter required in the single signature scheme into two segments, say C\\ and C2, as shown in Fig. 5.22b. C\\, which is (\/-bit long, counts the test length between two adjacent check points, i.e., 2q. C2 counts the number of signatures to be checked. Assume the final count to be 0 for both C\\ and CV Each time C\\ decrements to 0, a decoder decodes the 0 and generates the chk signal to enable the evaluation of a signature. If the signature is incorrect, testing can be terminated and the CUT declared faulty. Otherwise, C? is decremented by one, and testing continues. When both C\\ and C2 reach 0, the test is complete. Obviously, the controller in this case requires the same amount of hardware as that for single signature analysis since the total length of C\\ and Ci is the same as that of the counter used in single signature analysis. The total complexity of the two decoders required in the multiple signature case is also the same as that of the decoder in the single signature analysis. This is the best case in terms of hardware overhead. Case 3. The check points are selected such that the test length between two adjacent check points, \/,\u2022 and \/,-+i, is variable but constrained to values of 2?1, where qi is an integer. This case lies between the first two. In this case, one may still use C\\ and C2 to count the applied test vectors. C\\ must be of length q3, where qs = ming,-. Two decoders are required. One decodes the 0 state from C\\, while the other decodes the counts of 2qi~q' Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 60 CI C2 IMF test finished chk J test finished a . a controller for single signature analysis. b . a contoller for multiple signature analysis. Figure 5.22: Example controllers for signature analysis. from Co., i = 1, .., n-1. Every time C\\ reaches 0, and Co. reaches 2q'~q', i \u20ac (l, . . . ,n\u2014 1), a signature is checked. The hardware overhead in this case is between the first two cases, since the decoder for C\\ detects only one count, but the decoder for Co. has to detect n counts corresponding to the check points. The above discussion assumes a counter for controlling the signature analysis. A LFSR in the autonomous mode can also be used to count the applied test vectors, e.g., as in the BIST controller of the Intel 80386 [Gelsinger86]. The control implementations discussed above applies equally well to the case where the conventional counter is replaced by a LFSR-type counter. The split of an LFSR into two smaller LFSR-based counters, C\\ and Cii will not affect the randomness of the input pattern generator if concatenable polynomials are used [Bhavsar85]. 5.4 Applications As shown in the previous chapter, by careful arrangement of the check points, checking mul-tiple signatures can yield significantly smaller aliasing than conventional single signature Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 61 analysis. Moreover, recent research has shown many other advantages of multiple signa-ture analysis. This section provides a brief review of the possible applications of multiple signature analysis. 5.4.1 Exact BIST Fault Coverage Calculation The calculation of exact fault coverage implies fault simulating the CUT with a specific test pattern generator and output data compactor. Without the output data compaction, fault simulation can exploit fault dropping to accelerate the process. In this case, the correspond-ing computational effort is proportional to the shaded area above the fault coverage curve (see Fig 5.23). However, to determine the fault coverage after compaction when only the fi-nal signature is checked, no fault dropping is possible. This is because aliasing results in the possibility that a fault detected at some point still escape detection at the end of test. With-out fault dropping, fault simulation implies simulating each fault for the entire test length. In this case, the computational effort is proportional to the total shaded area above and below the fault coverage curve (see Fig. 5.24). For large CUTs, the required CPU time for such a fault simulation may become prohibitive [Waicukauski87][Lambidonis91][Zhang93]. too > o o J\u2014> \"3 Test length Figure 5.23: Fault simulation time to determine fault coverage before data compaction using fault dropping. Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 62 03 s - , > O O \u2022 * - > Test length Figure 5.24: Fault simulation time to determine fault coverage after data compaction. However, if multiple signature analysis is used, some amount of fault dropping can be used to reduce the fault simulation time. After checking an intermediate signature, all the faults that are detected by this signature can be dropped from further consideration. If a total of two signatures are checked, the required fault simulation time is illustrated by the shaded area shown in Fig. 5.25. Clearly, compared to the case shown in Fig. 5.24, this simple example shows how drastic reductions in total simulation time can be obtained using multiple signature analysis. If more intermediate signatures are checked, the computational time reduction can be more significant. 5.4.2 Test Time Reduction Short test time is desirable in VLSI testing since it implies higher productivity and hence lower production costs. Reducing test time is another possible application of multiple signature analysis. In single signature analysis, a bad signature cannot be identified until the entire test response sequence has been compacted. In comparison, with multiple signature analysis, a test session can be terminated and the CUT declared bad as soon as an incorrect intermediate signature is found. For good CUTs, testing with multiple signature analysis Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 63 A l i Test length l Figure 5.25: Fault simulation time with multiple signature analysis (assuming two signatures here). takes the same time as that with a single signature scheme. Thus, on average, the test time can be significantly reduced if yield is not too high. In general, the test time reduction is dependent on the yield. The lower the yield, the shorter the test time on average. Furthermore, the reduction also depends on the scheduling of the check points, and the shape of the fault coverage curve [Robinson85][Lee88]. An algorithm that computes the optimal check point scheduling to minimize average test time was developed in [Lee88]. In general, the algorithm tends to schedule the check points at early stages of the test session. But, it was shown that even with periodic check point scheduling, average test time can be significantly shorter than using single signature schemes [Lee88]. Traditionally, BIST with output data compaction has been mainly used for off-line testing, e.g., manufacturing testing and in-field testing [Gelsinger86][Gelsinger89]. For off-line testing, test time affects only the production costs. Recently, researchers have also proposed to use BIST with signature analysis for concurrent testing or \"on-line\" testing [Saluja88][Katoozi92]. Unlike off-line testing, in concurrent testing, the time required to de-tect a fault once it occurs is crucial to the dependability of the system, which may include the reliability, availability, safety, maintainability, performability, depending on the application Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 64 of the system [Johnson89]. Due to the shorter average test time, segmented testing, and hence multiple signature analysis in BIST, are highly recommended for concurrent testing [Robinson85]. 5.4.3 BIST Failure Diagnosis Fault diagnosis is the process of locating the fault or faults in a faulty system. For systems at the level of circuit boards and multichip module packages, fault diagnosis is a necessary pro-cess for repair. At the VLSI chip level, fault diagnosis is often used for the analysis of failure mechanisms for fault modeling and process improvement [Waicukauski87][Aitken89b][Rajski91]. In addition to distinguishing good CUTs from bad ones, signatures obtained from CUTs also provide much information for fault diagnosis [McAnney87][Waicukauski87] [Kar-povsky91][Rajski91]. A straightforward diagnostic method is to built a fault dictionary that contains the correspondence between each possible faulty signature and the detectable faults that would produce this signature. Upon the completion of a test session, the faulty signa-ture obtained is used as an index to look up the fault dictionary for the location of the fault. Diagnostic fault simulation is the only way to build the fault dictionary. Unlike ordinary fault simulation, no fault dropping is allowed in diagnostic fault simulation [Waicukauski87]. Thus, this method is usually impractical for large CUTs. In reality, however, only a few faults are ever actually used for diagnosing failures during the life of a product [Waicukauski87]. Thus, it is highly desirable to consider only these few faults so as to cut down the required computational efforts for fault diagnosis, and to increase the diagnosis resolution. Unfortunately, there is no way of knowing beforehand what these faults will be since the defects to be diagnosed are scattered throughout the design. Thus, diagnosis methods that are based on post-test fault simulation are proposed [Waicukauski87][Aitken89b]. These methods all collect some kind of intermediate \"signa-tures\". After completing a test session, a set of intermediate signatures are obtained. If some of them are faulty the CUT is faulty. To locate the fault, all possible modeled faults Chapter 5. Multiple Intermediate Signature Analysis \u2014 II 65 are simulated until the first check point. Then, all the faults whose signatures obtained in the simulation differ from the corresponding signature obtained in testing are removed from further consideration. The remaining faults, i.e., the faults whose signature matches the signature obtained at the first check point in testing, are further simulated until the second check point. A similar process carries on until the final check point, or until the fault is located. This methodology has been in use at IBM for some time [Waicukauski87]. Chapter 6 Fuzzy Multiple Signature Analysis In the CMS scheme, checking n signatures requires n references. Each time a signature is checked, it is compared to a specific reference. For a CUT to be declared good, each of the signatures must correspond to a corresponding reference. This strict one-to-one correspondence between the checked signatures and references makes the implementation of the CMS scheme relatively complex and expensive in terms of silicon area. In this chapter, we develop a Fuzzy Multiple Signature (FMS) analysis scheme which does not require the aforementioned one-to-one correspondence. The FMS scheme is simple to implement and requires little hardware. 6.1 Basis and Implementations As discussed in Chapter 4, the complexity of checking multiple signatures is mainly due to the requirement of the one-to-one correspondence between the references and signatures. By removing this strict requirement one can get a much simpler data compactor. This is the basic idea of the FMS scheme. The scheme is referred to as a fuzzy multiple signature scheme because it consists of checking multiple signatures but does not impose the one-to-one correspondence between checked signatures and references. Thus a degree of fuzziness in the reference-signature relationship is introduced in the FMS scheme. 6.1.1 Basis Like the CMS scheme, the FMS scheme checks n signatures at check points, li,h,---,ln-However, unlike the CMS scheme where a signature S{ is compared with a specific reference 66 Chapter 6. Fuzzy Multiple Signature Analysis 67 r,- at check point \/,-, in the FMS scheme, each signature s,- is compared with the whole set of references {?\"i, r2, . . . , r n } . A signature s,- is considered good if it matches any of the references in the reference set. Therefore, with the FMS scheme, for a CUT to be declared good, it suffices that the signature obtained from the LFSR at each check point correspond to any of the references rj,r2, ...,rn. Fig. 6.26 conceptually illustrates the FMS scheme. The fuzziness introduced may result in a small increase of aliasing compared to the CMS scheme for given k and n. But, this can be easily compensated for by the reduced complexity of the FMS scheme compared to the CMS scheme. Otherwise, the FMS scheme possesses all the advantages that the CMS scheme has over the single signature schemes. Figure 6.26: Conceptual representation of the FMS scheme. 6.1.2 Implementa t ion Since the one-to-one correspondence between references and signatures no longer exists, implementing the FMS scheme is very simple. As shown in Fig. 6.27, the FMS scheme consists of a Signature Observer (SO) and a LFSR. The LFSR collects signatures. The SO checks each signature and generates Pass\/Fail signals. At each check point, if the signature generated by the LFSR matches any of the references, the SO outputs a Pass signal, say Chapter 6. Fuzzy Multiple Signature Analysis 68 logic 0. Otherwise, the SO outputs a Fail signal, say logic 1. The Fail signal can be fed to a test controller to terminate the testing and declare the CUT as faulty. If the SO outputs Pass signals at all the check points, the CUT is declared good. Signature Observer A \/ l A\/v Pass \/ Fail LFSR Input squences Figure 6.27: The FMS Data Compactor. From the above discussion, the SO is a decoder with fc-input, 1-output combinational circuit which outputs a 0 when its input vector belongs to the reference set, and outputs a 1 otherwise. Example 5.1: Assume n = k = 3, i.e., checking three 3-bit signatures. If the refer-ences are r\\ = 111, r% = 110, and r^ = 100, denoting the three bits of the refer-ences by bi, 62, and 63, the function of the SO can be described as Pass \/Fail = &1&2&3 + b\\b2bz -f- b\\b2 63 = b\\b2 + fri&3 = bib2b%. Thus, the SO can be implemented with two 2-input NAND gates as shown in Fig. 6.28. \u2022 6.2 FMS Aliasing Performance Analysis Assume the compaction of an \/-bit random sequence into a k-h\\t signature, and r\\ to be the only valid reference. The total number of sequences (including the fault-free one) that map onto T\\ is 2l~k. If we assume that the distinct references T\\ and r2 are both acceptable, Chapter 6. Fuzzy Multiple Signature Analysis 69 Input sequences Pass \/ Fail LFSR Figure 6.28: An Example of the FMS Scheme where the references are 111, 110 and 100. then the total number of sequences that map onto either rx or r*> 1 yields Pa\\ \u00ab m2~k. The FMS scheme checks n signatures at check points, \/1 ; \/2 , ...,\/n, against a set of m references, with m < n in general (the reason that m < n will be given later). Using the arguments presented in Chapter 4 for the aliasing probability of CMS schemes, the following aliasing probability results for the FMS scheme: Chapter 6. Fuzzy Multiple Signature Analysis 70 \/m2h~k - l . . m 2 ' \u00bb - * - 1. ,m2l\"-k-ls , \u201e\u201e\u201e . PFMS = ( 2<1 _ , )( 2> 1 yields: PFMS \u00ab [m2-*]n, (6.32) where m < n in general because there can be at most n distinct references if we check n sig-natures. However, some references may happen to be, or be made identical [Wu92b][Wu93] [Wu93c], thus making m < n. Clearly, for fixed k and n, the best case aliasing occurs for m = 1, for which PFMS ~ 2~nk [Wu93]. m = 1 is also the best case in terms of hardware requirements for implementing the FMS scheme [Wu93]. When m = n, the worst case aliasing occurs, for which PFMS ~ [n2~k]n. The following analysis assumes the worst-case scenario (i.e., m = n). To study the aliasing performance of the FMS scheme, we define the FMS scheme equivalent length, Lg , as a figure of merit. For a given aliasing probability in the FMS scheme, we define iFMS (.Q bg a continuous variable whose value corresponds to the length of a LFSR that yields the same aliasing probability in a single signature (SS) scheme. Ideally, L^MS should be as large as possible to minimize aliasing. Since Pss ~ 2~k and PFMS ~ [n2~k]n, then 2~L i j . . . r : . -v \\ \\ : \u2022 \\ : : \\ ; ; k=.8.. k = 9 ,.-...-.. K-.12. k= 16 \"~ - - _ _ _ _ - - -50 100 150 200 the number of signatures n 250 Figure 6.29: Aliasing performance of the FMS scheme. the references may happen to be, or be made identical, which also results in h < n. In the worst case, i.e., h = n, the PLA has n cubes. If the SO is implemented with logic gates, instead of a PLA, the hardware require-ments are as follows. If there is only a single reference of length k, a \u00a3-input NAND gate is required to decode this reference from the LFSR. This &-input gate can be composed of (k \u2014 1) 2-input NAND gates in a tree-structured form. (Note that this is also the hardware requirement of a SS scheme). If there exist two distinct references, assuming that they are not logically minimizable, then two fc-input gates are needed to decode the two references. In addition, combining the outputs of the two gates to form the Pass\/Fail signal of the SO requires an extra 2-input gate. Thus, a total of 2{k - 1) + 1 = 2k - 1 2-input NAND gates are required since each &-input gate consists of {k \u2014 1) 2-input NAND gates. In general, Chapter 6. Fuzzy Multiple Signature Analysis 73 for m distinct references, at most mk \u2014 1 2-input gates are needed. If m = n, as assumed in the worst case scenario in Section 2.1.3, the worst case hardare requirement of the SO is nk - 1 2-input NAND gates. 6.4 Comparative Evaluation of the FMS Scheme This section compares the aliasing performance and hardware requirements of the FMS scheme with those of the SS scheme, the Modified LFSR (M-LFSR) [Raina91], and the CMS scheme. Here, only the worst case of the FMS scheme is considered, i.e., it is assumed that m = n and no logic minimization performed for the SO's function. 6.4.1 F M S vs. SS To achieve an aliasing probability of 2~h, the SS scheme requires a k-bit LFSR. To achieve the same aliasing probability, the FMS scheme only requires a (k \/ n + log2(n))-bit LFSR if n signature are checked. For the following detailed area comparisons, a PLA implementation of the SO is assumed. Since each PLA-input variable corresponds to two lines in the AND plane of a PLA, and since the drivers in the PLA take an area of about 8 cubes [Gagne91], the normalized area of a &-input, s-output, n-cube PLA is (n + 8) X (k X 2 + s) units. The following area estimate comparisons is based on the actual layout of a PLA and a 16-bit LFSR, using the Cadence\u2122 automatic place and route tool, and a 3 fim double-metal CMOS technology. The LFSR was built with static D flip-flops, and measured approxmately 1.38 X 106 um2. Actual layout revealed that a 12-input, 4-output, 64-cube PLA takes approxmately the same area as a 16-bit LFSR. According to the above analysis, this PLA requires an area of (64 + 8) X (12 x 2 + 4) = 2016 units. Therefore, we assume that a PLA of 2016 units corresponds to the area of a 16-bit LFSR. The comparison of the FMS and SS schemes is illustrated by the following examples. Chapter 6. Fuzzy Multiple Signature Analysis 74 Example 5.3: If k = 9 and n = 32, PFMS = [32 x 2 - 9 ] 3 2 = 2 - 1 2 8 . In this case, the required hardware is a 9-stage LFSR, and a 9-input, 1-output, 32-cube PLA to implement the SO. The PLA requires (32 + 8) X (9x2 + 1) = 760 units of area, which is approxmately 37.7% of the size of a 16-bit LFSR, or about the size of a 6-bit LFSR. Thus, the total area overhead for the FMS scheme to achieve PFMS = 2 - 1 2 8 is approximately the area of a 6+9=15 bit LFSR. In comparison, a SS scheme would require a 128-bit LFSR to achieve PSS = 2 - 1 2 8 . \u2022 More examples are summarized in Fig. 6.30, where the area for achieving a given aliasing probability is given in terms of equivalent LFSR sizes. As shown in Fig. 6.30, with the FMS scheme, small aliasing can be obtained against very small hardware overhead compared to what is required by SS schemes. 6.4.2 F M S vs. M-LFSR According to [Raina9l], for a r-input &-stage LFSR, and assuming the equally likely error model, the aliasing probability of the M-LFSR is: PM-LFSR &2~{h+rs\\ with r *