ON SELECTING PROGRAMMABLE SPACE COMPACTORS FORBUILT-IN SELF-TEST USING GENETIC ALGORITHMSByBarry Kazuto TsujiB. A. Sc., University of British Columbia, Canada, 1989A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinTHE FACULTY OF GRADUATE STUDIESELECTRICAL ENGINEERINGWe accept this thesis as conformingto the required standard...THE UNIVERSITY OF BRITISH COLUMBIA1993© Barry Kazuto Tsuji, 1993In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of iE/€c1’ricc (The University of British ColumbiaVancouver, CanadaDate /2,) /73DE-6 (2/88)AbstractWith the increasing complexity of modern VLSI circuits, achieving high quality built-inself-test requires monitoring an increasingly large number of internal nodes. Due to thelimitations in observing large numbers of nodes, it has become increasingly necessary tocompact the output from a large number of lines to a small number of lines in a processknown as space compaction. In the past, space compaction has usually been performed bycircuit-independent compactors, such as the popular parity function. Recently, a class ofcircuit-specific space compactors, known as programmable space compactors (PSCs), hasbeen proposed. A PSC is a highly effective space compactor tailored for a specific circuitunder test (CUT) subjected to a specific set of test stimuli. Circuit-specific information suchas the fault-free and expected faulty behaviour of a circuit can be used to choose compactorswhich have better fault coverage and/or lower area costs than the parity function. Thedrawback of PSCs is the difficulty involved in finding optimal PSCs for a circuit. The spaceof possible PSC functions is extremely large and grows exponentially with the number oflines to be compacted. Although searching for optimal PSC functions is a difficult problem,it is not impossible.This thesis presents a method for searching for combinational PSCs based upongenetic algorithms (GAs), a randomized search technique. Randomized search techniques,such as simulated annealing and genetic algorithms, have become increasingly popular inrecent years as methods for solving search and optimization problems beyond the capabilitiesof classical techniques. This thesis develops a procedure for using circuit-specific informationin the application of (GAs) to the search for optimal PSCs for a given CUT and a given setof test stimuli. A measure of the effectiveness, or fitness, of PSCs is developed as a meansfor guiding the genetic search. The factors used to assess the effectiveness of a PSC are its11fault coverage (estimated by its probability of aliasing) and area (measured in terms of gateequivalents). The nature of GAs allows for searches to be completed in a fixed amount ofCPU time. The availability of greater computing resources in the future will enable largerscale searches and improve small-scale searches. The results of genetic searches (limited toapproximately 30 minutes of CPU time on a Sun Sparc 10 workstation) for PSCs, with acompaction ratio of 4 : 1, for the ISCAS ‘85 benchmark circuits are presented here. Theseresults show that searches based upon genetic algorithms can find PSCs with better faultcoverage at a lower cost than the parity function while using limited computing resources. Insome cases, cost savings of up to 80% were realized while simultaneously achieving improvedfault coverage over similar space compactors based upon the parity function.111Table of ContentsAbstract iiList of Tables viiList of Figures viiiAcknowledgement ix1 Introduction 11.1 Conventional Test arid Design for Testability 11.2 VLSI Built-in-Self-Test 31.3 Genetic Algorithms and Randomized Search Techniques 41.4 Approach Taken for Study and Guide to Rest of Thesis 52 Built-in Self-Test 72.1 Failures, Fault Models and Conventional Test 72.1.1 Modeling Faults 72.1.2 Test Quality 102.1.3 Conventional Testing 112.1.4 Design for Testability 122.1.5 Built-In Self-Test 152.2 Test Pattern Generation 172.3 Test Response Evaluation 182.3.1 Compaction 192.3.2 Error Models and Aliasing Measures 21iv2.3.3 Time Compaction.222.3.4 Space Compaction 253 Programmable Space Compactors 303.1 Programmable Space Compactors 303.2 Realizability/Implementation Cost 323.3 Error Propagation Measure and Aliasing 333.4 Implementing PSCs 354 Genetic Algorithms 374.1 An Introduction to Genetic Algorithms 374.1.1 Traditional Search and Optimization Methods 384.1.2 Simulated Annealing 404.1.3 Genetic Algorithms 444.2 Fundamental Components 454.2.1 Chromosomal Representation 474.2.2 Generating Initial Populations 484.2.3 Evaluating Fitness 484.2.4 Genetic Operators 504.2.5 Parameters for the Genetic Algorithm 514.3 Mathematical Foundations 514.3.1 Fundamental Theorems 514.3.2 Problems and Limitations 544.4 Advanced Techniques 544.4.1 Fitness Scaling 544.4.2 Advanced Operators and Concepts 575 Genetic Search for Programmable Space Compactors 62v5.1 Genetic Search 625.1.1 Representing Boolean Functions as Chromosomes 635.1.2 Generating the Initial Population . 655.1.3 Evaluating Fitness 665.1.4 Genetic Operators and Parameters 705.2 Analysis of the Circuit Under Test 715.2.1 The ISCAS ‘85 Benchmark Circuits 715.2.2 Circuit Behaviour 715.2.3 Estimating Line Bit Error Probabilities 725.3 Results of Searches with ISCAS ‘85 Benchmark Circuits 745.3.1 Fault Coverage and Cost 755.3.2 Weighting Cost 835.3.3 Grouping/Partitioning of Output Lines 846 Conclusions 876.1 Results 876.2 Future Work 88Bibliography 90A Standard Cell Library 97viList of Tables2.1 Irreducible polynomials and quadratic compaction functions for 2 < k <4. 293.1 Probabilities of occurrence 355.1 Robustness: best fitness values () 695.2 Convergence: average final (d(100)) and peak dominance values (B) 695.3 The ISCAS ‘85 benchmark circuits 725.4 Accuracy of line error probabilities estimates . . . 735.5 Summary of FC’ (1k vectors) vs 795.6 Summary of FC’ (8k vectors) vs. c’ 795.7 Summary of FC’ (64k vectors) vs . . . 80A.1 Cell Library Elements and Costs 98viiList of Figures2.1 Conventional test using automatic test equipment 122.2 General scan path architecture 142.3 General BIST structure 172.4 LFSR realizing a feedback polynomial o(x) 192.5 LFSR used as a signature analyzer 242.6 Multiple-input shift register 262.7 Spatial quadratic compaction block diagram 283.1 PSC-based space compaction 314.1 Simulated annealing algorithm 424.2 Basic genetic algorithm 464.3 Ideal search: unscaled fitness vs. generation number 584.4 Premature convergence: dominance vs. generation number 594.5 Post-convergence wander: dominance vs. generation number 605.1 Relative fault coverage FO’ (1k vectors) vs. relative cost c’ 765.2 Relative fault coverage FC’ (8k vectors) vs. relative cost c’ 775.3 Relative fault coverage FC’ (64k vectors) vs. relative cost c’ 785.4 Number of patterns for 98% fault coverage vs. relative cost c’ 815.5 Relative fault coverage FC’ vs. number of patterns 825.6 Relative total cost of PSCs c’ vs. weight w 845.7 Relative observability usage NL’ vs. weight 86vii’AcknowledgementI would like to take this opportunity to thank my supervisor, Dr. André Ivanov, withoutwhose support, encouragement, patience, and guidance the completion of this thesis wouldnot have been possible. I am also very grateful to Yuejian Wu for giving me his insight on numerous topics in VLSI test and BIST during many helpful discussions, and for the use of hissoftware tools which helped make this work possible. I am greatly indebted to Jeffrey Chow,William Low, Robert Lam for their friendship, moral support, and help over the past twoyears. I would also like to express my gratitude to Oswaldo Antezana, Peter Bonek, Mohammad Darwish, Vikram Devdas and John Jay Tanlimco for their part in making UBC sucha wonderful working environment. I would also like to thank Dr. Carl Seger for his help inexploring non-traditional aliasing computing techniques. The work reported here would nothave been impossible without the financial support of MPR Teltech Ltd.; PMC-Sierra, Inc.;and the Natural Sciences and Engineering Research Council of Canada. Finally, I would liketo express my sincerest thanks to my family, for their unwavering support, encouragement,aild patience during this time.ixChapter 1IntroductionTesting of Very Large Scale Integrated (VLSI) circuits is an important part of their designand manufacture, and an exponentially increasing amount of effort is being expended ontest development [Tsui87]. In the past, testing of integrated circuits (ICs) was considereda completely separate task from their design. It has become clear, in the face of rapidlyincreasing circuit complexity, that effective and high-quality testing of present-day ICs canonly be achieved by designing some or all of the ability to test a circuit into the circuititself. It is not sufficient to simply design circuits that function correctly. Circuits mustalso be designed so that they can be efficiently tested to ensure that they have been properlyfabricated.1.1 Conventional Test and Design for TestabilityIn a conventional test methodology, automatic test equipment (ATE) [Turino86] is usedto test a circuit by applying stimuli to the input pins and observing the circuit undertest’s (CUT’s) responses on its output pins. Circuits which do not respond as expectedare deemed faulty, since they failed to behave as specified. If the design is assumed to becorrect or if the expected behaviour is arrived at through circuit simulation, the cause ofthe failure is due to a manufacturing defect and is called a fault [Abraham86]. The goalof testing is to exercise a circuit so as to detect as many potential faults as possible witha reasonable amount of effort. A fault is considered detected if the application of specifictest stimuli would cause an observable change in the circuit’s response. The ratio of allfaults detectable by a program of test stimuli to all faults under consideration is termed1Chapter 1. Introduction 2fault coverage [Wiffiams8l], and is the most popular quantitative measure of the quality ofa test.Two major requirements of effective testing are controllability and observability[Tsui87]. The incredible advances in semiconductor and computer-aided design (CAD)technology have made it possible to fabricate chips containing millions of transistors, e.g.,the Intel Pentium processor. These extremely high levels of integration have not oniyserved to increase the functionality of chips and lower their cost, but also make testingmore difficult due to the reduced accessibility of internal circuit nodes. While the earliestintegrated circuits, such as the TTL quad 2-input NAND gate 7400 part, were simple totest since they had gate-to-pin ratios of 1 to 3, today’s modern VLSI chips, such as Intel’sPentium processor have gate-to-pin ratios on the order of iO to 1 . Clearly, the accessibilityof internal nodes by external ATE is rapidly diminishing, and along with it the ability toefficiently and effectively test VLSI chips using older methods.In an effort to improve test quality in the face of reduced access to internal nodes dueto ever-increasing integration, the concept of specifically designing circuits to simplify thetesting procedure, i.e., design for testability (DFT) [DasGupta, Wiffiams83], was developed.These DFT techniques are sets of design constraints and guidelines whose use serves toimprove the controllability and/or observability of internal nodes of CUTs. Ad hoc DFTtechniques, which depend on the ability of the designer to manually enhance the accessibilityof the circuit, are often used for small, simple designs, where the cost (in terms of designeffort and silicon overhead) of more advanced structured techniques is considered too high.Structured techniques define sets of standard design rules or guidelines to be applied to thecircuit to enhance testability. One popular class of structured DFT techniques is scan design[Eichelberger78, McCluskey84], where shift registers are used to transport test stimuli (andresponses) to (and from) from the chip I/O to various portions of the CUT.Chapter 1. Introduction 31.2 VLSI Built-in-Self-TestOne of the promising methods for testing modern VLSI circuits is the concept of built-inself-test (BIST) [McCluskey8l, Sedmak85, McCluskey85]. As the most ambitious form ofDFT, BIST is the ability of a circuit to test itself and determine if it is faulty, due to incorrectfabrication, external damage or design flaws, or if it is “good.” Generally, with VLSI BIST,both the generation of the test stimuli, i.e., test patterns or test vectors, to be appliedto the CUT, and the evaluation of the responses are performed on the same chip as theCUT. Clearly, the on-chip nature of BIST is an advantage over conventional test in termsof controllability and observability of internal circuit nodes. There are a number of otheradvantages to using BIST including, simplified test operation, the ability to perform testingin the field, the ability to test at normal operational speed, and hierarchical partitioningof the test problem. Disadvantages of BIST include, the area overhead, which results inincreased die size, lowered yields and higher costs, and the design effort for implementingthe BIST scheme.One of the key elements of BIST is the compaction of test responses [Frohwerki7,McCluskey8l, Bardell87]. Since test pattern generation and test response evaluation aresituated on the chip, it is necessary to keep the hardware cost of those functions low. Thegeneration of pseudo-random patterns, commonly used in BIST schemes, can be performedby simple linear feedback shift-register (LFSR) structures, which require little silicon area.On the other hand, effective, economical evaluation of the test response is typically moredifficult. Test response data, kept in external ATE memory in conventional test, is fartoo voluminous to be stored on-chip. Therefore, it is necessary to compact or compress’the data [McCluskey85b, Gupta9O]. Compaction of test responses can result in aliasing[McCluskey85], a phenomenon where the compacted output of a faulty device matches theexpected output of a fault-free one and thus makes the faulty device appear good. Practicalthe purposes of this thesis, compression will only refer to the lossless reduction of test responseinformation, while compaction refers to possibly lossy reduction.Chapter 1. Introduction 4applications of BIST must combine low probability of aliasing with low hardware cost.Compaction of test responses may occur with respect to the number of paralleloutputs, i.e., space compaction, and/or the number of time periods, i.e., time compaction[Saluja83]. Both space and time compaction are discussed in detail in Section 2.3. Thefocus of this thesis is space compaction, in particular, programmable space compaction asproposed by Zorian and Ivanov in [Zorian93]. As discussed in Chapter 3, programmablespace compactors (PSCs) are a class of space compactors tailored for specific CUTs subjected to specific test stimuli. The customized nature of PSCs enables them to be moreefficient, in terms of lower probabilities of aliasing, lower hardware costs, or both, thancommonly used generic space compactors, such as parity trees. Unfortunately, the spaceof possible Boolean functions which may serve as an effective PSCs is enormous for anycompactor of non-trivial size. One way, proposed in [Zorian93], to find effective PSCs is tolimit the search to simple low-cost functions and search this subspace. However, restrictingthe search space in this manner may easily prevent the discovery of optimal PS Cs. Anotherway, proposed in this thesis, to find effective PSCs is to use non-classical, randomized,search techniques to explore the entire space and search for effective PSCs.1.3 Genetic Algorithms and Randomized Search TechniquesIn the modern world, there are a vast number of problems for which there are no simplesolutions, yet solutions are needed. Scenarios seemingly as simple as the three-body problemin dynamics have no deterministic solution. If, for every situation, there was a reasonablyfast algorithm which would lead to a reasonable solution, the world would be a boring placeand one devoid of the need for engineers. One might say that interesting problems can becharacterized by the fact that there are no simple, practical algorithmic ways to solve them.NP-complete problems are a class of problems where optimal solutions can be verified inpolynomial time, but the difficulty in selecting potential solutions requires exponential time.Chapter 1. Introduction 5The search for optimal programmable space compactors is an example of a problem wherethe size of the search space is exponentially related to the size of the compactor, but theeffort to evaluate the effectiveness of potential optimal solutions is related polynomially.2Many search and optimization problems, such as the search for optimal PSCs, are difficult,but even without the ability to deterministically solve them, it is often possible to findefficient algorithms which will provide reasonably optimal solutions.1.4 Approach Taken for Study and Guide to Rest of ThesisFor a given circuit and a given set of pseudorandom test patterns, there exist space compactors with better cost and/or fault coverage characteristics than popular circuit-independentspace compactors such as the parity tree. Searching the space of all Boolean functions isnot computationally feasible. However, a search method based upon genetic algorithmsand using circuit- and pattern-specific information can find space compaction functionswith better cost and/or fault coverage qualities than traditional space compactors.The remainder of this thesis presents background material regarding test and geneticalgorithms, and describes a method for using genetic algorithms to search to programmablespace compactors. Chapter 2 reviews faults, conventional test methodology and BIST.Chapter 3 presents the concept of programmable space compactors which, through the useof circuit-specific information, can be tailored to perform better than traditional circuit-independent compactors. Chapter 4 describes genetic algorithms and their applicabilityto search problems such as the search for PSCs. Chapter 5 proposes a new method ofsearching for efficient cost-effective PSCs by using genetic algorithms. The results fromexperiments performed on a number of benchmark circuits are also presented. Conclusionsand suggestions for future work are given in Chapter 6. Finally, the Appendix containsinformation on the cell library used to implement the programmable compaction functions2Although not formally proven here, the search for optimal PSCs is considered here as an NP-complete(or, strictly speaking, NP-hard) problem.Chapter 1. Introduction 6in the genetic search.Chapter 2Built-in Self-TestThis chapter provides an introduction to the built-in self-test (BIST) of VLSI circuits.Before discussing BIST, it is necessary to review the motivation behind VLSI test, thebasic problem of testing VLSI circuits, and the conventional approach to testing. Then, thetwo fundamental components of BIST, test pattern generation and test response evaluation,are discussed. Finally, various compaction techniques used in test response evaluation andthe issues relevant to analyzing their performance are examined.2.1 Failures, Fault Models and Conventional Test2.1.1 Modeling FaultsLike any manufacturing process, VLSI technology is not perfect and is incapable of producing perfectly working devices in every fabrication run. In fact, the breakneck speed ofthe advance of semiconductor technology creates a situation, unlike in other manufacturingsectors, where market pressures commonly force the use of immature, leading-edge technology, which exacerbates the problem of consistently fabricating working devices. Dealingwith defective chips requires an understanding of the causes of failures.The following discussion uses terminology commonly accepted by the testing community [Abraham86]. Assuming a correct design, the failure of a VLSI device to behave asspecified is caused by a fault. VLSI faults are caused by defects present in the base materialor introduced during the fabrication of the devices. Some common failure mechanisms, orcauses of faults, include gate oxide breakdown, wire bond failure, electromigration, and even7Chapter 2. Built-in Self-Test 8dust particles stuck to the wafer [Mourad88]. In metal-oxide semiconductor (MOS) technology, the most common failure modes, i.e., the manifestations of physical failure mechanismsat the circuit level, are short circuits, open interconnections, and parameter degradations[Mourad88]. Fault models are attempts to represent the effects of failure modes at the logicallevel [Abraham86].The need for fault models arises from the inability to use functional tests to verifythe correct manufacture of devices. Exhaustive testing of all of the possible states of a minput, p-output circuit with s bits of memory would require2m+p+s tests assuming completecontrollability of internal nodes of the circuit and even more tests otherwise. Since thenumber of tests required for exhaustive functional testing is unwieldy for all but trivialvalues of m, p and s, structural tests, first proposed in [Eldred59], provide a way to verifycorrect functionality of a circuit by verifying the correct functionality of its components andtheir interconnections. The effort involved in structural tests is linearly related to m, p ands, rather than exponentially. Testing the structure of a device requires not only detailedknowledge of the structure itself, but also of the possible faults which may occur.One of the earliest and most widely used fault models, the stuck-at model, wasdeveloped for board-level testing of early electromagnetic relay circuits and thus, ironically,predates VLSI and IC testing as it is presently known [Russell89]. Later, this model wasfound to be applicable to diode-transistor logic (DTL) and transistor-transistor logic (TTL).The stuck-at model assumes that faults cause the lines in a circuit to become fixed atlogical 0, thus, stuck-at-0 (sa0), or logic 1, thus, stuck-at-i (sai). The stuck-at model canbe applied at the pin level, where only the I/O pins of a module are assumed to be stuck;the gate level, where the pins of any gate can be stuck; or the transistor level, where anyline in the transistor circuit may be stuck.A subset of the stuck-at model is the single stuck-at (SSA) model, wherein, at anygiven time, only a single node in a circuit is fixed to logic 1 or logic 0. Obviously, this isa very strong assumption, particularly in light of ever-increasing circuit densities; however,Chapter 2. Built-in Self-Test 9the motivation for this assumption is not accuracy, but practicality. It is simply not feasibleto model multiple stuck-at faults for non-trivial circuits. For example, the number of doublestuck-at faults in a circuit of one thousand nodes (containing 2000 single stuck-at faults) is22.2!998! = 1,998,000. (2.1)The computational cost of performing fault simulation for even single stuck-at faults islarge; the cost for multiple stuck-at faults is enormous.The stuck-at model does not model many failure modes that are common in MOS devices, such as stuck-open [Chandramouli83], bridging [Mei74], and delay faults [Koeppe86].Around the time the stuck-at model was proposed, the dominant IC technologies, such asTTL, were based on bipolar junction transistors. In TTL technology, all stuck-open andmost bridging faults manifest themselves as stuck-at inputs or outputs [Abraham86]. Thesame cannot be said for MOS technology. Stuck-open faults in MOS technology can isolatenodes causing the circuit to store charge at the node and function as a memory element.Bridging faults occur when two or more normally independent lines are shorted together toform wired-OR or wired-AND connections dependent on the relative drive strengths of thelines. Delay faults model structurally correct circuits that exhibit unacceptably degradedperformance.Based upon the fact that the stuck-at model does not model many failure modescommon in MOS devices, it has been argued that the stuck-at model is insufficient fortesting [Galiay8o]. Test stimuli, developed under the assumption of a combination circuit,which detect stuck-at faults in TTL circuits, do not necessarily detect faults in functionallyequivalent CMOS circuits if sequential faults are created in the CMOS circuit due to thestorage of charge at a node isolated by a stuck-open fault. The memory effect created byopen interconnections necessitates the use of sequences of test vectors to detect faults inCMOS gates [Reddy84, Hayes84].Despite its shortcomings, the now-classic stuck-at model has become, and still is,Chapter 2. Built-in Self-Test 10the standard by which all other fault models are judged. While tests based upon other faultmodels may be used to augment SSA testing, they are unlikely to replace it in the nearfuture.2.1.2 Test QualityThe testing of digital circuits involves stimulating the circuit with test patterns, and comparing the observed behaviour with the expected response. Circuits which do not behaveas expected are considered faulty. A fault, such as a single stuck-at, is considered detectedif test stimuli cause the CUT to produce any observable response other than that expected.In general, test quality is measured by fault coverage or fault grade, the proportion of faults,in the assumed fault model, detected by a set of test patterns. Given the dominance of thesingle stuck-at fault model, fault coverage of SSA faults is the de facto standard performancemetric used by the VLSI industry to measure test quality. Typical fault coverage goals forchips range from 95% to 100%, with fault coverages as low as 85% considered adequate forsome applications.The fault coverage of a set of test patterns for a given circuit is generally determinedthrough a process called fault simulation [Tsui87]. In fault simulation, faulty versions of thecircuit, created by injecting targeted faults into the fault-free circuit, are simulated with thetest program. Divergence in the observable behaviour of the faulty circuit and the fault-freecircuit under the stimulus of the test vectors implies the detection of the fault which wasinjected to create the faulty circuit. In general, fault simulation is an expensive proceduresince it involves what amounts to the simulation of many different circuits, which are reallyvariations of the same circuit, rather than just one. When a fault is detected by a set ofpatterns, the variation of the circuit due to that fault, no longer needs to be simulated,and thus is removed from the list of faulty circuits in a process known as fault dropping[Tsui87]. Even with fault dropping, the large number of possible faults (two per line in asimple gate-level SSA model) in a circuit places a tremendous burden on fault simulators.Chapter 2. Built-in Self-Test 11Unfortunately, despite its cost, fault simulation is the only way to determine the exact faultcoverages [Wagner87] that are desired by the VLSI industry.2.1.3 Conventional TestingConventional test methodology uses external equipment to control and monitor the devicevia its I/O pins to determine if the device has been correctly fabricated. Given the dominance of the single stuck-at model in industry, the main goal in test development is toachieve maximum coverage of SSA faults. Additional tests patterns are sometimes includedin the test set to detect shorts and/or opens [Reddy84], or other test methods, such as IDDQtesting [Horning87, Soden92] are used to improve test quality. Test vectors are developed tostimulate the circuit into producing observable responses which would be adversely affectedby the presence of stuck-at faults. The effectiveness of the test vectors are evaluated usingfault simulators which report the coverage of SSA, or other modeled faults. During production test, automatic test equipment (ATE), apply the test stimulus to the input pins of thedevice and monitor the response on the output pins (see Figure 2.1). Devices which do notrespond as expected are considered faulty, and those which do are assumed to be correctlymanufactured. It is in the best interests of the manufacturer to detect faulty devices asearly as possible in the production process since the cost associated with discovering faultydevices generally increases by an order of magnitude with each packaging stage [Bardell87].Therefore, the quality of the test program, with respect to the number of faulty devices itscreens out, is vital to the economic health of a manufacturing process particularly whenconsidering how early chip fabrication is in the entire systems production line.One of the difficulties with conventional testing is the growing lack internal circuitaccessibility. In order to perform stuck-at testing, every line must be directly or indirectlycontrollable, and the results of the stimulus must be observable. With conventional testing,all access is through the I/O pins. The earliest SSI circuits, such as the TTL quad 2-inputNAND gate 7400, part, had an I/O pin dedicated to each line (in the gate-level model).Chapter 2. Built-in Self-TestFigure 2.1: Conventional test using automatic test equipment.12However, modern devices, such as the Intel Pentium processor, have gate-to-pin ratios onthe order of to 1. Gate-to-pin ratios will continue to grow as device counts growquadratically, and pin counts on the most advanced packages grow linearly if they have notalready reached their practical limits [Tsui87].2.1.4 Design for TestabilityTo combat the decreasing accessibility of the internal circuit nodes of growing VLSI designs,it has become necessary to design circuits with testability, in addition to functionality, inmind. Some sort of design for testability (DFT) technique, if only an ad hoc technique,Pattern —GeneratorTesterChapter 2. Built-in Self-Test 13must be used in the design of any modern VLSI circuit. Generally, DFT techniques definea test mode, separate from the normal mode of operation, in which the circuit is reconfigured to enhance testability. Commonly, circuits are reconfigured in test mode to reducecoupling between combination circuits and memory elements, such as flip-flops, to reducethe sequence dependence of the test vector stimuli.In ad hoc techniques, multiplexers are often placed near the I/O pins. These multiplexers are activated in test mode to route signals from the input pins to specific modulesunder test and to route signals from the modules to output pins. By defining several sub-modes of the test mode wherein different logic modules are accessed by the I/O pins, testdevelopment is greatly simplified.As circuit complexity increases, ad hoc techniques become less effective as the logistics behind their effective application grow unmanageable. Structured design techniques,such as scan path design define a set of design rules based upon special sequential memoryelements (flip-flops, latches, etc.) which are designed to provide controllability and observability {McCluskey84]. Since the sequential memory elements of any design following theserules are fully accessible in test mode, the circuit is reduced to a more testable combinational network. With the popular scan path design technique, the circuit is reconfigured,in test mode, into a scan register and a combinational circuit. The scan register is a shiftregister formed through the use of the flip-flops with auxiliary I/O ports to handle additional “test input,” “scan out” and “Clock 2,” signals which control the shift register intest mode (see Figure 2.2). Connecting the two ends of the scan register to real I/O pinsin test mode enables the control of every bit of the register by shifting the appropriate bitsequence into the register and the observation of every bit by shifting the contents to theoutput pin. Level sensitive scan design (LSSD) [Eichelberger78] is a well-documented scanpath technique adopted by IBM for use in all VLSI designs.The cost of the accessibility provided by scan path techniques is paid in terms ofextra hardware, additional circuit complexity, and additional test procedures. It has beenChapter 2. Built-in Self-Test 14Al___Combinational CircuitAn— —*—* ZnDl Q1 D2 Q2 Dn QnD D DTest Input D2 Q — D2 Q k— D2 Q —i p— Scan OutFF1 FF2 FFnCLK CLK CLKCLK2 CLK2 CLK2CLKClock 2Figure 2.2: General scan path architecture.estimated that the use of full scan path design usually adds anywhere from 10% to 20% tothe hardware cost of the circuit [Nagle89]. The use of scan path also requires the use ofan additional clock signal, two in the case of LSSD. Additionally, circuits employing scanpaths must include the necessary signal routing to chain all sequential memory elementsinto a shift register, regardless, of how unnatural or inefficient such a configuration may beto design or lay out. The use of scan paths in testing also slows down the test procedurebecause test vectors must be serially shifted into the scan register.Another effect of ever-increasing circuit density and complexity is increased difficulty in developing effective sets of test vector stimuli. Test vector development for small,Chapter 2. Built-in Self-Test 15structurally simple circuit blocks can be performed manually, but it is a tedious, labour-intensive task even under the best conditions. Or, development can be assisted or automatedthrough the use of deterministic test pattern generation algorithms and tools. Unfortunately, automatic test pattern generation (ATPG) is known to be an NP-complete problem[Ibarra75]. Still, effective ATPG algorithms for combinational circuits have been developed[Jain83, Bottorif, Rajski87]. (Although the development of ATPG algorithms for sequentialcircuits has lagged behind [Bardell87].) Even with the best tools, test vector developmentis becoming increasingly costly as devices are becoming larger and more complex.One problem not addressed by DFT techniques is that of at-speed testing. DigitalVLSI chips are usually tested at low-speeds, generally less than 5 MHz, relative to theiroperational speeds. Typically, high-speed parametric testing is only done on a few criticalpaths to verify the gross performance of the device. Since the majority of the circuitry istested at low-speed, devices which contain performance related faults, such as delay faults,may pass. Increasing the testing speed can be a simple solution to at-speed testing in somecases. However, this solution is generally impractical due to the exorbitant cost of high-speed ATE and because the speed of ATE is limited by the existing IC technology used tobuild the equipment itself. Conventional test methodology, even with DFT, generally failsto adequately deal with the problem of at-speed testing.2.1.5 Built-In Self-TestBuilt-in self-test proposes to improve VLSI testing by providing enhanced test point accessibility, simplifying test procedures, and allowing testing at operational speeds at theexpense of minimal added design complexity and overhead. BIST, the ability of a device totest itself, can be considered the ultimate evolution of the DFT concept. As illustrated inFigure 2.3, devices incorporating BIST contain on-chip circuitry to generate test patterns,apply those patterns to on-chip circuits under test, observe their response, and determineif the response indicates the presence of any faults. Usually, the test response is compactedChapter 2. Built-in Self-Test 16and compared with on-chip references. The on-chip nature of the test circuitry offers anunmatchable advantage in controllability and observability over any conventional non-BISTtechniqueBesides the improved test quality achievable from the enhanced controllability andobservability of BIST, BIST offers several other benefits. One of the attractive featuresof BIST is the possibility of reducing time required for test development which favourablyimpacts the time-to-market for a semiconductor product. BIST eliminates the need fortime-consuming test pattern generation, whether manual or assisted by ATPG tools. Sinceall of the structures necessary for testing the chip are built directly on the chip, no externalequipment is required for testing. As a result, devices incorporating BIST have the abilityof a device to test themselves while in the field. In-field chip test can be a tremendous assetis diagnosing board-level faults. Also, since, the testing of a device incorporating BIST isnot dependent on (nor limited by) external test equipment, BIST allows for testing at thefull speed of the device.The costs associated with BIST include design overhead and area overhead. As inDFT, the responsibility for adding the BIST circuitry falls upon the designer. He mustexpend considerable effort choosing and evaluating an appropriate BIST structure to incorporate into the design. The addition of BIST circuitry also has the potential to degrade thenormal mode performance of the circuit due to added loads caused by the observation of anode or extra levels of gate delays caused by the use of multiplexing circuitry to enhancecontrollability. The most obvious cost of BIST is the extra area required by all of the testcircuitry (e.g., test pattern generator and response compactor). Designers are often facedwith rigid die size limits, due to foundry costs or packaging constraints. Also, the impactof additional area reaches beyond the simple economics of the monetary cost of increaseddie size. Additional area also decreases yield, increases power consumption and reducesreliability [McCluskey85, Levy9l].Chapter 2. Built-in Self-Test 17Pass/FailFigure 2.3: General BIST structure.2.2 Test Pattern GenerationAs mentioned in Chapter 1, since it is impractical to store precomputed test patterns (generated off-line) on the chip, test pattern generation for BIST must be performed duringthe test, i.e., concurrently. The two most common concurrent test pattern generation approaches are exhaustive and pseudorandom pattern generation. Exhaustive testing entailsthe use of all possible input combinations as test patterns. Since the order in which exhaustive test patterns are applied is irrelevant (for SSA testing), the sequence can be generatedby binary counters or by maximum-length linear feedback shift registers (LFSRs), modifiedto include the all-zeroes state, operating in the autonomous mode, in which no externalinput is applied [McCluskey85bj. LFSRs are typically less costly in terms of area thanbinary counters, and can operate at higher speeds. As shown in Figure 2.4, LFSRs realizeChapter 2. Built-in Self-Test 18polynomials such as = aj_ix’1 + ak_2xk 2 + + a2x + a1x + a0, through theuse of feedback taps. LFSRs configured to realize primitive polynomials cycle through themaximum possible number of states, 2 — 1 for an n-stage LFSR. (The all-zeroes state isnever entered since its successor is also the all-zeroes state.) Unfortunately, the number oftest patterns required for exhaustive testing grows exponentially with the number of inputs,thereby rendering exhaustive testing impractical for circuits with large numbers of inputsthat are difficult to partition into smaller testable CUTs.Since it is usually impractical to use an exhaustive set of test patterns, pseudorandom patterns are a good alternative. For most CUTs, high fault coverage can be obtainedby simply applying a large number of pseudorandom vectors [Bardell87] as opposed to, forexample, a small set of deterministically generated test patterns. In addition to being usefulas an exhaustive pattern generator, the LFSR is also the basic structure for pseudorandompattern generation. When a pseudorandom sequence of a length much less than 2 is desired, an n-stage maximum-length LFSR, configured to realize a primitive polynomial, canbe used to generate such a sequence. The initial state of the register determines the startingpoint of the maximum-length sequence. Patterns produced from maximum-length LFSRsexhibit good randomness properties [Golumb82] and are reproducible.2.3 Test Response EvaluationAs opposed to conventional test with external automatic test equipment, the onus of evaluating the response of the circuit under test fails upon the chip itself. Just as the high storagecost makes it impractical to store precomputed test vectors on-chip, it is also impracticalto store raw precomputed circuit responses as references. It is necessary to compress orcompact the responses of the CUT to the test stimuli, to a manageable level. Generally,compaction results in the loss of information which introduces the possibility of misdiagnosis. As mentioned in Chapter 1, the phenomenon where the compacted test response ofChapter 2. Built-in Self-Test 19CLKa faulty device matches the expected output of a fault-free one and thus makes the faultydevice appear good is called aliasing or error masking [McCluskey85, Bardell87]. The idealcompaction technique would have a very low probability of aliasing and hence cause a verysmall loss in fault coverage, provide a very high compaction ratio, and cost very little interms of design effort and area overhead. Unfortunately, no single compaction techniquepresently achieves such an ideal. Tradeoffs between quality and cost (both design effortand area overhead) must be made. This section begins with a general discussion of compaction, includes a discussion of aliasing and error models, and ends with a survey of variouscompaction techniques.2.3.1 CompactionThe objective of compaction is to reduce the enormous volume of output test response datafrom a circuit under test. Following the notation used in [Zorian93], compaction is theFigure 2.4: LFSR realizing a feedback polynomial a(x).Chapter 2. Built-in Self-Test 20process of transforming an m x n matrix of test response data D = [d], resulting fromthe m observed nodes of a CUT over n time periods, into a p x q matrix C = [cjj] ofbits, representing p parallel outputs over q time units, where p < m and/or q < n. Thecompaction function is denoted by the function , where C (D). Reduction in thenumber of columns in the matrices is referred to as time compaction and the ratio n : qis termed the time compaction ratio; similarly, reduction in the number of rows is referredto as space compaction and the ratio p : m is termed the space compaction ratio. Thedistinction between time and space compaction is the most commonly used classificationfor compaction techniques.Other classifications of compactors are based upon attributes such as circuit specificity, and linearity. Circuit specificity of the compaction function refers to the use ofknowledge of the functional behaviour, based on the expected output, of the CUT, inchoosing . Compaction functions selected without regard to the expected fault-free orexpected faulty output are circuit-independent. Intuitively, circuit-specific compactors havethe potential to be more efficient than circuit-independent compactors. Yet, due to thedifficulty in effectively applying circuit-dependent information to the design of compactionfunctions, relatively few schemes, such as output data modification [Zorian9o] and zero aliasing compression [Gupta9o], have been proposed as opposed to the many circuit-independentschemes such as the well-known signature analyzer [Frohwerk77] and parity tree [Saluja83].Linear compaction functions are based upon Boolean addition over GF(2) usingexclusive-OR gates. When the compaction function results in a compacted output matrixC such that every bit in C can be expressed as the Boolean sum of any number of bits inthe uncompacted input matrix D, is considered to be a linear function. Intuitively, itwould be detrimental to restrict the design of compaction functions to linear functions onlyas opposed to non-linear functions. Non-linear functions have the potential to be moreefficient compactors than linear functions. As with the case of circuit-specific vs. circuitindependent compaction functions, because of the difficulties in searching the large space ofChapter 2. Built-in Self-Test 21non-linear compactors, relatively few schemes such as quadratic compactors [Karpovsky87]have been developed, as opposed to the well-known linear signature analyzers and paritytrees.2.3.2 Error Models and Aliasing MeasuresOne of the difficulties in evaluating the effectiveness of compactors is measuring the faultcoverage. High compaction ratios, often seen with time compaction (time compaction ratiosare typically 10N, where N 3), reduces the ability to use fault dropping which causesrequired fault simulation efforts to become too large to determine exact post-compactionfault coverage. Even with low compaction ratios, the time and resources required to performfault simulation are typically too great to evaluate exact fault coverages when designingcompactors. Without knowledge of exact fault coverages, easily calculated measures suchas the probability of aliasing, denoted by Pal, can be used to evaluate the effectivenessof compactors. Since aliasing can cause faulty devices to appear good, it is desirable tohave compactors with Pai as low as possible, given a reasonable cost. Low probabilities ofaliasing are assumed to translate into low losses in fault coverage due to the introduction ofthe compactor. Post-compaction fault coverage can be estimated as (1— Pai) . FC, whereFC is the fault coverage before compaction [Rajski9lb].In order to develop effective test response evaluation schemes with low probabilitiesof aliasing, it is necessary to understand the nature of the possible responses expected fromfaulty circuits. There are several error models which have been extensively employed instudies of aliasing. These models attempt to quantify the probability of the occurrence oferrors in the bits of the uncompacted output data matrix D, due to faults. If the fault-freeoutput is given by D, and the faulty output by b, then b = D e E, where E = [ejj], amatrix of error bits of the same dimensions (p x q) as D.One of the simplest models is the equally-likely error or uniform distribution error[Frohwerk77] model. In this model, all 2 possible error matrices, are equally likely toChapter 2. Built-in Self-Test 22appear, i.e., pr(ejj = 1) = pr(ejj = 0) = . Although this is a widely used model,because of its simplicity, it is not considered realistic [Ivanov92]. In two other simple models,the single-bit error and odd-numbered errors models, the error matrix may contain only oneor an odd number of errors, respectively. Under these two models, the presence of anyerrors can be detected with parity checking. One of the most general error models is thenon-homogeneous independent multiple bit error model [Ivanov92], in which all bit errorsare assumed to be statistically independent and occur with a probability pr(ejj = 1)=A variation of non-homogeneous independent multiple bit error model, the homogeneousindependent multiple bit error model, assumes that only the rows of the output data matrixD, rather than all of the bits, are statistically independent. Therefore, in this model, theprobability that an output bit in row i of V is in error, is given by pr(ejj = 1)= =i.e.,Vi,j,k ijöik.The assumption behind the homogeneous multiple bit error model is that there is a constantprobability of error associated with each output of a CUT.2.3.3 Time CompactionAlthough time compaction is not the main topic of this thesis, time compaction techniquesare the most well-studied class of compactors. Various schemes bear mentioning as examplesof the necessary tradeoff between cost and quaiity.Signature analysis, a circuit-independent, linear, time compaction technique hasbeen by far the most well-studied compaction scheme [Ivanov92] since it was first proposedin [Benowitz75]. The implementation of a signature analyzer (see Figure 2.5) is based uponthe same maximum-length LFSR (i.e., an LFSR configured to realize a primitive feedbackpolynomial) structure used to implement pseudorandom pattern generators discussed inSection 2.2. However, where the LFSR used in pseudorandom test pattern generation runsChapter 2. Built-in Self-Test 23in autonomous mode, the signature analyzer uses the output of the CUT as an input.The signature analyzer shown in Figure 2.5 realizes the irreducible feedback polynomialo(x) = x5 + x3 + x + 1. When an i-bit input sequence is considered as a polynomial Ri(x),the k-stage LFSR divides the input polynomial by the feedback polynomial (x) to form aquotient polynomial q(x) and a k-bit remainder rk(x) as shown byRi(x) = q(x)a(x) + Tk(X). (2.2)This k-bit remainder rk(x) is represented by the contents of the LFSR after all of theresponse bits in the test have been shifted into the LFSR, and is termed the “signature.” Awell-known result for the probability of aliasing of a k-stage signature analyzer, assuming thefeedback polynomial is irreducible, is Pai 2k for l>> k [Bardell87, Damiani89, Ivanov92].This result is intuitively satisfying since there are — 1. possible signatures and only andonly one is fault-free. One of the major drawbacks with signature analysis is the generalinability to evaluate the fault coverage after compaction. The time compaction ratio 1: k istypically very high (ioN, where N > 3) and since the signature is only checked at the end ofthe fault simulation, fault dropping cannot be exploited, which makes the fault simulationeffort to determine post-compaction fault coverage prohibitive [Ivanov92].Multiple signature analysis (MSA) is a variation of signature analysis in which severalintermediate signatures are checked throughout the test, rather than just a single one atthe end. One of the advantages of checking multiple signatures is a reduction in Pai for thesame amount of area, or alternatively, a reduction in area overhead for the same probabilityof aliasing. Generally, when r k-bit intermediate signatures are checked, the probability ofaliasing is Pai 2’’ for large values of l (assuming the feedback polynomial is irreducible)[Bhavsar84j. Recently, a scheme known as single reference multiple intermediate signatureanalysis (SMS), which consists of finding r identical k-bit signatures, has been developedby Wu in [Wu93]. In the SMS scheme, a minimal hardware implementation of MSA can beChapter 2. Built-in Self-Test 24Input OutputCLKrealized by choosing the length k of the signatures as 1 bit, in which case the LFSR need onlyconsist of a single stage. Another advantage of MSA is a reduction in the fault simulationeffort. Since several intermediate signatures are checked during the test, some fault droppingcan be exploited, which can make fault simulation feasible [Lambidonis9l]. Additionally,judicious scheduling of the intermediate signatures can reduce the fault simulation effort[Zhang93]. After fabrication, BIST schemes based upon MSA can take less time to test,since faulty devices can be diagnosed at one of the intermediate signatures rather thanalways waiting for a final signature [Lee88]. The additional costs associated with multiplesignature analyis are increased design effort to determine the additional signatures and theadditional area required to implement the additional checks.In an effort to improve the quality of compaction, several methods have been proposed eliminate the information loss due to compaction and thus achieve zero aliasingcompression, e.g., [Zorian9O, Gupta9O]. These methods typically have limited practicalapplicability, but are examples of how circuit-specific information can be used to designcompactors with improved aliasing. For example, the periodic quotient method put forthFigure 2.5: LFSR used as a signature analyzer.Chapter 2. Built-in Self-Test 25by Gupta, Pradhan and Reddy in [Gupta9O] uses knowledge of the fault-free response ofa CUT to design a signature analyzer with zero aliasing. Normally in signature analysis,the quotient polynomial q(x) is discarded and only the k-bit remainder polynomial rk(x)is checked since the order of q(x) makes the its storage unfeasible for BIST. The periodicquotient method attempts to make monitoring the quotient sequence feasible by choosing a feedback polynomial a(x) which will result in a periodic quotient polynomial qT(x)representing a repetitive sequence with a period T (ideally T is small). Having a periodicquotient sequence greatly reduces the effort required to monitor it, and by checking the quotient sequence as well as the remainder, no information is lost in the compaction process.Gupta et. al. have shown that given an i-bit fault-free response sequence represented byRi(x), a feedback polynomial a(x) can be found which can be realized by an LFSR havingat most stages. Unfortunately, even though periodic quotient sequences can always befound, it is difficult to find sequences with sufficiently short periods T for large values of 1.Therefore, for most applications, the hardware requirement is too large to be practical.2.3.4 Space CompactionA parallel signature analyzer or multiple-input shift register (MISR) is a linear, circuit-independent compactor which performs both time and space compaction concurrently (seeFigure 2.6). A MISR is configured like an LFSR, except that the input of each stage ofthe MISR is exclusive-ORed with an external input provided by an output of the CUT.As in LFSR-based compaction, the contents of the MISR after all test response bits havebeen processed forms the signature. Due to the similarities in structure, the MISR hasaliasing characteristics very similar to those of the LFSR, i.e., the probability of aliasingFat asymptotically approaches 2k for a k-stage MISR assuming the feedback polynomialis irreducible [Damiani89b, Damiani9l, Kameda93].Although the MISR performs both space and time compaction, it is not necessarilymore cost effective than a structure using a separate space compactor and a single-inputChapter 2. Built-in Se11 Test 26Figure 2.6: Multiple-input shift register.signature analyzer (LFSR). A MISR requires one XOR gate and one flip-flop per input. Forcompacting a number of CUT outputs less than or equal to k, the additional cost of a MISRis only one XOR gate. For large numbers of outputs, using only a MISR is impractical dueto the cost in area incurred by the flip-flops. When compacting a large number of CUToutputs, pure space compaction is necessary before MISR- or LFSR-based time compaction.Parity checkers are a well-studied class of linear, circuit-independent space compactors based upon linear error-detecting codes. Space compactors based upon Hammingcodes, BCH codes, etc., have been studied [Saluja83, Reddy88], although this thesis ismainly concerned with simple parity generators which can be used as k : 1 space compactors. It has been shown that linear space compactors are almost optimal with respectto error detection/propagation [Saluja83]. Under the assumption that no more than £CUT outputs can be in error simultaneously, a parity checker can be designed to detect orpropagate any errors.o Inputi Input 2 Input3 Input k-2 Input k-iChapter 2. Built-in Self-Test 27Parity checkers are easily implemented with trees of XOR gates. However, the costof an XOR gate is typically among the highest for 2-input Boolean functions. The cost of ak : 1 parity tree compactor is k — 1 XOR gates. One of the difficulties with parity checkersis the need for input partitioning. It is desirable to group the inputs of a parity checker suchthat they do not depend on the same CUT inputs [Benowitz75] in order to avoid systematicaliasing caused by faults manifesting themselves as multiple errors on related lines. Suchpartitioning is typically nontrivial and can lead to additional area overhead in the form ofrouting.The popularity of parity checkers as space compactors is due to the fact that theyare effective, easy to analyze, and simple to implement. On the other hand, they are alsoquite costly in terms of area, and not optimal under all conditions.An alternative to linear space compactors, such as the parity tree, is the quadraticcompactor described in [Karpovsky87]. The quadratic compactor, also a circuit-independentcompactor, has been shown to optimally compact test output data, with respect to the totalerror masking probability, on the condition that the fault-free response z of the circuit undertest and the error sequence r are statistically independent, uniformly distributed randomvariables.The implementation of quadratic compactors is based upon finite-field multiplicationover a Galois Field. The implementation of a spatial quadratic compactor, as shown inFigure 2.7, consists of a combinational multiplier over GF(2’). The space compactioncircuit multiplies two polynomials,z0 = ak_lxkl ak_2xk 2 ax2 a1x a0 (2.3)and z1 = bk_2x’ . . . b2x b1z b0, (2.4)each of degree less than k, to yield a productz01 = fk_lxk_l fk_2x2 . . . f2x f1x fo modulo (x), (2.5)Chapter 2. Built-in Self-Test 2810 a0akxk ‘0ic-i Finite-Field k CompactedCUT b0 Combinational OutputsMultiplier kibkl‘k-iFigure 2.7: Spatial quadratic compaction block diagram.where u(x) is an irreducible polynomial of degree k.Table 2.1 lists irreducible polynomials of degree k and combinational multipliercircuit functions over GF(2c) for 2 < k < 4.Chapter 2. Built-in Self-Test 29k a(x) f4 xEx13=abaoaobiaib2e obf2 = a3b a3b2 a2b3 a2b0 €j a1b a0b2fi = a3b2 a3b1 a2b3 ab2 a1b3 a1b0 a0b1fo = a3b1 a2b a1b3 e a0b3 x1 lboEDaibiEDaobf = a2b a2b1 a1b2 a1b0 a0b1fo = a2b1 a2b $ a0bIx2xEJl1 boaobfo = a1b a0bTable 2.1: Irreducible polynomials and quadratic compaction functions for 2 < k <4.Chapter 3Programmable Space CompactorsAs mentioned in Section 2.3.4, there are several well-known space compaction techniques.These techniques are all circuit-independent and were developed under certain assumptionswith respect to the fault-free output of the CUT and/or the error sequence. This chapter discusses programmable space compactors (PSCs) proposed by Zorian and Ivanov in[Zorian93]. These PSCs are a class of circuit-specific space compactors designed to providehighly cost-effective space compaction.3.1 Programmable Space CompactorsOf the many space compaction techniques proposed for BIST, few are circuit-specific, e.g.,[EWu93]. Most circuit-independent techniques were developed with abstract assumptionsabout the test response and error sequences, and are costly in terms of area. Under certainerror models, certain compaction techniques are optimal. For example, the parity tree is aperfect compactor under the odd-numbered error model. The concept behind programmablespace compactors is to find the best, most cost-effective compactor for a given circuit andtest pattern sequence, without restricting the compactor to specific compaction ratios or toonly linear or even combinational logic. This thesis deals mainly with relatively low-cost(less than the cost of a parity tree) combinational PSCs with space compaction ratios ofk : 1 where k is small, i.e., k < 7, with an emphasis on k = 4.The general structure of a BIST scheme employing PSCs for space compaction ispresented in Figure 3.1. A CUT with p . k outputs can be tested by using p not necessarilyidentical k : 1 PSCs that feed into a time compactor. The difficulty lies in choosing space30Chapter 3. Programmable Space Compactors 31CompactedOutputcompaction functions which can provide high fault coverage at an acceptable hardware cost.For any given k, there can be an enormous number, i.e., 22’, of possible combinationalfunctions.The key attribute of PSCs is their circuit-specificity. The programmability of PSCsrefers to the fact that they are tailored to the functional behaviour of a specific CUT, notto any implied dynamic changeability. The circuit-dependent information used in selectingPSCs is the fault-free output of a CUT to the input test patterns to be used in a test, andexpected errors caused by targeted faults. Straightforward logic simulation can be usedto determine the fault-free output. Unfortunately, obtaining complete knowledge of theexpected errors would require an impractical amount of fault simulation. Therefore, themanifestations of modeled faults in the error sequence must be estimated. As a simpleexample of a PSC, if the fault-free behaviour of a CUT results in all-zeroes vectors fromthe observed nodes, an optimal PSC function would be an OR gate. The OR gate wouldFigure 3.1: PSC-based space compaction.Chapter 3. Programmable Space Compactors 32propagate all ones produced by a faulty CUT and it would cost less than a generic circuit-independent space compactor such as a parity tree or quadratic compactor.3.2 Realizability/Implementation CostThere are two measures of cost which must be accounted for in the use of PS Cs. These arethe area overhead of the compactor, a recurring engineering cost (RE), and the design effortrequired in selecting it, a non-recurring engineering (NRE) cost. Usually, the importanceof RE vs. NRE costs depends on market considerations and expected sales volumes. HighNRE costs, perhaps due to extensive searches for PSCs, can be recovered through lowerper-piece RE costs, perhaps due to the use of PSCs with very low overhead, if the salesvolumes are sufficiently high. Conversely, for low volume parts, higher RE costs can betolerated if significant NRE savings are achieved. Also, higher NRE or RE costs can betolerated if the time-to-market of a product can be shortened or if higher quality can beachieved.There are numerous ways of estimating the area required to implement a Booleanfunction. These range from simply counting the number of terms in the Boolean expressionto actually mapping and routing the circuit. Counting the number of simple terms in thefunction is a quick but usually poor measure of circuit complexity and size. Conversely,laying out a circuit provides an exact measure of (technology-dependent) area, but the timeand effort required to lay out a circuit makes this impractical. Since the search space forPSCs is very large (22k functions for a k : 1 compaction function), a quickly calculablemeasure of area must be used. However, since much effort is being placed in the searchprocess, an accurate measure of area is greatly desired. Boolean functions of a small numberof variables (k < 7), can be minimized and mapped into gate-level implementations byprograms such as misIITM and SISTM in times on the order of seconds [Brayton87]. Bymapping functions into gate-level implementations, exact gate counts are obtained for aChapter 3. Programmable Space Compactors 33particular technology represented by the cell library used in the mapping process. A faster,but less accurate, measure of area is the literal count of the factored Boolean expression.The literal count is a common metric for evaluating the effectiveness of decomposing andfactoring Boolean expressions [Vasudevamurthy9o]. Although factoring and decompositionare integral to minimizing the function for mapping, skipping the mapping process itselfsaves a significant amount of CPU time.The greatest drawback of PSCs relative to other circuit-independent space compactors such as parity trees or quadratic compactors is the effort required to select appropriate PSC functions. Since PSCs are circuit-dependent, the CUT for which the PSC isbeing selected must be characterized. The fault-free output of the CUT in response to inputtest patterns used in a test must be obtained through logic simulation. Also, estimates ofthe expected errors caused by targeted faults must be obtained through fault simulation.The fault simulation process may take several hours or days depending upon the fault modeland the desired accuracy of the expected error rate. As mentioned above, evaluating theprecise cost of prospective PSCs may require prohibitive CPU resources. Estimating theeffectiveness of prospective PSC functions, by calculating the probability of aliasing mayalso require significant although generally not prohibitive CPU time.3.3 Error Propagation Measure and AliasingThe most accurate measure of the effectiveness of a PSC (or any test scheme) is defectcoverage. Unfortunately, defect coverage is difficult to measure [Wiffiams8l]. The mostaccurate practical metric for measuring the effectiveness of a potential PSC is exact faultcoverage obtained through fault simulation. As is usual when dealing with compactors, thisis not computationally feasible, especially in the context of searching through thousandsof potential PSC candidates. Since the effectiveness of a PSC depends upon its abilityto produce an output vector different from the fault-free output given an errored inputChapter 3. Programmable Space Compactors 34vector, a good measure of effectiveness is the probability of aliasing Pal. One measure ofthe probability of aliasing Pai proposed by Zorian and Ivanov in [Zorian93], the probabilisticerror propagation measure (EPM), is discussed below.The probabilistic EPM uses the expected fault-free output data matrix and thehomogeneous independent multiple bit error model [Ivanov92] discussed in Chapter 2.3.2 tocharacterize the expected m x n faulty output data matrices B. Deterministicaily obtainingthe fault-free output data matrix B requires only a one-time logic simulation effort. Sinceonly compaction in the spatial dimension is desired, only the column vectors of the datamatrices are of interest. Of particular interest is the number of times e each fault-free datavector i to be compacted appears in the fault-free output matrix, and the total number ofcolumns 1. Using the homogeneous independent multiple bit error model requires a fairlysignificant one-time fault simulation effort to characterize the bit error probabilities öj foreach output line (i.e., each row in the output data matrix) j of the CUT in the presenceof the targeted (typically, single-stuck-at) faults. With the line bit error probabilities, ôj,the probabilities of occurrence of all 2m— 1 (non-zero) error vectors and the probabilities ofoccurrence of each (fault-free or errored) output vector can be calculated. The probabilitythat a given set of (possibly faulty) vectors produces the expected (fault-free) results is givenby nr179t. Note that the more often a vector i appears in the fault-free sequence, theless likely it will always remain fault-free in the face of finite probabilities of errors for theoutput lines in a faulty circuit. The probability that a response vector is actually error-freeisi-_i- oj), (3.6)and the probability that all 1 input vectors are error free is 131. The probability of aliasingis the probability that an output sequence matches that from the fault-free input sequenceless the probability that the sequences was actually fault-free (see Equation 3.7).2m—1Fa1 II 79: _3l, (3.7)Chapter 3. Programmable Space Compactors 35i a2 a1 a0 F c “y0 0 0 0 0 0 0.9711 0 0 1 0 2 0.8092 0 1 0 0 1 0.8193 0 1 1 0 0 0.8014 1 0 0 0 1 0.8845 1 0 1 1 0 0.7646 1 1 0 1 2 0.7247 1 1 1 1 5 0.796Table 3.1: Probabilities of occurrence.Table 3.1 lists the probabilities of occurrence for given vectors ã for 0 i < 7 andline error probabilities 6 = 0.05, 6 = 0.1, and 62 = 0.2. With these line error probabilities,/3 = (0.95)(0.90)(0.8) = 0.684. For an input vector distribution c of the eleven inputvectors, also listed in Table 3.1, the probability of aliasing for the function F = (ao+ai).a2is given byPal=—/311 (3.8)= (0.809)2. (0.819). (0.884). (0.724)2. (0.796) — (0.684)11 (3.9)= 0.064041. (3.10)This measure of the probability of aliasing is straightforward to calculate. It takesinto account the functional behaviour of both the fault-free and faulty CUT, and is areasonably accurate estimate of fault coverage. Positive correlation between low Pai andhigh fault coverage of SSA faults has been empirically shown in [Zorian93], and supportedby experiments presented in Chapter 5.3.4 Implementing PSCsDue to the enormous number of possible combinational PSC functions (22’ functions fora k : 1 compaction function), it is not computationally feasible to evaluate every one forChapter 3. Programmable Space Compactors 36any non-trivial value of k. For example, for k = 4, there are 65,536 Boolean functions, andfor k = 5, there are 4.29 x i09 functions! When testing circuits with regular structures,e.g., ROMs, RAMs, FIFOs, etc., PSCs can be inteffigently chosen/designed to effectivelycompact the regular output from such structures [Zorian93]. For finding simple k : 1 PSCsfor random logic circuits the focus of this thesis it is necessary to limit the space ofPSC functions searched. Since cost is an important criterion in evaluating the suitability ofa PSC candidate, one strategy would be to restrict PSC searches to the set of low-cost oreasily realizable compactors [Zorian93]. Unfortunately, this tactic places arbitrary limits orbounds on the search space. One method of limiting the search to a computationally feasiblenumber of functions while still sampling the entire space of Boolean functions, rather thanan arbitrarily chosen region, is to base the search on a randomized search technique suchas genetic algorithms.Chapter 4Genetic AlgorithmsThis chapter provides an introduction to genetic algorithms, outlines their basic applicationas a search technique, explains the underlying theory behind them, and explores advancedconcepts used to improve their effectiveness in searches. Throughout the chapter, examplesare used to illustrate various concepts. Wherever possible, the basis for these examples isthe application of genetic algorithms in the search for space compactors.4.1 An Introduction to Genetic AlgorithmsGenetic algorithms are one of a class of problem solving techniques based upon analogiesto natural processes [Davis87]. The motivation behind the use of these techniques, suchas evolutionary programming [Michalewicz92], simulated annealing [Kirkpatrick83j, andneural networks, is the hope that artificial modeling of natural processes will provide simpleand elegant solutions, like those found in nature, to problems too complex for traditional,“man-made” techniques.Genetic algorithms (GAs), first proposed by John Holland in [Holland75], model theprocess of evolution by natural selection in nature. The power of GAs lies in the ability ofsimple structures, e.g., bit strings, to represent a complex objects/characteristics/structures,and the ability of a simple set of transformations (modeled after genetic operations) toimprove those structures. GAs use the notion of “survival of the fittest,” where fitness is ameasure of how good or bad a string is, along with a randomized, but structured methodof exchanging information among strings.37Chapter 4. Genetic Algorithms 38Genetic algorithms are often referred to as search algorithms, a class of machine-learning techniques, or classifier systems [Goldberg89]. The variety of terms used to describegenetic algorithms is a tribute to their versatility. They have been applied to problems asdiverse as numerical optimization [Michalewicz92], the Prisoner’s Dilemma {Axelrod87],the Traveffing Salesman problem [Greffenstette85], classroom scheduling [Colorni9O], imageprocessing [Geman84j, and pattern recognition [Forrest8ö, Giffies85].Along with versatility, genetic algorithms demonstrate a high degree of efficiency.The efficient search of a large space involves making a tradeoff between robust exploration(widespread coverage of the search space) of the space, and exploitation (focused scanningof regions where good solutions are believed to be clustered together) of the regions withthe best available solutions and their variants. Theoretical analysis of genetic algorithmssuggest that they can make near-optimal tradeoffs [Holland75]. Although GAs are considered weak, i.e., problem-independent, search methods, problem-specific knowledge canbe incorporated into the GA to improve both exploration and exploitation. Through theuse of appropriate problem-specific information, GAs can be robustly applied to many different types of applications to efficiently find reasonable solutions to complex problems[Booker87, Grefenstette87].4.1.1 Traditional Search and Optimization MethodsOptimization seeks to improve the performance of a solution to approach some optimalsolution. Reaching the optimal solution, though seemingly the ultimate goal, must not beover-emphasized. The cost to reach the optimal solution or other near-optimal solutions isan equally important factor in judging an optimization process. The process of continuallyfinding improved solutions, i.e., the journey, is as important as the destination. The bestoptimization method is one which efficiently uses available resources to provide satisfactorysolutions within the time allotted.There are three main classes of search methods: calculus-based, enumerative, andChapter 4. Genetic Algorithms 39random {Goldberg89]. Calculus-based methods use the derivatives of a function to eitherdirectly or indirectly locate the extrema of the function. Indirect methods solve the setof equations resulting from setting the gradient of the function to zero. Direct methodsemploy a hill climbing strategy and travel along the function in directions suggested bythe local gradient. One weakness of calculus-based methods is their narrow scope. Themethods search for optimal solutions only in the neighbourhood of the current point. Thus,they easily find local optima but do not robustly explore the space enough to find global (ormerely better local) optima. Additionally, calculus-based methods rely upon the existenceof derivatives. Therefore, the function must be continuous. Unfortunately, solving complex,discontinuous, multimodal problems is beyond the capability of this classical technique.Enumerative methods simply evaluate every possible solution in a finite search spaceto find the optimal solution. While simple, and obviously robust in its exploration, thesemethods are woefully inefficient for large spaces. This lack of efficiency makes enumerativemethods infeasible due to the lack of resources. The size of search spaces can easily exceedthe capability of the fastest computer to evaluate every point in any reasonable length oftime.Random search methods such as purely random walks through a search space areno more efficient than enumerated searches. They merely randomly distribute, around anunchanged mean, the effort required to find the optimal solution. As problems becomeincreasingly large and complex, reducing the average problem- solving effort, somethingpurely random search methods do not do, becomes increasingly important.Unlike purely random search methods, randomized search techniques are increasingly popular alternatives to calculus-based and enumerative techniques. Search techniqueswhich strategically use random processes to direct the search can provide highly efficientsearches which exhibit both robust exploration and high exploitation. Genetic algorithmsand simulated annealing [Kirkpatrick83] are two schemes which use randomized search procedures. Not coincidentally, both methods are metaphors for processes found in nature,Chapter 4. Genetic Algorithms 40namely annealing in solids and evolution.4.1.2 Simulated AnnealingSimulated annealing is a randomized search technique whose underlying metaphor is annealing in solids. This technique exploits the relationship between the principles of statisticalmechanics, which describe the probable behaviour of the enormous number of molecules (onthe order of 1023 per cubic centimetre) in a liquid or solid, and combinatorial optimization.Simulated annealing has been found to be capable of finding near-optimal cost solutions tolarge problems featuring many degrees of freedom [Kirkpatrick84].Statistical mechanics [Serway89], the cornerstone of condensed matter physics, probabilistically applies the principles of mechanics to large systems of interacting particles onthe microscopic level in order to predict the macroscopic behaviour of such systems. Thisdiscipline relies upon the fact that while a system is in thermal equilibrium, due to thehuge number of particles in a system, only the most statistically probable behaviour of thatsystem is macroscopically observable. The Boltzmann distribution describes the fluctuations of a system about the average behaviour of the system. In particular, it describesthe probability lrT(s) that a system in thermal equilibrium, at a temperature T, is in aparticular configuration s, given the energy E(s) of the configuration, i.e.,e kTrrT(s) = (4.11)ZwEse kTwhere k is Boltzmann’s constant and S is the set of all possible configurations [Davis87].Of fundamental concern is the behaviour of a system at or near its transition (orcritical) temperature T, and in particular whether the liquid forms a crystaffine or poiycrystalline solid. To form a crystaffine solid, the molecules of the fluid must enter minimalor near-minimal energy configurations. These low-energy configurations comprise only asmall fraction of the total number of configurations, yet they are extremely common atlow temperatures. The dominance of these configurations is the result of the BoltzmannChapter 4. Genetic Algorithms 41distribution [Kirkpatrick83].The process of finding minimal energy configurations is characteristic of the annealing process, used, for example, to grow a single crystal from a melt. Simply loweringthe temperature of a melt is not sufficient to ensure the growth of a single crystal. Thetemperature of the melt must be lowered slowly, allowing enough time at each interval forthe melt to achieve thermal equilibrium. Additionally, the temperature intervals must bebecome increasingly smaller near the freezing point, T. If done properly, the system willfind the minimal energy configuration to result in a single crystal. Otherwise, the resultingsolid will be a crystal with structural defects or be glassy [Kirkpatrick83].Simulated annealing draws an analogy between man’s search for optimal combinatorial configurations and nature’s search for minimal energy configurations of molecules ina solid. The configuration of particles is analogous to the abstract representation of a combinatorial problem. The energy function can be likened to a cost function. Temperaturerepresents the control parameter for the overall search procedure.The four parts of a simulated annealing system are:1. a system configuration description,2. a set of randomized rearrangements of elements of the configuration,3. a cost (objective or goal) function, which evaluates the desirability of a potentialsolution, and4. an annealing schedule, which consists of a decreasing sequence of “temperatures”and a specific number of rearrangements in the configuration in an attempt to reachequilibrium.The basic simulated annealing algorithm is illustrated in Figure 4.1. The searchbegins by initializing the effective temperature T to a high value in order to “melt” the system, and randomly or heuristicaliy choosing an initial system configuration. Annealing ofChapter 4. Genetic Algorithms 42initialize temperature Tinitialize configuration swhile (solution is not good enough)while (not reached equilibrium){select a new configuration s’ by randomly altering sevaluate the cost of s, F(s)evaluate the cost of 5’, F(s)= F(s’) — F(s)if(zE<O)5 _else if (Random[o,l] <e1)5 +— 5!}lower T according to annealing schedule}Figure 4.1: Simulated annealing algorithm.the system is accomplished by slowly “cooling” the system, allowing it to reach equilibriumat each successively lower temperature, until it “crystallizes.” Equilibrium, at a given temperature, is achieved by applying the Metropolis algorithm [Metropolis53], which simulatesa steady state system of particles, at each temperature. In every iteration of the Metropolisalgorithm, a new configuration consisting of a small random rearrangement of the presentconfiguration is created and the resulting change /E in the cost (energy) of the system iscalculated. If AE < 0, the new configuration is accepted as the starting point for the nextiteration. On the other hand, if /XE > 0, the new configuration is given a probability ofP(AE) = e_t/T of being accepted. This selection of P(E) encourages the evolution ofthe system towards the Boltzmann distribution. With this strategy, less desirable configurations have a finite probability of being accepted as long as T> 0. Repetition of this basicstep of the Metropolis algorithm simulates the thermal motion of particles in equilibrium.Once equilibrium is achieved for a given temperature, the temperature is lowered accordingChapter 4. Genetic Algorithms 43to the annealing schedule until the system “freezes” in a particular configuration.Simulated annealing has several advantages over traditional search and optimizationmethods. Its main strength is its ability to avoiding becoming trapped in local optima.Since the Metropolis algorithm probabilistically allows for the acceptance of less desirableconfigurations, rather than only more desirable configurations (cf. hill climbing) at each stepof the search, the algorithm can pop out of local optima. Also, control over the annealingschedule grants the ability to control the trade-off between the amount of resources usedand the quality of the solution found. In particular CPU time can be limited by quickeningthe temperature drop or relaxing the conditions for declaring thermal equilibrium.Unfortunately, simulated annealing also has its drawbacks. Simulated annealing,like genetic algorithms, is a very versatile problem-independent technique which can beapplied to a wide range of problems. However, such versatility can be a weakness. First,the necessary framework in which a problem or search must be formulated can sometimesbe unnatural. An awkward representation of a problem can make it difficult to incorporateuseful problem-specific information into the search effort. In particular, designing an appropriate cost function which satisfactorily quantifies all of the qualities desired in a potentialsolution and accounts for all possible trade-offs can be difficult. Also, developing a suitableannealing schedule of temperatures and equilibrium conditions is not trivial. As undesirable as it may seem, trial and error is frequently the only method. Secondly, optimizationproblems may consist of systems with unique, non-regular components, unlike the systemsimulated annealing attempts to model in which all particles are identical and the optimalconfiguration is a regular crystal. Finally, simulated annealing has been criticized as beinginherently slow [Geman84]. The lower bound for annealing schedules necessary to guaranteeconvergence is high, but it is often unnecessary possible to use more aggressive schedules toproduce reasonable, if not optimal, results.Simulated annealing has already proved to be a valuable technique for solving avariety of complex optimization problems. It has been applied to problems as diverse asChapter 4. Genetic Algorithms 44the Traveffing Salesman problem [Kirkpatrick84] and image restoration [Geman84]. One ofthe first applications of simulated annealing, component placement, has turned out to beone of its most successful. Simulated annealing is a key ingredient of the place and routealgorithms used in modern VLSI layout tools.4.1.3 Genetic AlgorithmsGenetic algorithms differ greatly from traditional search and optimization methods, andhave much in common with simulated annealing. Both GAs and simulated annealing techniques model natural processes, even their basic algorithms bear striking resemblances toeach other. They both employ objective functions measuring the cost or fitness of potentialsolutions, rather than derivatives or other ancillary information, and use them to directtheir searches probabilistically, rather than deterministically, to avoid becoming trapped inlocal optima. Both techniques can be applied to almost any kind of problem, even discontinuous, multimodal problems with huge search spaces that are beyond the capabilities oftraditional methods.Some of the ways in which GAs differ from simulated annealing include the use of apopulation of potential solutions instead of a single solution, which makes GAs inherentlyparallel in nature, the use of “standard” genetic operators to rearrange configurations, andthe (subtle) differences between their respective frameworks. The use of a population ofpotential solutions, rather than just a single candidate, is the basis for the robustness ofGAs. The diversity of the distinct members of the population permit the GA to robustlyexplore the extents of the search space, especially during the initial stages of the search. Asthe search converges and the population becomes dominated by the fittest individuals, thepopulation encourages the exploitation of the area of space in the region of the dominantindividual.Chapter 4. Genetic Algorithms 454.2 Fundamental ComponentsGenetic algorithms are randomized search techniques whose underlying metaphor is evolution by natural selection. Chromosomes, which encode the characteristics of biologicalorganisms, are represented by strings. In genetics, chromosomes are composed of genesresiding in various positions (loci) which may take on values called alleles. For example, theeye colour gene of an animal’s chromosome may reside at locus eight and take on the allelevalues, brown, blue or green. In GAs, strings, analogous to chromosomes, are composedof genes which may take on allele values according to the symbol alphabet (e.g., {O, 1} forbinary strings).A system using genetic algorithms to solve a problem consists of five components:1. a chromosomal representation of possible solutions to the problem,2. a method for generating an initial population of solutions,3. an objective or goal (or cost) function which evaluates the fitness of a potential solution, i.e., a measure of how good the solution is,4. genetic operators which change the composition of potential solutions during “reproduction,” and5. values for various parameters in the genetic algorithm (population size, probabilitiesof applying genetic operators, etc.)The basic structure of genetic algorithms is illustrated in Figure 4.2. The searchbegins by generating, randomly or heuristically, an initial population of potential solutions.In every iteration of the algorithm, i.e., for every generation, the fitness of every individualof the population is evaluated. The fitness of each individual relative to the populationas a whole determines the probability of that individual’s selection as a member of thepopulation’s next generation. For example, in the most commonly used mechanism, rouletteChapter 4. Genetic Algorithms 46initialize population P(O)evaluate fitness of the members of the initial generation P(O)while (terminate condition is not met){select members of next generation of population P(g + 1) from F(g)perform crossover with selected members of P(g + 1)mutate members of P(g + 1)evaluate fitness of members of current generation P(g)g—g+1}Figure 4.2: Basic genetic algorithm.wheel selection {Goldberg89], each member of the current population is assigned a selectionprobability (width of its slot in the roulette wheel) proportional to the ratio of its fitnessto the total fitness of the population (the total circumference of the wheel). The (g + l)tgeneration of a population of size p is selected from the gtI generation by “spinning thewheel” p times. With the fittest members of the population having greater probabilities ofselection than unfit members, fitter individuals have exponentially increasing exposure insuccessive populations. However, since selection is done probabilistically, even individualswith poor overall fitness scores, but potentially valuable gelletic material, have a finiteprobability of living. This selection strategy preserves the potentially valuable geneticmaterial of unfit chromosomes until they can be “mated” with fitter counterparts. Afterthe selection step, the members of the new population are randomly selected for binarycrossover. Binary crossover mixes or swaps the genetic material between two individualsto create two offspring sharing the qualities of both parents. The final genetic operation ismutation, which randomly alters a small proportion of the genetic material in the populationin an attempt to introduce genetic sequences which have been lost or otherwise do not existin the population. The end of the search can be based on a number of criteria, singly orChapter 4. Genetic Algorithms 47in combination. Most commonly, genetic searches simply run through a specific number ofgenerations. Alternatively, the search may be programmed to end if the aggregate fitnessof the population reaches a predetermined level, if the aggregate fitness is stagnant (fails toimprove after a certain number of generations), or if the population becomes dominated bya single individual which comprises a certain proportion of the population. Ideally, the endof the search will result in a highly evolved, very fit “survivor.”4.2.1 Chromosomal RepresentationThe fundamental data structure used in genetic algorithms to represent chromosomes isthe bit string [Michalewicz92]. Alternate chromosomal representations such as linked listshave been used for specific applications, but the bit string is the most widely used andextensively studied [Goldberg89].For any given problem, the possible chromosomal encodings of its configurations areendless. It is usually best to use the most natural, and often simplest, encoding so as notto impede the incorporation of problem-specific information into the search. In numericaloptimization problems, a bit string may represent a number such as an unsigned integer,for example,i’ 01101 011012 = 13o.In a more complex example, a chromosome may represent a Boolean function and thebits ill the string may represent the presence or absence of minterms in the function. Inthis encoding scheme, the minterm binary string (MBS) scheme, all combinational Booleanfunctions of k variables can be represented by a 21-bit string. For example, for a 3-bitBoolean function f(b2,b1, bo) encoded by a 23-bit string i, the value 01001101 representsthe function:f(b2,b1,b0) = m7 + m3 + m2 + m0Chapter 4. Genetic Algorithms 48where m represents the jth minterm, i.e.,f(b2,b1,b0) = b210+ b210+ b210 +b210.4.2.2 Generating Initial PopulationsSince genetic algorithms are relatively robust, the composition of the initial populationshould not critically affect the convergence of the algorithm on a final solution [Goldberg89].Therefore, it is often sufficient to merely generate random bit vectors to serve as the initialpopulation. Having a random initial population has the benefit of beginning the searchwith diverse genetic material which translates into a set of widespread initial points in thesearch space.In the interest of efficiency, problem-specific knowledge can be used by the initialization scheme to speed up convergence [Grefenstette87]. Some strategies include weightingthe random initialization toward known solutions and seeding the initial population withpotential (educated guesses) solutions. For example, if the parity function is commonlyfound to be a near-optimal space compaction function, seeding the initial population withone or more individuals representing the parity function could expedite the search process.4.2.3 Evaluating FitnessThe evaluation function, which rates potential solutions in terms of fitness, is a metaphorfor the environment. The results of the evaluation determine which individuals in thepopulation live to reproduce or die. Individuals with high fitness are more likely to beselected for reproduction than those with low fitness, thus ensuring the survival of thedesirable genes. Individuals with low fitness are more likely to be simply discarded fromthe population to make room for the children of the more fit.The evaluation function is an abstract measure of how good a potential solutionrepresented by a chromosome is. This function, sometimes referred to as the objectiveChapter 4. Genetic Algorithms 49function, is the sole embodiment of the principles used to direct the search. Althoughgenetic algorithms cannot derive the solutions to a problem, they can determine how closethey are to better solutions through the use of the objective function. The design of theevaluation function must take into account all of the desirable qualities of a solution. Inthe case of a search for a space compactor, the objective function F must account for faultcoverage (estimated by the probability of aliasing Pai) and area overhead (implementationcost Carea). For example:log(—)F(Pai,area) = (4.12)Vwhere potential fault coverage is represented by the probability of aliasing Fal’ and theimplementation cost by Carea. As shown in Equation 4.12, the fitness function F is relatedto the inverse of the probability of aliasing and inversely related to the implementation cost.The logarithm of the inverse of the probability of aliasing1--is taken since tends tovary by many orders of magnitude. The square root of the cost Carea is taken to modify theemphasis between probabifity of aliasing and cost. (See Section 5.1.3 for a more detaileddiscussion of fitness evaluation.)To properly guide a genetic search, the evaluation function must be normalized.The normalization process must scale “raw” fitness scores so that good individuals have afairly high (but not too high) probability of reproducing, and poor individuals have a low(but not too low) probability of reproducing. If the difference between the fitness of a verygood individual and a very poor one is low, say a few percent, it will take many generationsfor the descendents of the good individual to outnumber the descendants of the poor one.Conversely, if the difference in fitness is several orders of magnitude, the good individualwill quickly dominate the population and the possibly valuable genetic material of the poorindividual will be lost to the crossover operator thereby limiting the search to the local areaof the dominant individual. Various normalization methods are presented in Section 4.4.1.Chapter 4. Genetic Algorithms 504.2.4 Genetic OperatorsBit string operations, based upon genetic operations, that alter the configuration of themembers of the population, provide the basis for the evolution of solutions. While the listof possible bit string operations is endless, the most commonly used operators are binarycrossover and mutation. Although the use of problem-specific heuristics in an applicationoften leads to the use of specialized genetic operators, the operators are invariably basedupon the basic crossover and mutation operators [Davis87].Binary crossover mixes the genetic material of two parents to create two childrenwho share the traits of both parents. The crossover operator takes the first set of bits upto a randomly chosen “crossover point” from one parent and the remaining bits from thecrossover point from the other parent. In effect, the parents swap bits from the crossoverpoint. For example, applying the crossover operator to two chromosomes, i and ii2,iT1 = (001l00Il0l00l1l)(lOllOllOOlOOllO)with seven as the crossover point results in the children:= (001l00I00l001l0)= (10ll01Il0l0011l).Mutation, a unary operation, involves the probabilistic alteration of a specific localeto another allele value. Where bit strings are used, mutation simply involves the flipping ofa bit from 1 to 0 or vice versa. Typically, each bit in a bit string has an equal probabilityof mutating. Variations of the basic crossover operator, such as multi-point crossover, andother varieties of mutation, such as non-uniform mutation, are discussed in Section 4.4.2.Chapter 4. Genetic Algorithms 514.2.5 Parameters for the Genetic AlgorithmDetermining the most effective set of parameter values for using genetic algorithms to solve aproblem is likely to be as difficult as solving the problem itself. It is only necessary to choosea reasonable set of parameter values dependent as much on the available problem-solvingresources as on the nature of the problem. Population sizes typically used in academicresearch are on the order of fifty to several hundred. Generation limits for genetic algorithmsare similar. Since the population size multiplied by the maximum number of generationsgives an upper bound on the computational effort for a search (assuming fitness evaluationis the most costly operation), population size and generation limits are often made as largeas possible to run within desired time constraints.The probabilities of applying the genetic operators are dependent on the populationsize and chromosome length. Typical probabilities for the application of the crossovergenetic operator on population members P0 range from 0.1 to 0.33. With a crossoverprobability of FcrQss = 0.1, every individual selected for survival into the next generationhas a 10% chance of being selected for crossover. The crossover step replaces the selectedparent individuals, which are randomly paired together, with their offspring created by thebinary crossover operator. A typical mutation rate Fmut may be 0.01. With such a mutationrate, each bit in every chromosome of the population has a 1% chance of changing value, inevery generation. Since mutation is applied on a per-bit basis, mutation is more prevalentwhen longer bit strings are used.4.3 Mathematical Foundations4.3.1 Fundamental TheoremsThe principal assumption behind the operation of GAs is that the best solutions to aproblem are clustered together with other good solutions, and that these regions can befound by robust exploration of the search space. Once these regions are found, exploitationChapter 4. Genetic Algorithms 52of that region of space leads to the best solution.Similarity templates, or schemata, provide the theoretical framework for studyingthe theory behind the operation of GAs {HollandT5]. A schema, is a template which describes a subset of strings which match each other in certain positions. When dealing withbinary strings over an alphabet C = {0, 1, *}, where * is a don’t care symbol, a schemarepresents the set of all bit strings which match the schema in all positions except forthose with a * symbol. For example, the schema (01 * 01*) matches the set of strings,{(010010),(010011),(011010),(011011)}. Any schema containing r * symbols matches 2’strings. Theoretical analyses of the reproductive growth of schemata results in the schematheorem. The Schema Theorem [Goldberg89] states that short schemata (those with littledistance between the first and last fixed positions) of low-order (those with a small numberof fixed positions), that possess above-average fitness, receive an exponentially increasingnumber of trials in successive generations of a genetic search. Unfortunately, since implementations of genetic algorithms use finite populations (of sets of sampling points) andprobabilistic rules for selecting the progenitors of successive populations, premature loss ofdesirable schemata, which can lead to sub-optimal search results, is possible. Alternativesampling mechanisms, developed to reduce the sampling error introduced by the selectionmechanism are discussed in Section 4.4.2.In general, searches using GAs tend to run the course predicted by the schematheorem, because probabilistically, the short, low-order schemata of above-average fitnesstend to survive. These desirable schemata serve as the building blocks for the creation oflarger, highly-fit schema through the crossover operator as the desirable attributes of theshorter schema are inherited by the offspring. The Building Block Hypothesis [Goldberg89]states that:“a genetic algorithm seeks near-optimal performance through the juxtaposition of short, low-order, high-performance schemata, called the building blocks.”Chapter 4. Genetic Algorithms 53Although intuitively sound, the building block hypothesis does not always hold true. Forexample, in the search for a 3-bit space compaction function f to compact a sequence ofvectors consisting only of i50 = 000 and i’ = 111, it is necessary for the compactor torecognize minterms 0 and 7. Therefore, given the MBS chromosomal representation ofBoolean functions shown in the example of Section 4.2.1, the short, low-order schemata= (*******1)ands7 = (1*******)may be judged to have a high-level of fitness. Unfortunately, the combination of the twobuilding blocks, i.e., (1 * * * * * *1), will have very low fitness since the only unique vectorso and v7 in the sequence evaluate to the same output f() = 1, which results in aliasing.On the other hand, the combination of two schemata= (*******1)and$ = (0*******)will result in a schema (0 * * * * * *1) that will have a very high fitness since the outputsfrom the two possible input vectors are different, i.e.,fQ?ij) = 0and f(v) = 1,which leads to good aliasing performance. In spite of the potential theoretical pitfalls, ingeneral, the growth of above-average population members assured by the schema theorem,and the construction of longer, higher-order, fitter solutions from building blocks suggestedby the building block hypothesis are the foundations for the optimism behind genetic algorithms.Chapter 4. Genetic Algorithms 544.3.2 Problems and LimitationsDespite their promise, genetic algorithms possess several weaknesses. Because of the lackof inter-generational memory, GAs search through the combinations of genes even if onlythe permutations of the genes is of interest. Depending on the chromosomal representation,some bit strings may represent solutions in the valid search space. In this case, the genetic operators must be modified to avoid transforming individuals out of the valid searchspace, or a mechanism which converts illegal configurations into legal ones must be used.The effects of both weaknesses can be minimized or even elimininated through effectiveselection of chromosomal representations [Goldberg89]. In the case of the MBS mintermencoding scheme presented in Section 4.2.1, each combination of minterms forms a functionwith unique characteristics and so potential exploration of all bit combinations is required.Also, all bit strings in the MBS scheme represent valid Boolean functions, so no operationperformed on a chromosome can result in an illegal configuration.4.4 Advanced Techniques4.4.1 Fitness ScalingA common problem of search techniques is premature convergence on local rather thanglobal optima {Booker87]. Despite their promise of robust exploration, genetic algorithmscan often exhibit this weakness. If we consider that the objective of genetic algorithms isto encourage the proliferation of more desirable schema at the expense of the less desirableschema of the population, then eventually the best schemata occurs in dominating proportions and convergence is achieved. There is the chance that extraordinarily fit individualsmay overly dominate a population, especially if they appear early in the search when otherindividuals are not as highly developed. In this situation, the extreme dominance of thefittest individual causes the majority of currently undeveloped individuals, which may contam some valuable genes, to be dropped from the population. The resulting loss of usefulChapter 4. Genetic Algorithms 55schemata from a diverse population is known as genetic drift.Another problem that may occur in a genetic search is post-convergence wander[Michalewicz92]. Although near the end of a search, the population is ideally composedof a high proportion of a single very fit individual (or single good schemata), there maystill be significant diversity in the population. If the average fitness of the population iscomparable to the fitness of its dominant member, further evolution of the population maysee the downfall of the dominant member and the proliferation of the mediocre individuals.One commonly used method to combat both premature convergence and post-convergence wander is to control the proliferation of the fittest individuals by contoffingthe selection process via a method known as fitness scaling [Goldberg89, Michalewicz92].The most well studied scaling method is linear scaling [Goldberg89]. With linear scaling,the raw, unscaled fitness F (F> 0) of an individual is related to its scaled fitness F’ byF’=a•F+b, (4.13)where the scaling coefficients a and b (not necessarily greater than 0) are chosen to meet theconditions that the average raw fitness (of all population members in the current generation)equals the average scaled fitness, and the maximum scaled fitness is a set multiple Cmuit ofthe average raw fitness, i.e.,= Favg (4.14)and Fax = Cmuit (4.15)As a consequence of these relationships, the average number of copies of the best individualselected for the next generation is given by Cmuit, and average members of the populationare most likely to promote a single copy to the next generation. The scaling process isrepeated every generation (new scaling factors a and b are computed every generation) tomaintain control over the selection process.In the early stages of a search, scaling discourages extreme domination of extraordinary individuals. With linear scaling, if in a given generation, the single fittest individualChapter 4. Genetic Algorithms 56is 100 times as fit as the average member of the population, the next generation will contain Cmuit copies of that individual. Without linear scaling, the next generation wouldcontain 100 copies of the super-fit individual, which would prematurely end the search ifthe total population size was 100 or less. Scaling limits the growth rate of the number ofcopies of the fittest individual to Cmuit times the previous amount per generation. Limitingthe growth rate encourages more robust exploration of the search space with a geneticallydiverse population.In the later stages of a search, linear scaling encourages competition among nearequals. A mature population may consist of highly-developed individuals of near-equalfitness, which may each represent local optima. If the raw fitness of every individual is closeto the mean fitness of the population, every generation will see each individual selected oncefor the next generation. The result is a stagnant population. In this case, linear scalingof the raw fitness will magnify the differences in fitness among the near-equals so that thefittest will be favoured over the less fit. Controlling the growth rate encourages greaterexploitation of the region of the search space containing local optima in search of the globaloptima.One of the problems with linear scaling is dealing with negative values when scalingpopulation members with raw fitness values far below the population average and maximum.This situation often occurs toward the end of a run after the population has converged toa few dominant schemata but a small number of very unfit individuals remain. Since GAsrequire fitness values to be nonnegative, the minimum scaled fitness is limited to 0, i.e.,Fmin is scaled to = 0. This situation limits the maximum scaled fitness to valuesless than Cmuit.Other well-known scaling techniques include sigma (a) truncation and power lawscaling [Goldberg89]. Sigma truncation is simllar to linear scaling but uses knowledge ofthe variance of the population fitness to avoid the problem of negative scaled fitness values.Chapter 4. Genetic Algorithms 57With this technique, the raw fitness is prescaled according to the relationshipF = F— (Favg — C O, (4.16)where u is the standard deviation of the fitness values of the population and c is someconstant (usually, 1 c 3). Negative values of F are clamped to zero. Simplelinear scaling is then applied to the truncated fitness F rather than the raw fitness F. Theprescaling performed in Equation 4.16 eliminates the possibility of negative scaled fitnesswhen linear scaling is performed. With power law scaling, the scaled fitness is some powerof the raw fitness, i.e.,F’ = Ftc, (4.17)where k is some generally problem-dependent constant close to 1. It has been suggested[Michalewicz92] that k need not be a constant, and varying k according to the age of thepopulation can enhance the search. While it is clear that some sort of normalization orscaling operation must be performed on raw fitness values in order to control the rateof population convergence, without empirical evidence it is unclear whether either sigmatruncation or power-law scaling is useful in the general case.4.4.2 Advanced Operators and ConceptsThis section provides a brief survey of some advanced topics in the field of genetic algorithms.Although a detailed discussion of any theoretical aspects of GAs beyond the basics are notwithin the scope of this thesis, they are important to note for possible future work.In addition to binary crossover and mutation, there are many other genetic operatorsused in genetic algorithms. One commonly used operator, inversion, is based upon theconcept of reordering the bits in a section of a chromosome. It should not be confused withbitwise binary inversion where ones are changed to zeroes and vice versa. The inversionoperator takes the substring of bits between two randomly chosen inversion points andChapter 4. Genetic Algorithms 58SC,)Figure 4.3: Ideal search: unscaled fitness vs. generation number.reverses their order. For example, applying the inversion operator to a chromosome i5v (OO1I1O11OI100111)with inversion points 4 and 9, yields a chromosome ‘i5’ = (OO1IO11O1l100111).Most advanced genetic operators are simply variations of the basic crossover, mutation and reordering operators. Multi-point crossover exchanges bits of two chromosomesfrom several, rather than just one, section(s). Arithmetical crossover [Michalewicz92] formsoffspring that are linear combinations of the two parent vectors, i.e.,= a.i2+(1—a).ii (4.18)= a.i+(1—a).,Generation(4.19)Chapter 4. Genetic Algorithms 59100%1)0CCCC0%Figure 4.4: Premature convergence: dominance vs. generation number.where a is a constant.Variations of mutation affect either the rate of mutation or its scope. With nonuniform mutation, the probability of mutation changes according to the age of the population. Typically, the mutation rate is larger at the beginning of a search in order toencourage a more uniform search of the space, and smaller as the population ages in orderto reduce interference in a search of the local space. When dealing with large bit strings,the sheer number of bits may result in a disproportionately high number of bits changedvia mutations relative to the number changed via crossover, particularly if only single-pointbinary crossover is used. To reduce the effect of mutation on large bit strings, it may beuseful to apply mutation to the strings rather than the individual bits. Instead of usingthe probability of mutation to represent the chance that each individual bit in the vectoris changed, it can represent the probability that the entire vector is mutated. In this case,GenerationChapter 4. Genetic Algorithms 60100%1)EC0%Figure 4.5: Post-convergence wander: dominance vs. generation number.mutation of a vector can involve changing a fixed or variable number of bits.Besides the simple, basic roulette wheel selection scheme, more formally knownas stochastic sampling with replacement, other schemes proposed in [Brindle8l] such asdeterministic sampling, remainder stochastic sampling with replacement, and remainderstochastic sampling without replacement, have been studied. The motivation behind thedevelopment and use of the various alternative selection schemes is to reduce the stochasticsampling error of the basic roulette wheel selection scheme due to the finite size of thepopulation and the finite number of generations in a search [DeJong75]. Deterministicsampling is a scheme where the probabilities of selection are converted to expected numbersof copies e of each individual i. The integer portion of e determines (iion-probabilistically)the number of copies of individual i that advance to the next generation of the population.The remaining members of the next generation of the population are chosen from the topGenerationChapter 4. Genetic Algorithms 61portion of a list sorted by the fractional portion of e. Remainder stochastic samplingschemes are related to deterministic schemes in that both ensure that the new populationcontains the integer portion of the expected number of copies of an individual. In remainderstochastic sampling with replacement, the fractional portions of e are used to form a roulettewheel selection scheme for filling the remaining spots in the new population. In remainderstochastic sampling without replacement, individual random trials with probabilities ofsuccess equal to the fractional portion of e are used to determine if one additional copy ofindividual i is to become part of the next generation of the population.There are many mechanisms at work in biological genetics which can be modeledby genetic algorithms. However, their utility in simple applications of GAs for searchproblems is dubious. One of the basic concepts in natural genetics is dominance. All butthe simplest organisms are based upon double-stranded chromosomes, e.g., human DNA, inwhich pairs of genes carry information about a single characteristic, like eye colour. Natureresolves the conflict between the two genes through dominance, wherein certain dominantalleles, e.g., brown eyes, are given precedence over other recessive alleles, e.g., blue eyes.When incorporated into GAs, the use of dominance can serve to enhance the longevity ofuseful schemata and preserve schemata which are currently undesirable. The motivationbehind mechanisms such as dominance is the preservation of diverse genetic material inhighly redundant chromosomes to prevent potentially disastrous excess evolution that mayendanger the survival of a population of a species in the face of a changing environment.Generally, such concerns are unimportant for most (simple) applications of GAs for searchand optimization problems, and most of the desired effects of applying advanced geneticconcepts can usually be achieved by tuning the parameters of the simpler genetic operatorsand scaling mechanisms.Chapter 5Genetic Search for Programmable Space CompactorsAs mentioned in Section 1.3, the search for optimal programmable space compactors isconsidered (although not formally proven) to be an NP-complete (or, strictly speaking,NP-hard) problem. The performance and cost characteristics of one particular PSC canbe verified in polynomial time, but the space of potential PSCs is exponentially related tothe number of lines to be compacted. For simple combinational PSCs with a compactionratio of k : 1, the space consists of all 22’ possible k-variable Boolean functions. Evenfor small values of k (e.g., k = 4), the size of the search space may be too large (65,536possible functions) to exhaustively search with limited resources. This chapter proposes anew method for selecting effective (high fault coverage at a low area cost) combinationalPSCs based upon genetic algorithms. The method is shown to be highly applicable to thePSC search problem with presently available computing resources, and will benefit fromfuture increases ill computing speed. First, the search problem is expressed ill the necessaryframework for the application of genetic algorithms. Secondly, methods for obtaining circuit-specific information, such as the fault-free functional behaviour and error characteristics,are presented. Finally, results from applying genetic searches to benchmark circuits arediscussed.5.1 Genetic SearchIn order to apply genetic algorithms to the problem of searching for combinational programmable space compactors, the problem must be phrased in the proper framework, asdiscussed in Section 4.2.62Chapter 5. Genetic Search for Programmable Space Compactors 635.1.1 Representing Boolean Functions as ChromosomesThe first step in the application of genetic algorithms to the problem of searching forprogrammable space compactors is to define a chromosomal representation of possible compaction functions. The space of all possible k-bit combinational PSC functions is all 22’k-variable Boolean functions. The chromosomal encoding chosen for representing Booleanfunctions is the same encoding scheme, the minterm binary string (MBS) scheme, presentedin Section 4.2.1. In the MBS scheme, a k-bit Boolean function f is encoded as a 2kbit string.The value of a Boolean function f for a given input vector i is represented by the value ofthe jth bit; i.e. each bit in the string represents the presence or absence of the mintermof f.One of the strengths of the MBS chromosomal encoding scheme is its simplicity.The encoding is natural and problem-specific enough to be well representative of the searchspace. Although genetic algorithms are robust enough to perform reasonably well in spiteof the use of non-optimal coding schemes [Goldberg89], using contrived or overly-generalencodings can lead to problems such as illegal chromosomal configurations. For example,in a genetic search for the optimal speed (in km/h) of rush hour highway traffic, the useof 16-bit signed integers in two’s complement notation may be convenient for writing software, but inappropriate for a representing chromosomes since the speed of traffic will neverapproach 32,000 km/h! In the MBS chromosomal encoding scheme, since all 21-bit stringsmap to valid Boolean functions and all Boolean functions are in the search space, the MBSscheme does not suffer from any problem with illegal configurations. Also, with MBS encoding, evaluating Boolean functions involves only simple, fast bit-string lookup operations.Additionally, functions can be easily constructed manually, and easily converted both toand from external representations. Theoretically, the robustness of GAs makes them veryforgiving to inefficient chromosomal encodings, so assuming highly robust exploration of thesearch space and the use of unlimited computing resources, the choice of encoding schemeChapter 5. Genetic Search for Programmable Space Compactors 64does not adversely affect the outcome of a search [Goldberg89]. However, for all practicalpurposes, the use of a natural encoding enhances the search by simplifying the choice of genetic operators (as discussed in Section 4.2.4), and by more easily allowing the incorporationof problem-specific knowledge [Grefenstette87].One of the drawbacks of the MBS chromosomal encoding scheme is that all possiblek-bit Boolean functions are represented which is twice as many as is absolutely necessary. Inthe search for k-bit combinational PSCs (under the SSA fault model), the minimum searchspace need only consist of one-half of all possible k-bit combinational Boolean functionsrather than all of them. The size of the minimum search space is due to the fact that asfar as SSA fault coverage is concerned, a compactor need only propagate deviations in theoutputs values from the fault-free values, not the actual values themselves. Therefore, theSSA fault coverage performance of the inverse of a compaction function f is identical tothat of the original function f. Although the functions f and f have identical SSA FCperformance, they differ in cost. For example, if a 3-variable function f can be representedasf = a2 ao (equation form)= m7 + m5 (sum of minterms form)= 10100000 (MBS encoded form)and its inverse f as,f = a2 a0 (equation form)= m6 + m4 + m3 + m2 + m1 + m0 (sum of minterms form)= 01011111 (MBS encoded form),then f typically costs less than f, because the NAND function typically costs less to implement than the AND function. In a minimal search space (MSS) encoding scheme, byignoring the higher cost function of the (f, J) pair, it is only necessary to encode half ofChapter 5. Genetic Search for Programmable Space Compactors 65the functions, 22k_1 rather than 22k, in the Boolean function space. The advantages ofusing the MSS encoding scheme are a memory saving of one bit per chromosome and anycomputational effort when both f and f are both part of the population. The disadvantages of the scheme would be the additional computational effort involved in evaluatingfunctions represented by the MSS scheme compared to the effort involved in the use of alookup table in the simpler MBS encoding, and any overhead in converting to or from otherrepresentations. Also, because the space of Boolean functions is so large, the probabilityof having both a function and its inverse in a relatively small population in the same generation, especially in appreciable numbers, is very small. Since the additional complexityof using the MSS encoding scheme seemed to outweigh the slight advantages, the simpler,more straightforward MBS encoding was chosen.5.1.2 Generating the Initial PopulationThe second component in a search employing genetic algorithms is a method of generatingan initial population of potential solutions to the problem. As stated in Section 4.2.2, therobustness of GAs makes the composition of the initial population of PSCs a non-criticalfactor in finding a solution. For the purposes of finding a PSC for a circuit, a set of randomlygenerated bit vectors was used to provide a genetically diverse initial population. In aneffort to enhance convergence, the initial population was seeded with several potentiallygood compaction functions. Without any a priori knowledge of possible circuit-specificsolutions, the chosen seeds consisted of circuit-independent compaction functions basedupon the parity function and quadratic compactors discussed in Section 2.3.4. One seedwas the simple 4-bit (even) parity function,fparity = a3 a2 a1 a0, (5.20)which is a very commonly used function [Carter82] The parity function is used as a benchmark here. Other seeds used were the individual quadratic multiplier functions fqi (seeChapter 5. Genetic Search for Programmable Space Compactors 66Table 2.1 in Chapter 2), for k = 2, i.e.,fqi = a1b a1b0 aob1 (5.21)fqo = a1b a0b. (5.22)Since quadratic compaction as proposed in [Karpovsky87] results in a 2k : k compactionratio (since all k individual multiplier functions are assumed to be observed outputs) ratherthan the 2k : 1 ratio desired, the (even) parity function of the k individual functions fqifqparity = fqi fqo (5.23)= a1b0 aob1 a0b, (5.24)was also included as a seed.One of the advantages of using popular circuit-independent compactors as seeds isthat the search is guaranteed to yield a PSC with a fitness no worse than that of any ofthose popular compactors. The result of a genetic search is the most fit solution encounteredduring the course of the entire search, which may not necessarily be the converged uponresult due to the randomized nature of the search. Therefore, the compactors resulting froma genetic search are guaranteed be at least good (in terms of fitness, discussed in the nextsection) as any of the initial seeds.5.1.3 Evaluating FitnessThe third, and perhaps most important, component in a genetic search for PSCs is a methodof evaluating the fitness of a potential solution. Similar to Equation 4.12 used in the exampleof Section 4.2.3, the raw objective function used islog( J)Fw(Pai, Carea) = (5.25)V areaVarying the weighting parameter w in Equation 5.25 alters the emphasis between low probability of aliasing Pai and low implementation cost Carea. The logarithm of the inverse ofChapter 5. Genetic Search for Programmable Space Compactors 67the probability of aliasing-- was taken since an empirical survey of values of-- encountered in the experiments discussed in this chapter showed that tended to vary by manyorders of magnitude (e.g., from 10 to 1040). The use of the root of the cost Carea waschosen to provide polynomial, rather than linear, scaling of the cost relative to the log(--)term. The use of the wth root provides a greater range of scaling ability than the use ofa simple linear factor. Also, since linear fitness scaling (see Section 4.4.1) was being used,any benefits from linearly scaling the cost would be nullified by the overall linear scaling ofthe fitness.The probability of aliasing Pai of a compaction function is calculated using Equation 3.7. Estimates of line error probabilities of a circuit, used in the Pal calculation, wereobtained through fault simulation as discussed in Section 5.2.3.The implementation cost Carea is measured by the number of gates (in terms ofequivalent 2-input NAND gates, or four transistors) required to realize the function. Gatecount was chosen as the measure of area because it gives an accurate measure of area anddesigners easily identify with it. In the popular media and in the semiconductor industry,the sizes of VLSI devices are often quoted in terms of transistor count or equivalent gatecount, where one gate equals four transistors. While other more abstract measures, suchas the number of literals in a factored Boolean expression, can often serve well to estimatecomplexity, they can sometimes be misleading. For example, the functionsfbuf = aand fao4 = (a+b).(c+d)contain one and four literals, respectively. Using the cell library presented in Table A.1 ofAppendix A, fbf costs one gate equivalent and fao4 costs two gate equivalents. In terms ofthe number of literals, fao4 is four times as costly as fbuf, but in terms of equivalent gates,fao4 is only twice as expensive. Complex functions, such as AND-OR-INVERT functions,are often efficiently implemented (particularly in CMOS technology), with cost-effectiveChapter 5. Genetic Search for Programmable Space Compactors 68standard cells or gate array macros. The savings in area due to the use of these cost-effective complex function cells should not be ignored. Gate costs were calculated usingmislP’M to minimize the function and map it to a circuit using a standard cell library (seeTable A.1 in Appendix A).The raw fitness values computed by Equation 5.25 tend to vary by several orders ofmagnitude depending on the data used in the search. In an effort to reduce the sensitivityto this variation in raw fitness and to reduce premature convergence and post-convergencewander, linear scaling was used. Scaling is a commonly accepted practice in the applicationof GAs and linear scaling is generally considered to be problem independent [Michalewicz92].The results of a search tend to depend more upon parameters such as population size andthe number of generations a search is to run than on the parameters used in the linear scalingprocess. Considering the additional complexity and associated overhead of other advancedscaling techniques, such as sigma truncation [Forrest85] and power law scaling [Giffies85],and without solid empirical evidence to support their use, those advanced scaling techniqueswere not used.With the linear scaling technique, the growth rate was controlled by setting themultiplicative factor Cmuit (see Section 4.4.1) of the best individuals as Cmuit = 1.5. Thisvalue was chosen through experimentation. Several values of Cmuit, ranging from 1.25 to2.5 in increments of 0.25, were tested. For each value of Cmuit, genetic searches for PSCswere performed on one particular set of outputs of the ISCAS’85 benchmark circuit c432[Brglez85] (see section 5.2.1 over three different cost weightings (w1 = 1.5, w2 = 2.0, and= 2.5) using data from five different 1024-pattern test sets (p1, P2, p3, p4, and ps). Theindividual searches, were each run for 100 generations. In each of the. fifteen experiments,each using one specific weight w, and one specific 1024-pattern test set pj, the highest valueof raw fitness possessed by any compactor encountered in any generation throughoutthe entire search and the peak dominance were recorded. The dominance level in apopulation at the gth generation, d(g), is defined as the fraction of the population comprisedChapter 5. Genetic Search for Programmable Space Compactors 69Cmuit iUnscaled 55.31.25 57.41.50 55.51.75 46.42.00 47.12.25 37.7Table 5.1: Robustness: best fitness values (R).Cmuit d(100) DTJnscaled 63% 71%1.25 19% 36%1.50 88% 90%1.75 89% 90%2.00 79% 90%2.25 89% 90%Table 5.2: Convergence: average final (d(100)) and peak dominance values (D).by the most numerous, i.e., dominant, member of the population. The peak dominance B ina genetic search is the highest level of dominance encountered d(g) over all generations. Thedominance level in the final (lOOt) generation of each search d,(100) was also recorded.Table 5.1 lists the average (over fifteen searches) of the peak raw fitness values, givenby= fraci=1,j=1found in each of the 100-generation long searches for several different values of Cmuit. Table 5.2 lists the average final (at the end of the 100th generation) dominance, i.e.,= fraci=1,j=1Chapter 5. Genetic Search for Programmable Space Compactors 70and the average peak (achieved at any point in the 100-generation search) dominance i.e.,d(100) = frac d,(100)15,i=1,j=1found in the same set of searches. As seen in Table 5.1, the average fitness value tends todrop as Cmuit rises: The trend in Table 5.1 suggests the occurrence of premature convergencein searches employing larger values of Cmuit as the fittest individuals in the early populationtend to dominate the population more quickly for larger values of Cmuit. The prematureconvergence can be explained by the probable loss of valuable genetic material early in thesearches which prevented more thorough exploration of the search spaces which resulted inconvergence on individuals with lower overall fitnesses. Therefore, excessively high valuesof Cmuit can compromise the robustness of the search for effective PSCs. On the otherhand, using excessiveley low values of Cmuit can cause searches to fail to converge. Thispossibility is exhibited in Table 5.2, where the search tended to fail to converge when thevalue Cmuit = 1.25 was used, and in that respect, performed even worse than a search inwhich no fitness scaling (and therefore, no control of the growth rate of population members)was used. Since slow convergence usually results in more robust exploration of the searchspace and seemed to result in higher overall fitness values, the lowest value of Cmuit whichexhibited consistent convergence was chosen, i.e., Cmuit = 1.5.5.1.4 Genetic Operators and ParametersThe fourth component of a genetic search is a set. of genetic operators which alter thecomposition of potential solutions in the population. For the search for PSCs, only the twobasic genetic operators, binary crossover and mutation were used. Because the positionsof the bits encoding the Boolean function chromosome are very important, no re-orderingoperators, such as inversion (not to be confused with the inversion of bits used in mutation),were used.The fifth and final component of the genetic search for PSCs is the set of parametersChapter 5. Genetic Search for Programmable Space Compactors 71used in the search. The probability of crossover was 0.1 and of mutation was 0.01, whichoffer a medium level of genetic mixing and diversity [Goldberg89]. The size of the populationwas 200, and the limit on the number of generations in each search was set to 100. Theseparameters were chosen to limit the CPU time required for genetic searches for 4-bit PSCsto approximately 30 minutes (30 + 6 minutes) on a Sparc 10 workstation.5.2 Analysis of the Circuit Under TestThe key aspect of programmable space compactors is their circuit specificity. Knowledge ofthe behaviour of the CUT for a given set of test vectors and a given set of targeted faultsis used to select a PSC which is more effective and/or less costly than circuit-independentspace compaction techniques such as the parity tree.5.2.1 The ISCAS ‘85 Benchmark CircuitsAll experiments reported in this thesis were performed using the ISCAS ‘85 benchmarkcircuits. These circuits form a set of ten combinational circuits first proposed at the International Symposium on Circuits and Systems in 1985 (Brglez85]. Table 5.3 lists thestructural characteristics of the ten benchmark circuits. The last column in the table refersto the number of uncollapsed single-stuck-at faults [Tsui87] modeled in the circuit.5.2.2 Circuit BehaviourOne important piece of circuit-specific information used in the search for a PSC is thefunctional behaviour of the fault-free circuit. Whereas the structural circuit-specificity ofa compactor deals with factors such as matching the number of inputs to a compactorwith the number of CUT outputs, functional circuit-specificity deals with the actual outputexpected from a CUT when stimulated with a set of test patterns [Zorian93]. Using any sortof space compactor requires structural information. Selection of an effective PSC requiresChapter 5. Genetic Search for Programmable Space Compactors 72Circuit Function # of Inputs # of Outputs # of SSA Faultsc432 Priority Decoder 36 7 864c499 ECAT 41 32 998c880 ALU and Control 60 26 1760c1355 ECAT 40 32 2710c1908 ECAT 33 25 3168c2670 ALU and Control 233 140 4448c3540 ALU and Control 50 22 6188c5315 ALU and Control 178 123 9400c6288 16-bit Multiplier 32 32 12576c7552 ALU and Control 207 108 13048Table 5.3: The ISCAS ‘85 benchmark circuits.functional circuit-specific information as well.Since the goal of fabrication is to produce as high a proportion of fault-free devicesas possible, the most often expected behaviour of devices produced by any successful fabrication line is the fault-free behaviour. Fortunately, obtaining the distribution of fault-freeoutput vectors is a straightforward, relatively simple task involving logic simulation. TheCUT is simply simulated with the set of test vectors, assumed to be generated pseudorandomly, by a BIST test pattern generator, and the CUT outputs are recorded and analyzed.Because this thesis deals with combinational space compaction, only the distribution of theoutput vectors of the CUT are of interest, not their order or sequence. It is necessary tosave the set of input test vectors for later use in fault grading the compactors.5.2.3 Estimating Line Bit Error ProbabilitiesOne of the key pieces of circuit-specific knowledge in the search for a PSC is the expecteddistribution of faulty (under the single-stuck-at model for the purposes of this thesis) outputvectors. To estimate the distribution of faulty output vectors, the homogeneous independentmultiple bit error model [Ivanov92] is adopted. Relating targeted SSA faults and outputline bit errors requires fault simulation. The most accurate estimates of the probabilities ofChapter 5. Genetic Search for Programmable Space Compactors 73Accuracy (% difference) Frequency Cumulative %age< 5 1767 78.71%< 10 288 91.54%< 15 133 97.46%< 20 41 99.29%< 25 8 99.64%< 30 6 99.91%< 35 1 99.96%< 40 1 100.00%Table 5.4: Accuracy of line error probabilities estimates.errors on line outputs can be obtained by performing fault simulation with the exact set oftest vectors to be applied to the CUT.The fault simulation effort depends linearly on the size of the test vector set. If thatset is large, fault simulation may not be computationally feasible. Since, for the purposes ofthis thesis, the test vectors are assumed to be pseudorandomly generated, large test vectorsets (up to, or even beyond, 220 vectors) are of great interest.A way to achieve reasonable estimates of line error probabilities without an exorbitant amount of fault simulation effort is to perform fault simulation on a subset of the testvectors. Since the test vectors are generated pseudorandomly, it would intuitively seem thatany sequence of vectors behaves very similarly to any other sequence of the same length.Also, the distribution of errored output vectors would be fairly independent of the length ofthe test sequence. (However, output vectors due to the detection of hard faults, i.e., randompattern resistant faults, [Bardell87] would not likely appear in short test sequences.) Withthese assumptions, it should be possible to estimate the line error probabilities for a longtest vector sequence with a short test sequence, or a set of short test sequences.Table 5.4 presents the results of an attempt to estimate the line error probabilities(for all of the outputs of the ten ISCAS’85 benchmark circuits) for a 1024-pattern testChapter 5. Genetic Search for Programmable Space Compactors 74sequence by taking the mean probabilities from a set of ten short (96-pattern) fault simulations. Table 5.4 is a histogram which summarizes the distribution of the accuracy of theline probability estimates compared to the exact line error probabilities obtained from a full1024-pattern fault simulations for a given level of accuracy (5% increments). For example,1767 (or 78.71%) of the line bit error probability estimates obtained from the set of shortfault simulations were within 5% of the actual line error probabilities found by a full 1024-pattern fault simulation. An additional 288 estimates were within 10% (but not within5%) to make a cumulative total of 91.54% of estimates accurate to within 10%. All faultsimulations were performed with the UBC fsim fault simulator, developed by Yuejian Wu[Wu92], on the entire uncollapsed set of SSA faults. Each of the ten short fault simulationsused test vectors generated from an LFSR starting from a different 32-bit pseudorandomseed. Since only one of the seeds from the 96-pattern fault simulations was the same as theseed for the 1024-pattern fault simulation, the test vector sequences had little duplicationor overlap. Although the aggregate lengths of the test vector sets is similar (960 vectors vs.1024 vectors), considering the small amount of overlap between the vectors, the results ofTable 5.4 indicate that it is possible to estimate the line error probabilities of a CUT fora large test vector set with a series of shorter fault simulations. In over 99% of cases, theline error probabilities estimated from the ten short fault simulations were within 20% ofthe actual value obtained from a single long simulation.5.3 Results of Searches with ISCAS ‘85 Benchmark CircuitsThis section discusses the results from using the circuit-specific information to search for4: 1 combinational PSCs for seven of the ISCAS 85 benchmark circuits (c499, c880, c1355,c1908, c3540, c5315, and c6288). Two of the benchmarks circuits, c2670 and c7552, were notused due to the unavailability of netlists in the proper formats. For these two benchmarkcircuits, the netlists in the format necessary for fault grading by the BNR automatic testChapter 5. Genetic Search for Programmable Space Compactors 75pattern generation (ATPGTM) software did not have all of the primary I/O nodes present inthe netlists in the format used by the fsim tool. Searches were not performed with the c432circuit since it has only seven outputs, which would require only a single compactor. (Asdiscussed in Section 5.1.3, the c432 circuit was used in experiments to empirically determinea good value of Cmuit to use in these searches.) Line error probabilities for the benchmarkcircuits were estimated using the method (and associated data) from Section 5.2.3. Thedistribution of the 1024 fault-free output response vectors due to 1024 pseudorandom inputtest patterns was obtained, for each circuit, through logic simulation. The output linesof the benchmark circuits were grouped arbitrarily into sets of four outputs. No strategywas used to partition or group the outputs. Genetic searches for combinational PSCs wereperformed for each group of four outputs using the estimated line bit error probabilities andthe fault-free output vectors for those lines. For each circuit, several searches using differentvalues of the cost vs. aliasing weighting parameter w, ranging from 1.0 to 3.0, were used.5.3.1 Fault Coverage and CostThe PSCs resulting from the genetic searches were incorporated into the netlists of thebenchmarks and fault coverage (SSA faults) were obtained with the BNR ATPG tool. Allfault grades reported here are those achieved after compaction by a set of 4-bit PSCs foundfor a particular benchmark circuit subjected to a particular test vector set. Additionally,all fault coverage figures include the set of possible faults in the compactors themselves.In order to compare the cost and performance of the PSCs, the parity tree, whichgenerally has high fault coverage at a high cost, was used as a reference. If the fault coverageof a compactor f is FC and the fault coverage of the parity tree is FCparity, the relativefault coverage FC’ is defined asFC’ = FC (5.26)FCparitySimilarly, if the cost of a compactor f is c and the cost of the parity tree is Cparjty, theChapter 5. Genetic Search for Programmable Space Compactorsrelative cost c’ is defined asCCparityFC’ (1k vectors) vs. c’1.041.031.02I ‘•°0.990.980.970.9676(5.27)- —-I---UI •• I..•- It -- -. - - -- •:-.- --U•:-U -- —i- -UU,0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1C,Figure 5.1: Relative fault coverage FC’ (1k vectors) vs. relative cost c’.Figures 5.1, 5.2, and 5.3 are scatter plots of relative fault coverages FC’ vs. relativecost c’ (achieved by the compactors found in all of the genetic searches) for input testpattern sequences of length 1k, 8k and 64k, respectively. Each point in each of the threegraphs plots the relative fault grade FC’ of the set of PSCs found for a particular ISCASbenchmark circuit, for a certain number of test vectors (1k, 8k, or 64k), vs. the relative costc’ of those PSCs. For example, the genetic search with the weighting parameter w = 1.25for the 32output c499 circuit yielded a set of eight 4-bit PSCs with a total cost of 14 gates,which is considerably less than the 72 gate cost of eight 4-bit parity trees. The parity treecompactors achieve a fault grade of 95.22% for a test sequence of 1024 vectors, but theChapter 5. Genetic Search for Programmable Space CompactorsFC’ (8k vectors) vs. c’771.041.031.02!0.990.980.970.960 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9C,Figure 5.2: Relative fault coverage FC’ (8k vectors) vs. relative cost c’.eight PSCs achieved a fault grade of 98.11% for a the same sequence. In this case, the eightPSCs achieved 1.034 times the fault coverage of the parity tree, at a total cost of oniy 0.194times the total cost of eight 4-bit parity trees! The result of this particular case is plotted inFigure 5.1, at coordinates (c’,FC’) = (0.194,1.034). In total, 42 genetic searches for PSCswere performed on the seven ISCAS benchmark circuits (six different weights w were usedfor each circuit). The fault coverage results of using quadratic compactors for each of theseven benchmark circuits were also evaluated and included in the plots for comparison.Since the parity tree is the reference compactor, its results are represented by the(c’, FC’) coordinates (1,1). The best PSCs, those with high fault coverage at a low cost,are situated in the upper left-hand corner of each graph. PSCs with a better fault coveragethan the parity tree (FC’ > 1.0) and which cost less than the parity tree (c’ < 1.0) are—- _-•-I. -‘ -- -1•1-I—- - -U- -U- •Chapter 5. Genetic Search for Programmable Space Compactors 78FC’ (64k vectors) vs. c’1.041.03-.1.01-. 1-0.99-0.980.970.96 I I I I I I I I0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1ClFigure 5.3: Relative fault coverage FC’ (64k vectors) vs. relative cost c’.represented by all the points above the FC = 1.0 line since all of the PSCs found hadrelative cost less than or equal to 1. The worst PSCs are situated in the lower right-handcorner of each of the graphs. These PSCs have a fault coverage slightly lower than theparity tree and cost almost as much. These PSCs may be satisfactory in applications wherelower area is of great importance even at the expense of fault coverage; however, PSCswith better fault coverage and/or lower cost can often be found. In each graph, the line ofpoints with a relative cost c = 0.89 represents the compactors based upon the quadraticcompaction function of Equation 5.24.Although one might expect more costly functions to provide better fault coverage,from Figures 5.1, 5.2, and 5.3, there does not appear to be a clear correlation betweenhigh cost and high fault coverage. In fact, many low-cost PSCs exhibit high relative fault-.Th - - .-III..I--.11U U--U- •Chpter 5. Genetic Search for Programmable Space Compactors 79Relative Cost (c’)<0.2 <0.4 <0.6 <0.8 < 1.01.02 2 2 3 3 3(1Q)Relative FC 1.00 2 2 3 5 6 (3Q)(FC’) 0.98 6 10 15 24 28 (6Q)> 0.96 8 14 19 28 32 (7Q)Table 5.5: Summary of FC’ (1k vectors) vs. c’.Relative Cost (c’)<0.2 <0.4 <0.6 <0.8 < 1.01.02 2 2 3 3 3(1Q)Relative FC > 1.00 4 4 6 9 10 (3Q)(FC’) > 0.98 8 14 19 28 32 (7Q)0.96 8 14 19 28 32 (7Q)Table 5.6: Summary of FC’ (8k vectors) vs. c’.coverage characteristics, particularly for a large number (64k) of vectors. Clearly, there existmany PSCs which outperform parity trees in terms of fault coverage at a lower cost, andthese efficient, cost-effective PSCs can be found by a search based on genetic algorithms.There does appear to be a trend toward higher relative fault coverage FC’ as thetest set length increases from 1k to 8k to 64k. This trend suggests that the fault coverageperformance of PSCs (even low-cost ones) tends to approach or sometimes exceed the performance of the parity tree as the number of test vectors increases Therefore, a designercan make a tradeoff between the cost of the compactors and the test set length to achievethe desired fault coverage. This tradeoff is discussed in greater detail and illustrated byFigure 5.5.Cumulative summaries of the results plotted in Figures 5.1, 5.2, and 5.3, are presented in Tables 5.5, 5.6, and 5.7. These tables are two-dimensional cumulative histogramsChapter 5. Genetic Search for Programmable Space Compactors 80_____RelativeCost (c’)<0.2 <0.4 <0.6 0.8 < 1.01.02 2 2 3 3 3(1Q)Relative FC > 1.00 4 5 8 11 14 (3Q)(FC’) > 0.98 7 13 19 26 32 (7Q)0.96 7 13 19 26 32 (7Q)Table 5.7: Summary of FC’ (64k vectors) vs. c’.that summarize the number total number of cases where the PSCs for a benchmark circuit achieve a certain level of relative fault coverage (0.96, 0.98, 1.00, or 1.02), for a givennumber of test vectors (1k, 8k or 64k), without exceeding a certain relative cost (0.2, 0.4,0.6, 0.8 or 1.0). For example, for a test set length of 1k vectors (see Table 5.5), there weretwo cases where PSCs achieved FC’ 1.00 at a cost of c’ < 0.4, but three cases, i.e., oneadditional case, where that level of relative fault coverage was achieved at a cost c’ 0.6.For a test set length of 64k vectors (Table 5.7), there were five cases where PSCs achievedFC’ 1.00 at a cost of c’ < 0.4, and eight cases, where that level of relative fault coveragewas achieved at a cost c’ < 0.6. These tables include the 32 cases where the PSC solutionfor a benchmark circuit found by a search is not composed entirely of parity trees. All 32cases achieved a relative fault grade of at least 0.96 at a cost less than the parity tree.The last column in each of the tables lists the cumulative number of quadraticcompaction functions which achieved the relative fault coverages of 0.96, 0.98, 1.00, and1.02, as a number followed by the character ‘Q’. For example, Table 5.5 shows that therewere three quadratic compactors (out of the seven cases) which achieved a relative faultcoverage of 1, for a set of 1024 pseudorandom test patterns. Quadratic compactors areincluded in the tables to provide an alternative comparison point The quadratic compactionfunctions are listed in the last column because the cost of the quadratic compactor is 0.89times that of a parity tree compactor, i.e., c = 0.89.It is quite apparent from Figures 5.1, 52, and 5.3, and Tables 5.5, 5.6, and 5.7, thatChapter 5. Genetic Search for Programmable Space Compactors 81there exist space compaction functions that provide higher fault coverage than the circuit-independent parity tree compactor at a lower cost. In fact, with 64k pseudorandom testpatterns, there were 14 cases (out of 32) where PSCs provided fault coverages meeting orexceeding those of the parity tree at an equal or lower cost. In all of the test cases, relativefault coverages no less than 0.96 were achieved at a cost less than that of the parity tree.Number of Patterns for 98% FC vs. c’1413- - -- c190810 --fl-- c1355c62888______0—640.1 0.3 0.5 0.7 0.9 1.1C,Figure 5.4: Number of patterns for 98% fault coverage vs. relative cost c’.Figure 5.4 illustrates a graph of the approximate number of test patterns requiredto achieve 98% (absolute, not relative) fault coverage vs. the relative cost of those compactors, for three different ISCAS ‘85 benchmark circuits (c1908, c1355, and c6288). Thisfigure supports the correlation between the number of patterns needed for reasonable faultcoverage and the cost of the compactors. For each circuit, the number of test vectors neededto achieve 98% fault coverage tends to drop when more expensive PSCs are used.Chapter 5. Genetic Search for Programmable Space Compactors 820.60FC’ vs. Number of Patterns (Iog2)Figure 5.5: Relative fault coverage FC’ vs. number of patterns.Figure 5.5 illustrates a graph of the average, maximum, and minimum relative faultcoverages FC’ achieved by the PSCs for the benchmark circuits vs. the number of testpatterns required to achieve that fault coverage. Maximum relative FC is the greatestvalue of FC’ achieved by the PSCs in any of the experimeilts (for any of the seven circuitsand for any of the six weighting values). Similarly, the minimum relative FC is the leastvalue of FC’ achieved by a the PSCs in any of the experiments (for any of the seven circuitsand for any of the six weighting values). The average relative FC is the simple mean ofall of the values of FC’ found in all 42 of the experiments (six different weighting factorsw for each of the seven circuits). This graph suggests that, on average, the fault coverageperformance of PSCs tends to converge to the same fault coverage as the parity tree as thenumber of pseudorandom test patterns increases. Figure 5.5 also suggests that the variance1.101.000.900.80- - -- a - - -—— -.— —0.70V/.0.50Average FC- - Maximum FC— Minimum FC5 6 7 8 9 10 11 12 13 14 15 16Number of Patterns (log2)Chapter 5. Genetic Search for Programmable Space Compactors 83of the relative fault coverage decreases as the number of patterns increases, so less expensivePSCs which exhibit poor fault coverages for short test vector sets may have satisfactoryfault coverages for long test vector sets. Therefore, designers can make a tradeoff betweenthe number of patterns needed to achieve a desired fault coverage and the likelihood offinding inexpensive PSCs, i.e., test length vs. cost;5.3.2 Weighting CostFrom Equation 5.25, the emphasis on cost vs. Fai can be altered. The effect of changingthis weighting factor can be seen in Figure 5.6, which graphs the average, maximum, andminimum relative total costs c’ of the PSCs found for the benchmark circuits vs. the weighting factor w used in the searches. Maximum relative total cost is the largest value of c’ ofthe PS Cs found for any of the circuits, for a given weight. Similarly, the minimum relativetotal cost is the smallest value of c’ of the PSCs found for any of the circuits, for a givenweight. The average relative total cost is the simple mean of all of the values of c’ found inall seven of the searches (one for each circuit) for the given weight w, i.e.,=fracc7.Figure 5.6 shows that the relative cost of the PSC solution selected for a circuit increasesas the weight w increases to emphasizes lower Pai over lower Carea. The figure also impliesthat the cost of the parity tree c’ = 1 is an “soft” asymptotic limit for the cost of PSCs asthe w increases. None of the experiments yielded a PSC with a cost greater than that ofthe parity tree. It seems that the parity tree provides an upper bound on the cost of spacecompactors that provide reasonable probabilities of aliasing. Figure 5.6 suggests that theweighting parameter w gives a prospective designer a good deal of control over the cost ofa space compaction solution for a CUT. In conjunction with the ability to make a tradeoffbetween test length and cost, the ability to control cost give the designer excellent controlover the efficiency (cost vs. test length vs. fault coverage) of a PSC solution.Chapter 5. Genetic Search for Programmable Space Compactors 8410.90.80.70.6‘- 0.50.40.30.20.10C’ VS. WFigure 5.6: Relative total cost of PSCs c’ vs. weight w.5.3.3 Grouping/Partitioning of Output• LinesAs mentioned above, the output lines of the benchmark circuits were grouped without usingany partitioning or grouping strategy. It has been suggested that the inputs of a CUT shouldbe grouped so that the inputs to a compactor do not depend on the same CUT inputs, andthus introduce the possibility of systematic aliasing [Benowitz75]. Such partitioning canbe performed using a tool such as PEST [EWu9O]. However, such partitioning may not benecessary. Since the genetic search for a PSC here is free to choose any Boolean function offour variables, it often chooses functions which can be reduced to functions of fewer thanfour variables. Functions of three or fewer variables will tend to be simpler and have ahigh probability of costing less than functions of four variables. The number of lines NLused by a space compactor can be thought of as its observability usage. PSCs with lower- - -.Lr0-0—--.//—.—Average c’- - -- Maximum c’— Minimum c1 1.25 1.5 1.75 2 2.25 2.5 2.75 3wChapter 5. Genetic Search for Programmable Space Compactors 85observability usage require a lesser proportion of nodes to be observed to provide a certainlevel of fault coverage thall PSCs with higher observability usage.In order to gauge the performance of the PSCs, the parity tree, which uses all fourinputs lines to form the output, was again used as a reference compator. If the observability usage of a PSC f is NL, and the observability usage of the parity tree is NLparjty(NLparity = 4 for k = 4), the relative observability usage NL’ is defined asNL’= NL = (5.28)NLparity 4Figure 5.7 graphs the average, maximum, and minimum relative observability usage NL’ ofthe PSCs for a circuit vs. the weighting factor w used in the search. The average relativeNL is the simple mean of all of the values of NL’ found in all 42 of the experiments (sixdifferent weighting factors w for each of the seven circuits). Maximum relative observabilityusage is the greatest value of NL’ found for the PSCs in any of the experiments (for any ofthe seven circuits and for any of the six weighting values). Similarly, the minimum relativeobservability usage is the least value of NL’ found for the PSCs in any of the experiments.The relationship between the relative number of lines NL used by a compactor and theweighting factor w is another facet of the weighting of cost, since reducing the number oflines used by a compactor results in lower routing costs. In fact, the trends exhibited byFigures 5.6 and 5.7 are very similar. This figure suggests that the weighting parameter wgives a designer some control over how much observability is to be sacrificed.In Figure 5.7, as the weight w decreases (to emphasize cost over Pai), the observability usage of the PSCs decreases. The genetic search chooses PSC functions of fewerthan k bits if one or more of the output lines contains a low amount of information relativeto the cost of including that line in the function. This PSC selection method automaticallydrops observation nodes (within in a local k-bit grouping) which have low fault detectionvalue. For example, Figure 5.7 shows that an average of approximately 2.4 lines (NL’ 0.6and NLparity = k = 4) were used per PSC function when the weighting factor was setChapter 5. Genetic Search for Programmable Space Compactors 86NL’ vs. w1 0 00.9 -0.8 ////-— -• — — —* Average NL0.50.4 — - - - - Maximum NL0.3— Minimum NL’0.20.10 I I I I I I I I1 1.25 1.5 1.75 2 2.25 2.5 2.75 3wFigure 5.7: Relative observability usage NL’ vs. weight.to w = 1.0. Therefore, with w = 1.0, most individual PSCs were Boolean functions ofthree or fewer variables. With PSCs found with using genetic algorithms, partitioning ofthe output lines can be argued as unnecessary, because when dealing with lines which maycause systematic aliasing, it is likely that one of the lines will be dropped due to low faultdetection value.Chapter 6Conclusions6.1 ResultsThe experiments reported in this thesis show that for a given circuit and set of pseudorandom test patterns, circuit-specific programmable space compaction functions (PSCs) withhigher fault coverages and/or lower costs than circuit-independent functions do exist andcan be found, with a reasonable amount of effort by applying genetic algorithms to thesearch process. Circuit-specific information, such as the fault-free behaviour of the circuitunder test (CUT) and the line bit error probabilities (under the single stuck-at fault), fora given circuit and a given set of pseudorandom test patterns, can be obtained throughsimulation. It was demonstrated that line error probabilities for a CUT stimulated by alarge set of test patterns can be estimated, with a level of accuracy tolerable by a geneticsearch algorithm, from a set of short fault simulations.A method, based upon genetic algorithms, for searching for PSCs was proposed asa way to effectively search the large space of Boolean functions for PSCs. This methodprovided a mechanism for varying the emphasis between the fault coverage and the cost ofdesired PS Cs. Experiments showed that lower cost compactors do not necessarily have lowerfault coverage than higher cost compactors, although a greater number of test patterns maybe required to achieve the equivalent fault coverage. These experiments demonstrated theability of a search method, employing genetic algorithms, to often find PSCs with lower costand/or higher fault coverage than the parity tree with a reasonable amount of computingresources. Cost savings of up to 80% were achieved with no loss of (or even improved)87Chapter 6. Conclusions 88fault coverage. These savings are achievable for a single, one-time expenditure of limitedcomputing effort. The availability of greater computing resources in the future can onlyenhance the ability of the search method to cover a larger sample space of Boolean functionsand thereby improve the quality of the PSCs found in the search. Additionally, the resultsof the PSC searches provides insight concerning the relative utility of observing variousnodes in the circuit under test. Such information can be used to improve the selection ofnodes to provide optimal observability.6.2 Future WorkAlthough the results of searches for programmable space compactors presented here were forcombinational PSCs of four variables, the method is fully applicable to PSCs of up to anynumber of variables for which the computational resources for searches remain practical.Any future work should include searches for PSCs of greater than four variables, and thedevelopment of methods for dealing with the larger chromosome sizes and larger searchspace. Speeding up the search is a crucial factor in enabling searches for PSCs of a greaternumber of variables.One of the limiting factors in the current implementation of the genetic search is theamount of time required to perform the calculation of the probability of aliasing. A possiblemethod for speeding up the calculation of the probability of aliasing is to use binary decision diagrams (BDDs) [Seger92]. Although the use of BDDs would not reduce the overallcomplexity of the probability of aliasing calculations, they can be used to trade off the useof additional memory resources to reduce the average time required to make the calculations. An additional limiting factor in the current software is the time required to calculatethe cost of a PSC. Marginal improvement can be obtained by implementing mapping andminimization routines within the search software. However, for greater improvement, otherways to estimate the cost of a PSC function should be explored. Even if cost estimates areChapter 6. Conclusions 89not as accurate as the gate counts of minimized and mapped circuits, the improvement inthe speed of the search may more than compensate for the loss of accuracy.Although the method for searching for PSCs was developed for combinational PSCs,there is no reason why the basic method cannot be applied to searches for sequential PSCs.As noted, genetic algorithms are a very versatile search technique. If a suitable methodfor estimating the probability of aliasing of small sequential PSCs can be developed in conjunction with an appropriate chromosomal representation of sequential circuits, it should bepossible to use GAs to search for sequential PSCs with better cost or aliasing characteristicsthan commonly used maximum-length LFSRs (if any exist).As well as being versatile, GAs are extremely extensible. Biological models providean inexhaustible source of new ideas and variations to existing genetic search techniques.There are a number of advanced techniques developed for GAs which can be experimentedwith to determine their utility in this particular search application. Some techniques mayhave the potential to significantly improve the PSC search process. Another randomizedsearch method, simulated annealing may also prove to be applicable to the search for PSCs.The work presented in this thesis provides a method for searching for programmablespace compactors. The method gives designers an good alternative to the blindly usingcircuit-independent space compactors such as the parity tree. The genetic search methodpresented here is possibly only a starting point for the development of randomized searchtechniques to find optimal circuit-specific compactors of all types and sizes.Bibliography[Abraham86] Abraham, J.A., “Fault Modeling in VLSI,” In VLSI Testing, Ed.Wiffiams T.W., Elsevier Science Publishers, Amsterdam, The Netherlands pp. 1—27, 1986.[Axelrod87] Axelrod, R., “Genetic Algorithm for the Prisoner Dilemma Problem,”In Genetic Algorithms and Simulated Annealing, Ed. Davis, L., Pit-man, London, 1987, PP. 32—41.[Bardell87] Bardell, P.R., McAnney, W.H., and Savir, J., Built-In Test for VLSI:Pseudorandom Techniques, John Wiley & Sons, New York, 1987.[Benowitz75] Benowitz, N., Calhoun, D.F., Anderson, G.E., Bauer, J.E., andJoeckel, C.T., “An Advanced Fault Isolation System For DigitalLogic,” IEEE Transactions on Computers, Vol. C-24, No. 5, May 1975,pp. 489—497.[Bhavsar84] Bhavsar, D.K. and Krishnamurthy, B., “Can We Eliminate Fault Escape in Self-Testing by Polynomial Division?” Proc. mt. Test Conf.,1984, pp. 134—139.[Booker87] Booker, L., “Improving Search in Genetic Algorithms,” In GeneticAlgorithms and Simulated Annealing, Ed. Davis, L., Pitman, London,1987, pp. 61—73.[Bottorif] Bottorif, P.S., “Test Generation and Fault Simulation,” In VLSI Testing, Ed. Wiffiams T.W., Elsevier Science Publishers, Amsterdam, TheNetherlands pp. 29—64, 1986.[Brayton8i] Brayton, R.K., Rudell, R., Sangiovanni-vincentelli, A., and Wang,A.R., “MIS: A Multiple-level Logic Optimization System,” IEEETrans. Computer-Aided Design, Vol. 6, No. 6, Nov. 1987, pp. 1062—1081.[Brindle8l] Brindle, A., “Genetic Algorithms for Function Optimization,” Doctoral Dissertation, University of Alberta, Edmonton, 1981.[Brglez8s] Brglez, F. and Fujiwara, H., “A Neutral Netlist of 10 CombinationalBenchmark Circuits and a Target Translator in FORTRAN,” Proc.IEEE Int. Symp. on Circuits and Systems, 1985, pp. 151—158.90Bibliography 91[Carter82] Carter, W.C., “The Ubiquitous Parity Bit,” Proceedings of the 1thFault-Tolerant Computing Systems Conference, June 1982, pp. 289—296.[Chandramouli83] Chandramouli, R., “On Testing Stuck-Open Faults,” Proceedings ofthe 13th Fault-Tolerant Computing Systems Conference, June 1983,pp. 258—265.[Colorni90] Colorni, A., Dorigo, M. and Maniezzo, V., “Genetic Algorithms andugly Constrained Problems: The Time-Table Case,” In Proc. Firstmt. Conf. on Parallel Problem Solving from Nature, Eds. Schwefel,11.-P. and Manner, R., Dortmund, Germany, 1990, pp. 55—59.[Damiani89] Damiani, M., Olivo, P., Favalli, M. and Riccó, B., “An AnalyticalModel for the Aliasing Probability in Signature Analysis Testing,”IEEE Trans. Computer-Aided Design, Vol. 8, No. 11, Nov. 1989, pp.1133—1 144.[Damiani89b] Damiani, M., Olivo, P., Favalli, M., Ercolani, S. and Riccó, B., “Aliasing in Signature Analysis Testing with Multiple Input Shift Registers,”Proc. European Test Conf., April 1989, pp. 346—353.[Damiani9l] Damiani, M., Olivo, P. and Riccó, B., “Analysis and Design of LinearFinite State Machines for Signature Analysis Testing,” IEEE Trans.on Computers, Vol. 40, No. 9, Sept. 1991, pp. 1034—1045.[Davis87] Davis, L. and Steenstrup, M., “Genetic Algorithms and SimulatedAnnealing: An Overview,” In Genetic Algorithms and Simulated Annealing, Ed. Davis, L., Pitman, London, 1987, pp. 1—11.[DasGupta] DasGupta, S., Eichelberger, E.B., and Williams, T.W., “LSI Chip Design for Testability,” Dig. Tech. Papers, 1978 mt. Solid-State CircuitsConf., Feb. 1978, pp. 216—217.[DeJong75] DeJong, K.A., “An Analysis of the Behavior of a Class of GeneticAdaptive Systems,” Doctoral Dissertation, University of Michigan,Ann Arbor, 1975.[Eichelberger78] Eichelberger, E.B. and Wiffiams, T.W., “A Logic Design Structurefor LSI Testability,” Journal of Design Automation, Fault TolerantComputing, Vol. 2, No. 2, 1978, pp. 165—178.[EldredS9] Eldred, R.D., “Test Routines Based on Symbolic Logical Statements,”Journal of the ACM, 6, pp. 33—36, January 1959.Bibliography 92[Forrest85] Forrest, S., “Implementing Semantic Networks Structures using theClassifier System,” Proceedings of the First International Conferenceon Genetic Algorithms, Lawrence Erlbaum Associates, Hillsdale, NJ,1985, pp. 24—44.[Frohwerk77] Frohwerk, R.A., “Signature Analysis: A New Digital Field ServiceMethod,” Hewlett Packard Journal, May 1977, pp. 2—8.[Galiay8o] Galiay, J., Crouzet, Y., and Verniault, M., “Physical Versus LogicalFault Model in MOS LSI Circuits: Impact on their Testability,’ IEEETransactions on Computers, Vol. C-29, No. 6, June 1980, pp. 527—531.[Geman84] Geman, S., and Geman, D., “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,” IEEE Transactionson Pattern Analysis and Machine Intelligence PAMI-6, pp. 721—741.[Gillies85] Giffies, A.M., “Machine Learning Procedures for Generating Image Domain Feature Detectors,” Doctoral Dissertation, University ofMichigan, Ann Arbor, 1981.[Goldberg89] Goldberg, D.E., Genetic Algorithms in Search, Optimization and Machine Learning, Addison Wesley, Reading, MA, 1989.[Golumb82] Golomb, S.W., Shift Register Sequences, Languna Hills, CA, AegeanPark, 1982.[Greffenstette85] Grefenstette, J.J., Gopal, R., Rosmaita, B. and Van Gucht, D., “Genetic Algorithms for the Traveffing Salesman Problem,” Proceedings ofthe First International Conference on Genetic Algorithms, LawrenceErlbaum Associates, Hillsdale, NJ, 1985, pp. 160—168.[Grefenstette87] Grefenstette, J.J., “Incorporating Problem Specific Knowledge intoGenetic Algorithms,” In Genetic Algorithms and Simulated Annealing,Ed. Davis, L., Pitman, London, 1987, pp. 42—60.[Gupta9o] Gupta, S.K., Pradhan, D.K. and Reddy, S.M., “Zero Aliasing Compression,” Proc. mt. Symp. on Fault-Tolerant Computing (FTCS2O),1990, pp. 254—263.[Hayes84] Hayes, J.P., “Fault Modeling for Digital MOS Integrated Circuits,”IEEE Transactions on Computer-Aided Design, Vol. CAD-3, No. 3,July 1984, pp. 200—208.[Holland75] Holland, J., Adaptation in Natural and Artificial Systems, Universityof Michigan Press, Ann Arbor, 1975.Bibliography 93[Horning87] Horning, L.K., Soden, J.M., Fritzemier, R.R., and Hawkins, C.F.,“Measurements of Quiescent Power Supply Current for CMOS ICs inProduction Testing,” Proc. mt. Test Conf., 1987, pp. 300—309.{Ibarra75] Ibarra, 0.11., and Sahni, S.K., “Polynomially Complete Fault Detection Problems,” IEEE Trans. on Computers, Vol. C-24, No. 3, March1975, pp. 242—249.[Ivanov92] Ivanov, A. and Pilarski, S., “Performance of Signature Analysis: ASurvey of Bounds, Exact, and Heuristic Algorithms,” Integration, theVLSI Journal, 13, 1992, pp. 17—38.[Jain83] Jam, S.K. and Agrawal, V.D., “Test Generation for MOS Circuits Using D-Algorithm,” Proceedings of thç 20th Design Automation Conference, The, 1983, pp. 64—70.[Kameda93] Kameda, Tiko, Pilarski, S. and Ivanov, A., “Notes on Multiple InputSignature Analysis,” IEEE Trans. Computers, Feb. 1993, pp. 228—234.[Karpovsky87] Karpovsky, M.G., and Nagvajara, P., “Optimal Time and Space Compression of Test Responses for VLSI Devices,” Proc. mt. Test Conf.,1987, pp. 523—529.[Kirkpatrick83] Kirkpatrick, S., Gelatt, C.D. Jr., and Vecchi, M.P., “Optimization bySimulated Annealing,” Science, Vol. 220, No. 4598, May 13, 1983.[Kirkpatrick84] Kirkpatrick, “Optimization by Simulated Annealing: QuantitativeStudies,” Journal of Statistical Physics, Vol. 34, Nos. 5—6, pp. 975—986.[Koeppe86] Koeppe, S., “Modeling and Simulation of Delay Faults in CMOS LogicCircuits,” Proc. mt. Test Conf., 1986, pp. 530—536.[Lambidonis9l] Lambidonis, D., Agarwal, V.K., Ivanov, A. and Xavier, D., “Computation of Exact Fault Coverage for Compact Testing Schemes,” Proc.IEEE mt. Symp. on Circuits and Systems, 1991, pp. 1873—1876.[Lee88] Lee, Y.H. and Krishna, C.M., “Optimal Scheduling of Signature Analysis for VLSI Testing,” Proc. mt. Test Conf., 1988, pp. 443—447.[Levy9l] Levy, Paul S., “Designing In Power-Down Test Circuits,” IEEE Design&4 Test of Computers, Sept. 1991, pp. 31—35.[McCluskey8l] McCluskey, E.J., and Bozorgui-Nesbat, S., “Design for AutonomousTest,” IEEE Trans. on Computers, Vol. C-30, No. 11, Nov. 1981, pp.866—874.[McCluskey84] McCluskey, E.J., “A Survey of Design for Testability Scan Techniques,” VLSI Design, Vol. 5, No. 12, Dec. 1984, pp. 38—61.Bibliography 94[McCluskey85] McCluskey, E.J., “Built-In Self-Test Techniques,” IEEE DesignTest, Vol. 2, No. 2, April 1985, pp. 21—28.[McCluskey85b] McCluskey, E.J., “Built-In Self-Test Structures,” IEEE Design Test,Vol. 2, No. 2, April 1985, pp. 29—36.{Mei74] Mei, K., “Bridging and Stuck-at Faults,” IEEE Trans. on Computers,Vol. 23, No. 7, July 1974, pp. 720—727.[Metropoliss3] Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., and Teller,E. J. Chem. Phys., Vol. 21, p. 1087, 1953.[Michalewicz92] Michalewicz, Z., Genetic Algorithms + Data Structures EvolutionPrograms, Springer-Verlag, New York, NY, 1992.[Mourad88] Mourad, S., and McCluskey, E.J., “Fault Models,” In Testing andDiagnosis of VLSI and ULSI, Eds. Lombardi, F. and Mariagiovanna,S., Kluwer Academic Publishers, Dordrecht, The Netherlands pp. 49—68, 1987.[Nagle89] Nagle, H.T., et al., “Design for Testability and Built-In Self Test:A Review,” IEEE Trans. Industrial Electronics, Vol. 36, No. 2, May1989, pp. 129—140.[Rajski87] Rajski, J. and Cox, H., “A Method of Test Generation and Fault Diagnosis in Very Large Combinational Circuits,” Proc. mt. Test Couf.,1987, pp. 932—943.[Rajski9lb] Rajski, J. and Tyszer, J., “Experimental Analysis of Fault Coverage inSystems with Signature Registers,” Proc. European Test Conf., 1991,pp. 45—48.[Reddy84] Reddy, S.M., Reddy, M.M. and Agrawal, V.D., “Robust Tests forStuck-Open Faults in CMOS Combinational Logic Circuits,” Proceedings of the Lth Fault-Tolerant Computing Systems Conference, 1984,pp. 44—49.[Reddy88] Reddy, S.M., Saluja, K.K., and Karpovsky, M.G., “A Data Compression Technique for Built-In Self-Test,” IEEE Trans. on Computers,Vol. 37, No. 9, Sept. 1988.[Russell89] Russell, G., Advanced Simulation and Test Methodologies for VLSIDesign, Van Nostrand Reinhold (International), London, 1989.ESaluia83] Saluja, K.K., and Karpovsky, M.G., “Testing Computer HardwareThrough Data Compression in Space and Time,” Proc. kit. Test Conf.,1983, pp. 83—88.Bibliography 95[Sedmak85] Sedmak, R.M., “Built-In Self-Test: Pass or Fail?” IEEE Journal ofDesign .4 Test, April 1985, pp. 17—19.[Seger92] Seger, C.-J.H., Private Communication, 1992.[Serway89] Serway, R.A., Moses, C.J. and Moyer, C.A., Modern Physics, SaundersCollege Publishing, Philadelphia, 1989.[Soden92] Soden, J.M., Hawkins, C.F., Gulati, R.K. and Mao, W., “IDDQ Testing: A Review,” Journal of Electronic Testing: Theory and Applications, Vol. 3, No. 4, Dec. 1992, pp. 5—17.[Turino86] Turino, J., “Semiconductor Device Test Equipment,” In VLSI Testing, Ed. Wiffiams T.W., Elsevier Science Publishers, Amsterdam, TheNetherlands pp. 229—238, 1986.{Tsui87] Tsui, F.F., LSI/VLSI Testability Design, McGraw-Hill, New York,1987.[Vasudevamurthy9O] Vasudevamurthy, J., and Rajski, J., “A Method for Concurrent Decomposition and Factorization of Boolean Expressions,” Proc. IC-CAD, 1990, pp. 510—513.[Wagner87] Wagner, K.D., Chin, C.K., and McCluskey, .E.J., “PseudorandomTesting,” IEEE Trans. on Computers, Vol. C-36, No. 3, Mar. 1987,332—343.[Wiffiams8l] Wiffiams, T.W., and Brown, N.C., “Defect Level as a Function ofFault Coverage,” IEEE Trans. on Computers, Vol. C-30, No. 12, Dec.1981, 987—988.[Williams83] Wiffiams, T.W., and Parker, K.P., “Design for Testability A survey,” Proceedings of the IEEE, Vol. 71, No. 1, Jan. 1983, pp. 98—112.[EWu9O] Wu, E. and Autkowski, P.W., “PEST A Tool for ImplementingPseudo-Exhaustive Self Test,” Proc. IEEE Custom IC Conf., 1990,pp. 28.4.1—28.4.3.[EWu93] Wu, E., Moskowitz, M.E., “A Cost-Effective Test Data CompactionScheme,” Proc. IEEE Custom IC Conf., 1992, pp. 13.7.1—13.7.3.[Wu92] Wu, Y. and Ivanov, ‘A., “Accelerated Path Delay Fault Simulation,”Proc. IEEE Test Symp., April 1992, pp. 1—6.[Wu93] Wu, Y., “On Multiple Intermediate Signature Analysis for BuiltIn Self-Test,” Doctoral Dissertation, University of British Columbia,Vancouver, 1993.Bibliography 96[Zhang93] Zhang, C., “On Fault Coverage and Fault Simulation for Multiple liitermediate Signature Analysis BIST Schemes,” M.A.Sc. Thesis, University of British Columbia, Vancouver, 1993.[Zorian86] Zorian, Y. and Agarwal, V.K., “A General Scheme to Optimize ErrorMasking in BIST,” Proc. mt. Symp. on Fault-Tolerant Computing(FTCS-16), July 1986, pp. 410—415.[Zorian9o] Zorian, Y. and Agarwal, V.K., “Optimizing Error Masking in BISTby Output Data Modification,” Journal of Electronic Testing: Theoryand Applications, No. 1, 1990, pp. 59—71.[Zorian93] Zorian, Y. and Ivanov, A., “Programmable Space Compaction forBIST,” Proc. mt. Symp. on Fault-Tolerant Computing (FTCS23),1993, pp. 340—349.Appendix AStandard Cell LibraryTable A.1 lists the standard cells and their associated costs (expressed in equivalent gates)used by misl]JM to map implementations of programmable space compaction functions.97ozzDzczc±±± ±± ±± ± ±± ± ±± ±‘ThC) CD CD CD C2 C iD±±C(ID C C CDC) C CiD CD CD crq CDcc
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- On selecting programmable space compactors for built-in...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
On selecting programmable space compactors for built-in self-text using genetic algorithms Tsuji, Barry Kazuto 1993
pdf
Page Metadata
Item Metadata
Title | On selecting programmable space compactors for built-in self-text using genetic algorithms |
Creator |
Tsuji, Barry Kazuto |
Date Issued | 1993 |
Description | With the increasing complexity of modern VLSI circuits, achieving high quality built-in self-test requires monitoring an increasingly large number of internal nodes. Due to the limitations in observing large numbers of nodes, it has become increasingly necessary to compact the output from a large number of lines to a small number of lines in a process known as space compaction. In the past, space compaction has usually been performed by circuit-independent compactors, such as the popular parity function. Recently, a class of circuit-specific space compactors, known as programmable space compactors (PSCs), has been proposed. A PSC is a highly effective space compactor tailored for a specific circuit under test (CUT) subjected to a specific set of test stimuli. Circuit-specific information such as the fault-free and expected faulty behaviour of a circuit can be used to choose compactors which have better fault coverage and/or lower area costs than the parity function. The drawback of PSCs is the difficulty involved in finding optimal PSCs for a circuit. The space of possible PSC functions is extremely large and grows exponentially with the number of lines to be compacted. Although searching for optimal PSC functions is a difficult problem, it is not impossible. This thesis presents a method for searching for combinational PSCs based upon genetic algorithms (GAs), a randomized search technique. Randomized search techniques, such as simulated annealing and genetic algorithms, have become increasingly popular in recent years as methods for solving search and optimization problems beyond the capabilities of classical techniques. This thesis develops a procedure for using circuit-specific information in the application of (GAs) to the search for optimal PSCs for a given CUT and a given set of test stimuli. A measure of the effectiveness, or fitness, of PSCs is developed as a means for guiding the genetic search. The factors used to assess the effectiveness of a PSC are its 11 fault coverage (estimated by its probability of aliasing) and area (measured in terms of gate equivalents). The nature of GAs allows for searches to be completed in a fixed amount of CPU time. The availability of greater computing resources in the future will enable larger scale searches and improve small-scale searches. The results of genetic searches (limited to approximately 30 minutes of CPU time on a Sun Sparc 10 workstation) for PSCs, with a compaction ratio of 4 : 1, for the ISCAS ‘85 benchmark circuits are presented here. These results show that searches based upon genetic algorithms can find PSCs with better fault coverage at a lower cost than the parity function while using limited computing resources. In some cases, cost savings of up to 80% were realized while simultaneously achieving improved fault coverage over similar space compactors based upon the parity function. |
Extent | 1431942 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2018-05-08 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0065016 |
URI | http://hdl.handle.net/2429/5023 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1994-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1994-0099.pdf [ 1.37MB ]
- Metadata
- JSON: 831-1.0065016.json
- JSON-LD: 831-1.0065016-ld.json
- RDF/XML (Pretty): 831-1.0065016-rdf.xml
- RDF/JSON: 831-1.0065016-rdf.json
- Turtle: 831-1.0065016-turtle.txt
- N-Triples: 831-1.0065016-rdf-ntriples.txt
- Original Record: 831-1.0065016-source.json
- Full Text
- 831-1.0065016-fulltext.txt
- Citation
- 831-1.0065016.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065016/manifest