Reducing Post-Silicon CoverageMonitoring Overhead with Emulationand Statistical AnalysisbyRicardo Ochoa GallardoB.Eng., Instituto Tecnolo´gico y de Estudios Superiores de Occidente, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Electrical and Computer Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2015© Ricardo Ochoa Gallardo, 2015AbstractWith increasing design complexity, post-silicon validation has become a crit-ical problem. In pre-silicon validation, coverage is the primary metric ofvalidation effectiveness, but in post-silicon, the lack of observability makescoverage measurement problematic. On-chip coverage monitors are a pos-sible solution, where a coverage monitor is a hardware circuit that sets toone when the coverage event of interest occurs. However, prior research hasshown that the overhead is prohibitive for anything beyond a small numberof coverage monitors. In this thesis, I explore techniques that reduce thenumber of instrumented coverage monitors, while still being able to implyfull coverage with high probability. These techniques use a deterministicforward feature selection algorithm with the objective functions based onstatistical information. For gathering the required statistical information,the method relies on emulation, where all coverage monitors of interest areinstrumented on a version of the design. On this emulator, such as an FPGA,I stress the design with randomly generated tests to collect the data fromthe instrumented coverage monitors. I propose three objective functions forthe feature selection algorithm: the first estimates the probability of a cov-erage monitor being set during a test; the next objective function builds aBayesian Network (BN), then takes advantage of the relationship informa-iiAbstracttion between nodes (coverage monitors), which the network provides; the lastobjective function directly estimates the conditional probability of coveragefrom the gathered data. I demonstrate these techniques on a non-trivialsystem-on-chip, by measuring the code coverage achieved after executingrandomly generated test programs. Depending on the objective function,results show a 67.7% to 95.5% reduction in the number of required coveragemonitors. In the ASIC implementation, this would translate into an impactof 0.33-1.96% in silicon area overhead and 0.40-2.70% in static power over-head. These results show my technique works, by proving it is possible totrack a smaller number of coverage events that statistically represent thewhole set.iiiPrefaceThe part of the work in this thesis will be presented in a conference paper [22]to appear at ICCAD in November 2015. Specifically, the paper also coversthe methodology that uses a Bayesian network to estimate probabilities ofcoverage (Section 3.3.2 in this thesis) and the related results. The paper wascollaboratively done between Dr. Alan J. Hu, Dr. Andre´ Ivanov, Dr. MaryamS. Mirian, and myself. I performed the main research, gathered all data,and performed all experiments. Dr. Mirian provided support with machinelearning related work, in particular, Bayesian networks. Dr. Hu did thebackground research and the final review. All the work was done under thesupervision of Dr. Hu and Dr. Ivanov. Since this thesis expanded from thepaper, part of the writing from the paper was used in this thesis.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . 42 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.1 Emulation for Post-Silicon Validation . . . . . . . . . . . . . 62.2 Reasoning About Probabilities . . . . . . . . . . . . . . . . . 82.2.1 Conditional Probability . . . . . . . . . . . . . . . . . 8vTable of Contents2.2.2 Bayesian Networks . . . . . . . . . . . . . . . . . . . 93 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . 123.1 General Method . . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.3 Objective Functions . . . . . . . . . . . . . . . . . . . . . . . 203.3.1 Objective Function 1: Assuming Independent Proba-bilities . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3.2 Objective Function 2: Bayesian Network . . . . . . . 233.3.3 Objective Function 3: Direct Conditional Probability 244 Implementation and Evaluation Methodology . . . . . . . . 314.1 Implementation Tools . . . . . . . . . . . . . . . . . . . . . . 314.2 Evaluation Platform . . . . . . . . . . . . . . . . . . . . . . . 334.3 Test Generation for Training and for Evaluation . . . . . . . 364.4 Coverage Instrumentation . . . . . . . . . . . . . . . . . . . . 375 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 415.1 Coverage Monitoring Reduction . . . . . . . . . . . . . . . . 415.2 Accuracy of Coverage Probability Predictions . . . . . . . . 445.3 Accuracy of Full Coverage Predictions . . . . . . . . . . . . . 505.4 Real Application: Linux Boot . . . . . . . . . . . . . . . . . 515.5 Comparison to Static Source Code Analysis . . . . . . . . . . 525.6 Area and Power . . . . . . . . . . . . . . . . . . . . . . . . . 556 Conclusion and Future Work . . . . . . . . . . . . . . . . . . 58viTable of ContentsBibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62viiList of Tables4.1 Evaluation Coverage Monitor Types . . . . . . . . . . . . . . 405.1 Reduction Achieved Using Objective Function 1 . . . . . . . 435.2 Reduction Achieved Using Objective Function 2 . . . . . . . 435.3 Reduction Achieved Using Objective Function 3 . . . . . . . 445.4 Time to Compute CMinst . . . . . . . . . . . . . . . . . . . . 455.5 IU CTRL Linux Boot Test Results . . . . . . . . . . . . . . . 535.6 MUL Linux Boot Test Results . . . . . . . . . . . . . . . . . 535.7 DIV Linux Boot Test Results . . . . . . . . . . . . . . . . . . 54viiiList of Figures2.1 Post-Silicon Validation with Emulation Flow . . . . . . . . . 72.2 Example of Generic Bayesian Network . . . . . . . . . . . . . 93.1 Sequential Forward Selection Algorithm . . . . . . . . . . . . 163.2 Proposed General Method . . . . . . . . . . . . . . . . . . . . 194.1 Bayesian Network Structures learnt using the BNT toolbox . 324.2 LEON3 Processor Core Block Diagram . . . . . . . . . . . . . 334.3 LEON3 Integer Unit Datapath Diagram . . . . . . . . . . . . 355.1 Expected number of coverage monitors for IU CTRL . . . . . 475.2 Expected number of coverage monitors for MUL . . . . . . . 485.3 Expected number of coverage monitors for DIV . . . . . . . . 495.4 Coverage Latency . . . . . . . . . . . . . . . . . . . . . . . . . 515.5 Impact on the Design’s Area . . . . . . . . . . . . . . . . . . 575.6 Impact on the Design’s Static Power . . . . . . . . . . . . . . 57ixAcknowledgmentsI would like to dedicate some words to thank the faculty members at UBCin the ECE and CS departments for their help during the course of mygraduate work. In particular, I thank my supervisors Dr. Alan J Hu andDr. Andre´ Ivanov. I truly appreciate the guidance and support given byDr. Alan J Hu, from the time spent thinking on the research to the helpprovided writing my thesis, without it my graduate experience would havebeen completely different. Furthermore, I would like to thank Dr. Ivanovbecause despite having severe time constraints, due to his obligations as headof the department, he always supported me and stepped in when required.I greatly appreciate the support provided by my family and friends dur-ing this time. I thank my parents, who supported me from distant andwarmer lands. Also, I thank my grandmother, who constantly encouragedme to get a Master degree, even if she now regrets motivating me to moveaway from home. I also give a special thanks to Dr. Lino Coria, my professorand friend, who motivated me to start a Masters and helped me get intoUBC and move to Vancouver.Thank you all for your support.xChapter 1Introduction1.1 MotivationPost-silicon validation encompasses all validation activities between tape-out of first silicon to ramping up to high-volume manufacturing. In contrastto pre-silicon validation, the target of validation is the physical silicon chipinstead of a model (RTL, gate-level, circuit-level, timing, etc.), resultingin the challenges of extremely limited observability and controllability; incontrast to manufacturing test, the goal is to catch difficult design errorsthat escaped pre-silicon validation and analysis, instead of catching randommanufacturing defects during high-volume production.Ideally, all bugs would be caught and fixed pre-silicon, but the complexityof modern designs is such that some bugs inevitably escape into silicon. Fur-thermore, inaccuracies in electrical and timing modeling (e.g., [5]) also createbug escapes. All of these must be caught during post-silicon validation, toavoid the potentially disastrous consequences of high-volume production andshipment of defective products. With increasing design complexity and inte-gration, and decreasing time-to-market, post-silicon validation has becomea critical problem: In 2006, Abramovici et al. [1] reported that post-siliconvalidation consumed an average of 35% of the entire development cycle of a11.1. Motivationnew chip, and the problem continues to get worse [1, 17, 19].In both pre-silicon validation and manufacturing test, coverage is theprimary metric of effectiveness, and similarly, coverage information wouldbe extremely valuable for post-silicon validation as well. Without a methodto quantify the quality of a given test set, it is impossible to assert thata design has been adequately verified. Unfortunately, the unique charac-teristics of post-silicon validation create unique challenges for post-siliconcoverage measurement. Unlike pre-silicon validation, there is extremely lim-ited off-chip observability of what is happening on-chip, making it difficultto monitor coverage events. Unlike manufacturing test, post-silicon valida-tions tests are typically extremely long (e.g., booting an operating system,or billions of cycles of random instructions) so it is infeasible to establishcoverage by construction or via fault simulation. Mitra et al. [19] highlightpost-silicon coverage as one of the most important, open research problemsin post-silicon validation.A direct approach to post-silicon coverage measurement is to add on-chip coverage monitors, which consist of some logic to determine whether acoverage event has occurred, a latch to record whether the event has beencovered, and some mechanism to read this latch (e.g., via a scan chain).This solves the problem of off-chip observability, but at the cost of area,delay, and power. For example, Bojan et al. [6] report on-chip post-siliconcoverage monitoring on Intel’s Core 2 Duo processor family, but due to theoverhead, only very minimal coverage information was monitored. Karim-ibiuki et al. [16] used FPGA-based emulation of an system-on-chip (SoC)to measure the overhead needed for extensive on-chip monitoring of code21.2. Contributionscoverage, concluding that the overhead is impractically high, even aftersophisticated techniques to eliminate redundant monitors. Adir et al. [2]bypass the problem of on-chip overhead by proposing to add coverage moni-tors only in pre-silicon emulation, but to use this emulation to develop goodpost-silicon tests. However, their methodology loses any on-chip assurancethat the expected coverage was attained.1.2 ContributionsThis thesis proposes a novel solution for post-silicon coverage monitoring:Use statistics gathered via emulation to choose a small set of coverage mon-itors to be instrumented on silicon. As in Adir et al. [2], I assume anemulation-based methodology, and instrument the complete set of coveragemonitors on the emulated design. In emulation, I sample the relationshipsbetween coverage monitors as various tests are run. From this data, sta-tistical techniques can be used to choose a small set of coverage events tomonitor in silicon.I propose a straightforward feature selection algorithm that analyzesthe gathered data to select a small subset of coverage monitors whose fullcoverage predicts (according to the learned model) with high probabilitythat all other coverage monitors would be covered as well. Only this smallsubset would be instrumented on the actual silicon, thereby providing alow-overhead mechanism to corroborate that post-silicon tests are indeedachieving high coverage in silicon.This thesis investigates three different variants on how the gathered data31.3. Related Workcan be analyzed. The first option creates the subset of coverage monitorsby measuring the probability of each coverage monitor being covered on atest and selecting the least probable to occur. The second option creates aBayesian Network (BN) to deal with the statistical analysis, then generatesthe subset by doing queries and feeding evidence back into the BN. Thethird option builds the subset by directly measuring the conditional proba-bility among the coverage monitors from the data collected on the emulator.Chapter 3 explains the objective functions in more detail.1.3 Related WorkMost related work has already been cited in the Motivation section (Sec-tion 1.1). However, there is some other work that has been published onusing machine learning techniques to aid in coverage and post-silicon debug.My thesis shares the spirit of these other works, but addresses a completelydifferent problem. For example, DeOrio et al. [8] use machine learning clus-tering techniques to help find suspicious segments of error traces. Like me,Fine and Ziv [9] use Bayesian networks, but their application is to try tolearn to control the parameters of a test generator so as to attain highercoverage. In contrast, my goal is to learn how to safely minimize on-chipcoverage monitors, yet retain high probability of full coverage.1.4 Thesis OrganizationThis thesis is organized as follows. Chapter 2 presents required informationfor the correct understanding of the thesis: an overview of how emulation is41.4. Thesis Organizationused as part of the design flow, and a description of some probability theoryconcepts and probability representations. Chapter 3 explains the generalmethod and the different objective functions explored to select the coveragemonitors to instrument. Chapter 4 covers the evaluation methodology andtools used to test my method. Chapter 5 shows results of how the differentobjective functions performed, and what the impact on real silicon wouldbe. Finally, Chapter 6 concludes and presents some possible future work.5Chapter 2BackgroundThis chapter covers various topics that are relevant for understanding therest of this thesis.2.1 Emulation for Post-Silicon ValidationEmulation1 has become a standard technique for the validation of largeintegrated circuits. Although RTL software simulation is used extensively,simulation runs approximately eight orders of magnitude slower than actualsilicon [21]. At this speed, it would take more than three years to simulatejust one second of silicon execution. FPGA-based emulation technologyallows “executing” the design orders of magnitude faster than simulation.For example, Intel’s Atom processor was mapped to a single FPGA runningat 50 Mhz [25], and the i7 Nehalem processor was mapped to five FPGAsrunning at 520 kHz [24]. Although orders of magnitude slower than theactual chip, these speeds are fast enough to perform validation tasks suchas booting an operating system and to support early software development— tasks which are infeasible in simulation.1In this thesis, I include in this term all reconfigurable hardware techniques that accel-erate the simulation of a design, e.g., FPGA prototyping, dedicated emulation systems,and other hardware accelerators.62.1. Emulation for Post-Silicon ValidationDesignFPGASi ChipPost−Si Test Plan,Tests, ExercisersSilicon FlowEmulationFlowPre−Si Validation SW Dev, Integration, etc.Post−Si Validation PlanningValidationPost−SiCoverageResultsInstrumentfor CoverageFigure 2.1: Post-Silicon Validation with Emulation Flow. Two separatetool flows map the same design to silicon and to an emulator.Emulation can also support post-silicon validation, as documented byAdir et al. [2] for the IBM Power7. Figure 2.1 shows the flow. The key idea isto add coverage monitors to (only) the emulated version of the design, and touse the resulting coverage feedback to develop post-silicon tests that achievehigh coverage by design. Such a flow can even be used to develop post-silicon tests for non-functional properties, like timing. [3] However, there isno guarantee the test run on the emulator will perfectly match the run onreal silicon, and hence coverage might not be the same.In theory, if we assume that the emulation and the silicon versions of thedesign truly behave identically, then there is no need for monitoring post-silicon coverage, because we have already established in emulation that thetests achieve high coverage. However, in practice, two designs of such com-plexity will never run cycle-by-cycle identically on all signals. For example,slight differences in memory timing, communication with the environment,72.2. Reasoning About Probabilitiesclock-domain crossing, etc. will result in (hopefully) subtle changes in thecycle-by-cycle behavior. Furthermore, what is developed via emulation isnot specific post-silicon tests, but rather test templates, in order to exploitthe vastly faster execution speed of the real silicon chip. [2] Accordingly,although the emulation-based post-silicon validation flow gives us valuabledata to develop high-quality tests, we still need true silicon coverage moni-toring.2.2 Reasoning About ProbabilitiesThis thesis makes extensive use of probabilities. Hence, this section coverssome concepts and representations of probabilities that are important tounderstand the thesis.2.2.1 Conditional ProbabilityConditional probability measures the probability of an event occurring giventhat another event (by assumption, or observation) has happened. For ex-ample (taken from Kinney [18, p. 14]), if we have a deck of cards, theprobability of a drawing a Jack is 4{52 — a deck has a total of 52 cards and 4Jacks. But if we know the card is a face card (A, K, Q or J), then the samplespace reduces to 12 while the number of possible Jacks is still 4. Now wecan conclude that the conditional probability of drawing a Jack given thatwe know we are getting a face card is 4{12.Formally, the conditional probability of event A given that event B hashappened is defined as the quotient of the probability of A and B happening82.2. Reasoning About ProbabilitiesAB CDFigure 2.2: A Bayesian Network. This network represents a joint proba-bility distribution over the events A, B, C, and D, broken down into distri-butions P pAq, P pB|Aq, P pC|Aq, and P pD|B,Cq.together (known as the joint probability) by the probability of B:P pA|Bq “P pAXBqP pBq(2.1)In the previous example, event A represents getting a Jack, while eventB represents getting a face card. If we want to use the definition on Eq. 2.1,we need to measure the joint probability of getting a Jack and a face card,P pA X Bq “ 4{52, and the probability of getting a face card, P pBq “ 12{52.If we perform the quotient, we compute the same conditional probability asbefore, P pA|Bq “4{5212{52 “4{12.2.2.2 Bayesian NetworksA way to capture complex relationship between events is to use Bayesiannetworks [23]. A Bayesian network (BN) is a graphical representation of thejoint probability distribution for a set of events. The joint probability dis-tribution represents the probability of each event falling into any particularvalue.92.2. Reasoning About ProbabilitiesA BN has two components. The first is a directed acyclic graph (DAG)in which each node corresponds to an event. In this graph, each node isconnected to other dependent nodes as a child or parent. A node is proba-bilistically independent of its non-descendants in the graph given the stateof its parents. For example, in Figure 2.2 node D is probabilistically in-dependent of node A given the state of node B and node C. The secondcomponent is the quantitative part that describes the conditional probabil-ity P pXi|papXiqq of each event Xi given its parents papXiq. From these twocomponents, a unique joint probability distribution over the complete set ofnodes is represented. The joint probability distribution is given by Eq. 2.2,where n is the number of events.P pX1, X2, ..., Xnq “nźi“1P pXi|papXiqq (2.2)If we take Figure 2.2 as an example of a BN, where node A is a parentof nodes B and C, and nodes B and C are parents of D. Using Eq. 2.2, thejoint probability distribution is given by:P pA,B,C,Dq “ P pD|B,CqP pC|AqP pB|AqP pAq (2.3)This shows that the joint distribution of a BN has a factored represen-tation composed of the product of individual local probabilities for a node.If the edges in the network structure correspond to causal relationships,where a node’s parents represent the direct causal influences on that node(as A-to-B or B-to-D in Figure 2.2), then the resulting networks are good102.2. Reasoning About Probabilitiesdescriptions of the problem.There is a vast literature on Bayesian networks. For this work, it sufficesto know that standard efficient algorithms and software packages are avail-able to perform many BN computations, such as heuristically learning a BNfrom observed data, and inferring the updated probability distributions onnodes given observations on other nodes.11Chapter 3Proposed MethodThis chapter proposes the overall general method, with three variants foranalyzing the gathered data.3.1 General MethodMy method builds on the emulation-based post-silicon validation method-ology described in Section 2.1. In particular, my method assumes that theemulated design has been fully instrumented with monitors that check, foreach coverage event of interest, whether that event has been covered yet bya test or set of tests. This coverage monitors are formed by some hard-ware logic that sets to 1 when the event of interest occurs. The methodfurther assumes that existing methods have been used to develop good tests— which achieve high coverage on the emulated design — and to eliminateuncoverable coverage monitors. My goal is to identify a small subset of thecoverage monitors whose coverage implies full coverage of all coverage mon-itors, with high probability. Only this small subset would be instrumentedon the real silicon.More formally, I can represent the full set of all n coverage monitorsby a set CM of n Boolean variables c1, . . . , cn that indicate whether the123.1. General Methodith coverage target has been covered. Tests are drawn from some universeof possible tests T , and when any test t P T is run, it produces some n-dimensional probability distribution over the 2n possible values of c1, . . . , cn.I model the effects of a test as a probability distribution to account forrandomness and non-determinism in the system, e.g., timing differences withrespect to the environment, clock domain crossings, PLL lock time, etc.Furthermore, my method assumes that sets of tests will be chosen with somerandomness, e.g., via directed-random test templates, random instructiontest, or other techniques to boost coverage during extended testing. Thecombined effect is that the method models the post-silicon testing processas inducing a complicated, unknown n-dimensional probability distributionover c1, . . . , cn, and it can consider each ci a Bernoulli random variable. Mygoal is to find a smaller subset CMinst Ă CM such that the conditionalprobability of full coverage given that the coverage monitors in CMinst arecovered is greater than a threshold, as expressed in Eq. 3.1, where pthresholdis some user-specified bound on the probability of achieving full coverage.P p@cPCM pc “ 1q|@cPCMinstpc “ 1qq ě pthreshold (3.1)The goal is obviously an instance of the “feature selection problem” inmachine learning (e.g., [14, 15]): find a small subset of features (in thiscase, coverage monitors) to construct a good model to use as a classifier (inthis case, whether a test suite achieves full coverage or not). The centralproblem, though, is finding a good objective function so that the featureselection algorithm can maximize the probability of full coverage. With a133.1. General Methodgood objective function, my method can use any standard feature selectionalgorithm to select a good, small set of coverage monitors.For this thesis, I chose a simple deterministic sequential forward selection(SFS) algorithm due to its simplicity and speed. The algorithm starts withan empty CMinst set and greedily adds coverage monitors one-at-a-timethat give the maximum increase in the objective function, until reachingthe desired probability of full coverage. Figure 3.1 gives pseudocode of thefeature selection algorithm. Only the coverage monitors in CMinst would beinstrumented on the ASIC implementation.The objective functions are covered in detail in Sec.3.3. But in general,all of my objective functions compute approximations of the probability offull coverage given the information CMinst provides. Accordingly, to attackthe problem, they need knowledge about the joint probability distributionover c1, . . . , cn. Under the assumption that there are no large systematicbiases between emulation and silicon coverage, I can sample the joint prob-ability distribution by running tests on the emulator and observing whichCMs are covered.2For clarity and simplicity, I define the following notation for the prob-abilities of coverage. The notation P pfullq denotes the probability of allcoverage monitors being set to one:P pfullq “ P p@dPCM pd “ 1qq (3.2)2This is a central assumption behind any use of emulation for post-silicon validation.The success of the emulation-based approach on the Power7 project reported by Adir etal. [2] gives strong support for this assumption, but I acknowledge that it is an assumptionthat might not hold perfectly.143.2. SamplingThe notation P pCMinstq denotes the probability of coverage monitors inCMinst being set to one:P pCMinstq “ P p@dPCMinstpd “ 1qq (3.3)The notation P pfull|CMinstq denotes the conditional probability of all cover-age monitors being set to one given coverage monitors in CMinst are alreadyset to one:P pfull|CMinstq “ P p@dPCM pd “ 1q|@dPCMinstpd “ 1qq (3.4)The notation P pfull|CMinst, c “ 1q denotes the conditional probability ofall coverage monitors being set to one given coverage monitors in CMinstand coverage monitor c are already set to one:P pfull|CMinst, c “ 1q “ P p@dPCM pd “ 1q|@dPCMinstpd “ 1q ^ c “ 1q (3.5)I will use pP to denote the estimate of the probabilities of coverage basedon the samples gathered during emulation.3.2 SamplingThere are different sampling methodologies that could be used to gather thedata from emulation to estimate the coverage probabilities. The decision onhow to sample the coverage monitors can affect the number of tests requiredfor a good estimation of a coverage probability and the correlations between153.2. Sampling1: CMinst ÐH2: while pP pfull|CMinstq ă pthreshold do3: for all c P CM ´ CMinst do4: Compute Objective Function:Opcq “ pP pfull|CMinst, c “ 1q ´ pP pfull|CMinstq.5: Let cmax be the c that produced the largest gain.6: Add cmax to CMinst.Figure 3.1: Sequential Forward Selection Algorithm for Selecting CoverageMonitors to be Instrumented. The algorithm greedily adds coverage mon-itors to the instrumented set CMinst until a specified threshold is reached.The notation pP pfull|CMinstq denotes the estimate of the probability of fullcoverage given that all coverage monitors in CMinst are covered.coverage events the method is able to extract.A direct approach is to sample the coverage monitors at the end of ev-ery test. The problem with this sampling is that it does not capture thecorrelations that occur at the start or middle of the test. In other words,sampling just the final coverage loses details on how the coverage monitorswere covered through the test. To overcome this issue, the methodologywould require more test programs to estimate a good full coverage probabil-ity with the hope of other tests capturing the lost relations, and even thensome correlations might not be possible to capture.Another approach is to sample the test run on every clock cycle. Thissampling would produce many samples per test run — billions or more de-pending on the length of the test. Since the samples represent the completerun, they provide information about the behaviour of the coverage moni-tors through time. This temporal information would allow reasoning aboutcoverage points related through time. For example, I could estimate theprobability p of a coverage monitor ci being covered n cycles after another163.2. Samplingcoverage monitor.Relating coverage monitors through time is something my method needs,but reasoning about these relationships cycle-by-cycle, or even over rangesof cycles, is expensive. Such a methodology would dramatically increase thecomplexity by adding the extra dimension of time. For example, instead ofestimating only the conditional probability between two coverage monitors,the methodology would need to estimate the probabilities of a coveragemonitor being covered n cycles after another coverage monitor was covered,where n ranges from 1 to the maximum length, in number of clock cycles,of a test run.Since the cycle-by-cycle sampling would require a more complex methodand more memory resources on the sampling device (either external or in-ternal to the design), I propose to use a compressed sampling. I get thiscompression by sampling only when a new coverage monitor in CM is cov-ered. For example, let us assume there are four coverage monitors in CM ,rc1, c2, c3, c4s. It could be the case that on a certain test, c2 is covered first;some time later, c1 and c3 are covered together; and then c4 is covered at theend. This would produce the sampling data, in Eq. 3.6, where, each columnrepresents a coverage monitor, each row represents a sample at a specifictime, a value of 1 represents a covered coverage monitor, and a value of 0173.2. Samplingrepresents a not covered coverage monitor.c1 c2 c3 c4sample1 0 0 0 0sample2 0 1 0 0sample3 1 1 1 0sample4 1 1 1 1(3.6)Additionally, I split the data into individual samples and append thesamples from other tests. This sampling is a compressed version of the cycle-by-cycle sampling but it loses some of the temporal information. However,part of the temporal information is still in the data, represented by thefrequency of a coverage monitor being set on the samples. For example, c4from the sampling example in Eq. 3.6 is covered very late in the test run, sothis is shown as multiple data points in which it isn’t set yet. On the otherhand, c2 has multiple samples in which it is set, since it is covered early inthe test run.Moreover, unlike the data from the simple sampling at the end of eachtest, the compressed data contains information about the relations amongcoverage monitors covered through the test. Although, the compressed sam-pling fails to capture the probability of a coverage monitor being coveredafter a specific time after another coverage monitor is covered, as a cycle-by-cycle sampling would, the compressed data does have information toestimate the probability of a coverage monitor being covered given that an-other one is already covered. For example, in the data in Eq. 3.6, it could183.2. SamplingSam ple CMs on EmulatorIn strume nted CM (CM in st )Objective F unction: Fe ed s Se qu en tialForward Se lection (SFS) using Statistical I nf ormationIn put Tes ts f or trainingIn put Tes ts for trainingInpu t Tests for T rainingStatistical In formationFeature Selection: Seq ue ntial ForwardSelection (SFS)Figure 3.2: General Method. The method uses information gathered inemulation to create a subset of coverage monitors to instrument on the ASICimplementation.be possible to estimate c2 being covered given c1 is covered with a simpleconditional probability, pP pc2 “ 1|c1 “ 1q, using the samples from the wholetraining set.My method uses this compressed sampling, since it allows the objectivefunctions to easily reason about the relationships between coverage moni-tors without moving into a more complex and expensive method. Figure3.2 gives the overall flow of my method. In short, during pre-silicon emula-tion of post-silicon tests, coverage information is collected as training data.Then, the feature selection algorithm selects a subset of coverage monitorsto instrument on the ASIC implementation.193.3. Objective Functions3.3 Objective Functions3.3.1 Objective Function 1: Assuming IndependentProbabilitiesThe first objective function captures the intuitively appealing idea of in-strumenting on silicon the coverage monitors that were the hardest to hit,considering each coverage monitor in CM by itself. Formally, this meanstreating the random variables ci as if they were independent, and computingthe estimate of each pP pci “ 1q.From the general method in Figure 3.1, the objective function is based ontwo terms: the probability of full coverage given that all coverage monitorsin CMinst are covered, pP pfull|CMinstq, and the probability of full coveragegiven that all coverage monitors in CMinst plus another coverage monitorare covered, pP pfull|CMinst, c “ 1q. Let us first analyze pP pfull|CMinstq:pP pfull|CMinstq “ pP p@dPCM pd “ 1q|@dPCMinstpd “ 1qq“pP p@dPCM pd “ 1q ^ @dPCMinstpd “ 1qqpP p@dPCMinstpd “ 1qq,by the definition of conditional probability Eq. 2.1“pP p@dPCM pd “ 1qqpP p@pdPCMinstqpd “ 1qq, since CMinst Ă CM“pP pfullqpP pCMinstq(3.7)Because of the independence assumption, the estimate of the conditionalprobability of full coverage becomes the product of the probabilities of the203.3. Objective Functionscoverage monitors not in CMinst being covered:pP pfull|CMinstq “pP pfullqpP pCMinstq, by Eq. 3.7“ś@dPCMpP pd “ 1qś@dPCMinstpP pd “ 1q, by independence assumption“ź@dRCMinstpP pd “ 1q(3.8)Doing the same procedure, I derive the estimate of the conditional prob-ability pP pfull|CMinst, c “ 1q:pP pfull|CMinst, c “ 1q “pP pfullqpP pCMinst, c “ 1q, by Eq. 3.7“ś@dPCMpP pd “ 1qś@dPpCMinstYtcuqpP pd “ 1q,by independence assumption“ź@dRpCMinstYtcuqpP pd “ 1q(3.9)I can now replace the objective function on line 4 from the general213.3. Objective Functionsmethod in Figure 3.1 with the following:1: Let c be the coverage monitor to check for instrumentation.2: Compute pP pfull|CMinstq coverage probability estimate (Eq. 3.8):pP pfull|CMinstq “ź@dRCMinstpP pd “ 1q3: Compute full coverage probability estimate if c is instrumented(Eq. 3.9):pP pfull|CMinst, c “ 1q “ź@dRpCMinstYtcuqpP pd “ 1q4: return: pP pfull|CMinst, c “ 1q ´ pP pfull|CMinstqThis objective function requires the least number of samples (emulationtests) to achieve a specified precision of the probability of coverage, becauseit depends only on estimates of one-dimensional probabilities pP pci “ 1q.However, one obvious drawback for this objective function is that itignores any information about correlation among coverage monitors, e.g., agroup of coverage monitors that all tend to be covered at the same time.This lack of correlation information could miscompute the probability offull coverage. For this reason, the assumption of independence may beproblematic.223.3. Objective Functions3.3.2 Objective Function 2: Bayesian NetworkTo overcome the lack of correlation information from the previous objec-tive function, the second objective function considers the coverage monitorsto be possibly dependent on each other and captures the relations betweenthem using a Bayesian network (BN). I use off-the-shelf Bayesian networkstructural and parameter learning algorithms to construct a BN that approx-imates the joint probability distribution of the gathered data. For simplicity,and because I’m only interested in full coverage, I add a “full coverage” node(FC node) to the BN, which is dependent on all CMs and indicates whetherthey are all covered. This node lets me conveniently handle all probabilisticreasoning with only one query.Now armed with the BN, the objective function sets to 1 all the coveragemonitors in CMinst and queries the FC node to get the current estimationof the probability of full coverage, pP pfull|CMinstq. To get the probabilitypP pfull|CMinst, c “ 1q, the objective function also sets the node of the cov-erage monitor being checked for instrumentation to 1, c “ 1, and performsanother query to the FC node. The objective function returns the differenceof the two as the gain if c were instrumented.Again, the objective function using a Bayesian network replaces line 4233.3. Objective Functionsfrom the general method in Figure 3.1 with the following:1: Let c be the coverage monitor to check for instrumentation.2: Clear all evidence on the BN (set to unknown).3: Set all CMinst nodes to 1 on the BN.4: pP pfull|CMinstq “ Query FC node for full coverage on BN5: Set c node to 1 on the BN6: pP pfull|CMinst, c “ 1q “ Query FC node for full coverage on BN7: return: Opcq “ pP pfull|CMinst, c “ 1q ´ pP pfull|CMinstqObjective Function 2 requires more samples than Objective Function 1 toachieve a specified precision of the probabilities of coverage. The reason Ob-jective Function 2 requires more samples is because to learn the BN, it needsestimates of relations among multiple coverage points, like pP pci “ 1|cj “ 1q,whereas Objective Function 1 uses estimates of only one-dimensional prob-abilities, pP pci “ 1q.Objective Function 2 is able to compute a more precise estimate of thefull coverage probability, but to do so, it has to generate a complete BNwith the structure and probability parameters. Learning the BN is slowand produces more data than needed, since I don’t directly require the BNstructure.3.3.3 Objective Function 3: Direct Conditional ProbabilityTo avoid learning a BN, I propose a new objective function that directlyestimates the high-dimensional conditional probability from the collecteddata.243.3. Objective FunctionsAs in Objective Function 1, I use Eq. 3.7 on page 20 to estimate theprobability of full coverage given that the coverage monitors in CMinst arecovered. However, instead of simplifying the probability pP pfull|CMinstqdown to estimates of independent probabilities pP pci “ 1q, Objective Func-tion 3 looks to simplify down to estimates of conditional probabilities giventhat CMinst is covered pP pci “ 1|CMinstq.To do so, I used the definition of the probability of the intersection oftwo events derived from the conditional probability definition:P pAXBq “ P pA|BqP pBq, from Eq. 2.1 (3.10)Then, I express the probability of full coverage, pP pfullq, in terms of two setsof events: coverage monitors in CMinst being covered and coverage monitorsnot in CMinst being covered:pP pfullq “ pP p@dRCMinstpd “ 1q ^ @dPCMinstpd “ 1qq“ pP p@dRCMinstpd “ 1q|@dPCMinstpd “ 1qq pP p@dPCMinstpd “ 1qq,by Eq. 3.10“ pP p@dRCMinstpd “ 1q|CMinstq pP pCMinstq, replacing with notation(3.11)253.3. Objective FunctionsI now replace pP pfullq in Eq. 3.7 using Eq. 3.11:pP pfull|CMinstq “pP pfullqpP pCMinstq“pP p@dRCMinstpd “ 1q|CMinstq pP pCMinstqpP pCMinstq, by Eq. 3.11“ pP p@dRCMinstpd “ 1q|CMinstq(3.12)In Objective Function 1, because of the independence assumption amongall coverage points, Eq. 3.12 would reduce to pP p@dRCMinstpd “ 1qq, moreoverit would be expressed as the product of the probabilities pP pd “ 1q for allcoverage monitors not yet instrumented. In contrast, Objective Function 3will assume independence only among the coverage monitors not in CMinst.To continue the simplification, I will use a standard derivation of theconditional probability relating 3 events, in particular, the probability oftwo events occurring given that a third event has occurred:P px^ y | zq “P px^ y ^ zqP pzq“ˆP px^ y ^ zqP py ^ zq˙ˆP py ^ zqP pzq˙“ P px | y ^ zqP py | zq(3.13)If we take Eq. 3.13, and define x and y to be conditionally independent,then knowing y does not give any additional information about x. Hence,Eq. 3.13 reduces to:P px, y | zq “ P px | zqP py | zq (3.14)263.3. Objective FunctionsFor Objective Function 3, I’ll consider coverage monitors to be depen-dant only on coverage monitors in CMinst. This means that I am makingan assumption of independence among the coverage points that are notyet instrumented. Using Eq. 3.14 and given the independence assumption,Eq. 3.12 simplifies to:pP pfull|CMinstq “ pP p@dRCMinstpd “ 1q|CMinstq“ź@dRCMinstpP pd “ 1|CMinstq(3.15)So far we have considered CMinst to be fully covered. However, it mightbe the case that there are two mutually exclusive coverage monitors. Forexample, let us considered the data, D1 and D2, collected from two tests,where every column represents a coverage monitor and every row a sampleof coverage monitors at a specific time:D1 “»————–0 0 0 00 1 0 01 1 0 1fiffiffiffiffiflD2 “»————–0 0 0 00 1 0 00 1 1 1fiffiffiffiffiflFor example, c1 is covered at the end of D1, but is never covered in D2.As explained in Section 3.2, the compressed sampling will form one data273.3. Objective Functionscollection D from the two tests:D “»——————————————–0 0 0 00 1 0 01 1 0 10 0 0 00 1 0 00 1 1 1fiffiffiffiffiffiffiffiffiffiffiffiffiffiffifl(3.16)Notice that the coverage monitor c1, which is set in D1, and the coveragemonitor c3, which is set in D2, are mutually exclusive — they are never setat the same time. Still, from this data we could conclude that c2 “ 1 andc4 “ 1, if c1 “ 1 and c3 “ 1.However, if CMinst “ rc1, c3s and I estimate the probabilities of coverageusing Eq. 3.15 and the data in Eq. 3.16, any estimate Pˆ pd “ 1|CMinstqwould be undefined since there is no sample where both coverage monitorsare covered. If pP pd “ 1|CMinstq is undefined, then the probability of fullcoverage pP pfull|CMinstq is also undefined, and the objective function wouldfail to optimize CMinst.To work around this issue, I make an extra assumption of independencewhich could affect the accuracy in the estimations of probabilities of cover-age. I assume every coverage monitor can be dependant on only a subsetof CMinst. Furthermore, I choose the subset of CMinst that gives the max-imum probability of coverage for a given coverage monitor d. I denote this283.3. Objective Functionsby P pd “ 1|CMinstq, which can be expressed in the following equation:P pd “ 1|CMinstq “ maxall subsets cminstĎCMinst”pP pd “ 1|cminstqı(3.17)The following recursive function is a simple implementation of Eq. 3.17:1: function Pmax(d “ 1, CMinstq)2: pmax “ pP pd “ 1|CMinstq3: n “ 14: for all e P CMinst do5: Assume e is independent to d:cminstTemp “ CMinst ´ e6: ptemp “ Pmaxpd “ 1, cminstTempq7: if ptemp ą pmax then8: pmax “ ptemp9: return pmaxI simply replace the previous estimate in Eq. 3.15 with Eq. 3.17:pP pfull|CMinstq “ź@dRCMinstP pd “ 1|CMinstq (3.18)By considering c as part of the coverage monitors in CMinst, I useEq. 3.18 to get pP pfull|CMinst, c “ 1q:pP pfull|CMinst, c “ 1q “ź@dRpCMinstYtcuqP pd “ 1|pCMinst Y tcuqq (3.19)As with the previous objective functions, I now replace line 4 from the293.3. Objective Functionsgeneral method (Figure 3.1) with the following:1: Let c be the coverage monitor to check for instrumentation.2: Use Eq. 3.18 to compute the current full coverage probability.3: Use Eq. 3.19 to compute the full coverage probability if c is instru-mented.4: return: Opcq “ pP pfull|CMinst, c “ 1q ´ pP pfull|CMinstqSimilar to Section 3.3.1, where Objective Function 1 directly estimatesthe probabilities pP pci “ 1q from the data, this objective function directlyestimates the conditional probabilities pP pci “ 1|CMinstq. However, the newestimation would require more samples to get the same precision achievedwith the one-dimensional probability of Objective Function 1. Nonetheless,Objective Function 3 is able to estimate a better approximation of the fullcoverage probability by capturing the extra dependencies between cover-age points and the instrumented coverage points. Furthermore, ObjectiveFunction 3 avoids computing a complete approximation of the joint prob-ability distribution (like Objective Function 2) by directly estimating onlythe meaningful probabilities.30Chapter 4Implementation and EvaluationMethodologyThis chapter provides the details of the implementation and evaluation ofmy method. First, Section 4.1 explains the tools used for implementing thealgorithms of the method presented in Chapter 3. The following section, Sec-tion 4.2, presents the design and emulator used to demonstrate my method.Then, Section 4.3 explains how tests are generated to stress the design forgathering the statistical information and to evaluate the selection. The lastsection, Section 4.4, explains the type of coverage used in my experiments.4.1 Implementation ToolsThe main tool used to implement the algorithms and compute the probabili-ties is Matlab. The whole general method and almost all the objective func-tions are implemented using the available libraries and function providedwith the regular distribution of Matlab. The only exception is ObjectiveFunction 2, where I use the Bayesian Network Toolbox (BNT), developedby Murphy [20], to learn and use the Bayesian network in the algorithm. TheBNT toolbox is conveniently implemented in Matlab, so I directly integrateit into the Objective Function 2 algorithm.314.1. Implementation Tools12 34 567 89 1011 121314 1516 17 18FC(a)1 23 45 678 910 111213 14151617 181920 2122 2324FC(b)1 23 4 5 67 89 1011121314 1516FC(c)Figure 4.1: Bayesian Network Structures. The BNT toolbox created threeBayesian network structures. They are based on data collected from theunits of the design described on Sections 4.2 and 4.4: Figure (a) is theinteger unit, Figure (b) is the multiplier unit, and Figure (c) is the dividerunit.I setup the BNT toolbox to use the most common algorithms to learnboth the structure and the probability distribution. In particular, BNT usesthe K2 algorithm [7] for structure learning and a greedy search-and-scorealgorithm for learning the tabular parameters. The greedy search-and-scorealgorithm uses a fixed order, maximum likelihood parameter estimation andthe BDeu (Bayesian Dirichlet equivalent uniform [13]) prior initialization.As an example, Figure 4.1 shows the learnt structures of three differentdata collections. The numbered nodes represent a coverage monitor andthe FC node represents the event of full coverage, as explained in Section3.3.2. Each Bayesian network show the relations of the coverage monitors ofone of the instrumented units of the design described later on Sections 4.2and 4.4. Specifically, Figure 4.1(a) used data from the integer unit, Figure4.1(b) used data from the multiplier unit, and Figure (c) used data from thedivider unit.324.2. Evaluation PlatformFigure 4.2: LEON3 Processor Core Block Diagram. The central compo-nent is a 7-Stage Integer Unit that connects to other processing units, suchas the multiplication and division hardware and the floating point unit. Thecore implements separate instruction and data caches.(Figure taken from [11, p. 843])4.2 Evaluation PlatformTo demonstrate the proposed method, I need a benchmark design and anemulation platform. I will use an SoC design which was previously usedin my lab for other post-Silicon validation research [3, 4, 16]. The SoC iscomposed of the open-source LEON3 SPARCv8 core, provided by AeroflexGaisler3 [10, 11], plus various peripherals supplied by Aeroflex Gaisler andthe open-source community, such as UART, video, DDR2 controller, Eth-ernet, and flash memory. It is roughly comparable to a netbook or tabletdevice and Aeroflex Gaisler claims it can be fabricated to 0.13 µm ASICtechnology at a speed of 400 MHz.3The aerospace company Cobham plc acquired Aeroflex on September 15, 2014. Shortlyafter the acquisition, Aeroflex Gaisler was renamed to Cobham Gaisler AB. However, Iwill continue calling them Aeroflex Gaisler for the rest of this thesis since most of the workwas done prior to this change.334.2. Evaluation PlatformBecause of the processor-centric nature of the training tests (describedlater in Section 4.3), for my experiments, I will focus on three IP blocksfrom the LEON3 core: the integer unit (IU CTRL), the multiplier (MUL),and the divider (DIV).The integer unit IU CTRL implements a single instruction pipeline andexecutes ALU, logical, branch and shift operations. The LEON3 integer IPhas the following main features:• 7-stage instruction pipeline• Separate instruction and data cache interfaces• Support for 2 - 32 register windows (8 register windows are used inmy experiments)• Static branch prediction• Single-vector trapping for reduced code sizeThe IU CTRL IP dispatches multiplication and division instructions re-spectively to the multiplication IP and division IP. MUL performs 32x32-bit integer multiplication (signed and unsigned) and returns a 64-bit result.DIV performs 64-by-32-bit division, implementing the radix-2 non-restoringalgorithm, and returns a 32-bit result without remainder.The method proposed in my thesis requires a way to emulate the design.Aeroflex Gaisler [10] provides some templates for several Xilinx and Alteracommercial boards. I chose one that was already available in the lab, theXilinx ML505 (XUPV5-LX110T) evaluation platform with the general pur-pose Virtex-5 FPGA. The Virtex-5 FPGA has 17280 slices each with four344.2. Evaluation PlatformFigure 4.3: LEON3 Integer Unit Datapath Diagram. It is implemented ona 7-Stage pipeline. The instruction cache feeds the pipeline which is able toexecute ALU, branch and shift operations. It connects to the register file,as well as to the multiplication and division hardware if available.(Figure taken from [11, p. 845])354.3. Test Generation for Training and for EvaluationLUTs and four flip-flops. The design uses 9065 slices that represent 52% ofthe total available. The SoC can be loaded to the Virtex-5 with an operatingfrequency of 80 MHz. The emulated SoC is fast enough to boot Linux andrun the X11 windowing system.The goal of my method is to select a small set of coverage monitors toinstrument in silicon. Therefore, I need to evaluate the small set. The mostcompelling evaluation would be to fabricate the design in silicon, with fullon-chip instrumentation of all coverage monitors — full instrumentation isrequired to check to what degree coverage of CMinst implies full coverageof all CMs. But due to time and resource constraints, instrumenting thedesign in real silicon is not feasible for my thesis. Instead, I use the sameemulator used for implementing my method to evaluate the selection of cov-erage monitors. Since I’m using emulation to evaluate the CMinst selected onemulation, the method samples the coverage monitors for the training datausing a training set of tests, and then I evaluate CMinst using completelyindependent sets of tests.4.3 Test Generation for Training and forEvaluationAs explained in Chapter 3, the method assumes a training set of tests withsome randomness. To generate the training set, I used the tool CSmith [26]which generates random C programs. Although CSmith is commonly used totest compilers, I take advantage of the randomness provided in the generatedprograms. Using CSmith provides an easy mechanism to generate (con-364.4. Coverage Instrumentationstrained) random instruction tests, as are commonly used for post-siliconvalidation.I generated 100 random C programs. Then I compile them using theAeroflex Gaisler BCC cross-compiler to run directly on the LEON3 proces-sor. This set is used to stress the SoC and then sample the coverage monitorsto estimate the probabilities of coverage.Since I’m not fabricating the design in a costume integrated circuit, theevaluation methodology needs new sets of tests. I create two new evaluationsets using CSmith. The first evaluation set is composed of 400 short tests.The second evaluation test set is composed of 10 longer tests (equivalentto roughly 500-700 tests of the first set). I use one final evaluation test,booting Linux. I created a custom Linux version based on kernel 3.10 andsome extra patches provided by Aeroflex Gaisler.I run these completely new evaluation tests on the SoC and measure howwell the instrumented coverage monitors CMinst imply full coverage of CM .This emulates what would happen on-chip in real post-silicon validation.However, in my experiments, since the new tests are also run via emulation,I can observe all coverage monitors and measure how well my proposedmethod worked.4.4 Coverage InstrumentationMy method could be applied to any coverage model, but in order to runexperiments, it needs specific coverage targets. As mentioned before, theselected SoC has previously been used for other post-silicon validation re-374.4. Coverage Instrumentationsearch, and several functional units (included the selected IPs in Section 4.2)had already been instrumented by others for code coverage [4, 16]. Codecoverage is known to be a weak coverage model (poor coverage indicatespoor validation, but high coverage is no guarantee of adequate validation)but it is also standard, well-known, and well-accepted. Unlike functionalcoverage, where the verification engineer’s expertise is vital in devising goodfunctional coverage targets, the definitions of code coverage are precise andcompletely objective, making it a good choice for experimental evaluations.And despite its shortcomings, it is actively used in industry and provenvaluable (e.g., [12]).For the probability estimators, an important methodological and practi-cal question is how many coverage monitors to include in the analysis. Thelarger the set of coverage monitors analyzed together is, the more complexand more accurate are the relationships that can be discovered in the analysis— Objective Function 1 would be the exception, since it assumes indepen-dence among all coverage monitors. On the other hand, scalability suffersthe larger the set gets, because the number of pair-wise interactions growsquadratically with the number of elements. The number of interactions im-pacts the complexity of the BN (Section 3.3.2) and the direct conditionalprobabilities (Section 3.3.3). For this reason, in all my experiments, themethod partitions the monitors at the level of IP blocks, creating a separateCMinst for each IP block. Such a partitioning allows my methodology to bescaled to arbitrarily large designs without complexity blow-up, but fails tocapture relationships between monitors in different IP blocks.In my experiments, I found it useful to classify the coverage monitors384.4. Coverage Instrumentationinto three types: “active” CM , “instantly covered” CM , and uncoverableCM . Perhaps because the test tool is not tuned to stress the completedesign, there are some coverage monitors that are never covered in by thetraining tests. In real life, considerable effort would be spent analyzing eachof these coverage monitors, to determine if they are truly uncoverable, or toaugment the test set to cover them. For my experiments, I simply labeledthem “uncoverable,” and precluded them from further consideration. Of theremaining, coverable coverage monitors, I found that many were covered atthe start of every test, so every time I sampled them they were set to 1. Ilabel these “instantly covered.” In principle, my algorithm would correctlyidentify these coverage monitors and exclude them from the instrumentedset CMinst, because monitoring them would provide no useful information,but for efficiency, I simply excluded them manually, as I expect would bedone in practice. I call the remaining coverage monitors “active.” Table 4.1shows the distribution of the three kinds of coverage monitors in the chosenIP blocks, based on emulation of the training tests.On my experiments, the method will focus on active CM , since myalgorithm only selects coverage monitors from this set of coverage monitors.394.4. Coverage InstrumentationTable 4.1: Breakdown of Coverage Monitor (CM) TypesUnit Total UncoverableCoverableInstantlyCoveredCMActiveCMIU CTRL 118 8 92 18MUL 46 1 21 24DIV 31 6 9 16Sum 244 64 122 5840Chapter 5Experimental ResultsThis chapter presents the results of implementing my method using theevaluation platform described in Chapter 4. First, Section 5.1 presents thereductions achieved by the method using the objective functions and thetime required to get the reductions. Then, Sections 5.2 to 5.4 evaluate howwell the subset of coverage monitors CMinst represent the complete set ofcoverage monitors CM . Later, Section 5.5 compares the reductions achievedby my method with the reduction achieved by others using static analysisof the source code. Finally, Section 5.6 shows the impact of implementingthe coverage monitors selected by the method presented in this thesis.5.1 Coverage Monitoring ReductionThe first question is how much reduction in CM was achievable using mymethod with the proposed objective functions. Tables 5.1 to 5.3 show theresults for each of the three objective functions, using the training set of 100random programs. The first set of columns labeled “size of CMinst” presentsthe number of coverage monitors required to predict at least 90%, 98%, and99% of probability of full coverage (pthreshold in the general method) giventhat all the coverage monitors in CMinst are set. The second set of columns415.1. Coverage Monitoring Reductionlabeled “FS Reduction” shows the reduction in “active” CM achieved dueto the feature selection algorithm. The third set of columns labeled “OverallReduction” shows the reduction achieved by the complete method, due tothe feature selection algorithm and emulation analysis (removing instantlycovered coverage monitors). The tables present the results for each of theinstrumented units. In the last row, the tables present the sum of the sizesof CMinst and the reductions considering the sum of the CM sets of eachunit.The tables show how the lower pthreshold gets the more reduction of cov-erage monitors the method achieves. However, it is also true that the lowerpthreshold is the least assurance of full coverage we get from covering theselected coverage monitors CMinst.Notice that Objective Function 1 yields the worst results of all threeobjective functions. As mentioned earlier, because of the inaccurate estimateof the probabilities of coverage, Objective Function 1 produced no reduction— except in IU CTRL with pthreshold “ 0.9. However, since there is no extrareduction achieved by the feature selection algorithm, the last two of columnsin Table 5.1 represent the reduction achieved by removing only “instantlycovered” coverage monitors.By capturing the relations among coverage monitors, Objective Function2 and Objective Function 3 yield a smaller subset of coverage monitorsto instrument than Objective Function 1. However, the best result wasachieved using Objective Function 3 with a total reduction in the coverablecoverage monitors of 95.5% with pthreshold “ 0.99. Objective Function 3reduced the set by an extra 27.5% compared to just removing instantly425.1. Coverage Monitoring ReductionTable 5.1: Reduction Achieved Using Objective Function 1.UnitSize of CMinstusing pthresholdvalue of:FS Reduction:1´ |CMinst||Active CMs|OverallReduction:1´ |CMinst||Coverable CMs|0.90 0.98 0.99 0.90 0.98 0.99 0.90 0.98 0.99IU CTRL 17 18 18 6% 0% 0% 85% 84% 84%MUL 24 24 24 0% 0% 0% 47% 47% 47%DIV 16 16 16 0% 0% 0% 36% 36% 36%Sum 57 58 58 2% 0% 0% 68% 68% 68%Table 5.2: Reduction Achieved Using Objective Function 2.UnitSize of CMinstusing pthresholdvalue of:FS Reduction:1´ |CMinst||Active CMs|OverallReduction:1´ |CMinst||Coverable CMs|0.90 0.98 0.99 0.90 0.98 0.99 0.90 0.98 0.99IU CTRL 3 3 3 83% 83% 83% 97% 97% 97%MUL 3 4 4 87% 83% 83% 93% 91% 91%DIV 6 7 8 62% 56% 50% 76% 72% 68%Sum 12 14 15 79% 76% 74% 93% 92% 92%covered coverage monitors.Unless otherwise specified, in all subsequent experiments I use the CMinstthat predicts at least 98% probability of full coverage. In other words, I amusing a pthreshold “ 0.98 on the general method.Table 5.4 shows the time required by the complete method to createthe subset CMinst. The column labeled “Sample Time” presents the timerequired for running the training tests and sampling the coverage monitors in435.2. Accuracy of Coverage Probability PredictionsTable 5.3: Reduction Achieved Using Objective Function 3.UnitSize of CMinstusing pthresholdvalue of:FS Reduction:1´ |CMinst||Active CMs|OverallReduction:1´ |CMinst||Coverable CMs|0.90 0.98 0.99 0.90 0.98 0.99 0.90 0.98 0.99IU CTRL 3 3 3 83% 83% 83% 97% 97% 97%MUL 1 1 1 96% 96% 96% 98% 98% 98%DIV 4 5 5 75% 69% 69% 84% 80% 80%Sum 8 9 9 86% 84% 84% 96% 95% 95%emulation. For all the objective functions, this was the most time-consumingtask. However, this could be amortized over the entire emulation test plan,as many tests need to be emulated anyway and the method could use thethese tests.The fastest objective function is Objective Function 1, but this is irrel-evant considering it is unable to yield a smaller subset. On the other hand,by just one or two orders of magnitude slower than Objective Function 1,Objective Function 3 yields the smallest subset. Objective Function 2 isfour or five order of magnitude slower than Objective Function 3. Objec-tive Function 2 is the slowest because of the time required to learn the BN(querying the BN is not time consuming).5.2 Accuracy of Coverage Probability PredictionsThis section and the following two sections present experiments to evalu-ate the quality of the sets CMinst produced by each of the three objective445.2. Accuracy of Coverage Probability PredictionsTable 5.4: Time to Compute CMinst. The times presented for the methodwere measured on a machine with an Intel Core i5-4670, running at 3.4GHz,and on a Windows 7 64bits OS.UnitSampleTimeMethod Compute TimeObjectiveFunction 1ObjectiveFunction 2ObjectiveFunction 3IU CTRL 15.78hr 9.86ms 103.97s 99.77msMUL 15.61hr 9.38ms 129.49m 30.63msDIV 1.1hr 9.63ms 27.39s 471.39msSum 32.49hr 28.87ms 131.68m 83.14msfunctions.The first experiment is a fine-grained test of the individual coverageprobability predictions. As described in Section 4.3, I used an evaluationtest set of 400 individual random programs. For each test, I measure thefinal coverage over all coverable coverage monitors, as well as the coverage inCMinst. Based on the coverage of CMinst, I compute the predicted numberof all covered coverage monitors. For Objective Function 1, this is simplythe number of instrumented coverage monitors set to 1, plus the sum ofestimates of the probability pP pd “ 1q of every uninstrumented coveragemonitor d. However, since there is no uninstrumented coverage monitorswhen using Objective Function 1, then the prediction is simply the realcoverage of active CM. For Objective Function 2, I give the coverage ofCMinst as evidence to the Bayesian network and query the updated predictedprobabilities of covering each coverage monitor. Summing these probabilitiesgives the predicted number of covered coverage monitors. For ObjectiveFunction 3, I sum each conditional probability for every uninstrumented455.2. Accuracy of Coverage Probability Predictionscoverage monitor d using P pd “ 1|CMinstq, Eq. 3.17, plus the number ofinstrumented coverage monitors set to 1.Figures 5.1 to 5.3 show how well CMinst represents the complete setCM . The x axis represent the predicted number of active coverage monitorsset to 1, while the y axis represent the number of measured active coveragemonitors set to 1. On these bubble graphs, the area of the circle represent thenumber of tests with the same number of predicted and measured coveragemonitors set to 1. The solid lines represent Objective Function 1 (OF1)results; the dashed lines represent Objective Function 2 (OF2) results; andthe dotted lines represent Objective Function 3 (OF3) results. Regressionlines represent the R2 regression line. For reference, the number next to thelegend shows the size of the set CMinst, while the number on the grid thesize of the set active CM.Notice the best prediction is given by Objective Function 1, but this isnot a real prediction since all active coverage monitors are instrumented. Ofthe other two objective functions, Objective Function 2 provides the bestprediction. The reason is that the BN in Objective Function 2 estimatesa higher-dimensional joint probability distribution, but it is also becauseof the difference in the number of instrumented coverage monitors. Theextra reduction achieved by Objective Function 3 affects the predictions.In particular, Figure 5.2 shows a big bias in the prediction from ObjectiveFunction 3 because it is instrumenting only one coverage monitor.For all objective functions, the predictions are more accurate for highercoverage tests. In particular, the objective functions get the worst pre-dictions in MUL and DIV IPs in the test cases with actual zero coverage.465.2. Accuracy of Coverage Probability Predictionsy = 1. 2759x - 4. 7946-5051015-5 0 5 10 15 20# Actual Active CMs Covered# Ex p ected Active CMs coveredOF1 - 16OF2 - 7OF3 - 5OF2trendOF3trendRegression(OF1)Regress ion(OF2)Regress ion(OF3)Linear (OF2t rend)Linear (OF3trend)-518-5 18# Actual Active CMs Covered# Ex p ected Active CMs coveredOF1 - 18OF2 - 3OF3 - 3Regression(OF1)Regress ion(OF2)Regression(OF3)Figure 5.1: IU CTRL Expected Number of Coverage Monitors Based onActive CM coverage. The number of coverage monitors in active CM forthe IU CTRL unit is 18.475.2. Accuracy of Coverage Probability Predictions- 5# Expected Active CMs coveredy = 1.2727x - 4.8182OF 1 - 1 6OF 2 - 7OF 3 - 5OF 2trendOF 3trendRegression(O F1)Regression(O F2)Regression(O F3)Linear (O F2t ren d)Linear (O F3t ren d)051015202530- 5 0# Actual Active CMs Covered- 524- 5 24# Actual Active CMs Covered# Exp ected Active CMs coveredOF1 - 24OF2 - 4OF3 - 1Regression(O F1)Regression(O F2)Regression(O F3)Figure 5.2: MUL Expected Number of Coverage Monitors Based on ActiveCM coverage. The number of coverage monitors in active CM for the MULunit is 24.485.2. Accuracy of Coverage Probability Predictions10 15 20# Ex p ected Active CMs coveredOF1 - 16OF2 - 7OF3 - 5Regress ion(OF1)Regress ion(OF2)Regress ion(OF3)-50510152025-5 0 5 10 15 20 25 30# Actual Active CMs Covered# Exp ected Active CMs coveredBNNaïveTrend (BN-based)Trend (Naïve)202530# Actual Active CMs Covered5 10 15 20# Ex p ected Active CMs coveredOF1 - 16OF2 - 7OF3 - 5OF2trendOF3trendRegress ion(OF1)Regress ion(OF2)Regress ion(OF3)Linear (OF2trend)Linear (OF3trend) -516-5 16# Actual Active CMs Covered# Exp ected Active CMs coveredOF1 - 16OF2 - 7OF3 - 5Regress ion(OF1)Regress ion(OF2)Regress ion(OF3)Figure 5.3: DIV Expected Number of Coverage Monitors Based on ActiveCM coverage. The number of coverage monitors in active CM for the DIVis 16.495.3. Accuracy of Full Coverage PredictionsHowever, my method’s goal is to get good predictions of full coverage, so wemostly care of the predictions on high coverage.5.3 Accuracy of Full Coverage PredictionsAs noted above, the worst predictions occur on low coverage because myentire methodology is based on achieving full coverage. Accordingly, mynext experiment is to assess how well full coverage of CMinst implies fullcoverage of all coverage monitors.For this experiment, I measure the latency between achieving coverage ofCMinst and full coverage. Ideally, this should be 0. The longer this latency,the greater the risk that we will erroneously believe we have achieved fullcoverage.In this experiment, I use an evaluation test set formed by 10 long randomprograms. Each of these tests is long enough to achieve full coverage. Thetests are run until all instrumented coverage monitors are set, and thencontinuing until full coverage is reached. Every test is run three times, onefor each IP block. I define latency to be the difference between the time whencoverage monitors in CMinst are set to 1 and the time when full coverage isreached — the time is measured in number of clock cycles.Remarkably, on every test and IP block, the latency was zero, as shownin Figure 5.4. Zero latency means that the last coverage monitors that setsto 1 is in the subset CMinst. In this experiment at least, the instrumentedcoverage monitors were a perfect surrogate for full coverage monitoring.Note that this result held for the instrumented coverage monitors chosen by505.4. Real Application: Linux Boot030-1 5 -1 0 -5 0 5 10 15# of TestsLatencyO F1 - 58O F2 - 14O F3 - 9Figure 5.4: Coverage Latency. For reference, the number of coveragemonitors in CMinst of each objective function is shown next to the legend.all three objective function. So considering the achieved reduction and timeto get it, the most promising methodology would use Objective Function 3.5.4 Real Application: Linux BootThe Csmith tests are somewhat artificial, so for the last experiment, I mea-sured coverage during a Linux boot — a very typical, real post-silicon val-idation test. This also shows results on something very different from thetraining set. The Linux boot does not exhibit a lot of random variance,but it is much longer than the other evaluation tests, so instead, I show thecoverage achieved over time.Tables 5.5 to 5.7 show a summary of the coverage during a Linux boot515.5. Comparison to Static Source Code Analysison each of the instrumented units. For this experiment, a sample was takenevery 5 million cycles. The tables compare the coverage in CMinst and incoverable CM (number of coverage monitors in CMinst set to 1 / numberof coverage monitors in CM set to 1).In general, the results show that the CMinst created by the objectivefunctions never reported a false full coverage. In more detail, there weretwo types of outcomes: a unit that achieved full coverage and units that didnot achieve full coverage.The first type of outcome occurred in the MUL unit (Table 5.6), wherefull coverage happened at the same time as the coverage of CMinst. Inother words, full coverage was only achieved when all instrumented coveragemonitors were set.The second type of outcome occurred in IU CTRL and DIV units (Ta-bles 5.5 and 5.7). On both situations there is one missing coverage monitorto get full coverage. Yet, the missing coverage monitor is part of CMinst,meaning full coverage would occur only when the last member of CMinst isset. This gives strong reassurance that the selection of instrumented cover-age monitors is indeed a good one.5.5 Comparison to Static Source Code AnalysisGiven that the experiments were run using code coverage, a natural ques-tion is how the proposed method in this thesis compares to what could beachieved via static analysis of the source code.It turns out that the static analysis approach has been tried previously,525.5. Comparison to Static Source Code AnalysisTable 5.5: IU CTRL Linux Boot Test Results.Cycle(million)IU CTRLOF1 OF2 OF3 Real0 0/18 0/3 0/3 92/1105 13/18 0/3 0/3 105/11010 14/18 0/3 0/3 106/11015 14/18 0/3 0/3 106/11020 14/18 0/3 0/3 106/11025 14/18 0/3 0/3 106/11030 15/18 1/3 1/3 107/11040 15/18 1/3 1/3 107/11080 16/18 2/3 2/3 108/110650 17/18 2/3 2/3 109/110860 17/18 2/3 2/3 109/110Table 5.6: MUL Linux Boot Test Results.Cycle(million)MULOF1 OF2 OF3 Real0 0/24 0/4 0/1 21/455 23/24 3/4 0/1 44/4510 23/24 3/4 0/1 44/4515 23/24 3/4 0/1 44/4520 24/24 4/4 1/1 45/4525 24/24 4/4 1/1 45/4530 24/24 4/4 1/1 45/4540 24/24 4/4 1/1 45/4580 24/24 4/4 1/1 45/45650 24/24 4/4 1/1 45/45860 24/24 4/4 1/1 45/45535.5. Comparison to Static Source Code AnalysisTable 5.7: DIV Linux Boot Test Results.Cycle(million)DIVOF1 OF2 OF3 Real0 0/16 0/7 0/5 9/255 0/16 0/7 0/5 9/2510 0/16 0/7 0/5 9/2515 0/16 0/7 0/5 9/2520 0/16 0/7 0/5 9/2525 4/16 4/7 2/5 22/2530 4/16 4/7 2/5 22/2540 6/16 6/7 4/5 24/2580 6/16 6/7 4/5 24/25650 6/16 6/7 4/5 24/25860 6/16 6/7 4/5 24/25using sophisticated source code analysis techniques (based on pre- and post-dominator trees) to determine which code coverage monitors imply othercode coverage monitors. [16] In fact, I am using the same SoC as a testbedwith the same initial code coverage instrumentation.For the same IP blocks that I use, the static analysis reported reductionsof 19–31% in the number of coverage monitors, with a guarantee of no infor-mation loss at all, versus my method achieving reductions of approximately80–98%, with a predicted 99% probability of full coverage using ObjectiveFunction 3. Clearly, the additional information available via dynamic exe-cution of actual test cases, plus the flexibility of a probabilistic guarantee,results in substantial savings. In real life, static techniques should be ap-plied first, as much as possible, and then my approach can be applied to the545.6. Area and Powerremaining coverage monitors to achieve substantial further overhead reduc-tions.5.6 Area and PowerTo get an idea of the real impact of implementing CMinst into custom silicon,I did a straightforward synthesis and mapping of the design. I measured theimpact in area and power using the Synopsis Design Compiler tool andTSMC’s 180nm technology libraries.Area and power impact are shown in Figure 5.5 and Figure 5.6. Thefigures show the impact of instrumenting CMinst on each of the selectedunits (IU CTRL, MUL, and DIV), as well as considering all three units asone (Sum). To measure the impact, I created five versions of the design:the original design without any instrumentation, the design with Objec-tive Function 1 CMinst instrumented, the design with Objective Function 2CMinst instrumented, the design with Objective Function 3 CMinst instru-mented, and a version of the design instrumenting all coverable coveragemonitors (“active” plus “instantly covered” coverage monitors).If the complete set of coverable coverage monitors were instrumented tosilicon the impact would be: 5.00% of area overhead and 6.90% more staticpower. Of the three objective functions, Objective Function 1 has the mostimpact on the original design in both area and power, but it is still able toreduce the impact to 1.96% in area and 2.70% in power.Objective Function 2 and 3 have similar results when the three unitsare considered. The reason is that both objective functions get the same555.6. Area and Powerreduction on the largest unit, IU CTRL. However, Objective Function 3 hasthe least impact on the MUL and DIV units. Specifically in the MUL unit,the area overhead impact is 1.39% for Objective Function 2 and 0.59% forObjective Function 3, and the power overhead impact is 5.30% for ObjectiveFunction 2 and 0.70% for Objective Function 3. Clearly Objective Function3 has the least impact on both area and power. Furthermore, if we considerthat Objective Function 3 is much faster, then it is clear that for the currentbenchmark the method using Objective Function 3 is the best of the threeproposed objective functions.The results presented on this chapter show it is possible to extract asmall subset of coverage monitors to instrument in silicon with small impactin area and power. Furthermore, the impact may be reduced even more byreducing the predicted probability of full coverage, pthreshold of the proposedmethod.565.6. Area and Power0. 90 00. 95 01. 00 01. 05 01. 10 01. 15 01. 20 0IU_CTRL MUL DIV SumImpact in AreaOrigin al OF1 OF2 OF3 CoverableFigure 5.5: Impact on Silicon Area of the Custom Implantation of theDesign.0.9000.9501.0001.0501.1001.1501.2 001.2501.300IU_CTRL MUL DIV Su mImpact in Static PowerOrigin al OF1 OF2 OF3 Covera bleFigure 5.6: Impact on Static Power of the Custom Implantation of theDesign.57Chapter 6Conclusion and Future WorkI have presented a novel approach to select a small subset of coverage moni-tors to instrument on silicon which provides strong evidence that all coveragemonitors were hit as well. The technique uses emulation to sample the re-lationships between coverage monitors, and then uses statistical techniquesto select a small, but highly informative subset of coverage monitors.The thesis proposed three different statistical techniques. The first tech-nique makes the assumption that all monitors are independent. Becauseof the independence assumption, this first approach failed to reduce thenumber of coverage monitors. The second technique uses a Bayesian net-work that approximates the joint probability distribution. Although thisapproach produced a smaller subset, it was also the slowest. The thirdand last technique assumes some dependencies among coverage monitorsand directly estimates the conditional probabilities from the gathered data.The direct approximation provided a much faster solution than the secondtechnique. And by considering dependencies among coverage monitor, thistechnique produced the smallest subset. For example, the implementationon this thesis achieves a reduction of 95.5% in the number of coverage mon-itors to instrument using this methodology.There are many avenues for future work. The thesis presented a general58Chapter 6. Conclusion and Future Workframework, but only demonstrated the method via some specific samplingchoices, three particular probability estimators, and one of the simplest fea-ture selection algorithms available. Each of these is an avenue for furtherresearch and improvement, for example:• There are other feature selection algorithm that could be used insteadof the simple Sequential Forward Selection. For example, the methodcould use a “floating” selection methodology which adds and removefeatures to avoid getting stuck in a local minimum. Therefore, thismethodology could find a better solution.• The method used a specific type of sampling. However, it would beinteresting to compare how another similar method would performusing a sampling with no information loss, or even, just sampling atthe end of the test.• In Objective Function 3, there is an exponential search of possiblesubsets of CMinst in Eq. 3.17. In my evaluation, because of the smallsize of CMinst the compute time was not an issue for this objectivefunction. However, this search could become restrictive on larger de-signs. To prevent the exponential impact in the compute time, thealgorithm could restrict the search space by limiting the dependen-cies. For example, just looking for subsets of CMinst that have atmost 5 coverage monitors. An study of the implication of this changewould be required.• This thesis used code coverage for two main reasons: it is an objective59Chapter 6. Conclusion and Future Workand accepted coverage metric, and it was already implemented on theavailable SoC. However, code coverage has an intrinsic structure thatmay not be on other types of coverage metrics. Therefore, it wouldbe interesting to compare the results with different kinds of coverage(e.g. Functional Coverage).• Due to resources and time constraints, I made the decision of notinstrumenting the coverage monitors in silicon. However, comparingthe results with experiments on real post-silicon tests would be theultimate test for the method in this thesis.Furthermore, another methodology might want to extend the informa-tion available in the subset, so it can predict the state of all the coveragemonitors instead of just implying full coverage or not full coverage. Thiswould be specially useful if we would like to detect with more detail theholes in the validation instead of just having the certainty of full coverage.Intuitively, the new method should produce a larger subset since it shouldcontain more information than the subsets generated by my method.Finally, the proposed method assumes the selected coverage monitorswill be instrumented in silicon. However, this might not be the only way totrack coverage. My method should apply for any coverage tracking mech-anism since it simplifies the problem by reducing the number of coverageevents that need to be covered to imply full coverage. For example, wecould imagine using an assertion coverage model that uses programmablehardware to check for coverage events using internal signals, then havingless coverage events to track would simplify the work for the post-silicon60Chapter 6. Conclusion and Future Workvalidation engineer, specially given that the hardware may have a limitednumber of events it can watch simultaneously.61Bibliography[1] Miron Abramovici, Paul Bradley, Kumar Dwarakanath, Peter Levin,Gerard Memmi, and Dave Miller. A Reconfigurable Design-for-DebugInfrastructure for SoCs. In Design Automation Conference, pages 7–12.ACM/IEEE, 2006.[2] A. Adir, A. Nahir, A. Ziv, C. Meissner, and J. Schumann. Reachingcoverage closure in post-silicon validation. In Haifa Verification Con-ference, pages 60–75. Springer, 2011.[3] K. Balston, A.J. Hu, S.J.E. Wilton, and A. Nahir. Emulation in post-silicon validation: It’s not just for functionality anymore. In High LevelDesign Validation and Test Workshop (HLDVT), 2012 IEEE Interna-tional, pages 110–117, Nov 2012.[4] K. Balston, M. Karimibiuki, A.J. Hu, A. Ivanov, and S.J.E. Wilton.Post-silicon code coverage for multiprocessor system-on-chip designs.Computers, IEEE Transactions on, 62(2):242–246, Feb 2013.[5] Pouria Bastani, Benjamin N. Lee, Li-C. Wang, Savithri Sundareswaran,and Magdy S. Abadir. Analyzing the risk of timing modeling based onpath delay tests. In Intl. Test Conference. IEEE, 2007.62Bibliography[6] T. Bojan, M.A. Arreola, E. Shlomo, and T. Shachar. Functional Cov-erage Measurements and Results in post-Silicon validation of Core 2Duo family. In High Level Design Validation and Test Workshop, pages145–150. IEEE, 2007.[7] Gregory F. Cooper and Edward Herskovits. A bayesian method for theinduction of probabilistic networks from data. Mach. Learn., 9(4):309–347, October 1992.[8] Andrew DeOrio, Qingkun Li, Matthew Burgess, and Valeria Bertacco.Machine learning-based anomaly detection for post-silicon bug diagno-sis. In Design Automation and Test in Europe (DATE), 2013.[9] Shai Fine and Avi Ziv. Coverage directed test generation for functionalverification using bayesian networks. In In Proceedings of the 40th De-sign Automation Conference, pages 286–291, 2003.[10] Aeroflex Gaisler. Leon3 SoC. http://www.gaisler.com/products/grlib/grlib-gpl-1.3.7-b4144.tar.gz, 2012.[11] Aeroflex Gaisler. Leon3 SoC IP Core Manual. http://www.gaisler.com/products/grlib/grip.pdf, 2014.[12] J. Goodenough and R. Aitken. Post-silicon is too late: avoiding the $50million paperweight starts with validated designs. In Proceedings of the47th Design Automation Conference, pages 8–11. ACM/IEEE, 2010.[13] David Heckerman and David M. Chickering. Learning bayesian net-63Bibliographyworks: The combination of knowledge and statistical data. In MachineLearning, pages 20–197, 1995.[14] A. Jain and D. Zongker. Feature selection: evaluation, application, andsmall sample performance. Pattern Analysis and Machine Intelligence,IEEE Transactions on, 19(2):153–158, Feb 1997.[15] George H. John, Ron Kohavi, and Karl Pfleger. Irrelevant features andthe subset selection problem. In MACHINE LEARNING: PROCEED-INGS OF THE ELEVENTH INTERNATIONAL, pages 121–129. Mor-gan Kaufmann, 1994.[16] M. Karimibiuki, K. Balston, A.J. Hu, and A. Ivanov. Post-silicon CodeCoverage Evaluation with Reduced Area Overhead for Functional Ver-ification of SoC. In High Level Design Validation and Test Workshop,pages 92 –97. IEEE, Nov. 2011.[17] J. Keshava, N. Hakim, and C. Prudvi. Post-silicon validation challenges:How EDA and academia can help. In Proceedings of the 47th DesignAutomation Conference, pages 3–7. ACM/IEEE, 2010.[18] John J. Kinney. Probability: an introduction with statistical applica-tions. John Wiley & Sons, Inc, Hoboken, New Jersey, 2015.[19] Subhasish Mitra, Sanjit A. Seshia, and Nicola Nicolici. Post-siliconvalidation opportunities, challenges and recent advances. In DesignAutomation Conference, pages 12–17. ACM, 2010.64Bibliography[20] Kevin P. Murphy. The bayes net toolbox for matlab. Computing Scienceand Statistics, 33:2001, 2001.[21] A. Nahir, A. Ziv, and S. Panda. Optimizing Test-Generation to theExecution Platform. In Asia and South Pacific Design AutomationConference, pages 304 –309. IEEE, Feb 2012.[22] R. Ochoa Gallardo, A.J. Hu, A. Ivanov, and M.S Mirian. Reducingpost-silicon coverage monitoring overhead with emulation and bayesianfeature selection. Computers, IEEE Transactions on, page to appear,Nov 2015.[23] Judea Pearl. Bayesian networks: A model of self-activated memoryfor evidential reasoning. Technical Report CSD-850017, UCLA, April1985.[24] Graham Schelle, Jamison Collins, Ethan Schuchman, Perrry Wang, Xi-ang Zou, Gautham Chinya, Ralf Plate, Thorsten Mattner, Franz Ol-brich, Per Hammarlund, Ronak Singhal, Jim Brayton, Sebastian Steibl,and Hong Wang. Intel Nehalem Processor Core Made FPGA Synthe-sizable. In FPGA, pages 3–12. ACM/SIGDA, 2010.[25] P.H. Wang, J.D. Collins, C.T. Weaver, B. Kuttanna, S. Salamian, G.N.Chinya, E. Schuchman, O. Schilling, T. Doil, S. Steibl, et al. Intel AtomProcessor Core Made FPGA-Synthesizable. In FPGA, pages 209–218.ACM/SIGDA, 2009.[26] X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding65Bibliographybugs in C compilers. In Programming Language Design and Implemen-tation, pages 283–294. ACM/SIGPLAN, 2011.66
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Reducing post-silicon coverage monitoring overhead...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Reducing post-silicon coverage monitoring overhead with emulation and statistical analysis Ochoa Gallardo, Ricardo 2015
pdf
Page Metadata
Item Metadata
Title | Reducing post-silicon coverage monitoring overhead with emulation and statistical analysis |
Creator |
Ochoa Gallardo, Ricardo |
Publisher | University of British Columbia |
Date Issued | 2015 |
Description | With increasing design complexity, post-silicon validation has become a critical problem. In pre-silicon validation, coverage is the primary metric of validation effectiveness, but in post-silicon, the lack of observability makes coverage measurement problematic. On-chip coverage monitors are a possible solution, where a coverage monitor is a hardware circuit that sets to one when the coverage event of interest occurs. However, prior research has shown that the overhead is prohibitive for anything beyond a small number of coverage monitors. In this thesis, I explore techniques that reduce the number of instrumented coverage monitors, while still being able to imply full coverage with high probability. These techniques use a deterministic forward feature selection algorithm with the objective functions based on statistical information. For gathering the required statistical information, the method relies on emulation, where all coverage monitors of interest are instrumented on a version of the design. On this emulator, such as an FPGA, I stress the design with randomly generated tests to collect the data from the instrumented coverage monitors. I propose three objective functions for the feature selection algorithm: the first estimates the probability of a coverage monitor being set during a test; the next objective function builds a Bayesian Network (BN), then takes advantage of the relationship information between nodes (coverage monitors), which the network provides; the last objective function directly estimates the conditional probability of coverage from the gathered data. I demonstrate these techniques on a non-trivial system-on-chip, by measuring the code coverage achieved after executing randomly generated test programs. Depending on the objective function, results show a 67.7% to 95.5% reduction in the number of required coverage monitors. In the ASIC implementation, this would translate into an impact of 0.33-1.96% in silicon area overhead and 0.40-2.70% in static power overhead. These results show my technique works, by proving it is possible to track a smaller number of coverage events that statistically represent the whole set. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2015-08-13 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivs 2.5 Canada |
DOI | 10.14288/1.0166528 |
URI | http://hdl.handle.net/2429/54393 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2015-09 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/2.5/ca/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2015_september_ochoagallardo_ricardo.pdf [ 1.31MB ]
- Metadata
- JSON: 24-1.0166528.json
- JSON-LD: 24-1.0166528-ld.json
- RDF/XML (Pretty): 24-1.0166528-rdf.xml
- RDF/JSON: 24-1.0166528-rdf.json
- Turtle: 24-1.0166528-turtle.txt
- N-Triples: 24-1.0166528-rdf-ntriples.txt
- Original Record: 24-1.0166528-source.json
- Full Text
- 24-1.0166528-fulltext.txt
- Citation
- 24-1.0166528.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0166528/manifest