UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Conflict-driven symbolic execution : how to learn to get better Val, Celina Gomes do 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2014_spring_val_celina.pdf [ 945.31kB ]
JSON: 24-1.0165906.json
JSON-LD: 24-1.0165906-ld.json
RDF/XML (Pretty): 24-1.0165906-rdf.xml
RDF/JSON: 24-1.0165906-rdf.json
Turtle: 24-1.0165906-turtle.txt
N-Triples: 24-1.0165906-rdf-ntriples.txt
Original Record: 24-1.0165906-source.json
Full Text

Full Text

Conflict-Driven Symbolic ExecutionHow to Learn to Get BetterbyCelina Gomes do ValB.Sc., Universidade Federal de Minas Gerais, 2009M.Sc., Universidade Federal de Minas Gerais, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Computer Science)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)March 2014c? Celina Gomes do Val 2014AbstractDue to software complexity, manual and automatic testing are not enough toguarantee the correct behavior of software. One alternative to this limitationis known as Symbolic Execution.Symbolic Execution is a formal verification method that simulates soft-ware execution using symbolic values instead of concrete ones. The execu-tion starts with all input variables unconstrained, and assignments that useany input variable are encoded as logical expressions. Whenever a branchis reached, the symbolic execution engine checks which values the branchcondition can assume. If more than one valid evaluation is possible, theexecution forks, and a new process is created for each possibility.In cases where the program execution is finite, symbolic execution iscomplete, and potentially executes every reachable program path. However,the number of paths is exponential in the number of branches in the program,and this approach suffers from a problem know as path explosion.This thesis presents a novel algorithm that can dynamically reduce thenumber of paths explored during symbolic execution in order to prove agiven set of properties. The algorithm is capable of learning from conflictsdetected while symbolically executing a path.I have named this algorithm Conflict-Driven Symbolic Execution (CDSE),since it was inspired by the conflict-driven clause learning (CDCL) insightsintroduced by modern boolean satisfiability solvers. The proposed algorithmtakes advantage of two features responsible for the success of CDCL solvers:conflict analysis and non-chronological backtracking. In a nutshell, CDSEprunes the search space every time a certain branch is proven infeasible bylearning the reason why there is a conflict.In order to assess the proposed algorithm, this thesis presents a proof-of-iiAbstractconcept CDSE tool named Kite, and compares its performance to the state-of-the-art symbolic execution tool Klee [10]. The results are encouraging,and present practical evidence that conflict-driven symbolic execution canperform better than regular symbolic execution.iiiPrefaceThe work presented in this thesis was primarily done by myself, except forthe encoding of the search constraints (Section 3.2.1), which I designed incollaboration with Sam Bayless.None of the text of this thesis was taken from previously published arti-cles.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivating Example . . . . . . . . . . . . . . . . . . . . . . . 41.2 Text Organization . . . . . . . . . . . . . . . . . . . . . . . . 62 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.1 Notations and Definitions . . . . . . . . . . . . . . . . . . . . 82.2 Symbolic Execution . . . . . . . . . . . . . . . . . . . . . . . 112.2.1 Solving Strategies . . . . . . . . . . . . . . . . . . . . 142.2.2 Query Optimizations . . . . . . . . . . . . . . . . . . 162.3 SAT Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . 192.3.2 DPLL SAT Solvers . . . . . . . . . . . . . . . . . . . 202.3.3 CDCL SAT Solvers . . . . . . . . . . . . . . . . . . . 222.4 CDCL Applications . . . . . . . . . . . . . . . . . . . . . . . 26vTable of Contents3 Conflict-Driven Symbolic Execution . . . . . . . . . . . . . . 283.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2 Learning Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 323.2.1 Search Constraints . . . . . . . . . . . . . . . . . . . 323.2.2 Conflict Analysis . . . . . . . . . . . . . . . . . . . . 343.3 CDSE Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 363.3.1 Preprocess . . . . . . . . . . . . . . . . . . . . . . . . 383.3.2 Decide . . . . . . . . . . . . . . . . . . . . . . . . . . 393.3.3 Propagate . . . . . . . . . . . . . . . . . . . . . . . . 393.3.4 Conflict Analysis . . . . . . . . . . . . . . . . . . . . 403.3.5 Learn and Backtrack . . . . . . . . . . . . . . . . . . 413.4 Multiple Properties . . . . . . . . . . . . . . . . . . . . . . . 413.5 Loop Handling . . . . . . . . . . . . . . . . . . . . . . . . . . 424 Kite: A Conflict-Driven Symbolic Execution Tool . . . . . 454.1 Kite?s Architecture . . . . . . . . . . . . . . . . . . . . . . . 454.2 Preprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.2.1 CFG Construction . . . . . . . . . . . . . . . . . . . . 484.2.2 f CFG Construction . . . . . . . . . . . . . . . . . . . 494.3 Instruction Interpreter . . . . . . . . . . . . . . . . . . . . . 504.4 Constraint Solver . . . . . . . . . . . . . . . . . . . . . . . . 534.5 Decision Engine . . . . . . . . . . . . . . . . . . . . . . . . . 554.6 Known Limitations . . . . . . . . . . . . . . . . . . . . . . . 575 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.1 Testing Environment . . . . . . . . . . . . . . . . . . . . . . 605.2 Overall Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 625.3 Evaluation of the Decision Engines . . . . . . . . . . . . . . . 656 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.1 Interpolation-Based Learning Schemes . . . . . . . . . . . . . 726.2 Clause Learning Schemes . . . . . . . . . . . . . . . . . . . . 75viTable of Contents7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 797.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 81Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84AppendixA Extended Results . . . . . . . . . . . . . . . . . . . . . . . . . . 91viiList of Tables5.1 Table showing the execution time (in seconds) spent by Klee,CBMC and Kite to prove that each test case respects theirassertions. Since every tool spent more than 10s to solve the 7instances marked with a ???, they will be classified as hard, incontrast to the other instances, classified as easy, that couldbe solved in less than 1s by at least one tool. . . . . . . . . . 635.2 Table showing the execution time (in seconds) spent by Klee,CBMC and Kite to find a bug in each test case. . . . . . . . . 655.3 Total time spent (in seconds) by each tool to solve the bench-marks. Overall, Kite had the best performance, and CBMChad the worst performance. . . . . . . . . . . . . . . . . . . . 665.4 Table showing the effect of different decision engines on Kite?sexecution time (in seconds) to verify the safe instances. . . . 685.5 Table showing the effect of different decision engines on Kite?sexecution time (in seconds) to find a property violation inthe unsafe instances. In the instance unsafe 07, two enginesdid not finish before the 200s timeout. Thus, the total timereported for these engines are not precise. . . . . . . . . . . . 69A.1 Renaming of Safe Instances. . . . . . . . . . . . . . . . . . . . 91A.2 Renaming of Unsafe Instances. . . . . . . . . . . . . . . . . . 92A.3 The execution time (in seconds) spent by the two versionsof Klee to solve the safe instances. Kleef corresponds to theforked version, while Kleel corresponds to the latest one. . . . 93viiiList of TablesA.4 The execution time (in seconds) spent by the two versions ofKlee to solve the unsafe instances. Kleef corresponds to theforked version, while Kleel corresponds to the latest one. . . . 94A.5 Table showing the number of instructions that were symbol-ically executed by all versions of Klee and Kite to solve thesafe instances. In instances safe 05 and safe 06, Kite wascapable of proving property?s correctness just by applying thecode optimization passes, and the slicing algorithm. . . . . . 95A.6 Table showing the number of instructions that were symbol-ically executed by all versions of Klee and Kite to solve theunsafe instances. . . . . . . . . . . . . . . . . . . . . . . . . . 96ixList of Figures1.1 The execution tree for Listing 1.1, where P represents theproperty being proved. . . . . . . . . . . . . . . . . . . . . . . 62.1 The control flow graph and the execution tree for the programdescribed in Listing 2.1. (a) The CFG represents every basicblock in the program, along with all possible transitions be-tween them. (b) The execution tree represents every feasibleexecution path. It can be obtained by unwinding the CFG,and checking which paths are feasible. . . . . . . . . . . . . . 113.1 The control flow graph and the execution tree for the pro-gram described in Listing 1.1. (a) In the CFG, each branchb and each basic block B has a unique ID. (b) The execu-tion tree represents the 8 paths in the CFG that would leadthe program to check the assertion. The execution tree edgesrepresent only conditional branches; thus, one node may rep-resent the execution of more than one basic block. . . . . . . 30xList of Figures3.2 An example of a satisfying assignment to the f CFG that nolonger guarantees the existence of a complete path in a CFGdue to the existence of loops. (a) Represents a program CFGwith two loops. (b) Represents a satisfying assignment tothe corresponding f CFG. In this representation, each CFGelement?s identification is replaced by the value assigned tothe corresponding f CFG variable. Any complete path in thegiven CFG should include b3; however, a loop in the CFGallows a node to succeed its predecessor, and rules 4 and 5can be satisfied without guaranteeing that b3 must be takenin order to reach BD. . . . . . . . . . . . . . . . . . . . . . . 434.1 Kite?s main modules, their composition and communication.The preprocessor module provides information to the instruc-tion interpreter and the decision engine. The instruction in-terpreter symbolically executes the code guided by the deci-sion engine. The interpreter and the constraint solver shareinformation about path condition feasibility. . . . . . . . . . . 465.1 Number of instructions symbolically executed by Klee andKite on safe and unsafe instances. As expected, Kite symbol-ically executed fewer instructions in the majority of the tests,even in unsafe instances where Klee has better performance. 676.1 The control flow graph for the program described in Listing1.1, where basic block BH represents the assertion failure. . . 72xiAcknowledgmentsFirst, I would like to express my gratitude towards my supervisor, ProfessorAlan J. Hu, for his guidance and his enthusiastic encouragement throughoutthis work. I am also grateful for all the remarkable advice given by ProfessorMark Greenstreet.I wish to acknowledge financial support from Intel Corporation andfrom the Semiconductor Research Corporation, which were crucial to thisproject?s development.Furthermore, I would like to thank all my fellow colleagues in the In-tegrated Systems Design Laboratory, who always made this journey so de-lightful. I am particularly grateful for Sam Bayless?s valuable insights.I cannot overstate how much I appreciate the support I received frommy friends and family, especially my parents. Despite the distance, I havealways been blessed by their love.Finally, I would like to thank Ian for his patience, his kindness, and hishelp.xiiTo my parentsxiiiChapter 1IntroductionSoftware complexity is increasing, and current testing techniques are notenough to guarantee correct functionality. Bugs, such as the one responsiblefor NASA?s Mars climate orbiter crash [1], can be very costly [37]. In 2002,Tassey [47] estimated that the cost of software bugs in the U.S. economy is$59.5 billion dollars per year.Moreover, failures in safety-critical software not only have impact ineconomic terms, but also can endanger lives. For example, a software bugin the Therac-25 radiation therapy machine controller caused at least fivepatients deaths due to excessive X-rays administration in the 1980s [33].In order to avoid bugs from reaching the market, verification has becomea crucial stage of software development.Software testing is by far the most used technique for software verifica-tion due to its simplicity and scalability. The code is executed with differentinputs in order to assure quality and look for bugs. The inputs can be gen-erated randomly, or to try to imitate expected usage scenarios. In addition,coverage directed techniques can be used to generate inputs that aim at in-creasing test-case coverage, e.g., these techniques can generate inputs thatforce the program to execute specific lines of code [10].Nevertheless, testing is not enough to ensure correct behavior. For in-stance, a test suite for any program with only one input of 32 bits wouldrequire 232 different test inputs to be complete. Assuming one program fromthis set executes in 10 ms, a complete test suite would take more than 80years to finish.Complementary to automated testing, static analysis techniques havebecome part of the software verification process. Any analysis performedwithout concretely executing the program can be classified as static. For1Chapter 1. Introductionexample, modern compilers use static analysis to detect irregular and un-safe constructions before generating the executable code. These detectionalgorithms trigger compilation warnings or errors whenever they identifycode that appears buggy.Recently, static analysis has also been used to build automatic formalprogram verification tools. In contrast to previous techniques, formal pro-gram verification proves some correctness properties instead of just bugfinding. Two predominant techniques used for the construction of formalprogram verification tools are Model Checking and Symbolic Execution:Model Checking was introduced by Clarke and Emerson [12] and by Queilleand Sifakis [42], independently. It is a technique for verifying tempo-ral logic properties of a given transition system (including software) byexploring its state space. In its original form, model checking imple-mentations used explicit representations of the state transition graph.However, the number of states grows exponentially with the numberof system variables ? a problem known as state explosion.Symbolic state representation was later introduced to ameliorate thestate explosion problem [8]. In particular, the employment of modelchecking spread widely in industry after the introduction of BoundedModel Checking (BMC) [6] and the adoption of satisfiability solvers(SAT solvers). Bounded model checking modifies model checking toencode only a finite sequence of state transitions that reaches certainstates of interest as a Boolean formula. SAT solvers are used to deter-mine whether the formula is satisfiable or not, which proves if the statesof interest are reachable or not, respectively1. The generated booleanformula is linear in the length of the sequence of state transitions, anddetermining if this formula is satisfiable or not is an NP-Completeproblem. In spite of the high worst-case complexity, BMC has scaledto larger problems than other approaches, largely because it benefitsdirectly form ongoing advances in SAT solvers.1There exists different model checking techniques, known as Unbounded Model Check-ing, than can often prove properties of infinite execution sequences. However, they gobeyond the scope of this thesis.2Chapter 1. IntroductionSymbolic Execution was introduced by King [28]. It is a method to simu-late software execution using symbolic values instead of concrete ones2.While simulating the program with symbolic values, branch statementsmight have more than one feasible evaluation. When this is the case,the symbolic execution engine must consider every feasible possibil-ity; consequently, it forks. Every time the execution forks, a newconstraint derived from the branch condition will be added to eachsimulation process according to the direction taken. These constraintsare accumulated during the execution, and they comprise the pathcondition. Symbolic execution employs constraint solvers to deter-mine the possible values of a branch statement given a certain pathcondition. In cases where the program execution is finite, symbolicexecution is complete, and potentially executes every reachable pro-gram path. Symbolic execution tools may also bound program loopsto guarantee the analysis termination, but like BMC the analysis isrestricted to paths that respect the loop bound. In both scenarioswhere the execution is finite, the number of paths is still exponentialin the number of branches in the program; consequently, this approachsuffers from path explosion.Both methods represent state transitions in programs as propositionallogic formulas, and use constraint solvers to check their satisfiability3.Bounded model checking, however, typically builds one single formula torepresent the entire program, while symbolic execution builds different for-mulas for each execution path.This difference gives two advantages to symbolic execution. First, sym-bolic execution is usually more efficient for finding deeper bugs, because its2In this thesis, symbolic execution refers to the path-by-path exploration with symbolicvalues as it was first presented by [28]. Such a terminology is also used by many otherauthors [5, 25, 48]. In contrast, Symbolic execution has been used as a broader term by[3, 29], to include techniques this thesis classifies as model checking.3Even though this chapter emphasizes the usage of SAT solvers as constraint solvers,other solvers can be used for model checking and symbolic execution, such as SatisfiabilityModulo Theories (SMT) solvers. The emphasis on SAT solvers is because most algorithmicadvances were first applied to SAT solvers.31.1. Motivating Exampleformula size doesn?t grow as fast as model checking to reach a certain depthin the program. Thus, if lucky, the symbolic execution might find a bugat a depth where the model checking formula is too big and complex to besolved in a reasonable time.The second advantage is that symbolic execution can be used to par-tially verify software. Whenever symbolic execution fails to prove whetherone or more properties holds in a certain time or memory constraint, it canreturn which paths were completely executed. Thus, symbolic execution canguarantee these paths?s correctness regarding the properties tested. Further-more, the path condition can be used to generate test inputs or constraintsto guide automated tests. In contrast, when a model checker does not reacha conclusion, the model checker doesn?t produce any intermediate result.Symbolic execution is at a disadvantage compared to model checkingwhen it tries to prove that a certain property holds. Not only are sym-bolic execution decision heuristics not as optimized as SAT solvers?s, but,more importantly, SAT solvers can learn from the decisions previously made,whereas most symbolic execution engines can?t.Ideally, any symbolic execution engine would also take advantage of itsprevious decisions in order to reduce the verification effort. After executinga path, the engine can learn facts regarding the path?s fragments. Further-more, these facts can be used to predict whether other paths will satisfy theproperties tested.In this thesis, I present an efficient technique that can reason about sym-bolically executed paths and prune the solution search space. This techniqueis based on the learning scheme used in Conflict-Driven Clause Learning(CDCL) SAT (Satisfiability problem) solvers, which are known for the suc-cess and scalability of SAT solvers.1.1 Motivating ExampleBefore providing the proposed learning technique, this section presents theintuition behind the learning scheme through an example. I illustrate howto learn facts from symbolically executed paths, and to prune a large num-41.1. Motivating Example1 int main ( ) {23 symbol ic int x , y ;45 i f ( x < 0)6 x = ?x ;7 i f ( y < 0)8 y = ?y ;910 int r e s u l t = x ? y ;1112 i f ( x < y )13 r e s u l t = y ? x ;1415 a s s e r t ( r e s u l t >= 0 ) ;1617 }Listing 1.1: An example to illustrate that not all paths are needed toprove a property. In this example, the property to be proven is an assertionstatement embedded in the code. (In this example, this assertion alwaysholds)ber of paths.In Listing 1.1, the property to be verified is encoded as an assertionstatement, which should hold for every possible execution. There are 8distinct execution paths that reach the assertion, as shown by the executiontree in Figure 1.1. Most symbolic execution tools, like Klee [10] and JavaPathFinder [48], would simulate all 8 paths before returning that the givenproperty always holds. However, it is possible to prove that the assertionalways holds by executing only 2 paths.For example, if the first path to be executed is the one where all ifstatements (x < 0, y < 0 and x < y) are True (red path), symbolicexecution will reach the conclusion that result is always greater or equalto 0. Such a conclusion can be derived from the constraint ?x < y? and theassignment in line 13 alone; thus, the other constraints and statements areirrelevant to prove the property in this path. Once that fact is learned, everyother path that includes the true branch of the ?if x < y? statement doesn?t51.2. Text Organizationx < 0y < 0 Ty < 0 Fx < y Tx < y Fx < y Tx < y FP ? TP ? FP ? TP ? FP ? TP ? FP ? TP ? FFigure 1.1: The execution tree for Listing 1.1, where P represents theproperty being proved.have to be executed, because these paths will include the two constraintsthat are enough to guarantee the property tested.Now, suppose the second path to be executed is the one where the firsttwo if statements (x < 0 and y < 0) are still True, but the last one,x < y, is False (the blue edge). The result definition comes from line10 and the path condition includes the constraint ?(x < y), or equivalentlyx >= y. Together, the definition and the constraint imply that result isalways greater or equal to 0. This new implication is enough to prune everypath where the last branch, x < y, is False.Since every path that reaches the assertion will contain either the con-straint x < y and result = y ? x definition or the constraint x >= y andresult = x? y definition, those two paths are enough to prove the propertyresult >= 0.1.2 Text OrganizationThis thesis introduces a technique to systematically learn facts (like the onesin the preceeding example) from each symbolically executed path. These61.2. Text Organizationfacts can be used to predict the behavior of non-executed paths. I alsopresent a proof-of-concept tool named Kite, which is an implementation ofthe learning technique built on the top of the symbolic execution tool Klee[10].This thesis is organized as follows: Chapter 2 provides background onSymbolic Execution and on the CDCL algorithm. Chapter 3 explains howthese two approaches can be combined to build a more efficient software ver-ification tool. Chapter 4 introduces Kite, as well as implementation detailsand optimizations that make such an approach more scalable. Chapter 5presents the tests and the results obtained from executing Kite, as well as acomparison with other state-of-the-art verification tools. Chapter 6 presentsthe recent works that are related to the proposed method. Finally, Chap-ter 7 discusses this thesis?s main contributions, its limitations, and futurework.7Chapter 2BackgroundThis chapter presents the background on two important techniques that willbe the basis of this work. The first is symbolic execution, which is a softwareanalysis technique usually used to generate tests, as well as to find run-timeerrors and assertion failures. The second is Conflict-Driven Conflict Learn-ing SAT solving, which is a powerful constraint solver framework that hasbeen used as an underlying engine for symbolic execution, model checkingand many other industrial applications.Symbolic execution engines and SAT solvers are both search tools. Sym-bolic execution engines search for an assignment to program variables thatwill lead the execution along a certain path, while SAT solvers search for anassignment to boolean variables that will lead a formula to evaluate to true.However, there is one main difference between these search mechanismsthat is relevant for this thesis: modern SAT solvers learn from failed searchattempts, while symbolic execution tools usually don?t ? even though, aswe will see, they can.2.1 Notations and DefinitionsThis section establishes the notation followed throughout this thesis. Inorder to distinguish SAT instance variables from program variables, thisthesis will represent the former with a lower case v, and the latter withGreek letters. Specifically, the program variables will be represented with?. Additionally, boolean expressions will be represented with ?, programproperties will be represented with a ?, and property violations will be rep-resented with a ?.Additionally, this thesis assumes the program under verification follows82.1. Notations and Definitionsthe imperative or procedural programming paradigm, where the programexecution follows a sequence of commands ? an execution path ? thatchange the program state. A program state is defined as the collection ofprogram variables and their associated values, together with the controllocation, which is known as the program counter. Each instruction (com-mand or program statement) either modifies program variable values, or thecontrol location, i.e., each instruction is either an assignment or a branch(conditional or unconditional). For simplicity, sequences of consecutive in-structions with a single entry and a single exit point, and no branches, aregrouped as basic blocks.Without loss of generality4, this thesis considers programs that con-tain one method only ? all method calls are inlined ? and all conditionalbranches are represented by if-statements with the following format:if ? then [goto True BB] else [goto False BB]where if the given condition ? is true, the next basic block (BB) to beexecuted is the True BB; otherwise, the next one is the False BB. The elsestatement is optional, and when omitted, the False BB is equivalent to thebasic block that follows the if statement.The flow of control within a program can be captured as a directed graph,known as the control flow graph (CFG). The control flow graph representsthe sequence of commands and possible transitions between them. Thus,in a program?s CFG, the nodes represent basic blocks and the edges indi-cate possible control flow transitions between basic blocks. Every branchinstruction destination associated with the instruction location originates atan edge in the control flow graph. In this thesis, basic block identificationstarts with a B, and branch identification starts with a b. Consequently,the same notation is used to represent the CFG nodes and edges, B and brespectively.Consider the small example in Listing 2.1, and its CFG depicted in Figure2.1(a). The CFG has 3 nodes ? representing basic blocks BA, BB and BC4The structured program theorem [7] states that any program can be described usingthree transition techniques: sequence, selection, and repetition (or loop).92.1. Notations and Definitions1 int main ( int argc , char?? argv ) {2 symbol ic int a , b ;3 int r e t = a + b ;4 i f ( argc > 1)5 r e t = r e t / 2 ;6 return r e t ;7 }Listing 2.1: A small example in C to illustrate how basic blocks are defined.The program has 3 basic blocks, the entry basic block BA goes from line 1to 4, while BB contains only the statement from line 5 and BC is composedby the return statement in line 6.? and 3 edges ? representing branches b0 = (BA, BB), b1 = (BA, BC) andb2 = (BB, BC). The if-statement is a conditional branch statement thatconnects BA to either BB or BC . The transition from BB to BC is implicit,because the unconditional branch statements at the end of a basic block canbe omitted in some languages like C.Although every feasible execution path in a program is represented inthe CFG, not every path represented in the CFG is necessarily feasible. Apath is only considered feasible if there exists at least one set of input valuesthat leads the program execution through the path?s command sequence. Ifno such set of values exists, the path is considered infeasible. This thesiswill use the the term possible execution paths for those that may or maynot be feasible (i.e., it hasn?t been proven yet whether the path is feasibleor not).The set of all feasible execution paths in a program can be representedas a tree, known as the execution tree (see example shown in Figure 2.1(b)).Thus, the subset of paths represented in the execution tree can be obtainedby unwinding the CFG, and checking their feasibility. Due to loops in aprogram, and in the CFG, an execution tree can be infinite.102.2. Symbolic ExecutionBABB b1BC b2 b3(a) The control flow graph.BABB b1BC b2BC b3(b) The execution tree.Figure 2.1: The control flow graph and the execution tree for the programdescribed in Listing 2.1. (a) The CFG represents every basic block in theprogram, along with all possible transitions between them. (b) The execu-tion tree represents every feasible execution path. It can be obtained byunwinding the CFG, and checking which paths are feasible.2.2 Symbolic ExecutionIn a nutshell, symbolic execution is a program analysis method that simu-lates program execution with symbolic values instead of concrete ones. Thesymbolic execution engine treats each input variable as a symbolic value;hence, the engine can analyze the program behavior under every combi-nation of input values. The usage of symbolic values provides a safe andsound method to find feasible execution paths, as well as to establish underwhich conditions they are executed. If the program doesn?t have an infiniteexecution tree, then symbolic execution is also a complete method.In order to handle symbolic values, the program?s execution semanticsmust be extended. First, variables assignments are extended, and a variablecan either have a concrete value or a symbolic expression. The symbolicexpression is defined as an expression that represents a function over thesymbolic variables. The symbolic execution engine produces a symbolic ex-112.2. Symbolic Executionpression while evaluating a program?s data manipulation instruction. Theevaluation expands each variable used and translates the instruction opera-tion into an arithmetic operator.The semantics of if-statement instructions is also extended, and theif-condition might have more than one possible evaluation in the currentexecution, i.e., the if-condition may be true, false or undefined. In thelast case, the engine explores both possibilities separately. To do so, theexecution forks into two distinct executions, named processes: one thatfollows the true-branch, and another one that follows the false-branch.Each process assumes that the if-condition is either true or false accordingto the direction taken.The if-statement evaluation depends on the previous assumptions madeby the current process. Therefore, the program state is extended in order tomaintain the set of conditions that have been assumed true during a processexecution. This set is known as the path condition, or just pc5.The symbolic execution engine evaluates an if-condition under the pathcondition in order to establish which branches are feasible. Given a condition? of an if-statement and the current pc, the engine checks whether one ofthe branch conditions ? or ?? is true under the constraints in pc. In otherwords, the symbolic execution checks the following expressions:pc =? ? (2.1)pc =? ?? (2.2)Assuming pc is not strictly false, at most one expression is true. If oneexpression is valid, the execution follows the respective branch. In this case,it is optional whether the engine adds the branch condition to pc. However,if both expressions are satisfiable, both branches are feasible. As mentionedbefore, the execution forks, and the child processes inherit the program state5It is important to emphasize that pc stands for path condition in the symbolic ex-ecution context[28, 40, 48], and it differs from another common use of pc as programcounter122.2. Symbolic Executionfrom their parent. The engine adds the branch condition to each process pcaccording to the direction taken.Given these new semantics, the symbolic execution of a program runs asfollows6:1. Initialization: The symbolic execution starts with a single processfrom the program?s entry block ? as any concrete execution. Eachinput variable is considered symbolic, and the path condition is ini-tialized as true.2. Sequential execution: Next, the engine executes each instructionsequentially until it finds an if-instruction (step 3), or until it findsan exit instruction (step 6).3. If-statement execution: The if-condition is evaluated under thepath condition. If only one branch is feasible, the engine follows thatbranch, and continues the sequential execution described in step 2. Ifboth branches are feasible, the execution must fork.4. Forking: A process forks by generating two child processes, whichinherit their parent?s program state. The execution engine sets eachchild to follow one specific branch, and adds the corresponding branchcondition to pc. Both children are added to a worklist, and the engineselects a process to follow (step 5).5. Process selection: Conceptually, the engine can pick any processfrom the worklist to follow. Once the engine chooses a process, theengine restarts the process execution from the instruction pointed bythe process program counter (step 2).6. Process termination: Whenever an exit instruction is reached, theengine terminates the current process. The engine uses the termi-nated process pc to generate a concrete test input vector that leadsthe program execution under the completed path. After terminating a6This step division is not standard, but didactic.132.2. Symbolic Executionprocess, the engine checks whether there are any processes left in theworklist. If there are not, the symbolic execution is done. Otherwise,the engine goes back to step 5.Thus, the symbolic execution engine keeps executing each forked processuntil one of the following cases:? The engine executes every forked process until the end. In this case,it completely executed every feasible path in the program.? The engine picks a process that never ends, i.e., the engine follows aninfinite execution path, and it never terminates.? The execution produces an infinite number of processes; thus, it neverterminates.The set of paths executed during the symbolic execution can be repre-sented as a symbolic execution tree. In cases where the tree is finite, it alsorepresents the program execution tree, and it represents every feasible pathin the program. Moreover, every node in the symbolic execution tree rep-resents a forking if-statement execution, and each leaf represents a processtermination. The pc associated with any leaf is unique, and each concreteinput generated in step 6 satisfies one and only one leaf pc. Additionally,there is at least one concrete input for each feasible execution path.The description given in this session is a high-level overview of the sym-bolic execution algorithm introduced by King [28] in 1976, and which isstill the base of most symbolic execution tools available today such as Klee[10], and Java PathFinder [48]. State-of-the-art symbolic execution tools im-plement additional optimizations that further improve symbolic executionscalability.2.2.1 Solving StrategiesSymbolic execution is mainly used today for two purposes: to generate testcases, and as a bug finding tool. In these cases, symbolic execution can be142.2. Symbolic Executionused as a search algorithm that seeks feasible paths that lead the executionto some points of interest in the program. In other words, symbolic executiontraverses the program?s CFG, while trying to find a feasible path that reachesa certain node. These nodes can represent program exit points, assertionfailures or just uncovered program lines.King?s original description of symbolic execution does not define anysystematic search strategy to explore the execution tree automatically [28].King also introduced an interactive symbolic execution tool, named Effigy[27], where the user is responsible for choosing the process the tool will followat step 5.Since symbolic execution?s introduction, many search heuristics has beenintroduced in order to choose which path the engine should follow, andto systematically explore the execution tree. The main search strategiesimplemented by symbolic execution software are described bellow:Depth-first: In this strategy, the engine implements the worklist as a stack,always visiting the process that has been most recently created. Afterforking, the engine picks one of the process children, and pushes theother one to the top of the worklist; thus, it always follows one completeexecution path, before switching to a different path. Once a processterminates, the engine picks the process generated in the most recentfork.This strategy is implemented by many tools [10, 22, 48] due to itssimplicity. It also allows for easy integration with other approachesthat may enumerate the paths to be checked, such as concolic execution[22], and model checking [48]. However, in cases where the executiontree is unbalanced, the search can spend a lot of effort in one smallportion of the tree ? or worse, get stuck in an infinite path.Breadth-first: A breadth-first search strategy is used to avoid process star-vation [48], and it gives equal opportunities for the process to run. Theworklist is processed in FIFO order, always selecting the oldest pro-cess.152.2. Symbolic ExecutionThe main drawback of this approach is that if the point of interest canonly be reached deep in the execution tree, then the process may takea very long time to find it.Uniform Random: In this approach, the engine randomly chooses whichprocess following a uniform probability according to the process depthin the execution tree [9, 10]. The probability of each process to bepicked is equal to 2?L, where L represents the fork level on whichthe process was generated. Thus, this strategy favors processes thatare shallow on the execution tree. As with any random search, eachexecution explores a different part of the execution tree, and resultsare hard to reproduce.Coverage-Optimized: Many different greedy approaches have been pro-posed to select a process that is likely to cover new code [9, 10, 30].These techniques usually compute a weight for each state based on theminimum distance to an uncovered instruction, and the recently cov-ered new code. Then, the engine chooses the process randomly basedon each state weight.This approach adds some overhead to the process selection step; nev-ertheless, it is usually one of the most efficient strategies to reachuncovered program statements.Modern symbolic execution tools usually apply one or more strategiesdescribed above. Some tools support a combined strategy, where more thanone search strategy is interleaved using a round-robin scheduling technique.2.2.2 Query OptimizationsIn order to evaluate an if-statement, the symbolic execution engine em-ploys some sort of constraint solver to check the branch conditions underthe path constraint. Conceptually, the symbolic execution engine invokesa constraint solver every time a process execution reaches an if-statement,and this can impact on the engine performance. Because query complexity isexponential in the number of variables included in the formula, the number162.2. Symbolic Executionof queries, and the size of each query, have a major impact on the symbolicexecution engine?s performance. For this reason many optimizations havebeen introduced in order to reduce query complexity, or to attempt to solvequeries without invoking the constraint solver.Constant Propagation: This optimization affects the instruction?s seman-tics. When an instruction is executed, a simplification stage followsthe expression generation. In this simplification, any arithmetic oper-ation that only involves concrete values is replaced by the operation?sresult, which is also constant. Thus, instead of propagating formulasthat involve only concrete values, this method propagates the resultsof those operations.Constant propagation can be applied both statically, as a compilerpass, and dynamically in each individual execution path, according tothe values assigned to the program variables. Constant propagationmay completely eliminate queries that reference only variables withconcrete values. Failing that, this operation may still reduce the sizeand complexity of queries.Concrete Execution: In this strategy (also known as concolic execution),symbolic execution can be interleaved with concrete executions [22, 31,44]. The concrete executions help symbolic execution by determiningthe feasibility of entire execution paths, and well as the feasibilityof any subset of each path constraint. Each concrete execution ismonitored to record the command sequence followed.Implied Value Concretization: Some assumptions made along an exe-cution path may imply variable assignments. For example, if a processtakes a branch that has the condition ?i = 0, the engine can assign 0to ?i, as well as replace any occurrence of ?i in the path condition bythe constant 0. After that, it can apply constant propagation to tryto simplify the path condition even more.Constraint Independence: Instead of testing whether a branch condition? is feasible under the complete path condition, the engine can try to172.3. SAT Solverssimplify the query by identifying which constraints in pc are indepen-dent from ?. Two constraints ?i and ?j are classified as independentif they don?t share any variables, and if there is no other ?m that isdependent on both ?i and ?j .This optimization removes constraints that are completely disjointfrom the branch condition being checked, and reduces the formulacomplexity without affecting its satisfiability.Query Cache: This last optimization is also used to reduce the number ofcalls to the constraint solver. Query caches store the results of previousqueries, to avoid making repeated (expensive) calls to the solver forthe same conditions. When combined with constraint independence,queries might represent a sub-path condition; therefore, the cache caneliminate queries from different paths that includes the same sub-pathcondition.Besides the optimizations described in this section, researchers have alsofocused on the extension of symbolic execution to support different languagefeatures, such as: pointers, heap modeling, environment modeling and con-currency. These optimizations are out of the scope of this thesis. Thisthesis presents an approach that is orthogonal to these advances, and itshould benefit from them without any conceptual change.Existing search techniques that employ some kind of learning to prunethe number of paths executed during symbolic execution will be discussed inmore detail in Chapter 6. Chapter 6 also contrasts these learning techniqueswith the one presented in this thesis.2.3 SAT SolversThe problem of deciding the satisfiability of propositional formulas, knownas Boolean satisfiability or just SAT, is one of the most important openproblems in complexity theory. SAT was the first problem proven to beNP-complete, and there are no known algorithms for solving with better182.3. SAT Solversthan exponential worst case complexity. Despite this, modern SAT solversare often highly effective in practice. State-of-the-art solvers are capable ofsolving industrial problems with over a million variables and several millionconstraints.While work on SAT goes back to the 1960s, the main advances in thisarea started in the late-90s with the introduction of Conflict-Driven ClauseLearning (CDCL) SAT solvers. Since then, SAT solvers have been produc-ing remarkable results as constraint solvers in areas such as software andhardware verification.2.3.1 DefinitionsA propositional formula is a logic expression defined over a set of n Booleanvariables, {v1, v2, . . . , vn}, and that contains only three logic operations:and (?), or (?) and not (?). Modern SAT solvers usually assume thatthe propositional formula are represented in a specific form: ConjunctiveNormal Form (CNF).A CNF formula consists of a conjunction of m clauses, {c1, c2, . . . , cm},where each clause is a disjunction of one or more literals. A literal representsthe occurrence of a variable vi or its complement ?vi. Formula 2.3 is anexample of a CNF formula with 4 clauses and 3 variables:(v1) ? (?v2 ? ?v3) ? (v1 ? v3) ? (?v1 ? ?v2 ? v3) (2.3)From now on, this thesis assumes that any propositional formula ? isrepresented in CNF, and CNF formulas will simply be called formulas.Each variable in ? is a Boolean variable, and can either be assigned totrue (1) or false (0). Assigning a variable to true (resp. false) is equivalentto assigning all positive occurrences of its literals to true, and all negativeoccurrences to false (resp. true), and this thesis will make assignments toliterals or variables interchangeably. When a variable hasn?t been assignedany value, it is a free variable. A set A of the variables in ? together withtheir corresponding assigned value is a truth assignment, or just assignment,192.3. SAT Solversto ?. An assignment A is complete if it assigns a value to every variable inthe given formula (|A| = n); if |A| < n, then A is a partial assignment.A formula ? evaluated under a complete assignment A is either satisfied(true) or unsatisfied (false) by A. Additionally, if ? is evaluated undera partial assignment, ? can also be undefined, i.e., the evaluation cannotconclude the value of ? by the given assignment.The satisfiability problem can be defined as the problem of finding whetherthere exists at least one complete assignment that satisfies ?. For a formulawith n variables, there are 3n possible partial assignments, of which 2n arecomplete assignments.2.3.2 DPLL SAT SolversThe first efficient search procedure applied to SAT was introduced by Davis,Logemann and Loveland [14], and it was based on the resolution decision pro-cedure presented by Davis and Putnam [15]. This search algorithm knownas DPLL is the base of most modern complete SAT solvers, including CDCLsolvers.DPLL solvers take advantage of some useful properties of CNF formulas,which emerge from the fact that each clause in a formula ? can be evaluatedseparately under an assignment A. If any literal in a clause is assigned totrue, the entire clause is assigned to true. If every literal in a clause isassigned to false, the clause is false, and so is ?. Otherwise, a clause isundefined under A. Thus, for a formula to be unsatisfied, it only requiresone unsatisfied clause.Another useful property emerges from clauses that have one literal only,such as the clause (v1) in formula 2.3. These clauses are known as unitclauses, and any unit clause implies that its unique literal must be assignedto true in order for the formula to be evaluated to true.Besides that, if a literal l occurs in a formula, and its negation doesn?t,the literal is classified as a pure literal. In this case, this literal can be safelyassigned to true. This is due to the fact that a satisfying assignment Aeither assigns l to true, or A can be modified by assigning l to true, and the202.3. SAT Solversresulting assignment will also satisfy ?.DPLL traverses the search space in a top-down approach. It starts froman empty assignment, and it iteratively assigns values to free variables, andchecks whether the new assignment is a satisfying or an unsatisfying (partial)assignment.At each iteration, the algorithm tries to find a variable assignment thatis implied by the detection of unit clauses (known as unit propagation), orof pure literals. If no value is implied, the solver arbitrarily chooses a freevariable vi, and assigns either vi or ?vi to true. An arbitrary assignmentis called a decision, and splits the search space in two. A DPLL solvermaintains the set of decisions previously made as a decision tree, and adecision level is associated with every arbitrary assignment to denote itsdepth in the decision tree.After any variable assignment, the algorithm creates a new formula ??by modifying ? to:? Remove the occurrence of the literal assigned to false.? Remove the clauses with the literal assigned to true.If ?? is empty, then the current assignment is a satisfying one. If ??has an empty clause, then the current assignment is an unsatisfying one.Otherwise, the original formula is undefined under the current assignment,and the search continues by picking another variable assignment.The occurrence of an empty clause is known as a conflict, and the al-gorithm backtracks. In DPLL, the search always backtracks to the previousdecision level by reverting to the formula ??. The search then checks whetherit has tried both branches of the decision tree. If it hasn?t, the search as-signs the opposite value to the decision variable. If the search has tried bothvalues, it has to backtrack one more level.The DPLL search algorithm terminates when it finds a satisfying assign-ment, or when it backtracks past level 1 ? there is no satisfying assignment.212.3. SAT SolversAlgorithm 2.1 CDCL framework implemented by modern SAT solvers.Given a CNF formula, CDCL is a search procedure that tries to find asatisfying assignment. The CDCL algorithm traverses the decision tree ap-plying conflict analysis, clause learning, and non-chronological backtrackingto reduce the search space.Input: Set of clauses.Output: SAT if there exists an assignment that satisfy all the clausessimultaneously, UNSAT otherwise.function CDCL( )status? preprocess( )if status 6= UNKNOWN thenreturn statusloopl? decide( )if l 6= ? thenstatus? propagate(l)if status = CONFLICT thenif level = 0 thenreturn UNSATelselevel, cause ? analyze conflict( )learn(cause)backtrack(level)elsereturn SAT2.3.3 CDCL SAT SolversThe DPLL algorithm was proposed in 1962, and it remained the frameworkof state-of-the-art complete solvers until the introduction of Conflict-DrivenClause Learning (CDCL) SAT solvers. In 1996 [45], Silva and Sakallahpresented a new SAT solver named GRASP, based on DPLL solvers. Unlikeother DPLL solvers, GRASP was capable of analyzing and learning fromconflicts reached during the search. CDCL solvers are named after GRASP?sconflict-driven clause learning procedure.Algorithm 2.1 presents pseudo-code for the CDCL framework. Like222.3. SAT SolversDPLL, CDCL starts from an empty assignment, and the solver implementsan iterative search. However, the search is modified to support conflictanalysis, clause learning, and non-chronological backtracking.Preprocess: In the beginning of the algorithm, a CDCL solver may per-form some preprocessing to find out whether the formula?s satisfiabilitycan be trivially determined, or if any variable value can be implied.Decide: Like DPLL solvers, if the solver can?t deduce any free variableassignment, the solver picks a literal to assign to true. A CDCL solveralso records the level of each decision.Propagate: Once a decision is made, CDCL reduces the formula beingassessed by removing any clauses containing the selected literal, andremoving from each remaining clause any occurences of the comple-ment of the decision literal. In practice, the solver uses efficient data-structures to manage these operations. During propagation, if thesolver identifies a unit clause, its unique literal is assigned to true, andthe new assignment is propagated; if it detects an empty clause, thenit raises a conflict.Analyze Conflict: If the search reaches a conflict, then the solver analyzesthe set of decisions, and the assignments implied by them. To doso the solver either maintains and analyzes an implication graph, orimplements a sequence of selective resolution operations. More detailsabout how this operation is performed can be found in [45, 50].Learn: Once the solver finds a subset of the assigned literals that togetherare sufficient to cause a conflict, the solver generates a new clausecontaining the negation of these literals. By adding this learned clauseinto the clause database, the solver will be prevented from returningto the same search space in the future.Backtrack: The last step after a conflict is detected is a non-chronologicalbacktracking. The solver has to backtrack to a level that reverts at232.3. SAT Solversleast one conflicting assignment. If an earlier decision is responsiblefor the conflict, the search can backtrack more than one level.The search ends if it successfully assigned a value to each variable in theformula without raising a conflict, i.e., it found a satisfying assignment. Oth-erwise, if the formula is unsatisfiable, the solver terminates when a conflictis raised at level 0.Efficient CDCL solvers also include many other implementation improve-ments introduced after GRASP. One example is the new unit propagationdetection algorithm introduced by Moskewicz et al. [36], which reduced thesolving time of hard instances up to two orders of magnitude. Their ap-proach does not investigate every clause at each iteration while searching fornew unit clauses. Instead, their approach picks two literals in every clausethat have not yet been assigned false, and watches them. After assigning anew value to a variable, only the clauses watched by the corresponding falseliteral need to be visited (during which either a new watching literal will befound for each clause, or unit propagation will be triggered, or a conflict willbe raised). This BCP algorithm is known as the 2-literal watching scheme.Other improvements include restarts [24], sophisticated decision heuristics[23, 36], and learned clause deletion [23].In the last 20 years, SAT solver advances have not been restricted toperformance improvements. New features have been added to SAT solversto improve their usability in different applications. The following ones arethe most relevant to this thesis:Unsatisfiability Core : In the process of generating an unsatisfiabilityproof to a formula ?, many modern SAT solvers can identify a sub-formula ??, ?? ? ?, that is also unsatisfiable, known as an unsatisfia-bility core. This core can be very helpful in diagnosing the causes ofsome formula?s infeasibility.In practice, the goal is to generate a small unsatisfiability core, knownas minimal unsatisfiability core. However, this is a hard problem,harder than proving that a formula is unsatisfiable, and there exists242.3. SAT Solversmany different strategies that attempt to efficiently identify small andminimal cores (for more details see [38, 51]).Incremental Solvers : An incremental SAT solver is a solver capable ofreusing its solving state after solving a formula ?i to successively solvesimilar formulas. The variable activity used in decision heuristics,deletion / restarts strategies and others can be safely reused indepen-dently of the formulas. However, learned clauses can only be reusedif no clause of ?i is deleted. A solver can keep the learned clauses tosolve ?j after solving ?i if:?i ? ?jBecause ?j contains every clause in ?i, this technique can be used tosolve problems incrementally. This process is much faster than solvingeach formula independently.Assumption Mechanism : Ee?n and So?rensson [19, 20] observed that itis safe to keep the clauses learned even if some clauses are deleted.However, this only applies if all the removed clauses are satisfied at thetop level (such as unit clauses). Given this observation, the authorsintroduced a modification in the preprocessing stage that allows asolver to check a formula satisfiability under initial assumptions (orjust assumptions). An assumption is a unit clause that representsa variable assignment that should be propagated before a formula issolved. This assumption mechanism allows the solver to simulate theeffect of removing clauses from a formula.For example, a user can invoke an incremental solver successively tocheck ?i and ?j , even if ?i * ?j . For that, he / she adds a newvariable vk to both formulas, and adds the literal vk to every clausethat is in ?i but not in ?j . Then, the user invokes the solver tocheck whether the modified ??i under the assumption ?vk is satisfiable.During preprocessing, the solver propagates the assumption, whichremoves every occurrence of the literal vk, and the ??i is reduced to ?i.252.4. CDCL ApplicationsAfter solving ?i, the user can add the clauses that are exclusive of?j , and invoke the same solver; however, the user passes vk as an as-sumption. The solver propagates this new assumption, which triviallysatisfy all clauses that contain vk. Thus, the solver can remove themfrom search.The assumption mechanism also provides a powerful and lightweightmechanism to detect the unsatisfiability reason, or even the unsatis-fiability core. Because each assumption is treated as a unit clause,the solver will still represent it in the implication graph. Once thesolver proves that a formula is unsatisfiable, the solver can analyzethe conflict, and check if the final conflict is a consequence of any ofthe assumptions given. If there is no assumption involved, the solvercan conclude that the formula is unsatisfiable. On the other hand,if there are one or more assumptions involved in the final conflict,the solver concludes that the formula is unsatisfiable under the as-sumptions given. Additionally, the solver provides a subset of theassumptions that is sufficient to make the formula unsatisfiable [18].2.4 CDCL ApplicationsThe unquestionable efficiency of CDCL SAT solvers has inspired many re-searchers to apply the same reasoning behind CDCL to solve problems in dif-ferent domains. For example, Satisfiability Modulo Theories (SMT) solversemploy SAT solvers to enumerate abstract solutions, and a theory solver,which is able to handle atomic constraints in some decidable first-order the-ory to test and refine each solution. The efficiency of modern SMT solverscomes from the integration of the CDCL SAT solver and the theory solver.The SAT solver consults the theory solver after every decision, and the the-ory solver may return a new deduced assignment, as well as a conflict clause? if a conflict is detected. This tight integration, known as lazy SMT 7,is responsible for a great advance in SMT performance, and it has a direct7The name was given in contrast to eager SMT, which encodes the SMT formula intoan equivalently satisfiable Boolean formula before invoking a SAT solver.262.4. CDCL Applicationsimpact in symbolic execution, since most symbolic executions employ somekind of SMT solver as their constraint solver [43].Another example is the Conflict Driven Fixed Point Learning (CDFL)technique [17], which instantiates a CDCL architecture over abstract do-mains. This technique was successfully applied for bounds analysis, and itwas mathematically generalized to lattice-based abstractions [16].In the software verification context, the insights of modern SMT solvers,and consequently CDCL solvers, has inspired the introduction of a programanalysis named Satisfiability Modulo Path Programs [25]. Chapter 6 de-scribes this analysis technique at length, and compares it to the methodintroduced in this thesis.27Chapter 3Conflict-Driven SymbolicExecutionSymbolic execution has been highly optimized to generate lightweight queriesto the constraint solver, which allows tools to successfully explore many exe-cution paths even in large programs. Nevertheless, symbolic execution toolsstill try to exhaustively execute every feasible path in order to prove/dis-prove certain properties. This strategy resembles DLL SAT solvers; thus,symbolic execution might benefit from CDCL techniques, in the same waythat SAT solvers did.Therefore, this chapter presents a novel symbolic execution algorithmthat avoids executing every execution path, named Conflict-Driven SymbolicExecution (CDSE). In a nutshell, CDSE prunes the search space every timea certain branch is proven infeasible, by learning the reason why there is aconflict.The initial CDSE description considers programs that have no loops andonly one property. Section 3.4 describes how CDSE can handle multipleproperties, while Section 3.5 presents different CDSE solutions to verifyprograms with loops.In addition, this thesis assumes that a program property ? is encoded asa boolean expression ??. The property ? is embedded in the source code asan assert statement:assert(??)which is semantically equivalent to the following if-statement:if ??? then abort()283.1. OverviewThe branch that represents the transition to the abort() statement, i.e. aproperty violation, will be represented with a ?3.1 OverviewLike any symbolic execution algorithm, CDSE is a search algorithm. InCDSE, each program branch statement and a possible direction representsa decision. From now on, this pair will be called a ?branch?, inspired by theprogram?s control flow graph (CFG) representation. Thus, CDSE exploresthe program?s search space by choosing which branches should be executedin sequence.In order to reduce the search space, the CDSE search engine prunes pathsby employing the same insights from CDCL solvers: if a certain subsetof decisions raises a conflict, this conflict can be used to backtrack non-chronologically, and to preempt the occurrence of similar conflicts.The engine uses the control flow graph (CFG) and facts learned from pre-vious attempts to guide its future decisions. The engine iteratively chooseswhich branch to take, and then evaluates this decision?s consequences beforemaking a new decision.For example, going back to Listing 1.1 at page 5, conflict-driven symbolicexecution exploits the same intuition presented in Section 1.1 to prove thatthe given assertion always holds by executing only 2 out of 8 execution paths.In addition, it extracts information from the program?s CFG to optimize thelearning process, as well as the decision procedure.In order to prove that the assertion always holds, the CDSE enginesearches for a feasible path between the entry node and the assertion failureBB (the True BB of the translated if-statement). First, the engine extractsfrom the program?s CFG (shown in Figure 3.1(a)) which branches couldnever be included in any path that connects the entry node (BA) and theassertion True BB (BH). As a result, the engine learns branch b11 wouldnever be included; therefore, it should never be taken.After that, the engine symbolically executes the program by alternatelychoosing which branch to follow, and checking whether it is feasible or not.293.1. OverviewBABB b1BC b2 b3BD b4BE b5 b6BF b7BG b8 b9BH (abort) b10BI (exit) b11(a) Program?s control flow graphBABB, BC TBC FBD, BE TBE FBD, BE TBE FBG TBG FBG TBG FBG TBG FBG TBG F(b) Program?s execution tree ? modified copy from Figure 1.1 representingexecuted basic blocks.Figure 3.1: The control flow graph and the execution tree for the programdescribed in Listing 1.1. (a) In the CFG, each branch b and each basic blockB has a unique ID. (b) The execution tree represents the 8 paths in theCFG that would lead the program to check the assertion. The executiontree edges represent only conditional branches; thus, one node may representthe execution of more than one basic block.303.1. OverviewFollowing the description in Section 1.1, in order for the CDSE engine toexecute the path marked in red in Figure 3.1(b), the engine must choose tofollow the branches b1, b3, b4, b6, b7, b9, and b10. When the engine reachesb10, and checks if its condition is feasible, the engine finds a conflict.The branch b10 condition is not feasible because of the constraint ?x < y?and the result definition from assignment ?result = y? x? (inside BF ). Theengine adds the constraint ?x < y? to the path condition as a consequenceof the decision to follow b7. Additionally, the result value is defined in BFbecause the engine takes b9. Consequently, the decision of taking branchesb7, b9 and b10 led the symbolic execution into a conflict, and the enginelearns that it should never take them together again. This learning prunes4 possible execution paths (including the red path) from its search.In order to leave the conflicting state, the engine has to backtrack and re-move at least one of the conflicting decisions. The engine can keep decisionsthat led it to take b1, b3, b4, b6, but it cannot take either b7 or b9. Takingb7 implies that execution will follow b9, and b9 execution must follow b7?s.Moreover, the engine knows that only paths that include b10 are relevant tothe proof ? since all paths either include b10 or b11, and previous analysisexcluded b11?s.If the engine decides to keep the same decisions until b6, and to takeb8 and b10, the execution follows the path with the blue edge. Once again,the engine reaches a conflict when it checks whether b10 is feasible. Thisconflict emerges because of branches b8 and b10. As a consequence to thedecision of taking b8, the engine adds the constraint ?x > y? to the pathcondition, and the definition of variable result (result = x ? y) from BEreaches BG. This constraint and this definition together conflict with thebranch b10 condition ?(result >= 0). Therefore, the conflict detected is aninvariant to every path that includes b8 and b10.CDSE concludes the assertion always holds after applying what the en-gine learned from both: the two conflicts it reached and the program?s CFG.313.2. Learning Scheme3.2 Learning SchemeIn order to reduce the search space, the CDSE engine must be able to learnfrom its bad decisions, as well as to make new decisions based on the factslearned from previous iterations. Thus, the CDSE learning scheme pro-vides a mechanism to extract the decisions that led the search to a conflict,and to create a new constraint that blocks them. Moreover, this learningscheme also provides a mechanism to restrict the engine?s choices accordingto the constraints previously learned. Consequently, every time a conflict isreached, the engine learns a new constraint and the available choices becomemore restricted.3.2.1 Search ConstraintsEvery decision made by the CDSE engine respects the constraints imposedby the program?s CFG and the facts learned along the search. In otherwords, the engine should pick only branches that compose a CFG path andthat haven?t been blocked yet.The engine represents these constraints as a set of clauses (CNF). TheCNF representation allows the engine to use a SAT solver to find decisionsthat do not violate the search constraints. From now on, this CNF will becalled fCFG.An f CFG is a formula used to represent the constraints that guide theCDSE search, including those extracted from the program?s CFG. Eachvariable in the f CFG formula has a one-to-one correspondence to an elementin the CFG ? either a branch or a basic block. Therefore, the engine caneasily map elements of one set to the other.The engine uses a SAT solver to solve the f CFG and consequently toenumerate CFG paths that would lead the program execution to a propertyviolation, given that no conflict is detected by the engine. Thus, the f CFGis encoded to be satisfiable if and only if there is a path in the CFG be-tween the program?s entry node and a property failure ?. Consequently, thereachability problem becomes a satisfiability problem, where:323.2. Learning Scheme1. The variable corresponding to the entry node must be true.2. Any variable corresponding to an exit node (excluding the abort callfrom the property encoding) must be false.3. A branch variable vb can be true if and only if the origin and targetvariables are also true, when they exist.4. A basic block variable vB can be true if and only if at least one incom-ing edge variable is true.5. A basic block variable vB can be true if and only if at least one outgoingedge variable is true.6. The variable corresponding to the property violation ? must be true.Note that the constraints described above do not restrict satisfying as-signments to represent only one path. However, they guarantee that a sat-isfying assignment represents at least one non-blocked path that connectsthe entry node to the property violation in the CFG. Moreover, if a variablevb is assigned to true in a satisfying assignment, these constraints guaranteethat one of its path includes the branch b.Initially, the f CFG includes only constraints derived from the CFG.These constraints are enough to block the decisions that could lead thesearch to an ill-defined path in the CFG, or to an irrelevant path to theproperty proof. The initial f CFG is satisfiable unless every assertion state-ment is trivially marked as dead code during CFG construction.During the search, the engine tries to execute a CFG path that respectsthe current assignment to the f CFG. If the path is feasible, than it findsan assertion failure. On the other hand, if the path is infeasible, the engineperforms a conflict analysis to establish a set of branches that led to theconflict. This conflict analysis, which is described in the next section, returnsa new constraint, represented as a clause, that blocks any path that containsevery branch in the given set. The engine adds this clause to the f CFG, andinvokes the SAT solver to find a new assignment that respects the increasedset of constraints.333.2. Learning Scheme3.2.2 Conflict AnalysisIn order to determine a reason of a conflict, the symbolic execution of a pathis slightly modified. The changes allow the CDSE engine to represent theconflict reason as a clause, and to learn from it by adding the clause to thef CFG.In CDSE, the engine chooses which branches to take in sequence, and itexecutes a path following these decisions. The engine executes a programstatement as a consequence of the past decisions. Therefore, the CDSEexpression generation reflects not only the instruction semantics, but alsothe decisions that led the instruction execution.In order to represent this dependency, the expression encoding includesinstruction guards, from now on represented with ?. An instruction guard isa boolean symbolic variable that express the condition that the search mustmeet in order to execute the guarded instructions. In CDSE, the instructionguard is true if and only if the immediate preceding branch is taken.The engine encodes an instruction guard as a boolean symbolic variable?i that follows the same identification number i as the last branch takenbi. After taking branch bi, and before taking any other branch, the engineguards the instructions with the same variable ?i. In CDSE, the engine mayguard an assignment or boolean conditions.Given an assignment? := ?expr?where ? is a program variable, and ?expr? is the expression to be assignedby regular symbolic execution. The engine guards such an assignment byextending the expression to be assigned to a select expression. A selectexpression represents a ternary operation that contains a boolean condi-tion, and two possible assignments. The evaluation of the boolean conditiondetermines which assignment is chosen.In this new expression, the variable ? gets the ?expr? if the guard ?iis true. Otherwise, ? can get any value, i.e. its value is nondeterministic,which is represented as a ?nondet? symbol. The original assignment is then343.2. Learning Schemereplaced by the following guarded assignment:? := ?i ? ?expr? : ?nondet?Consequently, the variable ? will be defined as ?expr? if the program execu-tion reaches the assignment instruction ? if it takes the preceding branchbi; otherwise, ? can have any value.The engine can use data flow analysis to optimize the number of assign-ments to guard. CDSE does not need to guard a variable assignment if itsdefinition dominates every possible use.Besides assignments, CDSE also guards constraints generated from con-ditional branches. Before executing a conditional branch command, theengine chooses a branch bi that it will follow. Then, the engine creates anew expression with the original branch condition ?cond? and the instruc-tion guard ?i. In this new expression, ?i implies that ?cond? must be true.In other words, if the branch bi is taken, the branch condition must be true.Thus, the following constraint is added to the path condition:?i =? ?cond?Given the new encoding, the engine invokes the constraint solver passingall guards to the current path as assumptions, i.e., the solver has to solve thegiven formula assuming every guard is true. Consequently, CDSE still passesan equally satisfiable problem to the constraint solver. Additionally, if thepath is infeasible, CDSE leverages the constraint solver conflict reasoning toprovide which guards ? and which branches ? are responsible for a conflict.Once the constraint solver returns a conflict clause, CDSE has to trans-late the guards involved in the conflict returned to their respective branches.The translation follows a lightweight mapping mechanism, since the guardsidentification number is the same as its relative branch. Finally, CDSE gen-erates a conflict clause based on the branches that led to the current conflict,and adds such a clause to the learned clause database.Thus, CDSE explores the program structure in three different contexts.353.3. CDSE AlgorithmThe first one is the program CFG, which represents the set of all possibletransitions in the program. CDSE uses the CFG to build an initial knowl-edge about the program. Second, CDSE represents the CFG as a formula,f CFG, where the algorithm can easily add new facts (learned clauses). Thef CFG represents each CFG element as a boolean variable. The last con-text is the symbolic execution encoding itself. CDSE introduces the conceptof instruction guards, which are symbolic variables that represent branchesin this context. CDSE employs these guards in order to perform a conflictanalysis. Therefore, each branch bi in the program CFG has a correspondingf CFG variable vi, and a corresponding guard ?i.3.3 CDSE AlgorithmThe conflict-driven symbolic execution algorithm systematically explores thesearch space by deciding which branches to take given the knowledge it builtfrom previous decisions. CDSE starts from the entry node in a program,and symbolically executes the program until it reaches a branch statement.Then, CDSE chooses which branch should be taken next, and checks whethersuch decision is feasible or not. CDSE iteratively executes instructions be-tween the chosen branches, until it reaches the violation branch b?, or thebranch chosen is infeasible. If the b? branch is feasible, a bug was foundand a concrete counter example (CEX) is extracted. On the other hand, ifa branch is infeasible, a reason for the conflict is determined. The CDSEengine, then, learns which set of branches should not be taken in order toavoid the same conflict, and backtrack.Algorithm 3.1 presents a pseudo-code for the algorithm described above.Like CDCL solvers, the algorithm can be divided into the following mainsteps:Preprocess : In lines 1-7, CDSE employs different static analysis tech-niques to optimize the code and to extract relevant information forthe next steps.Decide : At every iteration, symbolic execution reaches a branch state-363.3. CDSE AlgorithmAlgorithm 3.1 Pseudo-code for the conflict-driven symbolic execution pro-posed in this thesis.Input: Program P, Bound B and Assertion AOutput: TRUE if assertion can?t be violated, a CEX otherwise1: P? ? analyze and optimize(P)2: cfg, f CFG? build cnf(P?, B, A) // Build CFG and translate to CNF3: paths ? solve(f CFG)4: if paths = ? then5: return True6: pcond ? ? // Path conditions7: dstack ? ? // Decision stack8: loop9: branch ? decide next branch(cfg, paths, dstack)10: Push(dstack, branch)11: pcond ? pcond & get cond(P?, branch)12: if is feasible(pcond) = True then13: if next = ?A then14: return compute cex(P?, pcond)15: else16: sym exec step(branch) // Current branch ? next branch17: else18: lclause ? analyze conflict( )19: add clause(f CFG, lclause)20: paths, pcond, dstack ? backtrack(f CFG, pcond, dstack)21: if paths = ? then22: return Truement, and decides which direction to take. Lines 9-10 represent thedecision procedure.Propagate : Once a decision has been made, the algorithm propagates it.From line 11 to 16, the path condition is extended and checked. Iffeasible, the algorithm checks whether it has found a bug. If a bugwas found, the algorithm returns a counter example. Otherwise, CDSEexecutes the instructions from the last branch until the next branch.Conflict Analysis : If a conflict was reached, the algorithm establishes373.3. CDSE Algorithmwhich decisions led to this conflict (Line 18).Learn and Backtrack: In lines 19-22, the algorithm learns to avoid theconflict detected, and backtracks to some previous branch where thisconflict is no longer implied. If there is no path left, the algorithmreturns True.3.3.1 PreprocessThis step precedes the symbolic execution, and the goal of this step is toextract relevant information for the following steps, as well as simplify theprogram instructions. During Preprocess, the engine can apply many staticanalysis techniques included in modern compilers to optimize the program,such as constant propagation, loop unrolling, dead code elimination, etc.Additionally, the engine gathers two piece of static information that arerequired in further steps: the control flow graph and the use-def chain.The control flow graph can be easily retrieved by visiting each basicblock in the program and connecting it to its immediate successors ? itsbranch targets. Because decisions are based on branches, the CFG containsan entry edge that points to the entry node. As mentioned in section 3.2,the engine also represents the CFG as a CNF (the f CFG), which facilitatesthe decide and learn steps. Thus, the engine also initializes the f CFG, anduses a SAT solver to find an initial assignment during preprocessing.The second piece of information needed is the set of all definitions of aprogram variable (namely, variable assignments) that can reach a certainusage, known as the use-def chain. Modern compilers already compute thisdata flow information while transforming the program into static single as-signment (SSA) form. In SSA form, whenever a variable value can comefrom different sources, due to join points in the CFG, phi-functions (or phi-nodes) are added to the beginning of basic blocks. Semantically, phi-nodesare conditional assignments, where a variable is assigned to its latest valuein the preceding block. The phi-nodes carry information about where vari-able values may come from, and the CDSE engine employs this knowledgeto define which assignments have to be guarded.383.3. CDSE Algorithm3.3.2 DecideThe search starts after the preprocess step. At every iteration, the first stepis to decide which branch to take next. Each decision is based on the CFG,and the current assignment to the f CFG variables.First, the decision procedure identifies which choices are available byusing the CFG to find the existing basic block transitions that originatedfrom the last basic block executed. The procedure then refines the set ofchoices available by using the current assignment to the f CFG. The decisionprocedure can only pick a branch bi if the f CFG variable vi has been assignedto True. If after the refinement there is more than one choice available, thedecision procedure could apply different strategies to try to guess which oneis the best choice, or choose randomly.After a decision is made, the f CFG variable picked is pushed onto astack, named the decision stack. The new size of this stack determines thelevel associated to this last decision.3.3.3 PropagateAfter the decision procedure picks a branch bi, CDSE propagates everychange implied by this choice. This includes checking if the branch bi isfeasible, and symbolically executing the instructions that follow it. Sym-bolic execution will continue until it reaches the next conditional branchstatement.First, the engine has to check the branch condition under the currentpath constraints. For that, the engine derives the branch condition from thecurrent branch statement and the direction to be taken. The engine guardsthe constraint with the ?i that is related to the branch bi, and extends thepath condition by adding this guarded constraint.Given this new path condition and the current decisions, the engine in-vokes the constraint solver to check whether the path condition is satisfiableor not. If the new path condition is infeasible, CDSE found a conflict, andthe engine follows the conflict analysis step. Otherwise, the propagationcontinues.393.3. CDSE AlgorithmCDSE propagation continues by checking whether it has reached a bug.In which case, it retrieves a valid counter example from the constraint solver,and returns it to the user.If propagation does not reach any conflict or any bug, the engine sym-bolically executes the instructions that follows bi, and it uses ?i to guardinstructions whenever it is necessary. If during the symbolic execution, theengine reaches an unconditional branch bj , the engine adds bj to the decisionstack as an implied decision, and starts using ?j instead.3.3.4 Conflict AnalysisThe conflict analysis? goal is to establish a subset of decisions that led thesearch into a conflicting state, where the path being explored is no longerfeasible. CDSE exploits the assumption mechanism adopted in many CDCL-based constraint solvers to implement a lightweight and efficient conflictanalysis.As explained in Section 3.2.2, the CDSE engine encodes the instructionguards as assumptions to be respected by the constraint solver. Therefore,if the constraint solver finds that the current path condition is unsatisfiable,the solver returns a reason for the conflict as a conflict clause based on theassumptions given.Because the conflict clause returned by the constraint solver is basedon guards and not on the f CFG variables, the engine has to translate thisclause as a function of the f CFG variables. In order to differentiate theconflict clauses returned by the constraint solver from the ones returned bythe CDSE conflict analysis stage, from now on, this thesis will use executionconflict clause for the former and CDSE conflict clause for the latter. Hence,an execution conflict clause is based on instruction guards, while a CDSEconflict clause is based on f CFG variables.As mentioned in Sections 3.2 and 3.3.1, the CDSE engine uses the sameidentification number i to identify a branch bi and its corresponding f CFGvariable vi and guard ?i. Thus, the engine uses this identification numberto map guards to f CFG variables, and to perform the translation from an403.4. Multiple Propertiesexecution conflict clause to a CDSE conflict clause.As a result, the conflict analysis procedure returns a clause with thenegation of every f CFG variable involved in the conflict. For example, ifbranches gi and gj are responsible for the conflict, the conflict analysis willreturn the CDSE conflict clause (?vi ??vj). This clause summarizes whichdecisions can be taken to avoid the same conflict.3.3.5 Learn and BacktrackIn this last step, the CDSE engine applies the information returned by theconflict analysis to undo previous decisions, and to preempt the occurrenceof similar conflicts. First, the engine expands the f CFG by adding the CDSEconflict clause retrieved from the conflict analysis. This clause represents anew constraint that excludes the corresponding decision subset responsiblefor the last conflict. Therefore, future satisfying assignments to the f CFGcan?t assign true to every f CFG variable responsible for the conflict.Once the conflict is learned, the engine needs to backtrack its previousdecisions in order to leave the conflicting state. CDSE backtracks by solvingthe expanded f CFG, and looking for new paths to follow. If the f CFG is nolonger satisfiable, every path has been blocked, and CDSE has proven thetarget property always holds.In contrast, if the SAT solver finds a satisfying assignment to the f CFG,the engine checks which variables in the decision stack are no longer assignedto true, and backtracks to a level where all these decisions are removed. Theengine also removes from the path condition the constraints added due tothe backtracked decisions, and the algorithm goes back to the decision step,starting a new iteration.3.4 Multiple PropertiesConflict-driven symbolic execution can simultaneously verify multiple prop-erties in the target program. Adding support for multiple properties requiresjust a few changes to the preceding algorithm description. First, the unit413.5. Loop Handlingclause v?, defined in Section 3.2 item 6, is no longer added to the f CFG.Instead, the f CFG translation adds a property clause to the f CFG. Theproperty clause is defined as the union of every variable relative to a prop-erty failure ?i, or:?iv?iThis change is enough if the goal of CDSE is to prove whether the pro-gram respects all properties given, i.e., whether it can find any propertyviolation. However, if the goal is to prove/disprove every property given,Multiple Property CDSE includes a second change. Whenever the symbolicexecution reaches the violation of a property ?i, the engine saves the counterexample found and learns to never look for the same violation. For that,the engine adds to the f CFG the unit clause (?v?i), which blocks everycomplete path that could violate ?i again.Even though this unit clause blocks every assignment to the f CFG thatassigns v?i to true, this will not restrict the search for a violation of anon-blocked property. Note that there is no path in the CFG that caninclude two property violations. A satisfying assignment to the f CFG thatincludes more than one property violation must also include independentpaths that connect each one of them to the entry node. Thus, for anysatisfying assignment to the f CFG that assigns v?i and v?j to true, thereis a second satisfying assignment that does not include v?i , but that stillincludes the same paths to ?j . Adding a unit clause (?v?i) to the f CFGwill block the first assignment, but not the second.3.5 Loop HandlingSo far, the CDSE description relies on the absence of loops in the programcontrol flow graph. In the presence of loops, an assignment to the f CFG nolonger guarantees the existence of a path that connects the entry node toone of the target nodes (See Figure 3.2 for an example). Therefore, CDSEmust handle the loops in the CFG before converting it to a CNF.423.5. Loop HandlingBABB b0BC b1  b2BD b3BE b4  b5BF b6BG (abort)  b7BH (exit)  b8(a) A CFG with loops.11 1 1 1  1 1 0 1 1  1 1 1 1 1 0 0 (b) A satisfying assignment to the f CFG.Figure 3.2: An example of a satisfying assignment to the f CFG thatno longer guarantees the existence of a complete path in a CFG due tothe existence of loops. (a) Represents a program CFG with two loops.(b) Represents a satisfying assignment to the corresponding f CFG. In thisrepresentation, each CFG element?s identification is replaced by the valueassigned to the corresponding f CFG variable.Any complete path in the given CFG should include b3; however, a loop inthe CFG allows a node to succeed its predecessor, and rules 4 and 5 can besatisfied without guaranteeing that b3 must be taken in order to reach BD.433.5. Loop HandlingFor example, the CDSE engine can apply source code modifications tothe program target P , unrolling all loops until a given bound n. In thiscase, the symbolic execution is bounded, and it can prove property correct-ness only for execution paths that respect the loop bound, and no otherconclusion could be taken about the program?s behavior past the bound.However, source code loop unrolling is expensive, since it generates ncopies of the same instruction sequence, and it can be easily avoided inCDSE. Because the CDSE engine takes advantage of the CFG to guide itssearch, the engine can apply more sophisticated approaches to handle loops,such as:Static CFG Unrolling: Instead of unrolling the program?s source code,the engine can unroll loops only in the CFG.Dynamic CFG Unrolling: The engine can also unroll the CFG dynami-cally. The algorithm starts by unrolling every loop twice, and assumingthat a loop second iteration should not be taken. Then, CDSE pro-ceeds with the search until it either finds a counter example, or blocksevery path available. In the latter, the engine checks if any assump-tion added causes the current f CFG to be unsatisfiable. If that is thecase, the engine removes the assumption, unrolls the loops that wereinvolved in the conflict, and continues the search.44Chapter 4Kite: A Conflict-DrivenSymbolic Execution ToolThis chapter presents my proof-of-concept conflict-driven symbolic execu-tion (CDSE) tool named Kite8, which was used to evaluate the performanceof this new algorithm. This chapter describes Kite?s implementation details,including which changes were necessary to employ many of the symbolic ex-ecution optimizations described in Section Kite?s ArchitectureKite was developed to verify embedded assertions in sequential programs.Kite verifies programs described in the LLVM assembly language, knownas LLVM IR, which can be obtained by parsing C files using the LLVM Cfront-end [41].Kite was mainly constructed on top of a state-of-the-art symbolic exe-cution tool named Klee [10, 39]. Kite?s implementation can be divided intofour main modules:Preprocessor: The preprocessor is responsible for reading the input file,as well as analyzing, and optimizing the program before the symbolicexecution starts.Instruction Interpreter: This module interprets the program instructionsfollowing the symbolic execution semantics. The interpreter imple-ments the entire CDSE propagation step.8Kite is an open-source project and its source code can be retrieved from: www.cs.ubc.ca/labs/isd/Projects/Kite454.1. Kite?s ArchitectureKitePre-Processor(Extended LLVM)DecisionEngineConstraintSolver(Extended STP) MiniSatInstruction Interpreter(Based on Klee)Figure 4.1: Kite?s main modules, their composition and communication.The preprocessor module provides information to the instruction interpreterand the decision engine. The instruction interpreter symbolically executesthe code guided by the decision engine. The interpreter and the constraintsolver share information about path condition feasibility.Decision Engine: The decision engine chooses which branches to follow,as well as the backtrack level, whenever a conflict is detected. Thus,it implements the decision step, as well as learning and backtrackingsteps.Constraint Solver: The constraint solver is responsible for analyzing thepath constraints, and deciding whether they are valid or not. In caseswhere the constraints are not valid, it returns an execution conflictclause. Moreover, the constraint solver also extracts an assignment tothe program input variables when the execution reaches an assertionfailure.Figure 4.1 depicts how these four modules are organized inside Kite, aswell as the communication between them. The preprocessor is the first mod-464.2. Preprocessorule to act in Kite?s execution, and it provides information to the instructioninterpreter and the decision engine. The symbolic execution is then coor-dinated by the instruction interpreter. This module consults the decisionengine every time a decision is required, or after any conflict is detected.Besides that, the interpreter invokes the constraint solver to check whetherany modification in the path condition is feasible.Kite inherited three of its modules from Klee: the preprocessor, the con-straint solver and the instruction interpreter. Consequently, Kite also in-herited three tools used in Klee: LLVM [32], STP [21], and CryptoMiniSat2[46]. LLVM is the base of the preprocessor, while STP and CryptoMiniSat2comprise the constraint solver.Kite introduces some extensions to the preprocessor and the constraintsolver in order to support the CDSE preprocessing requirements and the ex-ecution conflict detection. These two modifications are described in Sections4.2 and 4.4, respectively. Kite?s instruction interpreter is the result of manymodifications applied to Klee?s symbolic execution engine to implement aconflict-driven approach. These changes are described in Section 4.3.The decision engine is a new module exclusive to Kite. The decisionengine includes an underlying SAT solver in order to solve the search con-straints. The underlying SAT solver is an extended version of MiniSat [20],which was modified in order to implement different decision heuristics asdescribed in Section PreprocessorThe preprocessor is responsible for reading, analyzing, and optimizing theprogram under verification. As previously mentioned, Kite?s input is a pro-gram described using the LLVM IR, and the preprocessor module is mainlycomposed of the LLVM compiler itself.The LLVM (Low Level Virtual Machine) compiler was introduced in2004 by Lattner and Adve [32]. The authors designed LLVM to be a flexibleand easy-to-extend compiler framework. Since then, LLVM has grown, andit incorporates many program analysis procedures and optimization passes.474.2. PreprocessorFurthermore, LLVM also behaves as a virtual machine capable of interpret-ing its own assembly language, the LLVM IR.The LLVM provides optimization passes that reduce the number andcost of instructions in the final executable. These passes can also be usedto optimize the IR for symbolic execution. Many operations that are hardfor computer processors are also hard for constraint solvers. For example,memory access and division operations usually demand many machine cy-cles. These operations also cause big overheads during symbolic execution.Memory access is hard to model, while the translation of a division operationproduces symbolic expressions that are hard to solve.Therefore, Kite includes many LLVM?s optimization and analysis passes.Kite uses these passes to transform the code into SSA (Single Static Assign-ment), to reduce memory access, to apply constant propagation, to simplifyarithmetic expressions, to canonicalize loop induction variables, etc.Additionally, Kite introduces two new passes to retrieve the informationnecessary to build the program?s control flow graph (CFG) required by theCDSE: Loop Analysis and CFG Analysis. The first pass identifies everyloop in the program, as well as its structure (back edge, entry block andexit blocks). The second pass is a function analysis pass, which is used tocollect the program?s structure, i.e., this pass collects every basic block, aswell as the transitions between them (branches and function calls).4.2.1 CFG ConstructionThe CDSE learning scheme relies on the knowledge of the program?s controlflow graph. Since LLVM contains only separate representations of CFGsper program function, Kite?s preprocessor implements a method to build aunique CFG for the entire program.The CFG construction can be divided into two stages. In the first one,the preprocessor collects the required information about the program?s struc-ture using the Loop Analysis and the CFG Analysis passes. In the secondstage, the preprocessor builds the CFG using the information collected. Itencodes every basic block as a node, and every branch as an edge. The484.2. Preprocessorpreprocessor also decodes every function call in a process similar to functioninlining. For that, the preprocessor divides the callee basic block into twonodes, which represents the instructions executed before and after the call.Then, the preprocessor duplicates the called function CFG and connect itto the two new basic blocks. The preprocessor also unrolls every loop andevery recursive function until a certain depth. In both cases, the iterationpast the depth is represented with an exit node, which is handled in thef CFG construction.Additionally, the preprocessor employs an optimization to reduce theCFG to nodes that are relevant to the search. This optimization followsthe intuition behind program slicing [49]. A program slice is a subset ofinstructions that can directly or indirectly have influence over a programcomponent. Since the assertion statements are the symbolic execution tar-gets during the program verification, basic blocks that cannot precede thesestatements are removed from the CFG representation. This optimizationconsists of a simple backward traversal that originates from every assertionstatement, followed by the removal of every node that the traversal didn?tvisit. Exit nodes are connected to the end of the edges that connect a visitednode to a non-visited one.4.2.2 f CFG ConstructionThe preprocessor is also responsible for generating the f CFG representation,which is used by the decision engine to guide the CDSE search. Once thepreprocessor generates the CFG, the preprocessor translates this CFG intoa conjunction normal form (CNF) formula. The preprocessor traverses theCFG, and applies the rules described in Section 3.2.1 (page 32) in order toperform such a translation.Every exit node is represented as one single variable vexit, and a unitclause ?vexit is added to the CFG. This excludes from the search every paththat reaches the program?s return statement, as well as every path slicedout of the CFG.494.3. Instruction Interpreter4.3 Instruction InterpreterThe instruction interpreter is responsible for symbolically executing the pro-gram following the decisions made by the decision engine. The instructioninterpreter was developed on top of the symbolic execution tool Klee [10].Klee was mainly developed to generate test cases automatically, but Kleeis also capable of searching for runtime errors, including assertion failures.Klee generates a test case every time it reaches the end of an execution path.This input can be used to lead the program?s execution along a certain path,and it allows the user to debug errors by using regular debugging tools.Kite maintains Klee?s test case generation, but it only generates a testcase when an assertion failure is detected. Furthermore, Kite?s interpretertakes advantages of many of Klee?s features to implement the CDSE algo-rithm.Execution Backtrack: The instruction interpreter starts the execution atthe program?s entry point, with a single process. Every time the inter-preter reaches an if-statement, the current process forks after checkingif the next branch is feasible. However, the interpreter maintains onechild process as a copy of the current program state before it takes anybranch. The interpreter saves this copy in a stack, which representsthe program states at each decision level.Thus, the interpreter can easily backtrack to any level by popping allprogram states until the backtrack level, and by setting the currentprogram state to be the state in this level. Because any program statestored by the interpreter is a copy of another state before a branch istaken, the execution always restarts from an if-statement, and thistriggers another decision step.Notice that the interpreter discards all program states that are poppedduring the backtrack step. However, due to the nonchronological back-track strategy implemented in Kite, some of these states might stillbe explored in future iterations. Consequently, Kite is penalized fordiscarding them, and it might have to re-execute some redundant se-504.3. Instruction Interpreterquence of instructions to restore a discarded state. This penalty couldbe removed with further optimizations, which are presented in Chapter7.2.Instruction Guards: Klee represents every symbolic expression as an ab-stract syntax tree (AST), which links together the operations to beperformed in-order. For that, Klee represents each operation, sym-bolic variable, and concrete value as a specific class that extends anexpression base class. In order to represent instruction guards, Kiteincludes two new expression classes, one to represent guarded expres-sions, and one to represent guarded constants. These new types arenamed ?-expression and ?-constant respectively.A ?-expression simulates a select operation where a guard ? is theboolean condition, the true side is the guarded expression and the falseside is a non-deterministic value. This allows the interpreter to replacean expression by its guarded version in any AST.The ?-constant is a specialization of the ?-expression and the constantclass, which represents concrete values. Thus, a ?-constant stores aconcrete value and a list of guards. This type of expression allows thesymbolic execution to eliminate queries, to propagate constants, andsimplify expressions when the guarded expression is a constant.The implementation of ?-constants is enough to guarantee constantpropagation during expression creation. Because this class extendsand re-implements every method of the constant class, it can be usedanywhere a concrete value is used. Additionally, a ?-constant alwayspropagates its guards.For example, given an assignment:?i = ?j + ?kwhere both used variables ?j and ?k have guarded concrete values.If ?j is assigned to 10 under guard ?a, and ?k is assigned to 5 under514.3. Instruction Interpreterguard ?b, the interpreter will assign ?i to 15 under both guards ?a and?b.The constant propagation triggered by the implied value optimizationis described in the following item.Query Optimizations: Klee implements query optimizations that havegreat impact on symbolic execution scalability, and can reduce theruntime by more than an order of magnitude [10]. Therefore, Kitealso implements these optimizations to show that most of them are or-thogonal to the CDSE approach. Kite?s interpreter extends the queryoptimization methods in order to comply with the guards? usage.The implied value concretization, for example, has to propagate theguards involved in the concretization. If the execution adds a con-straint such as ?i = 0, this optimization replaces every occurrence of?i by the value 0. However, before replacing the occurrences of ?i,this method first checks if the added constraint is guarded. If it is, themethod collects all guards, and it creates a ?-constant representing theconcrete value 0, which is used instead of the concrete value 0.The query cache adaptation is more complicated. The cache imple-mented in Klee stores all queries its interpreter sends to the constraintsolver together with the computed results. The interpreter consultsthe cache before sending a new query to the constraint solver. In caseof a cache hit, the interpreter can use the cached result instead ofinvoking the solver.However, the instruction guards used in Kite can impact in the chanceof a cache hit, because similar queries that are originated from differentpaths could have different guards. Therefore, Kite?s cache does notkeep any information about guards. Kite implements a lightweightguard removal method, which the cache invokes to generate a copy ofa query without its guards. The cache uses this extraction mechanismbefore looking a query up, or before adding a new entry.Since the guards are used only during conflict analysis, where a query524.4. Constraint Solveris unsatisfiable, Kite?s cache only stores satisfiable queries. This re-striction allows the cache to safely ignore all guards in a query. In caseswhere the query is unsatisfiable, CDSE requires information about theconflicting guards, which the cache doesn?t provide. This forces theinterpreter to call the constraint solver in order to obtain this infor-mation.Although Kite could extend the cache in order to keep the guardsinvolved in a conflict previously found, that could increase the com-plexity of the cache lookup mechanism, and potentially decrease thecache efficiency. Moreover, the CDSE learning scheme already elimi-nates many redundant unsatisfiable queries.The constraint independence optimization did not require any change.This optimization determines constraint dependence by analyzing theset of symbolic variables that are included in the leaves of each expres-sion AST. Since the guards never represent a leaf, and this methoddoesn?t change any expression, the guards can be safely ignored.Therefore, the implemented data structure allows Kite to safely applyall the query optimizations included in Klee. Additionally, it allows Kite tolearn facts whenever the interpreter reaches a branch with a concrete condi-tion without calling the constraint solver. If a branch condition evaluates tofalse, its guards represent the conflict to be learned. If a branch conditionevaluates to true, the interpreter knows the non-taken branch is infeasibleunder the same path; consequently, Kite can add a new learned clause tothe f CFG even though the execution did not reach any conflict. Kite de-rives the learned clause from the condition guards, excluding the guard thatrepresents the last branch taken, and the guard relative to the non-takenbranch.4.4 Constraint SolverKite employs the constraint solver to prove whether a certain branch can betaken or not. The instruction interpreter passes a formula to the constraint534.4. Constraint Solversolver that defines the branch condition and the current path condition.Besides that, the interpreter also provides a set of assumptions that mustbe considered during the proof.Then, the constraint solver determines whether the formula is satisfiableor not. If the formula is satisfiable, the constraint solver produces variablevalues that satisfy the constraints. The instruction interpreter uses thesevalues if the branch taken leads to a bug; thus, these values represent avalid counter example to an assertion. In cases where the formula is provenunsatisfiable, Kite?s constraint solver returns a set of assumptions that ledthe proof to an unsatisfiable result.Kite inherits the pair STP / CryptoMiniSat from Klee as the constraintsolver employed during symbolic execution to check the feasibility of abranch condition. STP is a constraint solver developed to efficiently solveconstraints generated by program analysis, bug finding and test generationtools [21]. STP implements different algorithms based on the abstraction-refinement paradigm to reduce the size of the target formula, and then itemploys a SAT solver to check the formula satisfiability. CryptoMiniSat isone of the SAT solvers supported by STP, and it is the one chosen in Klee?simplementation. CryptoMiniSat extends MiniSat to support the XOR op-eration that is common in cryptography [46].STP receive as an input a quantifier-free formula in the theory of bit-vectors and arrays. STP is capable of determining whether a given formulais satisfiable or not, as well as producing an assignment in the first case.Nevertheless, STP does not support solver assumptions; consequently, STPis not capable of generating a final conflict clause, as required by the CDSEalgorithm.Therefore, Kite?s implementation required some modifications to STP?sinterface and internal processing to support assumptions. Kite?s constraintsolver extends the STP interface, and it includes methods to create non-deterministic variables, to add assumptions to the proof, and a method toretrieve the conflicting assumptions, in case of an unsatisfiable query.544.5. Decision Engine4.5 Decision EngineThe decision engine is the module responsible for choosing which branch theinstruction interpreter should follow. This module includes a SAT solverbased on MiniSat[20] to find satisfying assignments to the f CFG wheneverthey exist.MiniSat was chosen because it is a highly efficient9 and robust open-source implementation of an incremental CDCL SAT solver, that has awell-defined and clean interface. Because of these characteristics, MiniSathas been widely used[11, 21], and it has also served as the base to manyother successful solvers, such as Glucose [2] and CryptoMiniSat [46].During the initialization, the decision engine receives the program CFGfrom the preprocessor, and the f CFG. The engine uses the CFG to determinewhere the execution is, and which literals represent the next branches. Thedecision engine also maintains a decision stack, with all literals that representthe decisions taken in the current path.Every time the interpreter reaches an if-statement, the interpreter in-vokes the decision engine to determine which branch it should follow. Thedecision engine first checks the current assignment to the f CFG to checkif both branches could potentially be taken. If that is the case, the enginearbitrarily chooses which one the interpreter should follow.After a decision, the interpreter might find that the branch chosen isinfeasible. In this case, the interpreter passes a conflict clause to the decisionengine that adds the clause to the f CFG. The decision engine invokes itsSAT solver to determine whether a new path can be taken, and to which levelthe execution must backtrack. Because MiniSat is incremental, it reuses theprevious solving state to solve the new formula. This mechanism reduces thecost of successive calls to MiniSat performed by the decision engine. Whenthe f CFG becomes unsatisfiable, the decision engine informs the interpreterthat the backtrack level is 0, which implies that the symbolic execution can?treach any assertion failure.If a new satisfying assignment is found, the decision engine visits each9MiniSat won all the industrial categories of the SAT 2005 competition.554.5. Decision Engineliteral kept in the decision stack starting from level 1 until it finds a literalthat was assigned to false. Consequently, the backtrack level correspondsto the level of the last visited literal that is still true, i.e., the engine keepsevery decision that is common between the last executed path, and the newassignment.In order to reduce the backtrack depth, Kite introduces a modificationto MiniSat that allows the solver to restart the search from a certain set ofdecisions. From now on, this modified Minisat will be named kMiniSat.Besides assumptions, kMiniSat receives a vector of literals that shouldbe assigned to true if possible. Thus, before starting the regular search,kMiniSat includes a preprocessing step that initiates the search by propa-gating these preferred decisions, whenever possible. Preprocessing finishesif the preferred decisions were propagated, or if a conflict is detected.In contrast to assumptions, the decisions derived from this vector are notmandatory, and the solver is free to revert them during the regular searchloop.Kite?s decision engine may use this new mechanism to try to guide theSAT solver to search a new path similar to the current path taken. Forthat, the engine may provide the literals respective to the branches takenas the preferred decisions. Otherwise, the engine passes an empty vector,which leave the SAT solver free. Kite implements three different strategiesto determine when to guide the SAT solver, and when to let it free:Black Box: In this strategy, Kite always leaves the SAT solver free todecide, i.e., no preferred assignment is given, and the SAT solver istreated as a black box.Always Guide: This is the opposite strategy, where Kite always tries tominimize the backtrack level as much as possible. Thus, the decisionengine always sets kMiniSat preferred decisions as the last set of branchliterals followed.Maximize Learning: Following the intuition that the smaller the learnedclause, the more paths it will potentially prune, this strategy chooses564.6. Known Limitationswhen to guide kMiniSat according to the size of the last learned clause.The decision engine guides kMiniSat if the learned clause size relativeto the path size is smaller or equal to a constant k.4.6 Known LimitationsKite is still a prototype, developed to assess the conflict-driven symbolicexecution algorithm. Therefore, it is not mature software, and it has someknown limitations. These limitations are not inherent to CDSE, but arethe result of development time constraints. Kite?s main limitations are thefollowing:Memory Handling: Using Klee and LLVM?s structure to implement theproposed technique had one main drawback, memory handling is usu-ally conservative; thus, global memory and pointers are harder to rea-son about. The absence of phi-nodes and lack of pointer analysisimpact on the conflict analysis, making it less efficient and requiringspecial handling.Because the use-def analysis implemented in LLVM doesn?t includememory accesses, Kite cannot predict which join points in the CFGcould potentially change the value of a memory location. Conse-quently, Kite guards the expression generated from a memory readwith the guards that correspond to all branches taken between theread and the write operations. Since this approach over-approximatesthe possible interference other paths could have in a value read froma memory position, this approach is safe, but not very efficient.Therefore, Kite?s preprocessing step always includes the LLVM opti-mizations passes that reduce the number of memory accesses. Chap-ter 7.2 suggests and discusses other preprocessing techniques to delaymemory writes, anticipate memory reads, or use memory shadows toimprove CDSE learning.Symbolic Pointer Dereference: Kite does not support dereference of point-ers with symbolic values. Whenever a pointer is dereferenced, Kite574.6. Known Limitationschecks whether the pointer?s value is constant or not. In the formercase, the instruction execution can access only one memory objectthrough this pointer. In the latter, more than one memory objectcould be reached through this pointer, and this would require someadaptations in the expression generation, or in the CDSE learningscheme.Originally, Klee forks its execution whenever it reaches a pointer deref-erence which may result in different memory access. Klee creates anew process for each memory position that can be accessed. In or-der to reuse Klee?s approach, Kite could include a f CFG variable viassociated with which memory position it chose to follow, and guardthe dereference command with a guard ?i. This would allow Kite tolearn and backtrack to this decision, and explore paths with differentmemory dereference.Another solution to this problem would be the usage of conditionalaccesses inspired by model checking. The result of the dereferenceoperator ?*? is the value of a variable ?, if the pointer?s value is equalto the address of ? ? the reference &?. For example, if a pointer ?may hold the address of two variables ?i or ?j , the dereference of ?would be decoded as:?? =? (? = &?i) ? ?i : ?j (4.1)Furthermore, this change could be applied statically by a code trans-formation pass executed during preprocessing. Different pointer anal-ysis could be applied to restrict the number of variables that could bedereferenced in a certain instruction.Function Pointer Call: The same intuition behind symbolic pointer deref-erence could be applied to encode function pointer calls. The controlflow transition will follow a certain path depending on the value of thepointer?s value, i.e., it can be translated as nested if-statements that584.6. Known Limitationscompares function?s addresses to the pointer?s value. Once more, Kitecould replace every function call by applying a code transformationpass together with a pointer analysis during preprocessing stage.59Chapter 5ResultsThis chapter presents practical evidence that conflict-driven symbolic exe-cution (CDSE) can perform better than regular symbolic execution, as wellas model checking. For that, Kite is compared to the symbolic executiontool Klee [10], and to the model checker CBMC [11].Additionally, this chapter presents the impact of the decision heuristicson Kite?s performance on the benchmarks used. These results support thechoice of Kite?s best overall configuration.5.1 Testing EnvironmentIn order to assess Kite, all tests were performed using a benchmark com-posed of the ntdrivers-simplified and ssh-simplified instances re-trieved from the 2013 Competition on Software Verification (svcomp13) [4].The tests were slightly modified to comply with Klee?s standards and limita-tions, without breaking CBMC compatibility. The non-deterministic func-tions were renamed, and every loop was bounded to 50 iterations. In thesebenchmarks, the ERROR label represents a bug, and the label was replacedby an assert(false) statement.Because the name of the tests can exceed 50 characters, the tests were re-named to make all tables and graphs more readable. The renaming followedthe test classification according to the existence of a property violation. Atest can either be considered safe or unsafe. The former indicates that allassertions should hold, while the latter indicates that the test has a bug, andat least one assertion is violated. Thus, the tests were renamed to eithersafe n or unsafe n, where n is a unique identification number. Appendix Apresents a table with the mapping between the original test name and their605.1. Testing Environmentsimplified name.The CBMC version used was 4.7, which corresponds to the latest releaseby the date the tests were performed. Since Klee doesn?t have any officiallynumbered releases, every test was performed using two different change-sets. The first version corresponds to Klee?s changeset from which Kite wasforked10, and the second one corresponds to Klee?s latest changeset by thetime the tests were executed11. However, this chapter only presents the re-sults obtained by the forked version, because this version performed betterthan the latest version. Additionally, the forked version is closer to Kite?simplementation. See Appendix A for the performance obtained using Klee?slatest changeset.Additionally, this chapter presents many different versions of Kite ac-cording to the different decision heuristics employed (see Section 4.5 fordefinitions). This chapter will use the following names to differentiate eachheuristic:Kite KG: Represents the Always Guide strategy.Kite KB: Represents the Black Box strategy.Kite KM(k): Represents the Maximize Learning strategy, where a constantk is given as the threshold parameter. For example, KM(10) corre-sponds to the heuristic that guides the SAT search if the last learnedclause size is smaller or equal to 10% of the explored path size (in thenumber of branches taken).All the results presented were obtained from a single machine that has anIntel(R) Core(TM) i7-2600K processor and 16GB memory. The machine?soperating system was openSUSE version 12.1. Additionally, the tests hadno memory limit, but they had a time limit of 200s per test.10Changeset 9b5e99905e6732d64522d0efc212f3f1ce290ccc from September 12th, 201211Changeset e49c1e1958e863195b01d99c92194289b4034bbb from January 21st, 2014615.2. Overall Evaluation5.2 Overall EvaluationThis first analysis compares Kite?s performance to Klee?s [10] and CBMC?s[11] performance. As will be discussed in Section 5.3, engine KM(10) hadthe best performance overall; thus, this engine is used by default, and it isthe one presented in this section.Kite and Klee were executed with the following switches:--optimize: Runs LLVM optimization passes before symbolic execution.--exit-on-error: Aborts symbolic execution whenever an error is found.--no-output: Doesn?t generate test cases for each complete path explored.CBMC was run using the following switch:--no-unwinding-assertions: Doesn?t generate unwinding assertions forthe program loops.These switches were chosen to optimize performance, and to mimic the samebehavior for all tests. Every tool was set to run until it found an assertionfailure, or proved that all assertions hold.All tools concluded the verification before the time limit established,and every result was consistent to the properties safety. No false errors weredetected in any safe test, and every tool reported one assertion violation inevery unsafe test.Table 5.1 presents the time taken by each tool to solve each safe instance,and the total time to solve the entire safe test set. In this test set, Kitehad the fastest solving time (203.55s), and it was 1.55? faster than Klee(315.60s). CBMC presented the worst performance, and it spent 1410.99sto solve the test set; thus, CBMC was 6.93? slower than Kite and 4.47?slower than Klee.Even though the focus of this thesis is on the impact of adding learn-ing capability to symbolic execution, it is important to highlight that bothsymbolic execution tools performed better than the model checking tool. Asseen in the table 5.1, Klee and Kite were superior to CBMC in all instances.625.2. Overall EvaluationTest CBMC Klee Kitesafe 01 1.35 0.16 0.28safe 02 0.41 0.45 0.17safe 03 0.30 0.06 0.04safe 04 0.51 0.14 0.25safe 05 0.11 0.02 0.02safe 06 0.16 0.04 0.02safe 07 52.51 0.93 1.28safe 08 53.92 0.92 1.26safe 09 67.37 0.95 1.25safe 10 54.45 0.92 1.27safe 11? 169.76 68.58 11.60safe 12 2.82 0.09 0.61safe 13 0.61 0.09 0.13safe 14? 157.06 40.72 18.76safe 15? 158.55 40.36 15.91safe 16? 159.71 40.39 42.01safe 17? 184.12 40.62 34.36safe 18? 172.86 40.30 38.33safe 19? 174.41 39.86 36.00Total 1410.99 315.60 203.55Table 5.1: Table showing the execution time (in seconds) spent by Klee,CBMC and Kite to prove that each test case respects their assertions. Sinceevery tool spent more than 10s to solve the 7 instances marked with a ???,they will be classified as hard, in contrast to the other instances, classifiedas easy, that could be solved in less than 1s by at least one tool.This shows that using symbolic execution to prove properties given an upperloop bound is feasible, and that symbolic execution can be more efficient incertain cases.Comparing Kite against Klee, the former performed significantly betterin almost all hard instances ? where every tool spent more than 10s tosolve. Kite was The greatest speed-up in these instances happened whilesolving test safe 11, where Kite was 5.9? faster than Klee. Kite was slowerthan Klee only in one hard instance, safe 16, but the time difference was ofonly 4%.635.2. Overall EvaluationHowever, Kite did not outperform Klee in the easy instances. The twomain reasons for this behavior are Kite?s initial overhead, and Kite?s back-tracking implementation. First, the time spent by Kite?s preprocessor de-pends on the program control flow graph size, and it is significant only whenthe total solving time is small. Besides that, the cost of backtracking pastunblocked branches may overcome the benefit of eliminating some pathsfrom the entire analysis (refer to Section 4.3 at page 50 for more details).Section 7.2 suggests an optimization that can be applied to reduce the num-ber of instructions executed due to backtracking.In the unsafe instances, Klee had the best overall performance, as shownin Table 5.2. Klee took 4.23s, while Kite spent 5.50s and CBMC spent1602.30s. Both, Kite?s and Klee?s search heuristics were efficient at findingthe assertion violations, and they solved every instance in less than 2s.As mentioned before, in instances where the solving time is short, Kite?simplementation overhead had a significant impact on the solving time.CBMC had worse performance than Kite and Klee. This result supportsthe intuition that symbolic execution is usually faster than model checkingto find property violations. This big difference is also a consequence ofthe CBMC loop unrolling strategy. CBMC unrolls every program loop totheir upper bound prior to solving the instance. Therefore, it generates a bigformula, even though the assertion violation might be reached by executionsthat perform fewer loop iterations. Because of that, CBMC?s performanceis closely related to the loop bound provided by the user.Overall, Kite had the best performance in the benchmarks used. Asshown in Table 5.3, Kite solved all instances in 209.05s, while Klee spent319.83s, and CBMC took 3013.29s.Besides the improvement in symbolic execution performance, the benefitof CDSE becomes clearer when viewed in terms of the number of instructionssymbolically executed. This number can be reduced when the engine prunespaths out of its analysis. Figure 5.1(a) and 5.1(b) depict the impact of Kite?slearning scheme in the number of instructions symbolically executed in safeand unsafe instances, respectively.Despite the backtracking cost, Kite still tends to execute fewer instruc-645.3. Evaluation of the Decision EnginesTest CBMC Klee Kiteunsafe 01 0.31 0.24 0.03unsafe 02 0.49 0.14 0.06unsafe 03 0.18 0.04 0.02unsafe 04 1.83 0.07 0.19unsafe 05 130.86 0.04 0.03unsafe 06 178.63 0.62 0.78unsafe 07 197.32 1.60 1.62unsafe 08 185.79 0.27 1.35unsafe 09 172.38 0.15 0.05unsafe 10 167.25 0.09 0.06unsafe 11 154.35 0.10 0.07unsafe 12 52.92 0.13 0.46unsafe 13 183.94 0.04 0.04unsafe 14 67.48 0.18 0.43unsafe 15 54.34 0.14 0.06unsafe 16 54.23 0.38 0.25Total 1602.30 4.23 5.50Table 5.2: Table showing the execution time (in seconds) spent by Klee,CBMC and Kite to find a bug in each test case.tions than Klee, for both instance types. In all hard instances, Kite executedfewer instructions, and this gain came close to one order of magnitude. SeeAppendix A for a complete table with the exact number of instructionssymbolically executed by Klee and Kite.5.3 Evaluation of the Decision EnginesThe decision engine plays an important role in any conflict-driven clauselearning (CDCL) approach, and the same applies to CDSE. Tables 5.4 and5.5 show the different performances for safe and unsafe instances obtainedby changing Kite?s decision heuristic.Engine All Guide (KG) had the best performance in the safe test set.This engine tries to minimize the number of levels backtracked by guidingits SAT solver; consequently, it reduces the penalty of discarding unblocked655.3. Evaluation of the Decision EnginesInstances Type CBMC Klee KiteSafe 1410.99 315.60 203.55Unsafe 1602.30 4.23 5.50Total 3013.29 319.83 209.05Table 5.3: Total time spent (in seconds) by each tool to solve the bench-marks. Overall, Kite had the best performance, and CBMC had the worstperformance.processes. However, this engine tends to explore the same search space for along time. Therefore, this strategy impacted the time to find the assertionviolation in test unsafe 7.In contrast to KG, the engine derived from the Black Box strategy (KB)never guides the SAT solver. Because the SAT solver used has no knowl-edge about the branch precedence, nor which path was taken, this enginebacktracking strategy is more random. Thus, this engine tends to exploredifferent parts of the execution tree, and pays a high price for discardingmany unblocked processes. This impacted the engine performance in allhard safe instances. This engine also did not perform well in the test un-safe 7, although it was almost 10? faster than the other engines in theunsafe 8 instance.The Maximize Learning strategy tries to balance the cost and benefitof guiding the SAT solver during CDSE. This heuristic chooses when toguide the SAT solver according to the ?quality? of the search space. Theengine quantifies the quality of the search space by computing the relationbetween the learned clause size and the current path size. The search spaceis classified as good enough, if this relation is smaller than a given thresholdk.Three different thresholds were tested: 5%, 10% and 15%. The higher thethreshold, the more often the decision engine tends to guide the SAT solver.Thus, the engine KM(15) had a behavior similar to the KG engine. Betweenthe three Maximize Learning engines, KM(15) had the best performance inthe safe instances, but spent a significant amount of time to solve unsafe 7.665.3. Evaluation of the Decision Engines102103104105106102 103 104 105 106Kite [# instructions]Klee [# instructions]safe(a) Safe instances.102103104105106102 103 104 105 106Kite [# instructions]Klee [# instructions]unsafe(b) Unsafe instances.Figure 5.1: Number of instructions symbolically executed by Klee and Kiteon safe and unsafe instances. As expected, Kite symbolically executed fewerinstructions in the majority of the tests, even in unsafe instances where Kleehas better performance.The poor performance of KM(15) on unsafe 7 had a large impact on itsoverall performance, and this one long runtime made KM(15) overall worsethan either KM(10) (the best performer) or KM(5). The KM(10) engine wasalso superior to KM(5). In fact, KM(10) overcame KM(5) in the safe andunsafe test sets.Therefore, the best overall result was obtained using the Maximize Learn-ing strategy with 10% threshold, KM(10). Although this engine was not thebest in the safe instances, KM(10) still performed well on them. Moreover,this engine was the best one in the unsafe test set by a significant differ-ence. KM(10) was more than 2? faster than the other engines in the unsafeinstances.Notice that the tests shown in this section is limited to two benchmarks,although KM(10) had the best performance in these benchmarks, no con-clusion can be held about other instances. Further testing is necessary toestablish a better pattern, and to establish which engine is better in whichcases.675.3. Evaluation of the Decision EnginesTest KB KG KM(5) KM(10) KM(15)safe 01 0.62 0.27 0.30 0.28 0.27safe 02 0.34 0.19 0.25 0.17 0.18safe 03 0.04 0.04 0.04 0.04 0.05safe 04 0.76 0.25 0.25 0.25 0.25safe 05 0.01 0.01 0.01 0.02 0.02safe 06 0.02 0.02 0.02 0.02 0.02safe 07 2.02 1.30 1.35 1.28 1.27safe 08 2.31 1.33 1.31 1.26 1.46safe 09 1.98 1.25 1.29 1.25 1.30safe 10 2.08 1.25 1.32 1.27 1.27safe 11 33.26 11.57 11.78 11.60 12.66safe 12 0.58 0.56 0.57 0.61 0.58safe 13 0.17 0.12 0.13 0.13 0.13safe 14 46.48 16.93 17.26 18.76 19.07safe 15 41.42 15.41 16.75 15.91 17.17safe 16 71.25 31.73 44.70 42.01 40.14safe 17 64.35 24.91 37.10 34.36 30.22safe 18 58.39 25.68 40.74 38.33 34.51safe 19 67.48 27.28 39.36 36.00 32.41Total 393.56 160.10 214.53 203.55 192.98Table 5.4: Table showing the effect of different decision engines on Kite?sexecution time (in seconds) to verify the safe instances.685.3. Evaluation of the Decision EnginesTest KB KG KM(5) KM(10) KM(15)unsafe 01 0.05 0.03 0.04 0.03 0.04unsafe 02 0.07 0.06 0.06 0.06 0.06unsafe 03 0.03 0.02 0.02 0.02 0.02unsafe 04 0.52 0.19 0.11 0.19 0.19unsafe 05 0.05 0.03 0.03 0.03 0.03unsafe 06 0.49 0.55 0.74 0.78 0.56unsafe 07 Timeout Timeout 6.77 1.62 27.88unsafe 08 0.17 1.35 2.73 1.35 1.36unsafe 09 0.06 0.05 0.04 0.05 0.04unsafe 10 0.50 0.06 0.07 0.06 0.06unsafe 11 0.52 0.09 0.09 0.07 0.08unsafe 12 0.37 0.45 1.09 0.46 0.44unsafe 13 0.04 0.04 0.03 0.04 0.04unsafe 14 0.69 0.43 0.45 0.43 0.43unsafe 15 0.59 0.06 0.07 0.06 0.06unsafe 16 0.44 0.46 0.08 0.25 0.24Total >204.59 >203.87 12.42 5.50 31.53Table 5.5: Table showing the effect of different decision engines on Kite?sexecution time (in seconds) to find a property violation in the unsafe in-stances. In the instance unsafe 07, two engines did not finish before the200s timeout. Thus, the total time reported for these engines are not pre-cise.69Chapter 6Related WorkThere is a vast literature on software verification which goes beyond thescope of this thesis to survey. This chapter considers only works that aimto reduce the number of paths explored during symbolic execution. To thebest of my knowledge, there are only other four approaches reported in theliterature that can successfully reduce this number. All of them learn factsonce a conflict is reached, i.e., when a certain branch is proven infeasibleunder a certain path condition. These approaches, however, differ on whichfacts are learned, and how they employ them in order to prune paths fromthe search.This chapter describes these approaches, and contrasts their learningschemes to the one introduced in this thesis. These approaches are describedin two sections: the first section introduces learning schemes based on in-terpolants, while the second section describes approaches based on conflictclauses, which are closer to conflict-driven symbolic execution (CDSE).Before presenting the different learning schemes, it is important to high-light the distinction between early conflict detection and conflict preemption.In both cases, the symbolic execution engine stores all the knowledge it ac-quires from previous executions in a database, and uses this knowledge toprune the search space. However, these scenarios differ according to howthe engine can employ this knowledge.Early Conflict Detection: In early conflict detection, the symbolic exe-cution engine uses its knowledge to interrupt the symbolic executionof a path, before reaching an infeasible branch. In other words, theengine uses the database to constrain the symbolic execution, and maytrigger a conflict while symbolically executing a feasible branch. There-70Chapter 6. Related Workfore, the engine can determine if the next branch would never lead thecurrent exploration into a target violation.Conflict Preemption: In conflict preemption, the symbolic execution en-gine uses its database to restrict the set of paths that can be followed.Thus, the database offers a mechanism to avoid the paths that wouldtrigger a known conflict. Once the engine starts executing a new path,the engine will not hit a conflict that is kept in the database. It will ei-ther find a valid counterexample, or it will reach an infeasible branch,which represents a new conflict.For example, Section 3.1 explains how CDSE explores only two executionpaths to prove that the program in Listing 1.1 (at page 5) does not violateits embedded assertion, even though there are 8 feasible paths that leadthe program to the execution of the assertion statement. Consider the pro-gram control flow graph (CFG) depicted in Figure 6.1, after exploring paths{b1, b3, b4, b6, b7, b9, b10} and {b1, b3, b4, b6, b8, b10} CDSE learns the followingclauses:(?b7 ? ?b9 ? ?b10) ? (?b8 ? ?b10)Each learned clause is derived from a unique conflict.Notice that these two clauses alone are not sufficient to prove the prop-erty?s correctness without exploring other paths ? the two clauses by them-selves form a satisfiable formula. However, because the CDSE engine ini-tially populates its database with the constraints extracted from the CFG,it can conclude that there isn?t any feasible path that will trigger the asser-tion failure in this case. Thus, the CDSE engine uses the f CFG to pruneall paths that violates a certain learned clause. CDSE can preempt whichdecisions would lead the search to hit a known conflict, and it will never tryto follow conflicting decisions.However, the same is not true for a learning scheme that can still learnonly these two clauses, but that does not integrate any other knowledgeabout the CFG. Such an engine doesn?t know yet that all paths that leadthe execution to b10 have to include either b7 and b9, or b8. Thus, it has to716.1. Interpolation-Based Learning SchemesBABB b1BC b2 b3BD b4BE b5 b6BF b7BG b8 b9BH (abort) b10BI (exit) b11Figure 6.1: The control flow graph for the program described in Listing1.1, where basic block BH represents the assertion failure.continue exploring the search space to guarantee that the assertion failurecan?t be reached. Nevertheless, these two clauses can still be used to detecta conflict before reaching an infeasible branch. For example, if the enginefollows path {b1, b3, b5, b7, b9}, these decisions are enough for the engine toknow that b10 must be false. Thus, it will be able to detect the conflict afterexecuting BG, but before checking if the branch b10 is feasible.6.1 Interpolation-Based Learning SchemesGiven a pair of logical formulas ?A and ?B, where ?A??B is unsatisfiable, aCraig interpolant [13] is a set of constraints I with the following properties:1. ?A =? I2. I ? ?B is unsatisfiable.3. I contains only variables from ?A??B726.1. Interpolation-Based Learning SchemesThe interpolant I can be used to abstract ?A by keeping the invariantthat the conjunction with ?B cannot be satisfied. Therefore, interpolation-based approaches extract interpolants from unsatisfiable queries, where acertain branch condition ? is infeasible under a path condition pc, and usethem to block state transitions that trigger conflicts previously learned.There are two interpolation approaches described in the research litera-ture, Lazy Annotation [34] and Abstraction Learning [26]. These approacheswere developed independently, but both roughly implement the same algo-rithm to verify loop-free programs. The algorithm description below willfollow the Lazy Annotation implementation.For loop-free programs, Lazy Annotation proceeds as follow. The en-gine symbolically executes paths in a depth-first manner, and it follows apath until a conflict rises, or until a violation is reached. The interestingcase happens when there is a conflict. In this case, the engine extracts aninterpolant that summarizes a reason for this conflict, and that blocks thecurrent execution state. The algorithm annotates the infeasible edge withthe extracted interpolant, and backtracks to a non-blocked state. Wheneverthe execution reaches an edge with some annotation, the engine uses theannotated interpolant I to determine if pc will trigger a conflict detected inprevious iterations. If pc =? I, the pc would trigger the same conflict usedto generate I, i.e., I blocks the current execution state; thus, the engineinterrupts the current search, and backtracks.If the engine detects a new conflict in an edge already annotated, theengine extracts a new interpolant, and extends the previous annotation cre-ating a disjunction with the new interpolant. Moreover, whenever all outgo-ing edges of a node are infeasible under a certain path, the engine extractsa new interpolant for the given node. This interpolant is derived from theconjunction of the conditions that block the outgoing edges.On one hand, interpolants offer a safe abstraction that summarizes whycertain path conditions lead the program execution into a conflicting state.Because the interpolants express path conditions that may trigger a conflictin certain parts of the code, interpolants offer a more fine grained abstractionthan the conflict clauses extracted in CDSE. Thus, they can potentially736.1. Interpolation-Based Learning Schemesdetect conflicts that CDSE can?t.Additionally, interpolation can be used to perform unbounded analysis.As shown by McMillan [35], interpolation provides an approximate imageoperator, which can be used iteratively to compute an over-approximationof the set of reachable states that does not trigger a property violation. Ifthis computation reaches a fixpoint, the over-approximate set is an inductiveinvariant that proves that the property won?t be violated by any reachablestate. This unbounded analysis in the context of symbolic execution is betterexplored in Abstraction Learning, which is capable of proving correctness ofprograms with loops in a reasonable amount of time [26].On the other hand, neither interpolation-based approach is capable ofreasoning about the restrictions represented in the CFG, and they cannotpreempt future conflicts as CDSE does. In Lazy Annotation, and in Ab-straction Learning, the execution identifies a blocked state only when theexecution reaches a branch where the current state implies the annotatedinterpolant. The engine has to visit every feasible edge at least once.The example shown in Listing 6.1 retrieved from the Lazy Annotationpaper [34] shows an example on how the interpolation-based approachesemploy their knowledge to early conflict detection.For this example, Lazy Annotation explores three different conflicts, andone blocked path, in order prove that the error function is infeasible. Fol-lowing the original description [34], the algorithm could start by choosing toexecute the path that visits L3 and L7. The first conflict is detected whenthe engine tries to follow branch L10? L11. The algorithm would backtrackto node L6 and annotate L6 ? L7 with label p.Then, a new conflict is detected while trying to execute L6 ? L9. Theengine annotates this edge with the label a, and annotates L6 with p ? a.The engine backtracks until L2, and it executes the third path via L5 untilit reaches L6.The engine doesn?t follow the branch L6 ? L7, since it is blocked bylabel p, and the current path constraints implies p. Then, the engine followsonly the branch L6 ? L9. It eventually triggers a conflict while trying toreach L11. Thus, it will label L6 ? L9 with p. Location L6 is also labeled746.2. Clause Learning Schemesdiamond :L1 a s s e r t (p ) ;L2 i f (? )L3 a = 1 ;L4 elseL5 a = 0 ;L6 i f ( a )L7 x = x + 1 ;L8 elseL9 x = x ? 1 ;L10 i f ( ! p )L11 e r r o r ( ) ;Listing 6.1: A program fragment retrieved from [34]. In this example, theerror() function represents an assertion(false), which cannot be reachedduring any program execution. The assert(p) statement states that everyexecution considered in this example should assume that p is initially set totrue.with p, since L6?s label is derived from the disjunction of its old label p ? aand p.This example shows how Lazy Annotation can prune the search space bydetecting a conflict once it reaches L6 through L5, even though the edge L6?L7 is feasible. However, this is also a good example that shows the benefit ofintegrating structural information about the program to the learning scheme.Notice that L11 is only reachable if !p is true in L10, but the variable p isnever modified, and its value comes from statement in L1. Thus, it is possibleto prove this property by executing only one single path.6.2 Clause Learning SchemesIn the same direction taken in this thesis, [30] and [25] also present symbolicexecution approaches that have similar structure to CDCL SAT solvers. Ata high level, both approaches are closely related to CDSE, since they alsosummarize a conflict found as a function of the branches taken in the currentexecuted path. Unfortunately, both clause learning approaches do not havea publicly available implementation; therefore, it was not possible to include756.2. Clause Learning Schemesthem in the tests presented in Chapter 5.The solution presented by Krishnamoorthy, Hsiao, and Lingappan [30],explores the program in a depth-first search manner, and implementsnonchronological backtracking. Whenever a conflict is detected, this ap-proach applies conflict analysis included in its constraint solvers to detectwhich assignments in a query (the conjunction of path condition pc with abranch condition ?) are responsible for the conflict. The engine has to mapthe conflicting assignments to the branch conditions where they were made.Then, the algorithm learns how to avoid this conflict, and backtracks tothe most recently visited node that does not imply this conflict, and that hasa non-blocked outgoing edge. This approach does not encode the CFG as aCNF, and it does not use a SAT solver to enumerate paths. Consequently,it only detects a conflict when the engine selects a new branch to explore.After choosing which branch to take next, the engine checks if the pathviolates one of the clauses in the database. If a violation is detected, theengine does not follow the chosen branch. Instead, the engine tries to takethe other branch that originated in the current location. If all branchesin a location under the current path are blocked, the engine backtracks.Otherwise, it continues executing the branch chosen.Because the learned clauses are checked at every decision, this algorithmis able to perform early conflict detection. However, this learning schemeonly stores the constraints that were learned during conflict analysis; con-sequently, the engine does not employ any knowledge about the unblockedpaths that are available in a certain search space, and it cannot preemptknown conflicts.The algorithm description is not clear about how conflicting assignmentsare mapped to the branches. Additionally, this approach does not apply anyother information about the CFG to reduce the number of assignments thatcan be considered conflicting. The encoding may have a negative impactthe strength of the clauses learned.The Satisfiability Modulo Path Program [25] technique, or just SMPP,is the other approach that learns facts about the branches taken in a paththat led the execution into a conflict. For loop-free programs, SMPP resem-766.2. Clause Learning Schemesbles CDSE, since SMPP also encodes the program CFG as a conjunctivenormal form (CNF) formula, and it also uses a SAT solver to enumerate thepaths. For each satisfying assignment, the engine follows the branches thatdetermine the chosen path, and the execution stops whenever an infeasiblebranch, or an error is found. In the case of an infeasible path, the algorithmalso learns a clause that forces the algorithm to avoid the refined set of con-flicting branches. Thus, SMPP can also preempt conflicts in the same waythat CDSE does.SMPP has been proposed for different verification approaches, and ex-tended to symbolic execution. Despite the similarity between SMPP andCDSE, these algorithms have subtle differences, that can greatly impact thealgorithm?s performance and efficiency in practice, such as:Conflict Analysis : One key difference between SMPP and CDSE is theconflict analysis procedure. SMPP requires the constraint solver todetermine an unsatisfiability core12 for an unsatisfiable query. Ad-ditionally, the SMPP engine has to map each conflicting assignmentreturned by the constraint solver, to the branch previously taken (con-ditional or unconditional). Then, the algorithm performs an interfer-ence analysis to refine the set of branches responsible for the conflict.This analysis determines which other paths intersect with the currentone, which could result in a different set of assignments that might notlead the execution to trigger the same conflict.Therefore, SMPP demands an interference analysis every time a newconflict is detected, while CDSE doesn?t. CDSE avoids the extra workbecause the symbolic encoding includes some information about theprogram structure. Additionally, this encoding leverages SSA analysisto reduce the information included, and consequently reduces the con-flict clause detected by the constraint solver. Because CDSE uses theassumption mechanism implemented by incremental constraint solvers,it also does not require the constraint solver to determine the unsat-12As defined in Section 2.3.3, given an unsatisfiable formula ?, an unsatisfiability coreis a sub-formula ??, ?? ? ?, that is also unsatisfiable776.2. Clause Learning Schemesisfiability core of a conflicting query.Query Optimizations: In CDSE, the instruction guards provide a cleanway to implement all query optimizations included in modern sym-bolic execution tools. As described in Section 4.3, these optimizationsare applied not only to accelerate the solving process as in Klee. Con-straint simplification techniques often find that in certain paths somebranch conditions are constant. In these cases, the guards mecha-nism allows the CDSE engine to learn what caused the condition tobe constant. The engine uses such information to learn facts even be-fore attempting to take an infeasible branch, i.e., before the executionreaches a conflict.f CFG encoding: Another difference between SMPP and CDSE is theCFG formula encoding. SMPP encodes the CFG formula to be sat-isfied only if the variables assigned to true represent one single path.In other words, the constraints added to the CFG are stronger, and anode variable can be assigned to true, only if exactly one of its prede-cessors is true, and if exactly one of the node?s successors is also true.Therefore, these constraints are translated to XOR functions. Since thedescription of XOR functions with n variables in CNF requires 2n?1clauses, SMPP encoding result in a bigger and more complex CNFformula.Loop Handling: For programs with loops, the SMPP can use an alterna-tive CFG without loops by reducing the original graph using MaximalStrongly Connected Components. Although they suggest loop un-rolling to deal with loops in symbolic execution, their implementedapproach performs an over-approximated symbolic execution, whereall the assignments inside a loop are replaced by non-deterministicvalues.78Chapter 7ConclusionThe classic symbolic execution algorithm introduced by King [28] suffersfrom a problem known as path explosion. Assuming that a program to beverified has a finite execution tree, symbolic execution explores every feasibleexecution path. In the worst case scenario, the number of execution pathsis exponential to the number of conditional branches in a program.As presented in Chapter 6, there are just a few approaches that addressthis problem, and most symbolic execution tools still explore the entireexecution tree. This thesis presents a novel symbolic execution algorithmthat successfully reduces the number of paths necessary to prove propertycorrectness.7.1 ContributionsThis thesis?s main contribution is a novel algorithm that can dynamicallyreduce the number of paths explored during symbolic execution. This al-gorithm is capable of learning from conflicts detected while symbolicallyexecuting a path. Because this algorithm was inspired by the insights in-troduced in conflict-driven clause learning (CDCL) SAT solvers, I named itConflict-Driven Symbolic Execution (CDSE).The algorithm presented in this thesis has many interesting propertiesthat makes it a unique symbolic execution extension. These properties areoutlined bellow:Lightweight learning scheme: The CDSE learning scheme requires justa few changes to symbolic execution, and it takes advantage of the com-piler?s SSA format to reduce the number of guarded instructions. Ad-797.1. Contributionsditionally, this learning scheme leverages incremental solver techniquesto generate lightweight conflict analysis, and to efficiently solve thecontrol flow graph formula (f CFG) after adding new learned clauses.Small CFG formula encoding: In contrast to the SMPP algorithm [25],CDSE does not use XOR operations to encode the program controlflow graph (CFG). Consequently, the CDSE encoding (f CFG) can beeasily satisfied, and the size of the f CFG is linear to the size of theCFG.Error preemption: Because CDSE combines the CFG structure and thefacts learned as a unique formula, CDSE can use a SAT solver torestrict the search. Consequently, CDSE can preempt conflicts alreadylearned, and it only executes paths that have not been blocked yet.Learning without conflicts: The CDSE algorithm can leverage constantpropagation method not only to eliminate queries, but also to learnpotential conflicts without hitting them. A constant branch conditionalways implies that one direction cannot be taken; thus, the CDSEengine can learn why one direction is infeasible under the current con-straints, even if this direction is not the one being followed.Another important contribution of this thesis work is the proof-of-conceptCDSE tool Kite. Kite is an open-source tool, which can verify software de-scribed in C. Furthermore, Kite can be used in the development of futurework.I developed Kite in order to assess the CDSE algorithm, and to compareit to existing verification techniques. The results presented in Chapter 5 giveempirical evidence that CDSE can be better than regular symbolic executionin proving property correctness. These results also show that there is morespace for improving the CDSE algorithm.807.2. Future Work7.2 Future WorkAs future work, this thesis suggests the following topics that can improvethe CDSE algorithm:Decision Heuristics: The results in Section 5.3 show how the engine effi-ciency is highly dependent on the decision heuristic. Kite introduceda small change in the SAT solver included in the decision engine, thatallows the engine to induce the SAT solver to start its search undera similar path as the last one executed. This small change was as-sociated with a mechanism to choose when this induction should beapplied. The engines derived from this change performed better thanleaving the SAT solver free to decide the f CFG assignments.CDSE may still benefit from more sophisticated approaches. For ex-ample, one interesting research direction would be changing the SATsolver decision heuristic to prioritize assignments to branch variablesthat are closer to the entry node.Additionally, the decision heuristics introduced in this thesis couldbenefit from more diverse test cases. This would improve the heuris-tic?s tuning, and it may provide more information about their strongand their weak points.Loop Handling: Section 3.5 presents two techniques to handle loops, thestatic and dynamic CFG unrolling. Another interesting approachwould be combining CDSE with some unbounded model checking tech-nique. In the same way that the dynamic CFG unrolling uses assump-tions in order to detect which loops may influence a property?s proof,this mechanism can be used to find which loops are relevant to a prop-erty?s proof. For these loops, CDSE could integrate unbounded modelchecking (UMC) techniques instead of unrolling them. Potentially,this approach would allow the reduction of the logic that has to beanalyzed by UMC, since this technique is usually more expensive thanbounded symbolic execution.817.2. Future WorkParallel Computation: One last important improvement would be theuse of multiple CDSE engines at the same time. Inspired by parallelSAT solvers, these CDSE engines could be configured to search indifferent search spaces, and share every conflict learned. Furthermore,the engines could run different decision heuristics, creating a portfolioof CDSE solvers.Additionally, there are more practical directions that could be followedto improve Kite?s applicability, and efficiency. Some of these improvementswould be:State Discarding: Currently, Kite always discards every state in the deci-sion stack that is higher than the backtrack level. Consequently, Kitehas to re-execute a redundant sequence of instructions to restore a dis-carded state, if needed. However, this limitation could be overcome bykeeping track of all unblocked processes, and a mechanism to recoverthem. The main challenge of this approach would be how to deter-mine which processes are alive, and which ones have been completelyblocked.Memory Handling: The CDSE learning scheme employs the knowledgeof use-def chains to efficiently determine which merging points in theCFG could carry different variable definitions. In Kite, this knowledgeis not available for memory accesses; thus, Kite assumes any mergingpoint between a memory write and a memory read could impact onthe memory value.As discussed in Section 4.6, this approach could be improved by addinga use-def analysis of memory access. Even though this use-def anal-ysis is sometimes imprecise, it would allow Kite to optimize simple,but common, memory accesses with constant indexes. For more com-plicated cases, Kite could also take advantage of pointer analysis, anddetermine a superset of relevant merging points to some memory re-gion.827.2. Future WorkKite would benefit from these changes while solving instances wherememory accesses are more complex. These memory accesses cannotbe eliminated by the optimization passes included in LLVM.Symbolic Pointers: This is another limitation in Kite?s implementationpresented in Section 4.6. This limitation restricted the choice of testsused Section 5.As discussed in Section 4.6, Kite could either fork at each pointerdereference, one process per possible address, or Kite could encodeall possibilities as a conditional access. The first solution, which isimplemented in Klee, would require a change in the f CFG. The f CFGwould need to include the points in the program where these forkscould occur. The second approach would require Kite to enumerateall possible values of a pointer, and encode conditional reads, as wellas conditional writes.Because larger examples tend to require more complete memory handlingand support for symbolic pointers, with these improvements, Kite couldextend the range of problems it can solve. Moreover, Kite could be usedto investigate how scalable CDSE is. Since Kite excels over other tools forlarger examples, it is reasonable to expect that CDSE may scale to problemsthat traditional symbolic execution can?t.83Bibliography[1] Mars Climate Orbiter Mishap Investigation Board. Phase I Report,NASA, November 1999.[2] Gilles Audemard and Laurent Simon. Predicting Learnt Clauses Qual-ity in Modern SAT Solvers. In Proceedings of the 21st InternationalJoint Conference on Artificial Intelligence, IJCAI?09, pages 399?404,San Francisco, CA, USA, 2009. Morgan Kaufmann Publishers Inc.[3] Domagoj Babic and Alan J. Hu. Calysto: Scalable and Precise Ex-tended Static Checking. In Proceedings of the 30th international con-ference on Software engineering, ICSE ?08, pages 211?220, New York,NY, USA, 2008. ACM.[4] Dirk Beyer. Competition on Software Verification. http://sv-comp.sosy-lab.org/2013/, 2013.[5] Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majum-dar. The Software Model Checker Blast: Applications to Software En-gineering. Int. J. Softw. Tools Technol. Transf., 9(5):505?525, October2007.[6] Armin Biere, Alessandro Cimatti, Edmund M. Clarke, and YunshanZhu. Symbolic Model Checking Without BDDs. In Proceedings of the5th International Conference on Tools and Algorithms for Constructionand Analysis of Systems, TACAS ?99, pages 193?207, London, UK,1999. Springer-Verlag.[7] Corrado Bo?hm and Giuseppe Jacopini. Flow Diagrams, Turing Ma-84Bibliographychines and Languages with Only Two Formation Rules. Commun.ACM, 9(5):366?371, May 1966.[8] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang.Symbolic Model Checking: 1020 States and Beyond. Inf. Comput.,98(2):142?170, June 1992.[9] J. Burnim and Koushik Sen. Heuristics for Scalable Dynamic TestGeneration. In Automated Software Engineering, 2008. ASE 2008. 23rdIEEE/ACM International Conference on, pages 443?446, 2008.[10] Cristian Cadar, Daniel Dunbar, and Dawson Engler. KLEE: Unas-sisted and Automatic Generation of High-Coverage Tests for ComplexSystems Programs. In Proceedings of the 8th USENIX Conference onOperating Systems Design and Implementation, OSDI?08, pages 209?224, Berkeley, CA, USA, 2008. USENIX Association.[11] Edmund Clarke, Daniel Kroening, and Flavio Lerda. A Tool for Check-ing ANSI-C Programs. In Kurt Jensen and Andreas Podelski, editors,Tools and Algorithms for the Construction and Analysis of Systems(TACAS 2004), volume 2988 of Lecture Notes in Computer Science,pages 168?176. Springer, 2004.[12] Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. Model Check-ing. The MIT Press, Cambridge, MA, 1999.[13] William Craig. Three Uses of the Herbrand-Gentzen Theorem in Re-lating Model Theory and Proof Theory. J. Symb. Log., 22(3):269?285,1957.[14] Martin Davis, George Logemann, and Donald Loveland. A machine pro-gram for theorem-proving. Commun. ACM, 5(7):394?397, July 1962.[15] Martin Davis and Hilary Putnam. A Computing Procedure for Quan-tification Theory. J. ACM, 7(3):201?215, July 1960.85Bibliography[16] Vijay D?Silva, Leopold Haller, and Daniel Kroening. Abstract ConflictDriven Learning . In Principles of Programming Languages (POPL),pages 143?154. ACM, 2013.[17] Vijay D?Silva, Leopold Haller, Daniel Kroening, and MichaelTautschnig. Numeric Bounds Analysis with Conflict-Driven Learn-ing . In Tools and Algorithms for the Construction and Analysis ofSystems (TACAS), volume 7214 of LNCS, pages 48?63. Springer, 2012.[18] Niklas Ee?n, Alan Mishchenko, and Nina Amla. A Single-Instance In-cremental SAT Formulation of Proof- and Counterexample-based Ab-straction. In Proceedings of the 2010 Conference on Formal Methodsin Computer-Aided Design, FMCAD ?10, pages 181?188, Austin, TX,2010. FMCAD Inc.[19] Niklas Ee?n and Niklas So?rensson. Temporal Induction by IncrementalSAT Solving. Electr. Notes Theor. Comput. Sci., 89(4):543?560, 2003.[20] Niklas Ee?n and Niklas So?rensson. An Extensible SAT-solver. In EnricoGiunchiglia and Armando Tacchella, editors, SAT 2003, volume 2919of Lecture Notes in Computer Science, pages 502?518. Springer, 2004.[21] Vijay Ganesh and David L. Dill. A Decision Procedure for Bit-Vectorsand Arrays. In Proceedings of the 19th International Conference onComputer Aided Verification, CAV?07, pages 519?531. Springer-Verlag,2007.[22] Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART: DirectedAutomated Random Testing. In Proceedings of the 2005 ACM SIG-PLAN Conference on Programming Language Design and Implementa-tion, PLDI ?05, pages 213?223. ACM, 2005.[23] Eugene Goldberg and Yakov Novikov. BerkMin: A Fast and RobustSAT-solver. Discrete Appl. Math., 155(12):1549?1561, June 2007.[24] Carla P. Gomes, Bart Selman, and Henry Kautz. Boosting Combina-torial Search Through Randomization. In Proceedings of the Fifteenth86BibliographyNational/Tenth Conference on Artificial Intelligence/Innovative Appli-cations of Artificial Intelligence, AAAI ?98/IAAI ?98, pages 431?437,Menlo Park, CA, USA, 1998. American Association for Artificial Intel-ligence.[25] William R. Harris, Sriram Sankaranarayanan, Franjo Ivanc?ic?, and AartiGupta. Program analysis via satisfiability modulo path programs. InProceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposiumon Principles of Programming Languages, POPL ?10, pages 71?82, NewYork, NY, USA, 2010. ACM.[26] Joxan Jaffar, Jorge Navas, and Andrew Santosa. Abstraction Learning.In Proceedings of the 8th International Conference on Automated Tech-nology for Verification and Analysis, ATVA?10, pages 17?17. Springer-Verlag, 2010.[27] James C. King. A New Approach to Program Testing. SIGPLAN Not.,10(6):228?233, April 1975.[28] James C. King. Symbolic execution and program testing. Communica-tions of the ACM, 19(7):385?394, July 1976.[29] Alfred Koelbl and Carl Pixley. Constructing efficient formal modelsfrom high-level descriptions using symbolic simulation. InternationalJournal of Parallel Programming, 33(6):645?666, 2005.[30] S. Krishnamoorthy, M.S. Hsiao, and L. Lingappan. Tackling the PathExplosion Problem in Symbolic Execution-Driven Test Generation forPrograms. In 19th IEEE Asian Test Symposium (ATS), 2010, pages59?64, 2010.[31] Eric Larson and Todd Austin. High coverage detection of input-relatedsecurity facults. In Proceedings of the 12th Conference on USENIXSecurity Symposium - Volume 12, SSYM?03, pages 9?9, Berkeley, CA,USA, 2003. USENIX Association.87Bibliography[32] Chris Lattner and Vikram Adve. LLVM: A Compilation Frameworkfor Lifelong Program Analysis & Transformation. In Proceedings ofthe International Symposium on Code Generation and Optimization,CGO ?04, pages 75?86, Washington, DC, USA, 2004. IEEE ComputerSociety.[33] Nancy G. Leveson. An Investigation of the Therac-25 Accidents. IEEEComputer, 26:18?41, 1993.[34] Kenneth L. McMillan. Lazy Annotation for Program Testing andVerification. In Proceedings of the 22nd International Conference onComputer Aided Verification, CAV?10, pages 104?118. Springer-Verlag,2010.[35] K.L. McMillan. Interpolation and SAT-Based Model Checking. In War-ren A. Hunt Jr. and Fabio Somenzi, editors, Computer Aided Verifica-tion, volume 2725 of Lecture Notes in Computer Science, pages 1?13.Springer-Verlag, 2003.[36] M.W. Moskewicz, C.F. Madigan, Ying Zhao, Lintao Zhang, and S. Ma-lik. Chaff: engineering an efficient SAT solver. In Design AutomationConference, pages 530?535, 2001.[37] NASA-JPL. Mars Climate Orbiter Fact Sheet. http://web.archive.org/web/20130921065519/mars.jpl.nasa.gov/msp98/orbiter/fact.html, 1999. Archived from the original onSeptember, 2013.[38] Yoonna Oh, Maher N. Mneimneh, Zaher S. Andraus, Karem A.Sakallah, and Igor L. Markov. Amuse: A minimally-unsatisfiable sub-formula extractor. In Proceedings of the 41st Annual Design Automa-tion Conference, DAC ?04, pages 518?523, New York, NY, USA, 2004.ACM.[39] Hristina Palikareva and Cristian Cadar. Multi-solver support in sym-bolic execution. In Natasha Sharygina and Helmut Veith, editors, Com-88Bibliographyputer Aided Verification, volume 8044 of Lecture Notes in ComputerScience, pages 53?68. Springer-Verlag, 2013.[40] Corina S. Pa?sa?reanu and Willem Visser. A Survey of New Trends inSymbolic Execution for Software Testing and Analysis. Int. J. Softw.Tools Technol. Transf., 11(4):339?353, October 2009.[41] The LLVM Project. LLVM Download Page. http://web.archive.org/web/20131107071125/http://llvm.org/releases/download.html, 2013. Archieved from the original on November, 2013.[42] Jean-Pierre Queille and Joseph Sifakis. Specification and verificationof concurrent systems in CESAR. In Proceedings of the 5th Colloquiumon International Symposium on Programming, pages 337?351, London,UK, 1982. Springer-Verlag.[43] Roberto Sebastiani. Lazy satisability modulo theories. Journal onSatisfiability, Boolean Modeling and Computation (JSAT), 3(3-4):141?224, 2007.[44] Koushik Sen, Darko Marinov, and Gul Agha. Cute: A concolic unittesting engine for c. In Proceedings of the 10th European Software En-gineering Conference / 13th ACM SIGSOFT International Symposiumon Foundations of Software Engineering, ESEC/FSE-13, pages 263?272, New York, NY, USA, 2005. ACM.[45] Joa?o P. Marques Silva and Karem A. Sakallah. GRASP ? ANew Search Algorithm for Satisfiability. In Proceedings of the 1996IEEE/ACM international conference on Computer-aided design, IC-CAD ?96, pages 220?227, Washington, DC, USA, 1996. IEEE ComputerSociety.[46] Mate Soos, Karsten Nohl, and Claude Castelluccia. Extending SATSolvers to Cryptographic Problems. In Proceedings of the 12th Interna-tional Conference on Theory and Applications of Satisfiability Testing,SAT ?09, pages 244?257. Springer-Verlag, 2009.89[47] Gregory Tassey. The Economic Impacts of Inadequate Infrastructurefor Software Testing. Planning Report 02-3, National Institute of Stan-dards and Technology, 2002.[48] Willem Visser, Corina S. Pa?sa?reanu, and Sarfraz Khurshid. Test In-put Generation with Java PathFinder. SIGSOFT Softw. Eng. Notes,29(4):97?107, July 2004.[49] Mark Weiser. Program slicing. In Proceedings of the 5th InternationalConference on Software Engineering, ICSE ?81, pages 439?449, Piscat-away, NJ, USA, 1981. IEEE Press.[50] Lintao Zhang. Searching for Truth: Techniques for Satisfiability ofBoolean Formulas. PhD thesis, Princeton, NJ, USA, 2003.[51] Lintao Zhang and Sharad Malik. Validating SAT Solvers Using an In-dependent Resolution-Based Checker: Practical Implementations andOther Applications. In Proceedings of the Conference on Design, Au-tomation and Test in Europe - Volume 1, DATE ?03, pages 10880?10885, Washington, DC, USA, 2003. IEEE Computer Society.90Appendix AExtended ResultsThis appendix presents the result data that was not included in Chapter 5.First, Tables A.1 and A.2 show the map between each instance used andtheir simplified name.Benchmark Instance Simplified Namentdrivers-simplified cdaudio simpl1 safe.cil.c safe 01diskperf simpl1 safe.cil.c safe 02floppy simpl3 safe.cil.c safe 03floppy simpl4 safe.cil.c safe 04kbfiltr simpl1 safe.cil.c safe 05kbfiltr simpl2 safe.cil.c safe 06ssh-simplified s3 clnt 1 safe.cil.c safe 07s3 clnt 2 safe.cil.c safe 08s3 clnt 3 safe.cil.c safe 09s3 clnt 4 safe.cil.c safe 10s3 srvr 1 safe.cil.c safe 11s3 srvr 1a safe.cil.c safe 12s3 srvr 1b safe.cil.c safe 13s3 srvr 2 safe.cil.c safe 14s3 srvr 3 safe.cil.c safe 15s3 srvr 4 safe.cil.c safe 16s3 srvr 6 safe.cil.c safe 17s3 srvr 7 safe.cil.c safe 18s3 srvr 8 safe.cil.c safe 19Table A.1: Renaming of Safe Instances.Tables A.3 and A.4 shows the performance of both versions of Klee:Kleef corresponds to the changeset from which Kite was forked13, and Kleel13Changeset 9b5e99905e6732d64522d0efc212f3f1ce290ccc from September 12th, 201291Appendix A. Extended ResultsBenchmark Instance Simplified Namentdrivers-simplified floppy simpl3 unsafe.cil.c unsafe 01floppy simpl4 unsafe.cil.c unsafe 02kbfiltr simpl2 unsafe.cil.c unsafe 03cdaudio simpl1 unsafe.cil.c unsafe 04ssh-simplified s3 srvr 10 unsafe.cil.c unsafe 05s3 srvr 11 unsafe.cil.c unsafe 06s3 srvr 12 unsafe.cil.c unsafe 07s3 srvr 13 unsafe.cil.c unsafe 08s3 srvr 14 unsafe.cil.c unsafe 09s3 srvr 1 unsafe.cil.c unsafe 10s3 srvr 2 unsafe.cil.c unsafe 11s3 clnt 1 unsafe.cil.c unsafe 12s3 srvr 6 unsafe.cil.c unsafe 13s3 clnt 3 unsafe.cil.c unsafe 14s3 clnt 4 unsafe.cil.c unsafe 15s3 clnt 2 unsafe.cil.c unsafe 16Table A.2: Renaming of Unsafe Instances.corresponds to the latest changeset14 by the time the tests were executed.Kleef had better performance than Kleel for all safe instances. For theunsafe instances, the Klee versions had similar performance. Thus, Kleefhad the best overall performance, which explains why this was the versionused in Chapter 5.Finally, Tables A.5 and A.6 show the number of instructions symboli-cally executed by both Klee versions and all Kite versions to solve the safeand unsafe instances, respectively. As expected, the number of instructionsexecuted by both versions of Klee to solve the safe instances are the same,since they both execute all feasible paths in the program (Table A.5).14Changeset e49c1e1958e863195b01d99c92194289b4034bbb from January 21st, 201492Appendix A. Extended ResultsTest Kleef Kleelsafe 01 0.16 0.25safe 02 0.45 0.47safe 03 0.06 0.09safe 04 0.14 0.19safe 05 0.02 0.03safe 06 0.04 0.05safe 07 0.93 0.98safe 08 0.92 0.98safe 09 0.95 1.07safe 10 0.92 0.98safe 11 68.58 102.84safe 12 0.09 0.16safe 13 0.09 0.12safe 14 40.72 66.16safe 15 40.36 66.52safe 16 40.39 64.99safe 17 40.62 66.30safe 18 40.30 66.06safe 19 39.86 65.73Total 315.60 503.97Table A.3: The execution time (in seconds) spent by the two versions ofKlee to solve the safe instances. Kleef corresponds to the forked version,while Kleel corresponds to the latest one.93Appendix A. Extended ResultsTest Kleef Kleelunsafe 01 0.24 0.14unsafe 02 0.14 0.16unsafe 03 0.04 0.04unsafe 04 0.07 0.11unsafe 05 0.04 0.03unsafe 06 0.62 0.53unsafe 07 1.60 2.29unsafe 08 0.27 0.18unsafe 09 0.15 0.07unsafe 10 0.09 0.10unsafe 11 0.10 0.10unsafe 12 0.13 0.16unsafe 13 0.04 0.03unsafe 14 0.18 0.16unsafe 15 0.14 0.16unsafe 16 0.38 0.15Total 4.23 4.45Table A.4: The execution time (in seconds) spent by the two versions ofKlee to solve the unsafe instances. Kleef corresponds to the forked version,while Kleel corresponds to the latest one.94AppendixA.ExtendedResultsTest Kleef Kleel KiteB KiteG KiteM(5) KiteM(10) KiteM(15)safe 01 14,095 14,095 23,099 6,465 11,357 6,465 6,465safe 02 12,876 12,876 6,046 3,602 4,122 3,143 3,421safe 03 4,003 4,003 723 736 736 736 736safe 04 6,945 6,945 13,913 3,257 3,326 3,292 3,257safe 05 655 655 0 0 0 0 0safe 06 1,909 1,909 0 0 0 0 0safe 07 17,298 17,298 16,557 6,243 7,481 6,243 6,243safe 08 17,284 17,284 18,008 5,642 6,034 5,642 5,642safe 09 17,322 17,322 14,819 5,790 6,077 5,790 5,790safe 10 17,284 17,284 16,413 5,726 6,074 5,726 5,726safe 11 549,056 549,056 277,252 50,935 46,063 50,935 50,935safe 12 2,684 2,684 6,166 2,824 5,846 5,399 5,612safe 13 1,683 1,683 3,977 1,563 2,490 2,424 2,314safe 14 614,650 614,650 480,643 82,164 101,893 116,105 102,153safe 15 615,130 615,130 411,546 71,751 95,342 76,877 86,574safe 16 613,162 613,162 793,898 219,609 416,020 387,664 351,395safe 17 625,116 625,116 722,191 187,700 377,336 338,916 283,695safe 18 616,034 616,034 677,706 207,993 423,542 399,795 346,147safe 19 615,124 615,124 770,891 204,738 412,356 359,172 306,548Table A.5: Table showing the number of instructions that were symbolically executed by all versions of Kleeand Kite to solve the safe instances. In instances safe 05 and safe 06, Kite was capable of proving property?scorrectness just by applying the code optimization passes, and the slicing algorithm.95AppendixA.ExtendedResultsTest Kleef Kleel KiteB KiteG KiteM(5) KiteM(10) KiteM(15)unsafe 01 2,183 2,285 334 334 334 334 334unsafe 02 4,762 5,215 819 543 543 543 543unsafe 03 1,736 1,597 244 244 244 244 244unsafe 04 3,873 5,016 16,064 6,056 1,085 6,056 6,056unsafe 05 467 468 602 414 414 414 414unsafe 06 11,549 12,710 2,658 2,071 2,858 2,224 2,071unsafe 07 46,911 45,692 Timeout Timeout 79,136 5,034 228,371unsafe 08 4,406 3,022 1,052 3,033 12,038 3,033 3,033unsafe 09 539 556 500 489 496 489 489unsafe 10 1,153 1,200 3,959 610 677 610 610unsafe 11 1,156 1,207 6,427 632 632 632 632unsafe 12 2,162 2,330 2,710 2,638 6,280 2,638 2,638unsafe 13 378 378 377 377 377 377 377unsafe 14 2,339 2,177 4,200 2,513 2,501 2,513 2,513unsafe 15 2,397 2,377 2,964 649 729 649 649unsafe 16 2,397 2,155 2,550 1,183 759 1,183 1,183Table A.6: Table showing the number of instructions that were symbolically executed by all versions of Klee andKite to solve the unsafe instances.96


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items