A Metric-based Theory of Test Selection and Coveragefor Communication ProtocolsbyJADRANKA ALILOVIC-CURGUSB.Sc., Sarajevo University, 1977M.Sc., Sarajevo University, 1987A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF COMPUTER SCIENCEWe accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIAJULY 1993© JADRANKA ALILOVIC-CURGUS, 1993In presenting this thesis in partial fulfillment of the requirements for an advanced degree atthe University of British Columbia, I agree that the Library shall make it freely availablefor reference and study. I further agree that permission for extensive copying of this thesisfor scholarly purposes may be granted by the head of my department or by his or herrepresentatives. It is understood that copying or publication of this thesis for financialgain shall not be allowed without my written permission.Signature Department of Computer ScienceThe University of British ColumbiaVancouver, CanadaDatefiAbstractA metric-based theory is developed that gives a solution to the problem of test selection andcoverage evaluation for the control behaviour of network protocol implementations. The keyidea is that a fast, completely automated process can uniformly cover the execution subspaceof a network protocol control behaviour when characterized by appropriate metric functions,each concerned with some aspect of the protocol behaviour.Efficient systematic approximation of complex systems behaviours is a crucial problem insoftware testing. This thesis gives a theoretically sound and completely automated solutionto the approximation problem for the control behaviour space of network protocols generatedby many concurrent and highly recursive network connections. This objective is accomplishedin a series of steps. First, a metric-based theory is developed which introduces a rigorousmathematical treatment of the discipline of testing, through the definition of testing distance,test coverage metrics, and metric-based test selection method. It involves a metric characteri-zation of infinite trace sets of protocol behaviour within complete and totally bounded metricspaces, and captures approximations of different patterns of system behaviour due to recur-sion and parallelism. It is shown that classes of fault coverages of well known protocol testmethods form a metric hierarchy within these metric spaces. Next, a general mathematicalframework is developed for reasoning about the interoperability of communicating systems.An interoperability relation is obtained which gives a theoretical upper bound for the testselection process. The two threads are drawn together in a specific test selection algorithm,showing that the generation of arbitrarily dense sets of test sequences that approximate someoriginal test suite to some target accuracy within the theoretical upper bound, is a convergentprocess. Finally, the theory itself is tested on the examples of a multi-media network protocol.It is shown that very high densities of selected sets and coverage calculation can be achievedwithin reasonable time limits in a completely automated manner. These results indicate thatthere is no practical impediment to applying rigorous theoretical treatment to the disciplineof testing for the case of systems that derive their complexity from highly concurrent andrecursive subprocesses.ContentsAbstract^ iiTable of Contents^ iiiList of Tables viiList of Figures^ viiiAcknowlegement xi1 Introduction 11 - 1 The Problem ^ 21-2 The Approach 41-3 Limitations and Key Assumptions ............. . 72 Background 112-1 Testing for Reliability^ 122-1.1^General Software Testing ^ 122-1.2^Theoretical Investigations of Testing^. . ....... . 132-1.3^Protocol Testing Theory and Practice 142-1.4^The Extension of the Protocol Testing Problem ^ 162-2 Theoretical Background ^ 172-2.1^Theoretical Setting 17iiiiv34CONTENTS2-2.2^Extensional Equivalences ^202-3^The Metric Based Model of Testing ^26A Metric Space of Protocol Control Behaviour^ 283-1^General Issues ^303-1.1^Testing Real Protocol Implementations ^303-1.2^Approximating Protocol Systems by Traces ^333-1.3^Recursion and Test Selection in Protocol Systems ^353-2^The Metric Space of Execution Sequences of Protocols ^383-2.1^The Definition of Testing Distance dt ^403-2.2^The Interpretation of the Distance dt ^433-2.3^The Finiteness of &dense Covering for the Metric Space (S, dt) .^ 493-2.4^The Completness of Space (S, dt) ^513-3^Control Coverage Evaluation^. ................ .^. 543-3.1^A Coverage Metric for Pseudo-Exhaustive Testing Strategies^553-3.2^A Coverage Metric for Strategies With Pre-defined Target Test Se-quences ^573-4^Metric Based Test Selection ^593-4.1^Metric Based Test Selection Algorithm ^593-4.2^Test Selection as a Convergent Approximating Process ^633-4.3^Mixed Level Hierarchical Test Selection ^65A Metric Hierarchy of Fault Coverage Models^ 724-1^Some General Results ^734-2^Analytical Comparison of Test Selection Methods^....... .^754-2.1^A Metric Characterization of FSM-based Methods ^764-2.2^A Metric Characterization of Some Software Testing Covers^814-3^Approximating Infinite Systems^. .^. .^...... .^. .^. . 834-3.1^A Metric Characterization of Trace Equivalence . . 83CONTENTS4-3.2^A Metric Hierarchy of Testing ^5^A Framework for Interoperability Testingv86905-1 Implementation relations and interoperability ^ 925-1.1^Strings Equivalence ^ 935-1.2^Implementation Relations eon f and red ^ 965-1.3^Implementation Relations and Interoperability ....... . 965-2 The Interoperability Relations ^ 1025-2.1^The intop Relation^.............^.^. 1025-2.2^Properties of the intop Relation . .^.^.^. . .^. .^. . . .^. . . 1045-2.3^The intop red Relation .^.^.^.^.^.^.................^. 1065-3 The Interoperability Tester Design 1085-3.1^Architectural considerations ^ 1085-3.2^Formal Network Protocol Specification Issues ^ 1105-3.3^Interoperability Tester Design ^ 1125-4 Interoperability Relation and the MB method ^ 1156 Tests of the Theory 1186-1 Internet Stream Protocol ^ 1196-1.1^General Protocol Features ^ 1196-1.2^ST-II from the Testing Point of View ^ 1206-2 The Implementation^. .^. ...^ . ^ 1216-3 Experimental Results 1236-3.1^Computational Complexities and Performance Measurements . 1236-3.2^Sensitivity Analysis of the Metric Based Method^ 1246-4 Discussion ^ 1577 Conclusion 1607-1^Summary of Contributions ^ 160vi^ CONTENTS7-1.1 A Metric Characterization of Protocol Control Behaviour Space^. 1607-1.2 Method-independent Coverage Evaluation ^ 1617-1.3 The MB Test Selection Method ^ 1637-1.4 Interoperability Relation ^ 1647-1.5 Contributions to Specific Areas of Study ^ 1657-2 Suggestions for Further Research ^ 166Bibliography^ 168Appendices 173A Labelled Transition Systems^ 173B LOTOS^ 175C Theory of Metric Spaces - Mathematical Definitions^ 177D A Formal Description of ISO 8072 in LOTOS^180E A Specification of ST-II Protocol in Estelle-Y 182F Test Files and Sample Results^ 194F-1 Seed Files ^ 194F-2 Experiment 1.1 201F-3 Experiment 4 ^ 203List of Tables6.1 Experiment 3 - Characteristics of randomly generated sets ^ 1346.2 Experiment 5 - Characteristics of test sets ^ 1396.3 Experiment 6 - Characteristics of randomly generated test sets ^ 142B.1 Axioms and Inference Rules in LOTOS ^ 176viiList of Figures2.1 Some illustrative examples of strings equivalence ^ 222.2 Non-determinism and testing relations ......... . . .^232.3 Divergence and testing relations . . . . . . . . ............. .^242.4 Analytic Model of Testing ^ 273.1 An example of faulty implementations of a simple protocol ^ 323.2 Execution sequences of ISO 8072 Transport Service ^ 383.3 Increasing Similarity Information by Recognizing Recursion^ 443.4 Evaluation of the function pk ^ 463.5 Evaluation of the function 8k 473.6 Evaluation of the function dt ^ 493.7 The Subspaces Sr, of Trace Truncations for the ISO 8072 Transport Service 563.8 Generating Cauchy sequences ^ 624.1 Generating a T-tour using metrics only ^ 784.2 A Metric Hierarchy of Protocol Testing within infinite metric space (Tr(S), dt) 885.1 Connection Establishment Phase of ST-II protocol ^ 955.2 Different implementations of a process S ^ 975.3 An interoperable implementation of S 1005.4 Interoperability Test Architecture ....... .^ . . . ^ 109viiiLIST OF FIGURES^ ix5.5 A specification, its canonical tester T(S), and its interoperability testerIT(S) ^ 1136.1 The Functional Structure of the Test Development System TESTGEN++ 1216.2 Timing results (in user seconds) for control sequences of length 20 ^ 1246.3 Timing results (in user seconds) for control sequences of length 100^1256.4 Granularity of MB test selection with event name identification ^ 1286.5 Granularity of MB test selection with event name, gate and data parame-ters identification ^ 1296.6 Selecting Transition Tours with moderate vertical recursion ^ 1326.7 Selecting Transition Tours with moderate vertical recursion 1336.8 Pseudo-random versus random test generation ^ 1356.9 Selected subset size for random and backtracking algorithms ^ 1376.10 Sensitivity to vertical recursion . . . ............. .^1406.11 Mutual densities of some sets with moderate and low number of simulta-neous connections ^ 1436.12 Mutual densities of some sets with moderate and low number of simulta-neous connections ^ 1446.13 Mutual densities of some sets with moderate and low number of simulta-neous connections ^ 1456.14 Sensitivity to missing events - single connection with recursion ..... . ^ 1476.15 Sensitivity to missing events - general metric function, multiple connectionswith recursion ^ 1496.16 Sensitivity to missing events - general metric function, moderate parallelismand recursion ^ 1506.17 Sensitivity to missing events - special metric, moderate parallelism andrecursion ^ 151x^ LIST OF FIGURES6.18 Sensitivity to missing events - general metric function, moderate parallelismand recursion ^ 1526.19 Sensitivity to missing events - special metric function, moderate parallelismand recursion ...... . . .^. ...... . . .^ . 1536.20 Sensitivity to missing events - special metric, moderate parallelism andrecursion ^ 1546.21 Sensitivity to missing events - special metric, moderate parallelism andrecursion .......... .^. . . . ....... . .^155^6.22 Sensitivity to missing events - special metric on random sets 156To my dearest StipeSarajevo, - May 1993xiChapter 1IntroductionMeeting the growing technical requirements for reliable computer systems is limited bythe ability to develop, test, and maintain increasingly complex software designs. In theabsence of effective tools and techniques for disciplined software engineering, the endproduct can be a costly source of delays and unreliability. The reliability of complexcomputerized systems is enhanced if they can be thoroughly tested, and released only ifthey satisfy stringent test coverage criteria.In assessing the reliability, the users of computer systems are interested in the overallevaluation of systems composed of hardware and software, within their entire operationalenvironment, and not just an isolated set of features of a particular software module.Software is now often highly concurrent, and features new complexities such as distributedcomputing, non-deterministic, and event-driven processes.Extending the practice of software engineering to meet both current users' require-ments for reliable software and deal with the new set of features of modern software is achallenging problem in software testing.The goal of this thesis is to provide a solution to the problem of modelling and au-12^ CHAPTER 1. INTRODUCTIONtomating the test selection and test coverage evaluation process with respect to both ofthese requirements for the control behaviour of network protocol implementations. Sinceoverall evaluation of software implementations is our goal, the testing model presented inthis thesis is geared towards a type of testing often termed coverage testing. Coveragetesting strives for a thorough coverage of the entire execution space of tested systems ascan be found under a wide range of operational conditions. Such are various forms ofintegration, pre-release, and acceptance tests, although we believe coverage testing can bea valuable component in any form of testing. The models and methods supporting cov-erage testing are currently exceptions, in spite of their immediate impact on the qualityof released software.1-1 The ProblemIn this thesis, a protocol implementation under test is considered to be a concurrent,communicating and nondeterministic system capable of supporting any number of simul-taneous network connections. Similarly, no limit is imposed by the theory on the depthof recursion that such a system may reach during its operation. In what follows, the kindof testing that we are considering is the coverage-oriented type of testing, restricted tothe control protocol behaviour: the coverage target is some criterion defined on the entirecontrol behaviour space of a protocol implementation under test. A test selection pro-cess must select some approximation of the coverage target that can be executed withina fixed amount of time. Test coverage is the degree of approximation of the coveragetarget. The goal of this work is to develop a mathematical theory of this process, onewhich precisely defines the coverage target for coverage-oriented type of network protocolimplementations testing and describes a meaningful approximation of such a coveragetarget.There are several reasons for this choice of problem. First, there is empirical evidence1-1. THE PROBLEM^ 3that present partition-based and heuristic test selection techniques are an inadequatesolution for testing complex software with highly unvisualizable event interleavings [36,26]. Theoretical investigations of such techniques indicate that they are weak even withregard to developing confidence in sequential programs [25, 24, 14]. Both results suggestthat there is no alternative to coverage testing for increasing confidence in the reliabilityof tested software. However, there is a considerable limitation in the amount and varietyof test scenarios needed for coverage tests that current systematic or manual techniquescan develop in a limited amount of time. Based on this research in general software testingtheory and practice, we conclude that the success of such methods for testing protocolimplementations with many simultaneous connections is very much in question.Second, experimental results and software error experience with complex systems re-veal a strong statistical dependency of software component failure rates on several commonmeasures of utilization and load [38, 33]. These experimental findings underline the im-portance of testing for coverage of multiple simultaneous connections and high levels ofrecursion as major sources of increased utilization and load for protocol implementations.Finally, of all modern complex software systems, communication protocols are perhapsthe most valuable candidate for investigation of the coverage testing problem. On onehand, there is a considerable additional interest in the correctness aspect of communicationprotocols, in that they must be correctly implemented in order to interoperate at all.Additionally, modern communication systems depend on programmable computers andmicroprocessors, and exhibit complex event-driven behaviour that is composed of manyconcurrent and recursive processes stimulated from the layer above, from the layer below orby timers internal to protocols. Thus, if an efficient solution is found for test selection andcoverage evaluation for communication protocols, it seems plausible that similar methodscan be applied for a large class of complex real-time software systems.4^ CHAPTER 1. INTRODUCTION1-2 The ApproachThe problem of designing communication protocol testers in various theoretical settingshas been well researched. The successful execution of specification-based testers guaran-tees that the protocol implementation under test is a valid implementation of the protocolspecification it implements, according to the testers specific criteria of validity. All testers,however, that would attempt to satisfy the target test coverage criteria in the presence ofhigh level of parallelism and recursion would either test exhaustively or even be infinite.The crucial issue in testing such systems is the identification of appropriate approximat-ing subsets of test cases, including such that would be finite even if the original set of(generated) test cases is infinite. We approach this problem from the viewpoint of a math-ematical characterization of all objects and activities involved in the problem, such that,when drawn together, these theoretical results will yield a practically applicable testingmethodology. Whereas we will strive to illustrate the practicality of the approach by pro-viding algorithms and tester design, these are not optimized and are meant as illustrationonly - this indeed is one area where much improvement can still be made.The first step in the approach is the definition of the theoretical framework for test se-lection and coverage evaluation for protocol control behaviour, consisting of the followingthree factors: (i) the relation which describes the validity link between a protocol im-plementation and its specification (based on the specific validity criteria that the testingprocess attempts to verify), (ii) the target test coverage criteria derived from the valid-ity relation (this becomes the theoretical upper bound for the testing process), and (iii)an efficient approximation method for such criteria (approximation mode based on someselected set of properties that hold for the implementation under test).This work examines these three factors at the levels of theory, algorithm and imple-mentation.Chapter 2 provides the background material for this analysis. It begins with the sum-1-2. THE APPROACH^ 5mary of the major empirical and theoretical results in the general software testing theory.The emphasis is placed on recent results which suggest new directions for testing theoryand practice. This is followed by an overview of important results in the testing theoryfor concurrent, communicating and nondeterministic processes and their application tospecification-based protocol testing theory. A discussion is then presented of the waysin which these two threads can be drawn together. A testing model is derived, which isbased on formal specifications of communication protocols, is mathematically sound andallows full automation of activities within it. The generality of the model is demonstratedby the fact that, in the development of the thesis, first the approximation process is char-acterized, and then the validity relation is defined. The validity relation is then shownto generate coverage criteria that can be successfully approximated and evaluated by thepreviously defined approximation process.The analysis of the approximation process itself begins in Chapter 3. All testing ac-tivities are considered in the theoretical setting generated by the formal specifications ofcommunication protocols. Formally defined objects are amenable to objective measure-ment. We exploit this relationship of formal methods to metrication to set up a metricmodel of the control space of process behaviour, which serves as the basis of coveragemetrics and provides a guide for the selection of test cases. A minimal structure of theexecution space of protocol control behaviour is assumed, that of sets of sequences ofobservable interactions. This enables us to use the same metric spaces for all target testrelations and coverage criteria that are primarily based, or can be interpreted in terms ofexecution sequences. This metric aims at capturing the recursion and parallelism, whichare the main sources of complexity of the protocol behaviour. A number of constraints inthe metric definition are introduced, to allow a potentially infinite set of infinitely long ex-ecution sequences to be mapped into finite measures. Every Cauchy sequence of executionsequences in the metric space is also convergent with the limit in the metric completionof the same space by infinite sequences. This serves as the basis for a convergent test se-6^ CHAPTER 1. INTRODUCTIONlection algorithm that approximates the entire execution space of a protocol in a uniformmanner with respect to the metric definition. In addition, the metric space generated bythis metric is totally bounded, which guarantees the existence of finite covers for infinitesets of execution sequences. The next step is to develop a notion of test coverage mea-sures within such metric spaces. The definition of test coverage measures for average andworst case control behaviour coverage estimates is suitable for general and special purposetesting strategies. In the conclusion of this chapter, we discuss the applicability of resultsto the mixed level hierarchical test selection and coverage evaluation strategies.In the closely related Chapter 4, we use this metric system to investigate and classifythe relative power of some well understood test selection methods in protocol testing. Weshow how the hierarchy of test covers formed by the metric based theory subsumes previ-ously investigated theoretical coverages based on different fault models for certain coverageclasses. Furthermore, the same metric based methodology naturally extends from finiteto infinite domains and encompasses the full recursion and parallelism of real protocolimplementations, approximating the theoretical upper bound of trace equivalence.The question of the actual target test relation and an efficient upper bound for thetesting process has been delayed to Chapter 5. In this chapter, we extend the testingtheory based on formal specifications by formalizing testing for interoperability with anew relation intop. The discriminating power of some existing notions of testing relationsin protocol testing theory for identifying processes which will interoperate is examined.It turns out that the intop relation is predominantly characterized by trace sets of pro-tocol control behaviour, and that the testers built to test protocol implementations forsuccessful satisfaction of this relation are indeed trace equivalent to the specification ofprotocol control behaviour. Therefore, as anticipated, this relation is a valid candidatefor successful approximation in the previously defined metric spaces. It is also a moreefficient theoretical upper bound to the testing process than existing conformance rela-tions for the case of protocol implementations supporting large numbers of simultaneous1-3. LIMITATIONS AND KEY ASSUMPTIONS^ 7network connections.The final step of the work is to test the theory on an actual protocol specification. InChapter 6, process is tested on several excerpts of a multi-media communication protocol.It is shown that high target densities can be achieved in very little time for most of thetraditionally considered cases. Also, the process degrades gracefully as the parallelismand recursion are allowed to increase to levels corresponding to up to 600 simultaneousnetwork connections. The theory is shown to be capable of dealing time efficiently witha large number of situations that were not previously dealt with systematically in thetesting theory or practice. This particularly includes identification of variations in ver-tical recursion, parallelism and missing events or event combinations, for environmentssupporting multiple network connections and higher levels of recursion.The thesis results show that there is no impediment to creating a testing methodthat is in concert with the newest results in testing theory and empirical findings, ismathematically sound, and allows full automation of the activities in it. In Chapter 7, thecontributions are summarized, and a critical look is cast back at the results achieved. Wealso briefly point out some improvements that the experimentation experience suggests,as well as further directions in which this theory can be developed.1-3 Limitations and Key AssumptionsBefore embarking on the development of the theory, it is important to acknowledge anumber of limitations and assumptions that could potentially limit its relevance.The key assumption of this thesis is that the description of the tested protocol system isgiven in terms of a formal technique that allows labelled transition systems as its semanticmodel. Within this model, we further assume that concurrency is represented by indepen-dent interleaving of concurrently executed events - an assumption that establishes a link8^ CHAPTER 1. INTRODUCTIONbetween the behaviour of a concurrent system and testing through observation. Although" concurrency via interleaving" is a commonly made assumption in protocol testing, weshow how it can become a stumbling block in the development of the theoretical upperbound for interoperability testing. This assumption is relaxed to a certain extent throughour definition of interoperability relation. The accompanying formal idea of testing byobservation (of specified events), also precludes taking into account some common testsystem functions, such as run-time error indications, local timestamps, timeouts, and soon, which are commonly employed in testing activities. This limitation serves to keepthe theoretical framework both simple and focused on the single issue of detecting de-viations in the temporal order of specified and implemented event exchanges. This hasadditional impact on the practicality of the approach, in that checking for the existenceof the (specified) trace will typically supply us with less knowledge about the implemen-tation behaviour than would be the case if the former features (exception handling inparticular), were to be taken into account.The underlying idea of the research carried out in this thesis, is that a protocol systemshould be tested as a concurrent system, within the entire execution space which can begenerated in a real operating environment. It is also assumed that all faults in this spacewhich are observable on event interaction with the environment are equally probable.We have based our work on the work of Hamlet [25, 25],Parnas [45], and others, andtheoretical results that indicate that partition testing does not inspire confidence in mostrealistic situations. Where this is not the case, i.e. where highly vulnerable areas ofcode can be determined in advance of testing, more specialized test selection methodstargeting such partitions may be more efficient.' Since this is an initial investigationof a completely new approach, we leave all additional knowledge that could potentiallyaid in test development (including the knowledge of operational profiles) outside of ourconsiderations - we let the formal specification of the communicating system be the sole'Note that this implies that we either know the location of errors in advance, or amounts to provingthe correctness of the tested system.1-3. LIMITATIONS AND KEY ASSUMPTIONS^ 9source of our knowledge about the implementation control space.The possibility of divergent behaviour in the tested systems (the question of fairnessin process algebras), has not been dealt with. For the development of the theory, wehave chosen a theoretical setting in which all systems are assumed non-divergent - anassumption that severely limits the validity of the thesis' results in case of systems whichmay perform unobservable actions ad infinitum. This is an assumption that is frequentlyemployed in the testing theory for labelled transition systems, for the ease of theoreticaldevelopment. It can be removed, for instance, by introducing the notion of time intoformal framework.Another potential limitation is our choice to represent the control behaviour of thetested systems by sequences of observable interactions (formally, traces). Tree represen-tations (e.g. trace-refusal formalism) are a more adequate representation for concurrent,nondeterministic communicating systems such as we are considering. In making this de-cision, we have been guided by the intuitiveness and simplicity of implementation of traceset representation. Also, it is not clear that the satisfaction of validity relations based ontree representations can be directly testable. Notice, however, that so far as the metric-based theory is concerned, a simple extension of metric function definition is neededto accomodate approximations of the tree-based representations of the protocol controlspace.Finally, the mode of approximation within the metric-based method is entirely basedon the similarity of behaviour evaluated through positional variations in event patternand recursion. This choice has two implications on the possible applicability of this theory.First, the theory is only effective for systems with relatively few events and large amountsof recursion, which is the case for most protocol systems. Other methods would probablybe more efficient for systems for which this is not the case. Other possible relevantfeatures of protocol systems (such as special events in safety testing), can be treatedwithin the theory only with substantial modifications of the metric function definition,10^ CHAPTER 1. INTRODUCTIONwhich is the cornerstone of the theoretical development of this thesis.In conclusion, a limitation of a quite practical character is due. Although this thesisdoes not concern itself with the test generation, or tester design in particular, its industrialapplicability is dependent on the existence of a theoretically valid and practically fast testgenerator and executable tester. One possible design that we provide (based on existingdesign of a canonical tester), is theoretically sound and used as an illustration of howtest selection is to be viewed with respect to interoperability, both being quite practicalnotions. Improvements on this front in terms of scalability are much needed.Chapter 2BackgroundThe purpose of this chapter is twofold - to survey the related work and establish thetheoretical setting of the thesis.First, we present a collection of results from different study areas, that lead to thisthesis research. These are divided into three groups. Some reasons for the inadequacyof present testing methods for modern, complex software are presented in some detailfirst. They show that a fresh approach to testing may be needed in order to overcomethe indicated limitations. Next, the newest advances in two rather separate testing areas:(i) theoretical investigations of testing methods for gaining overall confidence in softwaresystems and (ii) extensional characterizations of testing and observation for concurrent,communicating and nondeterministic systems, are presented in more detail. We drawthese two together for the specific case of communication protocols in the section titled"The Extension of the Protocol Testing Problem", which also gives the broadest outlineof the thesis work.Next, the theoretical setting of the thesis research is presented. Although we believethat the metric based theory of test selection and coverage is applicable to a wider rangeof theoretical settings, we needed to anchor our discussion for the ease of presentation. We1112^ CHAPTER 2. BACKGROUNDchose process algebras (as the specification mechanism), labelled transition systems (astheir semantic model), and a rather general notion of control behaviour captured by theexecution sequence space of systems, that may be derived from, but need not necessarilybe tied to this theoretical setting only. The immediately following section on extensionalequivalences illustrates the kinds of problems that are encountered in systems testing, andhow present theory proposes to deal with them. The new testing relation derived in thisthesis is entirely based on this work.In the end of this chapter, we introduce a model for communication protocols testing,in which we draw all these results together and explicate the proposed thesis goal.2-1 Testing for Reliability2-1.1 General Software TestingCurrent testing methods are inadequate when applied to todays complex concurrent soft-ware with highly unvisualizable event interleavings ([26]).In practice, the majority of test design, generation, and selection is still done manually,often left to the least experienced members of the team or those who joined the projectlast. The manual (heuristic) approach to test selection is advocated even when the restof the testing problem is treated mathematically [4]. Besides being time consuming andcostly, manual (even if expert) test selection in the presence of complex software becomesimpossible to solve systematically and with objectively measurable outcome. Not surpris-ingly, some recent experimental results in fault tolerance techniques [36] detect errors inimplementations that evaded detection by every programmer in testing.Recently, there has been a substantial investment by industry in test tools, standardtest methods, languages and test suites, and even specialized testing corporations whose2-1. TESTING FOR RELIABILITY^ 13mandate is to provide (conformance) testing services. Despite significant improvementsto testing practice, the following investigations seem to point out that the expected gainsmay soon reach the limit due to the increasing complexity of current software systems.A number of more recent testing methods [56, 55] extend the structural testing criteriathat is traditionally applied to sequential software to concurrent programs. Taylor, Levine,and Kelly [54], point to at least two drawbacks of this type of approach: (i) structuraltechniques are among the weakest available with regard to developing confidence evenin sequential programs (see also [24]); (ii) the scalability of methods for large systems isvery much in question. Reports on hands-on experience with test tools that are based onthese and other partition-based sequential software testing paradigms warn of the sameproblem [26, 36, 37, 43, 57].Experimental results and software error experience with complex systems also reveal astrong statistical dependency of software component failure rates on several common mea-sures of utilization and load [5, 33, 38]. These experimental findings are not systematicallyaddressed by current testing methods.Because of these problems, it does not seem justifiable to bridge admittedly weaktraditional testing methods to complex concurrent software engineering.2-1.2 Theoretical Investigations of TestingRecent theoretical advances in the investigation of testing, and experimental results onthe dependability of the released software, offer fresh insight into the perspectives formore scientific, and hopefully more successful, treatment of this problem.In their independent investigations of testing for true reliability or confidence in re-leased software, both Parnas [45] and Hamlet [23] reach similar conclusions with respect tothe sampling basis and sampling mode for software system tests. The negative answers to14^ CHAPTER 2. BACKGROUNDthe question of the existence of a valid operational profile for todays complex software andto the validity of successful partition delineation for test methods that essentially targetfaults (which include all structural and functional test methods), lead to the conclusionthat the tests should be chosen from the entire state space of the operating process andfrom a distribution in which all state-input combinations are considered equally likely.This choice of the sampling basis and the sampling mode puts the recommended teststrategies into the coverage oriented class of testing. The theory does not suggest a prac-tical testing method, recognizing among other difficulties that there are far too many testsrequired if the sampling basis is a very large state space. The theory has been used fortheoretical comparison in [25]. The important result of this theoretical investigation (seealso [14]) is that, assuming that tests are selected so that the state space is uniformlysampled, it is shown that all test methods that can broadly be categorized as partitiontesting are far inferior to the uniform sampling of the entire state space. In addition, thedisadvantage is magnified by more partitions or by lower fault rates. Lower fault ratesare expected for coverage tests, which are applied in acceptance, release and integrationtesting of software systems. In the conclusion of another related research [23], Hamletstates that for the true applicability of this theory a fully automated support for testdevelopment is needed to handle the much larger number of sampling points than usuallyused in software testing.This thesis research is based on these theoretical advances, adopted for the specificcase of specification based testing of communication protocols.2-1.3 Protocol Testing Theory and PracticeIn the conclusion of this analysis, we turn our attention to the above issues in the com-munication protocols domain.For communication protocols, the major factors that influence resource utilization and2-1. TESTING FOR RELIABILITY^ 15load are the prolonged exchange of data during a communication link lifetime and thenumber of simultaneous connections an implementation supports at a particular momentof operation. Exactly the same factors are the major source of the state space explosionfor a protocol implementation operating under realistic circumstances and therefore affectthe scalability of test methods.Current protocol test methods are not well equipped to deal with this dual problemof load and space complexity. General software and Finite State Machine (FSM) basedprotocol test methods are not designed to systematically generate and select tests for largemulticonnection environments. These selection methods include transition and transferfault checking experiments for classes of FSMs (a summary in [52]) and combined controland data flow analysis (examples in [49, 55]). Other test selection methods reported inliterature, which are based on expert opinion, valuations and probabilities ([4, 62, 51]), apriori assume that the sampling of the execution space should not be uniform but basedon some specific notion of a value of an individual test case. All of the above protocoltest methods involve exact evaluations of test selection criteria, and as such must imposelimiting assumptions about the recursion and parallelism in protocol systems in order to becomputable for very large or infinite systems([21]). Since these methods are not designedto cover the entire execution space of the operating protocol process, the relationship oftheir achieved coverage with respect to the entire execution space of a multiconnectionprotocol implementation has not been theoretically characterized.Such protocol test generation methods exist that can be viewed as candidates forcoverage oriented testing strategies. Specifically, mutation-based testing [47, 22] and all-paths testing [48, 56], are methods whose theoretical upper bound may be defined to beentire execution space of the operating process. Appropriate test selection and coverageevaluation theories and algorithms which would be provably convergent if such a space isvery large or infinite, have not been developed.The theoretical milieu of process algebraic research includes theoretical investigations16^ CHAPTER 2. BACKGROUNDof the validity links between an implementation and its specification [39]. The processalgebraic theory for concurrent systems testing, as well as the related theory for commu-nication protocols testing imply the exhaustive test generation as the theoretical upperbound of the testing process. Test selection, however, remains an open problem in thesetheories of testing, if multiconnection protocol environments are the test generation do-main.2-1.4 The Extension of the Protocol Testing ProblemIt is precisely this correspondence of the newest theoretical results in the theory of generalsoftware testing and process algebraic theory for communication protocols testing thatlead to the research described in this thesis. We have defined the coverage testing problemfor communication protocols in the Section 1-1, and presented its characteristics in Section2-1.3. The theoretical background for exhaustive testing based on observation, which fitswell into the concept of coverage testing, has been extensively researched. If a methodcan be found which is able to address the problem of the systematic, automated testselection and coverage evaluation based on the process algebraic theory of testing, itwould yield a coverage oriented test development methodology for the case of executionspace of large multiconnection protocol implementations. This thesis concerns itself withthe development of a theoretical and algorithmic basis for such a method.For the development of the subsequent theory, we will place the research work in one ofthe possible theoretical settings, which we hope is the most general one: that of LabelledTransition Systems (LTS) and extensional equivalences for process algebras adapted tothe labelled transition systems framework. While the theoretical setting of LTS-s is usedthroughout the thesis, extensional equivalences serve as a basis for initial approximationof the control space and for more specific testing relations developed later.2-2. THEORETICAL BACKGROUND^ 172-2 Theoretical BackgroundIn this chapter, we present the general theoretical setting of the thesis and the theoryof extensional equivalences in process algebras in sufficient detail to allow for the furtherdevelopment of the metric-based test selection and coverage theory for communicationprotocols.2-2.1 Theoretical SettingA. The Formal Description TechniqueFormal description techniques are a fundamental step towards disciplined software en-gineering. The framework chosen in this thesis is Milner's Calculus of CommunicatingSystems (CCS), and its standardized derivative LOTOS 1 . CCS [42] is an algebraic lan-guage for specifying concurrent communicating processes and LOTOS [2, 31] is a FormalDescription Technique (FDT) intended for formal specification of communication proto-cols and distributed systems. In the theory of concurrency that applies for process algebrasand derived formalisms, two agents are composed via their mutual communication whichdetermines the composite; their independent actions are treated as occurring in arbitraryorder but not simultaneously [42]. The reasoning behind this is that an (external) observercan only record the effects of actions as though they took place sequentially 2 .Communication protocols and distributed systems for which LOTOS is specificallydesigned are a typical representative of this class of systems. The above notions of con-currency and observation are particularly valuable in the formal analysis and the closelyassociated testing theory and practice for concurrent communicating systems. That is1 Since only protocol control behaviour is examined, both are considered without value passing (i.e.data exchange).2 Note that this observation applies strictly to the formal theoretical setting used in this thesis. Otherformalisms may use timestamps, etc. to record concurrent effects in the test system.18^ CHAPTER 2. BACKGROUNDbecause the concept of observation represents a natural interpretation of the process thattakes place during the testing activities. In this thesis we use the entire set of FDToperators that generate concurrent behaviour, and examine all of its aspects throughobservation3 .B. The Semantic ModelWe study the validity relation and the approximating process for communication pro-tocol testing in the setting of Labelled Transition Systems (LTS). Labelled transition sys-tems represent a natural semantic model for the formal description technique LOTOS 4 .LOTOS is the specification formalism that has a particular class of language expressionsfor the description of system behaviour, which we refer to as processes. The inferencerules for LOTOS operators generate a labelled transition system for each operator ofthe language. Therefore, from now on, we will make no distinction between processes inLOTOS and a labelled transition system.Transition systems are an abstract model based on two primitive notions, state andtransition. Given any other model for which it is possible to define a notion of globalstate and a notion of indivisible action causing a state transition, for each object of themodel a transition system can be defined. This correspondence determines an interleavingsemantics for the model. We will consider a particular class of nondeterministic transitionsystems with interleaving semantics of concurrency, which exchange interactions with theenvironment and are capable of performing an internal action. This model is formallydefined in the following manner.Definition 2.1 ([12])A Labelled Transition System is a quadrupole (Q, E, —a^q0 ) where Q is a countable3 Other related research does not always consider all parallel composition operators, so care should betaken in comparing the appropriate results.4The results derived in this thesis for labelled transition systems can be interpreted for a wider rangeof formalisms for which labelled transition systems are a semantic model [2].'2-2. THEORETICAL BACKGROUND^ 19set of states, E is a countable set of elementary actions, —a —* with a E E U {i}, is a setof binary relations on Q and qo E Q is the initial state. ^In this definition, each of the relations —a -f, a E E, describes the effect of theexecution of the elementary action a and, if q, q' E Q, then q — a ---+ q' indicates that,by performing action a , a system, when in state q, can reach state q'. In accordance toLOTOS, the special symbol i is used to denote internal actions. q — i --4 q' indicates thata system in state q can perform a silent move to state q'.A transition system can obviously be unrolled into a tree. The initial state is the rootand the transition relation is represented by the arcs labelled with elements from EU {i};the various nodes will stand for other states. One graph model which is frequently usedrepresents transition systems in terms of synchronization trees (see [42]).C. Execution Sequence SpaceThe operational semantics of labelled transition systems and some formal specificationtechniques for protocols enable us to derive sequences of actions, in which a communicationsystem can engage, from the structure of the process itself (for details see [31]).Let E be a finite set of symbols, which represents (atomic) statements of a concurrentsystem. For each alphabet E, E* denotes the set of all strings which are formed fromsymbols in E. These strings may then be interpreted as a sequence of (atomic) interac-tions, execution sequence, that may take place between the system and its environment.In this thesis we propose a theoretical framework for the test selection and coverage ofprotocols, on the basis of very limited assumptions on the Formal Description Techniques(FDT) used to describe a communication protocol, and very limited assumptions on thekind of observation that an observer in the environment of the tested system may record.We therefore use, as the most basic building block for the control behaviour of protocols,the notion of execution sequence space.20^ CHAPTER 2. BACKGROUNDD. The Formal Thesis SettingFor the ease of the theoretical development of the thesis, we assume that protocolsystems are given a precise description by means of a formal specification in LOTOS.The formal description of a system is given in terms of behaviour expressions, which wewill refer to as processes and make no distinction between them and a correspondinglabelled transition system. The testing activities will be investigated in the context ofobservation, sometimes refered to as black-box testing in practice. In Chapters 3-4, wewill carry out the investigation of the approximation process in the execution spacesof all execution sequences derived by interleaving semantics of concurrency for formaldescription techniques that allow specifications of concurrent communicating protocolsystems and have a finite labelset of atomic actions that we call events. The full powerof labelled transition systems will be used from Chapter 5 on, to investigate the validityrelations and define the test selection algorithm. In most examples, we will representlabelled transition systems by synchronization trees [42].The formal definitions of concepts and notation used in this thesis can be found inAppendix A: Labelled Transition Systems, and Appendix B: LOTOS.2-2.2 Extensional EquivalencesThe theory of extensional equivalences for processes aims at establishing whether twosystems are equivalent or whether one system is a satisfactory "approximation" of theother. If some formalism is used to describe the system requirements (its specification),then it is possible to use theories based on equivalences or approximations which aredefined for the formalism, to show that a particular implementation is a satisfactoryapproximation of the given specification.2-2. THEORETICAL BACKGROUND^ 21A. Testing and ObservationThe basic idea behind the extensional equivalences is that two systems are equivalentwhenever no external observation can distinguish them on the basis of some crucial aspectsof system's behaviour that the particular equivalence relation takes into account. Theeffect of a system on the environment and its reactions to stimuli from the environmentconstitute its extensional or observable behaviour.This behaviour of processes can be investigated by a series of tests. In the testingtheory based on observation we think of an observable execution environment consistingof a set of processes and a set of relevant tests. Then two processes are equivalent (withrespect to this set of tests) if they pass exactly the same set of tests. For example, withsequential programs we can associate a test with a pair consisting of a predicate on theinput domain and a predicate on the output domain. For more general programminglanguages more general kinds of tests are needed.In this thesis we are concerned with communicating, concurrent, and non-deterministicprocesses. For such systems, various notions of systems equivalence based on the reactionsof systems to stimuli from the outside world have been presented in literature. We willhere present the two that are closely related to the thesis work.B. Strings EquivalenceThe most straight forward approach for systems equivalence is to consider as equivalenttwo systems which can perform exactly the same sequences of observable actions.Definition 2.2 ([12]) If B1 and B2 are two processes, then:B1 --,, B2 iff for all t E E* : t E Tr(Bi ) if and only if t E Tr(B2 ). ^It is obvious that --,, is an equivalence relation, and it is referred to as strings or traceequivalence.22^ CHAPTER 2. BACKGROUNDThe following simple example ( Figure 2.1) taken from [12] illustrates the importantaspects of this equivalence relation. It is obvious that strings equivalence is not suitableST1^ T2^T3Some illustrative examples of strings equivalenceFigure 2.1for testing in an environment where the possibility of nondeterministic interaction withthat environment exists. In particular, this equivalence does not differentiate betweensystems for which certain communications with the environment will always be accepted(e.g. events b and c after a in system Ti of the example), or only at some times (e.g.always only one of the b or c after a for system T2, or possibly neither of b or c for systemT3, after event a) .C. Acceptance/Refusal-based Testing EquivalencesAlternative approaches to systems equivalence have been summarized and further ex-panded in [13] and [12]. Informally speaking, these equivalences also explore the "branch-ing structure" of the system. When the processes involved may be nondeterministic,it is important not only to know whether given a particular test a process respondsfavourably or unfavourably, but also if a process responds consistently every time the testis performed. This natural notion of equivalence of two systems can be broken into twopreorders on processes. The first is formulated in terms of the ability of a process to2-2. THEORETICAL BACKGROUND^ 23respond positively to a test, the second in terms of its inability not to respond positivelyto a test [12].Figure 2.2 and Figure 2.3 illustrate two essential considerations in defining suchrelations. A number of extensional equivalences and the testing equivalence [12] considerTI^ T2Non-determinism and testing relationsFigure 2.2the two systems in Fig. 2.2 as equivalent. Indeed, no external observer can distinguishthem upon any experimentation that can be designed for testing through observation,unless rollback copies of the systems are available'. Similarly, acceptance/refusal basedtesting equivalences treat the systems T1 and T2 of Fig. 2.1 as non-equivalent, as wouldthe previously outlined intuition suggest. However, the possibility of divergence is treatedby different equivalences in different ways. Some of the theories treat the possibility ofdivergence (an example given in Fig. 2.3) as catastrophic, while others ignore it. In thisthesis, we assume that the specifications of communication protocols are non-divergent,5 Some initially defined (bisimulation) equivalences did indeed distinguish between the two systemssuch as those given in Fig. 2.2. It was however recognized as a drawback that the internal structurewould need to be taken into account when testing such systems.24^ CHAPTER 2. BACKGROUNDa 1T1^ T2Divergence and testing relationsFigure 2.3and that the same holds for their implementations. In practice, timeouts avoid this typeof divergence.As a specific example, we here quote the formal definition of the equivalence which isextensively researched in the conformance testing for communication protocols.D. The Conformance Problem for Communication ProtocolsFollowing is a formalization of the conformance problem.Definition 2.3 ([2]) Let B1 and B2 be two processes. Let E represent a universe of labelsof atomic protocol actions. B 1 conf B2 ifsVa E Tr(B2 ) VA C E if 3/3; Va E A Bi = a B; athen 3B; Va E A B2 = a B; a=. ^B1 conf B2 expresses that testing the traces of B2 against the behaviour B 1 will notlead to deadlocks that could not occur if the same test was performed with the B2 itself.6All subsequent notations are defined in Appendix 1.2-2. THEORETICAL BACKGROUND^ 25What is not guaranteed by the con! relation, is that B1 does not contain traces that B2does not specify. This particular problem is not possible to examine in a systematic andmathematical manner in general. All possible observers (universally quantified) wouldhave to be examined, which clearly cannot be done by any analytic or systematic process.Conf is not an equivalence relation. However, it turns out that it is closely related toanother relation, that can be used for the definition of an equivalence relation. To seethis, consider the following definition and proposition.Definition 2.4 ([2]) Let B1 and B2 be two processes. Let E represent a universe of labelsof atomic protocol actions. B1 red B2 ifVcr E E* VA C E if ]B; Va E A Bi = a. B; athen 3B21 Va E A B2 = CI B2 a . ^This is precisely the earlier requirement related to the universal quantification of observers:by this definition, B1 cannot have any traces that B2 does not specify.Proposition 2.1 ([2])1. B1 red B2 if Bi con f B2 and Tr(Bi ) g Tr(B2);2. red is a pre-order. ^Since red is a pre-order, it can be used for the definition of equivalence.Definition 2.5 Let B1 and B2 be two processes.B1 -- B2 iff B1 red B2 and B2 red B1 . ^This thesis research in validity relations and theoretical upper bounds for testing processesfor multiconnection environments will be based on this type of relation.26^ CHAPTER 2. BACKGROUND2-3 The Metric Based Model of TestingWe use these newest results from general software testing theory and practice, and thecommunication protocols testing theory, to propose a new model for communication pro-tocols testing. This model is briefly sketched in Figure 2.4. Fundamental to the modelis the idea that if the theoretical upper bound of the testing process is derived from anextensional validity relation between a specification and an implementation, and it is sys-tematically approximated over the entire execution space, it will yield a coverage orientedtype of testing and address load conditions for large multiconnection protocol systems.The rest of this section details the essential requirements that we impose on this testingmodel.Theoretical SoundnessThe necessity of the theoretical soundness of test methods that target the reliability ofreleased systems is the fundamental requirement in the model. The reliability of testedsoftware cannot be addressed through testing with ad-hoc methods whose precise char-acteristics are not known. The method is therefore based on formal specifications ofprotocols and theoretical results in process algebras. This provides a strict mathemati-cal foundation for subsequent analysis. All testing activities in the model are subject torigorous mathematical treatment and evaluation.Automated TreatmentLittle incentive exists to use formal description techniques and formal methods based onthese techniques if they do not increase the speed of development of the end product andreduce cost. This model is entirely based on the generic information about the processes,which is contained in the formal specification of the system. The model is thereforegeared towards full automation. Full automation of testing activities is a requirement forcoverage testing in order to handle the large number of test points needed, many morethan are usually selected in classical testing methods.formal specifications^testing theory^metric models forof protocols^for concurrent processes^concurrent processes'. ^ IMATHEMATICAL TREATMENT OF TESTINGconvergent test^method independent^completeselection process comparisons and automationwithin entire execution^coverage measures^of test selectionspace^ and coverage evaluation2-3. THE METRIC BASED MODEL OF TESTING^ 27Analytic Model of TestingFigure 2.4Chapter 3A Metric Space of Protocol ControlBehaviourEfficient analysis of the behaviour of communication protocol systems is very difficult,since these systems involve large amounts of parallelism and recursion. If the behaviourof such systems is given in terms of all global states that can be reached on the interactionof the system and its environment, then this space is so large that it can be consideredinfinite for purposes of practical analysis.For the successful test development and evaluation of such systems, a method is neededwhich can deal with the complexity of approximation of the global state space of suchsystems in a simple and intuitive manner, without sacrificing the mathematical soundnessof the results.Two problems are central to this approach. The first has to do with the formalrepresentation of the global state space of a system which is to be approximated, andthe second one is the choice of the mode of approximation of such a space. The choiceof trace characterization of execution space of a system and its implications with respectto coverage testing, the generality of the approach, and finally its relationship to the2829testing theory for process algebras is elaborated on following this section. The choiceof the mode of approximation in this space, which singles out the temporal ordering ofclusters of events generated by recursion as the primary approximation characteristics, isdiscussed next.Based on this analysis, we develop a metric model of execution sequence space foranalysing the control behaviour of protocol systems. Two specific simultaneous goalswere to be achieved by the definition of the metric function. On one hand, the metricincorporating the notion of approximation of different temporal ordering of externallyobservable events must be well defined with respect to the operational semantics of thespecification language. On the other hand, this metric is to be not only mathematicallytractable, but also applicable in practical setting involving finite (and infinite) behaviourswith recursion and parallelism.This chapter includes a collection of results that show how these goals have been met.The theoretical development is organized in the following manner. We introduce a classof metric dt which captures both the event pattern and the recursive characteristics ofsequences of interactions in protocol control spaces. The metric model is based on theunderlying notion of concurrency via interleaving and trace equivalence within the testingtheory for labelled transition systems. The method involves a metric characterizationof (infinite) trace sets S of a protocol control behaviour. The metric dt allows us tocharacterize all infinite execution sequences as limits of sequences of finite traces. Themetric completion of the trace set of finite traces yields the space of all (finite and infinite)execution sequences of a process. This metric generates totally bounded metric spacesof execution sequences of protocol control behaviour providing the theoretical foundationfor test selection and coverage for infinite systems. The metric interpretation of objectsin this theory is used to derive metric-based definitions of average and maximum coverageof an execution sequence space by a selected set of tests. The existence of finite coverswithin complete metric spaces offers theoretical foundation for test selection techniques30 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURwhere the selected test cases approximate the (infinite) set from which the selection isdone to an arbitrary degree of precision.3-1 General IssuesSeveral general issues are involved in the formulation of the metric characterization ofprotocol control behaviour space. This section discusses three most important ones: (i)a fault model applicable for coverage testing of real protocol implementations, and itsrelationship to the formal characterization of the protocol control behaviour space bytraces (ii) the implications of the selection of trace sets as the test selection and coverageevaluation domain, and (iii) the role of recursion in the protocol control behaviour.3-1.1 Testing Real Protocol ImplementationsIn this section we summarize some observations on testing complex concurrent systemsin general and communication protocols in particular. This discussion explicates thelink between coverage testing, the fault model it subsumes, and the proposed formalcharacterization and approximation mode of the execution sequence space.We first turn our attention to some aspects of testing protocol implementations inthe setting where bugs really are: that is the entire (infinite) execution space of protocolimplementation behaviour.Protocols are real-time systems that run continuously. During a (long) run, an imple-mentation accumulates state and resource information. Although an implementation isin a common state (for formal specifications whose semantics is representable by labelledtransition systems), or traverses the same node in the design graph (in general softwaregraph representation techniques), differences in accumulated information can cause im-3-1. GENERAL ISSUES^ 31plementation to behave differently. Consequently, no state can really ever be consideredthe same as another previous state unless we can guarantee the identical accumulation ofknowledge. For this reason, testing only with short test sequences is not representative ofthe accumulated protocol behaviour, and may be misleading ([45]).Second, software system implementations are really representable by discontinuousfunctions. This has several implications on how we test. First, a valid testing strategyshould not a priori exclude any point in the protocol space as nonrelevant for testing.Second, we should choose many more points than is usual [25], to expose as many faultysituations as possible. Of course, singular points may be hard to identify, but many, infact, will be exposed by some sufficiently similar situation in the protocol software system.The final aspect we address in such systems has to do with our understanding of testcoverage. In systems of this complexity, "subsumes" relationship between test strategiescannot hold; indeed, the same test suite can be less powerful than itself if the subsumesrelationship is understood as exposing more faulty implementations [24]. This is because atest suite may expose faults on one execution but not on another. This behaviour appearsas a nondeterministic protocol implementation, but is the consequence of the practicalimpossibility to model and test all aspects of a complex implementation. Unmodelledaspects then surprise us as nondeterministic test outcomes.As an illustration, consider the possible implementations of a generic protocol dataexchange phase and a connection establishment phase of a multiconnection protocol,depicted in Figure 3.1.This simple example illustrates how valid input events (data packets or incomingconnection requests) may easily introduce additional states (in this case labelled as crash)in faulty protocol implementations. The behaviour (which differs from the one formallyspecified), is both discontinuous and testable only by test sequences longer than a certainminimum. The possible non-determinism of a faulty implementation indicates that any32 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURAn example of faulty implementations of a simple protocolFigure 3.1ordering of test suites by "subsume" relationship would be flawed'.If a test selection and coverage evaluation method is to be applicable to a broad rangeof protocols, then it must address the issues mentioned above in some manner. Thetechniques which assume a finite fault model can apply exact selection techniques, faulttargeting and subsume relationships for coverage. Exact selection techniques and finitefault models become impractical in infinite execution spaces with above characteristics,so we adopt trace equivalence to characterize an infinite fault model: every string is apossible fault. The choice of the trace equivalence as the fault model has several immediateimplications on the resulting test development process.First, trace preorders and equivalences in process algebras target all faults, since all1 Note, however, that these errors could be detected by data flow analysis, using the knowledge of theinternal details of the implementation under test (i.e. "white box" testing).3-1. GENERAL ISSUES^ 33strings are potential faults. This eliminates the drawbacks of the fault targeting basedon non-uniform value weighting of possible tests which is characteristic of partition basedtesting techniques.Secondly, the fault model does not exhibit preference for test cases based on test'slength, or puts any restrictions on the type or frequency of event occurrences. For proto-cols, this means that any number of network connections or prolonged data exchanges arepossible candidates for test cases: these are exactly the main sources of load generatingconditions in communication systems environments.Finally, we observe that starting with the initial theoretical setting of labelled transi-tion systems, the trace characterization of execution space yields traces that lead to allpossible global system states in the control space of protocol behaviour.Therefore, the type of testing subsumed by this fault model is exactly coverage testingas defined earlier.3-1.2 Approximating Protocol Systems by TracesThere are two additional important aspects involved in the choice of the trace character-ization of the execution space and trace equivalence as the applicable fault model.The first is the obvious simplicity of using trace sets as test selection and coverageevaluation domain. The notion of traces is also well known to the practitioners in industry,and is considerably simpler for automated manipulation than more elaborate character-izations given in traces and refusal sets, their synchronization trees model, and otherrelated formalisms. Trace refusal formalisms are, however, more accurate characteriza-tion of execution space of nondeterministic systems (communication protocols are almostalways such). It is worth noting that it is possible to extend the metric theories of concur-rent behaviour to tree equivalences, using the full power of extensional equivalences for34 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURconcurrent non-deterministic systems for metric characterization of execution space (e.g.direct application of the metric characterization of synchronization trees given in [18]).It is not clear that such theoretical results would be directly applicable in practice. Inorder to use the full power of such techniques in testing, it must be possible to establishthe existence of the exact branching structure of the corresponding tree, after a tracehas occurred. This means, that either (i) multiple copies of a system are available aftera trace has been observed, or (ii) rollback copies of a system are available after a traceand one subsequent event have been observed. Neither of these are readily available inpractice. In this thesis we restrict ourselves to deriving the strongest theoretical resultswhich are also practically feasible, leaving the aspects of the system behaviour that arenot theoretically covered, to an essentially engineering approach - designing testers suchthat the validity of the behaviour (which is possibly given in trace-refusal formalism), isindirectly determined through repetitive execution of test cases specified as traces [12, 2].The second aspect is the generality of the approach. Trace equivalence, as a faultmodel, provides a link between test selection and testing theories [12, 2] which are appli-cable for specification techniques that admit labelled transition systems as their semanticmodel. All extensional equivalences proposed in the theoretical setting of testing protocolimplementations for conformance, as well as the preorders defined in Chapter 6 of thisthesis for testing protocol implementations for interoperability, are essentially based ontrace preorders. This purely theoretical characteristic has two implications on practicaltesting and protocol tester design: (i) in order to test for any extensional preorder orequivalence, the suggested testing methodology [12] is to check for the existence of a tracefirst, and subsequently its possible further characteristics (i.e. stability of the occurrenceof a trace, or the refusal sets associated with the trace) are examined and (ii) all exhaus-tive (e.g. the canonical) testers are trace equivalent to the original protocol specification.By deciding on the trace characterization of execution space, we are able to guarantee abroad applicability of metric based selection and evaluation techniques for all testing the-3-1. GENERAL ISSUES^ 35ories that are essentially based on trace preorders, although their more advanced featuresmay involve some other formalism.3-1.3 Recursion and Test Selection in Protocol SystemsBased on the forgoing analysis, the investigation of the approximation process will becarried out in the spaces of all traces derived by interleaving semantics of concurrencyfor formal description techniques that allow specifications of concurrent nondeterministicprotocol systems and have a finite labelset of actions that we call events. With the traceequivalence as the fault model in this space, there is a very large and possibly infinitenumber of potential fault strings from which a finite number of test cases must be selected.For the general case, there is obviously no exact method to deal with this fault modeland terminate in finite time. Test selection must therefore viewed as an approximatingprocess within infinite control behaviour spaces. The approximation must be based ona function which provides a measure of how well one execution sequence approximatesanother, and consequently how well one set of execution sequences approximates someother reference set. Such functions are metric functions, which evaluate the similarity ofexecution sequences based on some pre-defined criteria.As mentioned earlier, the problem in defining the metric function is twofold. First, thecriteria which will characterize the mode of approximation must be defined. Second, thenotion of approximation resulting from these criteria must be well defined with respect tothe operational semantics of the specification formalism - in particular, since the specifi-cation formalism allows for infinite sets of infinite traces, the mode of approximation mustallow for approximating such infinite elements by finite ones. The second requirement ofthe metric function definition, i.e. the well-definedness of the metric function with respectto operational semantics of the formal model of labelled transition systems, is proven ina number of theorems following the definition of the testing distance. We here give some36 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURobservations which are relevant for defining the first requirement.Many different properties of nondeterministic concurrent systems may be of interest,allowing for different notions of approximation based on intuitive perceptions of valueof an execution sequence. In this thesis, we concentrate on the criteria which can becomputed on the basis of the formal specification itself: in the theoretical setting ofexecution sequence spaces, these are individual events, their temporal ordering and theirlevel of recursion.The general notion of behaviour in LOTOS, CCS or other labelled transition systembased specification formalisms is closely related to the operational semantics of step-wise evolution of a system. This allows the simple definitions of distance da (AppendixC) to be directly applicable to such specifications. Informally, this distance is the wellknown distance on strings which measures the longest common prefix of two strings. Inthe context of these distances, as soon as two execution sequences differ in one step,their distance is evaluated, without regard to the future behaviour of these processes orthe characteristics of that step. These distances are based on the individual events andtheir temporal ordering. They can be used to describe depth bound algorithms for testselection and coverage evaluation, and are directly applicable to testing protocols withoutmulticonnection capabilities or prolonged repetitive event exchanges.For the specifications of complex distributed systems and protocols, which typically in-volve multiple concurrent connections and other recursively specified processes, the metriccharacterization of execution space through individual events and their temporal orderingis not sufficient'. Steps generated by recursive instantiations of processes constitute major2 We here note that in the specification formalism LOTOS both of these critical features: simultaneousnetwork connections and prolonged repetitive event exchanges, are specified using recursion. Simultaneousmultiple connections are specified by process instantiations within parallel composition operator, andrepetitive event exchanges by process instantiations within sequential operator. If the distinction isneeded, recursion derived through multiple parallel instantiations of a process will be termed "horizontal"recursion, and recursion derived from sequential consecutive instantiations of a process will be termed"vertical" recursion.3-1. GENERAL ISSUES^ 37part of protocol behaviour. It is generally recognized that testing all levels of consecutiveexchanges of the same event should be carefully balanced against testing new events orevent combinations. Additionally, the main problem in comparing and selecting tests isdue to time and space limitations, caused by the execution sequence explosion, which isexactly due to recursive calls to processes. For these two reasons, recursion has tradition-ally been a major factor influencing test selection. By capturing its characteristics it ispossible to design more efficient and intuitive test selection and evaluation algorithms.We illustrate this point on an excerpt from a typical communication protocol speci-fication: ISO 8072 Transport Service [50] (excerpt from a LOTOS specification given inAppendix D, and a simplified version in Section 3-4). Figures 3.2 (b) and (c) each showthree execution sequences s, t and u. Suppose that s, t and u in each of Figures 3.2 (b)and (c) are all valid test sequences generated by some test generation system. Applyingthe test selection techniques which do not recognize recursion, or metric models which donot allow for recursion in the definition of distance, one could be lead to give preferenceto test suites of the type {s, t}, which is intuitively a poor selection from the testing pointof view. Since the test sequences s and t are very similar, the selection of {s, u} or { t, u}instead of { s, t} would be a prefered choice according to most test selection criteria (e.g.distinct-paths criterion, branch testing, as well as switch-cover paradigm).A testing-oriented metric must therefore be defined in such a way to take into accountrecursive characteristics of test sequences, in order to allow the possibility of tradeoff be-tween selecting a new event combination and the finer distinction in the level of recursion.Based on these considerations, the following criteria are used to characterize the ap-proximation mode of execution sequence space of communication protocols in this thesis:the occurrence of individual events in the execution sequence, their temporal ordering,and their level of recursion.38 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURExecution sequences of ISO 8072 Transport ServiceFigure 3.23-2 The Metric Space of Execution Sequences ofProtocolsProtocol systems are highly recursive in nature. Consecutive actions may be identicaland may originate from a single source (data packets arriving over a network) or frommultiple sources (connection requests from several users.) We define a concise notationfor the control component of an execution sequence by condensing consecutive identicalactions into a single tuple: the action a and its recursion level a yield the pair (a, a).3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 39The recursion level is the number of actions which have been condensed. For example,the execution sequence bbbaccbaaa has the concise form (b, 3)(a, 1)(c, 2)(b, 1)(a, 3). Thismapping of execution sequences of a process onto the space of the sequences of pairs(ak,ak ) we will denote as 4 . . We give the following formal definition of the mappingDefinition 3.1 (of the mapping Let s be a sequence of observable events of a process,s = {rni}fL 1 ,L E N U {oo}, ( N denotes the set of natural numbers). Denote by s(k)mk E E, the k-th element of .s (E is the set of elementary protocol actions). With thissequence we shall associate a sequence of pairs c (s) = {(ak,ak)}f—i, K E NU{oo} inthe following way:a1 = s(1) =^= 3(04)s(ai + 1)a 2 = s(cti + 1) =^= s(ai ce2)^s(cei^ce2 + 1)k-1ak = s(E a.; + 1) = = s(E a j )j=1^j=1kaj +1), k = 3,4,...j=1whereak E E, ak E N.If for some k,s(Ejkj ai + 1) = s(i) for all i > 3. cf.; + 1— =then we set40 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURk-1ak = s(E °=1 a3 + 1), and ak = oo.In this mapping, a pair (ak , ak ) represents a recursive cluster of the event ak, whoselevel of recursion in the initial sequence s is ak . The recursive clusters are temporallyordered in the sequence c(s) by the relative position, in the sequence s, of the correspond-ing events ak which generate each recursive cluster. A single occurrence of an event hasa level of recursion equal to 1; oo is used to characterize any infinite execution sequence.3-2.1 The Definition of Testing Distance dtFrom this point on, we assume that all execution sequences are represented in concisenotation unless otherwise noted.We will let a, b,c,... denote elements of the event labelset E, and use Greek lettersa, /3,7, ... to denote the recursion level of an event cluster. Likewise, we will use S todenote the set of all execution sequences of a protocol system specification, T, U, . . . todenote its subsets. By s, t, u, o, ... we denote execution sequences, and e an emptysequence. A dot is used as a concatenation symbol. Further details of the notation canbe found in Appendix A.Definitions of concepts in metric spaces that are used in this chapter can be found inAppendix C.Let p, and r be functions satisfying the following properties:P1 {pk }ck°_, is a sequence of positive numbers such that Eci,"1 1 pk = p < oo .P2 frk h'cLo is an increasing sequence in [0, 1] such that^rk = 1. Put roo = 1.Definition 3.2 (distance dt) Let s, t be two (finite or infinite) execution sequences in Sand let s = {(ak,ak)} 1 1 , and t = {(bk,ifik)}fr i , K, LENU{ 04 The testing distance3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 41between two execution sequences s and t is defined asmax{K,L}dt(s,t) = E pk8k(s,t)k=1whereSk(s, t) =^Irak — rf3k1 if ak = bk1^if ak 0 bkIf s and t are of different lengths then the shorter sequence is padded to match the lengthof the longer sequence, so that 6k = 1 for all k in the padded tail.We can now prove the following theorem.Theorem 3.1 dt is a distance in S.Proof.1. dt(s,t) is a non-negative real number for each s, t.Indeed, since0 < bk(s,t) < 1 for all k = 1, 2,... ,max{K, Lbwe haven^n^co0< Epk6k(s,t) < Epk < Epk = p < oo for all n E N.k=1.^k=1^k=1Thus, Er_ i pk bik (s,t) converges and consequentlymax{K,L}0 < dt(s,t) = E pk8k (s,t) < 00.k=12. dt(s,t) = 0 if and only if s = t.Indeed, dt(s, t) = 0 iff 45k(s,t) = 0 for all k = 1, 2, ... ,max{K, L}. Clearly, 8k(s,t) = 0for all k = 1,2, ... , max{K, L} is equivalent to K = L and ak = bk and ak = /3k for allk = 1,2, ... , K = L, what gives s = t.42 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOUR3. dt(s,t) = dt(t,^)It follows from the definition of Sk(s,t), k = 1,2, ... ,max{K,L}, thatk(s^= k (t, s), k =Consequentlymax{K,L}^max{K,L}dt(s,t) = E pok(s,o= E pok(t,^) = dt(t,^)k=1 k=14. Now we will show that the triangular inequality holds for dt.Let s,t,u be three (finite or infinite) execution sequences in S. The following holds:45k (S) U)^46k(U, t) > 81c(S, t), k = 1,2,...,max{K, L,Consequently,dt(s, u) + dt(u, t)max{K,M}^max{L,M}> ^pkak(3,0+ : pk6k(u,t)k=1 k=1max{K,L}>^> pok(s,t)k=1= dt(s,t)Hencedt(s, u) + dt(u, t) > dt(s, t)and the triangular inequality indeed holds. ^3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 433-2.2 The Interpretation of the Distance dtA. An Intuitive Comparison of Distance FunctionsPerhaps the most straight forward way to illustrate the distance dt, is to compare it tothe other well known distance, denoted as da, used in the theoretical investigations ofthe theory of concurrency. This distance was previously mentioned, (Ch. 3-1.3), and itis evaluated on the basis of individual events and their temporal ordering only. We showwhat has been gained by introducing the notion of recursion into the definition of distancedt.In Figure 3.3 we give a simple example to motivate the definition of the distance dt.We will let da(s, t) = 2 -k , where k is the length of the longest common prefix of stringss and t. (Note that strings in this metric are not represented concisely.) We will let dtbe defined as in Definition 3.2, and pk = 1, k = 1, 2, 3, otherwise pk and rk satisfy theconstraints P1 and P2.Then the following evaluations of the distances between representative execution se-quences hold.1. da(s, u) < da(s,t), dt(s, t) < dt(s, u)2. da(t, s) = da(t, w) = da(t, u)3. dt(t, s) < dt(t, w) < dt(t, u)By failing to recognize recursion, the distance da, if applied in testing, would generate:1. counterintuitive preference for test cases, where the occurrence of a completely newevent (abort) and new test scenario (connect - data - abort) is treated as "closer"or more similar, therefore less valuable, than a sequence involving exactly the same44 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURconnect I^connectconnectdatadatadatareleaseconnectreleasedata data datadatadatadataabortreleases t^u wIncreasing Similarity Information by Recognizing RecursionFigure 3.3events (connect, data and release) and exactly the same event ordering (connect -data - release), except for one extra occurence of one event (case 1.)2. ambiguity in test case preference, where an entirely new event combination (data.abort)and additional event occurrence (abort) is treated the same as all other previouslyencountered event orderings and event occurrences, and ambiguity with respect tothe extent of differing in the level of recursion (case 2., vs. case 3.)One can therefore view the distance da as approximating similar behaviors by basingits decision on the granularity of a single event, and the first occurrence of the differenceat that level of granularity. The distance dt does the approximation on a more coarselevel, that of clusters of events generated by consecutive occurrences of the same event.3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 45Therefore, a new event cluster (abort,l) will be recognized by the metric, and consequentlythe sequence u will differ by at least p 3 from any other sequence in the example. Inaddition, the decision is not reached based on the first occurrence of the difference at thatlevel of granularity. Rather, the cumulative effect of differences in individual clusters andtheir respective levels of recursion is scanned along compared sequences and assigned anappropriate measure of mutual approximation.B. The Parameterization of dtThe distance dt therefore requires two parameters. The similarity of a temporal ordering ofclusters of events is measured by the sequence {pk } kx2.1. , where each individual pk representsthe weight related to differing with respect to that position of the cluster in an executionsequence. Several options are allowed, according to the Definition 3.2. In particular, theweight of the clusters of events related to their absolute position within the executionsequence may be defined such that1. the weights converge in a monotonous decreasing sequence (Figure 3.43 , functionPi)2. the weights are equal up to any finite n E N, after which they converge in a de-creasing sequence ( Figure 3.4, function /4)3. the weights are different but finite, up to any finite n E N, after which they convergein a decreasing sequence (Figure 3.4, function ADThe similarity of recursive depth for the same position of event clusters and identicalevents ak, along an execution sequence is measured by the function 8k. Notice that in thedefinition of dt, this measure is not constant for a constant difference in recursive depth.3Although we use the functions pk and rk only at discrete intervals, we plot them as continuousfunctions to better observe trends in their behaviour.46 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOUREvaluation of the function pkFigure 3.4The values of rk, by property P2, may be computed from a function that fluctuatesbetween concave/convex on a finite subdomain, (i.e. for (initial) rk values), but is convexon its subdomain starting at some large enough k. For simplicity of discussion, from nowon we will assume that rk is convex on its entire domain. Under such assumption, if twosequences differ at two event clusters in recursion only, more weight will be given if theevent clusters are carrying recursion depths of 2 and 3 (for the two sequences respectively),than if the event clusters carry recursion depths of one thousand and 2 and one thousand3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 47and 3 (again respectively to the two sequences) 4 . To see this, consider the evaluation ofthe function Sk on an example of the function rk, given in Figure 3.5, where a3 — al =a7 — a5 , 8k, =Ir«5 —^6k2 = Ira i — ra3 10aa 1^a3^as^a 7Evaluation of the function SkFigure 3.5Under the convexity assumption, the following result holds.Proposition 3.1 Let s be a sequence s = {(akoak)} ik‘_ 1 , and let t be a sequence t ={(bk,N)}i'-1 such that K = L and ak = bk for all k 1, 2, . , K = L. Let ako No forone specific ko E 11,2,...,10, and let ak = Ok for all k EI. for a fixed ako , and for No > ako , the distance dt(s, t) increases as No increases2. for a fixed ako , and for No < ako , the distance dt(s,t) increases as N o decreases'This characteristic of the function 8k enables the generation of totally bounded metric spaces. Forthe completeness result of the metric spaces, it would be sufficient to consider a simpler function 6k whichis constant with respect to the constant difference in recursion depth [59].48 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOUR3. for a fixed difference ako — Oh, the distance dt(s,t) decreases as ako increasesProof. The proof follows directly from the Definition 3.2. and the convexity assump-tion. ^Notice that, if the convexity assumption for function rk does not hold, then Proposition3.1 holds for large enough ako and No .The practical significance of the distance dt being parametrized by two functions,weighing the difference in (i) event clusters at a position k within two sequences, and(ii) recursion levels within respective event clusters, is in that it enables test designers tospecify their individual criterion of the relative importance of testing new event patternversus a finer distinction in the level of recursion. Figure 3.6 illustrates several simplecases. The specific functions used are pi (Figure 3.6 (a)) and pt (Figure 3.6 (b)) (bothfunctions are taken from Figure 3.4), and constant values for Sk = Si = 0.5 and Sk82 = 0.25 for all k E N U { oo}. The area under each curve pik Si (pik 82 ), for i = 1, 2, isthe distance between any two sequences that (i) have ak the same for all k, and (ii) allevent clusters constantly differ in recursion level by the value Si (82 ). The area underthe curve pi (p) is the distance between any two sequences that constantly differ in akfor each k (i.e. each event cluster (ak , ak)). The difference area between the lines pikand piSi (the unshaded area in Figure 3.6), is the value of the difference between testingnew event clusters at all positions k, versus refining the recursion to the value Si withineach event cluster of some existing test sequence. Similar holds for the difference areas ofother pk and pikki combinations, for i, j = 1, 2. The difference area between the curves,between the vertical lines i and i 1 is the value by which a single cluster at the positioni contributes to the distance between two execution sequences, in cases where these differin that cluster by recursion 8 1 (82 ), or in the event a i itself.(a)p2 81kPi 82plk 8,pi 82k(b)3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 4911/2^ 1/21/4Pk21 - Cr7 -1 -rrll - rT1Evaluation of the function dtFigure 3.63-2.3 The Finiteness of &dense Covering for the Metric Space(S, dt)The choice of trace equivalence as the applicable fault model for coverage testing of com-munication protocol implementations, leaves us with one outstanding problem: selectinga finite number of test cases from the infinite number of strings in the trace set of aprotocol process. The following theoretical result resolves that problem.50 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURTheorem 3.2 The space (S,dt) is totally bounded.Proof. Let c > 0 be arbitrary. Let ke be such that Er- k6-1-1 Pk < E/2. Put p = E ckti Pk .Let 74 E N, be such that rk > 1 —^= 6/2p for all k > rt e . The set^F = { {(ai,aj)},14 , aj E E, cri E^,n,} }is finite. The set of all spheres centered at elements in F and of radius e covers the entirespace.Indeed, let s = {(b;,^be arbitrary. Then there exists t = {(aj, aj)} /3t 1 in F, such1that a; = bj, for j = I,^, k,. Put ai = Oi if pj < n, or cri = n, if Pi > Ti e . It followsthatdt(s,t)max{k,L}E pk 8k (s,t)k=1k,^ oo< Epk Ira„ — roki + E Pkk=1 k=14-1-1k,^oo• Epk Ir., - 1I + E Pkk=1k,< E Pk + c/2 < rip 4- 6/2 = Ek= 1which completes the proof of this theorem. ^The total boundness result which implies the existence of finite covers for infinitemetric spaces (S, dt), offers theoretical foundation for test selection techniques where theselected test cases approximate the (infinite) set from which the selection is done to anarbitrary degree of precision.3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 513-2.4 The Completness of Space (S, dt)Recall that S is the space of all sequences {(ak,ak)}ic_ i where K E N U {oo}. If K = 00then ak E E, ak E N for all 1 < k < oo. If K is finite, then ak E E, for all 1 < k < K,and ak E N for all 1 < k < K — 1, and aK E N U fool.Define the space :5 to consist of all sequences {(ak,ao}r_ i where ak E E, ak EN U {oo} for all 1 < k < oo if K = oo, and 1 < k < K, if K is finite.In a natural way the metric dt extends to the space kAs a consequence of the following theorem we will obtain that an extension S of thespace S is complete.Theorem 3.3 Let {807_ 1 , where 8„ = {(a7,cenbtii , be a sequence in k The followingstatements are equivalent:(i) The sequence {s„} °° 1 is a Cauchy sequence.(ii) The sequence {K„}„,"_ 1 is eventually constant (Definition C.9) or K r,^oo (noo), the sequence {a7},.,"=1 is eventually constant, and the sequence lanne° 1 is even-tually constant or c —+ oo (n^oo).(iii) The sequence {s„},n°°_ 1 , converges in the space (k dt).Proof. The implication (iii)^(i) is well known. We now prove the implication (i)(ii). Assume that {.3,-J°1 1 is a Cauchy sequence.We first prove that the sequence {K.}°°-1 is eventually constant or Kn —> 00 (n —>oo). For arbitrary m E N consider the setsUm ={nEN:Kn >m} and Lm ={nEN:Kn <m}.52 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURClearly Urn is a nonincreasing and Lm is nondecreasing sequence of subsets of naturalnumbers. If j E Um and k E L„, then K.; > m > Kk. Therefore by the definition ofSm we have that (5,,,(s i , sk ) = 1. Consequently dt(si, sk ) > pm > 0. Since {s r,} 7,"_ i is aCauchy sequence not both sets Um and L, can be infinite. We distinguish the followingtwo cases:(a) All Um , m E N, are infinite (i.e. all L„, m E N, are finite). This implies thatoo (n^oo).(b) At least one Um is finite. Let mo be a maximum natural number for which Um° isinfinite. Then Lmo is finite. If no is an upper bound for L ino then Kn = mo for alln > no , i.e. {KO is eventually constant.Now we prove that the sequence {q}„"_, is eventually constant. For every c > 0 thereexist Ne such that dt(s,,,s,,) < c for all n, m > NE . Choose f such that e < pk. ThenPk (5 k(s n , Sm ) < dt(sn , s in ) < c < pk , for all n, m > NE .^(3.1)Consequently Sk(s„,s„,) < 1 for all n, m > N. From the definition of Sk we conclude thatak = aln for all n, m > NE . Therefore the sequence 141°° 1 is eventually constant.Now we prove that the sequence {a7}„"_ 1 is eventually constant or (x7^oo (noo). The proof is similar to the first part of this proof. For arbitrary m E N consider thesetsUm ={n E N:ak >m} and L„ = fn E N : < ml.Clearly Um is a nonincreasing and Lm is nondecreasing sequence of subsets of naturalnumbers. If j E Urn and 1 E L„, then a'k > m > a lk . Since the function r is increasing bythe definition of dt we have thatdt(si,si) > pk(r — rat) pk(r, — rm _ i ) > O.3-2. THE METRIC SPACE OF EXECUTION SEQUENCES OF PROTOCOLS 53Since Is 7,17_ 1 is a Cauchy sequence not both sets Um and Ln, can be infinite. We distin-guish two cases:(a) All Um , m E N, are infinite (i.e. all Lm , m E N, are finite). This implies thatak^oo (n^oo).(b) At least one Um is finite. Let mo be a maximum natural number for which Um„, isinfinite. Then Lmo is finite. If no is an upper bound for Lmo then a'ki = mo for alln > no , i.e. lot7,1 7,"_ i is eventually constant. This proves the implication (i)^(ii).^Assume (ii). The first part of (ii) implies that we can define KEO =^Kn, whereE N U { oo}. For 1 < k < 1(,, put ak = limn_,,, al' and ak =^ak, whereak E N U {oo}. Put s = {(ak, ak )} ik(21. It is clear from this definition that if Ifec, isfinite than the sequence {.9 7,} 1 is eventually constant. In fact it eventually equals .s.Therefore it converges. If K,„„ = oo the sequence {sn }„°1 1 converges to s. Indeed, let> 0 be arbitrary. Let lc, E N be such that EZ:+1 pk < E/2. Since each of the sequences{aZ}n"_ 1 , 1 < k < k, are eventually constant there exists n e E N such that all = ak forall 1 < k < lc, and for all n > n e . Since^= ak for all 1 < k < lc, we can choosen, such that ra k — raz = Sk(8,-„s) < E/2p (here p =^1 pk) for all 1 < k < k, and for alln > n,. Therefore for all n > Ti e we havedt(s„, s)Epok(s.,^)k=1ke^ooE Pk 6k(Snl S)^: Pkk=1^k=ke+1E kepk + —2k.12p P + = E.54 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURThis proves that the sequence 1..5.0,7_ 1 converges and the implication (ii)^(iii) is proved.An immediate consequence of this theorem is:Corollary 3.4 The metric space (S, dt) is complete.It is a general result in the theory of metric spaces, that a metric space which is totallybounded and complete, is also compact. Therefore, we state the following theorem as asimple consequence of the previous results.Theorem 3.5 The metric space (S, dt) is compact.3-3 Control Coverage EvaluationBy defining a metric which measures how far a test sequence is from its closest potentialneighbor (another test sequence) in some other set of execution sequences of the specifi-cation or in a given test suite, the intuitive notion of similarity of test cases is describedin analytic terms. Intuitively, the smaller the distance, the closer or more similar a testsequence is to some specified or already selected test sequence. Such a notion of distancebetween two sequences can be extended to sets of sequences. With this extension, themetric dt allows us to view the selected test suite not as an arbitrary or subjective col-lection of test cases, but rather a metric space for which the distance from the originalspecification or oracle test suite can be calculated to facilitate its analysis and comparisonwith other test suites.The definition of the control coverage is based on the distance between each element ofthe set of execution sequences (i.e. the superset (of tests) derived from the specification),and its closest approximant in the set of selected sequences (or selected test suite). The3-3. CONTROL COVERAGE EVALUATION^ 55formal definition of a distance of a point from a set in metric spaces, is given in AppendixC, Definition C.6.Also, the comparisons which are based on averages, in general have to be evaluatedfor all elements of the infinite execution space. The problem is resolved by basing theevaluation on the metric comparison of truncations of execution sequences. The followingdefinition formally specifies the concept of truncations of execution sequences.Definition 3.3 Let { ak}L i be a sequence, of events ak , in the space S of sequencesof observable interactions (i.e. not in concise representation), and let n be a naturalnumber. A truncation of length n of this sequence is defined as follows: If L > n, thenthe truncation is the sequence { ak}k=i , and if L < n, then the truncation is the sequenceitself, {a k}L i . [:3Denote the set of all truncations of length n by Sn . Note that it is not necessarily truethat Sn C Sn+1 . Figure 3.7 illustrates the example ISO 8072 specification trace space,where the stratification of the space is based on the length of truncations.3-3.1 A Coverage Metric for Pseudo-Exhaustive Testing Strate-giesGeneral acceptance testing, final system integration testing, and some availability testingstrategies, will produce a final verdict on the quality of the implementation system interms of how well it performs (and conforms) according to the entire specification. This isof course impossible to judge precisely since the domain of execution sequences is infinitein general. In terms of an objective measure, however, the following estimate can bedetermined through the metric dt.To measure the closeness of approximation of a (super) set of execution sequences of56 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURs„The Subspaces S„ of Trace Truncations for the ISO 8072 Transport ServiceFigure 3.7a specification S by its subset, a test suite T, we calculate the supremum of the averagedistances from elements of S, to the test suite T.Definition 3.4 Let S be the set of all execution sequences of a protocol specification andT its selected test suite. We assume that for any T there exists a natural number nT suchthat T C Sn, (i.e. we assume that T is finite and that there always exists the longest testsequence).Let kn be the number of different elements of Sn . For a natural number n, n > nT, wedefine vn to be the average of distances' from elements of Sn, to T. Therefore, for infinite5 The distance dt is, of course, measured on concise representations of execution sequences.3-3. CONTROL COVERAGE EVALUATION^ 57s,vn = Eidt(x, T), x E SOIcy,As a measure of how well T approximates S, we will take the normalized supremum ofVn6sup{vn : n > nT}a(T) =cic)°=1, PkDefinition 3.5 The average coverage of a set S by a selected set T is defined asCovAve(T) =1 — a(T)Notice that, removing a point from a set T cannot decrease the infimum in the cal-culation of vn , and converse, adding a point to the set S cannot decrease the supremumof the distances calculated from the points of that set. Therefore, we can formulate thefollowing remark.Remark 3.1 (Properties of CovAve)1. If T1 C T2 then CovAve(Ti ) < CovAve(T2 ).2. 0 < CovAve(T) <13. CovAve(Sn ) -* 1, n .-- 004. Coverage of an empty suite is 0 by definition.3-3.2 A Coverage Metric for Strategies With Pre-defined Tar-get Test SequencesIf a certain subset of a specification is pre-selected as a focal point of the testing campaign,as in safety-critical testing, test-purpose driven testing, estimated-usage profile testing,6 vn are positive real numbers that lie in an interval bounded from above, so supremum always exsits,58 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURthe knowledge of specific execution sequences that need to be well approximated in theoriginal specification may be available. In these cases, we are interested in knowing howwell a given test suite performs in the worst case.For the measure of the quality of coverage of a test suite T with respect to sets ofexecution sequences S, we will take the supremum of distances from the elements of thecomplement of T with respect to S, to the set T.Definition 3.6 Let S be the set of all execution sequences of a protocol specification andT its proposed test suite. (Note that the set S may be infinite.)Let S\T be the complement of the set T with respect to S. We define m(T) to be thenormalized supremum of distances from the elements of the complement S\T to the setT:m(T) suP{dt(x,T): x E S\T}EL-iPkDefinition 3.7 The worst-case coverage of a set S by a selected set T is defined asCovMax(T) = 1 — rn(T)Using the same arguments as in formulating the Remark 3.1., we state the following.Remark 3.2 (Properties of CovMax)1. If T1 C T2 then CovMax(Ti ) < CovMax( 712)•2. 0 < CovMax(T) < 13. Coverage of an empty suite is 0 by definition.In both definitions of coverage, it is implicit that sets S are infinite. If sets S are finite,the calculation of average distance a(T) can be done without intermediate Si, sets, andsupremums in both definitions are maximum distances.3-4. METRIC BASED TEST SELECTION^ 59It is important to note that, in metric based theory, test coverage is expressed inindependent objective terms rather than by "subsumes" relationships. The meaning ofcoverage figures is therefore purely analytic. We expect that poor coverage figures in thistheory will help point out undertested areas and prompt examination of test suites thatdiffer widely in their coverage of each other. Similarly, we expect that good coveragefigures will increase our confidence in the tested implementations. This is in keeping withthe theoretical results already considered in the introductory part of this chapter (Section3-1.1), and the intention of the theory to provide theoretical basis for testing systemswhich are possibly non-deterministic in their implementation.3-4 Metric Based Test SelectionThe ability of the theory to support building effective tools and techniques for commu-nication protocols engineering is essential in the development of reliable communicationsoftware. In this section, the testing distance metric is incorporated into a test selectionalgorithm. A simple procedure using a stratified selection approach for the control spaceof system behaviour is presented. The essence of the selection method is the generationof &dense sets of test sequences (the definition of &density is given in Appendix C, Defi-nition C.5), that approximate some original test suite to some target accuracy of distancec in a convergent manner, ultimately establishing the string equivalence between the twosets involved.3-4.1 Metric Based Test Selection AlgorithmThe selection algorithm for the metric based method (MB selection method for short), is amulti-pass process in which each pass is realized by the selection function SELECT(T o , G, e, C),which returns a selected set T, being an &dense (Definition C.5) subset of the original set60 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURG of test cases generated by some test generator, such that the cost of T (which includesthe initially selected set To ) is less than some given threshold cost C.The cost function of a test case can be defined to represent the resources (time andspace) required to execute that test case, e.g. its length. The cost of a set of test casescan be defined simply as the sum of the cost of the individual test cases in the set.Metric-Based Test Selection AlgorithmStep 1.Step 2.Initially, the selected set T isempty, G is the given (generated)set of test cases, f is theinitial target distance, and C isthe given cost threshold forthe selected set.While Cost(T) < C do T =SELECT(T,G,e = Elk, C)for some scaling factor k > 1,applied at each iteration(that is, each pass).Step 3. Stop. No further pass is possiblebecause any other test case addedto the set T of test casesselected so far violates thecost constraint.The function SELECT(T, G, 6, C) (that is, each pass in Step 2 of the above general algo-3-4. METRIC BASED TEST SELECTION^ 61rithm) can be specified as follows:^Step 2.1^Let X = G\T, i.e. G excluding T.^Step 2.2^If X is empty return T and exitElse remove (randomly)some test case t from X.^Step 2.3^If dt(t, T) < e goto Step 2.2.^Step 2.4^If Cost(T U {t}) < C then addt to the selected set T.^Step 2.5^Goto Step 2.2.In this algorithm we use a greedy heuristic in selecting the next test case t (Step 2.2).In fact, we simply select the first randomly encountered test case which is sufficiently far(greater than e) from the already selected test cases (Step 2.3), and which satisfies thecost constraint (Step 2.4).The scaling factor k is such that the target distance e decreases with each pass ofthe algorithm. The multipass algorithm therefore generates Cauchy sequences over thesuccessive passes. This process of Cauchy sequence generation is illustrated in Figure3.8, where test sequences are selected in the following order:Pass 1: 4 1 , density e l .Pass 2: 42 , 43 , density 62 .Pass 3: 44 ,^, 46 , density €3.62 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURThen, the initial segments of generated Cauchy sequences are, e.g., ts„t„,t„, also t„,tss ,t„and t„, t„.Generating Cauchy sequencesFigure 3.8The multipass algorithm presented above eventually terminates if and only if the initialset of test sequences G is finite. One possibility is that G can also be generated from arandom test generator, and the MB test selection algorithm run simultaneously with it.Assuming unlimited resources, set G would be infinite. In this case, the termination of theloop of Step 2 of the algorithm must be insured by an additional variable bounding fromabove the number of iterations ("tries") for a particular pass. Notice that terminatingthe loop prematurely (before the target density for the pass is provably achieved), doesnot affect the convergence properties of the algorithm in the limit. To see this, recall that3-4. METRIC BASED TEST SELECTION^ 63the algorithm always generates Cauchy sequences which are convergent. In the limit (i.e.given unlimited resources), every Cauchy sequence will be identified and its latest selectedelement approximated to an arbitrarily small c > 0, even if some intermediate membersin some sequences are skipped.The MB selection algorithm presented above can also be a basis for another viewon test selection from infinite sets. Recall that the metric spaces (S, dt) have a finitediameter. The proof of Theorem 3.2 gives a method for computing the elements of the setS, that cannot be further to already selected test cases than the current target value ofe. Given a finite representation of possibly infinite test sets G, this result may be used inthe Step 2.1. of the algorithm, to discard all but finitely many test cases that are releventto the current pass of the algorithm.3-4.2 Test Selection as a Convergent Approximating ProcessIn this section, we present and prove some interesting properties of the MB selectionmethod.Proposition 3.2 Let e denote the target density (distance) of the test selection algorithm,and assume the cost threshold is unlimited, i.e. cost is not a constraint. If the input set, G,of test cases for this algorithm is es-dense in the original specification, then the algorithmwill produce an c sd-dense cover of the original specification, wheref s el = f G + f •Proof. The proof follows immediately from the specification of the MB selection algo-rithm and the triangular inequality property of the distance dt.64 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURConsequence 3.1 The algorithm is able to select a test suite Ts whose density (6 sel ) inthe original specification can be made arbitrarily close to the density (EG) of the input(generated) set G of test cases with respect to the original specification. (For example, ifE = 11nEG, then E sei = EG + 1 I neG, and .so, E sei --- EG as n -4 oo).Theorem 3.6 Every (infinite) sequence in the space of execution sequences of a protocolis a limiting sequence of some Cauchy sequence in the same space, when this space isviewed as a metric space (S,dt).Proof. Let t = {(ak,ak)})"1 1 be an infinite sequence. For n E N, by to we denote thefinite sequence {(ak,ak)}Z=1 . This sequence is a Cauchy sequence (by the dt distancedefinition), and is convergent (by the Completeness Theorem). Moreover, we prove thatEZ--n+ithe sequence Itn*M i converges to I. In fact, we have dt(t,,,, t) = pk• Since theseries Er_ i pk converges, for every e > 0, 3N, such that Er--.+1 Pk < e for all n > Are .Thus Vf > 0, 3N, such that dt(t„, t) < e for all n > N. This proves that {tn*} convergesto t.These results imply that the MB test selection algorithm is a convergent approximationprocess such that the more test cases selected (through successive passes), the closer theselected set comes to the original set, with no relevant test cases or regions of test casesbeing missed out. The last theorem implies that in the approximation process, the selectedset may contain only finite test sequences even though the original set may contain infinitesequences, and still, the previously established approximation properties will hold. It isinteresting that we have arrived to the same result in the metric based theory, as isobserved in [27]. There, Hennessy observes (without proving) that one can show that,although infinite tests are allowed (by the theory of testing based on observation), finitetests suffice: if two processes are distinguishable, they are distinguishable using a finitetest.3-4. METRIC BASED TEST SELECTION^ 65As observed in [39], an exhaustive tester like a canonical tester ([2]), can be viewedas a theoretical upper bound on the testing process for communication protocols. Thetrace set of the canonical tester is equal to the trace set of the protocol specification ittests. The selection algorithm based on the metric dt, defined on sequences of atomicprotocol events, can be seen as a convergent approximating process for this theoreticalupper bound. The degree of approximation of such a theoretical upper bound is onlylimited by the amount of testing resources available. Given adequate resources, it willapproximate every (finite and infinite) sequence in the starting set to an arbitrarily smallc > 0. The exact mode of approximation depends on the characteristics of the metricparameters selected by a test designer, but the convergence always exists if the constraintsof the metric definition are satisfied.3-4.3 Mixed Level Hierarchical Test SelectionThe results in the previous section show that the MB test selection method is suitable forapproximations of theoretical upper bounds of testing processes derived from formal testrelations based on atomic event granularity. In this section we show that the MB methodcan also be used for a more general case of mixed level hierarchical test selection.Large (and infinite) systems are specified using hierarchical descriptions of their be-haviour. LOTOS, for instance, describes distributed systems in terms of processes asfollows:- a system is always described by a process- a process can be defined in terms of (interacting) subprocesses.A recursion mechanism allows for the definition of large (and infinite) processes.Given such a hierarchical system description, corresponding fault models may be es-tablished using these different levels of abstraction. As pointed out in [56], it may be thatthe consideration of hierarchical system description and corresponding hierarchical fault66 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURmodels is essential for dealing with fault diagnosis for more complex systems.Since the metric dt is defined in terms of behaviour graph only, specifically the se-quences of interactions derived from it, recursion is not limited to repetition of a singleevent, but of a subsequence, or even a set of subsequences. The following set of resultsgives the theoretical foundation for mixed level hierarchical test selection using the MBmethod.Let S be a set of execution sequences of atomic events. Let S, = {s i ,...,sn }, si =(n, L , yi E N), be a finite set of finite subsequences of elements of S. Theneach t = {(tio3i)} i E S can be written, in the obvious way, as a concatenation ofsubstrings of t and elements of Ss (substrings of t may be empty):t =^. • • Siko tlength(sio )-F...-1-length(s, k0 )-1-rno • • • tlength(.9. 0 )-F...-Flength(ssko )-Fmi S iko -fi • • '.. tmtSikl ^ Sikmfor sik E Ss or sik = e (empty sequence), for all k = ko ,^,The length of a sequence is taken, as usual, to be the number of its elements (inconcise notation, the number of pairs (ci , 7i )). Denote by t( ss ) the sequence (t') obtainedby substituting, in the sequence t', all occurrences of nonempty sequences si, by theiridentifier in Ss ', in a concise representation.Example 3.1 Consider the following example.A := b; BB := c; CC := a; And; DD := stopand Ss =^= {(b,1)(c,1)(a,1)}Then for some sequence t = {(b, 1)(c, 1)(a, 1)(b, 1)(c, 1)(a, 1)(b, 1)(c, 1)(d, l)}7The identification of substrings is suboptimal in case of sequential composition of an event a and aprocess P, where P can also evolve first .by the same event a.3-4. METRIC BASED TEST SELECTION^ 67t ( 5, ) = { (si, 2) ( b, 1) (c, 1)(d, 1) }Both sequences t and 1(53 ) obviously describe the same behaviour given in terms oftemporal ordering of atomic events in an execution sequence. From now on, we shall referto them as the t and the t( ss) representations of that behaviour.Theorem 3.7 Let (S, dt) be a metric space. Let S 3 be a finite set of finite subsequences ofexecution sequences t = {(bi ,^E S. Let S ( 58 ) = ft (s8)It E S}, t (53) = {(a i ,Then1. (S(5,), dt) is a totally bounded metric space2. the metric space (§(58 ), dt), obtained by completing the set 454 with ideal elementsin the manner described in Section 3-2.j, is complete3. the metric spaces (S, dt) and (458), dt) have the following characteristic: if a se-quence {tk }ck'l l is a Cauchy sequence in the space (S, dt), then the sequence { t(sok }°,1 1 ,where t(s.), = tk(s3) ,Vk, is a Cauchy sequence in the space (4 58 ), dt) and vice versa.Proof. The proof of 1. and 2. is straightforward after observing that the properties ofmetric spaces of strings of identifiers with the metric dt are independent of the granularityof events in execution sequences.The proof of 3. follows from the following observations.Case 1: Let each sequence t(s.), be obtained by substituting at most an initially deter-mined finite number r of occurrences of strings from S3 in the corresponding sequencestk.Differing in, at most, a fixed finite number of occurrences of finite substrings within theirindividual elements, does not affect the convergence properties of either Cauchy sequencein the limit. Therefore, if t-representations of execution sequences form a Cauchy sequence68 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURin the space (S, dt), their t( s8 )-representations also form a Cauchy sequence in the space(450, dt), and vice versa.Case 2: Let the sequences t (sok {(4, alic)}Ki_ki be obtained by substituting an arbitrarynumber of occurrences of strings from Ss in sequences tk =Case (i). In this case, sequences t(ss ) k may also contain elements of the form (a il, an,where each all = si E Ss , and such that a': = m, for some arbitrary m E N. Let onesuch element be (4, aV. It follows that in sequences tk there exist subsequences witharbitrary numbers of repetitions of the sequence 4 s i E Ss . Let {tk}ckt i form a Cauchysequence. From the property of Cauchy sequences in the space (S, dt) (Theorem 3.3. (ii),second part), and the fact that sequences tk allow arbitrarily large number of substitu-tions of (finite) substrings, it follows that the sequences tk must be such that their lengthtends to infinity. Also, by the same theorem, each of the sequences {1 ,1 }Z?_i is eventuallyconstant, and { /3P}Zl i is eventually constant or /3P oo(k oo). Infinite number ofsubstitutions cannot be derived from cases where /3I oo, since the set S, is finite.It follows that, the arbitrary number of substitutions of the string si is obtained, fromeventually constant strings Vinci:3_ 1 , 1 < k < ke , where ke (E N) -* oo, with eventuallyconstant {OP} /),°_,. Therefore, for {4}1'1 1 , in the t (s.,) representations of tk strings, thesequence {at }r_ i is such that cti o (= m) —+ oo, as k oo. By the second property inTheorem 3.3, (ii), the sequence {t(s.,),} kxl i , is Cauchy.The reasoning in the opposite direction (i.e. if {t( ss )k lixL i is Cauchy, then {tk} k)°_. 1 is alsoCauchy), is carried out using the same arguments.Case (ii). Alternatively, the arbitrary number of substitutions of finite strings sj E Ssin tk, does not generate elements in t(ss), with arbitrary recursion levels. (For instance,every other substring is substituted, in an infinite string.) It is easy to see that, in thiscase, if either representation is Cauchy in its space, by Theorem 3.3., sequences {bn(k)?_1and lanr_ 1 , as well as l itl i°,1 1 and {(4}ck<L i , are eventually constant, and lengths ofsequences tk and t(s3 ),, tend to infinity, as k oo. Therefore, by Theorem 3.3., the otherone of the sequences (precisely, the other representation) is also Cauchy, in its metric3-4. METRIC BASED TEST SELECTION^ 69space. ^From the proof of the previous theorem it is obvious that, irrespective of the repre-sentation of execution sequences in S (i.e. in t or 4,54-representations), if one set T1 ofexecution sequences approximates (in terms of the maximum distance m(Ti )) some setG C S to an arbitrary c i > 0 in its metric space, then there exists such € 2 > 0, andthe set T2 (the alternative representation of T1 ) in another metric space representation,such that it approximates the same set G with E2 . The details of the proof reveal that,this is in particular true of sets T1 and T2 of Cauchy sequences, which are t and 4,54representations of each other. The consequence of this observation is that, the MB se-lection method, which generates Cauchy sequences over its consecutive iterations, reallydoes the approximations in both metric space representations simultaneously: the speedof the approximations of a particular granularity level differs, however, depending on themetric space representation (and metric parameters, as before).In the introduction to this section, we mentioned that the results of the MB methodmay also be applied to metric spaces that contain sets of subsequences inside an executionsequence representation. We now formalize this result, which is essentially an extensionof the previous theorem.Let S be a set of execution sequences of atomic events, and (S, dt) a metric space.Let S = {S1, , SO, where Si = {sili E N}, and si = (n, L, rya E N), bea family of a finite number of sets of subsequences of elements of S, such that all setsand subsequences are finite. Then each t = {(ti, i3i)}1_ 1 E S can be written, in the waypreviously detailed for the set Ss , as a concatenation of substrings of t and elements ofSi E S.Denote by 4,5) the sequence obtained by substituting, in a sequence t, all occurrencesof nonempty sequences s ik E Si E S by the identifier of the set S; (i.e. the set they belongto in the family of sets), in a concise representation.70 CHAPTER 3. A METRIC SPACE OF PROTOCOL CONTROL BEHAVIOURTheorem 3.8 Let (S, dt) be a metric space. Let S be a finite family of sets of subse-quences of execution sequences of S, and all sequences and sets in the family are finite.Let S(S) = {t(s)It E S}. ThenI. (Sp), dt) is a totally bounded metric space2. the metric space (S(s), dt), obtained by completing the set S(S) with ideal elementsin the manner described in Section 3-2.4, is complete3. if all Si E S are disjunct, then every sequence that is Cauchy in (S, dt) is also aCauchy sequence in the space (S(s), dt).Proof. The proof of 1. and 2. is the same as in the previous theorem. Also, the prooffor 3. follows directly from the proof for 3. in the Theorem 3.7, the only difference beingpurely syntactic with respect to identifiers (of strings in Theorem 3.7, and sets in thistheorem).Notice, however, that the opposite implication in Theorem 3.8, 3., i.e. if a sequence isCauchy in (S(s), dt), then it is Cauchy in (S, dt), does not necessarily hold. (It only holdsin cases when this statement reduces to Theorem 3.7, 3., i.e. only one string in a set isused for substitution). To see this consider the following example.Example 3.2 Let S =^= {s 1 , s2 }, s 1 = {(a,1)(b,1)} and s 2^{(b,1)(a,1)}.Let s be the identifier for set Si..Let a sequence {tk}ckl i be an infinite sequence in (S, dt), wheretk = {(a,1)(b,1)(a,1)(b,1)...}, k = 1,3,5,...tk = {(b,1)(a,1)(b,1)(a,1)...}, k = 2,4,6,....Then tk(s) = {(s, a)}, a = 00, Vk = 1, 2, ....3-4. METRIC BASED TEST SELECTION^ 71The sequence ft(s)1 k`xL i , is obviously Cauchy (with the distance between any two se-quences equal to 0), whereasdt(tk, tk+1) = p, Vk = 1, 2, ... (p is the sum of the series pk).The last theorem implies that the convergence properties of strings in the (S(s), dt)space are strictly tied to their new syntactic representation and consequently, nothing canbe said about their convergence properties with respect to the space (S, dt).The final results in this chapter enable us to use the same MB-method and selectionalgorithm for mixed level hierarchical test selection and coverage evaluation. If event sub-sequences of execution sequences, derived from behaviour graphs, relevant to testing canbe identified, then the selection of 6-dense sets can be done relative to "direct" recursion,"mutual" recursion, or even at the level of individual processes or groups of processess.Additionally, Theorem 3.8 and the characteristics of the MB method are essential fortreatment of theoretical upper bounds on the testing process, which are essentially, butnot entirely based on the atomic event temporal orderings. Whereas the relevance of thefirst observation is evident in testing, the latter point will be further detailed in Chapter5.'For conciseness, we here use the terminology mainly inherited from process algebras.Chapter 4A Metric Hierarchy of FaultCoverage ModelsIn this chapter we apply the metric theory of the protocol execution space to investi-gate and classify the relative power of some well understood test selection methods inprotocol testing. These selection methods include transition and transfer fault checkingexperiments for classes of Finite State Machines [52], and control flow analysis [49, 55].These methods often involve exact evaluations of test selection criteria, and may need toimpose limiting assumptions about the recursion and parallelism in protocol systems inorder to be computable for very large or infinite systems. We show how these test selec-tion methods can be metrically characterized in their (finite) domains. The same metricbased methodology naturally extends from finite to infinite domains to encompass the fullrecursion and parallelism of real protocol implementations. This extension then charac-terizes all-paths infinite software test coverage criteria and trace equivalence for labelledtransition systems. This theoretically well-founded extension is supported by the proofof the existence of finite covers for infinite systems as well as the systematic convergenceof the metric based cover hierarchy for infinite systems. Finally, we put all the results in724-1. SOME GENERAL RESULTS^ 73a common theoretical setting of trace sets of protocols, to show how the hierarchy of testcovers formed by metric based theory can be related to previously investigated fault cov-erage models for certain coverage classes, and approximates the theoretical upper boundof trace equivalence.4-1 Some General ResultsThe characterization of coverage of protocol test methods in metric terms is investigatedusing two related concepts from the general theory of metric spaces: the concept of &densesets in metric spaces (Definition C.5), and Hausdorff distance (Definition C.8).Let (M, dt) be a metric space and T, S be two subsets of that metric space. Informally,T is f-dense in S if the family of all open spheres with centers at the points in T and withradius e forms a covering for S.Remark 4.1 Let the selected set of test sequences, T, be &dense in S. Then the coverageCovMax(T) can be related to the achieved density c, by observing thatsup{dt(s,T) : s E S\T} < cTherefore, according to the definition of rn(T), (Definition 3.6),m(T) = sup{dt(s,T): .s E S\T} Er-i Pk^< ECkli PkThus, (definition 3.7),CovMax(T) = 1 — m(T) > 1Eni Pkwhere Pk is the appropriate sequence for the distance dt used in the selection process.74^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSTherefore, we will often use f-density to characterize the coverage of a selected test set,whenever the sequence {pk }(k)1 1 is fixed or normalization is not relevant.We also remark that the Hausdorff distance, takes on a particular meaning in theenvironment of sets of tests and their selected subsets.Remark 4.2 Let T represent a selected set of test sequences and S represent the set fromwhich the selection is done. Recall that the Hausdorff distance is assigned the greater ofthe two distances,dtH(S, T) = max{sup{dt(s, T) : s E S}, sup -NW, 5) : t E T}}Since T is a subset of S, then the second supremum in the definition of dtH need not beevaluated (it is always 0). Therefore if dtH(S, T) = e, it follows that T is e dense in S.Notice also that if the Hausdorff distance between a test set T and its superset S is equalto zero (or i-density of T in S approaches zero), the CovMax(T) becomes 1 (approaches1).We quote some general observations which hold in metric spaces of sequences of ob-servable actions characterized by the metric function dt.In the following propositions, let T and S denote subsets of a metric space (M, dt),where M is a (finite or infinite) set of execution sequences of a protocol behaviour. Fur-thermore, let t = (ak , ak ) be an element in T. Likewise, let s = (bk , /3k) be an element ofthe set S. We will also take that the sequence pk is a monotonously decreasing sequence,pm+1 < Pm, for all m. Similarly, let the sequence r, in the definition of dt be such thatr; — ri_ 1 is decreasing for j = 2, 3, .... These assumptions greatly simplify the subsequentanalysis.Proposition 4.1 If dtH (T, S) < pk for some 0 < k < oo, then for each s E S there exists4-2. ANALYTICAL COMPARISON OF TEST SELECTION METHODS^75t E T such that aj = bj , j = 1, 2, . . . , k.Proof. From dtH(S, T) < /:)k it follows that dt(s, T) < pk. Consequently, there existst E T such that dt(s,t) < pk. If for some j E {1, 2, ... , k}, a .; bj then dt(s,t) > pa.Therefore we must have cti = b.; for all j = 1, 2, ... , k.Proposition 4.2 If dtH(T,S) < pkpk for some 0 < k < oo, 0 < pk < 1, then for eachs E S there exists t E T such that aj = bj and Irc, — r,3,1 < pk, j = 1,2, ... k.Proof. From dtH (T, S) < pkpk it follows that dt(s, T) < pkpk. Consequently, there existst E T such that dt(s, t) < pkpk. If for some j E {1, 2, ... , k} aj^bi or Ir,„, — 7733 1 > Pk,then dt(s, t) > pkpi > pkpk. Therefore, aj = b.; and Irc,, — r,3 I < pk, for all j = 1, 2, ... , k.4-2 Analytical Comparison of Test Selection Meth-odsIn this section, we provide an analytical characterization of the error classes that can bedetected by a hierarchy of c-dense test covers in metric spaces of execution sequences ofthe protocol control behaviour. These covers do not involve automata-theoretic or othersoftware fault-model concepts and are not proposed as an alternative to such methods.Our purpose is to investigate the effect of the increasing coverage CovMax(T) of a selectedtest set on the error detection capability in software designs for protocols within a unifiedframework of infinite metric systems.The environment of execution sequences enables us to investigate the behaviour ofprotocol systems in a unified framework for all (formal) description techniques that al-low derivation of execution sequences in which the systems they specify may engage,76^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSand without regard to the atomicity of their interactions (see also Section 3-4.3). Thenon-exhaustive list includes traces [17] for LOTOS and other LTS-based specification tech-niques, with the labelset of atomic actions or events, paths of state transition diagramscorresponding to Finite State Machines(FSM) labelled with transitions [29], derivationsand computations [7] for process algebras with labelset of (action,state) pairs, etc. Weuse this fact to our advantage, so that in each subsequent section, for each test methodand corresponding formal description model that the method is based on, we redefine theexecution sequence to consist of (not necessarily atomic) actions that are derivable withinthe model.4-2.1 A Metric Characterization of FSM-based MethodsFinite State Machine based methods for detecting errors in protocol implementations area major research area in protocol testing. We first turn our attention to the metric char-acterization of some basic fault models for finite state machines descriptions of protocols.This characterization is given in terms of the increasing coverage CovMax(T) of selectedtest set T with respect to S, in the metric space of paths generated by the state transitiondiagram that corresponds to the specification FSM (see [29] for details).A FSM is defined as a 5-tuple (S, E, so , (5, F) where S is a finite set of states, > is afinite set of input/output operations, s o E S is the initial state, and 6 : S x E --- S is astate transition function, and F is a set of final states. In the simplest case, one assumesa completely specified FSM. The type of basic faults considered in this model are thefollowing [56].Transition errors: A machine is said to have a transition error, if for a given state andinput pair it produces a different output than the specification machine it is compared to.4-2. ANALYTICAL COMPARISON OF TEST SELECTION METHODS^77Transfer errors: A machine is said to have a transfer error, if, for the correspondingstate and input received, the machine enters a different state than the specification ma-chine it is compared to.We assume a minimal, completely specified, deterministic FSM considered in protocolchecking experiments [52]. As noted in [52], interaction paths recorded by the statespace exploration procedure can be used to generate conformance test sequences. We usea similar methodology here for analysis. To every transition we assign a distinct labelt E L, where L is a finite labelset of transitions. We build a reachability tree, labelling theedges with transitions. This reachability tree obviously decomposes into distinct pathsoriginating at the root of the tree (state s o ), and traversing over the nodes (states in S)and edges (transitions in L) to the leaves of the tree. For uniformity of discussion, we callthese paths execution sequences. These paths are uniquely labelled by their edges. Weuse the simple technique outlined in the previous chapter, to represent these executionsequences concisely (Definition 3.1).We shall designate the metric space built on top of the set of all execution sequencesof a protocol FSM, supplied with metric dt, as (ExecSeqFsm, dt).Let NT , Ns be the minimum levels of the reachability tree corresponding to a FSMspecification, by which all transitions, respectively states, have been traversed by aninteraction path. (We take that the level of the root node s o within the reachability treeis zero.)Theorem 4.1 (Metric characterization of transition faults)Let (ExecSeqFsm,dt) be a metric space. IfETF < PNTthen every cTF -dense test suite in (ExecSeqFsm ,dt) has the same fault detection power78^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSwith respect to transition faults as transition checking experiments on FSM-s (i.e. T-method [52]).Proof. The proof of this theorem follows directly from Proposition 4.1.We illustrate the generality of this approach in Figure 4.1 where a T-tour is generatedusing the knowledge of NT and metric only. (The metric based method is of course not asefficient for selection T-tours as other methods specially designed to deal with it.) Notealso that from now on we always assume that generated test sequences will be appro-priately mapped to input events only and supplied with other intermediate subsequences(such as reset sequence) in order the form the exact sequence required by a particularmethod.One e -dense set is^{ t0 t2 t3 tO,tO tl tl t2 t3 t0 }E < 1/4^p k = 1/ 2k-1^rk -- 1-1/2 k2Generating a T-tour using metrics onlyFigure 4.1Let the length of the longest state signature (MO or distinguishing sequence [29]) fora particular FSM be uiol.4-2. ANALYTICAL COMPARISON OF TEST SELECTION METHODS^79Theorem 4.2 (Metric characterization of transfer faults)Let (ExecSeqFsm ,dt) be a metric space. IfESF < (rNs-1- 1-1-uiol^rNs-Fuio l )PN9 4-1 -Fuiolthen every EsF dense test suite in (ExecSeqF sm,dt) has the same fault detection powerwith respect to transfer faults as transfer checking experiments on FSM-s using state sig-natures (e.g. U-method, D-method [52.1).Proof. The proof of this theorem is straight forward after observing that rNs+1+uiol —rNs+ thioi is the smallest value that can appear as the factor Sk in the evaluation of thedistance between any two sequences, whose length is bounded from above by the valueNs +1-1-uiol. Proof for this theorem then follows directly from the definitions of methodsand Propositions 4.1 and 4.2.Consider an application of metric dt in the above setting.Consequence 4.1 (A metric characterization of UI0 sequences)Consider a protocol finite state machine with n+1 states, and designate with sp o ,sp,,, the shortest paths to those states [52]. We construct a reachability tree forsuch a FSM so that the root of the tree is so E S, and edges originating at s o lead-ing to so ,s i ,...,sn are labelled spo ,spi ,...,spn, respectively. The reachability tree thencan further be constructed as previously described, and the generated set of all execu-tion sequences originating at so we denote as ExecS eqFsm sp . Consider the metric space(E x ecS eqFsm.„, dt).Then any suffix sui, of an execution sequence s = spisuio E ExecSeqFsm,,, , (wherespisuio denotes the concatenation of sequences sp i and suic,), is a UI0 sequence for states i if and only if80^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSdt(s, E xecS eqFsm,\Itit = spitrestn >Proof. The proof of this consequence follows directly from the observation that the tailsuic, of any sequence starting at a node s i will differ from any sequence starting at anyother node si if and only if the distance between such two sequences is greater than /31 .Similar to [6, 52], we formulate a result relating the fault detection power of transitionand transfer checking experiments for a FSM satisfying the basic assumptions as statedearlier.Consequence 4.2 Every csF-dense test suite in the space (ExecSeqFsm,dt), is also eTF-dense in it. The opposite is not true, i.e. there always exist E TF -dense test suites whichare not csF-dense in the same space.Proof. The proof is based on the Theorems 4.1 and 4.2, as well as the observation thatNs +1 > NT and uiol > 1, under the assumptions about FSMs that we adopted.In other words, transfer cover is always a transition cover. Notice that the oppositedoes not hold.Finally, we formulate a general theorem relating the fault detection power of a test suiteselected by the MB method with respect to transition and transfer faults of specificationFSMs.Theorem 4.3 (Minimal fault detection power of an c-dense test suite)Let Ns and uiol be defined as before.Then every c-dense test suite in (ExecSeqFs m ,dt), whereNT' = max{kle < pk}Ns , = max{k le < (rNs+i-Fuiot — rNs-Fuiol)Pk4-2. ANALYTICAL COMPARISON OF TEST SELECTION METHODS^81can uncover all transition faults of transitions which appear up to level NT' in the reach-ability tree, and all transfer faults for states that appear up to level Ns, —1 — uiol, for acompletely specified, minimal and deterministic FSM.Proof. The proof for this theorem follows directly from Theorems 4.1, 4.2 and Proposition4.2.4-2.2 A Metric Characterization of Some Software Testing Cov-ersIn this subsection we examine some general software testing covers and selection strategieswhich have been applied in protocol testing for identifying sequencing errors [56]. As iscustomary in general software testing, we take that the protocol designs are representedby graphs. We label the edges of such graphs with observable protocol actions. For furtherdetails and general applicability of the method see [49, 55].We first present the brief definitions of basic software test covers [6] and then discussthe appropriate metric coverage hierarchy which would result in those covers.Branch Cover: The number of executable paths in a design with loops may be infinite.A common practice is to require that the test sequences form a "branch cover", i.e. everybranch in the design is traversed by at least one test sequence.Path Cover: A path cover consists of test sequences such that every path in the designgraph is traversed by some test sequence.In the following propositions, let T and S denote subsets of a metric space (M, dt)generated by the paths of the protocol design graph and metric dt. We omit most of theproofs as they are simple extensions of the above results.82^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSProposition 4.3 IfdtH(T,S) < eBF < pk, for some 0 < kt < oo, T C Sthen the set T is a branch cover for the branches up to the level k = kt for the designgraph of S.Consider (as in [6]) that the design graph is generated by a FSM, and label the edgesof the graph with transitions. Then, if kt = NT, all branches in the design graph havebeen traversed at least once, by sequences in any EBF-dense sets T in S.We can therefore formulate the following result.Consequence 4.3 For branches in the design graph labelled with transitions of a min-imal, deterministic and fully specified FSM, for every eTF -dense set T in S, there is aneBF-dense set T' in S that is arbitrarily dense in it. The opposite is also true.We immediately obtain the same result as in [6], where automata theoretic methodswere used to derive this result.Consequence 4.4 A branch cover can detect all transition errors and not all transfererrors.Proof. The proof follows directly from Consequence 4.2 and Consequence 4.3.If the generation of the set S is constrained by the number of times a loop in theprotocol design graph is entered, (which is the usual technique for dealing with designgraphs with loops), then the set S is closed. In that case the following proposition holds.Proposition 4.4 Let the set S be closed. If4-3. APPROXIMATING INFINITE SYSTEMS^ 83dtH(T, S) = 0then T forms a path cover for S.Various sequencing coverages of the types characterized above have been used in protocoltesting, improved by combining them with data flow analysis [49, 55].4-3 Approximating Infinite SystemsIn this section we extend previous results by comparing the methods of test selection andcoverage for real protocol implementations which are specified as infinite systems. Thetechniques described in Section 4.2 assume a finite fault model which permits them to useexact selection techniques, fault targeting and subsume relationships for coverage. Exactselection techniques and finite fault models become impractical in infinite execution spacesgenerated by discontinuous and nondeterministic behaviour. In this section we adopt traceequivalence to characterize an infinite fault model: every string is a possible fault.It turns out that the effect of increasing coverage (decreasing e), on the fault detec-tion power of selected tests carries over for from finite to infinite systems, but taking ona slightly different meaning. Exploiting the link provided by the metric based frame-work, we summarize the main theoretical results within the framework provided by traceequivalences of labelled transition systems.4-3.1 A Metric Characterization of Trace EquivalenceIn the analysis that follows, we will change the theoretical setting to the most generalone, in order to compare techniques that differ in control space representation and testcharacterization. First, we assume that the specification formalism that is used has a84^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSparticular class of language expressions for the description of system behaviour. The op-erational semantics of labelled transition systems and some formal specification techniquesfor protocols enable us to derive sequences of actions, in which a communication systemcan engage, from the structure of the expression describing the system behaviour itself(for details see [31]). Sequences of actions are also the fundamental control behaviourdescription element, that can be derived from a number of other specification formalisms.From this point on, we will carry investigation in metric spaces of all execution se-quences for any specification formalism that allows their derivation. For the particularcase of formal description techniques that allow specifications of concurrent systems andhave a finite labelset of actions, we will carry the investigation for trace sets derived byinterleaving semantics of concurrency.We view test selection process as an approximating process within infinite controlbehaviour spaces and base the approximation on the dt metric function.The dt metric function provides a measure of how well one execution sequence approx-imates another. An initial approximation of this infinite metric space can be achievedthrough an c-dense topological cover as demonstrated in Section 4.2. Such an approx-imation would have several benefits: similar behaviour patterns would be grouped andtested as a unit, the degree of approximation is easily controlled through the c parame-ter and parameters of the metric dt, and, intuitively, classes of similar faults will exposethemselves in similar situations of event pattern and recursion.The main concern of the proposed approximation technique must be the selection ofa finite set of test cases according to the selection criteria. Theorem 3.2 proves that themetric space is totally bounded which implies a finite c-dense cover for all values of E.We are therefore justified to turn our attention to characterizing any (finite or infinite)theoretical upper bound of approximations by c-dense covers, as c tends to 0.Let T and S be two subsets of a metric space (E*, dt), where E is the labelset of4-3. APPROXIMATING INFINITE SYSTEMS^ 85observable protocol actions. The following propositions are reformulations of results inthe general theory of metric spaces, given an interpretation in the setting of controlexecution spaces of protocols and their tests.Proposition 4.5 If S is closed then the following relation holds:sup{dt(t, S) : t E T} = 0 - T C SLet B1 and B2 be two protocol processes and T = Tr(Bi ) and S = Tr(B2 ) the setsof their traces. Then the above proposition gives metric characterization of the tracepreorder between processes Bi. and B2.Proposition 4.6 Let T and S be any subsets of the same metric space. IfdtH(T, S) = 0then the sets of all truncations of sequences in T and S are the same.Again, let B1 and B2 be two protocol processes and T = Tr(Bi ) and S = Tr(B2 )the sets of their traces. Then the above proposition characterizes the concept of traceequivalence between processes B i. and B2.Note that, in either case, the set T can be viewed as a set of selected tests for aspecification of a protocol system whose trace set is S. Then the above two propositionscharacterize the inclusion relation and the level of approximation of a specification by aset of tests with respect to trace equivalence.The above theorems characterize trace preorder and trace equivalence in terms ofthe metric function dt. As e tends to 0 in &covers ( and so test coverage CovMax(T)86^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELStends to 1), these characterizations imply that the covers approximate precisely the traceequivalence, thus providing the theoretical foundation for metric based theory of testselection and coverage. This last property is then used in a later section to build a metrichierarchy of protocol testing.4-3.2 A Metric Hierarchy of TestingIn Section 4.2, we examined different protocol test selection methods in the theoreticalsettings they are designed for. We now show that similar results hold if these methodsare viewed in infinite metric spaces of traces of protocol specifications.Let Tr(S) be a trace set of a protocol specification. Let the set of observable protocolactions E be the union of two sets which we call I (input) and 0 (output) events, E =I U 0. For the results of Section 4.2 to hold, we require that the sets I and 0 of inputand output actions of a protocol be disjoint. (This should not be too restrictive in caseof protocols). For NT, Ns and uiol defined as in Section 4.2, defineNStr = 2NsNTtr = 2NT= 2uiolConsider the metric space (Tr(S), dt) and letET < PNTtrES < (r N Str -1-21-uiolt r^rNStr i-utioltr )PNStr +2+uiolt rProposition 4.7 Let TST be any CT-dense test suite in (Tr(S),dt), and let T Ss be anyes-dense test suite in (Tr(S),dt). Then TST (7 1 Ss ), when applied to a minimal, com-pletely specified and strongly connected FSM, have the same fault detection power with4-3. APPROXIMATING INFINITE SYSTEMS^ 87respect to transition faults (resp. transfer faults) as ETF (resp, esF) dense test covers in(ExecS evsm, dt).Sketch of the proof: The proof of the above proposition is essentially a directconsequence of Theorems 4.1 and 4.2. However, one needs to notice that, by writing thesequences from the set Tr(S) in the concise notation as is customary with metric dt, werun the risk of collapsing together actions which may represent distinct actions in othermodels. We eliminate such danger in case of overlapping identifiers for the sets U and Iby imposing the above artificial restriction. We also implicitly assume that, (and this isthe case for models considered), if a FSM performs a transition with null output event,it will remain in the same state. This ensures that considering traces with observableevents broken down to atomic actions, and then collapsing them through their recursion,will not result in information loss. If (the maximum of) twice the length of subsequenceslabelled with transitions is examined in the same manner as before, the equivalent resultswill be obtained.A similar result holds for the branch cover (let the reformulation of EBF be calledcB-dense cover) of a design graph which corresponds to a specification FSM in the senseof Proposition 4.3, when it is viewed in the space (Tr(S), dt).Based on the analytical investigation carried out in previous sections, we are able toestablish the metric hierarchy of fault coverage models in protocol testing as illustratedin Figure 4.2. The fault coverage models are placed lower in the hierarchy, if the c-densecover needed to identify the class of faults identifiable in that model is coarser (i.e. eis larger). Conversely, the finer distinction in terms of c-dense covers needed, places themodels higher in the metric based hierarchy. Arrows, then, point towards fault modelsfor which the detectability of errors is granted by &dense covers higher in the hierarchy.Note that, for finite models, fault detection capability is strictly related to the appropriatefault model definition and assumptions about the implementations within such models.INFINITE MODELS^FINITE MODELStrace equivalence^all finite pathsE -> 0£ -denseStransition errors^branch coverEET -dense B -denseE -' 0^II../transfer errors88^CHAPTER 4. A METRIC HIERARCHY OF FAULT COVERAGE MODELSIn other words, transition errors indicate the fault detection capability for deterministicimplementations of finite state machine descriptions of protocols with assumptions as inSection 4.2. Nothing is said about the fault detection capability of such covers for general,infinite and possibly non-deterministic implementations.A Metric Hierarchy of Protocol Testing within infinite metric space (Tr(S), dt)Figure 4.2The above results show that the metric based theory of test selection and coveragecan be viewed as a general theoretical framework for some seamingly unrelated testingtheories.From the practical point of view, we would like to point out that approaches based onpattern identification and pattern similarity evaluation have often been used in areas ofphysical sciences for problems that defy exact evaluation for reasons of complexity. Similar4-3. APPROXIMATING INFINITE SYSTEMS^ 89problem is found in test selection and coverage evaluation for communication protocols.In the light of above results, the metric based theory of test selection and coverage canbe used to reformulate protocol test selection problem as an approximation technique inabstract terms of various measures, for different fault models. These measures are easierfor automated manipulation in very large systems than models based on detailed analysisof particular atomic events, states or other specific test selection criteria.Chapter 5A Framework for InteroperabilityTestingThe theory of test selection and coverage requires that two factors be defined: the targettheoretical upper bound and the approximation mode. In the development of the metricbased theory for test selection and coverage, we have defined the metric approximationthat will tend to the theoretical upper bound of strings equivalence between the executionspace of the protocol control behaviour and selected test sequences. This approach givesus the flexibility, to define more specific target test criteria if needed, provided they arebased on strings equivalence.In this chapter we extend the formal theory of testing protocol implementations alongthese lines. In investigating the control space of the protocol behaviour, we have con-cluded that a more efficient test selection and execution process may be obtained forimplementations supporting many network connections, if the target coverage criteria isnot strings equivalence, but a conveniently related notion. We propose a new relation,from which the new target test criteria is defined. The corresponding test architecture,formal protocol specification style and tester design are also specified.9091First, existing notions of implementation preorders and equivalences in testing theoryare examined with respect to their discriminating power for identifying processes whichwill interoperate. We show that all of the proposed relations are too strict if the interop-erability of implementations is desired. Moreover, the practical testers designed to testwith these relations as target criteria may be difficult to design and run, typically gen-erating many inconclusive test runs for test cases where these relations impose too strictrequirements (see also [2]). In particular, we show that the question of test selection turnsout to be impossible to solve adequately within the proposed theories themselves.Next, we introduce a formal framework for interoperability testing of (two or more)communicating systems. We extend the testing theory based on formal specifications bydefining a new intop relation which is defined to hold between an implementation and aspecification of a system. It insures that implementations which are valid in intop sensecan interoperate with other such implementations of the same specification. This relationimproves the efficacy of the test selection process by distinguishing the traces which areeither unexecutable or irrelevant, from those that are essential for basic interoperability,without the need to step outside the theory for an adequate solution. In particular, it isa tighter upper bound to the testing process than the previously defined relations, sincethe number of inconclusive test runs is significantly reduced.This result is particularly significant for designing testers for systems which are capa-ble of running independent concurrent processes (i.e., for testing simultaneous networkconnections). When it comes to testing modern network protocols such as ST-II (multi-media) communication protocol [15], some features emerge (shown on the example ST-IIspecification throughout the chapter), that were not critical in testing classical protocolsusually considered in the protocol test theory. For instance, even some basic functionalityof an ST-II agent cannot be tested without testing at least three simultaneous connec-tions. This then poses a problem in protocol test design (which typically deals with oneconnection or linear test sequences), which we feel is best solved by redefining the ba-92^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGsic test relations to better suit the design of modern protocols. At the same time, thenew testing framework greatly facilitates test development and tester design for multiplesimultaneous network connections, paving the way to meeting stricter reliability require-ments for tested communication systems.5-1 Implementation relations and interoperabilityThe goal of this chapter is to develop a general mathematical framework for reasoningabout the interoperability of communicating systems. Interoperability is a pivotal notionwithin Open Distributed Processing (ODP) concept in general, and network protocols inparticular. If we assume that a (formal) abstract specification of a communicating systemis available, then the interoperability of its different implementations is dependent on thedesign decisions taken during the implementation process leading from the specificationto an executable implementation. Viewed this way, the question of interoperability ofcommunicating systems is closely related to the formalization of the notion of validity,i.e., the nature of the relation which should hold between the implementation of a systemand its (formal) specification.This formal relation has been a subject of extensive research (especially within processalgebraic techniques), and the results can broadly be summarized in two categories.Equivalence RelationsExtensional equivalences play a central role in reasoning about the external behaviourof systems described by process algebraic languages (see, for example [42, 31]). Twoprocesses are considered equivalent in this context if they cannot be distinguished byexternal observation, i.e. short of taking them apart. Examples of these equivalences areobservation equivalence [42] or various testing equivalences [12].5-1. IMPLEMENTATION RELATIONS AND INTEROPERABILITY^93Certain testing theories are naturally based on the notion of indistinguishability byobservation. Testing based on equivalence relations of this kind ensures that the imple-mentation will behave exactly as prescribed by specification. For testing communicationprotocols, equivalence relations conf-eq [39] and to [2] have been proposed as a criterionfor establishing that a certain protocol implementation is a valid implementation of agiven specification.Implementation RelationsImplementation relations [39] allow the notion of validity between an implementationand its specification to be extended to those implementations which are not externallyequivalent to their specification. These relations have been less studied within the processalgebraic techniques than equivalences.Some implementation relations based on the idea of reducing nondeterminism as avalid implementation choice have been proposed so far for the use in testing. Examplesare conf, conf*, red, ext [2, 39] defined for communication protocols and distributedsystems in the context of conformance testing.We proceed by first defining and briefly examining the strings equivalence and theimplementation relations conf and red. We then evaluate these relations informally withrespect to the level of interoperability achievable between implementations which arevalid in the sense of these relations. Our results apply to other relations mentioned in theintroduction in much the same way.5-1.1 Strings EquivalenceThe most straight forward proposal for systems equivalence is to consider as equivalenttwo systems which can perform exactly the same sequences of visible actions. Let P1 andP2 be processes. It is clear that P1 ,,,, P2 if Tr(Pi ) = Tr(P2 ) and it is obvious that it is94^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGan equivalence relation, which is termed strings equivalence ([12] and Definition 2.2).Consider the following example.Example 5.1 S:= EstStreamRe q{7'1,T2}1( (SC onnectn; SC onnectT 2 ; S')1](SC onnectn; SC onnectn;S':= (SAcceptTl ; S Re f useT 2 ; S")(51 Re fuseT2 ; SAcceptTl ; S")S" := onC on f Indn;ConRej IndT 2 ; D)(ConRej IndT2 ;C onC on f Indn; D)The above is an example of a process (an excerpt from the connection establishmentphase of the ST-II communication protocol [15], for the origin agent only). The ST-IIprotocol at origin may receive a stimuli in the form of a (multimedia) stream establish-ment request from an application above, towards a number of targets (T1 and T2 in theexample). For simplicity, we have instantiated one possible situation (where one targetaccepts and the other target rejects the connection), resolved the parallel compositionof two simultaneous connections into a simpler choice ( [] ) structure, and shown someof the events and interleavings only. The synchronization tree of this process is givenin Figure 5.1 (a). ( Figure 5.1 (b) represents an ST-II agent as a target agent, againshowing a simple instantiated case). An implementation Is which is strings equivalentto the specification S will have the following trace set, for traces of length < 3, (e is theempty trace)Tr(Is) = {e,EstStreamReggi,T21)EstStreamRe q{T1,T2}•SConnectn,5-1. IMPLEMENTATION RELATIONS AND INTEROPERABILITY^95((a)^ b)Connection Establishment Phase of ST-II protocolFigure 5.1EstStreamReq{n,T21.SConnectT2 ,EstStreamReq{n,T21.SConnectn.SConnectT2 ,EstStreamRegai,T21.SConnectT2.SConnect n }96^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTING5-1.2 Implementation Relations conf and redMore involved concepts of equivalence between a specification and an implementationinclude the consideration of the traces that can be observed at the interface of a systemwith its environment as well as the sets of actions that may be refused after a certain trace.Such relations are conf and red. We here briefly recall their definitions in trace-refusalformalism (notation given in Appendix A). A more detailed discussion of these relationscan be found in Chapter 2.Definition 5.1 ([39])Picon/P2 iff Va E Tr(P2) we have Ref (Pi, a) C Re .1. (P2, cr)or equivalentlyP1 conf P2 iff Vo- E Tr(P2 ) fl Tr(Pi ) we have Ref a) C Ref (P2, a).Informally, Pl conf P2 iff, placed in any environment whose traces are limited to thosein P2 P1 cannot deadlock when P2 cannot deadlock. This relation is known as confor-mance relation in protocol testing theory and is used for conformance testing of protocolimplementations [2, 39].Red is the reduction relation as defined in Chapter 2. It limits the traces of a " red"-valid process P1 to those of P2 , but the essential deadlock property remains the same asin Def. 2.3. These two relations are the basis for the equivalence relations conf- eq and to[39, 2] proposed in protocol testing.5-1.3 Implementation Relations and InteroperabilityConsider the implementation relations defined above on the example ST-II origin spec-ification S and the sample implementations /1 and /2 of this specification, depicted in(b)ITI,T2)Cr EstStreansReqSRefuse T2ConConffndniConRejlni ConCon Ail ConConfindTTSConnect T2SConnect TiSAccept TiSRefusenonCon/Ind TI0SConnectSConnea TISAcceptConRejlndrejInd T2Ae LASConnect T1TIonIZejlnd 72EstStreamReq(TI,T2)00(a)5-1. IMPLEMENTATION RELATIONS AND INTEROPERABILITY^97Figure 5.2 (a) and (b).Different implementations of a process SFigure 5.2Example 5.2 The process Il ( Figure 5.2(a))Il := EstStreamReq{n,T2};((SConnectT l ; SC onnectT2 ; I1')[](SC onnectT2 ; SConnectT l ; Il'))Il' := (SAcceptT i ; S Re f usen, S")98^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGis neither in relation conf nor red with the process (specification) S. This is because therefusal set Ref (I1, o-i ), where o f = EstStreamReg{T 1x2}. SConnectT l . SC onnectT2 (orcrl, = EstStreamReg{T1 ,T2)• SC onnectT2 . SConnectTl ) includes, among other sets, alsothe set {SRefuseT2}, which is not in the refusal set for the same trace in the process(specification) S.Example 5.3 The process 12 ( Figure 5.2 (b))12 := EstStreamReg{n,T2};((SConnectTl ; SConnectT2 ; .12')[1(SC onnectT2 ; SConnectTl ; 12'))12' := (SAcceptn; S Re fuseT 2 ;1-2")[1(S Re fuseT2 ; S Acceptn; I2")12" := ConRej IndT2 ; C onC on f Indn; Dis neither in relation conf nor red with the process (specification) S. This is because therefusal set Ref (12, a2 ), where o2 = EstStreamRegm,T21 . SConnectT l . SConnectn.SAcceptn . S Re fuseT2 , contains, among other sets, also the set {C onC on f Indn}, whichis not in the refusal set for the same trace in the specification of the process S. (It is easyto see that there are three more traces which share the same characteristic with the trace0.2.)Notice also, that neither of the equivalences generated by these implementation rela-tions holds.This is because an observer (in the environment of these implementations) will beunable to observe some traces at the interface of these implementations, although theyshould be present in all implementations of S that are valid in the sense of con f and red.But are the implementations /1 and /2 equal with respect to their ability to interoperate,5-1. IMPLEMENTATION RELATIONS AND INTEROPERABILITY^99as origin agents, with other ST-II implementations? A brief informal investigation (wedefer the formal interoperability analysis for a later section) and some knowledge of thesemantics of the ST-II protocol specification shows that the implementation /1 may failto interoperate with a full implementation of ST-II, if it is presented with the eventSRefuseT2 before the event SAcceptTl . However, the implementation /2 will alwayssuccessfully interoperate with any other implementation which fully implements ST-II.Even more significantly, the implementation /3 of the same specification S, depicted inFigure 5.3, also possesses the same ability to interoperate with full ST-II implementationsunder all circumstances, although it is not even strings equivalent to S for very short traces(refer to traces Tr(Is) in Example 5.1).There are frequent situations in network protocols where the choice to drop some ofthe traces in an implementation of S will affect the externally observable behaviour of theimplementation but not its ability to successfully interoperate with other implementationsof S. We quote some of those:• one external stimuli (of the type EstStream request) which generates multipleprotocol events which may be arbitrarily interleaved (e.g., events SConnectTl ,SConnectn, ..., SConnectT,-, and implementation /3)• accumulation of external stimuli (of the type S Accept or S Re f use Protocol DataUnits in the example) which generates protocol events which may be arbitrarily tem-porally ordered, provided they individually satisfy timing correctness (e.g., eventsConConfInd and ConRejlnd and implementation /2)• multiple independent network connections (represented by the arbitrary interleavingin the interleaving model of concurrency) where any one of many possible interleav-ings of certain events may be sufficient for interoperability100^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGAn interoperable implementation of SFigure 5.3Note: Notice that, although many events initiated by the implementation under obser-vation will be in one of these categories, not all are: consider, for example, certain prioritycontrol PDUs or express PDUs.The observed problem is due to the fact that the formal theory of protocol testingbased on observation fails to distinguish between different aspects of the inability of animplementation to evolve via specific events. The consequence is that, in testing whichis based on these equivalences and implementation relations, the inability of implemen-tations such as /1 and /2 to synchronize on some specific observable events is treated inthe same manner.5-1. IMPLEMENTATION RELATIONS AND INTEROPERABILITY^101This characteristic has a profound impact on the design of the testing process itself.1. In such a theory a tester must be able and will try to synchronize on all traces andobservable events as described by the specification of the tested system, which mayresult in numerous inconclusive test runs if an implementation fails to react withthe exact ordering of events that the tester expects to see. Other solutions [1] mayinclude imposing artificial requirements on the testing process (such as establishinga stable state before proceeding with testing), which may reduce the error expositionprobability of the testing process.2. In addition, test selection under such premises becomes difficult to solve effectivelywithin the theory itself. A possible test selection scenario could drop some tracesthat are crucial for interoperability (intuitively, an implementation, such as /1, mustbe tested to be able to synchronize on SAccept PDU followed by a SRefuse PDUat any time, as well as vice versa), in favour of such traces which cannot even beobserved by testing for a particular implementation (for instance, a ConConfIndfollowed by ConRejlnd, in case of the implementation /2), and which may beirrelevant for the interoperability of such implementation.To overcome these problems, we propose a new formal approach aimed at resolving somepractical test design issues and improve the efficiency of multiconnection protocol testingin particular. The testing will still be based on observation, but the underlying formalrelation of validity will be greatly relaxed to include such implementations which, onexperimentation, are observed to have many fewer traces than the specification. The suc-cessful termination of the testing process (if the theoretical upper bound were reachable)would guarantee that all implementations of the same specification which pass will beable to interoperate.102^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTING5-2 The Interoperability RelationsIn this section we define a new formal notion of validity between an implementation and aspecification of a communication system. Although the definition appears more involvedcompared to other implementation relations, it turns out that the test design based onthis relation differs only slightly than the design based on other implementation relations,whereas the testing based on this relation is much more efficient. We delay these practicalconsiderations to the next section, and concentrate on theoretical development next.5-2.1 The intop RelationWe presuppose the existence of two subsets Lreq and Lait of labelsets of visible actions inL, such that:Lreq n Lait = CI) and Lreq U Lait = LIntuitively, we shall think of the elements of Lreq as events which must be observableat the interface of an implementation whenever the specification allows the possibility ofsynchronizing on that event in the state in which the implementation is at the momentof observation (i.e., the "required" synchronization). On the contrary, the elements ofLait are such events, which may not necessarily be observable at the interface of animplementation, although the specification allows the possibility of synchronizing on onesuch event at that point (i.e. the "alternative" synchronization).Similarly, let Re f„q (P, o-) = Re f (P, a) n P(L„q ) and Re fait (P, a) = Ref (P, a) nP(Lait) (where P(A) denotes the powerset (the set of all subsets) of a set A) be theC-maximal subsets of the refusal set Ref (P, a) which are completely contained in thepowersets of Lreq and Lait, respectively.Notice that these two sets always exist and are unique. Observe that the refusal setssatisfy the following properties.5-2. THE INTEROPERABILITY RELATIONS^ 103Proposition 5.1 (Properties of refusal sets):1. Re f„q (P, o-) and Re f au(P, a) are subset closed2. Re f„ q (P, a) U Re fa ii (P, a) C Ref (P, a) (not necessarily .,- --- )3. Re f„q (P, a) = 0^Re fa it (P, a) = Ref (P, o-) andRe fatt(P, a) = 0^Re freq (P, o- ) = Re f (P, a-)4. (Re f„ q (P, a) C Ref (P, a)VRefa lt (P, a) C Ref (P, a) ) Re f„,(P, o-)U Re fau(P, a) CRef (P, a) (proper subsets)5. Ref (P, a) c Ref (Q, a)^( Re fr„(P, a) c Re fr„(Q , a) A Re fau(P, a) c Re f au(Q , cr))(the opposite does not necessarily hold)6. 2( { U RERe freq(PM R} U {uRERefa,ap,a)R}) 2 Ref (P, a)Let I be an implementation of a specification S.Definition 5.2 I intop S if bra E Tr(S) fl T r (I) we have1. Re f„ q (I , a) g Re f„q (S , a) , and2. L au fl Out(S, a) fl Out(I , a) 0 0 V L alt E Re fa it (S , a)3. For the set A = Lau fl (Out(S, a)\Out(I, a)) and R E Ref (I, a) we haveRef(I,o-)\IR 1 R fl A 01 g Re f(S,a)Informally, I intop S if, when placed in an environment whose traces are limited to thetraces common to S and I, (i) the implementation I cannot deadlock on any events fromL„q on which S cannot deadlock; (ii) the implementation I cannot deadlock on all eventsfrom L cat on which S cannot deadlock.104^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGAs an example, consider the validity of the implementation /3 in the sense of intop(it is easy to observe that /3 is neither in con f nor red relation with S).Example 5.4 Let Lreq = IEStStreaMReq{T1,T2} , SAcceptn , SRefuseT2 } andLau = {SConnectT l , SC onnectT 2 , C onC on f Indn , C onRej IndT21.Consider the implementation /3 of S, after the trace a = EstStreamReq{n,T2}•Re freq (13,a) = 7, ({EstStreamReq{n,T2} , SAcceptn , SRefuseT2})Re fa it (I3,a) = P({SConnectT 2 , ConConflndT l , ConRejlndT2 })Then,1. Refreq (I3, a) C 'PaEstStreamReq{n,T2} , SAcceptn , SRefuseT 2 } = Ref„q (S,a),and2. SConnectTl E //cat, {SConnectT l} E Out(S,a) and {SConnectTl} E Out(i,a),and3. Ref (13,a) = P ((URERe.fau(1,a)R)U(URERefreq(I,c)R)) = P({SC onnectT 2 , C onC on f I ndTi,ConRejlndT2 } U { EstStreamReq{n,T2} , SAcceptn , SRefuseT2 }) ,and A = {SConnectT 2}. Therefore,Ref (I3,a)\{R I Rfl {SConnectT 2} 0} = P({ ConConflndT l , ConRejlndT2 ,EstStreamReq{Tix2} , S Acceptn , S Re f useT2}) g Ref (S, a).Therefore, the conditions of Definition 5.2 are satisfied for I, S and this trace a.5-2.2 Properties of the intop RelationWe collect some easy facts about the intop relation.5-2. THE INTEROPERABILITY RELATIONS^ 105Proposition 5.2 Let I intop S and I = a .1. Re fa it (I , a) C Re f adt (S , a)^Ref (I, a) C Ref (S, a)2. ( V a- E Tr(S) fl Tr(I), Re fait(' , a) .g Re fau(S, a))^I conf S.3. I conf S^I intop S (i.e. intop D conf )4. intop is reflexive5. intop is not transitiveProof. The proof for 1. follows directly from the observation that in the case whenRe fa it (I , a) C Re fa it (S, a) then the set A from Def. 5.2, condition 3., is empty. Since alsoI intop S, the condition in Def. 5.2, 3., holds which proves that Ref (I, a) C Re f (S, a)holds. (Notice that this is exactly the condition needed for the opposite implication inProp. 5.1, 5). 2. follows directly from 1. and the definition of conf relation.3. Assume that Ref (I, a) C Re f (S, a) Va E T r(S) fl Tr(I).• We have Re f„q (I , a) = Re f (I , a) n'P (L„ q ) and Re f„q (S, cr) = Re f (S, cr)nP (L„q ).Therefore,Re f„ q (I , a) = Ref (I, a) fl P(L„ q ) C Ref (S, a) fl P(L„ q ) = Re f„q (S, a).• Assume that the first part of Def. 5.2, 2., is not true. Then Va E Lait such that a EOut(S, a) we have a ' Out(/, a). Hence {a} E Re fait(', a) C Refait(s, a) by Prop.5.1, 5. We clearly have that the union of Out(S, c)nL a it and all one-element subsetsof Re fau(S , a) must be equal to Lait. However, since Out(S, a) fl Lait E Re fa it (S , a),it follows that La it E Re fau(S, a), which proves that the second condition of theDef. 5.2 holds.• The part 3. of Def. 5.2 holds trivially.106^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGThis proves that intop D con f .4. and 5. follow directly from the definition of intop relation.5-2.3 The intopred RelationThe results of Proposition 5.2 (4) and (5) are in keeping with the theory developed in[39], that a valid implementation relation must be reflexive (because specification is avalid implementation of itself), but not necessarily symmetric or transitive (because theimplementation and specification are not in general interchangeable). However, noticethat similarly to con f , the relation intop allows that additional traces may exist in theimplementation, that are not part of the specification. This feature becomes even morecritical when interoperability of different implementations is examined, since, in general,such implementations may synchronize on traces that are not in their common specifica-tion. For such traces, the concept of interoperability is really hard to define both formallyand informally. We therefore extend the formal notion of interoperability by defininga relation which restricts the traces in an implementation to those of the specification.Unlike intop, this relation is also transitive.Definition 5.3 I intopred S iff1. Tr(I) C Tr(S)2. I intop SProposition 5.3 The relation intop„d has the following properties:1. intop D intopred2. intop red D red^5-2. THE INTEROPERABILITY RELATIONS^1073. intopred is a preorderProof. The proofs for 1. and 2. are easy and follow directly from the definitions of thecorresponding implementation relations.We will prove that intopred is a preorder.-s1. intopred is reflexive since Tr(I) C Tr(I) and I intop I (by Proposition 5.2., 4. )2. intopred is transitive: Let I intopred J and J intopred K. Then:(a) ( Tr(I) C Tr(J) and Tr(J) C Tr(K) )^Tr(I) C Tr(K)(b) Let a E Tr(K). Theni. Re f„q (I , a) C Re f req (J, a) (from I intop J) and similarlyRe freq (J, o- ) C Re f„q (K , cr). These two relations imply Re f„q (I , or) CRe f„ q (K , o-).ii. By the definition of the intop relation, 3a E Lau n Out(J, a) such thata E Out(I,o). Since o.a E Tr(J) C Tr(K) we conclude thata E L au n Out(K , o) and a E Out(I , a). Therefore, the second requirementof Def. 5.2 is satisfied.iii. Denote by B = Latt n(Out(K , a)\Out(I , a)) and by C = La ttn(Out(K , o- )\Out(J, a))Then, a E C^o.a E Tr(K), but o.a 0 Tr(J). Therefore, o.a 0Tr(I). Thus a Out(I,o). Hence C C B. Since J intop K we haveRe f(J, cr)\IR I RnC ckl C Ref(K , cr)and consequently Ref (J, u)\{R I RnB 01 _ C_ Ref (K, a). (*)Put A = Lau n (Out(J, a)\Out(I , a)). Then,Ref (I, u)\ {R I Rn A 0} g Ref (J, a).We have aE A a .a 0 T r(I) a .a E T r(K) , a .a 0 Tr(I) a E B .Thus, Ac B. So, {RIRnAO}c {R1RnB ¢)} and thereforeRe f(I,a)\{R i Rin B 0} g Ref(I,a)\{1? 1 R n A i 0} C Ref(J,a).108^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGConsequently, Ref (I, cr)\{R I R 11 B^Ref (J, c)\{R RnBO 0}.By (*) it follows thatRef (I, o)\{R I R fl B 0} C Ref (K, a).This proves the third property in Def. 5.2, therefore I intop K.(a) and (b) together prove that intop„d is transitive. Therefore, intop„d is a preorder.The above theoretical considerations are sufficient as a basis for specifying the neces-sary architectural and design requirements in interoperability testing. It is worth notingthat the formal notion of interoperability as an implementation relation can be extendedin the sense of imp- eq and other formal theory given in [39].5-3 The Interoperability Tester DesignAfter establishing the necessary theoretical basis in the previous section, we turn ourattention towards some practical considerations in interoperability testing of networkprotocols. Based on these considerations, we outline the design of a network protocolinteroperability tester whose theoretical upper bound is the satisfaction of the interoper-ability relation intop between an Implementation Under Test (IUT) and its specificationS.5-3.1 Architectural considerationsThe general theory of protocol testing allows for different test architectures and differenttest interfaces. Consider the test architecture given in Figure 5.4. Generally, the Pointsof Control and Observation (PCOs) may be positioned at the upper IUT interface (PCO1,PCO4) as in the system SUT1 in Fig. 5.4, or at the lower IUT interface (PCO2, PCO3).5-3. THE INTEROPERABILITY TESTER DESIGN^ 109aPPerinterfacelowerinterface Interoperability Test ArchitectureFigure 5.4For interoperability testing of protocol implementations it is necessary to observe both up-per interface (service) PCOs and lower interface PCOs (as in SUT2 of Fig. 5.4.) in orderto ensure the proper internetworking of different implementations in all environments. Inthis chapter, we concentrate on the interoperability of different implementations of thesame (peer) protocols. (It may be interesting to study to what extent the same theoryapplies towards the interoperability of any two implementations that share the same in-terface.) To take advantage of our theory, we model the interoperability test architecturein the following manner:I. I is a protocol implementation2. IT is the interoperability tester3. NET is the underlying (network) connection between the I and IT. NET behavesas a reliable FIFO channel in either direction (FIFO without loss)''In our example specification, we adopt a simple strategy to prefix the name of each primitive by theaddress at which it occurs including S for the source (calling) ST-II agent and T for the target (called)110^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTING4. A tester IT is capable of observing at least one set of PCOs at the upper interfaceof I and one set of PCOs at the lower interface of I or IT5. A tester IT is capable of controlling PCO1 and either PCO2 or PCO3.We will also assume that an executable tester (an implementation of the interoperabilitytester IT) is capable of executing strong control over the PCOs it controls in the followingmanner: it will always be able to synchronize on any events that are its output events,before synchronizing on any events that are its input. In particular, we expect that anexecutable tester is able to send a PDU into network or request a service from an IUT atany time. If this assumption holds, then the possibility of inconclusive test runs linked tothe tester trying to observe a particular event in Lreq as the next event is eliminated. Thismay or may not hold for actual implementations, but will simplify our further analysis.5-3.2 Formal Network Protocol Specification IssuesThe requirements of interoperability testing regarding the observability and controllabilityof PCOs impose strict requirements on the formal specification style of a protocol processto be tested. In LOTOS, this style requires that gates modelling the PCO1 and one ofPCO2 or PCO3 not be hidden (i.e., event synchronization at these gates is visible). Theobservability of PC04 is entirely optional and depends on the executable tester design.We require that all the protocol processes be fully synchronized with the underlying pro-cess NET representing the network.For illustration, we complete our example ST-II specification of the stream establish-ment phase with the specification of the target process, supplying the remaining visibleinteractions at these gates. We simplify the specification in the manner similar to ExampleST-II agent, in order to allow for a simple NET process modelling. Other specification possibilities thatprovide distinctness of interactions exist and are equally suitable for this setting (see [19] for architecturaldetails of interaction points in various specification formalisms).5-3. THE INTEROPERABILITY TESTER DESIGN^ 1115.1. (We will nor worry about the details of the negotiation of connection establishmentwith the multimedia application above and let the ST-II target agent decide on acceptanceor refusal of the connection itself.)Example 5.5 The target ST-II agent T is the processT := (TConnectTl ; TAcceptTl ; D)III(TConneetT2 ;TRefusen;T9The full specification of our example ST-II process is the independent parallel compositionof the processes S and T,ST := S III Tand is represented as a parallel composition of the synchronization trees given in Figure5.1.The specification of the NET FIFO process can be given as in [19]. For the purposes ofour brief example we will informally observe that for every event prefixed by T, i.e. eventTe, on which the NET process synchronizes at interaction points PCO2 (PCO3), it willsubsequently synchronize on an event Se at PCO3 (PCO2 respectively) distinguishablefrom the event Te by its prefix S only, after which it is ready for a new interaction atPCO3 or PCO2. Similarly, if the NET process synchronizes on an Se event at theinteraction points PCO2 (PCO3) first, then it will subsequently synchronize on a Teevent at PCO3 (PCO2) and the process NET is ready for a new interaction. Notice thatthe parallel composition of the processes ST and NET with all gates observable, willyield exactly the traces of our full example ST-II specification. The following exampleillustrates this behaviour.Example 5.6 Consider the application of EstStreamReggi,T21 at the PCO1, and as-sume that the additional revealed PCOs are PCO2, PCO3 and PC04 (the system specifiedis exactly SUT2 in Fig. 5.4). Then the following trace may be observable at these PCOs:112^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGa=EstStreamReq{n,T2}. SConnectn. SConnectT 2 . TConnectT l . TConnectn. T Acceptn.TRefusen. SAcceptn . SRefusen . ConRejIndT2 . ConConf Indn.5-3.3 Interoperability Tester DesignIn this section we introduce the design of the interoperability tester IT(S) for protocolimplementations. The purpose of the interoperability tester is to properly distinguishbetween implementations that do or do not satisfy the intop relation with respect to theirspecification S.The construction of the interoperability tester IT (S) is based on the the canonicaltester [2, 39]. The canonical tester T(S) is constructed systematically from the specifica-tion of the system S and is defined in the following manner.Definition 5.4 Let S be a specification. The canonical tester of S, T(S), is definedimplicitly as a solution X satisfying the following two equations:1. Tr(X) = Tr(S)2. VI I con f S iff (Va E L* we have (L E Ref (I X, a) L E Ref(X, a)))We first observe that by the Proposition 5.2, 3., intop D con f . Using this observation,we can transfer the problem of finding the interoperability tester IT(S) to "relaxing"the structure of the canonical tester T(S). We here do not repeat the theoretical workof [2, 39], but assume that the canonical tester T(S) of a specification S is given. Thecrucial observation in the construction of the IT(S) is that the only behaviours that aretreated differently in the relations con f and intop are precisely the ones that allow thechoice over actions that are in Lau , after some observed trace a.In the canonical tester, the choice over the different actions in L is replaced by theinternal choice, i.e. these actions are prefixed by the internal action i. Consider the5-3. THE INTEROPERABILITY TESTER DESIGN^ 113example given in Figure 5.5, where S, T(S) and IT(S) denote the specification, itscanonical tester and its interoperability tester. In the derivation of the canonical testerS T(S) IT(S) L=ta, b, 4Lalt =(6' c)A specification, its canonical tester T(S), and its interoperability tester /T(5)Figure 5.5T(S), the purpose of this choice over i is to allow the tester to attempt synchronizationon each action ( a, b, c in the example) in L at that point in the specification.By the definition 5.2 (of the intop relation), evolving via actions in Lreq (I = a^I'of the example) is also mandatory for every implementation I, and we therefore leave thedesign of T(S) intact with respect to such actions (the trace i.a.8). The tester IT(S) willalso require synchronization on each such sequence.However, by the definition of the intop relation, for actions in Lait, not all possibleactions need to be observable or observed at that particular point. In the example, eitherI = b I' or I = c I" must be observable. It follows that the tester IT(S) mustattempt synchronization on either IT (S) = b^or IT(S) = c =. In such cases, IT (S)is derived from the canonical tester T(S) by substituting the n internal choices of T(S),114^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGeach followed by one of the n different actions in A C Latt, by one internal choice in IT(S)followed by the external choice over the n different actions in A C L ait . Therefore, T(S) :=i; a[]i; b[]i; c in the example becomes IT (S) := i; a[]i; (b[]c). The interoperability testerwill therefore resolve such a choice in the course of the interaction with the environment(for example, driven by the choice of the peer protocol implementation), rather than byattempting to synchronize on each one of the actions. More specifically, if a node in thesynchronization tree of the canonical tester T(S) has the general formNap; • • - I P E Plft{i;b g ; ...1qE(2)then it is transformed into a node of the synchronization tree of the interoperability testerIT(S) of the form[]{a;... I p E PElli; br;... I br E R g Qilii; ([i{ba; • — I ba E A g Q}where R is the set of all q E Q such that bq E Lreq and A is the set of all q E Q such thatbq E Lalt•All other nodes and branches of T(S) are left intact.We observe that the execution of IT(S) against an implementation I of a protocolwill have the following impact on the testing process:• Within the theory itself, such a selection process can be designed, which will guar-antee not to sacrifice traces needed for interoperability in favour of possibly unob-servable tracesObserve that all the events that can happen alternatively are collected under thenodes which have all emanating branches labelled with the events in Lait, and thebranches leading to such nodes are labelled i. Such nodes are uniquely distinguish-able and should participate in the test selection with the weight representative ofone test case only (other test selection criteria assumed to contribute separately).5-4. INTEROPERABILITY RELATION AND THE MB METHOD^115• fewer traces need to be examined or observed (even if they happen to be imple-mented), resulting in a more efficient upper bound of the testing processIt follows immediately, from the construction of the tester and the definition of therelation intop, if a branch labelled i leads to a node whose all emanating branchesare labelled with the events in A C Lait, then the number of tests is reduced fromthe number of events in A to a subtree which is to be considered as one test caseonly.• elimination of inconclusive test runs for test cases where one (of many) possibletemporal orderings of events is sufficient to guarantee the interoperability of imple-mentationsThis observation follows directly from the fact that, for such events, the multipleinternal choices of the canonical tester are substituted with one choice which is al-ways possible to be resolved on interaction of the interoperability tester with theenvironmentFinally, we would like to note that we have illustrated the features of the intop relationon a theoretically sound exhaustive test generator (canonical tester). If the theory is to bepractically applicable, it is crucial that fast test generators for complex systems (needlesto say, that are theoretically sound in the coverage testing context), be available.5-4 Interoperability Relation and the MB methodIn this section, we finally draw the two ingredients of the coverage testing process fornetwork protocols together: the theoretical upper bound for testing and the mode ofapproximation of this bound.Recall that the choice of the testing domain (the entire control space of systems, Chap-ter 2-1.4), leaves us with two open questions: (1) when is a system a valid implementation116^CHAPTER 5. A FRAMEWORK FOR INTEROPERABILITY TESTINGof its specification, or equivalently, what is the theoretical upper bound on the testingprocess whose satisfaction guarantees such validity, and (2) if the theoretical upper boundis not practically reachable, then what is the mode of its approximation.The first question is the question of the validity relation, and the second is the questionof test selection.We suggest that the interoperability relation gives an efficient upper bound on thetesting process for communication protocols. For the approximation mode, we suggestthat the MB test selection method is a theoretically sound way to approximate finiteand infinite spaces of protocol execution sequences based on pattern similarity and therecursion level of recursive pattern clusters.In this context, the metric space for interoperability testing of communication proto-cols takes on the following topological structure.Let (S, dt) be a metric space of execution sequences t (E S) of atomic events of acommunication protocol system. Let S = {S1 , , SO be a family of sets of sequences ofevents at E Lait. Then, according to Theorem 3.8, (S( s), dt) is a totally bounded metricspace. MB selection algorithm will, in this space, tend to approximate all sequencest(s) E G(s) ( C S(s)) by selected sequences s(s) to a target density c (Proposition 3.2,Consequence 3.1, Theorem 3.8). When E tends to zero, the selected set will tend tostrings equivalence with the starting set, within the new metric space (S(s), dt). At thesame time, the testing process based on the execution of selected sequences and theinteroperability tester design, will tend to establish the interoperability relation betweenthe implementation under test, and the set G(s).This analysis shows, that the MB method can indeed be considered as a generalmethod for test selection and coverage evaluation. It can support all target test relationsfor protocols which are primarily based, or can be interpreted in terms of string equivalenceat different granularity levels of protocol interactions. Indeed, the work summarized in5-4. INTEROPERABILITY RELATION AND THE MB METHOD^117[27], seems to indicate that this is the direction in which the testing of (distributed)systems should proceed. Non-sequential behavioural equivalences based on the notion ofthe labelled partial order (where in p — e —* q, e is not an atomic event but a partial orderof events), or timed equivalences (again, a multitude of possibly interleaved action labelscarries the system from one state to the other), introduce a much weaker notion of tests.It would be interesting to examine, to what extent these equivalences can be interpretedin the metric spaces (S(s,), dt) and (S(S), dt), and supported by the MB method.Chapter 6Tests of the TheoryIn this chapter the theoretical framework of the thesis is experimentally evaluated throughthe sensitivity analysis of the metric-based test selection and coverage evaluation pro-cesses. The sensitivity analysis is carried out with respect to a set of relevant protocolfeatures. The performance and stability of the experimental results are also considered.Based on the experimental results, a feasibility analysis of the metric-based approach ispresented in the conclusion of the chapter.We first provide a brief overview of the ST-II protocol which is the case study used inthe experimentation. This is followed by relevant implementation details of the experi-mental test generation, selection and evaluation tool called TESTGEN++ which was usedthroughout the experiments. After this introductory information on the experimentalenvironment, we give a detailed evaluation of the problems considered in experimentationand the results reached.1186-1. INTERNET STREAM PROTOCOL^ 1196-1 Internet Stream ProtocolThe Internet Stream Protocol has been used for several years as the primary protocol ina voice and video conferencing system which has operated over the Wideband and Terres-trial Wideband Networks. The revised experimental specification was issued in 1990, andis known as ST-II [15]. ST-II is a candidate protocol for the support of high-bandwidth,delay-constrained, multimedia applications such as conferencing, virtual environment sup-port, and telepresence.Its application area and novel design solutions make ST-II protocol interesting forexperimentation in software engineering. There are currently three implementations ofST-II reported: (i) Swedish Institute of Computer Science (MultiG Project), (ii) IBM Eu-ropean Networking Center (HeiTS Project), (iii) BBN Systems and Technologies, Cam-bridge, MA. The research and development issues regarding the implementation of ST-IIhave been reported in literature [46]. The ST Control Message Protocol (SCMP), mainlyused in our experiments, is reported to be a relatively complex communication protocolfor implementation [46]. There has been, to our knowledge, no theoretical or practicalresearch effort reported regarding the testing of ST-II.6-1.1 General Protocol FeaturesThe basic functionality of SCMP is to build routing trees between a sender, called origin,and one or more receiving applications, called targets. One such routing tree will typicallyinclude one or more additional nodes along each of its paths, called intermediate agents.ST (the data phase) then simply transmits data down the routing tree, from an origin toone or more targets. The data flow over the routing tree is referred to as a stream. Eachof the communicating entities within a routing tree runs an ST-II protocol and is calledan ST-II agent. One ST-II agent can simultaneously act as one or more origins, routers120^ CHAPTER 6. TESTS OF THE THEORYand targets.6-1.2 ST-II from the Testing Point of ViewThe following design features make ST-II challenging for testing (for details see [9]):1. The concurrency in ST-II is due to two different sources: a) multiple protocol connec-tions, and b) simultaneous execution of origin, target and intermediate agent functionalentities within one ST-II agent.2. With up to 17 different types of messages for SCMP part of ST-II, ST-II becomes oneof the more complex protocols, especially if the combined state space composed of oneorigin, one target and one intermediate agent module is considered.3. The usual concept of a "peer" partner in communication, as a convenient symmetri-cal counterpart in testing, has to be abandoned. The protocol is non-symmetrical sinceindividual connections are trees.We carried out the experimentation under the assumption that an ST-II protocolimplementation is tested as an entity, composed of independent parallel executions of(several) origin, intermediate or target agents. Because of the non-symmetricity of thecommunication pattern, the temporal order of protocol primitives arriving from IUT as aresponse to different upstream or downstream agents (even in one-stream environments)is unpredictable and should be abstracted from. Therefore, the interoperability relationis the proper theoretical upper bound for the testing process. The lack of the upperPCO-s for intermediate agents rules out some commonly used test architectures. Thetest architecture applicable in this setting is as given in Figure 5.4 (Interoperability TestArchitecture), where a set of lower PCO-s must be observable.6-2. THE IMPLEMENTATION^ 1216-2 The ImplementationThe design of the functional structure of the Test Development System used for experi-mentation is depicted in Figure 6.1.The Functional Structure of the Test Development System TESTGEN++Figure 6.1The TESTGEN moduleTESTGEN [61] is the test generator module in an experimental protocol TEST Gen-eration ENvironment for conformance testing developed at the University of BritishColumbia. The generated test suites incorporate both control flow testing and data flowtesting with parameter variation. Both types of testing are controlled by a set of user-defined constraints which allows a user to focus on the protocol as a whole or just on122^ CHAPTER 6. TESTS OF THE THEORYrestricted areas of the protocol.TESTGEN begins with a protocol description in an extended transition system for-malism (closely related to Estelle [40]) and ASN.1 description of the input service primi-tives, output service primitives, and protocol data units. Once all constraints are defined,TESTGEN identifies subtours within the specified protocol which satisfy the minimumand maximum usage constraints. Each subtour then undergoes the parameter variationdefined by the types of service primitives in the subtour and the parameter variationconstraints.The TESTSEL moduleThe TESTSEL module (which optionally includes an independent test coverage evalu-ator TESTCOV EVAL module), is an implementation of test case selection and test casecoverage evaluation based on the distance and coverage metrics within the metric-basedmethod. The TESTSEL algorithm uses a greedy approach for the test case selection.Since the test coverage metric we are using is guaranteed to be convergent when Cauchysequences are followed, and the greedy algorithm produces such sequences, the greedyapproach will iteratively yield a set of test cases which converges to the initial test suite.The implementation of the algorithm follows the design presented in Chapter 3 and [41].The two modules, TESTGEN and TESTSEL have been integrated into a test devel-opment environment similar to the one already used for the development of the InResprotocol test sequences [41]. The seed files for test suites used in experimentation aregenerated by TESTGEN. These can be input into the TESTSEL module directly, or firstpassed through an independent interleaving algorithm. This algorithm produces randomindependent interleavings and random recursive levels of a specified number of streamsand target agents, in order to obtain test suites exercising multiple simultaneous connec-tions, concurrency of different agents, and higher levels of recursion.6-3. EXPERIMENTAL RESULTS^ 1236-3 Experimental ResultsWe first present some performance measurements for the integrated test developmentsystem used in the experimental ST-II protocol test generation. The sensitivity analysisof the metric based method, carried out through a series of experiments, is then presentedin detail. Whereas the sensitivity analysis of test methods with respect to their abilityto detect certain kinds of faults is customary, the characteristics of the MB method(that it does not target faults, but aims to cover the complete execution space in whichfaults are of unknown location and type), has lead us to develop a different kind ofsensitivity analysis'. In our experimentation, we examine sensitivity of the method todetecting certain characteristics of the execution space covered by the test suite. We chosesensitivity to event pattern variation (Experiment 4), level of recursion (Experiment 5),multiconnection level (experiment 6) and missing events (Experiment 7), as most relevantto examine. The stability of the results observed and the feasibility of the approach arediscussed in the summary following the experiments.6-3.1 Computational Complexities and Performance Measure-mentsComputational complexities and performance measurements of the test selection and cov-erage evaluation algorithms were already thoroughly investigated and results reported in[41]. Figures 6.2 and 6.3 summarize the timing results for control sequences of length20 and 100 (concise notation), by plotting the time needed for single iteration of the testselection algorithm to select different sizes of test sets.' This testing was performed on aSun4/75 with 16 Mb of available memory. Performance results were further confirmed in1 Only Experiment 2 may be interpreted in the context of the sensitivity to fault identification.2 The original implementation of test selection and coverage evaluation algorithms, as well as theseperformance measurements were done by Michael McAllister./■e////////0'/.....0"- .... .....-./.^.."-N. ..... .---• .. '-•-----........4'^w• "la:: '....4)...".„^... AP'. • °.. ...... 0 ..... a........... El. .......^......ITte-as110 tests40 tests70 ;Zs —200 teststest suite size x 103250.00200.00150.00100.0050.000.00124^ CHAPTER 6. TESTS OF THE THEORYthe experiments that follow, which were carried out on the same machine. The repeatedmeasurements are not being shown separately.user seconds0.50^1.00^1.50^2.00Timing results (in user seconds) for control sequences of length 20Figure 6.26-3.2 Sensitivity Analysis of the Metric Based MethodIntroductory InformationThe experimental input to the TESTGEN module was the Estelle-Y specification of ST-IIorigin, intermediate and target agents (Appendix E), and the data parameter descriptionin ASN.1 (not given). Our ST Control Message Protocol functional description is relativelycomplete. Stream Setup, Data Transfer and Stream Tear Down have been specified,but Modifying an Existing Stream, Exceptional Cases and Ancillary functions have beenomitted. The Estelle-Y specification deals with 11 out of 17 ST Protocol Data Units.The remaining 6 are related to the Exceptional cases and Ancillary functions (4), andaiXeiio0Je/i 1..0.07ti ,'..."..)i....10 teststests•0 testsTO teststest suite size140.00120.00100.0080.0060.0040.0020.000.006-3. EXPERIMENTAL RESULTS^ 125user seconds200.00^400.00^600.00^800.00Timing results (in user seconds) for control sequences of length 100Figure 6.3Modifying an existing stream (2).3 . The specification has not been verified. We do notexpect that minor inconsistencies with the RFC 1190 text should affect the validity of theexperimental results of the thesis.Most of the experiments initially use sets of transition tours generated by TESTGENmodule for each of the agents separately. These sets are named originseeds, intermseedsand targetseeds (Appendix F). The purpose of these sets is to serve as the starting setsfor the random interleaving and recursion generation process that we implemented specif-ically for the use in these experiments. These sets should preferably provide a thoroughcoverage of the relevant tours through each individual agent's state machine, in their sim-plest form (i.e. without concurrency or recursion). The random independent interleavingmodule generates test suites with concurrency of: a.) multiple streams (simultaneousconnections) within the same ST-II agent, b.) multiple agent functional entities withinthe same ST-II agent, c.) multiple targets within the same stream. Concurrency due tothe interleaving of multiple intermediate agents within the same stream has been omitted.3The first 4 PDU-s are also often not implemented [46], or their function is not clear [46].126^ CHAPTER 6. TESTS OF THE THEORYIn the experiments, we examine the effects of the recursion and concurrency of the STControl Message Protocol only, since the data phase of this protocol is trivial.We finally point out that each protocol event carries an identification a) to the streamit belongs to, and b) to the original TESTGEN transition tour it came from. On execution,or for the purposes of selection, identification a) identifies SAP (address), to which andevent should be applied (or expected), and b) identifies data parameters of that particularPDU (SP). These can also be removed from test files if not needed. We have experimentedwith test suites containing input/output events, and input events only. The latter is moreappropriate if the into relation is the theoretical upper bound to the testing process.Individual events of a test sequence should then be interspersed with subsets S1,. . . , Snof events in La it , (Chapter 5). This phase of the testing process has not been implemented.Stability and Granularity of the Test Selection ProcessThe next 3 experiments investigate the granularity and stability of the test selectionprocess guided by two typical metric functions.Experiment 1.1The first experiment serves as an illustration of the test selection process. We start withthe set originseeds (Appendix F), which contains 16 test cases. We select a metric suchthat Pk = 1, k = 1, ... , 7, and ^k $ < 1. We also set I rn — rm = 0, dn, m < 30, andrn =1-2, for n>30.Succesive passes of the test selection algorithm are shown in Appendix F, as the processselects 6.56 dense, 5.90 dense, 4.78 dense, 3.87 dense, 2.82 dense, 1.85 dense and 0.98dense test suites in originseeds.In the next two experiments the function Pk takes values Pk = 210-k, for k = 1, ... ,10,and Pk = 2- 6, for k = 11,12, .... The function rk takes values rk = 1 — for allk = 1, 2, .... 20 sets of 100 test cases were generated, representing random interleavings of6-3. EXPERIMENTAL RESULTS^ 1271 through 10 simultaneous network connections, and 10 through 80 ST-II target agents.Test sets are labelled tsc.t, indicating the number of simultaneous connections c andtargets t that were allowed in their generation.Experiment 1.2The first series of 10 test sets contained test cases with individual events labelled by eventnames only. (This would be suitable for testing event variations at global levels, e.g. doesan implementation correctly react to an event e after some event d.) Figure 6.4 showsthe number of test cases selected versus refinements in the test density. x axis representsthe density range of 0 through 1024.00, the diameter of the metric space generated bythis metric. For this metric, results for ts10.35 and ts10.50 coincide. Also, the line stylesof ts10.30 and ts10.35 are reused for ts1.80 and ts2.70 - the lower lines belong to ts1.80and ts2.70.Experiment 1.3The second series of 10 test sets contained test cases with individual events labelled byevent names, transition tour they belong to and the stream identifier. (This is a muchfiner identification which would allow detailed testing of event variations within individualconnections and with respect to event parameter data variations). Figure 6.5 shows thegranularity of the selection process for this case.In both cases, i.e. given both very fine (many distinct event labels), and much coarser(relatively few distinct event labels) event space, there is a steady increase in the numberof test cases selected as the density approaches 0. Figures at the low end of density spec-trum indicate, that even at very fine granularity levels of density, this metric function isstill capable of distinguishing still finer test suites. More test cases are selected for testsuites in the second series of tests, for the same levels of density, number of simultaneousconnections and recursion, since more distinct event labels can be identified. In bothgraphs, ts2.60 and ts2.70 occupied the middle portion of the graph, with the suites tsl.t( ts5.t and ts10.t) almost entirely below (above, resp). Given same density levels and128^ CHAPTER 6. TESTS OF THE THEORYnumber of test cases100.95.90.85.80.0075.0070.0065.0060.0055.0050.0045.0040.0035.0030.0025.0020.0015.0010.005.000.00000000 ,,,,1111■1■11111111IM11111111■1111III _ 1,__0.00 0.20 0.40 0.60 0.80^1.00ts10.30ts10.3ts10.50Es3:10-Fs CO. Toro3oF:s2.030ts1.80ts2.70ty x 103Granularity of MB test selection with event name identificationFigure 6.46-3. EXPERIMENTAL RESULTS^ 129number of test cases) ,.^.....,.• , • • • • ".• ..............---::--- ...... .....1 . EllMIMIMIIIMMIIMIll\^....... - - - ......111.111111MEIIhOINI --"1.NI ..... \-.. \......... -.. ._,___0.00 0.20 0.40 0.60100.95.90.85.80.0075.0070.0065.0060.0055.0050.0045.0040.0035.0030.0025.0020.0015.0010.005.000.000.80^1.00ts10.30ts10.3ts10.50rs3.50-Ts I0-rs130-ts2.60ts1.80ts2.70ty x 103Granularity of MB test selection with event name, gate and data parametersidentification Figure 6.5130^ CHAPTER 6. TESTS OF THE THEORYevent labelling, fewer test cases are almost always selected for test suites involving fewersimultaneous connections, indicating a poorer event pattern. At the same time, given thesame number of simultaneous connections, the effect of higher vertical recursion (moretargets) on the number of test cases selected is not very noticable, given the low valuesof the rk series for moderate values of k in this example.The stability and granularity of the MB selection which are observed in these two ex-periments, are confirmed throughout the test selection and coverage analysis experimentsthat follow.Identifying Transition ToursExperiment 2This experiment investigates the efficiency of applying metric based method for transitiontour selection, for single connection environments, with multiple agent functionality andwith moderate vertical recursion. We allow 3 different Finite State Machines to representindividual connections.We use the seed files from Appendix F, for each of the three agents of the ST-II protocolmachine. We hypothesize that they identify all transitions of each agent. We did not havea tool to generate or verify the transition tours, consequently the value of the test is inthe efficiency estimate rather than the exactness of the transition tour identification. Thelargest NT, (Chapter 4), is for the intermediate agent, where NT = 9, for ACK packetsto DISCONNECT or REFUSE PDU-s. We generated 8 test suites, containing randomlyselected one-connection environments with the upper bound on vertical recursion (numberof targets) equal to 30. All transition tours from seed files were added to these sets inorder to simulate test suites which have transition coverage. Generated test suites had177, 277, 377, 577, 677, 777, 877 and 977 test cases, and are designated ts.n, where n isthe number of test cases they contained. Figure 6.6 (enlarged portion in Figure 6.7)shows the number of test cases selected by the progressive passes of the test selection6-3. EXPERIMENTAL RESULTS^ 131algorithm, for each of the test sets, as the density covers the range from 511 to 0.998047.At 0.998047 the algorithm is guaranteed to have identified transition cover of the threeagents, when they work in isolation, provided the experiment's hypothesis is satisfied.The metric used is pk = 29-k for k less than 10, and pk = ik , for k > 10. Also, therecursion was given a minimal weight through the r function as in experiments 1.2 and1.3.The greater variety in event patterns and recursion is expected, as the randomly generatedtest sets grow larger. This accounts for more test cases selected, at equal density levels,for larger test sets.The results show that, even with larger test suites, the algorithm is not overly wasteful,when used for identifying test suites with fault coverage at least equal to that of thetransition tour method. Since the metric used in the experiment yields a complete metricspace, the experiment simulates a test selection environment where, first, a test suite oftransition tours equivalent coverage is selected, after which the process further proceedsby identifying still finer subsets of the initial set, with respect to variations in patterns ofrecursion and event sequencing. Completeness guarantees complete coverage of the initialset in the limit.Comparison of Pseudo-random Test Generation and Random Test Generationby MB MethodExperiment 3This experiment compares pseudo-random test generation, realized through refining ran-domly generated test sets by f-dense subsets, with random test generation. The goal isto investigate if metric-based test selection can help derive test sets with better coveragein a general case of single- and multi-connection environments, with light to moderate0.00 100.00 200.00 300.00 400.00 500.00i #- —densityts.277ts.377ts.577ts.677Ts . -777g.877—ts.977132^ CHAPTER 6. TESTS OF THE THEORYnumber of selected test cases180.00170.00160.00150.00140.00130.00120.00110.00100.0090.0080.0070.0060.0050.0040.0030.0020.0010.000.00Selecting Transition Tours with moderate vertical recursionFigure 6.66-3. EXPERIMENTAL RESULTS^ 133number of selected test cases175.00170.00165.00160.00155.00150.00145.00140.00135.00130.00125.00120.00115.00110.00105.00100.0095.001II11III II I1 1 . 1'^I,it p' 11 11101%. 1, 0 1■i ittil—1 1.4 - 1;1 ■ 1it illi"0,14Ito i.',111.41I ',1,N .1%4:II I%..a%10.Otti...■ .,',,,t%'..4ta• g.*:, #VAts.277ts.377ts.577ts.677Fs.777ts.877rs.977density0.00^50 00Selecting Transition Tours with moderate vertical recursionFigure 61134^ CHAPTER 6. TESTS OF THE THEORYtest set id connections targets cases generated cases selected1 10 30 300 982 10 30 1000 1003 10 30 1000 1004 10 30 500 985 1 10 500 966 1 10 500 867 1 10 500 938 1 2 500 1009 1 2 500 10010 5 10 500 100Experiment 3 - Characteristics of randomly generated setsTable 6.1recursion.2 series of test sets were generated, psrid and rid, id = 1, . . . ,10. Sets labelled psrl,...,psr10were obtained by randomly generating larger test sets (details in Table 6.1), and thenrefining them through test selection to obtain their most dense subsets containing approx-imately 100 test cases ( Table 6.1, column 5). Sets labelled r1,...,r10 were obtained byrandomly generating 100 test cases in each set, using the same bounds for simultaneousconnections and number of targets as their psr counterparts with the same id. Sets withequal ids were compared for their mutual coverage, and results plotted in Figure 6.8.The metric used was the same as in the experiment 1.2. In Figure 6.8, x axis is labelledwith id, id = 1, ... , 10. y axis represents the density of test sets rid in test sets psrid,and vice versa. Plots are in fact scatter graphs. We have connected points representingdensities of rid in psrid sets by a line labelled psr/r, and points representing densities ofpsrid in rid sets by a line labelled r/psr, for easier viewing of the results.For 9 out of 10 cases, psr sets were more dense in their r counterparts, than vice versa.Even with very modest effort and size of the starting set, the results show almost consistent6-3. EXPERIMENTAL RESULTS^ 135Pseudo-random versus random test generationFigure 6.8136^ CHAPTER 6. TESTS OF THE THEORYgain in coverage if randomly generated test suites are subjected to 1 or 2 iterations of thetest selection algorithm.Comparison of Pseudo-random Test Selection and Systematic Backtrackingby MB MethodExperiment 4This experiment compares test suites generated by pseudo-random test generation andsystematic backtracking on the basis of the number of test cases selected at a particulardensity level. The pseudo-random test sets used in comparison were generated usingparameters that favoured the backtracking algorithm The metric function is the same asin the previous experiment.First, a systematic backtracking algorithm within the TESTGEN++ system was run withcertain ISP, OSP and PDU test generation constraints that produced 134 transition tours.Next, a random test generation algorithm was invoked on the same FSM (ORIGIN agentof ST-II), and run to produce 134 test cases, with 2 connections and 5 targets, which isroughly the characteristic of the test suite produced by TESTGEN. Figure 6.9 shows theresults of the progressive steps of test selection algorithm, run on the sets rtest and btest,produced by random and backtracking algorithms, respectively.The experiment was repeated once more, for a more loose set of constraints on thebacktracking algorithm which produced 456 test cases. The random test generator wasrun with 3 connections and 20 targets to produce 456 test cases. Similar results (notshown) were observed. For instance, a 3.5 dense subset of rtest contained 116 test cases,whereas a 3.5 dense subset of btest contained 48 test cases.The informal inspection of test sets generated by the two methods reveals, that at thesesizes of test sets, the systematic backtracking algorithm produced poorly structured testsuites. (The first 12 test cases generated by the two algorithms can be found in AppendixF.) The suites of these sizes, produced by systematic backtracking, typically involved6-3. EXPERIMENTAL RESULTS^ 137Selected subset size for random and backtracking algorithmsFigure 6.9138^ CHAPTER 6. TESTS OF THE THEORYsubtours with very little variation in the initial tour segment, among different tours, andlittle or no vertical recursion. Another characteristics of this test suite was that if morethan 1 connection is exercised in the same test case, it was after the previous one closed(i.e. no real simultaneity). The experiment indicates that the MB selection can be usedto detect poorly structured test suites, through detecting low sizes of selected subsets atparticular levels of density, for the levels of multiple connections and vertical recursion atest suite is known to exercise.Sensitivity to Vertical RecursionExperiment 5In this experiment we investigate the effect of vertical recursion on the mutual density oftest sets. We use test suites of 100 test cases each, randomly generated, with low numbersof simultaneous connections (1 - 3), and moderate vertical recursion (10 - 30) targets. Theexperiment shows that sets generated with a certain number of simultaneous connectionsand a certain event pattern are less dense in the sets with the same characteristics, if theycontain less recursion than those sets.Figure 6.10 shows this effect of vertical recursion on mutual coverage of two series of 6 testsuites each. Sets r 1 , ... , r6 are randomly generated sets of test cases, and s i , ... , s6 aretheir corresponding sets (same number of simultaneous connections and event patterns),but with a certain reduction in recursion when calculated with respect to the correspond-ing set (i.e. the same subscript) from the r series. The test suite generation specifics andthe amount of reduction in recursion (in percentages) are shown in Table 6.2. No specialeffort was taken to fairly space the recursion levels in either r or s series of test sets.The effect is plotted by representing the mutual density for each pair of sets (rk , ss k ),k = 1, ... , 6 running along the x-axis. Connecting the points of the scatter graph allowsfor easier viewing. The density points of sets s i , ... , s6 in sets r1 , ... , r6 are connected bythe r/s line, and the density points of sets r 1 , ... , r6 in sets s i , ... , 8 6 are connected by6-3. EXPERIMENTAL RESULTS^ 139set connections targets rec. reduction for sets skrl 1 10 30r2 2 10 50r3 3 30 100r4 1 30 25r5 2 30 100r6 3 20 70Experiment 5 - Characteristics of test setsTable 6.2the s/r line. The metric function used is the same as in the previous experiment.We generally found that with random generation, sets of 100 test cases were sufficient,under given protocol characteristics, for the mutual coverage figures to show sensitivitytowards vertical recursion at reduction levels of in the range of 25-100 percent. Thisis due to the fact that, given a sufficient number of test cases in a test suite and a testgeneration algorithm that does not necessarily cluster recursive events in few clusters withpoor distribution, lower bounds of recursion are more likely to generate less variety in eventcluster recursion. Consequently, it will be easier for the set which has higher bounds onrecursion to approximate event patterns and recursive levels of poorly recursive sets thanvice versa, under such conditions.In a related experiment (not shown in this thesis), pairs of randomly generated sets,with moderate numbers of simultaneous connections, one with low and one with highrecursion, were compared for mutual coverage. Under the same metric function as in thisexperiment, the results were inconclusive. We identified the cause by examining sequenceswhich contributed to high maximum distances. We concluded that a new event pattern ina test sequence, likely to appear early in a sequence with many simultaneous connections,could easily offset the effect of better recursion distribution under this metric.Joint conclusion from both experiments is that the quality of test suite expressed throughdensity600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00r/ssir0.00^1.00^2.00^3.00^4.00^5.00^6.00k140^ CHAPTER 6. TESTS OF THE THEORYSensitivity to vertical recursionFigure 6.106-3. EXPERIMENTAL RESULTS^ 141metric based coverage functions is sensitive towards vertical recursion, however a metricwith adequate sensitivity to recursion should be used if it is to be a decisive factor of thequality of test suite.Sensitivity to ParallelismExperiment 6In this experiment we investigate the impact of testing with low and moderate networkconnection parallelism on the mutual coverage of test suites that differ with respect to thelevel of parallelism. The experiments were conducted under quite general circumstances.All test sets, 6 with low and 6 with moderate parallelism of network connections, wererandomly generated, containing 100 test cases each. Therefore a certain potential existedfor inconclusive results due to choosing a particularly rare combination of events in eithercase. However, the mutual coverage results were consistent in 3 different metric spacesand throughout 36 different comparisons. We therefore concluded that the results were agood indication of the MB coverage function to identify suites with greater simultaneityof network connections, in a general case. The results are reported in Figures 6.11, 6.12,6.13.The characteristics of individual test suites, with respect to the number of simultaneousconnections and the number of targets are given in Table 6.3. Suites involved moderatevertical recursion (10-30 targets), and were put into "low simultaneous connection" (lscid)set if involving 1 to 3 simultaneous connections. Likewise, "moderate simultaneous con-nection" sets (mscid) involved 10 - 35 simultaneous connections. Suites with equal id-swere compared.The comparisons were carried out in metric spaces generated by metrics with1. pk = 1, for k < 400, and Pk = 22400) for k > 400.2. pk = 1, for k < 25, and pk = 2, 1 25 , for k > 25.142^ CHAPTER 6. TESTS OF THE THEORYtest set id connections targets cases generatedmsc0 10 30 100lsc0 1 30 100mscl 15 30 100lscl 1 30 100msc2 20 30 100lsc2 2 30 100msc3 25 10 100lsc3 3 30 100msc4 30 10 100lsc4 1 10 100msc5 35 10 100lsc5 2 10 100Experiment 6 - Characteristics of randomly generated test setsTable 6.33. Pk =--- 1, for k < 15, and pk — 2kL5 , for k > 15.and the series rk were defined as in Experiment 1.2.The plots for each of these metric spaces are in Figures 6.11, 6.12, 6.13, respectively.x axis is labelled with set id-s, from 0 through 6. The plots show the density of setslsc0,. .. , lsc5, in sets msc0, ... , msc5, resp. These scatter points are connected by a linelabelled msc/lsc. Similarly, the line labelled lsc/msc connects points that show densitiesof sets msc0 msc5 in sets lsc0,. . . , lsc5, resp.Although the test sets with moderate parallelism are consistently more dense in low-parallelism sets in all three metric spaces considered, significant difference exists in themagnitude of the mutual density variation. Metric 1. evaluates all patterns of very longsequences with equal value, up to a large k = 400. Therefore, the density variation in thespace generated by this metric can be attributed to the much larger length of sequencesinvolving many connections. This, however, is a definite indication of the number of con-nections exercised, given fixed vertical recursion, but not necessarily of the simultaneity of6-3. EXPERIMENTAL RESULTS^ 143Mutual densities of some sets with moderate and low number of simultaneousconnectionsFigure 6.11144^ CHAPTER 6. TESTS OF THE THEORYnue/6cIschrisc.....-N. ......."-., ...,......---,......--- **---.......-F.,..--.-...-..-..... ...... -...-0.00^1.00^2.00^3.00^4.00^5.00Mutual densities of some sets with moderate and low number of simultaneousconnectionsFigure 6.12density100.0095.0090.0085.0080.0075.0070.0065.0060.0055.0050.0045.0040.0035.0030.0025.0020.0015.0010.005.000.00d6-3. EXPERIMENTAL RESULTS^ 145densitymsalsc5.000.00^ lsc/msc90.005.000.005.000.005.000.005.000.005.001.005.00).00i.00).00).00i.00^ ........^........... - ............^.^..... ..... . ...........^..—...-......i.00...-------........-1.000.00^1.00^2.00^3.00^4.00^5.00Mutual densities of some sets with moderate and low number of simultaneousconnections Figure 6.13188775543:311146^ CHAPTER 6. TESTS OF THE THEORYsuch connections. Therefore, the next two metric spaces favoured the average and upperend case length of sets lscid, which was found to be in the range of 11 - 36. (Lengths ofmscid test cases were in the range of 60 - 145.) Consequently, they really evaluated thegreater variation in pattern of test sequences generated by many simultaneous connec-tions, especially in view of the fact that all sequences longer than 15 (25) would contributeat most 1 to the final distance. This is an extreme case which definitely did not give apriori advantage to longer test sequences of sets mscid. It is therefore conceivable thatalso a more moderate metric would yield results which are consistent with this experi-ment's results.Sensitivity to a Missing Event or Event CombinationThe purpose of the next two experiments is to investigate the sensitivity of mutual den-sities of two test suites, when one of the test suites does not contain an event of eventcombination, or does so only to a certain lesser extent. All test sets contain 100 test cases.Metric is the same as in Experiment 1.2, unless otherwise noted.Experiment 7.1First, the sensitivity analysis was carried out for the environments involving no parallelismand moderate vertical recursion. In this experiment, 20 randomly generated sets of 100test cases each were compared for mutual coverage. Test sets s i through s th are randominterleavings of one connection, up to 30 ST-II targets. Test sets 01 through 010 arerandom interleavings of one connection, up to 30 ST-II targets, where ST-II agent origindoes not include sequences with a PDU HAO (HID APPROVE) as a response to a CN(CONNECT) PDU. Figure 6.14 shows the density deviation for the densities of sets ok inrandomly generated test sets s k , connected by the line s/o. For comparison, also shownare plots of all other density figures available in the experiments: a) densities of sets s kin sets ok, connected by a line o/s, b) densities of sets s k in sets sk+1 , connected by a lines'..---•■•............- -•-•".".'s. ..............-...-— — — —AI ‘— — — ii- 4-i^■—/- - - 1,- - I -: ,•14•Ir'.. , •e.„•../•i\i:II1 7 ;,^.,,./(..^I\ 1.. •kI \ ..1A. .,..--\\ .- „ 4..i::-,%...„ .7 ‘./ ■ •,,'VN ...• e\ i.,\•\••....'600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00s/os/so/so/originseedsk6-3. EXPERIMENTAL RESULTS^ 147s/s, and c) densities an oracle set originoracle in sets ok , line o/originoracle.density0.00^2.00^4.00^6.00^8.00^10 00Sensitivity to missing events - single connection with recursionFigure 6.14Experiment 7.2In this experiment we investigated environments with moderate parallelism and moderatevertical recursion. This experiment shows the effects of a missing event (SP or PDU) ina a more complex test suite. 3 groups of test sets, 100 random test cases each, weregenerated to randomly contain between 5 to 10 simultaneous connections, and 10 to 30148^ CHAPTER 6. TESTS OF THE THEORYtargets. The set s i through s6 was generated taking into account all available SPs andPDUs for all ST-II agents. Sets r 1 through r6 were generated with DSI (DISCONNECTPDU from the direction of origin) missing from intermediate seed sets. Likewise, sets t 1through t6 were generated with HCT PDU missing from target agent seed sets.Figure 6.15 shows that applying general metric to mutual coverage of such sets yieldsinconclusive results. Lines s/r, r/s, s/t and t/s connect scatter points of densities of setsr, s, t, and s in sets s, r, s and t resp., where densities are calculated only between setswith equal subscripts.We did observe, however, that at these levels of parallelism and recursion, the mutualcoverage of randomly generated test sets with all events included into seed files, consis-tently showed better density figures.In a related experiment, we improved sets t from the previous experiment, by adding10 percent and 70-80 percent of sequences, involving the HCT PDU. Figure 6.16 showsthe range of densities of sets t 1 , , t6 (all HCT event sequences deleted), tiaehet, • , t 6aehc,with almost all HCT events missing (10 percent sequences with it), and s l , , s 6 with70-80 percent sequences with HCTs (which is the typical content in a randomly generatedtest set of these characteristics), in randomly generated test sets rs i , , rs6 , applying ageneral metric function (from Experiment 1.2).Although the results explain the densities in Figure 6.15 to a certain extent, it stillwould be necessary to have a knowledge of the expected densities at particular levels ofvertical recursion and parallelism, in order to use general metric functions for missingevent identification.In the following experiments, a special metric function was designed to identify the cov-erage of a particular event (HCT PDU in this case). Its definition is the same as inExperiment 1.2., except for the cases were neither of the events at a position k in twosequences is HCT: if this is the case, the summand pk8k for this position k is zero. Thismetric function measures the coverage of the event HCT both with respect to its position .6-3. EXPERIMENTAL RESULTS^ 1497:-........r_ ----- -I ,,. ..^.-^••• •" ' ...'".. ‘ •.'",,,...^.•••0"\ •••.%%s ss • .../.>• .. 1^•I••.• /^...• % e•.■■%I..;'gri''.s•/',N.\\/0,or,'...., %.."•.•.1.'0.00^1.00^2.00^3.00^4.00^5.00^6.00density600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00sirVssitWkSensitivity to missing events - general metric function, multiple connections withrecursionFigure 6.15150^ CHAPTER 6. TESTS OF THE THEORY-^•..........—..„^N. •.,,,•• :il/....„„„‘\•..^'`. '., /^.'.„....**".. ...•-. ...,,ii\ •/,^/...1..•\1; • Ivl0.00^1.00^2.00^3.00^4.00^5.00^6.00density600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00rs/srs/taehctrs/tkSensitivity to missing events - general metric function, moderate parallelism andrecursionFigure 6.166-3. EXPERIMENTAL RESULTS^ 151in sequences and the level of recursion.Figure 6.17 shows the application of the metric function to specially identify the coverageof HCT event, and its improvement with improving t-test sets with sequences containingevent HCT, up to its average 70-80 percent representativeness as observed in randomlygenerated test sets. These are calculated as densities of t sets in randomly generated ssets. Figure 6.18 shows the effect of improving sets t 1 to t6 with 10, 20,50 and 70-80densityslid600.00 s2/t2s3/t3550.00 s4/t4T5F.5—500.00 s6/t6450.00400.00350.00300.00250.00200.00\150.00100.0050.00—••••-"Z-Z...,,,... ....an :::-....=-- -...: ----------_ ... ..^...0.00percent HCT in t0.00 20 00 40 00 60 00Sensitivity to missing events - special metric, moderate parallelism and recursionFigure 6.17percent of sequences which contain HCT in case of general metric. These are calculated152^ CHAPTER 6. TESTS OF THE THEORYas densities of t sets in randomly generated s sets. These do not improve as fast as in theprevious metric.All subsequent examples use the special metric designed to identify missing event content.sl/tls2/t2WO_-..- --,..--.. ^i4747;57i5'. \% ./v7._',_\ \%%\\\\^%%%,Y' density600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00rcent HCT in t0.00^10 00^20 00^30 00^40 00^50 00Sensitivity to missing events - general metric function, moderate parallelism andrecursionFigure 6.18The following figures show the sensitivity of this metric in various situations.Figure 6.19 shows the gain in the quality of t by increasing its HCT content, by plottingthe density of randomly generated test suites, in sets tid, whose special event content_6-3. EXPERIMENTAL RESULTS^ 153(HCT content) is improved by adding 10 -70 percent test sequences involving the eventconsidered. The next 2 figures show how the plots for 2 combinations of randomly gen-density600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00tl/s1t2/s2t3/s3t4/s4r5/;5—t6/s6percent HCT in t0.00 20 00 40 00 60 00Sensitivity to missing events - special metric function, moderate parallelism andrecursionFigure 6.19erated test sets s and sets t from this same example, meet in the range of 16-64, as thecoverage of the event HCT increases. (These are excerpts for fixed combinations of testsets from Figures 6.17 and 6.20.) Figure 6.22 shows mutual coverages of the randomlygenerated sets and set combinations in the same metric. Sets typically include 70-80 per-cent tests with HCT. Random sets were generated using 5-10 connections, 10-30 targets.154^ CHAPTER 6. TESTS OF THE THEORYdensity600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.00N ^.. — ............. ,......................0.00 20 00 40 00 60 00s3/t3t3/s3percent content HCTSensitivity to missing events - special metric, moderate parallelism and recursionFigure 6.20density600.00550.00500.00450.00400.00350.00300.00250.00200.00150.00100.0050.000.000.00 20 00^40 00^60 00............_ ..............6-3. EXPERIMENTAL RESULTS^ 155percent content HCTSensitivity to missing events - special metric, moderate parallelism and recursionFigure 6.21s4/t4t4/s4156^ CHAPTER 6. TESTS OF THE THEORYExperiments show that in such cases, mutual coverages of randomly generated sets alsotend to settle in this same range.density x 1031.051.000.950.900.850.800.750.700.650.600.550.500.450.400.350.300.250.200.150.100.050.00-0.05 s/scombinations of s0.00^5.00^10 00^15 00Sensitivity to missing events - special metric on random setsFigure 6.226-4. DISCUSSION^ 1576-4 DiscussionThe sensitivity analysis of the MB-method in the previous section is intended to provideonly a rough assessment of its ability to identify or select test suites with certain levelsof vertical recursion, multiconnection capabilities and event, transition or event patterncontent. While we have endeavored to give a reasonably complete assessment of thesensitivity of the method with respect to these protocol features, there are at least twoadditional factors that may influence the validity of the results and the suitability of thealgorithm implementation in a general case.The first has to do with the complexity of the state space we have considered in theexperiments. The spaces we have experimented with typically include 5 - 20 simultane-ous streams (horizontal recursion complexity), and 10 - 30 target applications (verticalrecursion complexity). 4 For more complex state spaces (involving hundreds of streamsand target agents), an independent set of experimental results should be obtained.Secondly, one may argue that the sampling basis of the experiments (in the rangeof 10 - 30 sample test suites or comparisons) is not statistically significant to draw anyconclusion about the stability of the sensitivity results. However, since results of theseexperiments are also analysed and supported by the theoretical explanations whereverpossible, we feel that they may be considered as an indication that, under similar protocoland experimental setting, repeated results would tend to settle in the same range.Based on these experiments, MB-method appears attractive for test selection andcoverage evaluation in a general case of a relatively complex protocol, and moderatelycomplex state space. Since no expert knowledge is required for the successful applicationof the method (except, perhaps, the knowledge of Ns or NT), then it is feasible for asingle implementation to be utilized over a range of protocols - greatly improving test4 For a typical test case, this may mean up to 20*30 simultaneous network connections from source todestination Service Access Point.158^ CHAPTER 6. TESTS OF THE THEORYdevelopment time compared to specialized protocol test generators.It is useful to discuss the efficiency of MB-method in terms of the size and complexityof the state space it can successfully handle. As discussed in [36, 43, 26], known testgeneration and coverage analysis tools do not scale well to larger "programs". It is notknown to us that there exists another implementation, that can systematically select andanalyse test suites with the level of concurrency and recursion that we have used in theseexperiments.In contrast, MB-method handled "moderately" complex state spaces 5 reasonablywell, with performance results settling in the range of Section 6-3.1. The process de-graded gracefully as the number of simultaneous connections and vertical recursion wereallowed to rise to levels considered in experiments. Although neither (static) test selec-tion or coverage evaluation are time critical processes, the experiments showed a need forbetter algorithm and implementation, if larger space analysis is to be attempted. Thisis especially true if test suites larger than 1000 test cases in moderately complex statespaces are to be mutually compared for coverage.The performance of the method was greatly improved in the experiments when inputevents only were considered. This is however applicable only in cases where interoper-ability relation is the theoretical upper bound on the testing process. In contrast, allexperiments were carried out at the level of granularity of an atomic event. The metric-based theory does not constrain us to such granularity - a mixed level hierarchical testselection and analysis is equally supported by the theoretical results. It is expected thatthe algorithm would be more efficient for coarser granularity of test subsequences or setsof subsequences, or at module level.The efficiency of the MB-method could be improved somewhat further by the following5 We characterize state spaces of 10 - 600 simultaneous network connections as moderately complex,as compared to "low" complexity test suites generated by typical systematic methods [16], which contain1 - 5 connections.6-4. DISCUSSION^ 159steps.Partitioning of the input sets into disjunct regions (Voronoi diagrams were consideredin the early algorithm implementation), processing the regions separately and than puttingthe results together, would possibly alleviate the performance problem related to largenumbers of initial test cases.Computing the distance only to a certain depth of the execution sequence, usingthe knowledge of the maximum sum of the tail of the series, would shorten the time ofindividual sequence comparison in spaces with increased recursion and simultaneity ofconnections.All computations in the present implementation were carried out in floating pointmode. Scaling the values for pk and rk to large integers, and carrying out the computationin integer mode has been observed to noticeably shorten the time needed for individualalgorithm passes.Chapter 7ConclusionIn this concluding chapter we summarize the contributions reported in the previous chap-ters, and suggest some topics for further research.7-1 Summary of Contributions7-1.1 A Metric Characterization of Protocol Control BehaviourSpaceMetric characterizations of the behaviour of systems have been proposed earlier ([7, 18])for theoretical investigations of some aspects of systems' behaviour in the theory of con-currency. However, metrics used in earlier work were typically the simple metrics onstrings (da, Appendix C, Def. C.2). This metric distinguishes strings of events with re-spect to the first event at which they differ in a temporally ordered sequence of events ofa system's control behaviour.Large (and infinite) systems are ordinarily not specified (or cannot be directly speci-fied) in terms of strings of events. Specification formalisms use hierarchical representation1607-1. SUMMARY OF CONTRIBUTIONS^ 161of such systems, in which recursively specified objects may generate large and infinite con-trol behaviour spaces of strings of events. Analysis of the behaviour of such systems isnot possible, in the previously researched metric, at different levels of hierarchy that spec-ification mechanisms allow.This thesis gives an improved metric characterization of the control behaviour spaceof such systems. The metric dt defined in this thesis includes recursive characteristics ofstrings of events into its definition. This allows analysis of (infinite) systems which arespecified hierarchically, with recursion, at various levels of granularity with respect to sucha hierarchy: atomic actions, sequences of actions and their sets, processes (modules) andgroups of processes. Being primarily concerned with the practical aspects of the approach,we stop short of expanding these results to "any level of granularity" with respect to thespecification hierarchy.' In a series of results we show the theoretical validity of the newmetric characterization of the control behaviour of systems. The results on completeness(Theorem 3.3), total boundedness (Theorem 3.2), the existence of finite approximants toinfinite sequences of interactions (Theorem 3.6), and the metric characterization of tracepreorder and equivalence (Propositions 4.5 and 4.6), derived in this thesis, are comparableto those obtained earlier for simpler metric spaces.7-1.2 Method-independent Coverage EvaluationA simple method for measuring mutual coverage of the control behaviour of test suiteshas been defined based on the metric dt. This method of test coverage evaluation hasseveral advantages over other test coverage evaluation methods.First, the metric based coverage measures are defined for infinite behaviour spaces ofsystems (Def. 3.4. together with the total boundness result provide theoretical foundationfor a coverage evaluation method in such spaces). With respect to other coverage criteria1 The constraint is contained in Theorems 3.7 and 3.8, where only finite sets are allowed to be extracted.162^ CHAPTER 7. CONCLUSIONdefined for infinite test domains, e.g. number of paths covered (which is inconclusive forrecursive systems), or the all paths criteria (which is practically infeasible), this definitionis both theoretically sound for recursive systems (the coverage of low, high, and thoroughcoverage of recursion in between the extremes is evaluated, Chapter 3-2.2), and providestheoretical basis for coverage evaluation methods for large and infinite metric spaces whichare totally bounded (through the stratification of infinite spaces into finite intermediatesets of truncations of traces, Definition 3.3, and 3.4).Secondly, the definition of coverage is analytical in character. It does not involve anytesting concept (e.g. mutants, transitions or branches) or non-specified 2 information (e.g.valuations or expert opinion). The generality of the method resulting from its analyticalcharacter is further demonstrated by establishing the metric hierarchy of some seeminglyunrelated fault coverage models in protocol testing: FSM methods with transition andtransfer fault coverage, general software test methods with branch and path coverage, andfinally trace equivalence as a fault model for infinite labelled transition systems (Chapter4).Next, the definition of MB coverage is applicable to mutual coverage evaluation of testsuites derived by different test methods. Whereas any method can be (either theoreticallyor experimentally) used to evaluate another, there are two original features of this methodthat make it a fair comparison tool.The fact that it is defined on the entire space of a specified behaviour in atomic terms(i.e. unconstrained by particular requirements of some method), and that it evaluatescoverage on the basis of generic behaviour information, make it an unbiased method forany underlying test suite structure. In this way, a frequent error [24], of judging one testmethod inside the space generated by the constraints of another method, is avoided.Also, the method is not based on a relationship in which one set of test cases "sub-sumes" another in terms of coverage (e.g. maximum depth of paths covered in depthbound methods, or inclusion of test sets ([4, 20])). Subsumes relationship is both inap-2 We consider that a system is fully specified by a valid (formal) specification of its behaviour.7-1. SUMMARY OF CONTRIBUTIONS^ 163plicable to non-deterministic systems (a subset of test cases on an execution may exposea fault that a superset did not), and practically obscure(for depth bound methods onrecursive systems it forces counterintuitive test preference, whereas the set inclusion cri-teria is bound to yield inconclusive results most of the time). The MB coverage methodis both always computable, and avoids the caveat of enforcing the subsumes relationshipin test coverage. This is further underscored by the fact that test sets may be comparedin a family of metric spaces parameterized by functions pk and rk , (Chapter 3-2.2B, alsoChapter 6, Experimental Results), and with respect to two different coverage measures,CovMax(T) and CovAve(T) (Chapter 3-3). It is therefore geared towards examiningthe quality of test suites, rather than passing a definitive judgement on coverage in largenondeterministic environments.7-1.3 The MB Test Selection MethodThe metric based test selection method for the control behaviour of protocol behaviour,presented in this thesis, has several features that set it apart from other test selectionalgorithms.First, the method is equipped with both the coverage target (all finite and infinitesequences, in metric spaces S, 45,), and S(S), Theorems 3.7 and 3.8), and the testingrelation (which may be strings equivalence, conf (Chapter 3-4.2), or intop, intop„d,(Chapter 5-4)). The testing relation will hold between the tested and specified systems,after the completion of the testing process based on selected test cases, if the coveragetarget is achieved.This original way of looking at the testing process is theoretically sound. The iterativecharacter of test selection algorithms that metric spaces permit, through achieving finer&dense approximations of original sets in a series of steps, provides for their practicalfeasibility for large systems. The thesis also gives a theoretical basis for the feasibility of164^ CHAPTER 7. CONCLUSIONthe method for infinite systems. The proof of the existence of finite &dense covers forinfinite systems in Theorem 3.2, which offers a computation method for discarding allbut a finite number of elements of infinite spaces that should participate in any specificpass of the algorithm, together with the systematic Cauchy sequence generation by thealgorithm itself, show that this indeed is possible. The ability to deal with infinite testsand infinite spaces in a systematic, intuitive and theoretically sound manner is a uniquefeature of the MB method.This method is entirely based on the information contained in the description of thesystem, and can therefore be fully automated. Full automation is crucial if thoroughtesting of large systems is the goal, and improves the practice of software engineering byshortening the software development time.7-1.4 Int erop erability RelationCurrent testing theory of nondeterministic, concurrent and communicating systems in-volves relational characterizations of how well one system implements a valid specifica-tion of its behaviour. The characterizations are based on the observation of system'sbehaviour. The underlying notion is that, if two systems cannot be told apart by ob-serving their behaviour, then they cannot be told apart by testing the same behaviour.Therefore, they are testing equivalent. One system is then, in particular, "less" than theother with respect to a testing relation, if fewer traces of identified kind can be observed onits interaction with environment (tester). The only possible exception are traces resultingfrom internal nondeterministic choices, and those related to the treatment of divergence.In the definition of the interoperability relations intop and intop, we challenge thisbasic premise of the theory of testing in the interleaving model of concurrency. In exam-ining specifications of real systems in environments more complex than usually consideredin theoretical investigations, we have determined, that a different kind of testing relationwould be more appropriate for testing of communication protocols. The testing of pro-7-1. SUMMARY OF CONTRIBUTIONS^ 165cesses with the theoretical upper bound of intop relation is still based on observation, butthe possibility of observing and testing a behaviour are not seen as necessarily equivalentnotions. In particular, the possibility that a trace may be observed (by specification ofa system) does not necessarily imply that it may be observed in testing (i.e. we allowthat it may not). The intop relation allows the definition of sets of events (Lau andDef. 5.2-2), and refusal sets (Def. 5.2-3), such that the observation of one behaviour ina set of alternative behaviours is sufficient to establish the validity of a system and itsspecification with respect to intop relation. This, in particular, may mean that the twosystems are not even trace equivalent. The intop relation is a more accurate definition ofvalidity of an implementation for testing multiconnection protocol environments for in-teroperability. Previously defined testing relations based on observation require that suchbehaviours be also observable in testing, even if they are irrelevant for interoperability.At the same time, it is a more efficient upper bound on the testing process. It eliminatesinconclusive test runs due to non-synchronized next event selection between the tester andimplementation under test, and redundant test runs of alternative behaviours which mayeven be unobservable. It supports fair selection strategies in that possibly unobservableand behaviours irrelevant for interoperability are uniquely identified in the interoperabil-ity tester tree, and may be dropped in favour of required behaviours which are essentialfor interoperability.7-1.5 Contributions to Specific Areas of StudyThe results of this thesis are relevant to several areas of study.The new metric characterization of the behaviour of concurrent systems allows forthe same theoretical investigations in the theory of concurrency to be carried out in amore finely characterized space. Issues such as fair and unfair behaviours, divergence andequivalences can be reformulated to allow for the level of detail currently not possible.The MB theory can be used as a theoretical basis of algorithms that may allow system-166^ CHAPTER 7. CONCLUSIONatic selection, analysis and possibly generation of test suites with the levels of recursionand parallelism usually not attempted in practice, thereby improving the practice of pro-tocol engineering.The theory of extensional equivalences in process algebras with interleaving semanticsof concurrency, is challenged in this work with respect to the predominant view of obser-vation and testing being equivalent notions.Finally, the research in this thesis has been inspired by the related work in softwaretesting. Whereas the related work states the superiority and need for coverage testing(of complete state spaces) in a "uniform" manner (i.e. no a priori prefered test cases), itdoes not discuss a method, other than stating that one is presently unavailable [25]. Webelieve that this thesis is a step towards a possible solution to this general problem andthat it does bring the theory and practice of software testing that much closer.7-2 Suggestions for Further ResearchA number of interesting research problems arise in connection with the results derived inthis thesis.A challenging and long standing problem in software research [53] is to relate testcoverage to the reliability of tested implementations. We believe that this problem issolvable within our proposed analytical framework.In the thesis we stop short of repeating the work of [18] in demonstrating connectionsof the two theories of concurrency: metric spaces and synchronization trees, for the metricspaces (S, dt). However, the results in Chapter 3, Section 4.3 are a good indication thatthis may be possible. This would be of relevance to protocol testing, if efficient methodsare found for execution of tree-specified test sets.The experimentation experience suggests that the MB method may also be used as7-2. SUGGESTIONS FOR FURTHER RESEARCH^ 167a test generation method, perhaps on infinite sets defined by specification. A pseudo-random test generation of &dense sets in complete spaces is one possible solution thatdeserves investigation. Another possibility is to generate &dense sets using finite descrip-tions of infinite systems in the style of [171, and the constraints derived in the proof ofthe total boundness theorem.Experimental extension of the work of this thesis in the direction of experimentalcalibration of MB method against some well known testing tools, on very large systems,would be of definite interest for the comparison of this method to other established testmethods.It may be interesting to use this metric function (or other metric functions), for theanalysis of the specified control behaviour space of protocols. Using metric functions toinvestigate and possibly characterize protocol features such as testability, distribution ofevents in the metric space, error detection probability of faults, hard to detect faultsand other issues may help improve the efficiency and quality of the testing process ofcommunication protocols.Bibliography[1] N. Arakawa, M. Phalippou, N. Risser, and T. Soneoka. Combination of confor-mance and interoperability testing. In 5th International Conference FORTE '92.Lannion,France, 1992.[2] E. Brinksma. A theory for the derivation of tests. In Aggarwal and Sabnani, editors,Protocol Specification, Testing and Verification VIII. North-Holland, 1988.[3] E. Brinksma, G. Scollo, and C. Steenbergen. LOTOS specifications, their imple-mentation and their tests. In IFIP 6.1 Sixth Int. Workshop on Protocol Specification,Testing and Verification, St. Jovite, Canada, 1987.[4] Ed Brinksma, Jan Tretmans, and Louis Verhaard. A framework for test selection. InB. Jonnson, J. Parrow, and B. Pehrson, editors, Protocol Specification, Testing andVerification XI. North-Holland, 1992.[5] X. Castillo and D. P. Siewiorek. Workload, performance and reliability of digitalcomputing systems. In IEEE International Symposium on Fault Tolerant Computing,1981[6] Tsun S. Chow. Testing software design modeled by finite-state machines. IEEETransactions on Software Engineering, SE-4(3):178-187, 1978.[7] G. Costa. A metric characterization of fair computations in CCS. Technical Report169-84, Dept. of Comp. Sci., University of Edinburgh, 1984.[8] J. Curgus. Analytic methods in coverage testing of communications software. InProceedings of CASCON 92 - IBM Conference on Advances in Software. 1992.[9] J. Curgus. Development of a TTCN test suite for ST-II protocol. IBM Studie, 1992.[10] J. Curgus and S. T. Vuong. A metric based theory of test selection and coverage.In Protocol Specification, Testing and Verification XIII North-Holland, Amsterdam,1993.168BIBLIOGRAPHY^ 169[11] J. Curgus and S.T. Vuong. A framework for interoperability testing of communicationprotocols. International Conference on Network Protocols ICNP-93, 1993.[12] R. de Nicola. Extensional equivalences for transition systems. Acta Informatica,(24):211-237, 1987.[13] R. de Nicola and M. C. B. Hennesy. Testing equivalences for processes. TheoreticalComputer Science, (34):83-133, 1984.[14] J. Duran and S. Ntafos. An evaluation of random testing. IEEE Transactions onSoftware Engineering, pages 438-444, July 1984.[15] C. Topolcic (ed). Experimental Internet Stream Protocol, Version 2 (ST-II); RFC-1190. Internet Requests for Comments, (1190), October 1990.[16] International Organization for Standardization (ISO). X.25-DTE Conformance Test-ing, Part 3, Packet Level Conformance Test Suite. IS 8882, 1992.[17] S. Gallouzi, L. Logrippo, and A. Obaid. An expressive trace theory for LOTOS. InB. Jonnson, J. Parrow, and B. Pehrson, editors, Protocol Specification, Testing andVerification XI. North-Holland, 1992.[18] W. G. Golson and W. C. Rounds. Connections between two theories of concurrency:Metric spaces and synchronization trees. Information and Control, (57):102-124,1983.[19] R. Gotzhein. Formal definition and representation of interaction points. ComputerNetworks and ISDN Systems, (25):3-22, 1992.[20] J. S. Gourlay. A mathematical framework for the investigation of testing. IEEETransactions on Software Engineering, SE-9(6), 1983.[21] D. Gueraichi and L. Logrippo. Derivation of test cases for LAP-B from a LOTOSspecification. Technical Report 89-18, Dept. of Comp. Sci., University of Ottawa,1989.[22] F. Guo and R. L. Probert. Mutation testing of communication protocols: Method-ology and assessment of coverage. Technical Report 91-05, Dept. of Comp. Sci.,University of Ottawa, 1991.[23] D. Hamlet. Are we testing for true reliability. IEEE SOFTWARE, July 1992.[24] Dick Hamlet. Probable correctness theory. Inform. Processing Lett., 25:17-25, 1987.[25] R. Hamlet and R.Taylor. Partition testing does not inspire confidence. In SecondWorkshop on Software Testing, Verification, and Analysis, Banff, Canada, 1988.170^ BIBLIOGRAPHY[26] D. Harel. Biting the silver bullet. IEEE Computer, 25:8-25, 1992.[27] M. Hennessy. Observing processes. In J. W. de Bakker, W. P. Roever, and G. Rozen-berg, editors, Linear Time, Branching Time and Partial Order in Logics and Modelsfor Concurrency, pages 173-200. 1989.[28] E. Hewitt and K. Stromberg. Real and Abstract Analysis. Springer-Verlag, NewYork, 1965.[29] Gerard J. Holzmann. Design and validation of computer protocols. Prentice Hall,Englewood Cliffs, New Jersey, 1991.[30] ISO/IEC-JTC1/SC21/WG1/FDT/B. Information Processing Systems - Open Sys-tems Interconnection - Estelle, a Formal Description Technique Based on an ExtendedState Transition Model. IS 907, 1989.[31] ISO/IEC-JTC1/SC21/WG1/FDT/C. Information Processing Systems - Open Sys-tems Interconnection - LOTOS, a Formal Description Technique Based on the Tem-poral Ordering of Observational Behaviour. IS 8807, February 1989.[32] ISO/IEC-JTC1/SC21/WG6. Specification of Abstract Syntax Notation ONE(asn.1).IS 8824, 1987.[33] R. K. Iyer, S. E. Butner, and E. J. McCluskey. A statistical failure/load relationship:Results of a multicomputer study. IEEE Transactions on Computers, C-31(7), July1982.[34] G. Karjoth. Implementing process algebra specifications by state machines. In Ag-garwal and Sabnani, editors, Protocol Specification, Testing and Verification VIII.North-Holland, 1988.[35] R. Keller. Formal verification of parallel programs. Commun. ACM, (21):666-677,1978.[36] J. P. J. Kelly, T. I. McVittie, and W. I. Yamamoto. Implementing design diversityto achieve fault tolerance. IEEE SOFTWARE, pages 10-12, July 1991.[37] E Krauser, A. P. Mathur, and V.J. Rego. High performance software testing on SIMDmachines. IEEE Transactions on Software Engineering, (SE-17):403-423, 1991.[38] Jean-Claude Laprie and Karama Kanoun. X-ware reliability and availability model-ing. IEEE Transactions on Software Engineering, SE-18:130-147, 1992.[39] Guy Leduc. Conformance relation, associated equivalence, and new canonical testerin LOTOS. In B. Jonnson, J. Parrow, and B. Pehrson, editors, Protocol Specification,Testing and Verification XI. North-Holland, 1992.BIBLIOGRAPHY^ 171[40] Y. Lu. On TESTGEN, an environment for protocol test sequence generation andits application to the FDDI MAC protocol. Master's thesis, University of BritishColumbia, 1991.[41] M. McAllister, S. T. Vuong, and J. Curgus. Automated test case selection based ontest coverage metrics. In Protocol Test Systems V. Elsevier Science Publishers B.V.(North Holland), 1992.[42] R. Milner. A Calculus of Communicating Systems. In Lecture Notes in ComputerScience 92. Springer-Verlag, New York/Berlin, 1980.[43] M. T. Norris and S. G. Stockman. Industrializing formal methods for telecommunica-tions. In Lecture Notes in Computer Science 387. Springer-Verlag, New York/Berlin,1990.[44] S. Ntafos. Software testing: Theory and practice. In IEEE International Test Con-ference, 1992.[45] David L. Parnas, John A. van Schouwen, and Shu Po Kwan. Evaluation of safety-critical software. Commun. ACM, (33):636-648, 1990.[46] C. Partridge and S. Pink. An implementation of the revised Internet Stream Protocol(ST-2). Internetworking: Research and Experience, (3):27-54, 1992.[47] R. L. Probert and F. Guo. Mutation testing of protocols: Principles and preliminaryexperimental results. In Protocol Test Systems III. Elsevier Science Publishers B. V.(North Holland), 1991.[48] S. Rapps and E. J. Weyuker. Selecting software test data using data flow information.IEEE Transactions on Software Engineering, pages 367-375, April 1985.[49] B. Sarikaya, G.v.Bochmann, and E.Cerny. A test design methodology for protocoltesting. IEEE Transactions on Software Engineering, pages 518-531, April 1987.[50] SC6/WG4/N498. Formal description of Transport Service, ISO 8072, in LOTOS.revised ISO/PDTR 10023, 1987.[51] D. P. Sidhu and C. -S. Chang. Probabilistic testing of protocols. ACM SIGCOMMComputer Communication Review, 19-4:295-302, 1989.[52] D. P. Sidhu and T. -K. Leung. Formal methods for protocol testing: A detailed study.IEEE Transactions on Software Engineering, SE- 15:413-427, 1989.[53] P. K. Srimani and Y. K. Malaiya. Steps to practical reliability measurement. IEEESOFTWARE, pages 10-12, July 1992.172^ BIBLIOGRAPHY[54] R. N. Taylor, D. L. Levine, and C. D. Kelly. Structural testing of concurrent pro-grams. IEEE Transactions on Software Engineering, SE-18:206-215, 1992.[55] H. Ural and Bo Yang. A test sequence selection method for protocol testing. IEEETransaction on Communications, april 1991.[56] G. v. Bochmann, D. A. Dssouli, M. Duboc, M. Ghedamsi A., and G. Luo. Faultmodels in testing. In Protocol Test Systems IV. Elsevier Science Publishers B. V.(North Holland), 1992.[57] A. von Mayrhauser. Software testing: Opportunity and nightmare. In IEEE Inter-national Test Conference, 1992.[58] S.T. Vuong and J. Alilovic-Curgus. Metric-based test selection for communicationprotocols. Technical report, University of British Columbia, February 1992. CICSRand Dept. of Computer Science.[59] S.T. Vuong and J. Alilovic-Curgus. Test coverage metrics for communication proto-cols. In Protocol Test Systems IV. Elsevier Science Publishers B. V. (North Holland),1992. Invited Paper.[60] S.T. Vuong and J. Curgus. Metric characterization of execution sequences in LOTOS.In ICPADS 92 - International Conference on Parallel and Distributed Systems, 1992.[61] S.T. Vuong, H. Janssen, Y. Lu, C. Mathieson, and B. Do. Testgen: An environmentfor test suite generation and selection. The Computer Communication Journal, 1992.[62] W. Yi and K. G. Larsen. Testing probabilistic and nondeterministic processes. InR. J. Linn and M. U. Uyar, editors, Protocol Specification, Testing and VerificationXII. North-Holland, 1992.Appendix ALabelled Transition SystemsDefinition A.1 ([12]) A Labelled Transition System is a quadrupole (Q, E, -a -*, go )where Q is a countable set of states, E is a countable set of elementary actions, -a -*with a E E U {i}, is a set of binary relations on Q and go E Q is the initial state. ^Notation A.1 ([39])Processes ( denoted by P, and ranged over Ti, T2, T1, T2, P1, P2, P',... ) will be sets oflabelled transition systems over an alphabet E U {i} (i is the unobservable action) ofelementary actions, ranged over by a, b,c,....P - a -* P' means that process P may engage in an action a E E and, afterdoing so, behaves like process P'.P - ik --- P' means that process P may engage in the sequence of k internalactions and, after doing so, behaves like process P'.P - a.b --4 P' means HP", such that P - a -> P" and P" - b --4 P'P = a^P' means 3k0 , k1 E N such that P - i k°.a.i kl -.), P'P = a =means 3P' such that P = a P', i.e. P may engage in an action aP $ a means -, (P = a =) i.e. P cannot engage in an action aP = a P' means that process P may engage in a sequence of observable actions173174^ Appendix A. Labelled Transition Systemsa and, after doing so, behaves like process P'.P = a means that 3P' such that P = a P'P= ac° means Vk, we have P = akTr(P) is the trace set of P, i.e. {a I P = a^Tr(P) is a subset of E*P after a = {P' I P =^PT i.e. the set of all behaviour expressions (orstates) reachable by aOut(P, a) is the set of possible observable actions after the trace a, i.e.Out(P, a) = {a E E a.a E Tr(P)}Ref (P, a) is the refusal set of P after trace a, i.e.Ref(P,c) = {X C E^E P after a, such that P' a^E X}Appendix BLOTOSLOTOS (Language of Temporal Ordering Specification) is a Formal Description Techniquedeveloped within ISO (International Organization for Standardization) for the formalspecification of open distributed systems. The systems are specified in LOTOS by defininga temporal relation among the interactions , that constitute the externally observablebehaviour of the system. Labelled Transition Systems (Appendix A) are used as modelsfor LOTOS processes.The operational semantics is given by a system of inference rules that correspondto the language operators. These rules generate a Labelled Transition System for eachbehaviour expression of the language. Let B1 , B i , B2, B21 represent behaviour expressions,and m E E observable actions. The special action i represents the unobservable, internalaction that can be used to model nondeterminism.The purpose of this annex is just to present briefly the mapping of a LOTOS speci-fication onto such model. We give axioms and inference rules for the subset of LOTOSthat is used in the thesis in Table B.1. A detailed presentation may be found in [31].175176^ Appendix B. LOTOSCombinator Axioms or Inference Rulesstop nonem; B m;B—m—*Bi; B i;B—i—*BBiDB2 Bi — m —> B; I — B i 0B2 — m --+ B;orB2 - rn -4 B2I I - B1 [] B2 - rn -> B12131 I [gl , • • • , gni I B2 B1 - m -* Bii , name(m) V Igi, • • • , thili -I — Bii[gi, • • • , fin]IB2 — m —÷ 141[gi, • • • ,g2]1B2OrB2 - m -* B21 , name(m) cE {m., . . . , gi, } I —I — B11[91, • • - , gn] l/32 — m -4 Bi. I [M., • • • , gd1B21BillB2 Bi - m -* B; ) B2 - ni -4 B21 1 -Bi I I B2 - m -> B;jIB;131111B2 B1l[]1B2Axioms and Inference Rules in LOTOSTable B.1Appendix CTheory of Metric Spaces -Mathematical DefinitionsDefinition C.1 (Metric space)A metric space is a pair (M, d) with M a non-empty set and d a mapping (a metric ordistance) of M x M onto the set of non-negative real numbers, that satisfies the followingproperties:(a) Vx, y E M d(x,y) = 0 ^ x = y(b) Vx, y E M d(x,y) = d(y, x)(c) Vx, y, z E M d(x , y) < d(x, z) + d(z,y).Definition C.2 Let A be an alphabet, and let A°° denote the set of all finite and in-finite words over A. Let, for x E A°°, x(n) denote the prefix of x of length n, in caselength(x)> n, and x otherwise. We putda(x, y) = 2-supfnix(n)=y(n)}177178^Appendix C. Theory of Metric Spaces - Mathematical Definitionswith the convention that 2 -" = 0. Then (A", da) is a metric space.Definition C.3 Let (M, d) be a metric space, let {xi}°_o be a sequence in M.(a) We say that {xi} itc, is a Cauchy sequence whenever we have:Ve > 0 3N E N Vn, m > N^x, -,) <(b) Let x E M. We say that {x i }i converges to x and call x the limit of {xi}i wheneverwe have:Ve > 0 3N E N Vn > N d(x, x„) < cSuch a sequence we call convergent. Notation: /imi_,„„xi = x.(c) The metric space (M, d) is complete whenever each Cauchy sequence converges to anelement of M.Let (M, dt) be a metric space and T, S be two subsets of that metric space.Definition C.4 A family .F of subsets of M is said to be a covering for S if S C U{FF E Y}, i.e. if S is contained in the union of all sets in Y.Definition C.5 A set T is said to be &dense in S (for E > 0) if for every s E S there ist E T such that d(s,t) < E.Definition C.6 A metric space S is totally bounded if for every E > 0 it is possible tocover S by a finite number of spheres with radius E.Definition C.7 Let T be a subset of a metric space S . Distance of a point x E S fromT is defined asd(x,T) = inf{d(x,y)ly E T}Appendix C. Theory of Metric Spaces - Mathematical Definitions^179Definition C.8 The Hausdorff distance dH [28], between two subsets S, T of a metricspace is defined as follows:dif (S,T) . max{sup{d(s, T) : s E S},sup{d(t,S): t E T}}Definition C.9 A sequence {x,i }n°°_ 1 is said to be eventually constant if there existsN E N such that xn = xn, Vn,m > N.Appendix DA Formal Description of ISO 8072in LOTOSprocess TConnection [t] :^exit(TCEP [t] (CallingRole) III TCEP [t] (CalledRole))II(TCEPAssociation [t]^f), exit )endproc (* TConnection *)process^TCEP [t] (role:TSUserRole)TCEPAddress [t]IITCEPIdentification [t]IITCEPSPOrdering [t] (role)endproc (* TCEP *)behaviourTConnections [t]IITCldentification [t]IITCAcceptance [t]IITBackpressure [t]whereprocess TConnections [t] :TConnection [t]IIITConnections [t]endproc (* TConnections *)noexit: exit180Appendix D. A Formal Description of ISO 8072 in LOTOS^ 181process^TCEPSPOrdering [t] (role:TSUserRole)^:exit^process TCEPConnecti[t](role:TSUserRole)^: exit (TSP)TCEPConnect1(role)^»accept tsp:TSP^in^[role=CallingRole]^-2. t?ta:TAddress ?tcei:TCEI ?tcr:TSP[IsTCONreq(tcr) and (ta IsCallingOf tcr)];exit^(tcr)D[role=CalledRole] -> t?ta:TAddress ?tcei:TCEI ?tci:TSP[IsTCONind(tci) and (ta IsCalled0f tci)];exit (Id)endproc (* TCEPConnect *)process^TCEPConnect2 [t] (tc1:TSP)^:exit (TEXOption)t?ta:Taddress ?tcei:TCEI ?tc2:TSP[tc2 IsValidTCON2For tot];( choice x:TEXOption^1] [x IsTEXOptionOf tc21^->exit (x) )endproc (` TCEPConnect2*)process^TCEPDataTransfer [t] (x:TEXOption)^:noexit^process TCEPRelease [t]:^exit:= :=TCEPNormalDataTransfer [t]^ t?ta:TAddress ?tcei:TCEI ?tsp:TSP [IsTDIS(tsp)];^exitIII[x = UseTEX] -> TCEPExpeditedDataTransfer [t]endproc (' TCEPDataTransfer')process TCEPNormalDataTransfer [t]:^noexitt?ta:TAddress ?tcei:TCEI ?tsp:TSP [IsTDT(tsp)];TCEPNormalDataTransfer [t]endproc (' TCEPNormalDataTransfer')endproc (*TCEPRelease *)(( TCEPConnect2N(tsp)^»accept x:TEXOption^inTCEPDataTransfer [1] (x)) [2. TCEPRelease [t])LI[role = CalledRole]^-> exitendproc (' TCEPSPOrdering')Appendix EA Specification of ST-II Protocol inEstelle-YOrigin AgentSpecification ST2ORIGIN;CONSTVARISPS00^ST2OSAP; /* STREAM OPEN */SCO^ST2OSAP;^/* STREAM CLOSE */SDO^ST2OSAP; /* STREAM DISCONNECT */AIO^ST2OSAP; /* APPLICATION INFORM MESSAGE */CNO recv_in SOO;^/* CONNECT */DSO recv_in SCO;^/* DISCONNECT */OSPPDU182Appendix E. A Specification of ST-II Protocol in Estelle-Y^ 183ACO sent_in AIO;^/* ACCEPT */RFO sent_in AIO;^/* REFUSE */EQO sent_in AIO;^/* ERROR IN REQUEST */ACO ST2OPDU;^/* ACCEPT */AKO ST2OPDU;^/* ACKNOWLEDGE */DTO ST2OPDU;^/* DATA */HAO ST2OPDU;^/* HID APPROVE */HRO ST2OPDU;^/* HID REJECT */HCO ST2OPDU;^/* HID CHANGE */TIMERSTATEidle, wfst, hidc, hida, cona, dtph, wfcl;INITIALIZATIONTO idleBEGINEND;TRANSfrom idleto^wfstwhen SOOOUTPUT CNO;TRANSfrom wfstto idlewhen RFOOUTPUT AKO,AIO;to idle184^ Appendix E. A Specification of ST-II Protocol in Estelle-Ywhen EQOOUTPUT AKO,AIO;to idlewhen SCO;to hidcwhen HROOUTPUT HCO;to hidawhen HAO;TRANSfrom^hidcto hidcwhen HROOUTPUT HCO;to hidawhen HAO;to idlewhen SCO;TRANSfrom^hidato^hidawhen^ACOOUTPUT AKO;to^dtphwhen^ACOOUTPUT AKO,AIO;to^hidawhen^RFOAppendix E. A Specification of ST-II Protocol in Estelle-Y^ 185OUTPUT AKO;to dtphwhen RFOOUTPUT AKO,AIO;to wfclwhen SCOOUTPUT DSO;TRANSfrom dtphto dtphwhen SDOOUTPUT DTO;to wfclwhen SCOOUTPUT DSO;to idlewhen RFOOUTPUT AKO,AIO;TRANSfrom^wfclto idlewhen AKO;end.186^ Appendix E. A Specification of ST-II Protocol in Estelle-YIntermediate AgentSpecification ST2INTERM;CONSTVARISPOSPAKH ST2DTSAP;^/* ACKNOWLEDGE (outgoing) */HAH ST2DTSAP;^/* HID APPROVE (outgoing) */PDUCNI ST2IPDU;CNH ST2IPDU;DSI ST2IPDU;DSH ST2IPDU;ACI ST2IPDU;ACH ST2IPDU;RFI ST2IPDU;RFH ST2IPDU;DTI ST2IPDU;DTH ST2IPDU;AKI ST2IPDU;HAI ST2IPDU;HRI ST2IPDU;HRH ST2IPDU;HCI ST2IPDU;HCH ST2IPDU;/* CONNECT (incoming) *//* CONNECT (outgoing) *//* DISCONNECT (incoming) *//* DISCONNECT (outgoing) *//* ACCEPT (incomming) *//* ACCEPT (outgoing) *//* REFUSE (incoming) *//* REFUSE (outgoing) *//* DATA (incoming) *//* DATA (outgoing *//* ACKNOWLEDGE (incoming) *//* HID APPROVE (incoming) *//* HID REJECT (incoming) *//* HID REJECT (outgoing) *//* HID CHANGE (incoming) *//* HID CHANGE (outgoing) */TIMERAppendix E. A Specification of ST-II Protocol in Estelle-Y^ 187STATEidle, wfst, hidc, hida, hidn, dtph, wfac, wfcl;INITIALIZATIONTO idleBEGINEND;TRANSfrom idleto hidawhen CNIOUTPUT CNH,HAH;to hidcwhen CNIOUTPUT HRH;to idlewhen CNIoutput RFH;TRANSfrom hidato hidnwhen HRIoutput HCH;to wfstwhen HAI;to wfclwhen RFIoutput RFH, AKH;TRANS188^ Appendix E. A Specification of ST-II Protocol in Estelle-Yfrom hidnto hidnwhen HRIoutput HCH;to wfstwhen HAI;TRANSfrom^hidcto hidcwhen HCIOUTPUT HRH;to hidawhen HCIOUTPUT HAH,CNH;to wfclwhen HCIOUTPUT RFH;to idlewhen DSIOUTPUT AKH;TRANSfrom^wfstto^wfacwhen^ACIOUTPUT AKH,ACH;to wfclwhen RFIOUTPUT RFH, AKH;Appendix E. A Specification of ST-II Protocol in Estelle-Y^ 189to wfclwhen DSIOUTPUT DSH, AKH;TRANSfrom^wfacto dtphwhen AKI;TRANSfrom dtphto^wfclwhen RFIOUTPUT RFH, AKH;to wfclwhen DSIOUTPUT DSH, AKH;to dtphwhen DTIOUTPUT DTH;TRANSfrom^wfclto idlewhen AKI;end.190^ Appendix E. A Specification of ST-II Protocol in Estelle-YTarget AgentSpecification ST2TARGET;CONSTVARISPSAT^ST2TSAP;^/* STREAM ACCEPT */SRT^ST2TSAP;^/* STREAM REFUSE */SIT^ST2TSAP;^/* STREAM OPEN INFORM */AIT^ST2TSAP;^/* APPLICATION INFORM MESSAGE */DIT^ST2TSAP;^/* DATA INFORM */OSPPDUCNT ST2TPDU;DST ST2TPDU;ACT ST2TPDU;RFT ST2TPDU;EQT ST2TPDU;AKT ST2TPDU;HAT ST2TPDU;HRT ST2TPDU;HCT ST2TPDU;DTT ST2TPDU;TIMERSTATE/* CONNECT *//* DISCONNECT *//* ACCEPT *//* REFUSE *//* ERROR IN REQUEST *//* ACKNOWLEDGE *//* HID APPROVE *//* HID REJECT *//* HID CHANGE *//* DATA */idle, wfst, hidc, hida, cona, dtph, wfcl;Appendix E. A Specification of ST-II Protocol in Estelle-Y^ 191INITIALIZATIONTO idleBEGINEND;TRANSfrom idleto hidawhen^CNTOUTPUT HAT,SIT;to^hidcwhen CNTOUTPUT HRT;to idlewhen CNTOUTPUT RFT;TRANSfrom^hidcto hidcwhen HCTOUTPUT HRT;to hidawhen HCTOUTPUT HAT,SIT;to idle192^ Appendix E. A Specification of ST-II Protocol in Estelle-Ywhen DST;TRANSfrom^hidato wfstwhen^SATOUTPUT ACT;to^wfstwhen^SRTOUTPUT RFT;to idlewhen DSTOUTPUT AKT,AIT;TRANSfrom wfstto idlewhen DSTOUTPUT AKT,AIT;to idlewhen EQTOUTPUT AIT;to wfstwhen SATOUTPUT ACT;to wfstwhen SRTOUTPUT RFT;to wfstwhen AKT;Appendix E. A Specification of ST-II Protocol in Estelle-Y^ 193to dtphwhen AKT;TRANSfrom^dtphto^dtphwhen DTTOUTPUT DIT;to idlewhen DSTOUTPUT AIT, AKT;to wfclwhen SRTOUTPUT RFT;TRANSfrom wfclto idlewhen AKT;end.Appendix FTest Files and Sample ResultsF-1 Seed FilesOriginoriginseedsSOD RFOSOD EQOSOO SCOSOO HRO HAO ACO SDO SDO SCO AKOSOO HRO HAO ACO SDO SDO RFOSOO HRO HAO ACO SDO SCO AKOSOD HRO HAO ACO SDO RFOSOD HRO HAD ACO SCO AKOSOD HRO HAO ACO RFOSOO HRO HAO SCO AKOSOO HRO SCO194F-1. SEED FILES^ 195SOO HAO ACO SDO SDO RFOSOO HAO ACO SDO SCO AKOSOO HAO ACO SDO RFOSOO HAO ACO SCO AKOSOO HAO ACO RFOSOO HAO SCO AKOoriginseeds.atomicSOO CNO RFO AKO AIOSOO CNO EQO AKO AIOSOO CNO SCOSOO CNO HRO HCO HAO ACO AKO AIO SDO DTO SDO DTO SCO DSO AKOSOO CNO HRO HCO HAO ACO AKO AIO SDO DTO SDO DTO RFO AKO AIOSOD CNO HRO HCO HAO ACO AKO AIO SDO DTO SCO DSO AKOSOO CNO HRO HCO HAO ACO AKO AIO SDO DTO RFO AKO AIOSOO CNO HRO HCO HAO ACO AKO AIO SCO DSO AKOSOO CNO HRO HCO HAO ACO AKO AIO RFO AKO AIOSOO CNO HRO HCO HAO SCO DSO AKOSOO CNO HRO HCO SCOSOO CNO HAO ACO AKO AIO SDO DTO SDO DTO RFO AKO AIOSOO CNO HAO ACO AKO AIO SDO DTO SCO DSO AKOSOO CNO HAO ACO AKO AIO SDO DTO RFO AKO AIOSOO CNO HAO ACO AKO AIO SCO DSO AKOSOO CNO HAO ACO AKO AIO RFO AKO AIOSOO CNO HAO SCO DSO AKO196^ Appendix F. Test Files and Sample ResultsIntermediateintermseedsCNI HRI HAI ACI AKI RFI AKICNI HRI HAI ACI AKI DTI RFI AKICNI HRI HAI ACI AKI DTI DSI AKICNI HRI HAI RFI AKICNI HAI ACI AKI DTI RFI AKICNI HAI ACI AKI DTI DSI AKICNI HAI RFI AKICNI RFI AKICNI HCI HRI HAI ACI AKI DTI RFI AKICNI HCI HRI HAI ACI AKI DTI DSI AKICNI HCI HRI HAI RFI AKICNI HCI HAI ACI AKI DTI RFI AKICNI HCI HAI ACI AKI DTI DSI AKICNI HCI HAI RFI AKICNI HCI RFI AKICNI HCI AKICNI DSICNICNI HRI HAI DSI AKICNI HAI DSI AKICNI HCI HRI HAI DSI AKICNI HCI HAI DSI AKIF-1. SEED FILES^197int ermseeds.at omicCNI CNH HAH HRI HCH HAI ACI ACH AKH AKI RFI RFH AKH AKICNI CNH HAH HRI HCH HAI ACI ACH AKH AKI DTI DTH RFI RFH AKH AKICNI CNH HAH HRI HCH HAI ACI ACH AKH AKI DTI DTH DSI DSH AKH AKICNI CNH HAH HRI HCH HAI RFI RFH AKH AKICNI CNH HAH HAI ACI ACH AKH AKI DTI DTH RFI RFH AKH AKICNI CNH HAH HAI ACI ACH AKH AKI DTI DTH DSI DSH AKH AKICNI CNH HAH HAI RFI RFH AKH AKICNI CNH HAH RFI RFH AKH AKICNI HRH HCI CNH HAH HRI HCH HAI ACI ACH AKH AKI DTI DTH RFI RFH AKH AKICNI HRH HCI CNH HAH HRI HCH HAI ACI ACH AKH AKI DTI DTH DSI DSH AKH AKICNI HRH HCI CNH HAH HRI HCH HAI RFI RFH AKH AKICNI HRH HCI CNH HAH HAI ACI ACH AKH AKI DTI DTH RFI RFH AKH AKICNI HRH HCI CNH HAH HAI ACI ACH AKH AKI DTI DTH DSI DSH AKH AKICNI HRH HCI CNH HAH HAI RFI RFH AKH AKICNI HRH HCI CNH HAH RFI RFH AKH AKICNI HRH HCI RFH AKICNI HRH DSI AKHCNI RFHCNI CNH HAH HRI HCH HAI DSI DSH AKH AKICNI CNH HAH HAI DSI DSH AKH AKICNI HRH HCI CNH HAH HRI HCH HAI DSI DSH AKH AKICNI HRH HCI CNH HAH HAI DSI DSH AKH AKI198^ Appendix F. Test Files and Sample ResultsTargett arget seedsCNT DSTCNT SAT AKT DSTCNT SAT AKT DTT DSTCNT SAT AKT DTT SRT AKTCNT SAT AKT EQTCNT SAT AKT SRT AKTCNT SAT DSTCNT SAT EQTCNT SAT SRT AKT DSTCNT SAT SRT AKT DTT DSTCNT SAT SRT DSTCNT SAT SRT EQTCNT DSTCNT HCT DSTCNT HCT SAT AKT DSTCNT HCT SAT AKT DTT DSTCNT HCT SAT AKT DTT SRT AKTCNT HCT SAT AKT EQTCNT HCT SAT AKT SRT AKTCNT HCT SAT DSTCNT HCT SAT EQTCNT HCT SAT SRT AKT DSTCNT HCT SAT SRT DSTCNT HCT SAT SRT EQTCNT HCT DSTF-1. SEED FILES^ 199CNT HCT HCT DSTCNT HCT HCT SAT AKT DSTCNT HCT HCT SAT AKT DTT DSTCNT HCT HCT SAT AKT DTT SRTCNT HCT HCT SAT AKT EQTCNT HCT HCT SAT ACT AKT SRT RFT AKTCNT HCT HCT SAT DSTCNT HCT HCT SAT EQTCNT HCT HCT SAT SRT AKT DSTCNT HCT HCT SAT SRT AKT DTT DSTCNT HCT HCT SAT SRT DSTCNT HCT HCT SAT SRT EQTCNT RFTtargetseeds.atomicCNT HAT SIT DST AKT AITCNT HAT SIT SAT ACT AKT DST AKT AITCNT HAT SIT SAT ACT AKT DTT DIT DST AKT AITCNT HAT SIT SAT ACT AKT DTT DIT SRT RFT AKTCNT HAT SIT SAT ACT AKT EQT AITCNT HAT SIT SAT ACT AKT SRT RFT AKTCNT HAT SIT SAT ACT DST AKT AITCNT HAT SIT SAT ACT EQT AITCNT HAT SIT SAT ACT SRT RFT AKT DST AKT DITCNT HAT SIT SAT ACT SRT RFT AKT DTT DIT DST AKT AITCNT HAT SIT SAT ACT SRT RFT DST AKT AITCNT HAT SIT SAT ACT SRT RFT EQT AITCNT HRT DST200^ Appendix F. Test Files and Sample ResultsCNT HRT HCT HAT SIT DST AKT AITCNT HRT HCT HAT SIT SAT ACT AKT DST AKT AITCNT HRT HCT HAT SIT SAT ACT AKT DTT DIT DST AKT AITCNT HRT HCT HAT SIT SAT ACT AKT DTT DIT SRT RFT AKTCNT HRT HCT HAT SIT SAT ACT AKT EQT AITCNT HRT HCT HAT SIT SAT ACT AKT SRT RFT AKTCNT HRT HCT HAT SIT SAT ACT DST AKT AITCNT HRT HCT HAT SIT SAT ACT EQT AITCNT HRT HCT HAT SIT SAT ACT SRT RFT AKT DST AKT AITCNT HRT HCT HAT SIT SAT ACT SRT RFT DST AKT AITCNT HRT HCT HAT SIT SAT ACT SRT RFT EQT AITCNT HRT HCT HRT DSTCNT HRT HCT HRT HCT HAT SIT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT AKT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT AKT DTT DIT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT AKT DTT DIT SRT RFT AKTCNT HRT HCT HRT HCT HAT SIT SAT ACT AKT EQT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT AKT SRT RFT AKTCNT HRT HCT HRT HCT HAT SIT SAT ACT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT EQT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT SRT RFT AKT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT SRT RFT AKT DTT DIT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT SRT RFT DST AKT AITCNT HRT HCT HRT HCT HAT SIT SAT ACT SRT RFT EQT AITCNT RFTF-2. EXPERIMENT 1.1^ 201F-2 Experiment 1.1e e • • e .........seloriginseedsseloriginseeds 6.56(S00, 1) (HRO, 1) (HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)seloriginseeds 5.90(S00, 1) (HRO, 1) (HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)(S00, 1) (RFO, 1)seloriginseeds 4.78(S00, 1) (HRO, 1) (HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)(S00, 1) (RFO, 1)(500, 1) (HAO, 1) (ACO, 1) (SDO, 1) (SCO, 1) (AKO, 1)seloriginseeds 3.87(500, 1) (HRO, 1) (HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)(S00, 1) (RFO, 1)(500, 1) (HAO, 1) (ACO, 1) (SDO, 1) (SCO, 1) (AKO, 1)(500, 1) (HRO, 1) (HAO, 1) (SCO, 1) (AKO, 1)202^ Appendix F. Test Files and Sample Resultsseloriginseeds 2.82(500,^1)^(HRO,^1)^(HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)(500,^1)^(RFO,^1)(500,^1)^(HAO,^1)^(ACO, 1) (SDO, 1) (SCO, 1) (AKO, 1)(500, 1)^(HRO,^1)^(HAO, 1) (SCO, 1) (AKO, 1)(500,^1)^(HRO, 1)^(HAO, 1) (ACO, 1) (SCO, 1) (AKO, 1)(500,^1)^(HAO, 1)^(ACO,seloriginseeds 1.851) (RFO, 1)(500,^1)^(HRO,^1)^(HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)(500,^1)^(RFO,^1)(500,^1)^(HAO,^1)^(ACO, 1) (SDO, 1) (SCO, 1) (AKO, 1)(SOO,^1)^(HRO,^1)^(HAO, 1) (SCO, 1) (AKO, 1)(500,^1)^(HRO, 1)^(HAO, 1) (ACO, 1) (SCO, 1) (AKO, 1)(500,^1)^(HAO,^1)^(ACO, 1) (RFO, 1)(S00,^1)^(HRO,^1)^(HAO, 1) (ACO, 1) (SDO, 2) (RFO, 1)(500,^1)^(HRO,^1)^(HAO, 1) (ACO, 1) (RFO, 1)(500,^1)^(HRO,^1)^(SCO, 1)(500,^1)^(HAO, 1)^(ACO, 1) (SDO, 2) (RFO, 1)(500,^1)^(HAO,^1)^(ACO, 1) (SCO, 1) (AKO, 1)(500,^1)^(HAO,^1)^(SCO,seloriginseeds 0.981) (AKO, 1)(S00 ,^1)^(HRO,^1)^(HAO, 1) (ACO, 1) (SDO, 2) (SCO, 1) (AKO, 1)(SOO,^1)^(RFO,^1)F-3. EXPERIMENT 4(S00, 1) (HAO, 1) (ACO, 1) (SDO, 1) (SCO, 1) (AKO, 1)(500, 1) (HRO, 1) (HAO, 1) (SCO, 1) (AKO, 1)(500, 1) (HRO, 1) (HAO, 1) (ACO, 1) (SCO, 1) (AKO, 1)(500, 1) (HAO, 1) (ACO, 1) (RFO, 1)(SOO, 1) (HRO, 1) (HAO, 1) (ACO, 1) (SDO, 2) (RFO, 1)(S00, 1) (HRO, 1) (HAO, 1) (ACO, 1) (RFO, 1)(500, 1) (HRO, 1) (SCO, 1)(500, 1) (HAO, 1) (ACO, 1) (SDO, 2) (RFO, 1)(S00, 1) (HAO, 1) (ACO, 1) (SCO, 1) (AKO, 1)(500, 1) (HAO, 1) (SCO, 1) (AKO, 1)(S00, 1) (EQ0, 1)(500, 1) (SCO, 1)F-3 Experiment 4An excerpt from a randomly generated test suiteSOD CNO HAO ACO ACO ACO AKO AIO SDO DTO RFO SOO CNO HRO HCO HAO ACO ACOACO AKO AKO SOO CNO HRO HCO HAO AIO SCO ACO ACO ACO AKO AIO RFO AKO AIODSO AKO AIOSOD CNO HAO RFO ACO AKO AIO SDO DTO SOD CNO HRO HCO HAO ACO AKO SCO AIOSDO DTO SCO DSO AKO DSO AKOSOO SOO CNO HRO HCO HAO ACO CNO HRO HCO HAO ACO AKO RFO RFO ACO AKO AIOAIO RFO AKO AIO HRO HCO HAO ACO AKO SCO AIO SDO DTO SCO DSO SOD CNO HROHCO HAO ACO RFO AKO AIO AKO ACO ACO ACO AKO AIO SCO DSO AKO DSO AKOSOO CNO HRO HCO HAO ACO SOO CNO HRO HCO HAO ACO ACO ACO ACO AKO AIO SDODTO RFO ACO AKO AIO RFO AKO AIO AKO AIO203204^ Appendix F. Test Files and Sample ResultsSOO CNO HRO HCO SCO ACO ACO SOO CNO HAO SCO DSO ACO ACO AKO AIO SDO DTORFO AKO AIOSOO CNO HRO SOO CNO HRO HCO HAO ACO AKO HCO AIO SDO DTO SDO DTO RFO AKOAIO AIOSOO CNO HAO ACO AKO AIO SDO DTO SDO DTO RFO AKO AIO AKO AIO AIO SOO CNOHRO HCO HAO ACO AKO AIO RFO AKO AIOS00 CNO HAO ACO ACO AKO AIO RFO AKO AIO SOO CNO HRO HCO HAO SOO CNO HROHCO HAO SCO DSO ACO ACO ACO AKO AIO SDO DTO RFO AKO AIOSOO CNO HAO ACO ACO ACO AKO AIO SDO DTO SDO DTO SOD CNO HAO SCO DSO RFOAKO AIOSOO CNO HRO HCO HAO ACO ACO SOO CNO HAO SCO DSO AKO AKO AIO SDO DTO SDODTO SOO CNO HAO SCO DSO RFO AKO AIO SOO CNO HAO ACO ACO ACO ACO AKO AIOSDO DTO ACO ACO ACO RFO AKO AIO AKO AIO RFO AKO AIOS00 CNO HRO HCO HAO ACO ACO ACO AKO AIO RFO AKO AIO SOO CNO HRO HCO HAOACO AKO AIO RFO AKO AIOSOO CNO HRO HCO HAO ACO ACO ACO ACO ACO AKO AIO SCO DSO AKO SOO CNO HAOACO ACO ACO ACO AKO AIO SDO DTO SDO DTO SOO CNO HAO ACO ACO ACO ACO AKOAIO SCO DSO AKO RFO AKO AIOAn excerpt from a test suite generated by systematic backtrackingS00 CNO HRO HCO HRO HCO HAO ACO AKO SCO DSO AKOSOO CNO HRO HCO HRO HCO HAO ACO AKO SCO DSO AKO SOO CNO SCOSOO CNO HRO HCO HRO HCO HAO ACO AKO AIO SDO DTO SCO DSO AKOSOO CNO HRO HCO HRO HCO HAO ACO AKO AIO SDO DTO SCO DSO AKO SOO CNO SCOSOO CNO HRO HCO HRO HCO HAO ACO AKO AIO SDO DTO RFO AKO AIOSOO CNO HRO HCO HRO HCO HAO ACO AKO AIO SDO DTO RFO AKO AIO SOO CNO SCOSOO CNO HRO HCO HRO HCO HAO ACO AKO AIO SCO DSO AKOSOO CNO HRO HCO HRO HCO HAO ACO AKO AIO RFO AKO AIOF-3. EXPERIMENT 4^ 205SOO CNO HRO HCO HRO HCO HAO ACO AKO AIO RFO AKO AIO SOO CNO SCOSOO CNO HRO HCO HRO HCO HAO RFO AKO SCO DSO AKOSOO CNO HRO HCO HRO HCO HAO RFO AKO SCO DSO AKO SOD CNO SCOSOO CNO HRO HCO HRO HCO HAO RFO AKO AIO SDO DTO SCO DSO AKO
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A metric-based theory of test selection and coverage...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A metric-based theory of test selection and coverage for communication protocols Alilovic-Curgus, Jadranka 1993
pdf
Page Metadata
Item Metadata
Title | A metric-based theory of test selection and coverage for communication protocols |
Creator |
Alilovic-Curgus, Jadranka |
Date Issued | 1993 |
Description | A metric-based theory is developed that gives a solution to the problem of test selection and coverage evaluation for the control behaviour of network protocol implementations. The key idea is that a fast, completely automated process can uniformly cover the execution subspace of a network protocol control behaviour when characterized by appropriate metric functions, each concerned with some aspect of the protocol behaviour. Efficient systematic approximation of complex systems behaviours is a crucial problem in software testing. This thesis gives a theoretically sound and completely automated solution to the approximation problem for the control behaviour space of network protocols generated by many concurrent and highly recursive network connections. This objective is accomplished in a series of steps. First, a metric-based theory is developed which introduces a rigorous mathematical treatment of the discipline of testing, through the definition of testing distance, test coverage metrics, and metric-based test selection method. It involves a metric characterization of infinite trace sets of protocol behaviour within complete and totally bounded metric spaces, and captures approximations of different patterns of system behaviour due to recursion and parallelism. It is shown that classes of fault coverages of well known protocol test methods form a metric hierarchy within these metric spaces. Next, a general mathematical framework is developed for reasoning about the interoperability of communicating systems. An interoperability relation is obtained which gives a theoretical upper bound for the test selection process. The two threads are drawn together in a specific test selection algorithm, showing that the generation of arbitrarily dense sets of test sequences that approximate some original test suite to some target accuracy within the theoretical upper bound, is a convergent process. Finally, the theory itself is tested on the examples of a multi-media network protocol. It is shown that very high densities of selected sets and coverage calculation can be achieved within reasonable time limits in a completely automated manner. These results indicate that there is no practical impediment to applying rigorous theoretical treatment to the discipline of testing for the case of systems that derive their complexity from highly concurrent and recursive subprocesses. |
Extent | 8519582 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2008-09-11 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051361 |
URI | http://hdl.handle.net/2429/1871 |
Degree |
Doctor of Philosophy - PhD |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1993-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1993_fall_phd_alilovic_curgus_jadranka.pdf [ 8.12MB ]
- Metadata
- JSON: 831-1.0051361.json
- JSON-LD: 831-1.0051361-ld.json
- RDF/XML (Pretty): 831-1.0051361-rdf.xml
- RDF/JSON: 831-1.0051361-rdf.json
- Turtle: 831-1.0051361-turtle.txt
- N-Triples: 831-1.0051361-rdf-ntriples.txt
- Original Record: 831-1.0051361-source.json
- Full Text
- 831-1.0051361-fulltext.txt
- Citation
- 831-1.0051361.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051361/manifest