Reasoning Tools to Support Systems Analysis and Design by James Daniel Paulson B.Sc.(Eng.), The U n i v e r s i t y of Saskatchewan, 1977 M.Sc.(Physics), The U n i v e r s i t y of Saskatchewan, 1981 B.Ed., The Un i v e r s i t y of Saskatchewan, 1982 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY i n THE FACULTY OF GRADUATE STUDIES (Faculty of Commerce) We accept t h i s thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA February, 1989 © James Daniel Paulson, 1989 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department The University of British Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date 0f> FcL & ? DE-6(3/81) Abstract Some parts of the systems analysis and design process are not well structured and r e l y h e avily on human judgement and experience. This i s p a r t i c u l a r l y true for decomposition and the v a l i d a t i o n of system s p e c i f i c a t i o n s . Decomposition has long been considered a fundamental part of systems analysis and design. However, ensuring that a decomposition i s optimal i s nearly impossible. Ensuring that a system s p e c i f i c a t i o n i s complete and consistent i s an inherently d i f f i c u l t task. Most e x i s t i n g systems analysis and design methodologies allow only the use of techniques such as code walk-throughs and post-implementation t e s t i n g . Analysis errors discovered at such l a t e stages can be quite expensive to correct. E x i s t i n g methodologies cannot support automated completeness and consistency t e s t i n g because they lack the degree of formalism required to allow automation. The primary objective of t h i s research was to increase understanding of system decomposition. To a i d i n achieving t h i s objective a formalism for representing a system s p e c i f i c a t i o n , and a set of computer-based s p e c i f i c a t i o n s analysis tools were developed. The tools support decomposition and provide completeness and consistency t e s t i n g of a system s p e c i f i c a t i o n . An e x i s t i n g system modelling formalism was extended to provide the basis for the s p e c i f i c a t i o n formalism. This extended formalism w i l l allow an analyst to describe a system with the degree of p r e c i s i o n necessary f o r automated t e s t i n g and decomposition. The a b i l i t y to create a complete and consistent system model f a c i l i t a t e d the development of a general theory of system decomposition. A system model created using the s p e c i f i c a t i o n s analysis tools can be analyzed using a decomposition algorithm based on t h i s theory. The algorithm incorporates a number of commonsense software design rules and decomposition h e u r i s t i c s drawn from the l i t e r a t u r e , and has been included i n the s p e c i f i c a t i o n s analysis t o o l s . Experience has shown that the s p e c i f i c a t i o n s analysis tools may suggest system decompositions not previously considered by the analyst. A l t e r n a t i v e decompositions may a r i s e i n two s i t u a t i o n s : 1. The system has a v a l i d a l t e r n a t i v e structure which may not have been considered by the analyst. This a l t e r n a t i v e structure may be superior to the o r i g i n a l structure envisioned by the analyst when the system model was constructed. 2 . The system s p e c i f i c a t i o n does not contain enough information to rule out c e r t a i n unreasonable decompositions. The missing information should be e x p l i c i t l y included i n the s p e c i f i c a t i o n to avoid problems of i n t e r p r e t a t i o n l a t e r i n the system development l i f e cycle. Analysis of several t e s t systems (including the IFIP Working Conference system often used as a standard problem i n the systems analysis l i t e r a t u r e ) using the s p e c i f i c a t i o n s analysis tools has proven the f e a s i b i l i t y of automated consistency and completeness t e s t i n g and decomposition. Further research i s required i n two areas: 1. Enhancement of the s p e c i f i c a t i o n s analysis t o o l s . The tools are not user f r i e n d l y . An analyst w i l l require extensive t r a i n i n g to use them e f f e c t i v e l y . As w e l l , the computational speed of the tools must be improved. Automated decomposition i s too slow to allow easy i n t e r a c t i o n between the analyst and the t o o l s . 2 . A h i e r a r c h i c a l analysis technique must be developed to support a p p l i c a t i o n of the s p e c i f i c a t i o n formalism and the theory of decomposition to larger systems. Table of Contents Abstract i i Table of Contents i v Acknowledgements ix Chapter 1: Introduction 1 1.1. General 1 1 . 2 . Background 3 1.2.1. System Theory 3 1 . 2 . 2 . Computer Software 5 1 .2 .2.1. C h a r a c t e r i s t i c s of Good Decomposition . . . . 6 1 .2.2.1.1. Myers 6 1 . 2 . 2.1 . 2 . Parnas 7 1 .2.2.1.3. Yourdon and Constantine 7 1 . 2 . 2.1 . 4 . Cluster Analysis 10 1 . 2 . 2 . 2 . Decomposition Techniques 11 1 . 2 .2 .2.1. Structured Design Decomposition Technique 11 1 . 2 . 2 . 2 . 2 . HOS Decomposition 16 1 .2 .2 .2.3. Formal Models of Computer Programming 19 1.3. Conclusions 20 Chapter 2: System Modelling 22 2.1. Introduction 22 2 . 2 . The Formalism 24 2.2.1. General 24 2 . 2 . 2 . Systems 25 2.2.3. An I n t u i t i v e Beginning 25 2 . 2 . 4 . D e f i n i t i o n s 28 2.2.4.1. The Basics 28 iv 2 .2.4 . 2 . Completeness, Consistency and Correctness of Sublaws 34 2.2.4 .2.1. Conceptual D e f i n i t i o n s f o r Completeness, Consistency and Correctness . 35 2 . 2.4 . 2 . 2 . Operational D e f i n i t i o n s of Completeness and Consistency 36 2.2.5. A Simple Example 38 2.3. Implementation: The S p e c i f i c a t i o n s Analysis Tools 42 2.3.1. Entering a System Model 42 2.3.1.1. State Variables 43 2.3.1 .2. Values 43 2.3.1.3. Sublaws 44 2.3.1.3.1. S t a b i l i t y Conditions 45 2.3.1.3.2. Corrective Actions 46 2.3.1.4. External Events 47 2.3 .2 . A More Extended Example 47 2.4. Conclusions 51 Chapter 3: A Theory of Decomposition 53 3.1. General 53 3 . 2 . The Decomposition Formalism 55 3.3. Decomposition Syntax 65 3.4. Li m i t i n g the Search Space 66 3.4.1. General 66 3.4 .2. H e u r i s t i c s and Theorems 70 3.4.2.1. Subsystems should have outputs 70 3.4 .2 .2 . Subsystems should be small 71 3.4.2.3. Subsystems should show emergence 73 3.4.2.4. Subsystems should not show redundant dependencies 78 3.4.2.5. Bad Subsystems 79 3.4.3. Relationship to the H e u r i s t i c s of Simon and Ando . . 80 3.5. Automation of Decomposition 82 3.5.1. An Algorithm for Decomposition 82 3.5 .2. A Simple Example 85 3.5.3. Importance of the External Event Space 92 V 3.4.4. Decomposition of the P a y r o l l System 94 3.5.5. Intermediate State Variables 97 3 . 6 . Conclusions 100 Chapter 4: System Complexity, Maintenance, and Goals 102 4.1. General 102 4.2. Complexity 104 4.2.1. Variety 105 4.2.2. Modified Variety 107 4.2.3. Entropy 114 4.2.4. Computational Work 118 4.2.5. States or State Variables? 120 4.3. H e u r i s t i c Guided Search 120 4.4. Maintenance 126 4.4.1. Changes to Sublaws 128 4.4.2. Changes to External Events 131 4.4.3. Implications f o r Design 132 4.5. The System Goal 139 4 . 6 . Conclusions 145 Chapter 5: Conditional Decomposition 147 5.1. Introduction 147 5.2. Conditional Decomposition Basics 149 5.3. H e u r i s t i c s 151 5.4. Using Conditional Decomposition to Test a Model 158 5.5. Conclusions 163 Chapter 6: SELMA Applied 165 6.1. General 165 6.2. Applying SELMA 168 6.2.1. State Variable I d e n t i f i c a t i o n . . . 171 6.2.2. External Event I d e n t i f i c a t i o n 172 6.2.3. Sublaw I d e n t i f i c a t i o n 174 6.2.4. Consistency and completeness t e s t i n g 175 6.2.5. Decomposition 180 6.2.5.1. Pa r a l l e l / S e q u e n t i a l Decomposition 180 vi 6.2.5.2. Conditional Decomposition 184 6.3. Jackson System Development (JSD) 186 6.4. Active and Passive Component Modelling (ACM/PCM) 195 6.5. Conclusions 204 Chapter 7: Conclusions and Future Research 209 7.1. Introduction 209 7.2. D e f i n i t i o n of some key modelling concepts 210 7.2.1. Coupling 210 7.2.3. Cohesion 212 7.2.3. System, S t a t i c s and Dynamics 213 7.3. Conclusions 216 7.4. Future Research 219 7.4.1. Enhancement of the S p e c i f i c a t i o n s Analysis Tools . . 219 7.4.2. Ad d i t i o n a l Applications of SELMA 221 7.4.3. Extensions to the Theory of Decomposition 221 References 223 Appendix A: The Parable of Hora and Tempus 229 Appendix B: Myers' Taxonomy of Coupling and Cohesion 230 Appendix C: The Decomposition Rules of M i l i , Desharnais and Gagne. . . . 232 Appendix D: System S p e c i f i c a t i o n Testing 237 Appendix E: The Stable State Space and System Law 241 Appendix F: A Simple "Batch" P a y r o l l System 243 Appendix G: A Simple "Interactive" P a y r o l l System 249 Appendix H: Decomposition of the Four-Lights System 254 Appendix I: Possible Decompositions for the "Batch" P a y r o l l System . . . 268 vii Appendix J : The Modified P a y r o l l System 285 Appendix K: Decompositions of the Modified P a y r o l l System 291 Appendix L: A Schematic Diagram of the Four-Lights System 296 Appendix M: Modified Variety and Two Independent Subsystems 297 Appendix N: The Combined P a y r o l l System Model 299 Appendix 0: C a l c u l a t i o n of Total Pay i n the "Combined" P a y r o l l System . 306 Appendix P: The IFIP Working Conference Case 309 Appendix Q: State Variable I d e n t i f i c a t i o n f o r the IFIP Working Conference Problem 311 Appendix R: External Event I d e n t i f i c a t i o n f o r the IFIP Working Conference Problem 315 Appendix S: Sublaw I d e n t i f i c a t i o n for the IFIP Working Conference System Problem 318 Appendix T: The IFIP Working Conference System Model 325 Appendix U: Functional Forms of Some IFIP Working Conference Subsystems 330 viii Acknowle dgement s I would l i k e to thank a l l the members of my d i s s e r t a t i o n committee for t h e i r advice and support during the preparation of t h i s t h e s i s . In p a r t i c u l a r , I would l i k e to express sincere gratitude to my research supervisor, Dr. Yair Wand, f o r h i s patience and countless hours of valuable discussion, and Dr. Richard Mattessich for h i s many h e l p f u l c r i t i c i s m s and suggestions with respect to the o n t o l o g i c a l foundations of t h i s research. I would also l i k e to thank my wife, Kim, for her endless encouragement and tolerance during a tremendously s t r e s s f u l period i n both of our l i v e s . ix Chapter 1 : Introduction 1.1. General The notion of decomposition i s ce n t r a l to most methodologies f o r systems analysis and design. For example, Structured Analysis (DeMarco, 1978), Warnier-Orr Diagrams (Warnier, 1974; Orr, 1977), JSD 1 (Jackson, 1983), and HOS2 (Hamilton and Zeldin, 1976) a l l require the analyst to construct a h i e r a r c h i c a l structure f o r a proposed computer-based system. Courtois (1985) notes the importance of decomposition: "Decomposition has long been recognized as a powerful t o o l f o r the analysis of large and complex systems. The technique of decomposing a system, studying the components, and then studying the int e r a c t i o n s of those components has been su c c e s s f u l l y used i n many areas of engineering and science." Despite t h i s , there e x i s t s no theory to guide the process of system decomposition. Decomposition has always been considered a h e u r i s t i c a c t i v i t y . An important objective of the research i s to increase our understanding of system decomposition. In order to achieve t h i s objective a formalism has been developed f o r representing an information system based on the states and laws system model developed by Wand and Weber (1988, 1989). Not only does t h i s formalism provide a basis f o r the development of a theory of system decomposition, but i t includes operational d e f i n i t i o n s f o r completeness and consistency of system s p e c i f i c a t i o n s . Tangible r e s u l t s of the research include a set of Prolog-based s p e c i f i c a t i o n analysis t o o l s . These tools can support formal d e s c r i p t i o n of a system model and suggest possible decompositions for that system. The decompositions w i l l reveal the inherent structure of a system s p e c i f i c a t i o n and can be used to i d e n t i f y d e f i c i e n c i e s i n the model 3. 1 Jackson System Development 2 Higher Order Software 3 In t h i s research, a system model i s considered to be part of a f u l l system s p e c i f i c a t i o n . System models give the parts of the r e a l world to be represented i n the implemented information system and t h e i r r e l a t i o n s h i p s . 1 I t i s generally assumed that the structure and behaviour of an implemented system should c l o s e l y mirror that of the s p e c i f i c a t i o n . In the Structured Design l i t e r a t u r e , Myers (1978) states "...the program structure should c l o s e l y model the problem structure." (p. 73) JSD (Jackson, 1983, p. 5) defines system modelling to c o n s i s t of two a c t i v i t i e s : a. f i r s t , making an abstract d e s c r i p t i o n of the r e a l world, and b. second, making a r e a l i z a t i o n , i n the computer, of that abstract d e s c r i p t i o n . Therefore, i t seems reasonable to assume that the structure of the s p e c i f i c a t i o n can provide a basis f o r the further design of a system implementation. A suggested decomposition w i l l be "good" i n the sense that i t w i l l support modular construction and r e s t r i c t the e f f e c t s of system maintenance to e a s i l y i d e n t i f i a b l e segments of the o v e r a l l system. Since the decompositions suggested by the tools describe the structure of the s p e c i f i c a t i o n , they w i l l a i d i n both the understanding of a complex r e a l system and i n system design. The contributions of t h i s research include the following: a. Integration of e x i s t i n g decomposition h e u r i s t i c s . b. Development of a theory-based modelling technique for system s p e c i f i c a t i o n and decomposition. c. Development and implementation of a set of computerized tools for constructing a complete and consistent model of a system. d. A formal theory of system decomposition. e. Development and implementation of an algorithm capable of decomposing a system without human p a r t i c i p a t i o n . This chapter presents the r e s u l t s of a l i t e r a t u r e search f o r material r e l a t e d to decomposition i n the f i e l d s of general system theory and computer science. The second chapter presents a system modelling formalism su i t a b l e for System s p e c i f i c a t i o n s may also include requirements such as minimum response times, required throughput, cost, etc.. 2 use as the basis f or a general theory of system decomposition. This formalism i s an extension of the works of Bunge (1977, 1979) and Wand and Weber (1988, 1989), and supports automated t e s t i n g of completeness and consistency. The theory of decomposition i s developed i n Chapter 3 and l a t e r extended i n Chapter 5. Chapter 4 includes a discussion of various measures of system complexity. One of these measures i s shown to be u s e f u l f or d i s c r i m i n a t i n g between competing decompositions of a given system, and i s adopted f o r use i n t h i s research. To i l l u s t r a t e t h e i r usefulness, the system modelling formalism and techniques stemming from the theory of decomposition are applied to the IFIP Working Conference problem (Olle, 1982, pp. 8-9) i n Chapter 6. D e f i n i t i o n s f or some terms of general importance to systems analysis and design are suggested i n Chapter 7, p r i o r to a presentation of the conclusions reached i n t h i s research. The next section describes previous research on system decomposition. Ideas from both system theory and software design are considered. The decomposition rules of Myers (1978), Yourdon and Constantine (1979), Hamilton and Zel d i n (1976), and M i l i et a l . (1986) are described i n d e t a i l . 1.2. Background 1.2.1. System Theory Outside the systems analysis and design l i t e r a t u r e , the concept of system decomposition i s viewed from two d i f f e r e n t , but complementary, perspectives. Simon and Ando (1961) and Courtois (1985) suggest the use of decomposition as an a i d i n analyzing an e x i s t i n g system. Alexander (1967) and Simon (1981) argue that a s u i t a b l e decomposition could provide the basis f or design of a new system. From a systems analysis perspective, Simon and Ando consider the aggregation of v a r i a b l e s i n dynamic systems where short-run dynamics are separable from long-run dynamics. Their necessary c r i t e r i a f o r a system to be decomposable are as follows (from Courtois, 1985): a. In a short-term period, as a r e s u l t of stronger i n t e r n a l bonds, subsystems tend to reach an i n t e r n a l equilibrium "approximately" independently of one another. 3 b. In a long-term period, when a whole structure evolves toward a global equilibrium state under the influence of weak in t e r a c t i o n s among subsystems, the i n t e r n a l equilibriums reached at the end of the short-term period are approximately maintained i n r e l a t i v e value. Simon and Ando i l l u s t r a t e these rules by considering thermal equilibrium i n an o f f i c e b u i l d i n g (p. 117). The b u i l d i n g i s divided into a number of rooms. The rooms are separated from each other by walls which are good, but not perfect, i n s u l a t o r s . The rooms are further divided into o f f i c e s by poorly i n s u l a t i n g p a r t i t i o n s . Suppose that i n i t i a l l y there i s a large v a r i a t i o n i n the temperatures of the o f f i c e s . A f t e r a r e l a t i v e l y short time, the temperatures of each o f f i c e i n a p a r t i c u l a r room w i l l be approximately equal. A f t e r a much longer time, the temperatures of each room w i l l approach some common value. The thermal behaviour of i n d i v i d u a l rooms i s a r e l a t i v e l y short-term phenomenon. The rooms may be treated as independent subsystems with respect to this behaviour. The problem of how d i f f e r e n t short-term and long-term periods need be was addressed by Courtois (1985) . He presented i n t u i t i v e l y derived mathematical c r i t e r i a f o r the decomposition of queuing networks. In Chapter 3, i t w i l l be suggested that Simon and Ando's c r i t e r i a approximate more general rules governing "good" decomposition. From a system design perspective, Alexander suggests that a good decomposition w i l l lead to a design which exhibits a "good f i t " with i t s environment. His examples are a r c h i t e c t u r a l , but most of h i s arguments are applicable to system design i n general. He also defines a mathematical method for c l u s t e r i n g system variables such that information trans f e r i s minimized between c l u s t e r s of modules. Simon argues that a large proportion of n a t u r a l l y occurring systems i n the world e x h i b i t h i e r a r c h i c a l structures, and that "On t h e o r e t i c a l grounds we could expect complex systems to be hierar c h i e s i n a world i n which complexity had to evolve from s i m p l i c i t y " (Simon, 1981, p. 229). Thus, a h i e r a r c h i c a l structure i s seen not only as a useful t o o l , but as a fundamental feature of the universe. Simon uses a parable of two watchmakers, named Hora and Tempus, to i l l u s t r a t e t h i s point. Both men constructed watches c o n s i s t i n g of 1,000 parts. Tempus constructed h i s watches i n such a way that i f he was interrupted and had to put i t down, i t immediately f e l l to pieces and assembly had to begin again. Hora's watches performed p r e c i s e l y the same functions as Tempus', but he 4 designed h i s to have stable subassemblies of 10 parts each. Ten of these subassemblies could be put together i n another stable assembly, and ten of these f i n a l assemblies could be put together to form a completed watch. I f Hora was interrupted, previously completed subassemblies would not be aff e c t e d . I f the p r o b a b i l i t y of being interrupted while adding a part to a watch i s 0.01, a simple c a l c u l a t i o n (see Appendix A) shows that i t w i l l take Tempus on average 4000 times as long to complete a watch as Hora. 1.2.2. Computer Software A decomposed or "modular" computer program i s seen to be superior to a monolithic program. Yourdon (1975, p. 97) outlines the arguments i n favour of modularity as follows 4: a. A modular program i s easier to write and debug. Functional components can be w r i t t e n and debugged separately. b. A modular program i s easier to maintain and change. Functional components can be changed, rewritten, or replaced without a f f e c t i n g other parts of the, program. c. A modular program i s easier f o r a manager to c o n t r o l . For example, more d i f f i c u l t modules can be given to the better programmers. Most authors advocate the use of decompositions which e x h i b i t high cohesion within modules and low coupling between modules. Methodologies for achieving t h i s goal vary greatly i n both scope and degree of r i g o r . Several authors suggest "rules of thumb" for decomposing computer programs (Stevens, Myers and Constantine, 1974; Myers, 1978; Yourdon and Constantine, 1979) and some define rules f o r ensuring that a given decomposition i s consistent (Hamilton and Zeldin, 1976). Since a computer program i s a system, some Yourdon (pp. 97-99) also describes some performance r e l a t e d arguments against the use of modular programs. Subroutine c a l l s consume CPU time. Working storage a l l o c a t i o n s f o r each module may cause a modular program to require more space than an equivalent monolithic program. In v i r t u a l memory systems where only some modules may be i n phys i c a l memory at one time, some time may be wasted while waiting f o r the operating system to r e t r i e v e required modules from disk. 5 i n s i g h t s into a general theory of system decomposition might be gained through a close examination of t h i s body of knowledge. 1.2.2.1. C h a r a c t e r i s t i c s of Good Decomposition I t i s generally acknowledged that large computer software systems tend to be d i f f i c u l t to maintain or change 5 (Bubenko, 1986, p. 292). Parnas (1972, p. 1058) writes that when designing a program "one begins with a l i s t of . . . design decisions which are l i k e l y to change". Myers (1975) suggests examining the impact of future maintenance i n order to determine the "best" structure for software. This impact i s determined by the number of r e l a t e d changes to the system made necessary by coupling between i t s modules. However, modules must be defined before such an analysis can be performed. Myers' search for the best decomposition i s conducted by t r i a l and error. The r e l a t i o n s h i p between decomposition and software maintenance i s examined i n the l i g h t of a general theory of decomposition i n Chapter 4. Several suggested properties of good computer program decompositions w i l l be d i scussed i n t h i s section. Techniques f o r producing decompositions which possess these properties w i l l be described i n the next section. 1.2.2.1.1. Myers Myers (1975) appears to have been the f i r s t to propose a framework for analyzing coupling and cohesion within an e x i s t i n g program. He i d e n t i f i e s f i v e forms of coupling and seven forms of cohesion. They are defined i n order of decreasing d e s i r a b i l i t y and have been included as Appendix B. The ranking i s Myers' and was derived from experience. "Measurement" of coupling and cohesion i s l a r g e l y a r b i t r a r y . Myers (1975) develops a quantitative measure of the independence of two modules based on the type of coupling between them. This measure of decomposition q u a l i t y depends on a subjective assignment of weights to the various forms of coupling and cohesion. He uses a matrix of p r o b a b i l i t i e s to express the o v e r a l l modular independence of a decomposition. These p r o b a b i l i t i e s represent the l i k e l i h o o d of a change to one module f o r c i n g a change to another module. The matrix could The term "maintenance" s h a l l r e f e r to both the c o r r e c t i o n of implementation errors and the modification or enhancement of software. 6 be used to generate scores permitting quantitative comparison of two decompositions. A designer could use these scores to guide system decomposition i n a post-hoc manner. Ideally, a design methodology should force the f i r s t design of a system to be correct. Of course, t h i s would require a d e f i n i t i o n of decomposition correctness. The l e v e l s of coupling and cohesion within a program are r e l a t e d . Page-Jones (1980) claims that lower coupling tends to r e s u l t i n higher cohesion within modules. 1.2.2.1.2. Parnas Parnas (1972, p. 1056) introduces the concept of information hiding. Ensuring that h i g h e r - l e v e l modules do not have unnecessary knowledge about the i n t e r n a l workings of lower-level modules i s an important step i n the reduction of coupling. He also compares a l t e r n a t i v e decompositions by examining the impact of future modifications. "Good" decomposition r e s u l t s from minimizing t h i s impact. 1.2.2.1.3. Yourdon and Constantine Yourdon and Constantine (1979, chapter 9) suggest a number of decomposition h e u r i s t i c s . Rules a f f e c t i n g module s i z e , span of c o n t r o l , fan-i n , scope of e f f e c t , and scope of control are suggested as a basis f o r judging the q u a l i t y of a decomposition. A l l of these are described below. Module Size Module s i z e has been discussed extensively i n the l i t e r a t u r e . One early suggestion f o r module siz e comes from Baker (1972). He suggested that modules should be no longer than 50 statements, so that they could be shown on a single page of a p r i n t e r l i s t i n g . Weinberg (1970) showed that programmer comprehension of a module i s reduced i f the siz e exceeds about 30 l i n e s . Yourdon (1975, pp. 94-95) mentions a number of other module s i z e recommendations proposed by other researchers and p r a c t i t i o n e r s . Many of these recommendations are incompatible. 7 a. Modules should f i t into 4096 bytes, or 512 words, or 1024 words, or 2048 words of memory, etc. b. Modules are anything that can be written and debugged by one programmer i n one month. c. Modules should be no more than 100-200 statements i n length. (This suggestion i s .attributed to Larry L. Constantine.) d. Modules should consist of no more than 20 h i g h - l e v e l language statements. e. Modules should be no longer than 500 COBOL statements. When discussing the construction of h i e r a r c h i c a l program structures, Steward (1987, p. 98) suggests that no limb have more than 5 to 7 branches o f f of i t 6 . The lowest l e v e l of Steward's structures correspond to program code. Therefore, he i s suggesting a maximum module length of 5 to 7 statements. He c i t e s M i l l e r ' s P r i n c i p l e ( M i l l e r , 1956) which asserts an upper l i m i t to the number of concepts a human being may consider simultaneously. However, i t is not immediately obvious why a programmer should be expected to consider a l l the statements of a module simultaneously. Steward also claims that high cohesion i s indic a t e d "by whether what is done within the module can be given a short d e s c r i p t i o n " (p. 98). Myers (1978) defines a module which has functional cohesion (his most desirable form of cohesion) as a module which performs a single s p e c i f i c function. S i m i l a r l y , a common rul e f o r module siz e suggests that a module should c o n s i s t of a single f u n c t i o n a l idea. Unfortunately, the phrase "single f u n c t i o n a l idea" is d i f f i c u l t to define. While t h i s rule i s superior to any of the s i z e maximums mentioned e a r l i e r , i t s t i l l contains an undesirable degree of a r b i t r a r i n e s s . Any module s i z e rule based on function i s bound to be language dependent. Alexander (1967, p. 205) suggests a module to deal with the ac o u s t i c a l requirements of a system to i l l u s t r a t e t h i s problem. I t could be argued that the term acoustics " i s not a r b i t r a r y but corresponds to a c l e a r l y objective c o l l e c t i o n of requirements -- namely those which deal with auditory phenomena. Steward exempts CASE structures from t h i s r u l e . 8 But t h i s only serves to emphasize i t s a r b i t r a r i n e s s . A f t e r a l l , what has the fac t that we happen to have ears got to do with the problem's causal structure?". I f anything i s c l e a r from the above suggestions i t must be that there i s no consensus as to e i t h e r the optimal or maximal siz e of a module. Span of Control Span of co n t r o l r e f e r s to the number of immediate subordinates 7 to a module. Yourdon suggests that spans of control of two or le s s or more than ten should be c a r e f u l l y reconsidered. He claims that abnormally small or large spans of c o n t r o l are usually indicators of poor design. Small spans of control correspond to e i t h e r i n s u f f i c i e n t decomposition at the subordinate l e v e l or too much decomposition at the superordinate l e v e l . Large spans of control are u s u a l l y the r e s u l t of a f a i l u r e to define intermediate l e v e l s i n the decomposition. No theory or empirical evidence i s presented to support t h i s h e u r i s t i c . Fan-in Fan-in r e f e r s to the use of modules at more than one point of the program's structure. The use of these common modules reduces the amount of programming e f f o r t required. There i s c l e a r l y a trade-off between module s i m p l i c i t y and generality. For example, consider a point of sale (POS) inventory system. A si n g l e module could be written to handle a l l forms of input to the system. This module could be c a l l e d from any point i n the o v e r a l l structure. However, some designers might argue that such a module would be unnecessarily complicated. Input from a POS terminal and keyboard input from, say, the r e c e i v i n g dock could be s u f f i c i e n t l y d i f f e r e n t to warrant separate modules. 7 A module X i s subordinate to module Y i f Y controls the a c t i v a t i o n of X. A c t i v a t i o n may be accomplished by a subroutine CALL statement, for example. 9 Scope of E f f e c t and Scope of Control Scope of e f f e c t r e f e r s to the l o c a t i o n of d e c i s i o n events i n the program's structure. The scope of e f f e c t of a module i s the c o l l e c t i o n of a l l modules containing any processing that i s c o n d i t i o n a l upon the processing i n that module. The scope of control of a module i s the module i t s e l f and a l l of i t s subordinates. Yourdon and Constantine (1979, p. 178) state "for any given decision, the scope of e f f e c t should be a subset of the scope of c o n t r o l of the module i n which the decision i s located". In other words, any modules that are a f f e c t e d by a d e c i s i o n should be subordinate to the module which makes that decision. Again, no theory or empirical evidence i s presented to support t h i s h e u r i s t i c . 1.2.2.1.4. Cluster Analysis Hutchens and B a s i l i (1985) have proposed the use of c l u s t e r analysis to analyze the structure of an e x i s t i n g computer program. A l l c l u s t e r analysis algorithms require the d e f i n i t i o n of a s i m i l a r i t y or d i f f e r e n c e measure. This measure represents the "distance" between modules and i s the basis for decisions to group modules together. Hutchens and B a s i l i suggest several such measures based on data bindings 8. There i s no t h e o r e t i c a l reason for the s e l e c t i o n of one measure over another. Clustering algorithms are also s e n s i t i v e to the "black hole" e f f e c t . As more and more modules are combined into a single c l u s t e r , the number of linkages to other not yet c l u s t e r e d modules increases. This means that modules and small c l u s t e r s are more l i k e l y to be grouped with growing super-clusters than with each other. The f i n e structure of the system may be obscured. Weighting schemes can be used to reduce t h i s e f f e c t , but a s u i t a b l e assignment of weights must usually be found by t r i a l and error. I t i s also i n t e r e s t i n g to note that, i n order to perform c l u s t e r analysis, i t i s necessary to remove common modules as they cause disparate subroutines to c l u s t e r at low l e v e l s i n the hierarchy. A data binding e x i s t s when two modules reference the same v a r i a b l e . 10 1.2.2.2. Decomposition Techniques Several computer program decomposition techniques have been proposed. The Structured Design (Myers, 1975, 1979; Yourdon and Constantine, 1979) l i t e r a t u r e describes several decomposition h e u r i s t i c s . Myers defines source/transform/sink (STS) decomposition, transactional decomposition, and f u n c t i o n a l decomposition. Yourdon and Constantine define transform analysis and transaction analysis. Hamilton and Z e l d i n (1976) have developed a methodology based on an analysis of the inputs and outputs of a module. Their decomposition rules are embodied i n constructs c a l l e d JOIN, INCLUDE and OR. These constructs are r e f e r r e d to as "primitive c o n t r o l structures". In addition, M i l i , Desharnais, and Gagne (1986) have formally defined the process of program decomposition as performed by programmers. 1.2.2.2.1. Structured Design Decomposition Techniques 1.2.2.2.1.1. STS Decomposition and Transform Analysis STS decomposition i s Myers p r i n c i p a l decomposition technique. Transform analysis, as defined by Yourdon and Constantine (1979) i s e s s e n t i a l l y i d e n t i c a l 9 . The steps f or applying STS decomposition to a h i g h - l e v e l module are as follows: a. Outline the structure of a module. b. In t h i s module structure, i d e n t i f y the major stream of input data and the major stream of output data. c. I d e n t i f y the point i n the module structure where the input data stream l a s t e x i s t s as a l o g i c a l e n t i t y and the point where the output data stream f i r s t e x i s t s as a l o g i c a l e n t i t y . 9 This decomposition technique was f i r s t o u t l i n e d i n Stevens, Myers and Constantine (1974). 11 d. Using these points as the d i v i d i n g points i n the module structure, describe each d i v i s i o n of the problem as a sing l e function. These become the functions of the immediate subordinate modules. a user 's request 1 Leval ests 4-Read a request 1 Retr: Requ b .eg request I Form a search query c <] search query i Search keyword database 1 r cl l i s t of t i t l e s 1 Obtain abstracts CD r-t | Retri Resu e -=s| abstracts 1 Retri Resu i Display abstracts 1 £ ,sg display of abstracts 1 For example, consider a module which accepts a request to search an abstracts database by keyword and then displays selected abstracts. The structure of the module might be as i l l u s t r a t e d i n Figure 1. The major (and only) input stream consists of user requests "a". The major (and only) output stream consists of the r e t r i e v a l r e s u l t s " f " . Point "c" i s the l a s t point where the input stream e x i s t s as a d i s t i n c t e n t i t y . At point "d" there e x i s t s a l i s t of abstract Figure 1: A program structure f o r i l l u s t r a t i n g t i t l e s r e t r i e v e d from the Myers' STS decomposition. keyword database. There i s a one-to-one correspondence between the f i n a l r e t r i e v a l r e s u l t s and t h i s l i s t of t i t l e s . Therefore, Myers claims that point "d" i s where the output stream f i r s t e x i s t s as a d i s t i n c t e n t i t y . The module would then be broken into three submodules. The "source" submodule would read a request and form the search query. The "transform" submodule would search the keyword database. The "sink" submodule would obtain and display the selected abstracts. Yourdon and Constantine's transform analysis appears to be i d e n t i c a l to STS decomposition. They r e f e r to l o c a t i n g point "c" as i d e n t i f y i n g the "afferent data elements" and lo c a t i n g point "d" as i d e n t i f y i n g the "efferent data elements". Afferent data elements are defined as follows: 12 Afferent data elements are those h i g h - l e v e l elements of data that are furthest removed from p h y s i c a l input, yet s t i l l c onstitute inputs to the system. Efferent data elements are s i m i l a r l y defined, but for outputs. C l e a r l y , the points at which the o r i g i n a l module i s to be s p l i t are subject to some degree of ambiguity. For example, an argument could be made f o r s p l i t t i n g the above example at points "c" and "e" as i t i s only at point "e" that the f i n a l r e s u l t i s c l e a r l y seen. Yourdon and Constantine (1979, p. 194) are aware of t h i s problem, but claim that experienced designers w i l l not d i f f e r by more than one or two transforms ( i . e . functions) when i d e n t i f y i n g a f f e r e n t and e f f e r e n t data elements. I t should be noted that a l l decomposition techniques discussed i n t h i s section, including STS decomposition, are intended to be applied r e c u r s i v e l y to the newly created submodules. This recursion i s to be c a r r i e d out u n t i l the lowest-level modules may be e a s i l y converted into code. 1.2.2.2.1.2. Transactional Decomposition and Transaction Analysis Myers' tra n s a c t i o n a l decomposition i s applied when a module takes the form of a s e l e c t i o n process. I f a module receives d i f f e r e n t types of transactions, and the processing which follows i s dependent on the type of transaction received, the module i s a candidate for t r a n s a c t i o n a l decomposition. For example, a module whose purpose i s to process a merchandise transaction might be decomposed as shown i n Figure 2. Transactional decomposition i s s i m i l a r to Yourdon and Constantine's t r a n s a c t i o n a l a n alysis. However, they introduce the concept of a "transaction center". A transaction center must be able to a. obtain transactions i n raw form, b. analyze each transaction to determine i t s type, c. dispatch each type of transaction, and d. complete the processing of each transaction. 13 A p p l y s a l e s t r a n s a c t i o n to f i l e A p p l y r e t u r n t r a n s a c t i o n to f i l e A p p l y merchandise t r a n s a c t i o n to f i l e A p p l y new i t e m t r a n s a c t i o n to f i l e A p p l y d i s c o n t i n u a n c e t r a n s a c t i o n to f i l e Figure 2: An example of Myers' tran s a c t i o n a l decomposition. Myers would apply STS decomposition p r i o r to t r a n s a c t i o n a l decomposition i n order to i d e n t i f y the modules concerned with getting the transaction and determining i t s type. Yourdon and Constantine also provide an operational d e f i n i t i o n of a transaction. A transaction i s any element of data, c o n t r o l , s i g n a l , event, or change of state that causes, t r i g g e r s , or indicates some action or sequence of actions. This d e f i n i t i o n makes i t apparent that t r a n s a c t i o n a l decomposition may be applied i n cases where there i s no " t r a d i t i o n a l " t ransaction evident as there was i n the above example. 1.2.2.2.1.3. Functional Decomposition Myers describes functional decomposition as "an ad hoc process of p u l l i n g s i n g l e subfunctions from a module to achieve c e r t a i n purposes". He suggests two poss i b l e purposes: 14 a. I s o l a t i n g common functions, and b. I s o l a t i n g functions within informational cohesion modules, B e f o r e calculate average and write results to screen A f t e r : calculate standard deviation and write results to screen calculate average calculate standard deviation write to screen Figure 3: Myer's fu n c t i o n a l decomposition: I s o l a t i n g common functions. The f i r s t purpose r e f l e c t s the d e s i r a b i l i t y of removing a subfunction which i s contained i n a number of larger modules, and making i t into a separate module referenced by each. For example, the modules shown i n the "before" part of Figure 3 might be restructured as shown. The second purpose refe r s to s p l i t t i n g a function which references a number of data structures into modules which reference only one data structure each. An informational cohesion module i s one that hides "some concept, data structure, or resource" (Myers, 1978, p. 37). Modules with informational cohesion are considered as desirable as ones with f u n c t i o n a l cohesion. Myers' provides the following example of a s i t u a t i o n where t h i s s p l i t t i n g i s desir a b l e . Suppose there e x i s t s a module whose function i s " b u i l d table of underpaid employees". The module sequentially examines the personnel f i l e , and i f an employee meets the underpaid c r i t e r i a , i t places the employee's name i n the table. The 15 structure of the module might be as shown i n the "before" part of Figure 4. Myers would not have applied STS decomposition to t h i s module "because i t s l o g i c i s e a s i l y v i s u a l i z e d " . Easy v i s u a l i z a t i o n i s Myers' decomposition stopping c r i t e r i a . The f i r s t and the l a s t subfunctions r e f e r to separate data structures: the personnel f i l e and the output table. Functional decomposition of the above would lead to structure shown i n the " a f t e r " part of Figure 4 . The two newly created modules could be combined with other modules referencing the data structures, thus ei t h e r adding to or creating informational cohesion modules. Testing to determine whether a given employee i s underpaid would be performed i n the "Bui l d table of underpaid employees" module. Yourdon and Constantine do not re f e r to any decomposition method which i s analogous to Myers' functional decomposition. 1.2.2.2.2. HOS Decomposition • The HOS design methodology developed by Hamilton and Z e l d i n (1976) i s capable of generating computer code through the use of "mathematically provable constructs". S p e c i f i c a l l y , three p r i m i t i v e c o n t r o l structures are i d e n t i f i e d : JOIN, INCLUDE, and OR. The HOS methodology does not provide s p e c i f i c techniques f o r a c t u a l l y performing system decomposition. The methodology provides formal tools f o r ensuring that a given decomposition i s consistent with c e r t a i n axioms governing the r e l a t i o n s h i p s between modules. Therefore, the HOS methodology can provide some in s i g h t s into the nature of "good" decomposition, but cannot add to the decomposition h e u r i s t i c s found i n the Structured Design l i t e r a t u r e . JOIN i s used to support the decomposition of a function into two sequ e n t i a l l y executed subfunctions. The outputs of one module must be inputs c 16 before: get next personnel record extract salary fields compute this employee's lowest valid salary if underpaid, add to table A f t e r ' I build table of underpaid employees obtain salary data add entry to for next employee underpaid table Figure 4 : M y e r s ' f u n c t i o n a l decomposition: Creating i n f o r m a t i o n a l s t r e n g t h modules. to the other. For example, i f the function of the o r i g i n a l module was "make a s t o o l " , i t might be decomposed to "make parts" and "assemble parts". HOS uses a f u n c t i o n a l notation combined with a binary tree to represent decompositions as shown i n Figure 5a. Inputs to the system are TOPWOOD and LEGWOOD and the system's output i s STOOL. The output from the f i r s t , or r i g h t most, module i s TOP and LEGS. TOP and LEGS form, the inputs to the second module. JOIN i s analogous to STS decomposition when no transform module i s i d e n t i f i e d . The o r i g i n a l module w i l l be broken into only two submodules. I f some set of desired outputs can be obtained i n more than one way, OR i s used to separate the methods. For example, i f LEGS can be constructed from e i t h e r HARD wood by turning or SOFT wood by carving, a "make legs" function could be decomposed as shown i n Figure 5b. OR i s s i m i l a r to tran s a c t i o n a l decomposition. Its use implies that one and only one of the i d e n t i f i e d subsystems may be activated by a singl e transaction. INCLUDE i s used to separate independent subfunctions. For example i f the functions "make legs" and "make top" were independent of one another, the function "make parts" could be decomposed as shown i n Figure 5c. This sort of decomposition i s neither STS nor tra n s a c t i o n a l . Nor does i t f a l l under either of the circumstances Myers suggests for fu n c t i o n a l decomposition. Perhaps, Myers and Yourdon and Constantine considered t h i s form of decomposition too obvious to mention. That i s , i f a module consists of subfunctions which do not i n t e r a c t with each other i n any way, separate them. The HOS methodology has been l u c i d l y described by Martin (1985) . He claims that "The technique has been automated so that bug-free systems can be designed by persons with no knowledge of ei t h e r mathematics or programming. The software automatically generates bug-free program code. whereas most mathematical techniques have been applied only to small programs. Hamilton and Zeldin's technique has been used s u c c e s s f u l l y with highly complex systems. The technique i s used not only f o r program design but, perhaps more important, f o r h i g h - l e v e l s p e c i f i c a t i o n of systems. The design i s extended a l l the way from the h i g h e s t - l e v e l statement of system functions down to the automatic generation of code." (pp. 39-40) 17 a) HOS JOIN Stool = r i a ) c e - A - S t o o l ( T o p ¥ o o d , L e g ¥ o o d ) Stool = Assemr)le-Parts(Top,Legs) Top,Legs = Malce -Part s (Top¥ood ,LegVood) b ) HOS OR Legs = Make-Legs (Leg¥ood) Lgg¥oofl i s HARD LegVood i s SOFT Legs = Turn(LegVood) Legs = C a r v e ( L e g ¥ o o d ) c) HOS INCLUDE: Top,Legs = MaXe-Parts (Top¥ood,LegWood) Top = Ma);e-Top(Top¥ood) Legs = Ma)ce-Legs(Leg¥ood) Although Martin i s c l e a r l y concerned with " s e l l i n g " HOS, there i s no doubt that i t represents a major departure from the r e l a t i v e l y informal methodologies of Structured Design. 1.2.2.2.3. Formal Models of Computer Programming M i l i , Desharnais, and Gagne (1986) describe three formal models of the process of computer programming. They present three formalisms f o r program s p e c i f i c a t i o n s : a. As a p a i r of assertions (p,q), where p, the input assertion, defines the set of admissible input states and q, the output assertion, defines the set of correct output states. b. As a function mapping admissible input states into c o r r e c t f i n a l states. c. As a r e l a t i o n containing a l l the p a i r s (input/output state) considered to be correct by the s p e c i f i e r . The t h i r d formalism, and i t s associated r e l a t i o n a l decomposition, i s quite s i m i l a r to the one developed i n the remainder of t h i s document. M i l i et a l . define a r e l a t i o n R as a subset of S X S, where S i s the set of a l l possible states of a program 1 0. That i s , R i s a subset of a l l possible p a i r s (s,s' ) where the input state s and the output state s' are elements of S. A program s p e c i f i c a t i o n can be described by a r e l a t i o n where each tuple consists of a input/output state p a i r . Three rules for program decomposition are defined. These rules ensure that an o r i g i n a l r e l a t i o n can be reconstructed from a set of simpler 1 1 r e l a t i o n s . I t i s the programmer's r e s p o n s i b i l i t y to f i n d the simpler r e l a t i o n s . No procedure for obtaining these simpler r e l a t i o n s i s described. Program states r e f l e c t the values of the program's v a r i a b l e s . For example, i f a program contains three variables "a", "b" and "c", a state s of the program could be represented by a t r i p l e t of values <a(s),b(s),c(s)> where a(s) i s the value of v a r i a b l e "a" when the system i s i n state s, etcetera. I f "a" i s an integer v a r i a b l e and "b" and "c" are r e a l , p ossible states of the program might include <3 ,2 .1 ,3 .1> , <1 ,0 .1 ,0 .2> , and <0 ,0 .1 ,0 .2> . 1 1 That i s , l e s s complex code i s required for implementation. 19 Exp lana t ions o f the decomposi t ion r u l e s are r e l a t i v e l y i n v o l v e d and have been i n c l u d e d as Appendix C. The r u l e s are d e s c r i b e d b r i e f l y , and i n f o r m a l l y , below. Sequence Statement Rule A r e l a t i o n R may be decomposed i n t o two r e l a t i o n s R x and R 2 where: 1) The inpu t s t a t e s o f R x are the same as the inpu t s t a t e s o f R. 2) The output s t a t e s o f R are the same as the output s t a t e s o f R 2 . 3) The output s t a t e s o f Rj are the same as the inpu t s t a t e s o f R 2 . A l t e r n a t i o n Statement Rule A r e l a t i o n R may be s p l i t i n t o two sma l l e r r e l a t i o n s R x and R 2 where R x c o n s i s t s o f those s t a t e s o f R which s a t i s f y some c o n d i t i o n , and R 2 c o n s i s t s o f those s t a t e s o f R which do no t . The I t e r a t i o n Statement Rule Decomposi t ion by i t e r a t i o n i n v o l v e s f i n d i n g a r e l a t i o n which , when a p p l i e d to i t s e l f r e c u r s i v e l y u s i n g the sequence statement r u l e , w i l l y i e l d the o r i g i n a l r e l a t i o n . I t i s d i f f i c u l t to see how the i t e r a t i o n r u l e q u a l i f i e s as a form of decompos i t ion . The r u l e i s p r i m a r i l y in tended to a v o i d cod ing a l a r g e number o f i n p u t / o u t p u t p a i r s , by a p p l y i n g a s m a l l e r amount o f code i t e r a t i v e l y . I t e r a t i o n can be viewed as a programming t o o l used to save source code space and programmer t y p i n g t ime. Decomposi t ion u s i n g the i t e r a t i o n r u l e need not produce programs which are more e a s i l y v i s u a l i z e d than a program implementing the o r i g i n a l r e l a t i o n R. I n f a c t , the o p e r a t i o n o f i t e r a t i v e programs can be much harder to grasp than e q u i v a l e n t , but l onge r , " l i n e a r " programs. 1.3. Conelus ions Decomposi t ion i s r ecogn ized i n the genera l systems theory l i t e r a t u r e as an impor tant t o o l f o r both systems a n a l y s i s and d e s i g n . Systems e x h i b i t i n g "good" decomposi t ions are seen to be more s t a b l e than m o n o l i t h i c systems. 20 However, Simon and Ando's short-run and long-run dynamic c r i t e r i a appear to be the only c l e a r contributions to an understanding of what constitutes "good" decomposition. On the other hand, computer program decomposition has long been a major problem i n software engineering, and as a r e s u l t some p r a c t i c a l solutions have been devised. There are three basic types of computer program decomposition. STS decomposition, transform analysis, HOS JOIN, and the sequence statement rule r e f e r to the separation of sequentially a c t i v a t e d functions into separate modules. These techniques could be re f e r r e d to as "sequential decomposition". Transactional decomposition, transaction analysis, HOS OR and the a l t e r n a t i o n statement rule are used to decompose a set of functions that are activated c o n d i t i o n a l l y . These techniques could be r e f e r r e d to as "conditional decomposition". The HOS INCLUDE can be used to separate functions that are independent of each other. This technique could be r e f e r r e d to as " p a r a l l e l decomposition" . The Structured Design methodologies of Myers and Yourdon and Constantine, are derived from experience and require human i n t e l l i g e n c e f o r t h e i r a p p l i c a t i o n . They place some structure on the process of f i n d i n g the lower-l e v e l modules, but t h e i r precise d e f i n i t i o n i s l e f t to the program designer. The dictum s t a t i n g that a module should contain at most one f u n c t i o n a l idea i s both hi g h l y subjective and language dependent. Myers' framework of coupling and cohesion along with h i s measure of a decomposition's q u a l i t y i s only u s e f u l a f t e r the system has been coded. The HOS methodology does not consider how a module i s to be decomposed. Rather, i t i s concerned with ensuring that the decomposition i s good with respect to the HOS axioms, namely, i t can be represented using p r i m i t i v e control structures. The decomposition rules of M i l i et a l . can be used to ensure that given modules can be combined to form the o r i g i n a l program. They do not provide an algorithm f o r f i n d i n g the modules. The next chapter describes a system modelling formalism which w i l l support both automated decomposition and completeness and consistency v e r i f i c a t i o n . 21 Chapter 2: System Modelling 2.1. Introduction Bubenko (1986, p. 289) notes that the p r a c t i c e of information systems analysis and design i s characterized by hundreds of d i f f e r e n t methodologies. Yet there i s general agreement that most large information systems are d i f f i c u l t to maintain and change, and that assessing t h e i r correctness and completeness i s u s u a l l y impossible (p. 292). Several undesirable c h a r a c t e r i s t i c s possessed by many methodologies are i d e n t i f i e d (p. 298). These include the following: a. Fuzzy Concepts Many of the concepts advocated i n analysis and design methodologies are not well defined. I t i s d i f f i c u l t to know which ones to use, and how to use them i n varying, n o n - t r i v i a l design s i t u a t i o n s . b. No V e r i f i c a t i o n There i s u s u a l l y no way to v e r i f y the correctness, completeness, and consistency of conceptual s p e c i f i c a t i o n s . Examples of poorly-defined concepts include: system, decomposition, subsystem, s t a t i c s , and dynamics. I t i s impossible to develop a theory of system decomposition without exact d e f i n i t i o n s of these concepts. The main purpose of t h i s chapter i s to present a formalization of the modelling constructs deemed necessary for the automation of system decomposition. These constructs have been implemented i n the form of computer-based s p e c i f i c a t i o n s analysis t o o l s . The tools are described and t h e i r use w i l l be demonstrated using two rather simple examples i n l a t e r sections of t h i s chapter. A more complicated " r e a l " system w i l l be examined i n Chapter 6. There are no generally accepted d e f i n i t i o n s f or correctness, completeness, and consistency with respect to system models. Roman (1985) claims that "a requirements s p e c i f i c a t i o n i s complete i f some relevant aspect has not been l e f t out and i s consistent i f the parts of the s p e c i f i c a t i o n do not contradict each other. Both completeness and consistency require the existence of c r i t e r i a against which one may evaluate the model. Completeness and consistency checks...presuppose the a n a l y z a b i l i t y of the requirements by mechanical or other 22 means. The higher the degree of formality the more l i k e l y i t i s that requirements may be analyzed by mechanical means." (p. 16) . I t i s not s u r p r i s i n g that few methodologies provide for any form of v e r i f i c a t i o n since few produce formal requirement s p e c i f i c a t i o n s . D e f i n i t i o n s of completeness, consistency and correctness w i l l be suggested i n t h i s chapter. These d e f i n i t i o n s are s u f f i c i e n t l y formal to allow computerized analysis. Tests f o r completeness and consistency have been included i n the s p e c i f i c a t i o n s analysis t o o l s . Bubenko (1986, p. 298) also notes that there appears to be an underlying assumption, among the various analysis and design methodologies, that " i n the e a r l y stages, conceptual s p e c i f i c a t i o n and analysis of the behaviour (dynamics) of the information system i s less important (for the purpose of understanding) than the d e s c r i p t i o n of i t s 'structure'". I t i s not c l e a r how any analysis of structure can be performed without some knowledge of behaviour. Part of a system's structure consists of a c o l l e c t i o n of o b j e c t s 1 2 . In the object-oriented programming l i t e r a t u r e , Nierstrasz (1987) notes that "perhaps the most d i f f i c u l t task i s deciding how to n a t u r a l l y decompose a problem into objects" (p. 11-12). In order to separate two objects i n a system, the analyst must be aware of a circumstance i n which the objects behave independently. For example, consider an employee's f i r s t and l a s t names i n a personnel system. I f a d e c i s i o n i s made to represent them as separate objects, the analyst must know that they could be separated. The analyst must know of some process which does not require both parts of an employee's name. This knowledge could come from h i s or her understanding of the system's operation or previous experience. In Chapter 4, i t w i l l be argued that previous experience i s not a s u f f i c i e n t basis for good design decisions. The formalism presented here e x p l i c i t l y models system dynamics, and decomposition decisions (as described i n the next chapter) are based s o l e l y on the c h a r a c t e r i s t i c s of the system as described by the analyst. "Objects" as used i n t h i s research are r e l a t e d to the objects of object-oriented programming, but they are not i d e n t i c a l . The r e l a t i o n s h i p s h a l l be discussed i n Chapters 3 and 6. 23 2.2. The Formalism 2.2.1. General The process of systems analysis and design can be viewed as a three-stage ReaI System i—i to < ImpIemented n f o r m a t i o n System 4 Real Conceptual o tti CD a CD I—I m f Mode I o f t h e ReaI System Design Model of the n f o r m a t i o n System Figure 6: The system analysis and design process (adapter from Wand and Weber, 1988) . transformation as shown i n Figure 6. To i l l u s t r a t e these stages, consider an analogy with the design and construction of an o f f i c e b u i l d i n g . An a r c h i t e c t creates a set of drawings and s p e c i f i c a t i o n s which r e f l e c t the desires of h i s or her c l i e n t . An engineer translates these into a d e t a i l e d plan f o r the construction of the b u i l d i n g . F i n a l l y , a contractor constructs the o f f i c e b u i l d i n g i t s e l f . In t h i s example, the " r e a l system" consists of the c l i e n t ' s desires. The a c t i v i t i e s of the a r c h i t e c t are c a l l e d "analysis". The a r c h i t e c t ' s drawings and s p e c i f i c a t i o n s are a "model of the r e a l system". The a c t i v i t i e s of the engineer are c a l l e d "design". The set of d e t a i l e d construction drawings forms a model of the o f f i c e b u i l d i n g and i s analogous to a "model of the information system". The a c t i v i t i e s of the contractor are c a l l e d "implementation" and the o f f i c e b u i l d i n g i t s e l f i s analogous to an "implemented information system". This research i s p r i m a r i l y concerned with the transformations from the " r e a l system" to the "model of the r e a l system", and from the "model of the r e a l 24 system" to the "model of the information system", that i s , with both the "analysis" and "design" transformations. System decomposition i s the process by which an analyst i d e n t i f i e s the parts of the r e a l system which should be r e f l e c t e d i n h i s model. These parts, and t h e i r r e l a t i o n s h i p s with each other, have a d i r e c t influence on the structure of the information system. 2.2.2. Systems In order to support automated system decomposition a modelling formalism must be able to represent the following: a. The parts of the system 1 3, b. The allowed states of the system, and c. The manner i n which these states may change. The f i r s t requirement ref e r s to system s t a t i c s ; the l a s t two r e f e r to system dynamics. Most e x i s t i n g analysis and design methodologies meet these requirements at l e a s t i m p l i c i t l y . However, the basic constructs of most methodologies (with the notable exception of HOS) are not c l e a r . An important premise of t h i s research i s that understanding of system properties, i n p a r t i c u l a r decomposability, w i l l be greatly f a c i l i t a t e d by c a r e f u l l y defining what we are studying. That i s : What exactly i s a system and what governs i t s behaviour? 2.2.3. An I n t u i t i v e Beginning The system modelling formalism used i n t h i s research i s l a r g e l y based on the works of Bunge (1978, 1979) and Wand and Weber (1988, 1989). The system The modelling formalism selected f o r use i n t h i s research does not deal d i r e c t l y with the "parts" or things belonging to a system. As w i l l be shown, only knowledge of the properties which are used to describe the things i s required to s p e c i f y a system. However, at t h i s stage i t may be more convenient to v i s u a l i z e a system based on a c o l l e c t i o n of things, rather than a set of properties. 25 modelling approach based on the formalism w i l l be c a l l e d SELMA (for States, Events, and Laws Modelling Approach) 1 4. One of the goals of system decomposition i s the i d e n t i f i c a t i o n of the objects comprising a system. As s h a l l be i l l u s t r a t e d i n the next chapter, there i s no unique set of objects describing most i n t e r e s t i n g systems. In general, l a b e l l i n g of the objects from which a system i s constructed depends upon the analyst's point of view. Von Bertalanffy (1974) notes that i d e n t i f i c a t i o n of objects i n the r e a l world i s not a t r i v i a l task. "The s p a t i a l boundaries of even what appears to be an obvious object or 'thing' a c t u a l l y are i n d i s t i n c t . From a c r y s t a l c o n s i s t i n g of molecules, valences s t i c k out, as i t were, into the surrounding space; the s p a t i a l boundaries of a c e l l or an organism are equally vague because i t maintains i t s e l f i n a flow of molecules entering and leaving, and i t i s d i f f i c u l t to t e l l j u s t what belongs to the ' l i v i n g system' and what does not. Ultimately a l l boundaries are dynamic rather than s p a t i a l . " (p. 22). There i s a r e a l danger that an analyst may be tempted to decompose a system on the basis of s p a t i a l r e l a t i o n s h i p s ( i . e . r e l a t i v e p o s i t i o n s i n space). As w i l l be further discussed i n the next chapter, i f component objects discovered i n t h i s way are used to form the structure of the information system, i t i s l i k e l y that a l t e r n a t i v e , and possibly superior, structures w i l l not be considered. S p a t i a l r e l a t i o n s h i p s are p r i m a r i l y s t a t i c i n nature. The theory of decomposition presented i n the next chapter i s based on an analysis of both system s t a t i c s and dynamics. There i s no generally accepted d e f i n i t i o n f o r the term "system" and i t w i l l not be r i g o r o u s l y defined here. In the modelling formalism, and i n the theory of decomposition presented i n the next chapter, "the system" s h a l l mean whatever c o l l e c t i o n of objects and processes the analyst chooses to consider. A system i s described by p r o p e r t i e s 1 5 and r e l a t i o n s between these properties. Of course, a system may i t s e l f be considered to be an object, and as such suf f e r s from the same i d e n t i f i c a t i o n problems discussed above. I t i s assumed that the system i s Just as a point of i n t e r e s t , "selma" i s derived from the Arabic word for "secure" and i s the short form of "anselma" which i s Old Norse for " d i v i n e l y protected" (Browder, 1987, p. 185). Given that automation of consistency and completeness t e s t i n g i s one of the major advantages of SELMA over other modelling schemes, these are not e n t i r e l y inappropriate meanings. 1 5 These properties w i l l also describe the things from which the system i s composed. However, the modelling formalism i s not concerned with i d e n t i f i c a t i o n of the component things of a system. 26 described by a well defined set of properties, and that a l l relevant i n t e r a c t i o n s between the system and the r e s t of the universe ( i . e . i t s environment) are known. I t should be noted that Bunge (1979, p. 6) defines a system as an object c o n s i s t i n g of at l e a s t two d i f f e r e n t connected 1 7 things. This d e f i n i t i o n was found to be too r e s t r i c t i v e . For the purposes of system modelling, i t i s s u f f i c i e n t to accept as a system anything which the analyst claims i s a system. As f a r as t h i s research i s concerned, i t does not matter whether each component of the system i s connected d i r e c t l y , or i n d i r e c t l y , to any other component of the system. For example, consider a system c o n s i s t i n g of two independent subsets of things, but where the things i n each subset are interconnected 1 8. As we s h a l l see, the decomposition algorithm (described i n the next chapter) w i l l f i n d that the system consists of two independent subsystems 1 9. The stance taken here i s that analysts know what systems are and that too d e t a i l e d a d e f i n i t i o n w i l l only confuse matters. Relevant to the purpose of the analysis e f f o r t . 1 7 Bunge also defines the term "connected". Unfortunately, any discussion of connection or i n t e r a c t i o n degenerates into a discussion of c a u s a l i t y . Such a disc u s s i o n i s not appropriate here. 1 8 Normally there should be some reason to consider independent subsets as parts of the same system. Perhaps the independent subsets describe are parts of another subsystem defined at a higher l e v e l of abstraction. That i s , using the terminology to be introduced i n the next chapter, the two subsets may each contribute input state variables to a subsystem which determines the value of some emergent state v a r i a b l e at a higher l e v e l . 1 9 Note that i n r e a l i t y the subsystems may not be independent. I t could be argued that i n some sense a l l parts of the universe are interconnected. However, i t i s possible f o r two subsystems to be independent with respect to a p a r t i c u l a r model. The model i s a man-made abstraction of some aspects of the r e a l world. Not a l l int e r a c t i o n s w i l l be described. 27 2.2.4. D e f i n i t i o n s 2.2.4.1. The Basics D e f i n i t i o n : System State At a given time, the values attained by the properties of a system a comprise a STATE s of a. D e f i n i t i o n : State Variable and Value State v a r i a b l e s are the properties required to describe some part of the r e a l world for some given purpose 2 0. A system a i s that part of the r e a l world described by the set of STATE VARIABLES {vx v n) selected by an a n a l y s t 2 1 . A state v a r i a b l e i s a function mapping the set of a l l system states into the set of VALUES. That i s , the value of state v a r i a b l e v A at time t i s v ^ t ) . For example, a state v a r i a b l e c a l l e d "employee-type", describing some part of a personnel system, might have values of " f u l l - t i m e " or "part-time". A system state s can be represented by a vector of state v a r i a b l e values. s - [ V i ( t ) , . . . , v n ( t ) ] D e f i n i t i o n : Possible State Space The POSSIBLE STATE SPACE S of a system a i s the Cartesian product of the sets of a l l possible values of each state v a r i a b l e of a. Bunge (1979, p. 20) c a l l s t h i s space the "conceivable state space of a". For example, consider a system which can be described by three state v a r i a b l e s , "a", "b" and "c". A state s of t h i s system could be described by the vector [ a ( t ) , b ( t ) , c ( t ) ] , where a ( t ) , b ( t ) , and c ( t ) are functions returning the values of state v a r i a b l e s a, b, and This implies an appropriate choice of l e v e l of abstraction. That i s , i t i s not necessary to include a l l properties of the part of the r e a l world being modelled. For example, when modelling a company's p a y r o l l system, the analyst may choose not to include a state v a r i a b l e describing the s i z e of an employee's desk. 2 1 The concept of "system" i s discussed further i n Chapter 7. 28 c at time t r e s p e c t i v e l y . I f each state v a r i a b l e could have values of 0 and 1, the possible state space of the system would consist of [0,0,0], [0,0,1], [0,1,0], [0,1,1], [1,0,0], [1,0,1], [1,1,0], and [1,1,1]. D e f i n i t i o n : System Law The i n t e r a c t i o n s between the properties of a system a comprise the SYSTEM LAW L of a (Wand and Weber, 1989). Given any state s of a, L i s a f u n c t i o n 2 2 such that s' = L(s) where L(s) = s i f the system may remain i n d e f i n i t e l y i n s, and L(s) s i f the system state must change, and s' w i l l be the next state of the system where s' = L(s' ) ( i . e . the law does not change the next s t a t e ) . Every system has one and only one system law. This law completely defines the behaviour of the system. F u l l knowledge of a system law i s generally impossible or a l e a s t very d i f f i c u l t to o b t a i n 2 3 . The concept of a system law i s seen as u s e f u l t o o l f o r theory b u i l d i n g , but p r a c t i c a l problems w i l l require more operational d e f i n i t i o n s . These w i l l be developed l a t e r i n t h i s section. D e f i n i t i o n : Stable and Unstable System States Given a state s of a system with system law L: I f s = L(s) then s i s s a i d to be STABLE with respect to L. I f s ^ L(s) then s i s s a i d to be UNSTABLE with respect to L. This research deals only with deterministic system. That i s , each i n i t i a l system state transforms into one and only one f i n a l state. 2 3 For example consider the p h y s i c a l universe. One could think of the universe as being governed by a si n g l e all-encompassing p h y s i c a l law, which mankind i s s t r u g g l i n g to understand through science. We c u r r e n t l y have only a p a r t i a l understanding of t h i s law. This p a r t i a l understanding i s expressed by our chemical, b i o l o g i c a l and mathematical p r i n c i p l e s and laws of physics. 29 For example, consider a s i m p l i f i e d accounting system described by state v a r i a b l e s representing "account balance" and "value of assets". Assume the system law simply states that the values of the two state v a r i a b l e s should be equal, and i f they are not, the value of "account balance" must be set equal to the value of "value of assets". That i s , i f the values of "account balance" and "value of assets" are not equal, the system law would a l t e r the system state by s e t t i n g the value of "account balance" equal to the value of "value of assets". This means the system was i n an unstable state with respect to the system law, because the law maps the o r i g i n a l state into a d i f f e r e n t state. On the other hand, when the values of "account balance" and "value of assets" are equal, the system i s i n a stable state because the law i s f u l f i l l e d . D e f i n i t i o n : Stable State Space The set of a l l stable states of a system a i s c a l l e d the STABLE STATE SPACE of a. D e f i n i t i o n : External Event The environment 2* acts on a system i n the form of EXTERNAL EVENTS. An external event e occurs when the environment acts to set the value of some state variables within the system. This change of value might move the system into another stable state, an unstable state, or the system might remain i n the same state. In other words, i f s i s a system state and S i s the possible state space of the system, e i s a f u n c t i o n 2 5 of the following form. e: {s such that s = L ( s ) , s e S } -->{s such that s e S) I f the state i s stable, no further state changes occur. However, i f the new state i s unstable, the system must respond so as to return to a stable state. These system state changes i n response to external events define the system's The environment of a system i s described by a l l properties of the r e a l world which are not properties of the system. 2 5 I t i s assumed that external events can only occur when the system i s i n a stable state ( i e . L(s) = s) . 30 dynamics. For example, a system i n i t i a l l y i n a stable state "stable!! 1 1 may be moved to an unstable state "unstable-,/' by an external event e. The system w i l l respond by moving to another stable state " s t a b l e f l " . This i s shown gr a p h i c a l l y i n Figure 7a. I t i s also possible that the same event may move the system from a stable state " s t a b l e i 2 " to another stable state " s t a b l e f 2 " d i r e c t l y , as shown i n Figure 7b. An analyst may f i n d i t d i f f i c u l t to s p e c i f y a monolithic system law which describes the o v e r a l l behaviour of a l l state v a r i a b l e s . Fortunately, a system law may be decomposed into smaller SUBLAWS, and perhaps more importantly, a system law may be synthesized from a number of sublaws. D e f i n i t i o n : Sublaw A SUBLAW 1 i s a function defined on a subset of the state v a r i a b l e s describing a system, such that f o r any stable state s of the system a with system law L, s = l ( s ) and for any unstable state s of the system, s ^ l ( s ) . That i s , s = l ( s ) i f s = L(s) and s * l ( s ) i f s ^ L(s) Notice that i f s i s unstable, a sublaw need not map s into the same stable state as the system law. That i s , s' = l ( s ) and s * s' and s" = L(s) does not imply s' = s". Also notice that there are two parts to any system law or sublaw: a) e \ X s t a b l e ^ E> unstable! &- s t a b l e f l 1 e s t a b l e r - o s t a b l e £ 2 Figure 7: The a c t i o n of external event "e" on a system. a) The event moves the system into an unstable state. b) The event moves the system into a stable state. 31 a. S t a b i l i t y Conditions This part applies when a state s i s stable ( i . e . s = L(s) or s = l ( s ) ) . The condition s p e c i f i e s the system states allowed by the sublaw. b. Corrective Actions This part applies when a state s i s unstable ( i . e . s ^ L ( s ) or s ^ l ( s ) ) . The a c t i o n s p e c i f i e s how the values of the state v a r i a b l e must change should the system enter an unstable state. Before an example of a sublaw i s provided, one more d e f i n i t i o n i s required. D e f i n i t i o n : Rule A sublaw may be expressed as a set of RULES. Each r u l e s p e c i f i e s a s i n g l e stable condition or c o r r e c t i v e action. For example, a d e s c r i p t i o n of a very simple accounting system (with r e a l -time asset change posting) might include the following r u l e s . 1. The value of the "account balance" state v a r i a b l e must equal the value of the "value of assets" state v a r i a b l e . 2. The " l a s t change status" state v a r i a b l e must indicate that the l a s t change to asset value has been posted ( i . e . the value of the l a s t change to the value of the assets has been added to or subtracted from the account balance). 3. I f the value of the "account balance" state v a r i a b l e i s not equal to the "value of assets" state v a r i a b l e ( i . e . the system i s out of balance), then adjust the value of the "account balance" state v a r i a b l e to equal the value of the "value of assets" state v a r i a b l e , and set the value of the " l a s t change status" state v a r i a b l e to indicate that the l a s t change has been posted. The above rules constitute a sublaw. There may be other sublaws describing other parts of the system. The f i r s t two rules s p e c i f y s t a b i l i t y conditions and the l a s t s p e c i f i e s a c o r r e c t i v e action. Notice that t h i s sublaw assumes that the only way the system can become out of balance i s by a l t e r i n g the value of the 32 assets. That i s , the sublaw could not handle a s i t u a t i o n where the account balance was changed d i r e c t l y , e i t h e r by accident or deliberate tampering. In other words, the above rules have been formulated with a s p e c i f i c set of external events i n mind. The system modelling tools, described l a t e r i n t h i s chapter, require e x p l i c i t i d e n t i f i c a t i o n of external events so that d e f i c i e n c i e s i n the rules can be immediately i d e n t i f i e d . T r a d i t i o n a l l y , the behaviour of o f f i c e information systems has been described i n terms of procedures. In SELMA, the behaviour of systems i s e n t i r e l y defined i n terms of sublaws. Sublaws are not equivalent to procedures. Panko (1984, p. 227) defines a procedure as a program " i n which there i s a predetermined flow of work involving many steps, whether the flow consists of the same steps each time or involves a more complex l o g i c flow". In a study in v o l v i n g the creation of computerized systems to support executive work, Panko notes that none of the executives interviewed "could a r t i c u l a t e d e f i n i t e processes, much less well-defined procedures, to describe how t h e i r goals were achieved" (p. 228). The order of a c t i v a t i o n of sublaws i s not predetermined. I t i s hypothesized that i n many cases i t may be easier to discover the sublaws under which an executive operates, than to determine a l l the procedures he or she may choose to follow. However, empirical t e s t i n g of t h i s hypothesis i s beyond the scope of t h i s research. Before formal d e f i n i t i o n s of correctness, completeness, and consistency can be given, one more basic d e f i n i t i o n i s required. D e f i n i t i o n : Response Path and Response Function Let Q be the set of sublaws describing the behaviour of some system. An ordered l i s t of sublaws [ l i , . . . , ^ ] where { l i , . . . , ^ } c Q i s c a l l e d a RESPONSE PATH and the composition of those sublaws 2 6 P Q ( S ) = l j ( . . . l ^ l ^ s ) ) . . .) - l j O . . . O ^ o l ' ^ s ) i s c a l l e d a RESPONSE FUNCTION defined on Q. The symbol o i s used to denote the composition of functions. 33 2.2.4.2. Completeness, Consistency and Correctness of Sublaws Brooks (1987) notes that "the hardest part of the software task i s a r r i v i n g at a complete and consistent s p e c i f i c a t i o n . . . " (p. 16). SELMA supports formal d e f i n i t i o n of completeness and consistency. As w i l l be demonstrated l a t e r i n t h i s chapter, these d e f i n i t i o n s can be computerized to automatically test a system model c o n s i s t i n g of state variables and sublaws. Informally, the notions of sublaw completeness and consistency can be described as f o l l o w s : 2 7 Completeness: A l l system states may be transformed to stable states by the sublaws ( i . e . no states have been " l e f t out" i n the analysis of system dynamics). Consistency: Every system state, which may be transformed to a stable state by the sublaws, may be changed into one and only one stable state ( i . e . the sublaws do not contradict one another). In addition, the notion of correctness i s informally described as follows: Correctness: Taken together, the sublaws transform the i n i t i a l system states to exactly the same f i n a l states as the system law ( i . e . a l l the sublaws combined describe the actual behaviour of the system). Each of these d e f i n i t i o n s depends to some extent on the system law. Completeness and consistency require that stable states be i d e n t i f i e d . Stable and unstable states were defined i n terms of the system law. Correctness requires that the "operation" of the sublaws be the same as the system law. Unfortunately, system laws are generally unknowable 2 8. The best an analyst can hope f o r i s an approximation to the system law i n terms of sublaws. This does not imply that completeness, consistency and correctness are useless notions. While correctness i s usually impossible to v e r i f y , two l e v e l s of completeness and consistency are formally defined below. At a conceptual l e v e l , global 2 7 These informal notions are s i m i l a r to those of Roman (1985, p. 16). 2 8 Olive (1983, p. 73) states " i t i s not possible to formally v e r i f y the v a l i d i t y of the conceptual model with respect to the user's ' r e a l ' requirements...". 34 completeness, global consistency and correctness are defined i n terms of the system law. At an operational l e v e l , l o c a l completeness and l o c a l consistency are defined using a more r e s t r i c t e d d e f i n i t i o n of s t a b i l i t y c a l l e d l o c a l s t a b i l i t y 2 9 . 2.2.4.2.1. Conceptual D e f i n i t i o n s f o r Completeness, Consistency and Correctness D e f i n i t i o n : Global Completeness of Sublaws A set of sublaws Q of system a with system law L completely describes the behaviour under L of a with respect to the possible state space S of a, i f for every state s i n S there e x i s t s a response function P(s) which maps that state into a stable state s' . That i s : Q i s g l o b a l l y complete with respect to S i f and only i f FOR ALL s such that s e S, THERE EXISTS P Q such that s' = P Q(s) and s' = L(s' ) Notice that while s' must be stable, i t need not be the same stable state into which the system law maps s ( i . e . s' i s not n e c e s s a r i l y equal to L ( s ) ) . Equivalence of the sublaws to the system law i s assured by sublaw correctness as defined l a t e r . D e f i n i t i o n : Global Consistency of Sublaws A set of sublaws Q of system a with system law L i s g l o b a l l y consistent with respect to the possible state space S of cr, i f a l l response functions which map a state s i n S into a stable state, map s into the same stable state. That i s : Q i s g l o b a l l y consistent with respect to S i f and only i f FOR ALL s, PQ, P'Q such that s e S and L(P Q(s)) = P Q(s) and L(P' Q(s)) = P' Q(s), PQ(S) = P'q(s) Conceptual and operational l e v e l s are concerned with aspects of the " r e a l system" and the "model of the information system" (as defined e a r l i e r i n t h i s chapter), r e s p e c t i v e l y . 35 Again, notice that while the f i n a l states P Q(s) and P' Q(s) must be stable and equal, they need not be equal to the state into which the system law would have mapped s. D e f i n i t i o n : Correctness of Sublaws A set of sublaws Q of system a with system law L c o r r e c t l y describes the behaviour under L of a with respect to the possible state space S of <r, i f for every state s i n S every response function P Q(s) , which maps s into a stable state, maps s into the same state that L maps s. That i s : Q i s g l o b a l l y correct with respect to S i f and only i f FOR ALL s, P Q such that s € S and L(P Q(s)) = P Q(s) , P Q(s) = L(s) Global completeness and consistency are p r e r e q u i s i t e s f o r correctness. That i s , the d e f i n i t i o n of correctness implies that every state can be mapped into one and only one stable state. However, notice that global completeness and consistency do not imply correctness. That i s , the d e f i n i t i o n s of global completeness and consistency do not ensure that the mappings provided by the sublaws and the system law are the same. This observation may be expressed by the following c o r o l l a r y . C o r o l l a r y : I f a set of sublaws Q i s correct with respect to a possible state space S, then Q i s g l o b a l l y consistent and g l o b a l l y complete with respect to S. 2.2.4.2.2. Operational D e f i n i t i o n s of Completeness and Consistency Knowledge of the system law i s required to t e s t a set of sublaws for global completeness, global consistency, and correctness. In p r a c t i c e , the system law governing the behaviour of most r e a l systems i s approximated by the sublaws themselves. Notice that the global completeness and global consistency conditions require only knowledge of whether a system state i s stable. While knowledge of the system law i s required to assess s t a b i l i t y ( i . e . s i s stable 36 i f s = L ( s ) ) , i f a weaker d e f i n i t i o n of stable state i s employed, a form of completeness and consistency t e s t i n g becomes possible. Consider the following d e f i n i t i o n f or LOCALLY STABLE STATE where P i s some response function derived from the set of sublaws Q. D e f i n i t i o n : L o c a l l y Stable State A system state s i s l o c a l l y stable i f and only i f there i s no composition P Q of sublaws Q which can map the state into a d i f f e r e n t state. That i s , s i s l o c a l l y stable with respect to Q i f and only i f THERE DOES NOT EXIST P Q such that s * P Q(s) A weaker form of completeness, c a l l e d LOCAL COMPLETENESS, of the sublaws could be guaranteed by ensuring that there e x i s t s some response function mapping each system state into a l o c a l l y stable state. D e f i n i t i o n : Local Completeness of Sublaws Let Q be a set of sublaws describing the system a which may enter states S'30, then Q i s l o c a l l y complete i f and only i f FOR ALL s, s e S' THERE EXISTS s' = P Q(s) such that s' i s l o c a l l y stable A weaker form of consistency, c a l l e d LOCAL CONSISTENCY, of the sublaws could be established by ensuring that a l l possible response paths lead to the same f i n a l l o c a l l y stable state. 3 S' may not equal the possible state space S. S' i s the set of states, both stable and unstable, which the sublaws and external events included i n the model of the system are designed to consider. As w i l l be discussed i n more d e t a i l l a t e r , the s t a b i l i t y conditions of the sublaws define the stable states of S' and the c o r r e c t i v e actions define the unstable states. 37 D e f i n i t i o n : Local Consistency of Sublaws Let Q be a set of sublaws describing the system a which may enter states S' , then Q i s l o c a l l y consistent i f and only i f FOR ALL s, PQ, P'Q such that s e S' and P 0 ( s ) , P' Q(s) are l o c a l l y stable, P Q(s) = P' 0(s) Tests for l o c a l consistency and completeness are c l e a r l y i n f e r i o r to the tests possible i f the system law i s known. However, l o c a l consistency and completeness t e s t i n g does ensure that a l l known information i s consistent and complete with respect to i t s e l f . Olive r e f e r s to l o c a l consistency and completeness as the " l o g i c a l consistency of the model" (p. 73). A model i s " l o g i c a l l y consistent" i f the outputs of the system are derivable from the inputs. Most systems analysis and design methodologies do not provide any way to systematically v e r i f y the l o g i c a l consistency of a model. CIAM (Gustafson, et a l . , 1982) and DADES (Olive, 1982) are notable exceptions. CIAM refe r s to tests for l o c a l completeness and consistency as "checking the s a t i s f i a b i l i t y of information requirements". Each output must be expressible i n terms of system inputs or information derived from those inputs. DADES refer s to these tests as " d e r i v a b i l i t y a n a l y s i s " (p. 229). D e r i v a b i l i t y analysis i s a formal method to show that outputs are derivable from inputs. 2.2.5. A Simple Example Consider a hypothetical system c o n s i s t i n g of four interconnected l i g h t s . Light "a" i s connected i n series with "b" so that i f "a" i s on then "b" w i l l be on and i f "a" i s o f f , "b" w i l l be o f f . I f l i g h t "a" i s o f f then l i g h t "c" w i l l be on, and i f "a" i s on, l i g h t "d" w i l l be on. Only the state of l i g h t "a" may be set manually. The schematic diagram of a system implementation using d i g i t a l l o g i c and l i g h t emitting diodes i s included as Appendix L. The system may be described by the following state v a r i a b l e s and states. The "on" state of a l i g h t w i l l be represented by the integer 1 and the " o f f " state by 0. 38 State Variables States a 1 or 0 b 1 or 0 c 1 or 0 d 1 or 0 One of many possible sets of sublaws, describing the stable states of the system and the actions to be taken should the system f i n d i t s e l f i n an unstable state, i s given below. The cor r e c t i v e action rules are numbered f o r future reference when describing system response paths. Sublaws 1. S t a b i l i t y Conditions: a b 0 0 1 1 Corrective Actions: Conditions Actions a --> b Rl : 1 1 R2: 0 0 2. S t a b i l i t y Conditions: a c 0 1 1 0 1 1 Corrective Action: Conditions Actions a - -> c R3: 0 1 3. S t a b i l i t y Conditions: a d 1 1 0 1 0 0 Corrective Action: Conditions Actions a --> d R4: 1 1 Since only l i g h t "a" may be switched by the environment, there are two possible external events. 39 External Events 1. Set a = 1 2. Set a = 0 The s t a b l e 3 1 state space of t h i s system i s shown below. Each state i s l a b e l l e d f o r future reference. Stable States State Variable Label a b c d A 0 0 1 0 B 0 0 1 1 C 1 1 0 1 D 1 1 1 1 Response paths are generated by f i r s t applying each event to each stable state, thus obtaining a state which might be unstable. Then the sublaws are used to tr y to br i n g the system to a f i n a l stable state. For example, a possible response path corresponding to the a p p l i c a t i o n of the event "set a = 1" to the f i r s t stable state A i s as follows: a b c d Label I n i t i a l stable state 0 0 1 0 A Unstable state a f t e r eventl 1 0 1 0 E Unstable state a f t e r rule Rl 1 1 1 0 F Stable state a f t e r r u l e R4 1 1 1 1 D There may be more than one possible response path associated with each unstable state. In the above example, rule R 4 could have been a c t i v a t e d before rule Rl. The precise ordering of a c t i v a t i o n of sublaws i s not important so long as each response path ends i n the same stable state ( i . e . the sublaws are consistent). 3 1 For the remainder of t h i s thesis, the terms "stable", "complete" or "consistent" s h a l l mean l o c a l l y stable, l o c a l l y complete or l o c a l l y consistent with respect to the defined sublaws. 40 Response paths are described using the following notation. [ ( i n i t i a l = S t a t e 0 ) , | Event, S t a t e ^ , | Rule x, State 2>, | Rule 2, State 3> |Rule n,State n + 1>] where State 0 i s the i n i t i a l state to which event Event, i s applied, Rule k i s the name of the c o r r e c t i v e action r u l e which moves the system from State k to S t a t e k + 1 , and S t a t e n + 1 i s the f i n a l (and therefore stable) state of the system. I f the above analysis i s repeated f o r the remaining stable states and events, the following unstable states and response paths may be generated. Unstable States E 1 0 1 0 F 1 1 1 0 G 1 0 1 1 H 0 1 0 1 I 0 0 0 1 J 0 1 1 1 Response Paths Path # Event Response Path 1 a = 1 [ ( i n i t i a l = A ) , |Event 1,E>,|R1,F>,|R4,D>] 2 a = 1 [ ( i n i t i a l = B ) , |Event 1,G>,|R1,D>] 3 a = 1 [ ( i n i t i a l = C ) , |Event 1,C>] 4 a = 1 [ ( i n i t i a l = D ) , |Event 1,D>] 5 a 0 [ ( i n i t i a l = A ) , |Event 2,A>] 6 a = 0 [ ( i n i t i a l = B ) , |Event 2,B>] 7 a = 0 [ ( i n i t i a l = C ) , |Event 2,H>,|R2,I>,|R3,B>] 8 a = 0 [ ( i n i t i a l = D ) , |Event 2,J>,|R2,B>] Every state entered as a r e s u l t of an external event i s transformed into a stable state. This means the sublaws are complete. The sublaws are consistent because a l l a l t e r n a t i v e response paths lead to the same stable states. 41 2.3. Implementation: The S p e c i f i c a t i o n s Analysis Tools A set of Prolog-based s p e c i f i c a t i o n s analysis tools has been created to f a c i l i t a t e the construction of a system model based on the system theory concepts presented i n the previous section. These tools provide the following functions: a. Testing f o r : 1) s y n t a c t i c errors i n the system model (e.g. misplaced punctuation, inconsistent naming, e t c . ) , 2) stable condition coverage of the state v a r i a b l e s ( i . e . each state v a r i a b l e i s referenced i n at l e a s t one rule from the s t a b i l i t y conditions of a sublaw), 3) state v a r i a b l e variance ( i . e . each state v a r i a b l e i s assigned a l l of i t s defined values), 4) c o n f l i c t i n g sublaws, and 5) l o c a l completeness and l o c a l consistency of the sublaws. b. Determination of the stable state space of the system. c. Determination of the unstable state space and response paths of the system. d. Suggestion of possible decompositions. The modelling syntax required by the s p e c i f i c a t i o n s analysis t o o l s , and the various tests which can be applied to the model, w i l l be described i n the context of the f o u r - l i g h t s example. The tests are further described i n Appendix D. The procedures used to f i n d the stable and unstable state spaces as well as the response paths of a system are described i n Appendix E. System decomposition i s described i n the next chapter. 2.3.1. Entering a System Model The user i s required to create a text f i l e l i s t i n g a l l of the state v a r i a b l e s , state v a r i a b l e values, sublaws, and external events to be included The tools were implemented using Turbo Prolog (Borland, 1986) running on an IBM AT compatible microcomputer. 42 i n the model. The f o u r - l i g h t s example described e a r l i e r would be entered as described i n the following subsections. Text enclosed by /*...*/ i s added for explanation only and i s ignored by the to o l s . 2.3.1.1. State Variables A simple one-place predicate i s used to inform the tools that c e r t a i n v a r i a b l e s are to be included i n the model. A l l state v a r i a b l e s must be declared i n t h i s way. No a d d i t i o n a l state v a r i a b l e s may be included i n any further d e s c r i p t i o n of the model (e.g. i n the sublaws or events). Predicates declaring the state v a r i a b l e s used to describe the f o u r - l i g h t s example would be created as follows: /* state v a r i a b l e s */ st a t e _ v a r i a b l e ( a ) . s t a t e _ v a r i a b l e ( b ) . s t a t e _ v a r i a b l e ( c ) . s t a t e _ v a r i a b l e ( d ) . 2.3.1.2. Values Each state v a r i a b l e may be assigned only a l i m i t e d number of va l u e s 3 3 . A l l possible values must be declared using the binary predicate "values(StateVariableName.Values)", where StateVariableName i s the name of the state v a r i a b l e and Values i s a l i s t of possible values. I f a state v a r i a b l e , which was not declared using the " s t a t e _ v a r i a b l e ( ) " predicate, i s used here an error message w i l l be generated by the s p e c i f i c a t i o n s analysis t o o l s . Another error message w i l l be produced i f a state v a r i a b l e does not assume one of i t s defined values during the determination of the system's response paths. This l a s t t e s t ensures that the state v a r i a b l e value declarations are consistent with the defined dynamics of the system and i s r e f e r r e d to as t e s t i n g "state v a r i a b l e variance". Any mismatch would indicate e i t h e r i n s u f f i c i e n t l y defined dynamics ( i n the form of sublaws) or too many defined values. The problem of state variables which may be assigned a large number (perhaps i n f i n i t e ) of d i f f e r e n t values i s addressed l a t e r i n t h i s chapter. 43 /* state v a r i a b l e values */ values(a,[0,1]). values(b,[0,1]) . values(c,[0,1]). values(d,[0,1]) . 2.3.1.3. Sublaws The two components of sublaws (namely, s t a b i l i t y conditions and corrective actions) are defined separately. S t a b i l i t y conditions describe the allowed combinations of state v a r i a b l e values i n stable system states, and are used to determine the stable state space of the system. Corrective actions specify actions to be taken i f the system i s not i n a stable state, and are used to f i n d a l l response paths of the system. There i s some d u p l i c a t i o n of information i n the two parts of a sublaw. The stable state space of the system could be determined by generating a l l possible combinations of state v a r i a b l e values and t e s t i n g to see whether some cor r e c t i v e a c t i o n rule could a l t e r each of the possible states. I f there i s no c o r r e c t i v e action r u l e which could a l t e r a state, that state would be added to the stable state space of the system. Such a "generate and t e s t " algorithm becomes computationally i n t r a c t a b l e as the number of combinations of state v a r i a b l e values increases. The use of s t a b i l i t y conditions as formulated above allows a much more e f f i c i e n t method of determining the stable state space of the system. Also, i f there were a large number of possible system states, i t would be easy for the analyst to a c c i d e n t a l l y omit a c o r r e c t i v e a c t i o n rule required to "correct" an unstable state. Should such an error occur, the unstable state would be i n c o r r e c t l y assumed to be stable. When separate d e f i n i t i o n s of both s t a b i l i t y conditions and c o r r e c t i v e actions are required, tests for l o c a l consistency and completeness can point to a c c i d e n t a l l y omitted r u l e s 3 4 . E x p l i c i t statement of s t a b i l i t y conditions also allows the system to be described by a smaller number of r u l e s . There may be some poss i b l e states which the system should never enter. In a natural system, where the system law r e f l e c t s fundamental properties of the p h y s i c a l universe, such states could be The tests for l o c a l completeness and consistency provide a kind of cross-check between the s t a b i l i t y conditions and c o r r e c t i v e actions of the sublaws as well as the defined external events. 44 impossible (e.g. i t i s impossible for a mass on the surface of a planet to be f a l l i n g up) . However, i n a man-made system where the system law may be imperfectly enforced, i t i s possible for the system to enter an unexpected state. For example, i n a simple accounting system the value of the assets represented by some account may not equal the balance of that account, i f some user manually a l t e r e d the balance. Most accounting systems would have controls i n place to prevent such a l t e r a t i o n s . However, i f t h i s sort of tampering was not included among the defined external events for the system, the model might not include any sublaws to deal with the s i t u a t i o n . I t would be extremely d i f f i c u l t , i f not impossible, to a n t i c i p a t e a l l such undesirable events. When dealing with man-made systems, a model can only approximate the operation of the o r i g i n a l system since the external events considered comprise only a subset of a l l possible external events. 2.3.1.3.1. S t a b i l i t y Conditions S t a b i l i t y conditions are represented using the binary predicate "static(LawName,Conditions)", where LawName i s some a r b i t r a r y name f o r the law and Conditions i s a l i s t of state v a r i a b l e name and value p a i r s a l l of which must occur together i n each stable state of the system. A stable condition may consist of more than one r u l e . This i s modelled using several clauses with the same LawName parameter. A stable state need s a t i s f y only one of the rules forming the s t a b i l i t y conditions of a p a r t i c u l a r sublaw 3 5. That i s , " s t a t i c ( ) n clauses with the same name are combined using an i n c l u s i v e OR condition. Clauses with d i f f e r e n t LawNames are combined using an AND condition. I f a state v a r i a b l e name or value which was not declared with a " s t a t e _ v a r i a b l e ( ) " or "valueQ" predicate i s used to define a stable condition r u l e , the s p e c i f i c a t i o n s analysis tools w i l l issue an error message. Other tests of the s t a b i l i t y conditions are described i n Appendix D. These ensure that every defined state v a r i a b l e i s mentioned i n at l e a s t one stable condition rule ( r e f e r r e d to as "stable condition coverage"), and that the stable condition rules do not c o n f l i c t with each other. The s t a b i l i t y conditions for the f o u r - l i g h t s example are defined as follows: The r e l a t i o n s h i p between stable states and s t a b i l i t y conditions i s described i n more d e t a i l i n Appendix E. 45 /* s t a b i l i t y conditions */ s t a t i c ( " S l " , [ v ( a , " 0 " ) , v ( b , " 0 " ) ] ) . s t a t i c ( " S l " , [ v ( a , " 1 " ) , v ( b , " 1 " ) ] ) . s t a t i c ( " S 2 " , [ v ( b , " 0 " ) , v ( c , " 1 " ) ] ) . s t a t i c ( " S 2 " , [ v ( b , " l " ) , v ( c , " 0 " ) ] ) . s t a t i c ( " S 2 " , [ v ( b , " l " ) , v ( c , " 1 " ) ] ) . s t a t i c ( " S 3 " , [ v ( b , " l " ) , v ( d , " l " ) ] ) . static("S3",[v(b,"0"),v(d,"0")]). static("S3",[v(b,"0"),v(d,"1")]). 2.3.1.3.2. Corrective Actions Corrective actions are represented using the ternary predicate "dynamic(LawName,Conditions.Actions)". Again, LawName i s some a r b i t r a r y name. Conditions i s a l i s t of a c t i v a t i o n conditions, c o n s i s t i n g of state v a r i a b l e name and value p a i r s , which must be s a t i s f i e d by an unstable state before the cor r e c t i v e a c t i o n r u l e i s allowed to a f f e c t the system state ( i . e . to " f i r e " ) . Actions i s a l i s t of state v a r i a b l e name and value p a i r s which s p e c i f y the f i n a l values of c e r t a i n state v a r i a b l e s a f t e r the sublaw i s allowed to " f i r e " . For example, the co r r e c t i v e a c t i o n rule dynamic("D3",[v(a,"l")],[v(d,"1")]). means that i f the value of "a" i s "1" then "d" should be set to "1". I f a state v a r i a b l e i s not mentioned i n the l i s t of f i n a l values, i t i s assumed to have the same state as i t had before the sublaw was f i r e d . A l l c o r r e c t i v e a c t i o n rules, whether they have the same name or not, are combined using an OR condition. I f a state v a r i a b l e name or value which was not declared with a " s t a t e _ v a r i a b l e ( ) " or "value()" predicate i s used to define a c o r r e c t i v e a c t i o n r u l e , the s p e c i f i c a t i o n s analysis tools w i l l issue an error message. . The cor r e c t i v e actions f o r the f o u r - l i g h t s example are defined as follows: 46 /* c o r r e c t i v e actions */ dynamic("Dl",[v(a,"0")],[v(b,"0")]). dynamic("Dl",[v(a,"1")],[v(b,"1")]). dynamic("D2",. [v(a, " 0 " )],[v(c,"1")]). dynamic("D3",[v(a,"l")],[v(d,"1")]). 2.3.1.4. External Events External events are defined using the binary predicate "event(EventName.Actions)", where EventName i s some a r b i t r a r y name f o r the event, and Actions i s a l i s t of the state v a r i a b l e s a l t e r e d by the external event together with t h e i r a l t e r e d values. As i n the case of sublaws, i f a state v a r i a b l e name or value which was not declared with a " s t a t e _ v a r i a b l e ( ) " or "valueQ" predicate i s used to define an external event, the s p e c i f i c a t i o n s analysis tools w i l l issue an error message. The external events a f f e c t i n g the f o u r - l i g h t s example are defined as follows: /* External Events */ event("El",[v(a,"0")]). event("E2",[v(a,"1")]). The various tests performed by the s p e c i f i c a t i o n s analysis tools are summarized i n Table I. 2.3.2. A More Extended Example The above example was s i m p l i f i e d by the f a c t that each state v a r i a b l e had a small number of d i s c r e t e values. What happens i f there e x i s t s a state v a r i a b l e with a very large or even i n f i n i t e number of possible values? Complete e x p l i c a t i o n of sublaws i n the manner described above would be impossible. To model systems described by such state variables i t i s necessary to reduce the l e v e l of d e t a i l of the sublaws. The s t a t i c s and dynamics must be described q u a l i t a t i v e l y where each state v a r i a b l e may take on only a small number of values. These state v a r i a b l e values are c a l l e d SUBREGIONS i n keeping with the work of De Kleer and Brown (1985). Subregions are bounded by c r i t i c a l values of the real-world state v a r i a b l e . For example, i n an inventory management 47 Table I: Tests performed by the s p e c i f i c a t i o n s analysis t o o l s . Diagnostic When Type I d e n t i f i e d How I d e n t i f i e d Possible Meaning syntax when model i s loaded into Prolog naming before consistency generation of stable state space i l l e g a l Prolog syntax name or value i n sublaw or event does not match declarations typing errors 1. s p e l l i n g error 2. i n s u f f i c i e n t defined state v a r i a b l e s 3. i n s u f f i c i e n t defined values s t a t i c sublaw coverage s t a t i c sublaw c o n f l i c t state v a r i a b l e v a r i a t i o n before generation of stable state space before generation of stable state space a f t e r generation of response paths a state v a r i a b l e i s not referenced i n a s t a t i c sublaw see Appendix D a state v a r i a b l e does not obtain a l l of i t s defined values 1. missing s t a t i c sublaw 2. too many defined state v a r i a b l e s inconsistent s p e c i f i c a t i o n of s t a t i c sublaws 1. too many defined values 2. missing external events l o c a l complete-ness during generation of response paths dynamic sublaws cannot move system to a stable state a f t e r a p p l i c a t i o n of some event 1. too many defined external events 2. missing or incor r e c t dynamic sublaws 3. missing or in c o r r e c t s t a t i c sublaws l o c a l during consistency generation of response paths dynamic sublaws can move the system to more than one stable state following the ap p l i c a t i o n of some event improperly defined dynamic sublaws 48 system, knowledge of the exact quantity on hand of a p a r t i c u l a r item i s probably not important i n order to describe the operation of the system. I t i s l i k e l y that c e r t a i n actions w i l l be taken i f the quantity i s e i t h e r above or below a c e r t a i n c r i t i c a l value, say the economic order quantity. In t h i s case the state v a r i a b l e "quantity_on_hand" might be modelled as having two d i s c r e t e values: "under_eoq" and "over_eoq" . Use of state v a r i a b l e s with values that are a c t u a l l y subregions i s i l l u s t r a t e d i n the following example adapted from Wand and Weber (1989). Consider a p a y r o l l system f o r a company36. The company has two types of jobs: o f f i c e and sales. An employee may be i n e i t h e r a regular or i n a managerial p o s i t i o n . Salaries are comprised of base pay, overtime pay and commissions. The way i n which t o t a l salary i s c a l c u l a t e d depends on the job type and employee p o s i t i o n . Company p o l i c y i s as follows: the o f f i c e s t a f f i s e n t i t l e d to overtime pay but not to commissions. the sales s t a f f i s e n t i t l e d to commissions but not to overtime pay. managers are not e n t i t l e d to overtime pay nor commissions. hours and sales are recorded f o r a l l employees. (This might happen i f managers are required to report hours and o f f i c e workers may take a telephone order.) a l l employees receive b e n e f i t s . Also assume that a l l p a y r o l l processing takes place at the end of some period. This system would be entered into the s p e c i f i c a t i o n s analysis tools as shown i n Appendix F. The only external events modelled a f f e c t the state v a r i a b l e "end". Its value changes from "0" to "1" at the end of the period and from "1" to "0" at the s t a r t of the next period. Most continuous state v a r i a b l e s are represented using two subregions. For example, the "sales" state v a r i a b l e may have values of e i t h e r zero or p o s i t i v e ("0" or "nz" i n the model). The state v a r i a b l e for "hours worked" i s somewhat more complicated. An employee may work s u f f i c i e n t hours to q u a l i f y f o r overtime pay and base pay, a l e s s e r number of hours for 3 6 This system w i l l be l a t e r r e f e r r e d to as the " i n i t i a l " p a y r o l l system to d i s t i n g u i s h i t from a s i m i l a r system to be r e f e r r e d to as the "modified" p a y r o l l system. These systems are f a i r l y simple. There i s no i n t e n t i o n to suggest that r e a l p a y r o l l systems can be as e a s i l y modelled as these examples. A more complicated " r e a l " system w i l l be examined i n Chapter 6. 49 which he w i l l only receive base pay, or no hours at a l l . In the model, the state v a r i a b l e representing "hours worked" may take on any of three values. hours = ot - s u f f i c i e n t hours to q u a l i f y f o r overtime pay and base pay. hours = reg - employee to receive base pay only, hours = "0" - no hours worked. This system model has ni n e t y - s i x stable states. Forty-eight of these states represent the i n i t i a l states of the system when the state v a r i a b l e "end" has a value of "0". In these states a l l quantities to be c a l c u l a t e d at the end of the period have a value of "0". The other f o r t y - e i g h t states represent the f i n a l system states when "end" has a value of "1" and base pay, overtime, be n e f i t s , commissions and t o t a l pay have been calculated. A l t e r n a t i v e l y , the p a y r o l l system could be modelled without use of the state v a r i a b l e "end" as shown i n Appendix G. I t i s a somewhat more abstract representation, i n that the concept of end-of-period processing has been eliminated. In a sense, the above model i s "batch" and t h i s model i s " i n t e r a c t i v e " 3 7 . The model describes the allowable configurations ( i . e . stable states) of state v a r i a b l e values a f t e r a l l processing has been completed. External events then become those occurrences which can a l t e r these stable configurations, as opposed to the massive t r a n s i t i o n represented by end-of-period processing. The f i v e events defined f o r t h i s system occur when an employee i s reported to have worked a number of hours or made some sales. Also notice that the b e n e f i t s state v a r i a b l e has only one possible value: non-zero. This i s because i t s value does not depend on the value of any other state v a r i a b l e i n the new model. As s h a l l be shown i n the next chapter, the decompositions generated automatically by the s p e c i f i c a t i o n s analysis tools are s i m i l a r , but not the same, for the model of the batch and i n t e r a c t i v e systems. In p a r t i c u l a r , the b e n e f i t s state v a r i a b l e does not appear i n any decomposition of the i n t e r a c t i v e system. The reason f o r t h i s w i l l be discussed i n the next chapter. 3 7 Pick (1986) defines "batch and " i n t e r a c t i v e " as follows. "Batch" describes systems where a number of s i m i l a r input items are grouped together for processing during the same machine run (p. 622). In the batch example, the machine run occurs at the end of the period. " I n t e r a c t i v e " describes systems where the user has rapid two-way communication with a computer (p. 670). In the i n t e r a c t i v e example, ca l c u l a t e d values are updated without an end of period external event. 50 This new model has f o r t y - e i g h t stable states. These states are i d e n t i c a l to the stable states of the "batch" system when the state v a r i a b l e "end" has the value "1" except that "end" i s not included. I t should be noted that although t h i s representation has fewer stable states, i t i s not n e c e s s a r i l y more e f f i c i e n t than the "batch" representation. There were only two events defined f o r the f i r s t model, whereas t h i s model has f i v e . This means that 192 (96 times 2) response paths had to be determined f o r the "batch" model, and 240 (48 times 5) had to be found f o r the " i n t e r a c t i v e " model. 2.4. Conelus i ons A formalism f o r the representation of systems has been developed. SELMA i s notable f o r i t s focus on laws rather than on procedures. Consistent representation of the linkages between the properties of the system, i n the form of sublaws, f a c i l i t a t e s tests of both completeness and consistency of the system de s c r i p t i o n . Sublaws are seen as a p r a c t i c a l way to formulate a global system law. The analyst may focus h i s or her att e n t i o n on small parts of the system, and s t i l l ensure that the sublaws form a complete and consistent model of the system. A basic implementation of a set of Prolog tools to support SELMA has been described. While i t s use has been shown to be f e a s i b l e f o r some small problems, further t e s t i n g i s required. A larger " r e a l " system needs to be considered. I t i s possible that, even with the use of state v a r i a b l e subregions, as the number of state v a r i a b l e s increases, there could be an unacceptably rapid increase i n the number of stable system states. However, i t should be noted that a r e l a t i o n s h i p of the form: number of system states = number of values f o r state v a r i a b l e 1 * number of values f o r state v a r i a b l e 2 * number of values f o r state v a r i a b l e n could only occur i f each state v a r i a b l e were "independent" of every other state v a r i a b l e . In other words, a very large number of system states i s only expected i f f o r every possible value of each state v a r i a b l e every other state v a r i a b l e could have each of i t s possible values. This would mean that the s i z e of the 51 stable state space of the system equals the s i z e of i t s possible state space. This sort of behaviour i s not expected f o r most i n t e r e s t i n g systems, as i t implies no coupling among the state v a r i a b l e s . For example, the simple four-l i g h t s example has a possible state space with 16 (= 2 4) states, but there are only four stable states. Also, the "batch" p a y r o l l system has a possible state space with 3072 (= 2 1 0*3) states, but there are only 96 stable states. As c u r r e n t l y implemented, the s p e c i f i c a t i o n s analysis tools have a very l i m i t e d syntax. In some cases, coding of sublaws could be made more e f f i c i e n t i f state v a r i a b l e s could be described as not having a c e r t a i n value, rather than s p e c i f y i n g a l l the values i t may have. Also, many systems are l i k e l y to require q u a l i t a t i v e a d d i t i o n and m u l t i p l i c a t i o n (e.g. the p a y r o l l system example). The formats of the sublaws which represent these operations are well defined. De Kleer et a l . (1985) define q u a l i t a t i v e addition and m u l t i p l i c a t i o n as follows: Addition M u l t i p l i c a t i o n X 0 + ? 0 + + + 0 + + 0 0 0 0 Ambiguities may a r i s e when adding quantities of d i f f e r e n t sign. However, they can probably be avoided through c a r e f u l d e f i n i t i o n of state v a r i a b l e values. Avoiding ambiguities then becomes the r e s p o n s i b i l i t y of the analyst and not the s p e c i f i c a t i o n s analysis t o o l s . The s p e c i f i c a t i o n s analysis tools could be enhanced to support simple rendering of a d d i t i o n and m u l t i p l i c a t i o n operations. However, a l i m i t e d syntax i s s u f f i c i e n t to i l l u s t r a t e the f e a s i b i l i t y of SELMA. 52 Chapter 3 : A Theory of Decomposition The problem of i d e n t i f y i n g the subsystems from which a system i s composed i s not t r i v i a l . This chapter begins with an i n t u i t i v e example i l l u s t r a t i n g the d i f f i c u l t y of decomposition. This i s followed by formal d e f i n i t i o n s of several concepts r e l a t e d to system decomposition. A number of h e u r i s t i c s and theorems, used to l i m i t the search space of possible decompositions, are also presented. F i n a l l y , a decomposition algorithm compatible with SELMA i s described and i t s use i s demonstrated on several simple systems. 3.1. General Bunge (1979, p. 11) describes system decomposition on the basis of i d e n t i f i a b l e things. However, only by observing a system's behaviour can a designer hope to discover into what parts the system may be decomposed. The behaviours of the p r o p e r t i e s 3 8 describing a system, and not the things from which i t i s constructed, are of primary importance to decomposition (Simon and Ando, 19.61). Consider a s i m p l i f i e d b i c y c l e system. Many people would recognize the following things as being parts of a b i c y c l e . Things front wheel rear wheel pedals frame front forks handle bars chain Some state v a r i a b l e s representing the properties of the b i c y c l e are l i s t e d below. Notice, that only normal operation of a b i c y c l e i s being modelled. That i s , we are not concerned with skidding, f a l l i n g over, etcetera. 3 8 I t could be argued that only things can e x h i b i t behaviour. However SELMA does not e x p l i c i t l y model things. For the purposes of t h i s research, i t i s s u f f i c i e n t to describe a system's behaviour by the describing the observed changes of the properties. 53 State Variables front wheel angle 3 9 front wheel r o t a t i o n a l speed rear wheel r o t a t i o n a l speed pedal r o t a t i o n a l speed front fork angle handle bar angle frame speed chain r o t a t i o n a l speed A reasonable decomposition on the basis of things might include the following subsystems: front end: front wheel front forks handle bars rear end: rear wheel pedals chain I t i s not c l e a r with which subsystem the frame should be associated as i t spans both the fr o n t and rear ends. Also notice that the behaviour of the front wheel w i l l be r e l a t e d to the behaviour of the rear wheel. Under normal operating conditions the r o t a t i o n a l speed of the two wheels w i l l be the same40. This dependency implies that the two subsystems are coupled. In general, coupling between two subsystems ex i s t s when the behaviours of the two subsystems are not independent. In t h i s case, coupling can be avoided i f the subsystems are selected on the basis of steering and forward motion state v a r i a b l e s as shown: steering subsystem: forward motion subsystem: front wheel angle front wheel r o t a t i o n a l speed front fork angle rear wheel r o t a t i o n a l speed handle bar angle pedal r o t a t i o n a l speed chain r o t a t i o n a l speed frame speed 3 9 Front wheel angle, front fork angle, and handle bar angle are a l l ho r i z o n t a l angles measured r e l a t i v e to the frame of the b i c y c l e . 4 0 Assuming front and rear wheels of the same radius. 54 Insofar as these l a s t two subsystems can been given meaningful names, they do represent things. However, i t i s argued that the things represented are not i n t u i t i v e l y obvious. Many analysts would not consider " s p l i t t i n g " a ph y s i c a l object (e.g. the front wheel) between two subsystems. The only "behaviour" suggesting the f i r s t decomposition occurs during b i c y c l e assembly. Assembly contexts are f a r too tempting a c r i t e r i a f o r decomposition. An analyst needs to consider the behaviour of a system i n a l l contexts of i n t e r e s t . In SELMA, d i f f e r e n t contexts are represented by d i f f e r e n t external events. Analysts who consider only decompositions c o n s i s t i n g of obvious things may miss "superior" a l t e r n a t i v e decompositions. I t may happen that the "good" subsystems, i d e n t i f i e d by the decomposition methodology presented here, w i l l have state v a r i a b l e s corresponding to the properties of an i n t u i t i v e l y obvious thing, but t h i s i s by no means c e r t a i n . 3 . 2 . The Decomposition Formalism The meaning of decomposition w i l l be formally defined i n t h i s section, but f i r s t some terms for describing system dynamics must be introduced. D e f i n i t i o n : Subsystem Any subset X of the state variables describing a system a w i l l describe a SUBSYSTEM of a. For convenience, X may be r e f e r r e d to as a subsystem 4 1. Not a l l subsets of state variables w i l l describe reasonable subsystems. For the b i c y c l e example, one possible unreasonable subsystem would be described by "front wheel angle" and "pedal r o t a t i o n a l speed". The development of c r i t e r i a f o r s e l e c t i n g reasonable subsystems i s the major purpose of t h i s chapter. A subsystem consists of more than j u s t a set of d e s c r i p t i v e state v a r i a b l e s . There must also be rules f or governing subsystem behaviour. However; for purposes of system decomposition, i t i s s u f f i c i e n t to i d e n t i f y a subsystem by a set of state v a r i a b l e s . 55 D e f i n i t i o n : Projection of a Subsystem The state x of a subsystem X of a system a when a i s i n state s i s c a l l e d the PROJECTION of s onto X, x = proj(s.X). For example, consider the following state of the b i c y c l e system. State Variable Value front wheel angle turning l e f t f r o nt wheel r o t a t i o n a l speed p o s i t i v e rear wheel r o t a t i o n a l speed p o s i t i v e front fork angle turning l e f t handle bar angle turning l e f t pedal r o t a t i o n a l speed zero chain r o t a t i o n a l speed zero frame speed p o s i t i v e That i s , the b i c y c l e i s coasting around a l e f t turn. The p r o j e c t i o n of th i s state onto the previously i d e n t i f i e d steering subsystem i s State Variable Value front wheel angle turning l e f t f r o nt fork angle turning l e f t handle bar angle turning l e f t I t should be noted that there may be many system states having the same pro j e c t i o n . The state of the steering subsystem would be the same i f the b i c y c l e was pedalled (as opposed to coasted) around a l e f t corner. D e f i n i t i o n : Deterministic Subsystem Wand and Weber (1988) hypothesize that a l l good decompositions w i l l s a t i s f y the following requirement: 56 The behaviour of each subsystem i s determined only by those state v a r i a b l e s describing the subsystem. This means a decomposition i s good i f each subsystem behaves d e t e r m i n i s t i c a l l y . A subsystem behaves d e t e r m i n i s t i c a l l y i f i t s f i n a l state i s f u n c t i o n a l l y determined by i t s i n i t i a l state, or i f f o r every i n i t i a l state of the subsystem there i s only one possible f i n a l state of the subsystem. This implies that a l l the information necessary to determine the f i n a l state of the subsystem i s already contained i n the subsystem. I t i s not necessary to consider the states of other subsystems i n order to decide how the subsystem w i l l behave. I f the state of a subsystem depends on the state of another, the subsystems are coupled. Therefore, t h i s requirement w i l l ensure that there i s no coupling between the subsystems of a good decomposition. Wand and Weber's requirement may be formally expressed as follows: Let a be a system with system law L, and l e t R be a set of states of a. A subsystem X of cr i s deterministic with respect to R and L, i f and only i f a l l system states s i n R, having the same i n i t i a l subsystem state proj(s,X), have the same f i n a l subsystem state p r o j ( L ( s ) , X ) . That i s : X i s deterministic with respect to R and L i f and only i f FOR ALL s l f s 2 such that s x G R and s 2 G R, and such that p r o j C s ^ X ) = pr o j ( s 2 , X ) , proj (L(s 1) ,X) = proj (L(s 2) ,X) A subsystem i s characterized by a set of d e s c r i p t i v e state v a r i a b l e s . The behaviour of a deterministic subsystem can be defined by a function i n v o l v i n g only these state v a r i a b l e s . This function may be expressed by a sublaw a f t e r considering the subsystem state changes between i n i t i a l and f i n a l states. For example, consider a system described by binary state v a r i a b l e s {x,y,z}. Assume that the system dynamics are defined by the following unstable state space and corresponding f i n a l stable states. 57 Unstable States x y z 0 1 0 0 1 1 1 1 1 1 1 0 Corresponding F i n a l Stable States X y z 1 l 1 1 l 1 1 0 0 1 0 0 We see that {y,z) i s not a deterministic subsystem since the subsystem state (1,1} corresponds to f i n a l subsystem states (1,1) and (0,0). However, (x,y) i s a deterministic subsystem i n that no i n i t i a l subsystem state corresponds to more than one f i n a l state. The state t r a n s i t i o n s f o r the subsystem (x,y) are as shown: X y > x y 0 1 1 1 1 1 1 0 The c o r r e c t i v e actions of a sublaw describing t h i s behaviour could be expressed as follows: Corrective Actions: Conditions Actions X Y --> X Y 0 1 1 1 1 1 1 0 D e f i n i t i o n : INTERNAL EVENT External events a l t e r the values of some of the state v a r i a b l e s describing a system. The system responds to the external event by further a l t e r i n g the values of i t s state variables u n t i l i t enters a stable state. This further a l t e r a t i o n of state variables i s accomplished through INTERNAL EVENTS. The a c t i o n of a sublaw, as described i n Chapter 2, corresponds to an i n t e r n a l event. The actions of external and i n t e r n a l events both r e s u l t i n 58 system state changes. The sequence of state v a r i a b l e value changes as the system moves towards a stable state constitutes a response path. Since there may e x i s t many system response paths leading to the same stable state, an external event need not be followed by a unique sequence of i n t e r n a l events. The change of state from an unstable to a stable state can be viewed as a sequence of i n t e r n a l events. For example, consider the b i c y c l e system at r e s t ( i . e . "frame speed" = zero). I f the pedals are made to rotate, the chain, rear wheel and front wheel must also begin to rotate. However, the b i c y c l e might be modelled such that the chain and rear wheel begin to rotate before the front wheel and frame begin to move (e.g. some "play" i n the free wheel mechanism). In t h i s case two i n t e r n a l events would follow the external event as shown: external event: rotate pedals i n t e r n a l event 1: rotate chain and rear wheel i n t e r n a l event 2: rotate front wheel and move frame The b i c y c l e system can be viewed as entering a number of unstable states a f t e r the a c t i o n of an external event before once again achieving a stable state. One possible s e r i e s of unstable states leading to a stable state i s shown below. Other sequences of i n t e r n a l events are possible, but i f the system i s complete and consistent, a l l such sequences w i l l lead to the same stable state. Changes to system states are indicated with a "*". I n i t i a l Stable State: stopped, with front wheel pointing s t r a i g h t ahead State Variable Value front wheel angle s t r a i g h t front wheel r o t a t i o n a l speed zero rear wheel r o t a t i o n a l speed zero f r o n t fork angle s t r a i g h t handle bar angle s t r a i g h t pedal r o t a t i o n a l speed zero chain r o t a t i o n a l speed zero frame speed zero External Event: s t a r t peddling 5 9 F i r s t Unstable State: State Variable Value front wheel angle s t r a i g h t front wheel r o t a t i o n a l speed zero rear wheel r o t a t i o n a l speed zero front fork angle s t r a i g h t handle bar angle s t r a i g h t pedal r o t a t i o n a l speed p o s i t i v e * chain r o t a t i o n a l speed zero frame speed zero F i r s t Internal Event: set values of chain and rear Second Unstable State: State Variable Value front wheel angle s t r a i g h t front wheel r o t a t i o n a l speed zero rear wheel r o t a t i o n a l speed p o s i t i v e * front fork angle s t r a i g h t handle bar angle s t r a i g h t pedal r o t a t i o n a l speed p o s i t i v e chain r o t a t i o n a l speed p o s i t i v e * frame speed zero Second Internal Event: set values of front wheel speed F i n a l Stable State: moving s t r a i g h t ahead State Variable Value front wheel angle s t r a i g h t front wheel r o t a t i o n a l speed p o s i t i v e * rear wheel r o t a t i o n a l speed p o s i t i v e front fork angle s t r a i g h t handle bar angle s t r a i g h t pedal r o t a t i o n a l speed p o s i t i v e chain r o t a t i o n a l speed p o s i t i v e 60 frame speed p o s i t i v e * The external event a l t e r e d the value of "pedal r o t a t i o n a l speed". The f i r s t i n t e r n a l event changed the values of state v a r i a b l e s "chain r o t a t i o n a l speed" and "rear wheel r o t a t i o n a l speed" based on the value of "pedal r o t a t i o n a l speed". The second i n t e r n a l event updated the values of state v a r i a b l e s "frame speed" and "front wheel r o t a t i o n a l speed". The f i n a l values of any of the previously a l t e r e d state v a r i a b l e s could have been used as the basis for t h i s second change. For the sake of argument, assume that the f i n a l value of "rear wheel r o t a t i o n a l speed" was used. The process of a l t e r i n g the values of state v a r i a b l e s through i n t e r n a l events s h a l l be c a l l e d an UPDATE. Updates involve sets of state v a r i a b l e s , or subsystems. In the above example, the subsystem {pedal r o t a t i o n a l speed, chain r o t a t i o n a l speed, rear wheel r o t a t i o n a l speed) was used to update the f i r s t unstable state to the second unstable state. Then the subsystem {rear wheel r o t a t i o n a l speed, frame speed, f r o n t wheel r o t a t i o n a l speed) was used to update the second unstable state to the f i n a l stable state. The notion of system updates can be formalized as follows: D e f i n i t i o n : Updating Let o be a system with system law L, R be a set of states of a, and U be the state variabl e s used to describe a set of subsystems of a k Z . A set of states R' i s UPDATED with respect to U and R by s e t t i n g the values of those state v a r i a b l e s i n each system state s i n R, which are elements of U equal to t h e i r values i n the f i n a l stable state L ( s ) . That i s , i f SV i s the set of a l l state variables describing a, then R' = {s') such that THERE EXISTS s such that s e R and FOR ALL v such that v G SV, (proj(s',{v}) = proj(L(s),{v)) and v e U) or (proj (s' , {v)) = proj (s , {v) ) and v <£ U) 4 2 In the previous example, updates involved s i n g l e subsystems only. In general, a set of subsystems may be used to perform an update. 61 The "or" separates two p o s s i b i l i t i e s f o r the value of each state v a r i a b l e i n an updated system state. The f i r s t p o s s i b i l i t y occurs when the state v a r i a b l e i s found i n the set of subsystems used to update the i n i t i a l system state. In t h i s case, the value of the state v a r i a b l e i n the updated system state i s equal to i t s value i n the f i n a l stable state of the system. The second p o s s i b i l i t y occurs when the state v a r i a b l e i s not used to describe any subsystem used to update the i n i t i a l system state. In t h i s case, the value of the state v a r i a b l e i s l e f t unchanged. As another example, consider a b i c y c l e beginning to move to the l e f t a f t e r the r i d e r begins to pedal. A poss i b l e i n i t i a l unstable s t a t e / f i n a l stable p a i r f o r t h i s s i t u a t i o n i s shown below. I n i t i a l Unstable F i n a l Stable State Variable Values Values front wheel angle s t r a i g h t turning l e f t f r o nt wheel r o t a t i o n a l speed zero p o s i t i v e rear wheel r o t a t i o n a l speed zero p o s i t i v e front fork angle s t r a i g h t turning l e f t handle bar angle turning l e f t turning l e f t pedal r o t a t i o n a l speed p o s i t i v e p o s i t i v e chain r o t a t i o n a l speed p o s i t i v e p o s i t i v e frame speed zero p o s i t i v e The i n i t i a l state i s c l e a r l y unstable as the pedals are turning but the wheels are not yet spinning. The state could be updated with respect to the forward motion subsystem, i d e n t i f i e d e a r l i e r , by s e t t i n g "front wheel r o t a t i o n a l speed" and "rear wheel r o t a t i o n a l speed" to " p o s i t i v e " . However, the r e s u l t i n g updated system state would s t i l l not be stable, since the handle bars and the wheels are not pointing i n the same d i r e c t i o n . I f the system were further updated with respect to the steering subsystem, the r e s u l t i n g system state would be stable. Updating r e f e r s to a l t e r i n g a set of states to r e f l e c t the completion of some a c t i v i t i e s within the system. As s h a l l be shown, i t i s the update which r e f l e c t s sequential decomposition. A few more d e f i n i t i o n s w i l l make i t easier to discuss updates as they pe r t a i n to system decomposition. 62 D e f i n i t i o n : F i r s t Intermediate State Space and System Relation The set of a l l system states which r e s u l t from the a c t i o n of any external event i n the set of external events E on a stable state of the system a, i s c a l l e d the FIRST INTERMEDIATE STATE SPACE ( F i r s t ISS) of a with respect to E. E w i l l always include the NULL EVENT. The n u l l event does not change the value of any state v a r i a b l e . The f i r s t ISS and the f i n a l stable system states associated with each member state comprise the FIRST SYSTEM RELATION of a with respect to E. These concepts were used i n Chapter 2. The f i r s t intermediate state space i s the set of states f o r which response paths leading to unique stable states must be found, i f the system model i s to be complete and consistent. The i n i t i a l unstable states and the associated stable states comprise the f i r s t system r e l a t i o n . The f i r s t system r e l a t i o n shows to which stable state the system w i l l move should i t be i n an unstable state as the d i r e c t r e s u l t of the ac t i o n by an external event. D e f i n i t i o n : Nth Intermediate State Space and System Relation The set of a l l system states, where s r e s u l t s from a given set of N updates being applied to each state of the f i r s t intermediate state space of a, i s c a l l e d an Nth INTERMEDIATE STATE SPACE (Nth ISS). The Nth ISS and the f i n a l stable system states associated with each member state comprise the Nth SYSTEM RELATION. D e f i n i t i o n : Level The set of subsystems used to update an intermediate state space w i l l be c a l l e d a LEVEL. Decomposition, i t s e l f , may now be formally defined. 63 Definition: Decomposition I f a serie s of updates i s begun with the f i r s t ISS of a with respect to external events E, and ends when the updated ISS contains only stable states, the r e s u l t i n g sequence of l e v e l s i s c a l l e d a DECOMPOSITION of system a with respect to external events E. I f only deterministic subsystems (as defined above) are used to perform the updates, the r e s u l t i n g sequence of l e v e l s i s c a l l e d a DETERMINISTIC DECOMPOSITION. Unfortunately, there w i l l be i n general, a very large number of deterministic subsystems with respect to any ISS of a system. For example, any subset of state v a r i a b l e s whose values do not change between states i n the ISS and the corresponding f i n a l stable states w i l l describe deterministic subsystems. Consider the following f i r s t system r e l a t i o n f o r a system described by four binary state v a r i a b l e s {a,b,c,d}: F i r s t Intermediate Corresponding State Space F i n a l Stable : States a b c d --> a b c d 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 1 The subsystems {a}, {b} , {a,b} , {a,b,c}, (a,b,d), and {a,b,c,d} are a l l determ i n i s t i c . Any subset of these deterministic subsystems may be used to update the f i r s t ISS. Any ISS thus created may be further updated using any subsystem that i s deterministic with respect to the new space. This process w i l l lead to at l e a s t 2 6! = 1.3X1089 deterministic decompositions 4 3. Most of these de t e r m i n i s t i c decompositions w i l l be of no i n t e r e s t to the analyst. Several rules f o r avoiding these "useless" decompositions w i l l be discussed following the next section of t h i s chapter. 4 3 I f n i s the number of good subsystems, there are 2 n ways to se l e c t a subset of the good subsystems. Each permutation of these subsets w i l l correspond to a good decomposition. Therefore, there are at l e a s t 2n! good decompositions for a system with n good subsystems. There may be even more good decompositions i f subsystems, which are not good with respect to the f i r s t intermediate state space, become good as a r e s u l t of an update operation. 64 3 . 3 . Decomposition Syntax In t h i s and l a t e r chapters i t w i l l be necessary to discuss, and even compare, many decompositions. A consistent representation scheme i s required. Two such schemes w i l l be defined i n t h i s section. The f i r s t conveys the most information, but i s somewhat d i f f i c u l t to i n t e r p r e t without p r a c t i c e . The second i s diagrammatic and emphasizes the linkages between subsystems. Both w i l l be used as appropriate. Consider the system described by binary state v a r i a b l e s {a,b,c,d) with a system r e l a t i o n as shown above. I f the f i r s t ISS i s updated using {a,b,c} and {b}, the new or second ISS contains the following states. F i r s t Second Corresponding ISS ISS F i n a l Stable States a b e d --> a b e d --> a b e d 0 0 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 The subsystems {a}, {b}, {c}, {a,b}, {a,c}, {b,c), {a,b,d}, and {c,d} are a l l deterministic with respect to t h i s second ISS. I f {c,d} i s selected f o r an update, the t h i r d ISS becomes Second T h i r d Corresponding ISS ISS F i n a l Stable States a b e d --> a b e d --> a b e d 0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 The states of the t h i r d ISS are the same as the corresponding f i n a l stable states. Two l e v e l s have been defined: {{a,b,c},{b}} and {(c,d)}. Together they form a deterministic decomposition of the system. The decomposition may be represented as shown below. State variables with values that change during an update are underlined 4*. These state variables w i l l be l a t e r defined as OUTPUT state v a r i a b l e s . 65 2: {c,d} 1: {a,b,c} {b} This decomposition has the following associated semantics. 1. Subsystems {a,b,c} and {b} are deterministic subsystems with respect to the f i r s t ISS (or {a,b,c} and {b} are deterministic at l e v e l 1). 2. Subsystem {c,d} i s a deterministic subsystem with respect to a second ISS (or {c,d} i s deterministic at l e v e l 2). This state space i s formed by updating the f i r s t ISS using subsystems (a,b,c) and {b}. 3. The t h i r d ISS formed by updating the second ISS using the subsystems {c,d) w i l l contain only stable states. Decompositions w i l l sometimes be displayed using diagrams s i m i l a r to Figure 8 . Subsystems are represented by boxes containing sets of state v a r i a b l e names. I t i s easier to see the linkages, or communication, between subsystems i n t h i s sort of diagram. Communication ( i f any) between subsystems i s shown by l i n e s between boxes. The l i n e s are l a b e l l e d with the name of the state v a r i a b l e whose value i s passed. Values are passed from lower to higher subsystems only. Figure 8: 3.4 . L i m i t i n g the Search Space 2: {c,d} 1: (a,b,c) 3.4.1. General A number of rules have been found which can considerably l i m i t the number of de t e r m i n i s t i c decompositions which should be considered by the analyst. Some of these rules are h e u r i s t i c s , i n that they cannot be formally proved. Others follow d i r e c t l y from formal d e f i n i t i o n s and are c a l l e d theorems. Before the rules maybe presented, three more d e f i n i t i o n s are required. These d e f i n i t i o n s , A n a l t e r n a t i v e representation for the p a r a l l e l / s e q u e n t i a l decomposition: 66 and many of the theorems and h e u r i s t i c s , w i l l be i l l u s t r a t e d using the simple system described by state variables {a,b,c,d} as introduced i n the previous section. D e f i n i t i o n : Output State Variable A state v a r i a b l e i s an OUTPUT STATE VARIABLE with respect to some intermediate state space R, of a system with law L, i f i t s value i n some system state s i n R i s d i f f e r e n t from i t s value i n the f i n a l stable system state L ( s ) . That i s , i f v i s a state v a r i a b l e then v i s an output state v a r i a b l e with respect to R i f and only i f THERE EXISTS s such that s e R and proj(s,{v}) * proj(L(s),{v}) D e f i n i t i o n : Input State Variable The set of state variables whose values are required to p r e d i c t the f i n a l values of the output state variable s with respect to some intermediate state space R i s c a l l e d the set of INPUT STATE VARIABLES with respect to R45. D e f i n i t i o n : Constant State V a r i a b l e 4 6 Any state v a r i a b l e which i s not an output state v a r i a b l e with respect to some intermediate state space R i s CONSTANT STATE VARIABLE with respect to R. The sets of input and output state v a r i a b l e s with respect to some intermediate state space are not nec e s s a r i l y mutually exclusive. The f i n a l value of some output state v a r i a b l e could depend on i t s i n i t i a l value. Such a state v a r i a b l e would be both an input and an output. A state v a r i a b l e which i s both an input and an output w i l l be named twice i n the set of state variables describing a subsystem. For example, {x,y,z,z} indicates that values of "x" and "y" and the i n i t i a l value of "z" are a l l required to determine the f i n a l value of "z". 4 6 In many of the examples to be considered i n t h i s and l a t e r chapters, the set of input state vari a b l e s w i l l equal the set of constant state v a r i a b l e s . The sets only d i f f e r when the i n i t i a l value of an output state v a r i a b l e i s required to determine i t s own f i n a l value. 67 C o n s t a n t O u t p u t S t a t e V a r i a b l e s S t a t e V a r i a b l e s I n p u t S t a t e V a r i a b l e s Figure 9 : The r e l a t i o n s h i p between output, input, and constant state v a r i a b l e s with respect to a given intermediate state space. The r e l a t i o n s h i p s between the set of output, input, and constant state v a r i a b l e s with respect to a given intermediate state space are diagrammed i n Figure 9. For example, consider the system described by binary state variables {a,b,c,d}. F i r s t Intermediate Corresponding State Space F i n a l Stable States a b c d --> a b c d 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 1 '• 0 0 0 1 1 0 0 1 1 1 1 68 State v a r i a b l e s {a,b} are input and constant state v a r i a b l e s and {c,d} are output state v a r i a b l e s with respect to the f i r s t ISS. Now consider the second ISS formed by updating the f i r s t ISS using the subsystem (a,b,c}. F i r s t Second Corresponding ISS ISS F i n a l Stable States a b e d --> a b e d --> a b e d 0 0 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 State v a r i a b l e s (a,b,c) are a l l inputs with respect to the second ISS. State v a r i a b l e "c" i s no longer an output since i t s value does not change i n any i n i t i a l / f i n a l state p a i r . Thus output state v a r i a b l e s need not remain output state v a r i a b l e s a f t e r an update. These d e f i n i t i o n s of "input" and "output" are not the same as those i n common use. I t i s more usual to r e f e r to external events as inputs and the actions of the system on i t s environment as outputs. 4 7 In SELMA, the state v a r i a b l e s a f f e c t e d by external events are constant state v a r i a b l e s . Their values do not change between states i n the f i r s t ISS and the corresponding f i n a l stable states of the system 4 8. They may also be input state v a r i a b l e s i f they are required to determine the f i n a l values of the output state v a r i a b l e s . However, there may be other input state variables which are not af f e c t e d by external events. The d e f i n i t i o n of output state v a r i a b l e i s also somewhat unusual. Outputs are defined for every ISS with the exception of the space containing only stable states. Interaction between output state v a r i a b l e s and the environment i s not modelled. Any such int e r a c t i o n s would form the external events to another system located i n the environment of the system under study, and so are not considered. The rules f o r l i m i t i n g the number of deterministic decompositions to be considered by the analyst may now be presented. Bunge (1979, p. 25) defines input and output i n t h i s way. 4 8 I t i s assumed that the values of state v a r i a b l e s may be set only once during the system's response to an external event. This assumption i s discussed, i n d e t a i l , l a t e r i n t h i s chapter. 69 3.4.2. H e u r i s t i c s and Theorems 3.4.2.1. Subsystems should have outputs Deterministic subsystems are subsystems whose f i n a l states can be predicted knowing only t h e i r i n i t i a l states. By d e f i n i t i o n , constant state v a r i a b l e s do not change t h e i r values between the i n i t i a l and f i n a l states of the system. Therefore, i t i s a t r i v i a l exercise to p r e d i c t the f i n a l state of a subsystem described by only constant state v a r i a b l e s . The following theorem i s suggested by t h i s f a c t * 9 . Theorem 1: Any subsystem X, described only by constant state v a r i a b l e with respect to some intermediate state space R and the corresponding f i n a l stable states, w i l l be a deterministic subsystem with respect to R. That i s : IF FOR ALL s such that s G R, proj(s.X) = proj(L(s),X) THEN X i s det e r m i n i s t i c with respect to R and L. Such deterministic subsystems are u n l i k e l y to be i n t e r e s t i n g to an analyst as they contain no information about the dynamics of the system. A program module based on t h i s sort of subsystem would always return the same values i t received. This f a c t leads to the f i r s t h e u r i s t i c f o r l i m i t i n g the number of deterministic subsystems which may be used to update an ISS. H e u r i s t i c 1: A l l deterministic subsystems used to update an intermediate state space must be described by at l e a s t one OUTPUT state v a r i a b l e . For example, H e u r i s t i c 1 w i l l ensure the subsystems {a} and {b} are not used to update the f i r s t ISS of the system described by binary state variable s {a,b,c,d}. Proofs for the theorems included i n t h i s document are straightforward and proceed d i r e c t l y from the d e f i n i t i o n s . They have not been included here. 70 3.4.2.2. Subsystems should be small Any subsystem formed by adding a constant state v a r i a b l e to a deterministic subsystem w i l l be deterministic. The value of a constant state v a r i a b l e does not change between i n i t i a l and f i n a l states, and so cannot cause a deterministic subsystem to behave non-deterministically. For example, consider the following second intermediate, and f i n a l stable, state spaces of the system described by binary state v a r i a b l e s {a,b,c,d}. This second ISS was created by updating the f i r s t ISS using the subsystem described by state v a r i a b l e s {a,b,c}. F i r s t Second Corresponding ISS ISS F i n a l Stable States a b e d --> a b e d --> a b e d 0 0 0 0 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 State v a r i a b l e "a", "b", and "c" are constant state v a r i a b l e s with respect to th i s ISS, since t h e i r values do not change between the intermediate and corresponding f i n a l stable states. The subsystem (a,b,d) i s de t e r m i n i s t i c with respect to t h i s ISS, since no i n i t i a l subsystem state leads to two d i f f e r e n t f i n a l subsystem states. I f the state v a r i a b l e "c" i s added to {a,b,d}, the r e s u l t i n g set of state variables also describes a dete r m i n i s t i c subsystem. This r e s u l t i s expressed by the following theorem. Theorem 2: Let X be a set of state variables containing output state v a r i a b l e s 0, and l e t X' be a subset of X also containing 0 . I f X' describes a deterministic subsystem, with respect to some intermediate state space R and system law L, then X w i l l be a dete r m i n i s t i c subsystem with respect to R and L. That i s : IF X' i s deterministic with respect to R and L and X' c X and FOR ALL o such that o e X and proj(s,{o}) * proj(L(s),{o}) and s e R, o e X' THEN X i s a deterministic subsystem with respect to R and L. 71 In the above example, the set of state v a r i a b l e s {a,b,c,d} contained one output state v a r i a b l e . I t also contained a subset {a,b,d} which described a deterministic subsystem. Since t h i s subset also contained the output state v a r i a b l e {d}, the state v a r i a b l e s (a,b,c,d} had to describe a deterministic subsystem. Deterministic subsystems formed by adding constant state variables to e x i s t i n g deterministic subsystems are probably not i n t e r e s t i n g to an analyst. A program module corresponding to such a subsystem would contain a redundant v a r i a b l e , since the outputs of the subsystem could have been determined by the o r i g i n a l v a r i a b l e s . This f a c t suggests the following h e u r i s t i c . H e u r i s t i c 2 : Let X be a set of state v a r i a b l e s containing output state variable s 0, and l e t X' be subset of X also containing 0. I f X' describes a deterministic subsystem, with respect to some intermediate state space R and system law L, then X may not be used to update R. This r u l e ensures that the subsystems used to update an ISS are described by as small a number of input state variables as p o s s i b l e . I t i s required to avoid t r i v i a l decompositions formed by adding constant state v a r i a b l e s to d e t e r m i n i s t i c subsystems. For example, without t h i s h e u r i s t i c both of the following would be considered as possible decompositions of the system described by binary state v a r i a b l e s (a,b,c,d). Output state v a r i a b l e s are underlined. 2: (a,b,d) and 2: {a,b,c,d} 1: {a,b,c} 1: {a,b,c} The second decomposition does not add any information as i t could have been deduced from the f i r s t decomposition and Theorem 2. Now r e c a l l that the subsystems {a,b,c} and {a,b,d} are both deterministic with respect to the f i r s t ISS. So i s the union of the two subsystems. That i s , the subsystem {a,b,c,d} i s also deterministic. This r e s u l t i s generalized i n the following theorem. Theorem 3 : A subsystem described by the union of the state v a r i a b l e s describing two deterministic subsystems w i l l be d e t e r m i n i s t i c . 72 Two possible deterministic decompositions of the system are 1: {a,b,c} {a,b,d} and 1: {a,b,c,d} The second decomposition implies that four state v a r i a b l e s are required to p r e d i c t the f i n a l state of the subsystem. The f i r s t decomposition contains more information than the second. I t t e l l s the analyst that the f i n a l values of "c" and 11 d" may be predicted i f the i n i t i a l values of only "a" and "b" are known. The second decomposition does not indicate whether the values of "a" and "b" are both required to p r e d i c t the f i n a l value of both "c" and "d",.or whether j u s t one state v a r i a b l e could serve to predict one of the outputs. Since the second decomposition can be deduced from the f i r s t through the use of Theorem 3, the following h e u r i s t i c i s suggested. H e u r i s t i c 3: Do not generate a l t e r n a t i v e decompositions r e s u l t i n g from the Together, H e u r i s t i c s 2 and 3 ensure that the subsystems presented to the analyst f o r consideration w i l l be described by as small a number of state v a r i a b l e s as possible. 3.4.2.3. Subsystems should show emergence Consider the example system described by binary state v a r i a b l e s {a,b,c,d) . The sets of state v a r i a b l e s {a,b,c} and {a,b,d} describe subsystems which are deterministic with respect to the f i r s t ISS. An update using these subsystems would lead to an ISS containing only stable states. The f i r s t ISS could also be updated using only the subsystem {a,b,c}, to obtain the second ISS shown below. F i r s t Second Corresponding ISS ISS F i n a l Stable States a b e d --> a b e d --> a b e d union of smaller subsystems. 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 o i o o 1 0 0 0 1 1 1 1 o 73 The subsystem {a,b,d} i s s t i l l d eterministic with respect to t h i s ISS. I f t h i s second ISS were further updated using {a,b,d}, the r e s u l t i n g ISS would contain only stable states. Therefore, the following are both deterministic decompositions of the system: 1: {a,b,c} {a,b,d} and 2: {a,b,d} 1: {a,b,c} The second decomposition does not add to the information provided by the f i r s t , and should not have to be considered by the analyst. This observation may be generalized with the following theorem. Theorem 4 : Let X and Y be deterministic subsystems with respect to an intermediate state space R. I f X i s used to update R to obtain intermediate state space R' then Y w i l l be d e t e r m i n i s t i c with respect to R' . This theorem expresses the commutativity of the update. That i s , i f two subsystems X and Y are deterministic with respect to some ISS, the f i n a l ISS's r e s u l t i n g from updating using X and then Y and using Y and then X w i l l be the same. Theorem 4 suggests another h e u r i s t i c f or l i m i t i n g the number of deterministic decompositions which have to be considered by the analyst. However, the following d e f i n i t i o n w i l l make i t s formulation easier. D e f i n i t i o n : Emergent State Variable Let x be a output state v a r i a b l e used to describe a subsystem at the nth l e v e l of some decomposition. I f x i s not used to describe any subsystem at any mth l e v e l where m < n, then x i s an EMERGENT STATE VARIABLE at l e v e l n. The concept of an emergent state v a r i a b l e i s analogous to the notion of a h o l i s t i c property. That i s , h o l i s t i c properties are "those c h a r a c t e r i s t i c s of a p a r t i c u l a r system that go beyond the q u a l i t i e s of the i n d i v i d u a l system components" (Mattessich, 1978, p. 31). H o l i s t i c properties are a manifestation of the f a c t that a system i s more than the sum of i t s parts. 74 For example, consider a p a y r o l l system. "Total pay" might be determined by the values of "regular pay" and "overtime". The values of "regular pay" and "overtime" might be determined by the values of "hours worked" and "pay rate". Such a system could be decomposed as shown: 2: {regular pay.overtime.total pay) 1: {hours worked,pay rate.regular pay) {hours worked,pay rate.overtime) In t h i s case " t o t a l pay" i s an emergent state v a r i a b l e at l e v e l 2. Emergent state variables allow the analyst to focus h i s or her attention on h i g h e r - l e v e l abstractions of the system under study. The " t o t a l pay" emergent state v a r i a b l e could be considered an abstraction of the "hours worked" and "pay rate" state v a r i a b l e s . I f the analyst were not i n t e r e s t e d i n the degree of d e t a i l provided by these state v a r i a b l e s , " t o t a l pay" may be a p e r f e c t l y adequate substitute. Decompositions which show the emergence of state v a r i a b l e s whenever possible are assumed to be superior to those that do not. The following h e u r i s t i c i s based on t h i s assumption. H e u r i s t i c 4a: A l l subsystems used to update an nth intermediate state space must by described by at l e a s t one state v a r i a b l e which i s emergent at l e v e l n. However, t h i s h e u r i s t i c alone i s not enough to avoid redundant a l t e r n a t i v e decompositions as discussed above. In the decomposition 2: {a,b,d} 1: {a,b,c} state v a r i a b l e "d" i s emergent at l e v e l 2. Unfortunately, state v a r i a b l e "d" i s not a u s e f u l abstraction of any state v a r i a b l e s found at lower l e v e l s . While state v a r i a b l e s "a" and "b" are found at l e v e l 1, they are also found at l e v e l 2. They are not abstracted out of the view of the system presented to the analyst at any l e v e l of the decomposition. Only i f the values of emergent state v a r i a b l e s are determined by output state v a r i a b l e s at a lower l e v e l , do they 75 become us e f u l abstractions for the analyst. The following h e u r i s t i c embodies t h i s notion. H e u r i s t i c 4b: Any deterministic subsystem used to update the nth intermediate state space must be described by at l e a s t one state v a r i a b l e which i s an output state v a r i a b l e with respect to the n - l t h intermediate state space. In other words, subsystems used to update an ISS must be described by at l e a s t one state v a r i a b l e which was an output state v a r i a b l e with respect to the previous ISS. This ensures that outputs from deterministic subsystems at a lower l e v e l w i l l be used as inputs at a higher l e v e l whenever po s s i b l e . Consider the following decomposition of the example subsystem described by state v a r i a b l e (a,b,c,d). 2: {c,d} 1: (a,b,c} State v a r i a b l e "c" i s an output state v a r i a b l e at l e v e l 1, and i t i s an input state v a r i a b l e at l e v e l 2. State v a r i a b l e "d" i s emergent at l e v e l 2. Therefore, t h i s decomposition s a t i s f i e s H e u r i s t i c s 4a and 4b. Now consider the following decomposition. 2: (a.b.d) 1: {a,b,c} While state v a r i a b l e "d" i s emergent at l e v e l 2, no output state v a r i a b l e from l e v e l 1 appears at l e v e l 2. Therefore, H e u r i s t i c 4b would lead to r e j e c t i o n of t h i s decomposition. When emergent state variables are used to form abstractions of other state v a r i a b l e s , some state variables may be "hidden". The concept of a hidden state v a r i a b l e i s analogous to "information hiding" as defined by Parnas (1972, p. 1056) . Subsystems at higher l e v e l s i n the decomposition do not have to be "aware" of a l l state v a r i a b l e s considered by lower-level subsystems. For example, consider the p a y r o l l system. 76 2: 1: {regular pay.overtime.total pay) {hours worked,pay rate.regular pay) {hours worked,pay rate,overtime} Here, the state v a r i a b l e s "hours worked" and "pay rate" are hidden with respect to " t o t a l pay". This means that an analyst, i n t e r e s t e d only i n the f i n a l value of " t o t a l pay", would be concerned with the view of the system shown at the top of Figure 10. The arrows between "regular pay" and " t o t a l pay" and between "overtime" and " t o t a l pay" indicate value dependencies (e.g. the f i n a l value of " t o t a l pay" depends on the value of "regular pay"). On the other hand, i f the analyst were interes t e d i n both " t o t a l pay" and "overtime", he or she would require the view shown at the bottom of Figure 10. No state v a r i a b l e s are hidden i n t h i s view of the system. The formal d e f i n i t i o n of a hidden state v a r i a b l e i s somewhat obscure, but i s equivalent to the above " i n t u i t i v e " d e s c r i p t i o n . V a n a f j l e ( s ) of i n t e r e s t : TOTAL PAY r e g u l a r pay o v e r t i m e t o t a l pay V a r i a b l e ( s ) of I n t e r e s t : TOTAL PAY, OVERTIME / r e g u l a r pay t o t a l pay pay r a t e \ h o u r s worked o v e r t i m e Figure 10: Two possible views of a hypothetical p a y r o l l system. 77 D e f i n i t i o n : Hidden State Variable Let x be an emergent state v a r i a b l e at l e v e l n of a decomposition s a t i s f y i n g H e u r i s t i c s 4a and 4b. I f , at l e v e l n, the state v a r i a b l e y i s not used to describe any subsystem also described by x, then y i s HIDDEN with respect to x. 3 . 4 . 2 . 4 . Subsystems should not show redundant dependencies The f i n a l value of an output state v a r i a b l e may be f u n c t i o n a l l y determined by more than one subsystem that i s deterministic with respect to some ISS. Consider the example system described by binary state v a r i a b l e s {a,b,c,d}. The second ISS, formed by updating the f i r s t ISS using the subsystem {a,b,c}, i s shown below. F i r s t Second Corresponding ISS ISS F i n a l Stable States a b e d --> a b e d --> a b e d The f i n a l value of output state v a r i a b l e "d" may be f u n c t i o n a l l y determined by e i t h e r {a,b,d) or {c,d}. However, a decomposition of the form 2: {a.b.d} {c,d} 1: {a,b,c} would not be considered desirable i n that i t indicates redundant updating at l e v e l 2. There i s no need to have "d" set by two subsystems. This observation leads to the following h e u r i s t i c . H e u r i s t i c 5 : The set of deterministic subsystems used to update an This i s not meant to imply that an analyst should not be made aware of a l t e r n a t i v e methods for c a l c u l a t i n g the f i n a l values of output state v a r i a b l e s . 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 1 1 0 0 0 1 1 0 1 0 0 1 0 0 0 1 1 1 1 intermediate state space may not contain more than one subsystem described by a given output state v a r i a b l e . 78 The h e u r i s t i c only forces such a l t e r n a t i v e s to be shown i n d i f f e r e n t candidate decompositions of the same system. That i s , 1: {a,b,c} {a,b,d} and 2: {c,d} 1: {a,b,c} are possible decompositions of the example system. Both would be suggested by the s p e c i f i c a t i o n s analysis t o o l s . 3 . 4 . 2 . 5 . Bad Subsystems Theorem 2 s p e c i f i e s a condition under which i t i s not necessary to scan the ISS i n order to see i f a subsystem behaves d e t e r m i n i s t i c a l l y . The next theorem serves a s i m i l a r function. Both are used by the s p e c i f i c a t i o n s analysis tools to speed the search for deterministic subsystems. Consider the f i r s t ISS of the example system. F i r s t Intermediate Corresponding State Space F i n a l Stable States a b c d --> a b c d 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 1 The subsystem ( b . c . d ) i s not deterministic with respect to t h i s state space. Neither i s the subsystem formed by dropping a constant state v a r i a b l e . That i s , the subsystem (c.d) i s not deterministic e i t h e r . This r e s u l t may be generalized with the following theorem: Theorem 5: Let a subsystem X, which i s not deterministic with respect to some intermediate state space R, be described by the set of output state v a r i a b l e s 0 and constant state v a r i a b l e s C. I f X' i s another subsystem described e n t i r e l y by 0 and a subset of C, then X' w i l l not be a deterministic subsystem with respect to R. This means that a subsystem which i s not d e t e r m i n i s t i c cannot be made deter m i n i s t i c by dropping some of i t s constant state v a r i a b l e s . 79 3.4.3. Relationship to the H e u r i s t i c s of Simon and Ando The concept of a deterministic subsystem together with the above h e u r i s t i c s are r e l a t e d to the i n t u i t i v e notions of Simon and Ando (1961) as presented i n Chapter 1. Consider t h e i r o f f i c e b u i l d i n g example f i r s t discussed i n Chapter 1. Simon and Ando consider each room to be a subsystem of the b u i l d i n g and each o f f i c e to be a subsystem of a room. They were concerned with describing the thermal equilibrium of the b u i l d i n g . Using the decomposition syntax of t h i s chapter, the b u i l d i n g system might be characterized by the following decomposition: 3; t t r l t r N , tfo) 2: ( t o l l , t o l J , t r l ) ... { t o N 1 , t o N k , t r N ) 1: {..., ^11} • • • { . . . , t c l j ) • • • { • • • , t o N i ) • • • ( • • • > t o N k ) where: t b = equilibrium temperature of the o f f i c e b u i l d i n g t r i = equilibrium temperature of the i t h room t o i j = equilibrium temperature of the j t h o f f i c e of the i t h room The " t b " i s an emergent state v a r i a b l e at l e v e l 3 and each " t r " i s emergent at l e v e l 2. R e c a l l that Simon and Ando's necessary c r i t e r i a f o r a decomposable system were a. i n a short-term period, as a r e s u l t of stronger i n t e r n a l bonds, subsystems tend to reach an i n t e r n a l equilibrium "approximately" independently of one another, and b. i n a long-term period, when a whole structure evolves toward a global equilibrium state under the influence of weak in t e r a c t i o n s among subsystems, the i n t e r n a l equilibriums reached at the end of the short-term period are approximately maintained i n r e l a t i v e value. 80 Suppose a bond i s interpreted as a dependency between state v a r i a b l e s . Also suppose that bonds are d i r e c t e d . 5 0 I f the subsystems are det e r m i n i s t i c , there cannot be bonds between subsystems at the same l e v e l 5 1 , but there may be bonds to lower-level subsystems. However, the number of bonds between a subsystem and subsystems at other l e v e l s w i l l never be greater than the number of bonds within the subsystem. This can be shown as follows: 1. H e u r i s t i c 1 ensures that each subsystem must be described by at l e a s t one output state v a r i a b l e . 2. H e u r i s t i c 2 ensures that the values of the output state v a r i a b l e s are dependent on the values of a l l input state v a r i a b l e s . 3. 1 and 2 imply that i f there are n state v a r i a b l e s describing the state of a subsystem, there must be at l e a s t n-1 bonds between them. 4. There can never be more bonds to other subsystems than there are input state v a r i a b l e s ( i . e . at most n-1). 5. Therefore, the r a t i o of the number of i n t e r n a l to external bonds must always be greater than or equal to 1. I f there are stronger l i n k s within a subsystem, than between that subsystem and the r e s t of the system, Simon and Ando argue i t i s l i k e l y to reach equilibrium f a s t e r than the whole system. I f the number of bonds i s assumed to be proportional to the strength of the l i n k , subsystems can never be more strongly l i n k e d together that they are i n t e r n a l l y . This s a t i s f i e s Simon and Ando's i n t e r a c t i o n strength requirement. Now suppose that a subsystem i s i n equilibrium when no d e s c r i p t i v e state v a r i a b l e i s an output state v a r i a b l e . That i s , a l l state v a r i a b l e s have attained t h e i r f i n a l values. A subsystem w i l l be i n equilibrium a f t e r i t i s used to perform an update. By d e f i n i t i o n , lower-l e v e l subsystems w i l l always reach equilibrium before the system as a whole. 5 For example, consider a system where the value of some state v a r i a b l e "b" depends on the value of some state v a r i a b l e "a" and not v i c e versa. A bond i s assumed to e x i s t between the subsystem which determines the value of "a" and the subsystem described by "b", but not the reverse. 5 1 A bond between subsystems X and Y at the same l e v e l would imply that the value of some state v a r i a b l e i n Y i s dependent on the value of some other state v a r i a b l e i n X. Therefore, Y could not be a good subsystem, since the behaviour of a good subsystem i s predictable knowing only the values of i t s own state v a r i a b l e s . 81 This behaviour i s the same as that predicted by Simon and Ando f o r a decomposable system. 3 .5 . Automation of Decomposition 3.5 .1. An Algorithm f o r Decomposition An algorithm 5 2 employing the notions formally defined i n the previous sections has been developed. An implementation of t h i s algorithm comprises a large part of the computerized s p e c i f i c a t i o n s analysis t o o l s . Operation of the algorithm w i l l be i l l u s t r a t e d using a simple system. The algorithm requires as input an e x p l i c a t i o n of the system law i n the form of i n i t i a l unstable and f i n a l stable state p a i r s . The Decompose() procedure i s then c a l l e d r e c u r s i v e l y u n t i l a set of a l t e r n a t i v e decompositions has been generated. Each a l t e r n a t i v e w i l l be a deterministic decomposition, and i t w i l l s a t i s f y each of the h e u r i s t i c s described e a r l i e r i n t h i s chapter. Required functions: Outputs(R) - returns a l i s t of the output state v a r i a b l e s with respect to the intermediate state space R. Subsystems(R,Outputs,PreviousOutputs) - returns a l i s t of deterministic subsystems with respect to some intermediate state space R. Outputs i s a l i s t of output state v a r i a b l e s with respect to R. PreviousOutputs i s a l i s t of the output state variables describing the subsystems used i n the update which produced R. Each deterministic subsystem w i l l be described by a set of state v ar ia bl es such that: 1) As required by H e u r i s t i c 1, the set of state v a r i a b l e s w i l l contain an element of the l i s t of state v a r i a b l e s assigned to Outputs. There i s no i n t e n t i o n to suggest that the algorithm described here represents the "best" way to operationalize the theory of decomposition. I t is possible that more e f f i c i e n t algorithms e x i s t . This p a r t i c u l a r algorithm is described to show that o p e r a t i o n a l i z a t i o n i s possible. The most important contributions of t h i s research are to be found i n the construction and analysis of system models. 82 2) As required by He u r i s t i c s 2 and 3, the set of state v a r i a b l e s w i l l be as small as possi b l e . 3) As required by H e u r i s t i c 4b, the set of state v a r i a b l e s w i l l contain an element of the l i s t of state v a r i a b l e s assigned to PreviousOutputs (unless there are no previous outputs, as w i l l be the case with the f i r s t ISS). Theorem 5 i s used to further reduce the number of subsystems which must be tested by scanning the ISS. Subsets(Subsystems,Outputs) - returns a l i s t of a l l subsets of the set of subsystems assigned to Subsystems. Outputs i s a l i s t of the output state v a r i a b l e s used to describe the subsystems assigned to Subsystems. As the subsets w i l l be used to perform updates on some ISS, care must be taken to ensure H e u r i s t i c 5 i s not v i o l a t e d . That i s , no subset may contain two subsystems which are described by the same output state v a r i a b l e . Update(R,U) - returns the ISS formed by updating R with respect to the subsystems U. The body of the algorithm: Begin {Set R-L equal to the set of i n i t i a l unstable states. R X i s the ISS formed by applying each defined external event to each stable state of the system. The symbol [] refer s to a l i s t with no members. The f i r s t time the Decompose() procedure i s c a l l e d there are no outputs with respect to a previous l e v e l and no deterministic subsystems have been found.} Decompose ( R X , [ ] , [ ] ) ; End. 83 The decomposition procedure: Procedure: Decompose(R,PreviousOutputs.DecompSoFar) Arguments: R - an intermediate state space. PreviousOutputs - the output state v a r i a b l e s of the subsystems used for the update which produced R. DecompSoFar - a l i s t of the sets of subsystems used to obtain R from f i r s t ISS v i a a s e r i e s of updates. Begin Step 1: {Find the output state v a r i a b l e s with respect to the ISS.) Outputs := Outputs(R); Step 2: {If Outputs i s empty a l l the states i n the ISS are stable. This means that the sets of subsystems used to perform updates defines a deterministic decomposition.} I f Outputs i s empty Begin Output DecompSoFar as a possible decomposition; Ex i t ; End; Step 3 : {Find the deterministic subsystems with respect to the ISS subject to c e r t a i n conditions described f or the DetSubsystems() function.} DetSubsystems := Subsystems(R,Outputs,PreviousOutputs); Step 4 : {Find a l l the subsets of the set of dete r m i n i s t i c subsystems sui t a b l e f o r updating the ISS. These subsets must meet c e r t a i n c r i t e r i a as described f o r the Subsets() function.} PossibleUpdates := Subsets(DetSubsystems,Outputs); Step 5: {Perform a d e p t h - f i r s t search f o r deterministic decompositions. C a l l the Decompose() procedure r e c u r s i v e l y f o r each new ISS formed by updating the current ISS using the subsets i d e n t i f i e d i n step 4.} I f PossibleUpdates i s not empty then For each element U of PossibleUpdates do Begin R' := Update(R.U); 84 NewDecompSoFar := DecompSoFar u U; Decompose(R' .Outputs,NewDecompSoFar); End; End. As i l l u s t r a t e d by the following example, the algorithm w i l l f i n d a l l de t e r m i n i s t i c decompositions subject only to the rather elementary h e u r i s t i c s 5 3 described e a r l i e r . The order of discovery of the decompositions does not imply any form of ranking. Even moderately complex systems are l i k e l y to have a very large number of possible decompositions. Further h e u r i s t i c s are needed to present the decompositions i n some meaningful order. 3 . 5 . 2 . A Simple Example R e c a l l the hypothetical system c o n s i s t i n g of four interconnected l i g h t s . Light "a" i s connected i n series with "b" so that i f "a" i s on then "b" w i l l be on and i f "a" i s o f f , "b" w i l l be o f f . I f l i g h t "a" i s o f f then l i g h t "c" w i l l be on, and i f "a" i n on, l i g h t "d" w i l l be on. Only the state of l i g h t "a" may be set manually. The "on" state of a l i g h t w i l l be represented by the integer 1 and the " o f f " state by 0. Sublaws describing the stable states of the system and the actions to be taken should the system f i n d i t s e l f i n an unstable state, are given below. Sublaws 1. S t a b i l i t y Conditions: a b 0 0 1 1 Corrective Actions: Conditions Actions a --> b 1 1 0 0 2. S t a b i l i t y Conditions: a c 0 1 1 0 1 1 The h e u r i s t i c s are embedded i n the functions Subsystems() and Subsets(). 85 Corrective Action: Conditions Actions a --> c 0 1 3. S t a b i l i t y Conditions: a d 1 1 0 1 0 0 Corrective Action: Conditions Actions a --> d 1 1 There are two external events.. External Events 1. Set a = 1 2. Set a = 0 The stable state space of t h i s system i s shown below. Stable States a b c d 0 0 1 0 0 0 1 1 1 1 0 1 1 1 1 1 The f i r s t system r e l a t i o n may be obtained by applying the events "set a=l" and "set a=0" to each of the four stable states. This y i e l d s the f i r s t ISS or Rx. The f i n a l stable states corresponding to each of the states i n the f i r s t ISS are obtained by examining the response paths of the system (these paths follow d i r e c t l y from the sublaws and are l i s t e d i n Chapter 2 ) . F i r s t Intermediate Corresponding f i n a l State Space stable state a b c d — > a b c d 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 86 Four deterministic decompositions w i l l be suggested when the algorithm i s applied to t h i s system. The steps leading to the f i r s t deterministic decomposition are shown below. The f u l l s o l u t i o n , showing the generation of a l l four decompositions, i s included as Appendix H. Each step i s l a b e l l e d using the following convention: x(Ay|Lz) where x = Step number s t a r t i n g with 1 and increasing by 1 u n t i l the algorithm f i n i s h e s . y = Algorithm step number. z = The current l e v e l of recursion with respect to the DecomposeQ procedure. START 1(A1|L1) Find the output state v a r i a b l e s with respect to the current ISS. The only state v a r i a b l e s which change t h e i r values between the f i r s t ISS and the corresponding f i n a l stable states are {b,c,d}. 2(A2|L1) The set of output state v a r i a b l e s i s not empty. 3(A3|LI) Find the deterministic subsystems. The f i r s t system r e l a t i o n was as follows: F i r s t Intermediate Corresponding f i n a l State Space stable state a b c d ---> a b c d 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 87 The smallest deterministic subsystems, with respect the the f i r s t ISS, which are described by at l e a s t one output state v a r i a b l e are {a,b}, {a,c,c}, and {a,d,d}. Notice that state v a r i a b l e s "c" and "d" are both inputs and outputs i n t h e i r respective subsystems. That i s , the f i n a l values of "c" and "d" are dependent on t h e i r i n i t i a l values. 4(A4|L1) Find subsets of the dete r m i n i s t i c subsystems f o r ISS update. The subsets of t h i s set of deterministic subsystems are {{a,b}} ({a.c.c}} {{a,d,d}} {{a,b},{a,c,c}} {{a,b},{a,d,d}} {{a,c,c},(a,d,d)} {(a,b),{a.c.c},{a,d,d}} 5(A5|L1) Update the current ISS using one subset of the set of deterministic subsystems, and c a l l the Decompose() procedure. The f i r s t ISS w i l l be eventually updated using a l l the sets found i n step 5. The f i r s t set selected i s {{a,b}}. The second ISS created by t h i s update i s as shown below. F i r s t ISS a b e d --> 0 0 1 0 0 0 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 Second ISS a b e d --> 0 0 1 0 0 0 1 1 0 0 0 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 Corresponding f i n a l stable states a b e d 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 6(A1|L2) Find the output state v a r i a b l e s with respect to the current ISS. The only state variables with values which change between the second ISS and the corresponding f i n a l stable states are {c,d}. 7(A2|L2) The set of output state v a r i a b l e s i s not empty. 88 8(A3 |L2) Find the deterministic subsystems with respect to the second ISS. The only deterministic subsystems described by at l e a s t one output state v a r i a b l e and one output state v a r i a b l e from the subsystems used i n the l a s t update, are {b,c,c} and {b,d,d}. Notice that while {a,c,c} and {a,d,d} are s t i l l d e t e rministic subsystems, they are not described by an output state v a r i a b l e from the subsystems used to create the second ISS ( i . e . they are not described by state v a r i a b l e b). 9(A4|L2) Find subsets of the deterministic subsystems f o r ISS update. The subsets of t h i s set of deterministic subsystems are {{b,c,c},{b,d,d}} {{b,c,c}} {{b,d,d}} 10(A5|L2) Update the current ISS using one subset of the set of deterministic subsystems, and c a l l the Decompose() procedure. The second ISS w i l l be eventually updated using a l l the sets found i n step 9. The f i r s t set selected i s {{b,c,c),{b,d,d)}. The t h i r d ISS created by th i s update i s as shown below. Second Th i r d Corresponding f i n a l ISS ISS stable states a b e d --> a b e d --> a b e d 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 l l ( A l | L 3 ) Find the outputs with respect to the current ISS. There are no output state v a r i a b l e s . 89 12(A2 |L3) Since there are no output state v a r i a b l e s , output a deterministic decomposition. The sets of subsystems used to transform the f i r s t ISS into a stable states defines a decomposition. The second ISS was formed using {a,b}. The t h i r d was formed using {b,c,c} and {b,d,d}. Therefore, the f i r s t discovered decomposition i s therefore 2: {b,c,c} {b,d,d} 1: U.bJ This invocation of the Decompose() procedure i s now complete. Execution w i l l continue by updating the second ISS using the next subset of the set of deterministic subsystems i d e n t i f i e d at Step 9. {The example i s completed i n Appendix H.} A f u l l l i s t of the decompositions produced by the decomposition algorithm i s shown below. Decomposition #1 2: {b,c,c} {b,d,d} 1: {a,b} Decomposition #2 2: {b,d,d} 1: {a,b} {a,c,c} Decomposition #3 2: {b,c,c) 1: {a,b} {a,d,d} Decomposition #4 1: {a,b} {a,c,c} {a,d,d} 90 Each of these decompositions represents a d i f f e r e n t view of the same system. The s u i t a b i l i t y of a p a r t i c u l a r decomposition w i l l depend upon such considerations as: 1) Which state variables i s the analyst inte r e s t e d in? ( i . e . What i s the goal of the system?), and 2) What maintenance changes are anticipated? (Maintenance considerations w i l l be discussed i n the next chapter.) Decomposition . #1 allows state v a r i a b l e "a" to be hidden with respect to outputs "c" and "d". An analyst i n t e r e s t e d i n state v a r i a b l e s "c" and "d" need only be concerned with the view of the system i l l u s t r a t e d i n f i r s t part of Figure 10. The arrows indicate value dependencies. For example, the arrow between "a" and "c" means that the f i n a l value of "c" depends on the value of "a". Decompositions //2 and #3 cannot hide any information from an analyst interested i n "c" and "d". In neither decomposition i s "a" hidden with respect to both "c" and Figure 11: Three possible views of "d". For example, decomposition #2 y i e l d s the four l i g h t s system, the view shown i n the centre of Figure 10 with respect to "c" and "d". Decomposition #4 shows that the f i n a l values of state v a r i a b l e s "b", "c", and "d" can be ca l c u l a t e d concurrently i f the i n i t i a l values of {a}, {a,c}, and {a,d} are known. I t i s also the decompsition inherent i n the sublaws. This decomposition y i e l d s the view shown i n the bottom of Figure 10 with respect to "c" and "d". In t h i s case, state v a r i a b l e "b" i s hidden with respect to "c" and "d". I t i s not immediately c l e a r whether decomposition #1 or #4 i s superior. a) Decomposition #1: b) Decomposition #2: b E> d O Decomposition #4: 91 3 . 5 . 3 . Importance o f the E x t e r n a l Event Space Proper e x p l i c a t i o n of the events which may act upon the system from the environment i s c r u c i a l . For example, consider the following simple example. The system i s intended to model the addition of two continuous quantities "a" and "b". State v a r i a b l e "c" contains the r e s u l t of the addition. A l l three state v a r i a b l e s are modelled as having only two values: zero and p o s i t i v e . S t a b i l i t y Conditions: a b c pos - pos pos pos 0 0 0 Corrective Actions: Conditions Actions a b --> c pos - pos pos pos 0 0 0 The only i n t u i t i v e l y reasonable decomposition for t h i s system i s as follows: 1: {a,b,c} However, i f only one external event i s defined as Set a = pos the s p e c i f i c a t i o n s analysis tools w i l l y i e l d the following decomposition: 1: {a,c,c} The defined external event i s not s u f f i c i e n t to force the system to e x h i b i t a l l of i t s dynamic properties. As a r e s u l t , knowledge of the i n i t i a l values of "a" and "c", as well as the value of "a" a f t e r the a p p l i c a t i o n of the external event, i s s u f f i c i e n t to p r e d i c t the f i n a l value of "c". Therefore, {a,c,c} i s a deterministic subsystem. Defining external events 92 1. Set a = pos 2. Set a = 0 w i l l y i e l d the i n t u i t i v e l y expected decomposition . The four l i g h t example exhibits s i m i l a r behaviour. I f the only defined external event i s Set a = 1 the s p e c i f i c a t i o n s analysis tools w i l l i d e n t i f y two possible decompositions. Decomposition 1: 1: {a,b} {a,d,d} Decomposition 2: 2: {b,d,d} 1: U,b} The rules whose r e s p o n s i b i l i t y i t i s to set the value of state v a r i a b l e "c" are never activated. The value of "c" i s never changed, therefore "c" cannot be i d e n t i f i e d as an output state v a r i a b l e . To help ensure that the defined external events are s u f f i c i e n t to force the system to e x h i b i t a l l behaviour implied by the defined state v a r i a b l e values, they should cause the af f e c t e d state variables to assume a l l of t h e i r defined values. The s p e c i f i c a t i o n s analysis tools perform a t e s t to ensure that t h i s i s so. I f the te s t f a i l s , a warning message i s issued to the analyst. This h e u r i s t i c i s based on the assumption that i f an analyst defines several values for a state v a r i a b l e a f f e c t e d by an external event, he or she i s interes t e d i n seeing that state v a r i a b l e assume each of these values as a r e s u l t of external events 5 5. Defining further external events to a l t e r the value of state v a r i a b l e "b" has no e f f e c t on the generated decomposition. No a d d i t i o n a l decomposition information i s provided by such an event. 5 5 State v a r i a b l e s are not allowed to change t h e i r values twice during any response to an external event. Therefore, the values of state v a r i a b l e s affected by external events can only be set by external events. The reason f o r t h i s rule i s described i n t h i s chapter under the heading "Intermediate State Variables". 93 3 . 4 . 4 . Decomposition of the P a y r o l l System The simple " i n t e r a c t i v e " p a y r o l l system described i n the previous chapter decomposes as shown below. To save space, v a r i a b l e names have been abbreviated as indicated. There are seven possible decompositions 5 6. State Variable Abbreviations hours = hours worked emp_p = employee p o s i t i o n sales = sales com = commissions total_pay = t o t a l pay pay_r = pay rate emp_t = employee type base = base pay over = over time pay P a y r o l l System Decompositions Decomposition #1 1: {emp_t,emp_p,hours,over} 1emp_t.emp_p.pay_r.hours.sales.total_pav) {pay_r.hours.base) {emp_t,emp_p,sales.com) Decomposition #2 2: {emp_t,emp_p,pay_r,hours,com,totalpav) 1: (emp_t.emp_p.hours.over) {pay_r.hours.base) (emp_t.emp_p.sales.com) Decomposition #3 2: {emp_t,emp_p.hours,sales.base.total_pay) 1: {emp_t,emp_p,hours,over) (pay_r.hours.base) (emp_t.emp_p.sales.com) Decomposition #4 2: {emp_t,emp_p.hours.base,com.total_pav) 1: {emp_t,emp_p,hours,over) (pay_r.hours.base) (emp_t.emp_p.sales.com) 5 6 Careful examination of these decompositions w i l l reveal that #1 through #6 may be deduced from #7 by simple substitutions of state v a r i a b l e s . This issue w i l l be addressed i n Chapter 4 . 94 Decomposition #5 2: f emp_t.emp_p.sales.base.over.total_pay) 1: (emp_t.emp_p.hours.over) (pay_r.hours.base) (emp_t.emp_p.sales.com) Decomposition #6 2: (pavr.hours.over.com.totalpay) 1: {emp_t.emp_p.hours.over) (pay_r.hours.base) (emp_t.emp_p.sales.com) Decomposition #7 2: (base.over.com.total_pay) 1: (emp_t.empp.hours.over) (pay_r.hours.base) (emp_t.emp_p.sales.com) o v e r h o u r s emp_p emp_t over t o t a l _ p a y com o v e r b a s e base b a s e h o u r s p a y _ r com com s a l e s emp_p emp_t Figure 12: A diagrammatic representation of Decomposition #7 for the " i n t e r a c t i v e " p a y r o l l system. This i s the decomposition r e f l e c t e d i n the sublaws. Although i t i s used i n the system model, the "benefits" state v a r i a b l e does not appear i n any of the decompositions of t h i s system. Since i t has only one possible value, i t s value cannot change. Therefore, "benefits" i s not an output state v a r i a b l e . Also, i t i s not included i n the c a l c u l a t i o n of any other state v a r i a b l e included i n the model. Therefore, i t does not appear as an input 95 state v a r i a b l e i n any deterministic subsystem. However, suppose that for some reason " t o t a l pay" were to include "benefits". The value of the state v a r i a b l e "total_pay" would be dependent on the value of "benefits", and the model would have to be modified to r e f l e c t t h i s dependency. For example, i f the "benefits" state v a r i a b l e could have values "0" and "nz", the rules describing the c a l c u l a t i o n of " t o t a l pay" would have to be modified as shown below. I t a l i c i z e d text indicates the changes to the model described i n Appendix 6. /* c a l c u l a t e t o t a l pay */ dynamic("calculate t o t a l pay", [v(base,nz)], [v(total_pay,nz)]) . dynamic("calculate t o t a l pay", [v(over.nz)], [v(total_pay,nz)]) . dynamic("calculate t o t a l pay", [v(com,nz)], [v(total_pay,nz)]). dynamic("calculate total pay", [v(ben,nz)], [v(total_pay,nz)]). dynamic("calculate t o t a l pay", [v(base,"0"),v(over,"0"),v(com,"0"),v(ben,"0")] , [v(total_pay,"0")]). The s p e c i f i c a t i o n s analysis tools would now suggest decompositions which included "benefits" as an input to the " t o t a l pay" subsystem. The "batch" p a y r o l l system, described i n the previous chapter, decomposes as above. However, the "batch" model gives r i s e to many decompositions which are not generated using the " i n t e r a c t i v e " model 5 7. The s p e c i f i c a t i o n s analysis tools produce a t o t a l of 168 decompositions. A l l suggested decompositions of the "batch" p a y r o l l system have been included as Appendix I. Most are a d i r e c t r e s u l t of the batch o r i e n t a t i o n . The state v a r i a b l e "end" i s not the only state A "benefits" subsystem i s included i n the "batch" model. As shown i n the appendix, t h i s subsystem i s independent of a l l other subsystems and i s not responsible f o r the increase i n the number of a l t e r n a t i v e decompositions. 96 v a r i a b l e which may be used to indicate the end of the period. Any output state v a r i a b l e which has had i t s value c a l c u l a t e d may be used to t r i g g e r the c a l c u l a t i o n of another output state v a r i a b l e . For example, consider the following decomposition: Decomposition #2 2: {payrate.hours.benefits.base) 1: (end.benefits) (end.emp_t.emp_p.sales.com) {end,emp_t,emp_p,hours,over} {end,emp_t,emp_p,pay_rate.hours,sales.total_pav) The c a l c u l a t i o n of a l l output state v a r i a b l e s at l e v e l 1 i s t r i g g e r e d by the value of the state v a r i a b l e "end". This f a c t i s ind i c a t e d by the i n c l u s i o n of "end" i n each subsystem at that l e v e l . The subsystem at l e v e l 2 indicates, as expected, that "base pay" may be c a l c u l a t e d from "pay rate" and "hours worked". However, the c a l c u l a t i o n of "base pay" i s triggered by the c a l c u l a t i o n of "benefits". As soon as the value of "benefits" becomes non-zero, "base pay" i s ca l c u l a t e d . Generation of decompositions of t h i s form i s not considered to be an error. 3 . 5 . 5 . Intermediate State Variables Consider a modification of the " i n t e r a c t i v e " p a y r o l l system. Assume the company makes some changes i n i t s p a y r o l l p o l i c y (Wand and Weber, 1989). 1. Both o f f i c e s t a f f and sales employees are e n t i t l e d to both overtime pay and sales commissions. 2. An o f f i c e employee cannot receive more i n commissions than i n overtime. 3. A sales employee cannot receive more i n overtime than i n commissions. An analyst might be tempted to define a system which, a f t e r c a l c u l a t i n g both commissions and overtime, modifies these amounts to r e f l e c t the r e s t r i c t i o n s r e s u l t i n g from changes 2 and 3. The sublaws could s t i l l pass the tests f or l o c a l completeness and consistency. Response paths could be determined and the f i r s t ISS could be created. The decomposition algorithm could be applied, but none of the r e s u l t i n g decompositions would indicate the c a l c u l a t i o n of the 9 7 intermediate values f o r "overtime" and "commissions". That i s , no decompositions of the following form would be found. (Notice that "overtime" and "commissions" are output state variables at two d i f f e r e n t l e v e l s of the decomposition.) n+2: {....commissions.overtime.total_pay) n+1: {employee_type,commissions,overtime.commissions.overtime) n: (....hours.overtime) (....sales.commissions) Instead, subsystems at l e v e l s n and n+1 would be combined together as shown: m+1: (....commissions.overtime.totalpav) m: {....hours,sales,employee_type.commissions.overtime) This w i l l happen for the following reason. When an update i s performed on an ISS, the state v a r i a b l e s i n each of the updating subsystems are set to t h e i r f i n a l values. There i s no way to i d e n t i f y the intermediate values of "overtime" and "commissions" which would have been ca l c u l a t e d at l e v e l n. Only by updating "commissions" and "overtime" to t h e i r intermediate values could the two-step nature of the c a l c u l a t i o n be discovered. Unfortunately, knowledge of these intermediate values i s not contained i n the information input to the decomposition algorithm. The only information a v a i l a b l e to the algorithm i s the f i r s t system r e l a t i o n ( i . e . the l i s t of i n i t i a l unstable states and t h e i r corresponding f i n a l stable s t a t e s ) . This observation may be stated more generally. The s p e c i f i c a t i o n s analysis tools w i l l never suggest a decomposition where a state v a r i a b l e i s an output state v a r i a b l e at more than one l e v e l of a system. I f an analyst wishes to show the "multiple-step" nature of a c a l c u l a t i o n , he or she must i d e n t i f y INTERMEDIATE STATE VARIABLES. In the above example, such a state v a r i a b l e might be c a l l e d "additional_payments". Suppose "employee type", "commissions" and "overtime" are used to c a l c u l a t e " a d d i t i o n a l payments". Also suppose "additional payments" i s input to the subsystem c a l c u l a t i n g " t o t a l pay". The algorithm would i d e n t i f y a decomposition with the following form. 98 n+2: {...,additional_payments.total_pav) n+1: {employee_type,commissions,overtime.additionalpayments) n: (....hours.overtime) (....sales.commissions) total_pay base add_pay base add_pay base pay_r bours add_pay over emp_t com over bours emp_p com sales emp_p Figure 13: The required use of intermediate state v a r i a b l e s i s not a r e s t r i c t i o n on the gene r a l i t y of SELMA. In fac t , i t could be argued that the commissions and overtime amounts be f o r e and a f t e r the r e s t r i c t i o n s are applied are not the same properties of the system. That i s , "commissions before r e s t r i c t i o n s " i s not the same state v a r i a b l e as " c o m m i s s i o n s a f t e r r e s t r i c t i o n s " . Perhaps an analyst wishing to model them as the same state v a r i a b l e i s a c t u a l l y making a mistake. This mistake might be caused by thinking about the system i n procedural rather than sublaw oriented terms. Thus the required use of intermediate state variables can be seen as a kind of semantic i n t e g r i t y check. That i s , i f none of the decompositions suggested by the tools e x h i b i t the structure i n t u i t i v e l y expected by the analyst, some state variables may be serving dual rol e s and a d d i t i o n a l state v a r i a b l e s may be required. A l i s t i n g of the formal model f or the modified p a y r o l l system has been included as Appendix J . A t o t a l of 48 decompositions are suggested by the s p e c i f i c a t i o n s analysis t o o l s . These are also included as Appendix K. The decomposition matching the structure of the sublaws i s shown i n diagrammatic form i n Figure 13. The f a c t that so many decompositions are generated h i g h l i g h t s the need f o r a d d i t i o n a l h e u r i s t i c s to reduce the s e l e c t i o n task faced by a designer. Some a d d i t i o n a l h e u r i s t i c s w i l l be discussed i n the next chapter. A diagrammatic representation of Decomposition #27 for the modified p a y r o l l system. This i s the decomposition r e f l e c t e d i n the sublaws. 99 3.6. Conclusions System decomposition can be performed by considering the manner i n which the values of the state variables describing the system change under the influence of external events. A theory of decomposition embodying the concepts of emergent and hidden system state v a r i a b l e s has been developed. An algorithm for decomposing systems u t i l i z i n g Wand and Weber's requirement f o r deterministic decompositions has been described. The goal of t h i s theory of decomposition i s quite d i f f e r e n t from the formalisms of HOS (Hamilton and Zeldin, 1976) and M i l i et a l . (1986). These formalisms focus on ensuring that given subsystems are combined i n a consistent manner. They are not concerned with the i d e n t i f i c a t i o n of subsystems. Computerized tools implementing t h e i r ideas would be "passive" i n nature. That i s , the tools would merely t e s t the consistency of given decompositions. Myers (1978) and Yourdon and Constantine (1979) were concerned with developing a methodology f o r a c t i v e l y f i n d i n g deterministic subsystems. S i m i l a r l y , the s p e c i f i c a t i o n s analysis tools are "active" i n the sense that they can suggest decompositions for a system. However, while the techniques of Myers and Yourdon and Constantine are informal and depend to a great extent on human judgement, the algorithm used by the s p e c i f i c a t i o n s analysis tools i s derived from a theory of decomposition and may be completely automated. Two of the three basic forms of decomposition i d e n t i f i e d i n Chapter 1 are supported by the theory. Wand and Weber's requirement i s used i n conjunction with several h e u r i s t i c s to i d e n t i f y subsystems which are candidates for p a r a l l e l decomposition. The processes 5 8 associated with the subsystems at any l e v e l of a decomposition may be executed i n p a r a l l e l . No subsystem w i l l have an associated process which depends on the output of another subsystem at the same l e v e l . The update i s the essence of sequential decomposition. The ISS formed by an update using c e r t a i n subsystems, represents the states of the system a f t e r the processes associated with those subsystems have been completed. P a r a l l e l decomposition may be performed following the construction of the f i r s t ISS or a f t e r any update. A possible i n t e r p r e t a t i o n of the t h i r d generic form of decomposition, namely co n d i t i o n a l decomposition, i s discussed i n the Chapter 5. 5 8 The r e l a t i o n a l form ( i e . i n i t i a l / f i n a l state p a i r s ) of these processes could be obtained d i r e c t l y from the system r e l a t i o n with which the subsystem i s associated. 100 The decompositions generated by the s p e c i f i c a t i o n s analysis tools could provide a basis for e i t h e r analysis of the system or design of some a r t i f a c t intended to represent the system (as i n the case of a computerized information system). For analysis, the subsystems i d e n t i f i e d by the s p e c i f i c a t i o n s analysis tools w i l l be guaranteed to behave i n a deterministic way. This w i l l reduce the cognitive load required to comprehend the operation of the e n t i r e system. For design, the decomposition can provide the basis of a hierarchy of program modules as required by structured programming. As well, a deterministic subsystem i d e n t i f i e d by the s p e c i f i c a t i o n s analysis tools could be e a s i l y implemented as an object i n an object-oriented programming system. The state v a r i a b l e s describing the subsystem would comprise the state vector of an object type. The processes or methods encapsulated with t h i s state vector could be described by a sublaw s p e c i f y i n g the r e l a t i o n s h i p s between state v a r i a b l e s . The example systems considered i n t h i s section were quite small. I t i s l i k e l y that larger systems w i l l give r i s e to even greater numbers of d e t e r m i n i s t i c decompositions. The next chapter suggests a "ranking" h e u r i s t i c which could be used to present the analyst with the "best" decompositions f i r s t . Another h e u r i s t i c for reducing the s i z e of the decomposition search space i s also suggested. 101 Chapter 4 : System Complexity. Maintenance, and Goals 4 . 1 . Genera l The l a s t chapter showed how the i n t e r n a l s t r u c t u r e o f a system may be d i s c o v e r e d g i v e n o n l y the s t a t e s r e s u l t i n g from the a c t i o n o f e x t e r n a l events . Th i s i n t e r n a l s t r u c t u r e i s found through the use o f a decompos i t ion a l g o r i t h m based on a number o f h e u r i s t i c s . The h e u r i s t i c s serve to l i m i t the number o f " p o s s i b l e " decomposi t ions which must be cons ide red by the a n a l y s t . However, as i l l u s t r a t e d by the s imple p a y r o l l system examples, these h e u r i s t i c s s t i l l a l l o w a l a r g e number o f decomposi t ions . Some method o f r a n k i n g these decomposi t ions i s r e q u i r e d so tha t on ly the bes t need be presented to the a n a l y s t . To t h i s p o i n t , a l l subsystems produced by the s p e c i f i c a t i o n s a n a l y s i s t o o l s have been cons ide red to be e q u a l l y s u i t a b l e as bases f o r the c o n s t r u c t i o n or unders tanding o f a system. For example, the subsystems {emp_p ,emp_ t ,hou r s ,pay_ r , s a l e s , t o t a lpay} and {base ,add_pay. to ta l_pay) where hours = hours worked emp_p = employee p o s i t i o n s a l e s = amount o f s a l e s t o t a l _ p a y = t o t a l pay pay_r = pay r a t e emp_t = employee type base = base pay add_pay = a d d i t i o n a l payments suggested f o r the mod i f i ed p a y r o l l system, are cons ide red to d e s c r i b e e q u a l l y s u i t a b l e modules f o r the c a l c u l a t i o n o f t o t a l pay. The f i r s t subsystem suggests c a l c u l a t i o n o f t o t a l pay g i v e n on ly the i n i t i a l i npu t s (or the s t a t e v a r i a b l e s a f f e c t e d by e x t e r n a l even t s ) , whereas the second suggests making use o f the in t e rmed ia t e v a l u e s : base pay and a d d i t i o n a l payments. Most a n a l y s t s would agree tha t implementa t ion (or unders tanding) o f the f i r s t subsystem would be more d i f f i c u l t than the second. The subsystem c a l c u l a t i n g t o t a l pay from i n i t i a l i npu t s would be more compl ica ted than the subsystem u t i l i z i n g the in te rmedia te v a l u e s . There fore , the second subsystem l i k e l y desc r ibes a s u p e r i o r subsystem 102 i n that i t s complexity i s less than the f i r s t . Unfortunately, the intermediate values of base pay and a d d i t i o n a l payments must be c a l c u l a t e d before the second subsystem may begin c a l c u l a t i o n of t o t a l pay. These lower-level subsystems might increase the complexity of the system beyond the complexity of a c a l c u l a t i o n from i n i t i a l inputs. A quantitative measure of decomposition complexity i s required so that d i f f e r e n t candidate decompositions may be compared and ranked. This chapter i s p r i m a r i l y concerned with the s e l e c t i o n of such a complexity measure. As there i s no general consensus on the meaning of the term "complexity", the chapter w i l l begin with a discussion of some necessary c h a r a c t e r i s t i c s f or a measure of complexity s u i t a b l e f or use with systems modelled using SELMA. The f i n a l s e l e c t i o n w i l l be r a t i o n a l i z e d by t r a c i n g the l o g i c a l development of the measure beginning with Ashby's (1956) d e f i n i t i o n of system " v a r i e t y " . Variety w i l l be modified to provide some a d d i t i o n a l desirable properties. A f t e r a logarithmic transformation, the modified v a r i e t y measure i s i d e n t i c a l to "entropy" as defined by Shannon (1948). While entropy w i l l be shown to be unsuitable as a measure of complexity, i t s problems can be overcome with a simple modification. This modification was f i r s t made by Hellerman (1972) . He c a l l e d the r e s u l t i n g measure "computational work" 5 9. Computational work has been adopted as the measure of complexity for t h i s research. In summary, r a t i o n a l i z a t i o n of the complexity measure s h a l l c o n s i s t of four major stages: 1. Ashby's Va r i e t y 2. Modified V a r i e t y 3. Shannon's Entropy 4. Hellerman's Computational Work Hellerman's choice of the l a b e l "computational work" i s i n many ways unfortunate. His measure does not r e f l e c t the number of machine operations required to perform a c a l c u l a t i o n . This sort of machine work would be highly implementation dependent. Hellerman uses a m u l t i p l i c a t i o n subroutine as an example. The subroutine could c a l c u l a t e 38 * 73 by adding 73 to i t s e l f 38 times. However, there are easier ways to perform m u l t i p l i c a t i o n which require f a r less machine work. Hellerman was interested i n f i n d i n g a measure of the d i f f i c u l t y of a c a l c u l a t i o n which would be implementation independent. He also notes that, i n the Computer Science l i t e r a t u r e , complexity i s a quantity which varies d i r e c t l y with work, "and so may be i d e n t i f i e d , loosely, with i t " (Hellerman, 1972, p. 439). 103 A p p l i c a t i o n of the complexity measure w i l l be demonstrated using the modified p a y r o l l system example of the previous chapter. The complexity of a n t i c i p a t e d future changes to a system (or system maintenance) are expected to influence design. These influences w i l l be i l l u s t r a t e d using the modification of the p a y r o l l system described i n the previous chapter. The chapter w i l l close with the d e f i n i t i o n of another h e u r i s t i c f o r pruning the decomposition search tree. This h e u r i s t i c uses the measure of complexity and depends upon knowledge of the system's purpose or goal. 4 . 2 . Complexity The Oxford Dictionary defines something as complex i f i t "consists of parts". Most people would agree that something i s complex i f i t i s made up of many parts. A block of ice i s usually not considered to be a complex object, whereas a space shuttle i s very complex. Thus "many-partedness" does seem to be an e s s e n t i a l ingredient f or complex things. However, a mountain need not be considered to be a complex object even though i t consists of a very large number of i n d i v i d u a l pieces of rock, and the block of i c e would be a complex object i f the motions of i n d i v i d u a l electrons and n u c l e i were considered. C l e a r l y , although many-partedness i s important, i t must be many-partedness at the l e v e l of a b s t r a c t i o n where the behaviour of i n t e r e s t i s manifested. That i s , i f we are only concerned with the s t a t i c behaviour of mountains, they are indeed simple things. However, i f we are interested i n patterns of erosion, such as la n d s l i d e s , or even geological u p l i f t , then mountains become f a i r l y complex systems of i n t e r a c t i n g s t r a t a and f a u l t s . S i m i l a r l y , i f we are intere s t e d i n the gross (or emergent) properties of a block of i c e , the block may be treated as a simple thing. But i f we are concerned with "lower-level" properties of i c e , such molecular bonding v i a electron sharing, the same block must be regarded as a complex system. Therefore, the following necessary c r i t e r i o n f o r a d e f i n i t i o n of complexity i s proposed. Complexity must be r e l a t e d to the behaviour of a system. 104 Any acceptable d e f i n i t i o n must recognize that complexity i s r e l a t e d to the dynamics of the system 6 0. 4.2.1. Var i e t y The f i r s t step i n the r a t i o n a l i z a t i o n of a measure of complexity w i l l be v a r i e t y . Ashby (1956) notes that most systems of i n t e r e s t have outputs. He defined v a r i e t y to be the number of d i f f e r e n t output states exhibited by a system. For example, consider the following two subsystems from the modified p a y r o l l system. A table of input and output states f o r each subsystem i s provided. Subsystem #1: (hours.pay_r.base) Base pay w i l l only be non-zero (abbreviated "nz") i f the pay rate i s non-zero and the hours worked i s non-zero ( i . e . Hours worked i s e i t h e r less than the l i m i t f o r regular hours "reg", or s u f f i c i e n t f o r overtime pay " o t " ) . Inputs Output hours pay_r base 0 0 0 0 nz 0 reg 0 0 reg nz nz ot 0 0 ot nz nz Notice that t h i s requirement i s somewhat at odds with the common usage of the term "complex". Many people would consider an assembly c o n s i s t i n g of two parts to be more complex than an assembly c o n s i s t i n g of only one part. Further, they might continue to support t h i s ranking even i f the assemblies exhibited no behaviour other than simple "existence". The notion of complexity, as presented i n t h i s research, i s more r e s t r i c t i v e i n that i t does not address t h i s sort of " s t a t i c complexity". I t w i l l be argued that only "dynamic complexity" i s important i n assessing the q u a l i t y of a decomposition. 105 Subsystem #2: {emp_p,sales,com) Commissions w i l l only be non-zero i f the employee has a regular p o s i t i o n (abbreviated "r") as opposed to a management p o s i t i o n (abbreviated "m") and some sales have been made. Inputs Output emp_p sales com r 0 0 r nz nz m 0 0 m nz 0 The v a r i e t y of both subsystems i s 2 since both base pay and commissions may e x h i b i t two d i s t i n c t values. V a r i e t y i s at l e a s t s i m i l a r to complexity. I t seems i n t u i t i v e l y reasonable to expect a system e x h i b i t i n g a large number of output states to be more complex than one which shows only a small number of output states. However, complexity appears to be a function of more than j u s t output states. Consider the following possible p a r t i a l implementations of the base pay and commissions subsystems: procedure base_pay(hours,pay_r,base); begin case pay_r of 0: base := 0; nz : case hours of 0: base := 0; reg: base := nz; ot: base := nz; endcase; endcase; end; procedure commissions(emp_p,sales,com); begin case sales of 106 0: com := 0; nz: case emp_p of m: com := 0; r: com := nz; endcase; endcase; end; The base pay c a l c u l a t i o n procedure i s s l i g h t l y longer than the one for commissions because there are more input states to consider. Because there are three possible values for hours worked and two for pay rate, the base pay subsystem has s i x input states. The commissions subsystem has only four input states. This suggests that a measure of complexity which i s not only a function of output states, but of input states as well, i s required. Such a measure w i l l form the next step i n the development of a measure of system complexity. 4.2.2. Modified Variety Each of a deterministic system's input states w i l l lead to one and only one output state. The p r o b a b i l i t y of observing a p a r t i c u l a r output state i s equal to the p r o b a b i l i t y of observing any of the input states leading to that output state. V a r i e t y can be modified to be a function of the p r o b a b i l i t i e s of observing each output state. Thus the modified measure would be a function of both input and output states. For systems modelled using SELMA, the p r o b a b i l i t y of observing a p a r t i c u l a r output state i s determined by the frequencies of the external events. The analyst could be asked to estimate these frequencies. They are, a f t e r a l l , l i k e l y required to f a c i l i t a t e implementation-level decisions r e l a t i n g to such things as data storage l o c a t i o n or f i l e access method. However, f o r purposes of analysis and design, an analyst i s not concerned with the p r o b a b i l i t y of an external event, only with understanding or designing the system's response to that event. For example, a computer program must contain routines to handle a l l a n t i c i p a t e d inputs. The f a c t that a p a r t i c u l a r input may occur more often than another does not influence the d i f f i c u l t y of the code written to handle that input. Therefore, for purposes of analysis, i t s h a l l be assumed that the p r o b a b i l i t y of observing each external event i s the same. For subsystems, t h i s i s the same as assuming the 107 p r o b a b i l i t i e s of observing a l l input states are equal. Therefore, for t h i s research, the p r o b a b i l i t y p A of observing a given output state 0L w i l l be defined as where I± i s the number of input states leading to 0L and I i s the t o t a l number of input states. P i = I i / I. There are many possible ways of incorporating output state p r o b a b i l i t i e s into a modified measure of v a r i e t y . input States 11 © I 2 Output States © 01 O ^ O 02 However, f o r consistency, the modified measure should y i e l d the same value as Ashby's v a r i e t y when p r o b a b i l i t i e s do not matter ( i . e . when they are a l l equal). I t should also s a t i s f y a somewhat les s i n t u i t i v e requirement. By d e f i n i t i o n , i f the p r o b a b i l i t y of observing output state 01 i s less than that of observing output state 0 2 , there are fewer input states leading to 0l than to 0 2 . As i l l u s t r a t e d below, t h i s means that fewer decisions must be made before moving the system to state 0X than to 0 2 . Therefore, when two systems e x h i b i t the same input and output states, the system i n which the output p r o b a b i l i t i e s are most unequal w i l l be the l e a s t complicated. That i s , modified v a r i e t y should be a maximum when the p r o b a b i l i t i e s of observing each output are the same. Consider two simple systems with four input and two output states. If P(Ii)=P(j) for a l l i and j Then P(01)=2/5 and P(02)=3/5 Figure 14: P r o b a b i l i t i e s of observing output states given equal input state p r o b a b i l i t i e s . System 1: {a,b,c} P r o b a b i l i t i e s of observing each output state are NOT equal. Inputs Output a b c 0 0 0 0 1 0 1 0 0 1 1 1 108 System 2: {d,e,f} P r o b a b i l i t i e s of observing each output state are equal. Inputs Output d e f 0 0 1 0 1 0 1 0 0 1 1 1 Possible implementations for these systems are as follows: procedure c(a,b,c); begin case a of 0: c := 0; 1: case b of 0: c := 0; 1: c := 1; endcase; endcase; end; procedure f ( d , e , f ) ; begin case d of 0: case e of 0: f := 1; 1: f := 0: 1: case e of 0: f := 0; 1: f := 1; endcase; endcase; end; 109 In the f i r s t system the p r o b a b i l i t i e s of observing outputs of 0 and 1 are 0.75 and 0.25 re s p e c t i v e l y . In the second system the p r o b a b i l i t i e s are both 0.50. The implementation f o r the second system i s s l i g h t l y longer (or more complex) than the one f o r the f i r s t . One measure e x h i b i t i n g both of the above properties i s as follows: n Modified V a r i e t y = II ( l / P i ) P i i-1 where n i s the number of output states Pi i s the p r o b a b i l i t y of observing output i and P i - I i / I where It i s the number of input states leading to output i I i s the t o t a l number of input states I f a l l the I i ' s are the same, a l l the Pi's w i l l be equal to 1/n and n Modified V a r i e t y = II n 1 / n i-1 - n = Variety as desired. In the base pay subsystem there are 6 d i f f e r e n t input states. Two of these states lead to a non-zero output state and 4 lead to output states of zero. Therefore the p r o b a b i l i t i e s of observing output states of non-zero and zero are 0.33 and 0.66 res p e c t i v e l y . Therefore, the modified v a r i e t y of the base pay subsystem i s 1.89 (= (1/0. 33) 0- 3 3 + (1/0.66) 0 6 6) . The modified v a r i e t y of the commissions subsystem may be s i m i l a r l y c a l c u l a t e d to be 1.76. This implies that 110 the commissions subsystem i s i n some sense les s complex than the base pay subsystem, but both are less complex than a subsystem which would produce the values "nz" and "0" with equal p r o b a b i l i t y . Because there are two output states, the modified v a r i e t y of such a system would be 2.00. At t h i s point a digression i s i n order. In the above implementations, the only language p r i m i t i v e s assumed were a s e l e c t i o n structure i n the form of "case" and an assignment operator i n the form of ":=". D i f f e r e n t languages are l i k e l y to have d i f f e r e n t p r i m i t i v e s . For example, most languages have a m u l t i p l i c a t i o n operator. Such an operator would greatly s i m p l i f y c a l c u l a t i o n of base pay since i t i s merely the product of hours and pay rate. The case structures could be eliminated. However, a m u l t i p l i c a t i o n operator would not much s i m p l i f y the c a l c u l a t i o n of commissions as a s e l e c t i o n depending on employee p o s i t i o n i s required. That i s , no commissions are c a l c u l a t e d f o r management employees. As the complexity measure i s to be used to help s e l e c t a decomposition for use as a basis f o r system implementation, the p r i m i t i v e s of the implementation language are obviously important. I t i s possible to imagine a fourth-generation language which provides a p r i m i t i v e f o r the c a l c u l a t i o n of t o t a l pay given the i n i t i a l inputs of employee p o s i t i o n , employee type, hours, pay rate and sales. I f t h i s language was to be used for implementation of the p a y r o l l system, software written using any of the more d e t a i l e d decompositions would l i k e l y be more complex than software written f o r the monolithic subsystem. Also consider a system with the following inputs and output. Inputs Output a b c 1 1 1 2 2 3 This system has a modified v a r i e t y of 2. Now consider another system. Inputs Output d e f 1 1 2 2 1 2 1 2 2 3 3 4 111 This system has a modified v a r i e t y of 2.83. Yet both systems could be implemented i d e n t i c a l l y using a simple ad d i t i o n p r i m i t i v e . But i s addition r e a l l y simple? Decimal addition requires the use of a 100 entry look-up table (0+0=0,0+1=1. . .9+9=18) and a set of rules f o r "carrying" (or "borrowing" i n the case of negative ad d i t i o n or subtraction). Of course, a d d i t i o n i n binary i s simpler than add i t i o n i n decimal but i s s t i l l a n o n - t r i v i a l exercise. In the case of the base pay subsystem, m u l t i p l i c a t i o n was suggested to be a simple operation. In f a c t , not too many years ago some publishers were able to make a p r o f i t s e l l i n g large look-up tables of logarithms which could be used i n conjunction with the addition look-up table and rules (hopefully contained within the user's brain) to si m p l i f y m u l t i p l i c a t i o n . Modified v a r i e t y provides a measure of the basic d i f f i c u l t y of a procedure. I t i s independent of whatever language p r i m i t i v e s w i l l be av a i l a b l e during implementation. I t i s often argued that implementation issues, such as language s e l e c t i o n , should not be considered during the early stages of systems analysis. I t i s these e a r l y stages that SELMA i s designed to support. In f a c t , mathematical operations such as addi t i o n and m u l t i p l i c a t i o n are not l i k e l y to appear i n the early stages of systems analysis but are more l i k e l y to be found i n l a t e r stages where the procedures are developed to c a l c u l a t e emergent state v a r i a b l e s . The p a y r o l l systems used as examples here are quite "low-level" i n t h e i r focus. That i s , the actual procedures used to c a l c u l a t e t o t a l pay are l i k e l y to be of i n t e r e s t only i n the l a t e r stages of the analysis of an ent i r e personnel and accounting system. This i s not to say that SELMA i s not applicable to such a low-level system. Rather, i t i s the complexity h e u r i s t i c which i s of questionable use at l e v e l s of analysis close to implementation because of the v a r i e t y of d i f f e r e n t implementation p r i m i t i v e s a v a i l a b l e . Back to the discussion of v a r i e t y . I t would also be nice i f the modified v a r i e t y of a system formed by merging two independent subsystems could be found by combining the modified v a r i e t i e s of the subsystems i n some simple way. In fa c t , as shown i n Appendix M, the modified v a r i e t y of such a system i s simply the product of the modified v a r i e t i e s of the subsystems. Consider the system formed by merging t h e . t o t a l pay and commissions subsystems: {emp_p,hours,pay_r,sales.base.com) The input and corresponding output states f o r t h i s system are as follows: 112 Inputs Outputs emp_p hours pay_r sales base com r 0 0 0 0 0 r 0 0 nz 0 nz r 0 nz 0 0 0 r 0 nz nz 0 nz r reg 0 0 0 0 r reg 0 nz 0 nz r reg nz 0 nz 0 r reg nz nz nz nz r ot 0 0 0 0 r ot 0 nz 0 nz r ot nz 0 nz 0 r ot nz nz nz nz m 0 0 0 0 0 m 0 0 nz 0 0 m 0 nz 0 0 0 m 0 nz nz 0 0 m reg 0 0 0 0 m reg 0 nz 0 0 m reg nz 0 nz 0 m reg nz nz nz 0 m ot 0 0 0 0 m ot 0 nz 0 0 m ot nz 0 nz 0 m ot nz nz nz 0 The v a r i e t y of t h i s combined system i s (24/12) 1 2 / 2 4 * (24/4) V24 * (24/6) 6 / 2 4 * (24/2) 2 / 2 4 = 3.32 which i s also equal to the product of the v a r i e t i e s of the o r i g i n a l subsystems (1.89 * 1.76 = 3.33) ignoring some round-off error. This r e s u l t can be e a s i l y generalized to systems formed by merging more than two independent subsystems. I t was proven that the modified v a r i e t y of a system formed by merging independent subsystems could be found by multi p l y i n g the modified v a r i e t i e s of the components. However, since m u l t i p l i c a t i o n i s not as e a s i l y v i s u a l i z e d as 113 a d d i t i o n , a logarithmic transformation of the modified v a r i e t y measure i s commonly used. This transformed measure i s c a l l e d entropy. 4.2.3. Entropy Shannon (1948) was the f i r s t to propose a d e f i n i t i o n of information entropy 6 2 although Ashby d i d not suggest the notion of v a r i e t y u n t i l several years l a t e r . Shannon was looking f o r a measure H of the "degree of choice or uncertainty" i n the s e l e c t i o n or occurrence of an output state which would be a function of the p r o b a b i l i t i e s of observing each output state Pi,p 2.•••.P n- He also wanted the measure to have a number of desirable properties (Shannon, 1948, pp. 392-393). 1. H should be continuous i n the >^t. 2. I f a l l the Pi's are equal, p A = 1/n, then H should be a monotonically increasing function of n. With equally l i k e l y events there i s more choice, or uncertainty, when there are more possible events. 3. I f a choice can be broken down into two successive choices, the o r i g i n a l H should be the weighted sum of the i n d i v i d u a l values of H. He concluded that the only H s a t i s f y i n g a l l of these conditions i s of the form 6 3. n H = S P i * l o g d / p j i = l where again I t i s r e l a t i v e l y easy to v i s u a l i z e the r e s u l t of adding two things to a c o l l e c t i o n of four. I t i s much harder to v i s u a l i z e two things m u l t i p l i e d by three. Many people w i l l v i s u a l i z e the m u l t i p l i c a t i o n as a s e r i e s of additions. Addition i s seen to be more i n t u i t i v e than m u l t i p l i c a t i o n . Entropy i s an additive measure while modified v a r i e t y i s m u l t i p l i c a t i v e . Therefore, entropy i s considered superior. 6 2 For the remainder of t h i s research, "information entropy" w i l l be r e f e r r e d to as simply "entropy". 6 3 The formula for H may be m u l t i p l i e d by, or added to, a constant and s t i l l possess the required properties. 114 n i s the number of output states Pi i s the p r o b a b i l i t y of observing output i The base of the logarithm determines the units of entropy. The usual base i s 2, and the u n i t s are b i t s . Logarithms i n t h i s document are always to base 2, although the actual units of entropy are i r r e l e v a n t to t h i s research. Entropy i s equal to the logarithm of the modified v a r i e t y introduced i n the previous section. H = log(Modified Variety) Since modified v a r i e t y was introduced as a possible measure of system complexity, Shannon's "degree of choice", or entropy, of a system i s also a possible measure of complexity. In Shannon's work the pj/s were given. Here i t i s assumed that the p r o b a b i l i t i e s of observing any input state i s the same and that each p A may be c a l c u l a t e d as follows: P i = I i / I where I± i s the number of input states leading to output i I i s the t o t a l number of input states Shannon also noted that H has other properties which make i t a reasonable measure of choice (pp. 394-395): 1. H = 0 i f and only i f a l l the Pi's but one are zero, t h i s one having the value 1. That i s , a system with only one output state has zero entropy. 2. Suppose there are two subsystems A and B with m and n output states r e s p e c t i v e l y . Let pLi be the p r o b a b i l i t y of the j o i n t occurrence of i as the output of the f i r s t subsystem and j as the output of the second. The system C formed by merging the two subsystems i s 115 He = S P i j * l o g ( l / P i j ) i . j while H A S P i j * l o g ( l / S P i j ) H B 2 P i J * l o g ( l / S P i j ) i . j j and i t i s e a s i l y shown that He < H A + H B Note: The P r e v i o u s exanvples of merged subsystems were for independent subsystems where P i J = P i * P j so that H c = H A + HB. Unfortunately, as a measure of complexity i n the sense of coding or understanding d i f f i c u l t y , entrooy i s flawed. The second P r o p e r t y seems to run counter to the h e u r i s t i c i n Cha Pter 3 which suggested that subsystems be kept as small as poss i b l e . The entro Py of a merged process can be smaller than the sum of the entropies of the component processes. This i s one undesirable property of entropy as a measure of system complexity. There i s another. Consider the following systems and possible implementations: {a,b} Input Output a b 1 1 2 2 procedure b(a,b); begin case a of 1: b := 1; 2: b := 2; 116 endcase; end; and {c,d} Input Output c d 1 1 2 2 3 1 4 2 procedure d(c,d); begin case c of 1: d := 1; 2: d := 2; 3: d := 1; 4: d := 2; endcase; end; The entropy of {a,b} i s 1, but the entropy of {c,d} i s also 1 despite the f a c t that i t s implementation i s twice as long. This i s a r e s u l t of the f a c t that entropy i s based on the p r o b a b i l i t y of observing a given output. I t i s not dependent on the absolute number of input states which give r i s e to those outputs, but only on t h e i r r a t i o s . Solutions to both of these undesirable properties of entropy were suggested by Hellerman (1972). His measure has been selected as the estimate of system complexity for t h i s research. 117 4.2.4. Computational Work Hellerman (1972) was intere s t e d i n estimating the amount of work done by a process independent of i t s implementation. His measure i s equal to the amount of information stored i n the look-up table implementation of a process. Look-up tables are a l i s t of input and corresponding output states, and have been used to describe the dynamics of the systems discussed i n t h i s chapter. To determine the amount of information i n a look-up table, Hellerman suggests performing an experiment. F i r s t the table i s implemented i n a computer memory by u t i l i z i n g the concept of a DOMAIN CLASS. A domain cla s s i s the set of input states which map into a s i n g l e output state. I f there are N output states, there are N domain classes. I f there are I input states, the look-up table may be implemented i n a computer memory co n s i s t i n g of I locations by p l a c i n g the output value corresponding to the j t h input state i n the j t h l o c a t i o n . An a r b i t r a r y memory l o c a t i o n may then be selected and i t s contents examined. I f I ± i s the number of input states leading to the i t h output state, the p a r t i c u l a r contents found i n the selected memory l o c a t i o n occur i n IL l o c a t i o n s . Therefore, i t s p r o b a b i l i t y of s e l e c t i o n was I i / I . According to information theory, the s e l e c t i o n provided l o g ( I / I i ) b i t s of information. The t o t a l information which may be extracted from the memory i s then N S I ± * l o g d / I i ) i = l This i s the t o t a l amount of information stored i n the memory or the t o t a l information required by the process. Hellerman c a l l e d t h i s quantity the computational work (W) and i t i s equal to the number of input states m u l t i p l i e d by the entropy of the process. N W = S I ± * l o g d / I i ) = I*H i = l He also notes that, i n the computer science l i t e r a t u r e , complexity i s a quantity that v a r i e s d i r e c t l y with work and "so may be i d e n t i f i e d , loosely, with 118 i t " (p. 439). In t h i s research, an absolute value f o r the complexity of a process i s not required. The measure need only provide r e l a t i v e l e v e l s of complexity, and be a d d i t i v e 6 4 . As noted e a r l i e r , the s e l e c t i o n of "computational work" as the name f o r t h i s quantity i s perhaps unfortunate as i t s value i s independent of any p a r t i c u l a r computer implementation. For the purposes of this research, "computational work (W)" s h a l l be renamed "complexity (C)". N C = S ^ * l o g ( I / I i ) i-1 This formulation of complexity avoids the two problems noted f o r entropy. The complexity of a system formed by merging several subsystems w i l l always be greater than or equal to the sum of the complexities of the component subsystems. That i s , i f A and B are subsystems with complexities C A and C B r e s p e c t i v e l y , and D i s the system, with complexity CD, formed by merging A and B, the following w i l l be true. — + '-'B In f a c t , Hellerman notes that i f A and B have no inputs state i n common, the complexity of C i s given by C D = I B * C A + I A * C B where I A and I B are the numbers of input states of subsystems A and B r e s p e c t i v e l y (p. 442). Therefore, the h e u r i s t i c c a l l i n g f o r small subsystems can be j u s t i f i e d from the standpoint of reducing o v e r a l l complexity. The second Therefore, i f A, B and D are processes with complexities CA, CB, and C D r e s p e c t i v e l y , and C A > C B then ^ A + > C B + C D. This property i s possessed by both entropy and computational work. 119 problem, r e l a t i n g to entropy's r e l i a n c e on only the proportions of input states leading to each f i n a l state, i s also solved. R e c a l l the two systems {a,b} and {c,d} described above. The complexity of {a,b} i s equal to 2 (I = 2, H = 1) . The complexity of (c,d) i s twice t h i s amount or 4 (1 = 4, H = 1) . This difference i n complexity i s i n t u i t i v e l y reasonable when the possible implementations (given e a r l i e r ) are considered. 4.2.5. States or State Variables? In software cost estimation, a common input to module complexity c a l c u l a t i o n s i s the number of input v a r i a b l e s (Halstead, 1977; Albrecht, 1979; Bailey and B a s i l i , 1981; Rubin, 1983). I f the sort of complexity being estimated i n these c a l c u l a t i o n s i s the same as that described i n t h i s chapter, such a pr a c t i c e can only be j u s t i f i e d i f the number of v a r i a b l e s i s monotonically r e l a t e d to the number of input states. Perhaps, on average, t h i s w i l l be close to the truth, but i t i s only correct when a l l input v a r i a b l e s have the same degree of interdependence and the same number of possible values. For example, i f there are three input variables with 2 possible values each, and i f there i s no r e l a t i o n s h i p r e l a t i n g the variables to each other, the number of input state i s 2 3 = 8. I f another s i m i l a r v a r i a b l e i s added, the number of states would become 16, and so on. However, i f a fourth v a r i a b l e with three possible values i s added, the number of input states would become 24. Therefore, the number of input states to a module need not be monotonically r e l a t e d to the number of input v a r i a b l e s , and a basic assumption of software cost estimation techniques i s shown to be questionable. 4.3. H e u r i s t i c Guided Search A measure of system complexity was required so that the decompositions generated by the algorithm of Chapter 3 might be presented to the analyst i n a meaningful order 6 5. A l t e r n a t i v e decompositions w i l l be presented i n order of increasing complexity. The complexity of a decomposition i s defined as being As indicated by a footnote i n Chapter 3, there i s no suggestion that the algorithm and ranking h e u r i s t i c ( i e . complexity) described here are the "best". It i s possible that more e f f i c i e n t algorithms and more appropriate h e u r i s t i c s e x i s t . This section i s intended to show that automated decomposition and some sort of meaningful ranking of a l t e r n a t i v e s i s poss i b l e . 120 equal to the sum of the complexities of i t s constituent subsystems at a l l l e v e l s of the decomposition 6 6. The algorithm performs updates on an intermediate state space (ISS) using every possible subset of the subsystems which were dete r m i n i s t i c with respect to that ISS. The complexity measure can provide the basis f o r a h e u r i s t i c to s e l e c t the subset most l i k e l y to lead to a "high q u a l i t y " decomposition, where "high q u a l i t y " i s defined as low complexity. For example, suppose the modified p a y r o l l system has been updated once producing the p a r t i a l decomposition shown below. Complexities of i n d i v i d u a l subsystems are shown following a "|". State v a r i a b l e abbreviations: hours = hours worked pay_r = rate of pay emp_p = employee p o s i t i o n sales = amount of sales com = commission pay over = overtime pay 1: {hours,pay_r,base)|5.51 {emp_p,sales,com)|3.25 (emp_p.hours.over)|3.90 This i s not a f u l l decomposition i n that the second ISS, formed by the update at l e v e l 1, s t i l l contains unstable states. In p a r t i c u l a r , the state v a r i a b l e s representing a d d i t i o n a l payments and t o t a l pay have not been updated to r e f l e c t t h e i r f i n a l values. The subsystems which are deterministic with respect to the second ISS (and which s a t i s f y the other h e u r i s t i c s presented i n Chapter 3) are {base.com.empt.over.totalpav)|12.98 {base,emp_p,emp_t,over,sales.total_pay}|22.04 {base,com,emp_p,emp_t.hours.total_pay)|29.61 {com,emp_t,over,add_pay}|8.00 {emp_p,emp_t,over,sales.addpay)|9.71 Notice that the sum of the subsystem complexities i s only the lower l i m i t to Hellerman's complexity of the system. However, one of the important reasons for decomposing a system i s to avoid having to v i s u a l i z e the en t i r e system at once. An analyst deals with i n d i v i d u a l subsystems at each l e v e l of the decomposition. Therefore, the sum of the subsystem complexities i s a reasonable estimate of the o v e r a l l e f f o r t required to understand the system. 121 {com,emp_p,empt.hours.addpay)|15.34 where emp_t = employee type total_pay = t o t a l pay add_pay = a d d i t i o n a l payments A good search h e u r i s t i c should indicate the subset of t h i s set of subsystems which i s most l i k e l y to lead to the lowest-complexity decomposition 6 7. There are 15 subsets 6 8. For each subset, the s p e c i f i c a t i o n s analysis tools determine the minimum and maximum possible decomposition complexities, where that subset comprises l e v e l 2. That i s , each possible update has, associated with i t , a minimum and a maximum possible decomposition complexity. These minimum and maximum complexities are based on information already obtained during the search. The minimum possible decomposition complexity i s equal to the sum of the subsystem complexities at a l l lower l e v e l s plus the t o t a l complexity of the subsystems used f o r update at the current l e v e l . In other words, the minimum possible complexity i s determined by assuming that a l l p o t e n t i a l h i g h e r - l e v e l subsystems have zero complexity. The maximum possible decomposition complexity i s equal to the minimum possible complexity plus the complexities of the le a s t complex subsystems known so f a r which can determine the f i n a l values of any remaining output state v a r i a b l e s . The next update w i l l be performed using the subset with the lowest associated minimum complexity. Minimum and maximum complexities can be used together to "prune" the search tree. For example, consider an update using the subset {{base,com,emp_t,over.totalpav)|12.98,{com,emp_t,over.add_pay)|8.00}. The t o t a l complexity of l e v e l 1 i s 12.66 (= 5.51 + 3.25 + 3.90). Therefore, the minimum possible complexity of any decomposition a r i s i n g from t h i s update i s 33.64 (= 12.66 + 8.00 + 12.66). A f t e r t h i s update there w i l l be no remaining output state v a r i a b l e s . That i s , i n the t h i r d intermediate state space created This sort of h e u r i s t i c search i s sometimes c a l l e d the "best bud" method (Sandewall, 1971). 6 8 R e c a l l that no two deterministic subsystems chosen f o r use i n an update operation may contain the same output state v a r i a b l e . Therefore, there are only 3 * 3 + 6 = 15 possible update subsets. 122 by t h i s update, a l l state v a r i a b l e s w i l l have reached t h e i r f i n a l values. Therefore, f o r t h i s update subset, the minimum and maximum possible complexities are equal. As another example, consider an update using the subset {(com.empt.over.addpay)|8.00} The minimum possible complexity of any decomposition a r i s i n g from t h i s update i s 20.66 (= 8.00 + 12.66). The state v a r i a b l e "total_pay" w i l l s t i l l be an output with respect to the t h i r d intermediate state space created by t h i s update. The lowest-complexity deterministic subsystem discovered thus f a r which can ca l c u l a t e the f i n a l value of "total_pay" i s (base,com,emp_t,over.total_pay)|12.98 Therefore, the maximum possible complexity a r i s i n g from t h i s update i s 33.64 ,(= 12.98 + 20.66). Minimum and maximum complexities f o r a l l p o s s i b l e update subsets are l i s t e d below. To reduce the size of the table, subsystems are coded as follows: Subsystem Complexity Code (base.com.emp_t.over.totalpay) 12.98 A (base.empp.emp_t.over.sales.totalpay) 22.04 B (base.com.empp.emp_t.hours.total_pay) 29.61 C (com.empt.over.addpay) 8.00 D (emp_p.emp_t.over.sales.add_pay) 9.71 E {com.emp_p.emp_t.hours.add_pay} 15.34 F Update Subset (A) (B) (C) Minimum Possible Maximum Possible Complexity 25.63 34.70 42.27 Complexity 6 9 33.63 42.70 50.27 Minimum and maximum possible decomposition complexities w i l l be the same whenever the update subset contains a l l of the remaining output state v a r i a b l e s . For example, the subset (A,D) contains both "total_pay" and "add_pay". Therefore, the minimum and maximum possible decomposition complexities following an update operation using (A,D) are equal (33.64). 123 (D) 20 66 33 64 {E} 22 37 35 35 (F) 28 00 40 98 (A, D) 33 64 33 64 (A, E) 35 35 35 35 {A,F} 40 98 40 98 (B.D) 42 70 42 70 {B,E} 44 41 44 41 {B,F} 50 04 50 04 {C,D} 50 27 50 27 {C,E} 51 98 51 98 {C,F} 57 61 57 61 The update leading to the smallest minimum possible complexity uses the subsystem (over.emp_t.sales.add_pav) ( i . e . update subset (D)). I f t h i s update i s performed, the p a r t i a l decomposition becomes 2: (com.emp_t.over.add_pay)|8.00 1: (hours.payr.base)|5.51 (emp_p.sales.com)|3.25 {emp_p.hours.over)|3.90 The only subsystems which are deterministic with respect to the t h i r d ISS (and s a t i s f i e s the other h e u r i s t i c s of Chapter 3} are (base,add_pay,total_pay)|3.25 and {add_pay.hours.pay_r.total_pay)|11.02. The minimum and maximum possible complexities associated with an update using the f i r s t subsystem are both 23.91. I f the second subsystem i s used, they are both 31.68. The f i r s t subsystem i s selected f or the next update. Thus the f i r s t decomposition reached, s t a r t i n g from the given l e v e l 1, i s as follows: 3: {base,add_pay,total_pay)|3.25 2: (com.empt.over.addpay)|8.00 1: (hours.pay_r.base)|5.51 (emp_p.sales.com)[3.25 (emp_p.hours.over)|3.90 124 The complexity of t h i s decomposition i s equal to the sum of the complexities of the i n d i v i d u a l subsystems or 23.91 (= 5.51 + 3.25 + 3.90 + 8.00 + 3.25). This i s , i n fa c t , the lowest-complexity decomposition of the modified p a y r o l l system. A l t e r n a t i v e decompositions, with the same subsystems at l e v e l 1, can be found by performing updates using the other subsets of subsystems which were dete r m i n i s t i c with respect to the second and t h i r d intermediate state spaces. These a l t e r n a t i v e subsets w i l l be selected i n order of increasing minimum possible complexity. Using subsystem complexity to guide the search f o r decompositions with low complexity i s not quite as straightforward as i t appears. The above example st a r t e d from a given set of subsystems at l e v e l 1, or a given second ISS. In fa c t , the s p e c i f i c a t i o n s analysis tools, when applied to the modified p a y r o l l system suggest a large number of d i f f e r e n t possible l e v e l l ' s . The following subsystems are a l l minimal ( i . e . described by as small a number of state v a r i a b l e s as possible) and deterministic with respect to the f i r s t ISS: 1. {hours,pay_r,base} 2. {emp_p.sales.com) 3. {emp_p.hours.over) 4. {hours.emp_p.emp_t.sales.add_pay) 5. {hours,emp_p,emp_t,pay_r,sales.total pay) With these f i v e subsystems there are 31 subsets which might be used to form the second ISS. That i s , there are 31 possible sets of l e v e l 1 subsystems, or 31 possible p a r t i a l decompositions r e s u l t i n g from an analysis of the f i r s t ISS. The subset of subsystems, selected f or the f i r s t update i n the above i l l u s t r a t i o n was {{hours.payr.base),{emp_p,sales.com),{emp_p.hours.over)}. Such an update eventually leads to the decomposition with the lowest-complexity, but t h i s subset does not have the lowest minimum possible complexity at l e v e l 1. While decompositions with lower complexities are generally suggested f i r s t , there w i l l often be exceptions. 125 I t was mentioned e a r l i e r that maximum and minimum complexity information could be used together to "prune" the search t r e e 7 0 . I f the analyst i s able to speci f y an upper bound to the complexity of a decomposition he or she i s i n t e r e s t e d i n , some possible updates need never be performed. The s p e c i f i c a t i o n s analysis tools allow the analyst to input the maximum percentage difference between the minimum complexity decomposition and any other suggested decompositions. I f the percentage difference between the minimum complexity associated with a possible update and the maximum complexity associated with the lowest minimum complexity found so f a r i s greater than the s p e c i f i e d maximum percentage difference, that update w i l l never be performed 7 1. For example, r e c a l l from the above i l l u s t r a t i o n , the possible update using the subsystem (addpay.hours.pay_r.total_pay)|11.02 The minimum possible complexity associated with the above subsystem i s 31.68. The smallest minimum, and associated maximum, possible complexity found thus far were both 23.91. The percentage difference i s 32% (= [31.68 - 23.91] / 23.91). I f the analyst had s p e c i f i e d 20% as the maximum percentage d i f f e r e n c e , t h i s p ossible update would never be performed. Of course, should an analyst wish to see a l l p ossible decompositions i r r e s p e c t i v e of complexity, he or she can simply enter a very large number as the maximum allowed percentage d i f f e r e n c e . 4.4. Maintenance In t h i s research, SYSTEM MAINTENANCE refer s to any changes to a system a f t e r implementation 7 2. I t w i l l be shown that when system maintenance i s Such pruning cannot r e s u l t i n the loss of the lowest complexity decomposition. The algorithm w i l l s t i l l f i n d the "optimal" decomposition of the system. 7 1 This i s a modified form of SSS* minimax search (Charniak and McDermott, 1985, pp. 286-290). The concept of "maximum allowable percentage difference" i s added because the complexity measure i s imperfect and, as w i l l be shown i n Chapter 6, other decompositions can help to i d e n t i f y shortcomings of the system model. In other words, higher-complexity decompositions can sometimes serve a us e f u l purpose. 7 2 In common usage, the term "maintenance" does not include enhancements to a system. The Oxford Dictionary defines "maintenance" as "being maintained", and "maintain" as "cause to continue". However, i n keeping with the terminology 126 considered, the optimal (or lowest-complexity) decomposition f o r a system may change. I t w i l l be assumed that a l l future changes to a system may be defined to the degree of d e t a i l present i n the system model 7 3. This i s a f a i r l y strong assumption. In many cases, changes cannot be anticipated. In such cases, the best the analyst can do i s s e l e c t a low complexity decomposition for the o r i g i n a l system ignoring possible changes. Parnas (1972) and Myers (1977) suggest that the q u a l i t y of a decomposition may be assessed by observing i t s behaviour i n the face of maintenance changes. They assume the best decompositions w i l l l i m i t the e f f e c t s of a change to a small number of subsystems. In t h i s section, a framework for c l a s s i f y i n g maintenance changes w i l l be developed. A technique f o r assessing the impact of maintenance on a given decomposition w i l l also be proposed. I t w i l l be shown that i n some cases i t i s best to construct parts of the o r i g i n a l system with maintenance i n mind, while i n others i t i s best to ignore maintenance during i n i t i a l construction, and to create e n t i r e l y new subsystems when the maintenance must be done. Because i t i s based on a l i m i t e d number of basic constructs, SELMA provides a unique framework for c l a s s i f y i n g possible changes to an e x i s t i n g system. A l l possible changes to a system model can be categorized as f o l l o w s 7 4 : 1. Changes to sublaws add a sublaw delete a sublaw 2. Changes to the set of external events add an event delete an event As s h a l l be shown, a change may cause a subsystem i n a given decomposition to be no longer deterministic or no longer minimal ( i . e . the subsystem i s now i n the f i e l d of information systems, system maintenance s h a l l include any change to a system including possible enhancements. 7 3 That i s , changes to the fu n c t i o n a l r e l a t i o n s h i p s between state v a r i a b l e s , defined i n the o r i g i n a l system model, must be known before the o r i g i n a l system i s implemented. 7 4 M o d i f i c a t i o n of a sublaw or an event can be accomplished by de l e t i n g the o l d v e r s i o n and adding the new. 127 described by more state variables than are required to p r e d i c t i t s behaviour). The above are SIMPLE CHANGES. A r e a l MAINTENANCE OPERATION i s l i k e l y to consist of several simple changes. 4.4.1. Changes to Sublaws Consider the sublaw for c a l c u l a t i o n of base pay i n the p a y r o l l system. S t a b i l i t y Conditions: base hours pay_r 0 - 0 0 0 nz regular nz nz overtime nz Corrective Actions: Conditions Actions hours pay_r --> base 0 - 0 0 0 regular nz nz overtime nz nz where 0 = zero nz = not zero regular = less than that required f o r overtime pay overtime = s u f f i c i e n t f or overtime pay = any value, or "don't care" The analyst must specify each l i n e of the above sublaw. During maintenance, any change to the sublaw can be represented as a sequence of additions and/or deletions of i n d i v i d u a l l i n e s or r u l e s . Therefore, at the lowest l e v e l , an analyst does not add and delete sublaws. Rather, he or she adds and deletes r u l e s . A change to a rule may or may not introduce new state v a r i a b l e s to the system. Changes which do not introduce new state v a r i a b l e s w i l l be considered f i r s t . With respect to a given decomposition, the state v a r i a b l e s a f f e c t e d by such a change w i l l be 1. contained within a single subsystem, or 128 2. not contained within a single subsystem. I f the system a f t e r the change i s s t i l l l o c a l l y complete and consistent, the a d d i t i o n of a ru l e where a l l the state v a r i a b l e s covered by the rule are included i n a sing l e subsystem w i l l not a f f e c t the decomposition. For example, consider the e f f e c t , on the subsystem (hours.payr.base). of adding the following rule to the base pay sublaw. Corrective Actions: Conditions Actions hours pay_r --> base nz unknown unknown The a d d i t i o n of such a rule merely r e s u l t s i n a l t e r i n g the functional r e l a t i o n s h i p between the input and output state v a r i a b l e s of the subsystem. On the other hand, i f the ru l e i s added which covers state v a r i a b l e s not found i n any sing l e subsystem, the subsystem may no longer be deterministic. For example, consider the addition the following rule to the base pay sublaw. Corrective Actions: . , Conditions Actions hours e"ip_P P ay_ r - - > base overtime management - unknown Hours and pay rate are no longer s u f f i c i e n t to determine the value of base pay. Knowledge of the employee's p o s i t i o n i s also required. Therefore, the subsystem (hours.pay_r.base) would be no longer deterministic. Rules may also be deleted from a sublaw. When a rule i s deleted, one or more of the subsystems i n a decomposition may no longer be required. For example, consider the commissions sublaw from the o r i g i n a l p a y r o l l system as shown below. S t a b i l i t y Conditions: emp_p emp_t com sales regular sales nz nz 0 0 o f f i c e 0 management - 0 -129 Corrective Actions Conditions Actions emp_p emp_t sales --> com regular sales nz nz 0 0 I f the f i r s t c o r r e c t i v e a c t i o n r u l e were deleted, employee p o s i t i o n and employee type would no longer be required to determine the f i n a l value of commissions. Therefore, the subsystem (emp_p.emp_t.sales.com) would no longer be minimal 7 5. When a rule i n a system model i s deleted, the state v a r i a b l e s used i n the r u l e may or may not be contained i n a s i n g l e subsystem of the given decomposition. That i s , there i s no reason to suppose that the sublaws s p e c i f i e d i n the system model w i l l match exactly the sublaws describing the subsystems produced by the s p e c i f i c a t i o n s analysis t o o l s . The input to the tools i s simply the f i r s t ISS (Intermediate State Space) of the system and the corresponding f i n a l stable states ( i . e . the f i r s t system r e l a t i o n ) . The algorithm has no d i r e c t "knowledge" of the rules s p e c i f i e d by the analyst. However, the e f f e c t s of d e l e t i n g a rule spanning more than one subsystem i n the given decomposition w i l l be the same as described above. One or more subsystems may be no longer required because some state v a r i a b l e i s no longer an output, or some subsystem may no longer be minimal. There i s one more way i n which a sublaw may be a l t e r e d . A r u l e may be added which contains a state v a r i a b l e not previously used to describe the system. This was the case when the a d d i t i o n a l payments state v a r i a b l e was added to form the modified p a y r o l l system. The new state v a r i a b l e w i l l be e i t h e r an input state v a r i a b l e ( i . e . i t s value does not change between the f i r s t ISS and the f i n a l stable states) or an output state v a r i a b l e . I f i t i s an input state v a r i a b l e , i t w i l l simply be added to some subsystem or subsystems i n the decomposition. For example, decomposition #27 of Appendix K shows "add_pay" added as an input state v a r i a b l e to the subsystem responsible for c a l c u l a t i n g the value of "total_pay". I f i t i s an output state v a r i a b l e , i t may be added as an output to an e x i s t i n g subsystem or a new subsystem may be formed to In f a c t , i f the f i r s t c o r r e c t i v e a c t i o n rule i s deleted, the subsystem w i l l not only no longer be minimal, i t w i l l no longer be required. A f t e r such a maintenance operation, the value of commissions w i l l never change. That i s commissions i s no longer an output state v a r i a b l e , and the subsystem whose r e s p o n s i b i l i t y i t was to c a l c u l a t e commissions w i l l no longer be required. 130 determine i t s f i n a l value. For example, decomposition #1 of Appendix K shows the c r e a t i o n of the subsystem f com.empt.over.addpay) to determine the value of "add_pay". In eit h e r case the basic structure of the decomposition w i l l not be a l t e r e d . 4.4.2. Changes to External Events The e f f e c t s of changes to external events are s i m i l a r to those of sublaws. A f t e r a l l , external events are f u n c t i o n a l l y quite s i m i l a r to sublaws. In f a c t external events can be thought of as sublaws with a c t i v a t i o n conditions located i n the system's environment (and, therefore, not included i n the system s p e c i f i c a t i o n ) and actions a f f e c t i n g state v a r i a b l e s w i t h i n the system. I f an external event a f f e c t i n g an e x i s t i n g state v a r i a b l e i s added to a system, some new states may be added to the f i r s t ISS (since the f i r s t ISS i s e s s e n t i a l l y the cross product of the stable state space of the system and the set of external events). As was shown i n Chapter 3 under "Importance of the External Event Space", these new states may represent system behaviours which were not evident under the smaller set of external events. New subsystems may appear (because state variables which were previously constant may become outputs) and other subsystems may no longer be dete r m i n i s t i c (because new behaviours may be exhibited. I f an external event i s deleted from the system, some system behaviours may no longer be exhibited. This means that some subsystems of the given decomposition may disappear (because some state v a r i a b l e s are no longer outputs) , or some systems may no longer be minimal (because fewer input state variables are required to determine the f i n a l values of the output state v a r i a b l e s ) . P r e d i c t i o n of the e f f e c t of adding an external event which a f f e c t s a state v a r i a b l e not previously used to describe the system, i s t r i v i a l . No changes to the decomposition are expected. The system can only respond to events through the a c t i v a t i o n of r u l e s . Since no rules mentioning the a f f e c t e d state v a r i a b l e e x i s t , adding such an event can not a f f e c t system behaviour. Changes i n behaviour w i l l only occur i f rules are added to respond to the new state v a r i a b l e . The e f f e c t s of such changes were described e a r l i e r . 131 4.4.3. Implications f o r Design When a system designer has knowledge of planned maintenance operations, he or she must decide whether to construct a system which can support these changes, or modify the i n i t i a l system at a l a t e r date. This d e c i s i o n can be s i m p l i f i e d by considering the possible e f f e c t s a maintenance operation. The possible e f f e c t s of changes to a system model are summarized i n Table I I . Table I I : Possible e f f e c t s of simple changes to a system model. Change E f f e c t Sublaws: 1. add rule covering change form of sublaw associated with one subsystem some subsystem 2. add ru l e covering more some subsystems may no longer be than one subsystem deterministic 3. add ru l e with new expand e x i s t i n g subsystems or create state v a r i a b l e s new ones 4. delete rule covering some subsystems may no longer be minimal one subsystem 5. delete r u l e covering some subsystems may no longer be minimal more than one subsystem Events: 6. add event a f f e c t i n g some subsystems may no longer be deter m i n i s t i c e x i s t i n g state v a r i a b l e new subsystem may be added 7. add event a f f e c t i n g none, unless new rules are added new state v a r i a b l e 8. delete event some subsystems may no longer be minimal A maintenance operation which r e s u l t s only i n the removal of some subsystems w i l l require very l i t t l e maintenance e f f o r t . However, i f the maintenance requires the addition of new subsystems, the i n c l u s i o n of new state v a r i a b l e s i n o l d subsystems, or a change i n the r e l a t i o n s h i p between inputs and outputs of a subsystem, a great deal of e f f o r t may be required. Notice that the serious e f f e c t s a l l occur when a rule or event i s added to a subsystem. The problem of maintenance becomes one of i d e n t i f y i n g these serious e f f e c t s when the i n i t i a l system i s designed. Merely i d e n t i f y i n g the simple changes involved i n a maintenance operation, and looking up t h e i r e f f e c t s i n the above table, i s not s u f f i c i e n t to p r e d i c t s p e c i f i c e f f e c t s . No change i s c e r t a i n to have an e f f e c t and the extent of e f f e c t s which do occur cannot be e a s i l y determined. There i s , 132 however, a way to i d e n t i f y a l l s p e c i f i c e f f e c t s . The analyst must construct models for three systems 7 6: 1. A model of the i n i t i a l system. 2. A model of the modified system. 3. A sin g l e model describing the behaviours of both the i n i t i a l and modified systems. This model w i l l include a state v a r i a b l e to d i s t i n g u i s h between the two versions of the system for use i n subsystems where c a l c u l a t i o n s are performed d i f f e r e n t l y a f t e r the modification. Use of such a state v a r i a b l e i s i l l u s t r a t e d i n Appendix N. Decompositions produced for the combined system w i l l contain only subsystems which behave d e t e r m i n i s t i c a l l y with respect to the behaviours exhibited by both the i n i t i a l and modified systems. The analyst must then compare decompositions for a l l three models and decide how the i n i t i a l system should be implemented. Subsystems which are deterministic with respect to the behaviours of both systems could be implemented i n i t i a l l y , or subsystems which are only deterministic f o r the f i r s t system could be implemented and then reconstructed when the maintenance operation i s a c t u a l l y performed. For example, consider the i n i t i a l and modified p a y r o l l systems. The o r i g i n a l p a y r o l l system described i n Chapter 2 was modified i n Chapter 3 to r e f l e c t the following changes i n company p o l i c y : 1. Both o f f i c e s t a f f and sales employees are e n t i t l e d to both overtime pay and sales commissions. 2. An o f f i c e employee cannot receive more i n commissions than i n overtime. 3. A sales employee cannot receive more i n overtime than i n commissions. Appendix N contains a model for a system which w i l l e x h i b i t the behaviours of both p a y r o l l systems. The following i s the lowest-complexity decomposition 7 7 f o r the combined system. Complexities are noted beside each subsystem. The The actual e f f o r t required to construct the three models i s not l i k e l y to be p r o h i b i t i v e . Except when major maintenance changes are expected, the models w i l l probably be quite s i m i l a r . 7 7 The complexity of a decomposition i s equal to the sum of the complexities of the i n d i v i d u a l subsystems. 133 a d d _ p a y o v e r emp_t com a d d _ p a y over t o t a l _ p a y o v e r emp_t com b a s e com over o v e r s y s h o u r s emp_t emp_p base com com s y s s a l e s emp_t emp_p b a s e p a y _ r h o u r s Figure 15: Lowest-complexity decomposition for the combined p a y r o l l system, decomposition i s shown using the diagrammatic format i n Figure 14. Lowest-Complexity Decomposition for the Combined P a y r o l l System: 2: {add_pay,com,emp_t,over,add_pay}|19.02 (base.com.empt.over.totalpav)|12.98 1: (hours,pay_r,base}|5.51 (emp_p,emp_t,sales,sys,com}|11.14 (emp_p,emp_t.hours,sys.over)|13.05 where emp_t emp_p hours sales over com add_pay employee type (sales or o f f i c e ) employee p o s i t i o n (management or regular) hours worked sales overtime pay commissions ad d i t i o n a l payments 134 total_pay = t o t a l pay and "sys" i s a state v a r i a b l e representing the vers i o n of the system. This state v a r i a b l e i s used to avoid problems which might a r i s e i f rules f o r the two versions of the system c o n f l i c t with each other. For example, i n the commissions subsystem (emp_p.emp_t.sales.sys.com) there are two d i f f e r e n t ways to cal c u l a t e commissions. The value of the "sys" state v a r i a b l e i s used to determine which set of rules i s to be activated. Notice that t h i s decomposition i s s t r u c t u r a l l y s i m i l a r to the lowest-complexity decomposition f o r the i n i t i a l p a y r o l l system. Lowest-Complexity Decomposition f o r the I n i t i a l P a y r o l l System: 2: (base.com.over.total_pay)|3.90 1: (hours.pay_r.base)|5.51 (emp_p.emp_t.sales.com)[4.35 {emp_p,emp_t,hours.over)|4.97 o v e r h o u r s emp_p emp_t over t o t a l _ p a y com o v e r b a s e base b a s e h o u r s p a y _ r com com s a l e s emp_p emp_t' Figure 16: Lowest-complexity decomposition f o r the i n i t i a l p a y r o l l system. This decomposition i s shown using the diagrammatic format i n Figure 16. A subsystem f o r a d d i t i o n a l payments has been added and the "sys" state v a r i a b l e 135 i s included i n two subsystems to show that the behaviour of these subsystems depends on the vers i o n of the system. The lowest-complexity decomposition for the modified p a y r o l l system i s shown below. Figure 17 displays t h i s decomposition using the diagrammatic format. Lowest-Complexity Decomposition for the Modified P a y r o l l System: {base,add_pay.total_pav)|3.25 (com.over.empt.addpay)|8.00 {hours,pay_r,base)|5.51 {emp_p,sales,com}|3.25 (emp_p.hours.over)|3.90 base b a s e p a y _ r h o u r s t o t a l _ p a y b a s e a d d _ p a y add_pay a d d _ p a y o v e r e m p _ t com over o v e r h o u r s e m p _ p com com s a l e s e m p _ p Figure 17: Lowest-complexity decomposition for the modified p a y r o l l system. The decomposition of the combined system reveals three s u r p r i s i n g aspects of the modification: 136 1. The lowest-complexity decomposition f o r the combined system places c a l c u l a t i o n s of a d d i t i o n a l payments and t o t a l pay at the same l e v e l s . That i s , a d d i t i o n a l payments does not become an input to the t o t a l pay c a l c u l a t i o n as i n the modified system. 2. The c a l c u l a t i o n of t o t a l pay i n the modified system does not require information as to the version of the system. 3. The c a l c u l a t i o n of a d d i t i o n a l payments i n the combined system i s more complex than i n the modified system. The f i r s t observation shows that the structure of the i n i t i a l system i s i n some sense "dominant". The modification does not require major changes to the composition of any subsystem. Indeed, system ver s i o n information i s not even required i n order to c a l c u l a t e t o t a l pay. This f a c t r e s u l t s from a r e l a t i o n s h i p between the i n i t i a l system's method of c a l c u l a t i n g commissions and overtime pay, and the modified system's method of c a l c u l a t i n g a d d i t i o n a l payments and t o t a l pay. This r e l a t i o n s h i p was u n l i k e l y to be foreseen i n t u i t i v e l y , and i s described i n d e t a i l i n Appendix O78. The c a l c u l a t i o n of a d d i t i o n a l payments i s more complex i n the combined system, because the model s p e c i f i c a l l y i n s i s t e d that the c a l c u l a t i o n NOT be performed i f the "sys" state v a r i a b l e i n d i c a t e d the i n i t i a l system. This a d d i t i o n a l decision increased the complexity. Also notice that the "sys" state v a r i a b l e i s not e x p l i c i t l y required by the subsystem. I f the incoming value of a d d i t i o n a l payments i s "not c a l c u l a t e d " then the i n i t i a l system i s being simulated and the f i n a l value should also be "not c a l c u l a t e d " . The a d d i t i o n a l payments state v a r i a b l e i s both an input and an output with respect to t h i s subsystem. Design decisions must be made subsystem by subsystem. The analyst needs to decide whether i t i s more complicated to construct a subsystem which w i l l not require changes during maintenance, than i t i s to construct i n i t i a l and modified subsystems. That i s , i f C±, C m and C c are the complexities of a subsystem i n the B r i e f l y , the input states which would lead to an error i n the i n i t i a l system's c a l c u l a t i o n of "total_pay", i f the modified system's method were used, simply cannot occur. This allows the combined system to use the same method of c a l c u l a t i n g "total_pay" for both versions of the system. 137 i n i t i a l system and the corresponding subsystems i n the modified and combined systems, the following decision rule a p p l i e s 7 9 : IF C c > CL + C m THEN construct a new subsystem during maintenance ELSE construct the combined subsystem i n i t i a l l y A f t e r examining the decompositions of the three systems the following design decisions might be made: 1. The complexities of the commissions and overtime pay subsystems i n the combined system are greater than the sums of t h e i r complexities i n the i n i t i a l and modified systems (11.14 > 3.25 + 4. 35 and 13 .05 > 3 . 90 + 4. 97) . Therefore, i t w i l l be simpler to reconstruct these subsystems when the maintenance i s performed than to i n i t i a l l y construct subsystems which w i l l not require changes. 2. Assuming the analyst i s interes t e d i n knowing the value of ad d i t i o n a l payments: a. A d d i t i o n a l Payments: The subsystem i s simpler i n the modified system than i n the combined system (8.00 < 19.02). I t i s also not required i n the i n i t i a l system. Therefore, t h i s subsystem should be constructed during maintenance. b. To t a l Pay: The sum of the complexities of the i n i t i a l subsystem and the complexity of the subsystem i n the second decomposition of the modified system i s less than the complexity of the subsystem i n the combined system (12.98 > 3.90 + 3.25). Therefore, t h i s subsystem should be reconstructed during maintenance to take advantage of the new a d d i t i o n a l payments state v a r i a b l e . Other factors are l i k e l y to be important i n determining the optimal maintenance strategy. For example, since the av a i l a b l e implementation language p r i m i t i v e s can a f f e c t subsystem implementation d i f f i c u l t y , they may also influence the s e l e c t i o n of a maintenance strategy. I d e n t i f i c a t i o n of other such factors i s a possible subject f o r future research. The f a c t that other important factors undoubtedly e x i s t means that the simple d e c i s i o n r u l e should not be automated. Intervention by the system designer must be allowed. 138 3. Assuming the analyst i s NOT interested i n knowing the value of a d d i t i o n a l payments: a. A d d i t i o n a l Payments: This subsystem should never be constructed. b. T o t a l Pay: The subsystem i n the combined system i s le s s complex than the sum of the i n i t i a l subsystem's complexity and the complexity of the t o t a l pay and a d d i t i o n a l payments subsystems i n the modified system (12.98 < 3.25 +8.00 + 3.90). Therefore, the combined subsystem should be constructed i n i t i a l l y . I t should be noted that the above discussion applies to s i n g l e maintenance operations only. In r e a l i t y , a system i s l i k e l y to undergo a s e r i e s of such operations before i t i s f i n a l l y discarded. This scenario might be diagrammed as follows: m l m 2 m 3 " " n-l ° 1 ---> °2 CT3 - - - > . . . ---> CTn where am i s the mth v e r s i o n of the system r e s u l t i n g from the m-lth maintenance operation. I t i s possible that o3 might be more e a s i l y constructed by modifying oi than by modifying CT2. In t h i s case, the design chosen f o r ax would be a f f e c t e d by xa1 and m2, but the design f o r az would not be influenced by m2. In general, the problem of f i n d i n g an optimal set of system designs and changes could be quite complex. 4.5. The System Goal As shown i n Chapter 3, the same system model can have several a l t e r n a t i v e decompositions. However, not a l l of them may be equally acceptable to the analyst i n that state v a r i a b l e emergence and hiding v a r i e s between a l t e r n a t i v e s . In t h i s section, t h i s notion i s formalized through the concept of a SYSTEM GOAL. The existence of a goal i s one of the d i s t i n g u i s h i n g features of an a r t i f i c i a l system (Simon, 1981, p. 8 ) 8 0 . A system designer creates a system to f u l f i l some goal. This i s the raison d'etre for a r t i f i c i a l systems. D e f i n i t i o n According to Simon, a r t i f i c i a l They are q u a l i t a t i v e l y d i f f e r e n t from (eg. b i o l o g i c a l systems formed through systems are d e l i b e r a t e l y created by man. systems r e s u l t i n g from "natural" forces natural s e l e c t i o n ) . 139 of the system goal i s a very important part of systems analysis and design. The system's goal, as perceived by the analyst or envisioned by the designer, influences l e v e l of abstraction and s e l e c t i o n of boundaries f o r the system model 8 1. The notion of a system goal i s r e a d i l y supported by SELMA. When an analyst creates a system model, he or she i s l i k e l y to define two d i f f e r e n t kinds of state v a r i a b l e s . Some var i a b l e s w i l l be indispensable, others w i l l be dispensable. Indispensable state v a r i a b l e s represent the e s s e n t i a l properties of the system (e.g. i n the modified p a y r o l l system " t o t a l pay" i s l i k e l y to be indispensable). Such state v a r i a b l e s w i l l be examined by the user or by other systems which r e f e r to the system being modelled. Dispensable state variables are defined by the analyst to s i m p l i f y creation of the system model (e.g. i n the modified p a y r o l l system, " a d d i t i o n a l payments" may have been added merely to f a c i l i t a t e d e f i n i t i o n of the sublaw describing the computation of " t o t a l pay"). The indispensable state variables define the purpose or g o a l 8 2 of the system as perceived by the analyst. D e f i n i t i o n : Goal State Variable Any state v a r i a b l e which the analyst requires to be included i n a decomposition i s c a l l e d a GOAL STATE VARIABLE. A system model consists of state variables and values, external events, and sublaws. The state variables selected for i n c l u s i o n i n a model w i l l be determined by the system goal. For example, a model created to analyze or describe the f i n a n c i a l e f f i c i e n c y of a point-of-sales terminal system i s u n l i k e l y to include state v a r i a b l e s representing the work schedule of the terminal operator. These state variables are probably i r r e l e v a n t with respect to the stated goal. S i m i l a r l y , state variables describing the operation of the i n d i v i d u a l e l e c t r o n i c and mechanical components of the terminal w i l l not be included. Thus the system goal influences both the system boundaries and l e v e l of a b s t r a c t i o n found i n the system model. 8 2 An analyst may f i n d i t more convenient to v i s u a l i z e a system as having a set of subgoals. However, d e f i n i t i o n of subgoals would imply that some form of decomposition has already been performed by the analyst. To avoid p r e j u d i c i n g the operation of the s p e c i f i c a t i o n analysis t o o l s , the analyst i s asked to provide only the h i g h e s t - l e v e l goal of the system. I f the system has more than one h i g h - l e v e l goal, the sets of state v a r i a b l e s describing these goals must be merged. 140 D e f i n i t i o n : System Goal The set of a l l the goal state v a r i a b l e s of a system i s c a l l e d the SYSTEM GOAL. Notice that, as defined here, a goal i s not an inherent property of a system. Rather, a goal i s a function of both a system and the analyst's expectations for that system 8 3. The d e f i n i t i o n of goal state v a r i a b l e s can influence system decomposition and hence system design. Sometimes, i f a state v a r i a b l e i s not part of the system goal, subsystems which determine i t s value may be dropped from a decomposition. I f a subsystem i s dropped, the complexity of the decomposition w i l l be reduced. For example, consider the suggested decompositions of the modified p a y r o l l system l i s t e d i n Appendix K. There are f o r t y - e i g h t decompositions, a l l s a t i s f y i n g the h e u r i s t i c s defined i n Chapter 3. The i n t u i t i v e decomposition i s #27. A l l other decompositions are seemingly t r i v i a l transformations of decomposition #27. The transformations being simple s u b s t i t u t i o n s . For example, consider decompositions #27 and #1 (both are shown i n diagrammatic form i n Figure 18). Decomposition #27: 3: (add_pay.base.total_pay) 2: (com.emp_t.over.addpay) 1: {hours,pay_r,base} {emp_p.sales.com} (emp_p.hours.over) Decomposition #1 2: {com.empt.over.addpay) (base.com.emp_t.over.total_pay) 1: {hours.pay_r.base) {emp_p.sales.com) (emp_p.hours.over) where The s p e c i f i c a t i o n s analysis tools allow the analyst to include a predicate of the form: system_goal(SVList) where SVList i s a l i s t of state v a r i a b l e s which must be included i n a l l suggested decompositions. I f no goal state v a r i a b l e s are defined, i t i s assumed that a l l state v a r i a b l e s are indispensable. 141 emp_t = employee type (sales or o f f i c e ) emp_p = employee p o s i t i o n (management or regular) hours = hours worked sales sales over • overtime pay com = commissions add_pay a d d i t i o n a l payments total_pay t o t a l pay There are two basic differences between these decompositions: 1. the subsystems responsible for c a l c u l a t i n g the f i n a l value of t o t a l pay u t i l i z e d i f f e r e n t input and constant state v a r i a b l e s , and 2. the output state variable s at the top l e v e l are d i f f e r e n t . In the t o t a l pay subsystem of decomposition #1, the a d d i t i o n a l payments state v a r i a b l e "add_pay", has been replaced by the state v a r i a b l e s representing the values of commissions, employee type and overtime. Since i t i s already known from decomposition #27 that (com.emp_t.over.add_pay) i s a deterministic subsystem, t h i s would seem to be a t r i v i a l s u b s t i t u t i o n . Moreover, since increasing the number of state variables i n a subsystem can never decrease that subsystem's complexity, i t would seem to be a useless s u b s t i t u t i o n . However, such a s u b s t i t u t i o n does reduce the system's dependence on the emergent state v a r i a b l e "add_pay". In decomposition #1, the t o t a l pay subsystem no longer requires knowledge of the a d d i t i o n a l payments state v a r i a b l e . I f the subsystem responsible for the c a l c u l a t i o n of the f i n a l value of "add_pay" i s dropped from the decomposition, t o t a l complexity w i l l be reduced from 33.64 to 25.64. This i s s t i l l greater than the 23.91 complexity of decomposition #27. Thus, i n this case, the s u b s t i t u t i o n may not be u s e f u l . In general, the t o t a l complexity of a decomposition may be reduced i f the subsystems responsible for the c a l c u l a t i o n of state v a r i a b l e s removed by s u b s t i t u t i o n are dropped. Such a course of a c t i o n may only be j u s t i f i e d i f the analyst i s not interested i n knowing the f i n a l values of the removed state 142 base base. pay_r hours t o t a l _ p a y base add_pay add_pay #27 over add_pay over emp_t com com com s a l e s emp_p add_pay over emp_t com over base Figure 18: Two decompositions of the modified p a y r o l l system. v a r i a b l e s . No reduction of complexity i s possible i n the p a y r o l l systems. However, the f o u r - l i g h t s system does present such an opportunity. Two of the suggested decompositions for t h i s system were as follows: Subsystem complexities are as indicated. 2: {b,c,c}|2.75 {b,d,d}|2.75 1: {a,b}|2.00 and 1: {a,b}|2.00 {a,c,c}|2.75 {a,d,d}|2.75 The complexity of both of these decompositions i s 7.51 (There i s some round-off error i n the complexities of the i n d i v i d u a l subsystems). Therefore, there i s no c l e a r advantage i n making substitutions f or "b" i n the subsystems which determine the values of "c" and "d". However, i f "b" i s not part of the system goal, the decomposition 1: {a,c,c}|2.75 {a,d,d}|2.75 with complexity 5.51 becomes a v i a b l e a l t e r n a t i v e decomposition. I f complexity had not been reduced, as was the case for the overtime pay s u b s t i t u t i o n i n the modified p a y r o l l system, there would have been no need to suggest t h i s a l t e r n a t i v e to the analyst. This suggests the following h e u r i s t i c : H e u r i s t i c 6: Avoid useless substitutions Do not suggest decompositions formed by state v a r i a b l e s u b s t i t u t i o n s unless 1. the s u b s t i t u t i o n allows the removal of a subsystem where a non-goal state v a r i a b l e i s an output, and That the analyst may not be intere s t e d i n knowing the f i n a l values of a l l state v a r i a b l e s was suggested e a r l i e r . In the previous section, d i f f e r e n t system designs were recommended depending on whether he or she was interested i n knowing the f i n a l value of a d d i t i o n a l payments. Also, i n Chapter 3, system views were found which "hid" d i f f e r e n t state v a r i a b l e s depending on which state v a r i a b l e s were of i n t e r e s t to the analyst. 144 2. a f t e r removal of the subsystem, the complexity of the decomposition has been reduced. This means that i n the case of the four l i g h t system, i f "b", "c", and "d" are a l l s p e c i f i e d as goal state v a r i a b l e s , only one decomposition w i l l be suggested to the analyst. Decomposition #1: 2: {b,c} {b,d} 1: (a,b) I f "b" i s not s p e c i f i e d as a goal state v a r i a b l e an a d d i t i o n a l decomposition w i l l be suggested. Decomposition #2: 1: {a,c) {a,d} 4 . 6 . Conclusions This chapter has presented an i n t u i t i v e l y j u s t i f i a b l e measure of complexity. R a t i o n a l i z a t i o n of the measure was combined with the examination of four possible measures of complexity: 1. Ashby's Va r i e t y 2. Modified Variety 3. Shannon's Entropy 4. Hellerman's Computational Work Hellerman's measure of computational work was f i n a l l y selected f o r use i n th i s research. This measure i s p a r t i c u l a r l y well s u i t e d f or use with SELMA. Through the use of t h i s measure, the decompositions suggested by the s p e c i f i c a t i o n s analysis tools may be presented to the analyst i n a meaningful order. I t should be noted that the complexity measure can be used only to guide the search so that low-complexity decompositions are found r e l a t i v e l y e a r l y i n the search. That i s , the f u l l set of possible decompositions can be found by s p e c i f y i n g a very 145 large maximum allowable percentage difference between the l e a s t and most complex decompositions. While the measure was f i r s t suggested as a means to guide the search f o r decompositions so that they might be presented to the analyst i s some meaningful order, i t has proved i t s e l f u s eful i n other ways as we l l . I t s quantitative nature has supported the d e t a i l e d analysis of maintenance operations. Use of the complexity measure i n conjunction with the decomposition algorithm allows a system designer to se l e c t a decomposition which w i l l reduce the t o t a l e f f o r t required f o r i n i t i a l implementation and maintenance 8 5. I t was noted that the same system model can have several a l t e r n a t i v e decompositions, but not a l l of them may be equally acceptable to the analyst. A l t e r n a t i v e decompositions w i l l hide d i f f e r e n t state v a r i a b l e s . D e f i n i t i o n of the system goal was recognized as an important part of systems analysis and design. The goal, as perceived by an analyst, influences both the system boundaries and the l e v e l of abstraction of the system model. SELMA allows the analyst to e x p l i c i t l y define the system goal, so as to d i s t i n g u i s h between indispensable state v a r i a b l e s , which are used to define the goal, and those state v a r i a b l e s created merely to f a c i l i t a t e c r e a t i o n of the model by s i m p l i f y i n g the s p e c i f i c a t i o n of sublaws. The complexity measure, coupled with system goal information, can be used to r e j e c t many a l t e r n a t i v e decompositions which would otherwise have been presented to the analyst by the s p e c i f i c a t i o n s analysis t o o l s . Future maintenance must be predictable, but need only be known to the degree of d e t a i l represented i n the system model. 146 Chapter 5: Conditional Decomposition 5.1. Introduction Three basic forms of decomposition were i d e n t i f i e d i n Chapter 1: p a r a l l e l , sequential, and c o n d i t i o n a l . Chapter 3 showed how p a r a l l e l and sequential decomposition can be automated provided that the system to be decomposed has been s p e c i f i e d using SELMA. Automation of con d i t i o n a l decomposition w i l l be described i n t h i s chapter. R e c a l l that p a r a l l e l decomposition involves i d e n t i f y i n g subsystems which behave d e t e r m i n i s t i c a l l y with respect to some intermediate state space. Subsystems which are deterministic with respect to a given system r e l a t i o n ( i . e . i n i t i a l system states with t h e i r corresponding f i n a l stable states) may perform t h e i r functions at the same time (or i n p a r a l l e l 8 6 ) , so long as the system i s i n one of the states of the relevant intermediate state space. Sequential decomposition, on the other hand, was not so much a matter of i d e n t i f y i n g d e t e r m i n i s t i c subsystems, but of arranging the subsystems found by p a r a l l e l decomposition into a meaningful sequence of l e v e l s . This sequence had to s a t i s f y a number of h e u r i s t i c s and showed how each system r e l a t i o n associated with a deter m i n i s t i c subsystem might be created. For example, consider the following decomposition of the modified p a y r o l l system. 3: (addpay.base.total pay) 2: (com.emp_t.over.add_pay) 1: (hours.pay_r.base) (emp_p.sales.com) {emp_p,hours,over) where hours = hours worked emp_p = employee p o s i t i o n sales = amount of sales com = commissions total_pay = t o t a l pay pay_r = pay rate emp_t = employee type base = base pay over = over time pay add_pay = a d d i t i o n a l payments This means that the subsystems {hours,pay_r,base), {emp_p.sales.com). and {emp_p.hours.over) at l e v e l 1 behave d e t e r m i n i s t i c a l l y with respect to the f i r s t A c t u a l l y , such deterministic subsystems may perform t h e i r functions i n any order. Simultaneity i s not required. 147 system r e l a t i o n . The f i r s t system r e l a t i o n r e s u l t s from the act i o n of a l l external events on every stable state of the system. Therefore, these subsystems are d e t e r m i n i s t i c with respect to a l l a n t i c i p a t e d responses of the system caused by i n t e r a c t i o n with i t s environment. The subsystem (com.emp_t.over.add_pay) at l e v e l 2 also exhibits deterministic behaviour, but only with respect to the system r e l a t i o n created when the subsystems at l e v e l 1 have performed t h e i r functions. In the model these functions cause changes i n the values of the output state v a r i a b l e s of each subsystem ( i n t h i s case, state v a r i a b l e s "base", "com", and "over") i n each i n i t i a l state of the system r e l a t i o n . The end r e s u l t of the actions of the subsystems at l e v e l 1 i s the second system r e l a t i o n . The second system r e l a t i o n d i f f e r s from the f i r s t system r e l a t i o n only i n the values of the output state v a r i a b l e s of the subsystems at l e v e l 1. S i m i l a r observations can be made for l e v e l 3. P a r a l l e l decomposition i d e n t i f i e s deterministic subsystems at each l e v e l . Sequential decomposition i d e n t i f i e s the l e v e l s themselves. Before i t may be automated, co n d i t i o n a l decomposition must be c l e a r l y defined. The d e f i n i t i o n adopted f o r t h i s research i s e s s e n t i a l l y that of the "a l t e r n a t i o n statement r u l e " from M i l i , et a l . (1986). This r u l e i s described i n d e t a i l i n Appendix C. B r i e f l y , i t depends on f i n d i n g two r e l a t i o n s Rx and R2 such that a) R = Rx U R2, and b) domain(R 1) n domain(R 2) = {} where R i s the system r e l a t i o n 8 7 describing the behaviour of the system. The a l t e r n a t i o n statement rule i s used to decompose a program s p e c i f i c a t i o n into two c o n d i t i o n a l l y executed s p e c i f i c a t i o n s . The programmer i s required to f i n d some predicate t(s) , where t(s) i s true when s e domain(R 1) and f a l s e when s € domain(R 2), which can be used to s p l i t the o r i g i n a l r e l a t i o n into two non-i n t e r s e c t i n g parts. Conditional decomposition, therefore, involves p a r t i t i o n i n g M i l i et. a l . ' s system r e l a t i o n s are s i m i l a r to the system r e l a t i o n s defined i n Chapter 3. The domains of t h e i r r e l a t i o n s c o n s i s t of i n i t i a l states of the system. However, the domains of the r e l a t i o n s used i n t h i s research need not contain only i n i t i a l states. They may contain system states where the values of some state v a r i a b l e s have been changed by the actions of some subsystems. That i s , the domains of the system r e l a t i o n s used i n t h i s research may contain intermediate states. 148 a system r e l a t i o n into two (or more) parts. The major problem l i e s i n deciding where to place the p a r t i t i o n s 8 8 . 5.2. Conditional Decomposition Basics Before entering into an involved discussion of the p a r t i t i o n i n g problem, i t may be best to consider a simple example. The system r e l a t i o n for a system described by four state v a r i a b l e s ("a", "b", "sw", and "c") i s shown below. The state v a r i a b l e sw i s intended to represent the p o s i t i o n of an SPDT (single pole/double throw) switch which makes a connection between "a" and "c" or between "b" and "c" as i l l u s t r a t e d i n Figure 19. Intermediate State Space a b sw c > a b 1 _ 0 1 0 - 0 - 0 -- 1 1 - - 1 - 0 1 - - 0 a f C \ ~W Figure 19: The SPDT switch used to i l l u s t r a t e c o n d i t i o n a l decomposition. sw c 0 1 0 0 1 1 1 0 Corresponding F i n a l Stable States where "-" means "don't care" or "any value". There i s only one p a r a l l e l / s e q u e n t i a l decomposition of t h i s system. 1: {a,b,sw,c} That i s , the system may not be decomposed into smaller subsystems using only p a r a l l e l and sequential techniques. However, the system may be c o n d i t i o n a l l y I t i s always possible to produce a t r i v i a l p a r t i t i o n i n g of the system r e l a t i o n by creating a dummy state v a r i a b l e with value "1" f o r some states and "0" f o r the r e s t . However, an analyst i s u n l i k e l y to create such an a r b i t r a r y state v a r i a b l e , and the s p e c i f i c a t i o n s analysis tools use only the state v a r i a b l e s included i n the system model i n the search for possible c o n d i t i o n a l decompositions. 149 decomposed. I f the above r e l a t i o n i s p a r t i t i o n e d using the conditions sw = 0 and sw = 1, the following two smaller r e l a t i o n s r e s u l t . sw = 0: Intermediate State Space Corresponding F i n a l Stable States 1 0 sw = 1: Intermediate sw 0 0 -> a 1 0 sw 0 0 1 0 Corresponding F i n a l State Space Stable States a b sw c > a b sw c 1 1 _ 1 1 1 0 1 - 0 1 0 When the value of "sw" i s 0, one deterministic subsystem s a t i s f y i n g a l l the h e u r i s t i c s of the previous chapter i s {a,c}. State v a r i a b l e "b" can be any value f o r each value of "c" and state v a r i a b l e "sw" i s a constant providing no information, therefore, neither i s required to determine the value of "c". S i m i l a r l y , when the value of "sw" i s 1, {b,c} i s a de t e r m i n i s t i c subsystem. Conditional decompositions w i l l be expressed using the following syntax: [CondSVs = CondValsj^]Subsystems! ... [CondSVs = CondVals]Subsystems-where CondSVs CondVals! Subsystems! the CONDITIONAL STATE VARIABLES, or the set of state variables which are tested to p a r t i t i o n the system r e l a t i o n . a set of sets of values of the c o n d i t i o n a l state v a r i a b l e s . a set of subsystems 8 9 which are dete r m i n i s t i c with 8 9 This i s a set of subsystems because p a r t i t i o n i n g of the system r e l a t i o n may allow further p a r a l l e l decomposition of the system being decomposed. For example, the subsystem ( i . j . k . 1 } might c o n d i t i o n a l l y decompose to 1: [{i> = {{0}}]{{i,k},{j,l}} [ ( i ) = {{l})]{{j,kJJ} 150 respect to the part of the p a r t i t i o n i d e n t i f i e d by the condition CondSVs = CondValSi. Such subsystems w i l l be r e f e r r e d to as CONDITIONAL SUBSYSTEMS. For example, the con d i t i o n a l decomposition f o r the SPDT switch system may be expressed as follows: [{sw) = {{0))]{{a,c))|2.00 [{sw) = {{1}}]{{b,c))|2.00 The complexity of {a,b,sw,c} i s 8.00. As indicated following the |, the complexities of {a,c) and (b,c) are both 2.00. Therefore, the complexity of the o r i g i n a l system has been reduced (by a fa c t o r of 2) through c o n d i t i o n a l decomposition. The above syntax requires several layers of bracketing. In order to improve the r e a d a b i l i t y of con d i t i o n a l decompositions, brackets w i l l be dropped whenever possible so long as the meaning i s preserved. The co n d i t i o n a l decomposition of the SPDT switch system can be s i m p l i f i e d to the following: [sw = 0]{a,c) [sw = l]{b,c) Conditional decompositions may also be presented diagrammatically as shown i n Figure 20. 5.3. H e u r i s t i c s Conditional decomposition of the SPDT switch system was t r i v i a l . A glance at the system r e l a t i o n s u f f i c e d to i d e n t i f y "sw" as a su i t a b l e c o n d i t i o n a l state v a r i a b l e , and discovery of the con d i t i o n a l subsystems quickly followed. In most cases things w i l l not be so simple. In fa c t , had the rows of the system r e l a t i o n been randomly rearranged, condi t i o n a l decomposition of even t h i s system would have required some e f f o r t . While simple, the example d i d i l l u s t r a t e the general procedure to be followed when c o n d i t i o n a l l y decomposing a system. i n d i c a t i n g that the f i n a l values of the output state v a r i a b l e s need not ne c e s s a r i l y be determined together. 151 Conditional Decomposition Procedure: 1. Perform p a r a l l e l / s e q u e n t i a l decomposition. 2. Select a subsystem 9 0 f o r further c o n d i t i o n a l decomposition. 3. Select a state v a r i a b l e . 4. P a r t i t i o n the system r e l a t i o n on the basis of the values of t h i s state v a r i a b l e . 5. Find the subsystems which behave d e t e r m i n i s t i c a l l y with respect to each part of the p a r t i t i o n . Figure 20: A n a l t e r n a t i v e r e p r e s e n t a t i o n f o r conditional decomposition. As was the case for p a r a l l e l / s e q u e n t i a l decomposition, given only the dete r m i n i s t i c subsystem requirement, t h i s procedure could lead to an extremely large number of co n d i t i o n a l decompositions. A f t e r a l l , i n the above example, when the value of "sw" i s 1, {a}, {sw), {a,sw,c) as well as {a,c} are a l l deterministic subsystems. C l e a r l y , some h e u r i s t i c s to l i m i t the search for c o n d i t i o n a l decompositions are required. The h e u r i s t i c s of Chapter 3 which dealt with i n d i v i d u a l subsystems are applicable here. The others were concerned with arranging subsystems i n a l e v e l structure and are not u s e f u l f o r c o n d i t i o n a l decompos i t i o n . Conditional H e u r i s t i c #1: Outputs Required Each c o n d i t i o n a l subsystem must be described by at l e a s t one output state v a r i a b l e . There i s no fundamental d i s t i n c t i o n between a system and a subsystem. A subsystem of a system <JX i s a system a2 where the remainder of a-y i s i n the environment of a2- Conditional decomposition may be applied to both systems and subsystems i n exactly the same fashion. I f p a r a l l e l / s e q u e n t i a l decomposition i s applied to a system which may not be decomposed e i t h e r i n p a r a l l e l or sequentially, only one deterministic subsystem w i l l be found (as happened i n the case of the SPDT switch system). This subsystem w i l l be equal to the system. 152 The r a t i o n a l e for t h i s h e u r i s t i c i s the same as f o r p a r a l l e l / s e q u e n t i a l decomposition. B a s i c a l l y , subsystems without outputs are not very i n t e r e s t i n g . Conditional H e u r i s t i c #2: Must be Small Each co n d i t i o n a l subsystem must not be described by any state v a r i a b l e which i s not required to ensure deterministic behaviour. Again, the r a t i o n a l e f o r t h i s h e u r i s t i c i s the same as for p a r a l l e l / s e q u e n t i a l decomposition. An analyst i s interested i n knowing the minimal amount of information ( i n the form of state v a r i a b l e values) necessary to perform some task. Conditional H e u r i s t i c #3 : Must be D i f f e r e n t The set of state v a r i a b l e s describing c o n d i t i o n a l subsystems must d i f f e r by at l e a s t one state v a r i a b l e . Consider the following c o n d i t i o n a l decomposition. Some redundant brackets have been removed for c l a r i t y . [h = 0]{j,k} [h = l ] l j , k ) Such a structure does not provide any information beyond the f a c t that {j, k) i s deterministic with respect to the e n t i r e f i r s t system r e l a t i o n 9 1 . This could be more s u c c i n c t l y represented using the simpler p a r a l l e l / s e q u e n t i a l syntax. 1: {j,k) While i t i s important to know what co n d i t i o n a l decomposition i s , i t i s equally important to know what i t i s not. This h e u r i s t i c implies that some con d i t i o n a l decompositions, which would be considered by M i l i et a l . using t h e i r a l t e r n a t i o n statement r u l e , w i l l not be considered here. Conditional The M i l i et. a l . requirement that R = Rx U R2 implies that "h" has no values other than "0" and "1". 153 decomposition w i l l not f i n d a l t e r n a t i v e f u n c t i o n a l forms f o r the same subsystem. For example, suppose that t o t a l pay (total_pay) i s a function of hours worked (hours) and the pay rate (pay_r). Also suppose that the employee receives 1.5 times h i s or her regular pay for each hour i n excess of 40. Such a s i t u a t i o n i s e a s i l y coded using an IF/THEN/ELSE structure. IF hours<40 THEN total_pay := hours*pay_r ELSE total_pay := pay_r*(l.5*hours-20); This use of an IF/THEN/ELSE structure i s not the so r t of co n d i t i o n a l decomposition being described here. The subsystem describing both the THEN and ELSE parts of the structure i s (hours.pay_r.total_pay). Therefore, a p a r t i t i o n of the system r e l a t i o n using hours worked as the co n d i t i o n a l state v a r i a b l e would be rejected because of Conditional H e u r i s t i c #3. This i s an important difference between the sort of decomposition embodied by M i l i ' s a l t e r n a t i o n statement rule and c o n d i t i o n a l decomposition. M i l i et a l . do not e x p l i c i t l y consider state v a r i a b l e s i n t h e i r decompositions. They look only at the system r e l a t i o n r e s u l t i n g from a p a r t i t i o n . They would see a p a r t i t i o n using the ru l e hours < 40 as u s e f u l because i t allows d i f f e r e n t program implementations f o r the THEN and ELSE portions of the structure, whether a p a r t i t i o n allows d i f f e r e n t program implementations i s determined by the pr i m i t i v e s a v a i l a b l e i n a given language. I f a language p r i m i t i v e to ca l c u l a t e t o t a l pay d i r e c t l y from any values of hours worked and pay rate existed, p a r t i t i o n i n g on the basis of hours worked would not lead to d i f f e r e n t implementations, and M i l i et a l . would not consider such a p a r t i t i o n u s e f u l . As argued i n the previous chapter, t h i s research i s not concerned with a v a i l a b l e language p r i m i t i v e s , and as such i s p r i m a r i l y useful at a f a i r l y high l e v e l of analysis. Conditional H e u r i s t i c #4 : Same Conditional State Variables The c o n d i t i o n a l state v a r i a b les associated with each c o n d i t i o n a l subsystem must be the same. This h e u r i s t i c helps to ensure that there i s no overlap between the parts of the system r e l a t i o n associated with each co n d i t i o n a l subsystem. Consider the following: 154 [h = 0]{j,k} [ i = 0]{i,k) There i s no reason why such a decomposition should be rejected. As long as " i " cannot be 0 when "h" i s 0 and v i c e versa, t h i s decomposition w i l l not v i o l a t e the M i l i et a l . condition of non-intersecting domains. However, i n t h i s case the decomposition could be replaced by [h = 0]{j,k} [h * 0]{i,k}. The non-intersecting nature of such a decomposition i s f a r more apparent, and i s preferred. Conditional H e u r i s t i c #6: Complexity may not Increase The t o t a l complexity of the con d i t i o n a l subsystems may not exceed the complexity of the subsystem being decomposed. There i s no point i n suggesting a co n d i t i o n a l decomposition which i s more d i f f i c u l t to understand or b u i l d than the o r i g i n a l system. An example of a con d i t i o n a l decomposition which increases the complexity of the system i s given near the end of t h i s chapter. The next h e u r i s t i c cannot be i n t u i t i v e l y j u s t i f i e d . I t i s introduced s o l e l y to keep the problem of con d i t i o n a l decomposition computationally tr a c t a b l e . Conditional H e u r i s t i c #7: Single Conditional State Variables The set of conditi o n a l state v a r i a b l e s used to p a r t i t i o n a system r e l a t i o n may have no more than one member. This means that c o n d i t i o n a l decompositions such as [(x,y) = { { 0 , 0},{l,l))]{j,k} [{x,y) = { { 0,l),{l , 0}}]{i,k} 155 P a r t i t i o n s o f a S e t n t h Be I I # = # o f P a r t i t i o n s 2 0 Figure 21: Number of p a r t i t i o n s of a set of N things. w i l l not be considered. The major computational problem with c o n d i t i o n a l decomposition l i e s i n t e s t i n g a l l possible p a r t i t i o n s of the system r e l a t i o n with respect to the values of the condit i o n a l state v a r i a b l e s . The number of p a r t i t i o n s of a s e t 9 2 increases dramatically as the number of elements i n the set increases (see Figure 21). Experience gained during t h i s research has shown that most systems may be described using state v a r i a b l e s with between two and f i v e 9 2 The number of p a r t i t i o n s of a set containing N elements where a l l N elements occur i n one and only one part (also c a l l e d a " c l a s s " or "block") i s c a l l e d the "Nth B e l l number" . B e l l numbers are given by the following recurrence r e l a t i o n (Krishnamurthy, 1986, pp. 16 and 22). B(0) - 1 N B(N+1) = E C „ * B(K) K - 0 ' • where C N K i s the number of combinations of N things taken K at a time. 156 values . A state v a r i a b l e with f i v e values leads to only 52 p a r t i t i o n s of the system r e l a t i o n . This number of p a r t i t i o n s can be e a s i l y examined for subsystems meeting the other h e u r i s t i c s . On the other hand, i f more than one c o n d i t i o n a l state v a r i a b l e i s allowed, the number of p a r t i t i o n s quickly becomes unmanageable. As shown below, c o n d i t i o n a l decomposition of a t r i v i a l system described by only three state v a r i a b l e s with two values each would require t e s t i n g of 4184 p a r t i t i o n s . State Variable Values a 0,1 b 0,1 c 0,1 Conditional P a r t i t i o n i n g Values P a r t i t i o n s State Variables to T e s t 9 4 a 0,1 1 b 0,1 1 c 0,1 1 ab {0,0},{0,1},{1,0},{1,1} 14 ac {0,0},{0,1},{1,0},{1,1} 14 be {0,0},{0,1},{1,0},{1,1} 14 abc {0,0,0},{0,0,1},{0,1,0},{0,1,1}, {1,0,0},{1,0,1},{1,1,0),{1,1,1} 4139 t o t a l p a r t i t i o n s to t e s t : 4184 note: I f c e r t a i n combinations of state v a r i a b l e values can never occur together, the t o t a l number of p a r t i t i o n s to t e s t can be reduced. For R e c a l l that continuous real-world v a r i a b l e s are modelled using ranges. The system responds to a l l values i n a given range i n s i m i l a r ways. 9 4 The number of p a r t i t i o n s which must be tested i s one less than the number of possible p a r t i t i o n s . The p a r t i t i o n which consists of only one part, where that part i s the set to be p a r t i t i o n e d need not be tested. Since the system r e l a t i o n would not be s p l i t i n t h i s case, examining such a p a r t i t i o n would be equivalent to t e s t i n g whether the o r i g i n a l system i s d e t e r m i n i s t i c . I t i s assumed that a l l systems to which condit i o n a l decomposition i s to be applied are already known to be deterministic. 157 example, i f the system r e l a t i o n contains no state where a=0 and b=0, the t o t a l number of p a r t i t i o n s to t e s t drops to 1+1+1+4+14+14+202 = 237. Thus, the number of p a r t i t i o n s shown i n the above table i s an upper l i m i t only. The e f f e c t of r e s t r i c t i n g p a r t i t i o n s to those i n v o l v i n g only one c o n d i t i o n a l state v a r i a b l e i s not as serious as might be f i r s t imagined. The analyst can always decide to further c o n d i t i o n a l l y decompose a c o n d i t i o n a l subsystem. The e f f e c t of t h i s i s e s s e n t i a l l y the same as p a r t i t i o n i n g with more than one c o n d i t i o n a l state v a r i a b l e . For example, suppose the following was suggested as a c o n d i t i o n a l decomposition of a system described by state variables h, i , j , and k: [h = 0 ] l i , j , k ) [h = l ] { i , k ) Now suppose the analyst suspects that the f i r s t c o n d i t i o n a l subsystem can be further decomposed. The co n d i t i o n a l decomposition procedure can be applied again to the subsystem {i,j,k}. This might r e s u l t i n the decomposition [ i - 0]{i,j,k} [ i - l ] < j , k ) . The two l e v e l s of c o n d i t i o n a l decomposition can be combined as follows: [{h,i} = {0,0}]{i,j,k} [{h,i> - {0,l}]{j,k} [h - l]{i,k} In t h i s case, f u l l search of a l l possible p a r t i t i o n s has been replaced by s e l e c t i v e search guided by the judgement of the analyst. 5.4. Using Conditional Decomposition to Test a Model None of the three major examples developed so f a r (namely the four l i g h t , the p a y r o l l and the modified p a y r o l l systems) o f f e r any subsystems which are obvious candidates for c o n d i t i o n a l decomposition. However, the co n d i t i o n a l decomposition procedure can help to uncover some modelling e r r o r s . This w i l l be demonstrated using the modified p a y r o l l system. The error discovered i n the modified p a y r o l l system i s symptomatic of one p o t e n t i a l problem with using 158 q u a l i t a t i v e modelling techniques ( i . e . state v a r i a b l e ranges) f o r state variable s which represent continuous quantities i n the r e a l world. R e c a l l the d e s c r i p t i o n of the modified p a y r o l l system presented i n Chapter 2: 1. both o f f i c e s t a f f and sales employees are e n t i t l e d to both overtime pay and sales commissions, 2. an o f f i c e employee cannot receive more i n commissions than i n overtime, and 3. a sales employee cannot receive more i n overtime than i n commissions. The s p e c i f i c a t i o n analysis tools i d e n t i f i e d the following subsystem capable of determining the value of the a d d i t i o n a l payments state v a r i a b l e . This state v a r i a b l e was introduced to represent the t o t a l overtime and commission pay to which an employee i s e n t i t l e d a f t e r these rules have been applied. 2: (com.emp_t.over.add_pay)|8.00 where com = commissions over = overtime pay emp_t = employee type (o = o f f i c e worker, s = sales employee) add_pay = ad d i t i o n a l payments When co n d i t i o n a l decomposition i s applied to t h i s subsystem an unreasonable suggestion i s made by the s p e c i f i c a t i o n s analysis t o o l s : [emp_t = o1(over.add_pay)|2.00 [emp_t = s1(com.add_pay)|2.00 In order to ca l c u l a t e a d d i t i o n a l payments, the amount of commissions, overtime pay and the type of the employee must be a v a i l a b l e . There i s no way that a d d i t i o n a l payments can be cal c u l a t e d given only the employee type and the amount of overtime pay. Why then, i s the above c o n d i t i o n a l decomposition suggested? The fun c t i o n a l form of t h i s subsystem may be represented by the following table: 159 com emp_t over > add_pay nz o nz nz 0 o nz nz nz s nz nz nz s 0 nz nz o 0 0 0 o 0 0 0 s nz 0 0 s 0 0 where 0 = a va lue o f zero nz = some non-zero va lue N o t i c e tha t the a d d i t i o n a l payments s t a t e v a r i a b l e i s model led w i t h on ly two v a l u e s : 0 and non-zero . I t i s p o s s i b l e to p r e d i c t whether a d d i t i o n a l payments i s going to be zero or non-zero g i v e n o n l y the employee type and e i t h e r the amount o f commissions or overt ime pay. That i s , i f the employee i s an o f f i c e worker and h i s or her overt ime pay i s non-zero , then a d d i t i o n a l payments w i l l be non-ze ro . On the o ther hand, i f h i s or her over t ime pay i s z e r o , then a d d i t i o n a l payments w i l l a l s o be ze ro , s i nce he or she may not make more i n commissions than i n over t ime . A s i m i l a r argument a p p l i e s f o r members o f the s a l e s s t a f f . The above t ab l e may be r e w r i t t e n to make t h i s r e l a t i o n s h i p obv ious . com emp_t over • -> add_pay nz 0 o o s s nz 0 nz 0 nz 0 where " - " any va lue or "don ' t care" The problem l i e s i n the cho ice o f va lues f o r the a d d i t i o n a l payments s t a t e v a r i a b l e . The r u l e s s p e c i f i e d i n the system model f o r the c a l c u l a t i o n of a d d i t i o n a l payments may be represented i n t a b u l a r form as shown: 160 com emp_t over > add_pay nz - nz nz 0 - 0 0 o nz nz o 0 0 nz s - nz 0 s - 0 These r u l e s are not concerned merely w i t h de te rmin ing whether the va lue of a d d i t i o n a l payments i s zero or non-zero . The r u l e s s p e c i f y the c o n d i t i o n s under which the t o t a l o f commissions and overt ime pay i s to be reduced because o f the employee's p o s i t i o n . For example, i f the employee i s p a r t o f the s a l e s s t a f f and he or she p o t e n t i a l l y makes more i n over t ime pay than i n commissions, not a l l o f the over t ime shou ld a c t u a l l y be p a i d . Th i s concept o f "pay r e d u c t i o n " shou ld be made e x p l i c i t i n the va lues o f the a d d i t i o n a l payments s t a t e v a r i a b l e . The r u l e s c o u l d be r e w r i t t e n as f o l l o w s : com emp_t over > add_pay nz - nz nr 0 - 0 0 0 o nz nr nz o 0 r nz s 0 nr 0 s nz r where r = pay has been reduced nr = no r e d u c t i o n i n pay When p a r a l l e l / s e q u e n t i a l decomposi t ion i s a p p l i e d a f t e r such a change, the decomposi t ions are s t r u c t u r a l l y the same as f o r the o r i g i n a l system. The on ly d i f f e r e n c e i s tha t the complex i ty o f the a d d i t i o n a l payments subsystems has been i n c r e a s e d to 12 .00 . However, when c o n d i t i o n a l decomposi t ion i s a p p l i e d to the a d d i t i o n a l payments subsystem, the o f fend ing c o n d i t i o n a l decomposi t ion i s no longer suggested. The changed subsystem can a l s o be used to i l l u s t r a t e the need f o r the c o n d i t i o n a l decomposi t ion h e u r i s t i c , which r e q u i r e d tha t t o t a l complex i ty not be i n c r e a s e d by c o n d i t i o n a l decomposi t ion . I f t o t a l complex i ty i s a l l owed to i n c r e a s e , the f o l l o w i n g i s suggested by the s p e c i f i c a t i o n s a n a l y s i s t o o l s f o r the changed system: 161 [add_pay = {0.nr)]{com.emp_t.over.addpay)|12.00 [add_pay = {r)1{com.over.add_pay)|6.00 That i s , i f the system r e l a t i o n i s s p l i t using the i n i t i a l values of the a d d i t i o n a l payments state v a r i a b l e , i n a l l cases where the a d d i t i o n a l pay was p r e v i o u s l y 9 5 reduced, only knowledge of commissions and overtime i s required to determine the new value of a d d i t i o n a l payments. The complexities of these two c o n d i t i o n a l l y - a c t i v a t e d subsystems are 12.00 and 6.00 r e s p e c t i v e l y . The complexity of the o r i g i n a l a d d i t i o n a l payments subsystem was 12.00. Conditional decomposition increased the o v e r a l l complexity of the subsystem. Complexity i s increased because the a d d i t i o n a l payments state v a r i a b l e i s now an input to the conditionally-decomposed system as well as an output. Since the number of input state v a r i a b l e s has increased, so has the number of input states and, therefore, so has the complexity. Although the above decomposition increases complexity and can, therefore, be rejected, i t i s worth examining how i t could even be a p o s s i b i l i t y . How can the value of a d d i t i o n a l payments be c a l c u l a t e d without knowing the p o s i t i o n of the employee? This decomposition r e s u l t s from the f a c t that each external event defined i n the model changes the value of only one state v a r i a b l e . There are external events which a l t e r the value of the hours worked state v a r i a b l e s , and external events which a l t e r the value of the sales state v a r i a b l e , but there are no external events which a l t e r both together. This means that whenever the a d d i t i o n a l payments subsystem i s activated, e i t h e r the values of commissions or overtime pay i s equal to i t s previous value. This, along with the o l d d o l l a r value of a d d i t i o n a l payments and knowledge that pay was previously reduced, i s s u f f i c i e n t information to calculate the new value of a d d i t i o n a l payments. For example, suppose the o l d values of commissions, overtime pay, and a d d i t i o n a l payments were as follows: old commissions = $400 ol d overtime pay = $500 o l d a d d i t i o n a l payments = $800 and pay reduced That i s , before the external event which l e d to a new value for either "com" or "over". 162 The employee was obviously a member of the sales s t a f f since a d d i t i o n a l payments i s l e s s than the sum of sales commissions and overtime pay, and overtime pay exceeds sales commissions. Now suppose, an external event a l t e r s the value of s a l e s 9 6 such that commissions are increased to $600. The a d d i t i o n a l payments subsystem now has access to the following information: new commissions = $600 ol d overtime pay = $500 old a d d i t i o n a l payments = $800 and pay reduced Since the o l d value of a d d i t i o n a l payments i s less than twice the o l d value of overtime pay and pay was reduced, the employee i s a salesperson. Therefore, the new value of a d d i t i o n a l payments should be $1,100 with no pay reduction. I f the system model i s changed so that both hours worked and amount of sales can change at the same time (as would l i k e l y be the case i n a batch processing system, where transaction records contained information about both hours and s a l e s ) , t h i s c o n d i t i o n a l decomposition w i l l not be suggested. Once again, the s e n s i t i v i t y of decomposition to the defined external events i s demonstrated. 5.5. Conelus ions Three basic forms of decomposition were i d e n t i f i e d i n Chapter 1: p a r a l l e l , sequential, and c o n d i t i o n a l . P a r a l l e l and sequential decomposition were discussed i n Chapter 3. This chapter has investigated the remaining basic form of decomposition: c o n d i t i o n a l decomposition. While the basics of c o n d i t i o n a l decomposition are adapted from the a l t e r n a t i o n statement r u l e of M i l i et a l . (1986) , procedures for a c t u a l l y decomposing a system are o r i g i n a l to t h i s research. Two types of condi t i o n a l decomposition were i d e n t i f i e d . One type led to d i f f e r e n t f u n c t i o n a l forms f o r a c a l c u l a t i o n using the same state v a r i a b l e s . The other found subsystems described by d i f f e r e n t sets of state v a r i a b l e s . The f i r s t type was seen to be p r i m a r i l y useful during the implementation phase of the system development l i f e c ycle. While M i l i et a l . were concerned with both types, the s p e c i f i c a t i o n s analysis tools deal only with the second. This might happen, f o r example, i f a c o r r e c t i o n to the amount of sales were entered within the same pay period. 163 A number of h e u r i s t i c s to l i m i t the search f o r s u i t a b l e c o n d i t i o n a l decompositions have been suggested. Two of these h e u r i s t i c s are derived from those suggested f o r p a r a l l e l decomposition, one i s j u s t i f i e d on the basis of complexity, and another i s suggested f o r reasons of computational e f f i c i e n c y . The remainder follow d i r e c t l y from the meaning of co n d i t i o n a l decomposition. The modified p a y r o l l system of Chapter 3 was reexamined, and condit i o n a l decomposition was shown to be a useful t o o l f o r f i n d i n g inadequacies i n a system model. While the small systems used as examples cannot i l l u s t r a t e useful c o n d i t i o n a l decomposition, the IFIP system, analyzed i n the next chapter, can. 164 Chapter 6: SELMA Applied 6.1. General This chapter i s intended to show the f e a s i b i l i t y of applying the SELMA formalism and s p e c i f i c a t i o n s analysis tools to a " r e a l " system. The v a l i d i t y of the modelling approach w i l l be assessed by comparing the r e s u l t s to those obtained by using more established systems analysis and design methodologies. In order to do t h i s , each technique must be applied to the same system. Fortunately, there e x i s t s a system which has been analyzed by a large number of methodologies. This i s the IFIP Working Conference system. In 1982, the International Federation f o r Information Processing (IFIP) held a conference intended to provide a comparative review of a number of information system design methodologies. In order to f a c i l i t a t e comparison, a singl e t e s t case was provided. The proponents of each methodology then produced a s p e c i f i c a t i o n f o r an information system designed to solve the problem presented i n the case. The problem was to design an information system to support an IFIP Working Group Conference. The information system was to support several a c t i v i t i e s of the Program Committee and the Organizing Committee (Olle, 1982, pp. 8-9). The case i s described with greater d e t a i l i n Appendix P. A c t i v i t i e s of the Program Committee to be supported: 1. Preparing a l i s t to whom the c a l l f o r papers i s to be sent. 2. Registering the l e t t e r s of intent received i n response to the c a l l . 3. Registering the contributed papers on re c e i p t . 4. D i s t r i b u t i n g the papers among those undertaking the refereeing. 5. C o l l e c t i n g the referees' reports and s e l e c t i n g the papers f o r i n c l u s i o n i n the program. 6. Grouping selected papers into sessions f o r presentation and s e l e c t i n g a chairman f o r each session. A c t i v i t i e s of the Organizing Committee to be supported: 1. Preparing a l i s t of people to i n v i t e to the Conference. 2. Issuing p r i o r i t y i n v i t a t i o n s to National Representatives, Working Group members and members of associated working groups. 3. Ensuring a l l authors of each selected paper receive an i n v i t a t i o n . 165 4. Ensuring authors of rejected papers receive an i n v i t a t i o n . 5. Avoiding sending duplicate i n v i t a t i o n s to any i n d i v i d u a l . 6. Registering acceptance of i n v i t a t i o n s . 7. Generating a f i n a l l i s t of attendees. SELMA has been applied to the IFIP Working Conference system. The a p p l i c a t i o n technique i s comprised of f i v e major steps: Step 1: State v a r i a b l e i d e n t i f i c a t i o n Step 2: External event i d e n t i f i c a t i o n Step 3: Sublaw i d e n t i f i c a t i o n Step 4: Consistency and completeness t e s t i n g Step 5: Decomposition F u l l a p p l i c a t i o n of these steps to the IFIP Working Conference system i s f a r too lengthy to be demonstrated i n t h i s chapter. The reader would be overcome by d e t a i l s . The i d e n t i f i c a t i o n of three state v a r i a b l e s , one external event, and one sublaw w i l l be described here. These examples were selected to show some i n t e r e s t i n g aspects of SELMA and to suggest the flavour of i t s a p p l i c a t i o n to a r e a l system. Construction of the en t i r e model i s described i n Appendices Q, R, and S. The s p e c i f i c a t i o n s analysis tools were used to v e r i f y the consistency and completeness of the r e s u l t i n g system model. Many errors were made during the construction of the model; however, only one consistency and one completeness error w i l l be described i n d e t a i l . The intent i s to i l l u s t r a t e to the reader the process by which a complete and consistent model may be constructed, but not to overwhelm him or her with d e t a i l s . The errors also i l l u s t r a t e the usefulness of the s p e c i f i c a t i o n s analysis tools f o r ensuring model i n t e g r i t y . The s p e c i f i c a t i o n s analysis tools suggest three decompositions for the IFIP Working Conference system. As w i l l be discussed l a t e r , the differences r e s u l t from the l i m i t e d amount of system information incorporated i n the model. One decomposition w i l l be selected f or comparison to the decompositions produced by Jackson System Development (JSD) (McNeile, 1982, pp. 225-246) and Active and Passive Component Modelling (ACM/PCM) (Brodie and S i l v a , 1982, pp. 41-91). JSD and ACM/PCM were selected f or comparison with SELMA for a number of reasons: 166 1. Both JSD and ACM/PCM have been used to solve the Working Conference problem. 2. JSD i s notable f o r i t s e x p l i c i t focus on real-world modelling and simulation. In p a r t i c u l a r i t provides some guidelines f o r the s e l e c t i o n of s u i t a b l e "communicating sequential processes" or e n t i t i e s . These e n t i t i e s are s i m i l a r to the objects of Object-Oriented Programming. As w i l l be described l a t e r i n t h i s chapter, SELMA decompositions may be used to i d e n t i f y objects. I t w i l l be i n t e r e s t i n g to see how c l o s e l y the objects automatically i d e n t i f i e d by the s p e c i f i c a t i o n s analysis tools match those i d e n t i f i e d by JSD. 3. ACM/PCM c a r e f u l l y distinguishes between s t a t i c and dynamic system modelling. This separation of s t a t i c and dynamic behaviour, or of data and programs, i s common to many methodologies. SELMA makes no such d i s t i n c t i o n . I t w i l l be argued that the separation of system s t a t i c s and dynamics i s not only unnecessary, but may even lead to s p e c i f i c a t i o n e r r o r s . 4. ACM/PCM i s t y p i c a l of many of the system development techniques which depend on object h i e r a r c h i e s representing " i s - a " and "part-of" r e l a t i o n s h i p s . 5. ACM/PCM uses condition and action statements to describe dynamic behaviour. These statements are s i m i l a r to sublaws. I t should be c l a r i f i e d from the outset that SELMA i s not intended as a replacement f o r e i t h e r JSD or ACM/PCM. Both JSD and ACM/PCM support d e t a i l e d system design down to the implementation l e v e l . The SELMA methodology does not do t h i s . SELMA i s intended f o r use at a r e l a t i v e l y high l e v e l of abstraction during the real-world modelling phase of the systems analysis and design process. When used with the s p e c i f i c a t i o n s analysis to o l s , SELMA can provide automated system v e r i f i c a t i o n and decomposition. In ACM/PCM the system's decomposition i s a function of the objects selected f o r i n c l u s i o n i n the s p e c i f i c a t i o n . No advice i s given on how to make the se l e c t i o n s . JSD provides a number of rules to a i d i n object (or ent i t y ) i d e n t i f i c a t i o n , but they would be very d i f f i c u l t to automate. These rules w i l l be examined l a t e r i n t h i s chapter. I t w i l l be shown that the subsystems suggested by the s p e c i f i c a t i o n s analysis tools can be used to form objects which w i l l s a t i s f y a l l of the JSD ru l e s . Thus SELMA i s 167 seen as a possible a d d i t i o n to e x i s t i n g systems analysis and design methodologies, rather than as a methodology i n i t s e l f . 6.2. Applying SELMA The f i v e major steps f o r applying SELMA may be diagrammed as i n Figure 22. S t a t e V a r i a b l e I d e n t i f i c a t i o n <3-E x t e r n a l E v e n t I d e n t i f i c a t i o n 5ubla¥ I d e n t i f i c a t i o n C o m p l e t e n e s s and C o n s i s t e n c y T e s t i n g D e c o m p o s i t i o n Figure 22: Block diagram of the States, Events, and Laws Modelling Approach (SELMA). Note that these steps need not be performed sequentially. That i s , i t i s quite l i k e l y that while an analyst i s i d e n t i f y i n g sublaws, he or she may decide that another state v a r i a b l e i s required or that an external event has been missed. Also, should the model f a i l the tests f o r l o c a l consistency and completeness, changes to state v a r i a b l e s , external events, and/or sublaws w i l l be required. 168 F i n a l l y , as was i l l u s t r a t e d i n the l a s t chapter , i f the decompositions suggested by the tools are not considered reasonable by the analyst, changes to the model may be required. A b r i e f d e s c r i p t i o n of each step i s provided below. Detailed examples of the a c t i v i t i e s performed during each step w i l l follow. Construction of the IFIP Working Conference model 9 8 i s described i n f u l l i n Appendices Q, R, and S. Step 1: State v a r i a b l e i d e n t i f i c a t i o n State v a r i a b l e i d e n t i f i c a t i o n i s accomplished through a d e t a i l e d examination of the system functions (these correspond to the a c t i v i t i e s l i s t e d above and i n Appendix P f o r the IFIP Working Conference Problem). The system functions (or requirements) are combined with the analyst's knowledge of system behaviour to i d e n t i f y those properties which should be represented i n the information system 9 9. Step 2: External event i d e n t i f i c a t i o n External events are found by examining each state v a r i a b l e i d e n t i f i e d i n Step 1, and deciding whether i t s value i s determined by the system i t s e l f or the environment. External events are defined for each state v a r i a b l e d i r e c t l y a f f e c t e d by the environment. The s p e c i f i c a t i o n s analysis tools suggested a co n d i t i o n a l decomposition of the a d d i t i o n a l payments subsystem of the modified p a y r o l l system which c o n f l i c t e d with r e a l i t y . Changes to the sublaw describing the c a l c u l a t i o n of ad d i t i o n a l payments were required. 9 8 In order to i l l u s t r a t e the u t i l i t y of tes t s f o r completeness and consistency, the model constructed i n Appendices Q, R, and S contains several e r r o r s . 9 9 S t r i c t l y speaking, SELMA state v a r i a b l e s represent properties of the system. They do not have to be t i e d to any p a r t i c u l a r things ( i e . objects or e n t i t i e s ) i n the r e a l world. However, i t i s l i k e l y to be d i f f i c u l t f o r most analysts to v i s u a l i z e a property of the system, as opposed to a property of some thing. There i s no harm i n v i s u a l i z i n g a system as c o n s i s t i n g of some set of things before deciding on relevant properties. For example, when analyzing the IFIP Working Conference problem, an analyst may wish to v i s u a l i z e people and papers before deciding on s p e c i f i c properties such as Group membership or paper q u a l i t y . But i t must be remembered that these things are merely a f i r s t approximation to a decomposition of the system. The s p e c i f i c a t i o n s analysis tools w i l l i d e n t i f y deterministic groups of state v a r i a b l e s ( i e . subsystems) which can provide the basis f o r i d e n t i f y i n g a system's things. There i s no reason to expect that an analyst's i n i t i a l l i s t of things w i l l always be the same as that derived from use of the too l s . 169 Step 3: Sublaw i d e n t i f i c a t i o n Every state v a r i a b l e not d i r e c t l y a f f e c t e d by the environment w i l l be included i n at l e a s t one sublaw. The analyst consults h i s or her knowledge of system dynamics to construct rules describing the r e l a t i o n s h i p s between state v a r i a b l e s . Step 4 : Consistency and completeness t e s t i n g The s p e c i f i c a t i o n s analysis tools are used to automatically t e s t the model f o r l o c a l completeness and consistency. Operation of the system i s simulated to ensure that each stable state, when acted on by an external event, can be transformed to one and only one stable state by the defined sublaws. Local completeness and consistency were formally defined i n Chapter 2. Step 5: Decomposition A l l three forms of decomposition are automatically performed by the s p e c i f i c a t i o n s analysis t o o l s . P a r a l l e l sequential decomposition i s used to f i n d sets of dete r m i n i s t i c subsystems and the time ordering of t h e i r a c t i v a t i o n . For example, p a r a l l e l sequential decomposition of the modified p a y r o l l system y i e l d e d the following: 3: {base.add_pay.total_pay) 2: {com,emp_t,over,add_pav} 1: {hours,pay_r,base} {emp_p,sales,com) {emp_p.hours.over) This decomposition indicated, among other things, that c a l c u l a t i o n s f o r base pay ("base"), sales commission ("com") and overtime pay ("over") may be performed i n p a r a l l e l , and that they must be performed before t o t a l pay may be determined. Conditional decomposition provides a d d i t i o n a l f l e x i b i l i t y i n the time ordering of subsystem a c t i v a t i o n s . For example, suppose that a d d i t i o n a l payments ("add_pay") were only c a l c u l a t e d f o r o f f i c e employees ("emp_t" = " o f f i c e " ) . The tools would suggest the following c o n d i t i o n a l decomposition of the a d d i t i o n a l payments subsystem. [emp_t = office](com.over.addpay) [emp_t = sales 1(add pay) 170 This indicates that, i n the case of sales employees, a d d i t i o n a l payments may be c a l c u l a t e d immediately ( i t w i l l be zero). There i s no need to wait for sales commission and overtime pay to be determined. A l l forms of decomposition w i l l help the analyst to i d e n t i f y modelling errors should suggestions c o n f l i c t with h i s or her understanding of the system. The f i v e steps of the SELMA methodology w i l l now be applied to the IFIP Working Conference Problem. 6.2.1. State Variable I d e n t i f i c a t i o n The f i r s t stage i n the process of information systems analysis and design involves b u i l d i n g a model of the r e a l world (see Figure 6 ) . Naturally, no analyst would attempt to model everything i n the r e a l world. He or she w i l l only model those parts which are to be r e f l e c t e d i n the implemented information system. To i d e n t i f y these parts, the analyst must determine the f u n c t i o n a l i t y of the system. That i s : what i s the information system supposed to provide? In SELMA, the p o r t i o n of the r e a l world to be modelled i s delineated by the state v a r i a b l e s chosen to represent those properties of the r e a l system required to support the functions to be provided by the information system. The IFIP Working Conference information system i s required to support a number of a c t i v i t i e s . These were l i s t e d e a r l i e r i n t h i s chapter and are repeated i n Appendix P. Consider the f i r s t a c t i v i t y of the Programme Committee. A c t i v i t y : Preparing a l i s t to whom the c a l l f o r papers i s to be sent. This a c t i v i t y suggests that one property of the r e a l world, with which the information system w i l l be concerned, should indicate whether a p a r t i c u l a r person i s to be i n v i t e d to submit a paper to the Conference. This property, or state v a r i a b l e , w i l l be c a l l e d "pap_inv" (for " i n v i t e d paper"). I n v i t a t i o n s to submit papers are always sent to National Representatives, Working Group members, and members of associated working groups. A state v a r i a b l e i n d i c a t i n g whether a person i s i n any of these categories w i l l be c a l l e d "grp_mem" (for "group member"). Individuals i n each category are treated the same with respect to a l l of the a c t i v i t i e s which the information system i s to support. Therefore, to 171 avoid unnecessary complexity only one state v a r i a b l e i s used. Individuals not i n any of the above categories could also be i n v i t e d to submit a paper. The state v a r i a b l e "ext_inv" (for "external i n v i t a t i o n " ) w i l l be used to indicate whether t h i s i s the case. Each of these state v a r i a b l e s w i l l have two values "y" and "n" (for "yes" and "no") to indicate whether a person has been i n v i t e d to submit a paper, i s a group member, or w i l l be i n v i t e d to submit a paper regardless of group membership. Notice that state v a r i a b l e s describing the l i s t i t s e l f are not properly a part of the system being modelled 1 0 0. The l i s t i s an a r t i f a c t of the implemented information system and need not be included i n model of the r e a l world. Many more state v a r i a b l e s were i d e n t i f i e d by examining the other required system functions. These are described i n Appendix Q. A l i s t of the IFIP Working Conference state v a r i a b l e s w i l l be provided a f t e r the i d e n t i f i c a t i o n of an external event i s i l l u s t r a t e d . 6.2.2. External Event I d e n t i f i c a t i o n In SELMA, external events a f f e c t a system by a l t e r i n g the values of state v a r i a b l e s . The values of other state v a r i a b l e s may be changed by the system i t s e l f i n response to an external event. Such secondary changes are c a l l e d i n t e r n a l events. During t h i s step, the analyst i s p r i m a r i l y concerned with external events. Internal events are considered when system sublaws are defined. Each of the above state variables must be examined to decide whether i t s value i s set by an external event. One w i l l be examined here. The others are considered i n Appendix R. The state v a r i a b l e "del_acc", as i d e n t i f i e d i n Appendix Q, i s used to represent whether a delegate has accepted an i n v i t a t i o n to attend the Conference. Whether a person accepts an i n v i t a t i o n i s beyond the influence of the system. Therefore, acceptance of an i n v i t a t i o n must be modelled using external events. However, the state v a r i a b l e "del_acc" was to be used to generate a l i s t of conference attendees. This implies that attendance at the Conference i s e n t i r e l y decided by factors external to the system. This i s not the case. Mere acceptance of an i n v i t a t i o n i s not a s u f f i c i e n t condition f o r r e g i s t r a t i o n at For example, state variables describing the l i s t i t s e l f might include l i s t currency or length. I f some a c t i v i t i e s of the Committees required these properties, they would have to be included i n the model. 172 the Conference. The delegate must also have been i n v i t e d . A state v a r i a b l e , i n a d d i t i o n to those i d e n t i f i e d i n Appendix Q, i s required to i n d i c a t e whether the delegate has a c t u a l l y been registered. This state v a r i a b l e w i l l be c a l l e d "del_reg" (for "registered delegate" and w i l l have the values "y" and "n" (for "yes", the delegate i s r e g i s t e r e d and "no", he or she i s not). The value of "del_reg" i s not d i r e c t l y a f f e c t e d by external events, but i s determined s o l e l y by the values of "inv" and "del_acc". Also note, the a c t i v i t y "generating a f i n a l l i s t of attendees" w i l l require the examination of the state v a r i a b l e "del_reg", instead of "del_acc" as suggested i n Appendix Q. The state v a r i a b l e s i d e n t i f i e d through examination of required system functions and determination of external events are l i s t e d below. The defined values have the following meanings. y = yes n — no acc = accept rej = r e j e c t n/a = not applicable State Values Variable Name grp_mem y,n ext_inv y,n pap_prom y,n pap_sub y,n r e t _ r e f y,n Description Whether a person i s a member of the Working Group. Whether an i n v i t a t i o n to submit a paper should have been issued to a person by the Programme Committee regardless of Group membership. Whether a person has promised to submit a paper to the Working Conference. Whether a person has submitted a paper for review to the Working Conference. Whether a paper has been returned to the Programme Committee by the referees. 1 I t i s conceivable that some person might return an i n v i t a t i o n which was not sent to him. Perhaps i t was obtained from a colleague. D e t a i l s l i k e this one were not included i n the " f i r s t d r a f t " model of the IFIP Working Conference system. They were added i n order to make the model complete and consistent. For c l a r i t y , not a l l errors made i n the " f i r s t d r a f t " are described here. 173 s u i t y,n Whether a paper i s s u i t a b l e f o r i n c l u s i o n i n the Conference chair y,n Whether a chairman has been assigned to a session by the Programme Committee. del_acc y,n Whether a person has accepted an i n v i t a t i o n from the Organizing Committee to attend the Conference. pap_inv y,n Whether a person has been i n v i t e d to submit a paper to the Programme Committee f o r consideration. sent_ref y,n Whether a paper has been sent to the referees by the Programme Committee. ref_dec acc,rej,n/a The referee's d e c i s i o n as to the s u i t a b i l i t y of a paper f o r i n c l u s i o n i n the conference. pap_dec acc,rej,n/a The Programme Committee's d e c i s i o n as to the s u i t a b i l i t y of a paper f o r i n c l u s i o n i n the conference. sess_ass y,n Whether a paper has been assigned to a session by the Programme Committee, inv y,n Whether a person has been i n v i t e d to attend the Conference by the Organizing Committee. del_reg y,n Whether the person has been r e g i s t e r e d to attend the conference. 6 . 2 . 3 . Sublaw I d e n t i f i c a t i o n The ea s i e s t way to define sublaws i s to consider each state v a r i a b l e i n d i v i d u a l l y . The sublaws inv o l v i n g a l l of the state v a r i a b l e s l i s t e d above are developed i n Appendix S. Only the sublaw governing the d e c i s i o n to i n v i t e a person to attend the conference w i l l be considered here. A person w i l l be i n v i t e d i f one of the following conditions i s met: 1. He or she i s a Working Group member. 174 2 . He or she has submitted a paper that has been accepted, rejected, or not yet returned by the r e f e r e e s 1 0 2 . Furthermore, no person should be i n v i t e d twice and no i n v i t a t i o n should be cancelled once issued. This l a s t requirement implies that the s t a b i l i t y conditions relevant to the state v a r i a b l e "inv" are not very r e s t r i c t i v e . A person w i l l not be i n v i t e d i f h i s or her paper i s not considered by the Programme Committee ( i . e . "sent_ref" i s "n") and he or she i s not a Group member. However, an i n v i t a t i o n may be (or may have been) sent i n any other s i t u a t i o n . This sublaw may be expressed i n tabular form as shown below. State v a r i a b l e s and values are as defined e a r l i e r , and "-" means "any value" or don't care". Sublaw: "Authors of processed papers and group members are i n v i t e d " S t a b i l i t y Conditions: sent_ref grp_mem inv y n n n Corrective Actions: Conditions Actions pap_dec sent_ref grp_mem inv --> inv - - y y acc n y rej - - n y y - n y y n y 6.2.4. Consistency and completeness t e s t i n g Appendix T contains a l i s t i n g of the IFIP Working Conference system model i n the format required by the s p e c i f i c a t i o n s analysis t o o l s . There are some differences between t h i s model and the one developed above. The differences r e f l e c t changes to the system required to correct errors found during t h i s step. The tools also note that some of the rules included i n the model are not required to respond to the defined external events. Each of these rules must be examined to determine whether they are redundant or whether there i s a d e f i c i e n c y i n the model. Notice that mere submission of a paper does not guarantee a person an i n v i t a t i o n to attend the Conference. This i s an i n v i t e d paper conference. No paper w i l l be sent to the referees by the Programme Committee unless i t was previously i n v i t e d . 175 The time required for t e s t i n g can be considerably reduced by n o t i c i n g that two state v a r i a b l e s appear to be unrelated to the r e s t of the system. The state v a r i a b l e s "pap_prom" and "chair" are not mentioned together, or with any other state v a r i a b l e s , i n any sublaw. Therefore, they cannot a f f e c t the behaviour of any other state v a r i a b l e . While the values of these state v a r i a b l e s are of i n t e r e s t to the Programme Committee, they can be handled by subsystems which are independent of the r e s t of the system. The system as described above has 352 stable states. I f "chair" i s removed from the system, there w i l l be 176 stable state, and i f "pap_prom" i s also removed, there w i l l be only 88 stable states. The number of stable states i s halved i n each case because both state variables have two values, and they may assume eit h e r of these values regardless of the state of the r e s t of the system. In order to save time t e s t i n g and decomposing the system, these two state vari a b l es w i l l be dropped from the model. Subsystems to handle promised papers and the assignment of chairmen can be constructed independently of the other subsystems which w i l l be suggested by the s p e c i f i c a t i o n s analysis t o o l s . When the model i s entered, the s p e c i f i c a t i o n s analysis tools w i l l f i n d i t to be inconsistent. I f a person who was not a Working Group member and d i d not submit a paper to the Programme Committee becomes a member, the system can change to two d i f f e r e n t stable states. The relevant state v a r i a b l e s and values are shown below. State I n i t i a l Stable State A f t e r Event F i n a l Stable States Variable grp_mem n y y y pap_sub n n n n ref_dec n/a n/a n/a n/a sent_ref n n n n pap_dec n/a n/a rej n/a There i s an error i n the sublaw which determines the f i n a l value of "pap_dec". In Appendix S, i t was assumed that the d e c i s i o n to include a paper i n the Conference i s based s o l e l y on the v a l i d i t y of the referees' decision. The referees' d e c i s i o n w i l l not be v a l i d i f the paper they judged was not sent to them by the Programme Committee or no paper was submitted. The o r i g i n a l tabular form of t h i s sublaw i s as follows (as developed i n Appendix S): 176 O r i g i n a l i n c o r r e c t sublaw Sublaw: "Papers are accepted nor rejected" S t a b i l i t y Conditions: ref_dec sent_ref pap_sub pap_dec acc y - acc rej y - rej n/a - - rej n n/a ' Corrective Actions: Conditions Actions ref_dec sent_ref pap_sub --> pap_dec acc y - acc rej y - rej n/a - - rej n n/a . The t h i r d rule i n both the s t a b i l i t y conditions and co r r e c t i v e actions sections of the sublaw must be changed as shown below. The corrected sublaw r e f l e c t s that f a c t that i f the referees' d e c i s i o n i s not applicable, the Programme Committee's de c i s i o n w i l l be neither accept nor r e j e c t . S t a b i l i t y Conditions: ref_dec sent_ref pap_sub pap_dec n/a - - n/a Corrective Actions: Conditions Actions ref_dec sent_ref pap_sub --> pap_dec n/a - - n/a The model i s incomplete with respect to an external event which sets the value of "grp_mem" to "y" ( i . e . the person becomes a Working Group member). There i s an error i n the sublaw responsible f o r s e t t i n g the value of "sess_ass". A paper may be submitted and neither accepted nor rej e c t e d by the Programme Notice that the s t a b i l i t y conditions and the c o r r e c t i v e actions parts of t h i s sublaw are nearly i d e n t i c a l . A "pap_dec" conditions column containing i n i t i a l values of "pap_dec" could have been added to the c o r r e c t i v e actions, but since the f i n a l value of "pap_dec" i s independent of i t s i n i t i a l value, i t i s easier to simply leave i t out. As i s evident i n Appendix S, the structures of the s t a b i l i t y conditions and the c o r r e c t i v e actions are often very s i m i l a r . This i s to be expected since the s t a b i l i t y conditions specify the stable combinations of values f o r the state v a r i a b l e s and the c o r r e c t i v e actions specify how to a t t a i n those values. While t h i s means the analyst must provide seemingly redundant information, the two parts of a sublaw are not always the same and neither may be l e f t out. For example, the s t a b i l i t y conditions and co r r e c t i v e actions of the sublaw for determining the value of "inv" (as described i n the previous section) are quite d i f f e r e n t . 177 Committee, e i t h e r because i t was never sent to the referees or was never returned. The o r i g i n a l tabular form of t h i s sublaw i s as follows (as developed i n Appendix S): O r i g i n a l i n c o r r e c t sublaw: Sublaw: "Accepted papers are assigned to a session" S t a b i l i t y Conditions: pap_dec pap_sub sess_ass acc - y rej - n n n Corrective Actions: Conditions Actions pap_dec pap_sub --> sess_ass acc - y rej - n n n Rules s p e c i f y i n g the behaviour of the system, when the Programme Committee's d e c i s i o n on a paper i s neither accept nor r e j e c t , must be added as shown below. These new rules show that papers which are neither accepted nor rejected are not assigned to a session. S t a b i l i t y Conditions: pap_dec pap_sub sess_ass n/a - n Corrective Actions: Conditions Actions pap_dec pap_sub --> sess_ass n/a - n The model i s also incomplete with respect to an external event which sets the value of "pap_sub" to "n" ( i . e . no paper i s submitted to the Programme Committee). The sublaw responsible for s e t t i n g the value of "ref_dec" does not spec i f y the a c t i o n to be taken when a paper i s not submitted (and the value of " s u i t " i s therefore "n/a") but i s somehow returned by the r e f e r e e s 1 0 4 . This sublaw, and the sublaw sp e c i f y i n g the r e l a t i o n s h i p between paper submission and s u i t a b i l i t y , are as follows: 1 0 4 Perhaps a referee a c c i d e n t a l l y returned a paper destined f o r another conference. 178 Sublaw: "Referees e i t h e r accept or r e j e c t " S t a b i l i t y Conditions: r e t _ r e f s u i t ref_dec y y acc y n rej n - n/a Corrective Actions: Conditions Actions r e t _ r e f s u i t --> ref_dec y y acc y n r e j n - n/a Sublaw: "Papers may be suitable of unsuitable" S t a b i l i t y Conditions: pap_sub s u i t y y y n n n/a Rather than a l t e r the referee decision sublaw, i t was decided to drop the value of "n/a" f o r the state v a r i a b l e " s u i t " . In r e a l i t y , a paper w i l l be either s u i t a b l e or unsuitable regardless of whether i t i s a c t u a l l y submitted to the Programme Committee. The sublaw s p e c i f y i n g the r e l a t i o n s h i p between paper submission and s u i t a b i l i t y was also dropped from the model. I f the above corrections are made, the model w i l l be both l o c a l l y complete and l o c a l l y consistent. Appendix T contains a l i s t i n g of the l o c a l l y complete and consistent model expressed i n the syntax required by the s p e c i f i c a t i o n s analysis t o o l s . Although t h i s model i s complete and consistent, the tools note that two rules are not needed to return the system to a stable state a f t e r the acti o n of any external event. One of these rules deals with i n v i t a t i o n to the Conference, the other with r e g i s t r a t i o n . Unnecessary i n v i t a t i o n r u l e : Conditions Actions grp_mem sent_ref inv --> inv y y Unnecessary r e g i s t r a t i o n r u l e : Conditions Actions inv --> del_reg n n The analyst should confirm that these rules are indeed redundant, and that no semantic error has been made. The reason the ru l e which sets the value of "inv" 179 i s never a c t i v a t e d i s t r i v i a l . I t i s not capable of changing the state of the system. That i s , the action state v a r i a b l e "inv" must have the same value before and a f t e r rule a c t i v a t i o n . This does not indi c a t e an er r o r i n the system model. There i s no t h e o r e t i c a l reason why such a r u l e should not be allowed to f i r e . The s p e c i f i c a t i o n s analysis tools simply avoid such rules to save time when determining system response paths to external events. The r u l e which sets the value of "del_reg" i s never activated because state v a r i a b l e "inv" w i l l never be assigned a value of "n" during a system response to an external event. In v i t a t i o n s are never withdrawn. Because states are stable before the a p p l i c a t i o n of the external events, i f "inv" has the value "n" then "del_reg" w i l l also have the value "n". Therefore, t h i s r u l e i s never required to regain s t a b i l i t y and i s redundant. 6.2.5. Decomposition 6.2.5.1. P a r a l l e l / S e q u e n t i a l Decomposition P a r a l l e l / s e q u e n t i a l decomposition as performed by the s p e c i f i c a t i o n s analysis tools leads to three d i f f e r e n t decompositions for the IFIP Working Conference system. None of these decompositions i s exactly the same as the decomposition inherent i n the sublaws. The differences between the suggested decompositions w i l l be discussed f i r s t . As s h a l l be shown, these differences can be a t t r i b u t e d to a d e f i c i e n t system model. One decomposition w i l l be selected f o r further analysis and the differences between i t and the decomposition inherent i n the sublaws w i l l be explained. These differences point to " i n e f f i c i e n t " sublaw d e f i n i t i o n s . The three decompositions suggested by the tools are l i s t e d below and shown i n diagrammatic form i n Figure 23. They d i f f e r only i n the subsystems responsible f or c a l c u l a t i n g the values of "pap_dec" and "sess_ass" ( i . e . the subsystems which decide whether a paper i s accepted by the Programme Committee for i n c l u s i o n i n the Conference, and whether a paper i s assigned to a session). 180 Decomposition #1 Complexity = 30.84 4: ( d e l _ a c c . i n v . d e l r e g ) ( r e t _ r e f . s e n t r e f . s e s s a s s . p a p d e c ) 3: (grp_mem.inv.sent_ref.inv) ( r e f _ d e c . s e n t _ r e f . s e s s a s s ) 2: (pap_inv.pap_sub.sent_ref) 1: (ext_inv.grp_mem.pap_inv) ( r e t _ r e f . s u i t . r e f _ d e c ) Decomposition #2 Complexity = 31.49 4: ( d e l a c c . i n v . d e l reg) 3: (grp_mem.inv.sent_ref.inv) (ref_dec.sent_ref.pap_dec) {ref_dec,sent_ref.sessass) 2: (papinv.papsub.sent_ref) 1: (ext_inv.grp_mem.pap_inv) ( r e t r e f . s u i t . r e f d e c ) Decomposition #3 Complexity = 30.35 4: (del_acc.inv.del_reg) (pap_dec.sessass) 3: (grp_mem.inv.sent_ref.inv) (ref_dec.sent_ref.papdec) 2: (pap_inv.pap_sub.sent_ref) 1: (ext_inv.grp_mem.pap_inv) ( r e t _ r e f . s u i t . r e f _ d e c ) The complexities of a l l three decompositions are roughly the same. The f i r s t suggestion i s somewhat s u r p r i s i n g i n that i t shows that "pap_dec" can be determined as a function of "sess_ass". (The fun c t i o n a l forms associated with t h i s subsystem, and the other subsystems discussed below, are l i s t e d i n Appendix U) . While i t would be possible to construct a system which functioned t h i s way and s t i l l f u l f i l l e d a l l the requirements, an analyst would probably r e j e c t any suggestion that papers be assigned to sessions before they are accepted by the Programme Committee. The decomposition would be re j e c t e d because there are probably other factors which a f f e c t the Programme Committee's acceptance and session assignment decisions i n addition to those included i n the model. For example, time and space considerations may d i c t a t e that an otherwise acceptable paper cannot be included i n the conference. In other words, the model does not embody a l l of the analyst's knowledge per t a i n i n g to the dynamics of the system. That there i s some missing information i s also apparent i n the second decomposition. I t shows that the values of "pap_dec" and "sess_ass" can be determined i n p a r a l l e l using the same input state v a r i a b l e s . In the t h i r d decomposition, "sess_ass" i s shown as a function of "pap_dec" only. This i s 181 **1 OO c l - t a to CO tn H Xi 3 " CD CD O P-Hi O Q) P> H rt QJ '-' 3 fD CO i—J pj P3 w 2.3 rt CO rO W rt O o * o rt 3 (D O M p . ?r 3 tn 0 " , £ 2 <» 3 o "> cr M ^ w r t (T) r t • (D del . _reg inv de l . acc I n v inv sent _ref grp. mem pap. dec sess _ass sent _ref ret . ref s e s s a s s sess _ass sent _ref ref. dec s e n t _ r e f r e f _ d e c sent .ref pap. SUB pap. inv ref_dec suit ret ref pap_inv grp_mem ext_inv D e c o m p o s i t i o n #1 del_reg inv del acc inv sent _ref grp. mem pap. .dec sent _ref ref. dec sess_ass sent_ref ref dec sent _ref pap. .sub pap. inv ref_dec suit ret_ref pap _inv grp _mea ext _inv D e c o m p o s i t i o n #2 del. .reg inv del . .acc I n v inv sent _ref grp. sess_ass pap_dec p a p . d e c pap_dec sent_ref ref dec sent _ref pap. sub pap. inv p a p . l n v pap. inv grp. nem ext. inv ref_dec suit ret_ref D e c o m p o s i t i o n #3 c l o s e s t to the i n t u i t i v e sequence of actions within the system. The model could be enhanced to include more of the factors i n f l u e n c i n g the Programme Committee's dec i s i o n to accept a paper and assign i t to a session. The enhanced model could then be decomposed again to ensure that the suggested decompositions f u l l y agree with the analyst's knowledge of the system dynamics. However, such a d e t a i l e d analysis would not be appropriate i n t h i s chapter. The t h i r d decomposition w i l l be selected f o r further analysis. In the t h i r d decomposition, the subsystems responsible f o r c a l c u l a t i n g the values of "pap_inv", "ref_dec", "sent_ref", and "del_reg" are described by the same state v a r i a b l e s used to s p e c i f y the relevant sublaws. However, there are some differences between the subsystems suggested by the tools and those i n t u i t i v e l y obvious from the sublaw s p e c i f i c a t i o n s . As discussed below, these differences point to redundant rules or " o v e r - s p e c i f i c a t i o n " i n the model. 1. The sublaw s p e c i f y i n g the c a l c u l a t i o n of "inv" mentions state v a r i a b l e s "pap_dec", "grp_mem", "inv" and "sent_ref", but the subsystem i s not described by "pap_dec". The tools recognized that the d e c i s i o n to i n v i t e a person to attend the Conference depends only on whether that person i s a group member, whether he or she submitted a paper, and whether an i n v i t a t i o n was previously issued. The d e c i s i o n to accept or r e j e c t the paper i s i r r e l e v a n t as authors of both accepted and r e j e c t e d papers are i n v i t e d . 2. The sublaw s p e c i f y i n g the c a l c u l a t i o n of "pap_dec" mentions state v a r i a b l e s "ref_dec", "sent_ref", and "pap_sub". However, the tools recognized that the value of "pap_sub" i s not required i n order to determine the value of "pap_dec". That i s , i t i s not necessary to know e x p l i c i t l y whether a paper was submitted, only whether i t was sent to the referees and whether they found i t to be s u i t a b l e . 3. S i m i l a r l y , the sublaw sp e c i f y i n g the c a l c u l a t i o n of "sess_ass" mentions state v a r i a b l e s "pap_dec", and "pap_sub". However, the tools recognized that the value of "pap_sub" i s not required i n order to determine the value of "sess ass". 183 6.2.5.2. Conditional Decomposition The selected decomposition shows the d e c i s i o n to i n v i t e a person to attend the Conference (represented by the value of the state v a r i a b l e "inv") being made at l e v e l 3. This r e s u l t s from the f a c t that i n order to make the d e c i s i o n under a l l p ossible circumstances i t must be known whether a person i s a Working Group member (represented by "grp_mem") and whether the paper was accepted f o r further consideration (represented by " s e n t _ r e f " ) . Such a subsystem i s l i k e l y to be unacceptable to the analyst as i n v i t a t i o n s to Working Group members should be sent long before papers are received from e i t h e r Group members or external i n v i t e e s . The subsystem {grpmem.inv.sent r e f . i n v ) i s a candidate for c o n d i t i o n a l decomposition. When con d i t i o n a l decomposition i s applied to the subsystem by the s p e c i f i c a t i o n s analysis tools, three suggestions are made. 1- [grp_mem = y](inv) [grp_mem = n]{inv,sent_ref,inv) This c o n d i t i o n a l decomposition shows that as long as the person i s a Working Group member, there i s only one possible value for "inv". An i n v i t a t i o n should be sent. However, i f the person i s not a working group member, the i n v i t a t i o n d e c i s i o n must be delayed u n t i l a paper i s submitted and i t i s decided whether to send the paper to the referees ( i . e . i t i s accepted for consideration by the Programme Committee). 2. [inv = y j ( i n v ) [inv = nl(grp_mem.sent_ref.inv) S i m i l a r l y , t h i s c o n d i t i o n a l decomposition shows that i f an i n v i t a t i o n has been sent the f i n a l value of "inv" i s known. Since i n v i t a t i o n s are not cancelled, the f i n a l value of "inv" w i l l be "y". However, i f no i n v i t a t i o n i s sent, the values of both "grp_mem" and "sent_ref" must be considered. 3. [sent_ref = y1{inv) [sent_ref = n1(grp_mem.inv.inv) This c o n d i t i o n a l decomposition shows that i f a paper i s sent to the referees, there i s only one possible value for "inv" ( i . e . an i n v i t a t i o n w i l l be sent). However, i f the paper i s not sent to the referees, i t i s necessary to know 184 whether the person i s a Working Group member before d e c i d i n g whether to send an i n v i t a t i o n . When choos ing between the three c o n d i t i o n a l decompos i t ions , an a n a l y s t would l i k e l y favour decomposit ions which a l l o w d e c i s i o n s to be made a t a lower l e v e l ( i . e . e a r l i e r ) than tha t i n d i c a t e d by the p a r a l l e l / s e q u e n t i a l decompos i t ion . For example, the c o n d i t i o n a l decomposi t ions f o r the i n v i t a t i o n subsystem a l l o w the p a r a l l e l / s e q u e n t i a l l e v e l s t r u c t u r e to be m o d i f i e d as f o l l o w s : 4: ( d e l _ a c c . i n v . d e l r e g ) ( pap_dec . s e s sa s s ) 3: [grp_mem=n]{inv,sent_ref , inv} {ref_dec .sent_ref .pap_dec) 2: ( p a p _ i n v . p a p s u b . s e n t r e f ) 1: (ext_inv.grp_mem.pap_inv) ( r e t _ r e f . s u i t . r e f d e c ) [grp_mem=y]{inv} or 4: ( d e l _ a c c . i n v . d e l r e g ) (pap_dec.sess_ass) 3: [ inv=n]{grp_mem,sent_ref , inv) ( re f_dec .sen t_ref .pap_dec) 2: (pap_inv.pap_sub.sent_ref ) 1: {ext_inv.grp_mem.pap_inv} ( r e t _ r e f . s u i t . r e f _ d e c ) [ inv=y]{ iny) or 4: ( d e l _ a c c . i n v . d e l r e g ) {pap_dec.sess_ass) 3: [ sen t_re f=y]{ inv) [sen t_re f=n]{grp_mem, inv . inv){re f_dec , sen t_re f .pap_dec) 2: (pap_inv.pap_sub.sent_ref ) 1: (ext_inv.grp_mem.pap_inv) ( r e t _ r e f . s u i t . r e f _ d e c ) The f i r s t a l t e r n a t i v e shows tha t under some c i rcumstances the d e c i s i o n to i n v i t e a person may be made a t l e v e l 1 ( i . e . i f a person i s a Group member, he or she shou ld be i n v i t e d ) . The second merely a f f i rms the f a c t t ha t i n v i t e d people w i l l always be i n v i t e d ( i . e . i n v i t a t i o n s are not r evoked) . Th i s decomposi t ion does not a l l o w the o r i g i n a l i n v i t a t i o n d e c i s i o n to be made any e a r l i e r . The t h i r d does not a l l o w any d e c i s i o n to be advanced i n time s i n c e both c o n d i t i o n a l subsystems are s t i l l a t l e v e l 3. Therefore , the f i r s t a l t e r n a t i v e 185 w i l l be adopted f o r purposes of comparison with JSD and ACM/PCM. This decomposition i s shown i n diagrammatic form i n Figure 24. The complexity of the subsystem {grp_mem,inv.set_ref,inv) i s 4.35. The t o t a l complexities of each condi t i o n a l decomposition i s 3 . 25 . Therefore, i n addi t i o n to adding f l e x i b i l i t y to the timing of the d e c i s i o n to i n v i t e a person, there i s s i g n i f i c a n t complexity reduction through c o n d i t i o n a l decomposition. 6.3. Jackson System Development (JSD) The r e s u l t s of analysis using SELMA can be compared to those of more established systems analysis techniques. In t h i s s e c t i o n and the next, the objects i d e n t i f i e d by Jackson System Development (JSD) and Active and Passive Component Modelling (ACM/PCM) for the IFIP Working Conference system w i l l be compared to the hierarchy of subsystems discovered above. In JSD, "the r e a l world i s described i n terms of e n t i t i e s , actions they perform or s u f f e r , and the orderings of those actions" (Jackson, 1983, p. 23). The notion of " e n t i t y " i n JSD a d i f f e r e n t from that used i n most database modelling
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Reasoning tools to support systems analysis and design
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Reasoning tools to support systems analysis and design Paulson, James Daniel 1989
pdf
Page Metadata
Item Metadata
Title | Reasoning tools to support systems analysis and design |
Creator |
Paulson, James Daniel |
Publisher | University of British Columbia |
Date Issued | 1989 |
Description | Some parts of the systems analysis and design process are not well structured and rely heavily on human judgement and experience. This is particularly true for decomposition and the validation of system specifications. Decomposition has long been considered a fundamental part of systems analysis and design. However, ensuring that a decomposition is optimal is nearly impossible. Ensuring that a system specification is complete and consistent is an inherently difficult task. Most existing systems analysis and design methodologies allow only the use of techniques such as code walk-throughs and post-implementation testing. Analysis errors discovered at such late stages can be quite expensive to correct. Existing methodologies cannot support automated completeness and consistency testing because they lack the degree of formalism required to allow automation. The primary objective of this research was to increase understanding of system decomposition. To aid in achieving this objective a formalism for representing a system specification, and a set of computer-based specifications analysis tools were developed. The tools support decomposition and provide completeness and consistency testing of a system specification. An existing system modelling formalism was extended to provide the basis for the specification formalism. This extended formalism will allow an analyst to describe a system with the degree of precision necessary for automated testing and decomposition. The ability to create a complete and consistent system model facilitated the development of a general theory of system decomposition. A system model created using the specifications analysis tools can be analyzed using a decomposition algorithm based on this theory. The algorithm incorporates a number of commonsense software design rules and decomposition heuristics drawn from the literature, and has been included in the specifications analysis tools. Experience has shown that the specifications analysis tools may suggest system decompositions not previously considered by the analyst. Alternative decompositions may arise in two situations: 1. The system has a valid alternative structure which may not have been considered by the analyst. This alternative structure may be superior to the original structure envisioned by the analyst when the system model was constructed. 2. The system specification does not contain enough information to rule out certain unreasonable decompositions. The missing information should be explicitly included in the specification to avoid problems of interpretation later in the system development life cycle. Analysis of several test systems (including the IFIP Working Conference system often used as a standard problem in the systems analysis literature) using the specifications analysis tools has proven the feasibility of automated consistency and completeness testing and decomposition. Further research is required in two areas: 1. Enhancement of the specifications analysis tools. The tools are not user friendly. An analyst will require extensive training to use them effectively. As well, the computational speed of the tools must be improved. Automated decomposition is too slow to allow easy interaction between the analyst and the tools. 2. A hierarchical analysis technique must be developed to support application of the specification formalism and the theory of decomposition to larger systems. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-10-18 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0098358 |
URI | http://hdl.handle.net/2429/29259 |
Degree |
Doctor of Philosophy - PhD |
Program |
Business Administration |
Affiliation |
Business, Sauder School of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1989_A1 P38.pdf [ 15.67MB ]
- Metadata
- JSON: 831-1.0098358.json
- JSON-LD: 831-1.0098358-ld.json
- RDF/XML (Pretty): 831-1.0098358-rdf.xml
- RDF/JSON: 831-1.0098358-rdf.json
- Turtle: 831-1.0098358-turtle.txt
- N-Triples: 831-1.0098358-rdf-ntriples.txt
- Original Record: 831-1.0098358-source.json
- Full Text
- 831-1.0098358-fulltext.txt
- Citation
- 831-1.0098358.ris