SWITCHING CIRCUITS AS INFORMATION NETWORKS by W. S. MATHESON B . E . ( E l e c ) , U n i v e r s i t y of Melbourne, A u s t r a l i a , M.Eng.Sci., U n i v e r s i t y of Melbourne, A u s t r a l i a , A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF THE DEGREE OF DOCTOR OF PHILOSOPHY i n the Department of E l e c t r i c a l Engineering We accept t h i s thesis as conforming to the required standard Research Supervisor.... Members of the Committee Acting Head of the Department Members of the Department of E l e c t r i c a l Engineering THE UNIVERSITY OF BRITISH COLUMBIA June, 1970 In p r e s e n t i n g t h i s t h e s i s in p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced degree at the U n i v e r s i t y o f B r i t i s h Co lumb i a , I a g ree t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s tudy . I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by the Head o f my Department o r by h i s r e p r e s e n t a t i v e s . It i s u n d e r s t o o d tha t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department o f £lE<ZTZ (q/t-L jZM<i f MrrV<$ The U n i v e r s i t y o f B r i t i s h Co lumbia Vancouver 8, Canada Date 2 f XT***-ABSTRACT A s i n g l e - o u t p u t combinational s w i t c h i n g network has a number of input t e r m i n a l s , each c a r r y i n g a s i g n a l v a r i a b l e which may take one of two v a l u e s , and an output t e r m i n a l , the s i g n a l v a r i a b l e of which has a value determined i d e a l l y by the input s i g n a l s only. In t h i s t h e s i s , we make an a r b i t r a r y assignment of p r o b a b i l i t i e s to each of the p o s s i b l e c o n f i g u r a t i o n s of input s i g n a l v a l u e s , (namely that each c o n f i g u r a t i o n i s e q u a l l y l i k e l y ) . This i s an i n t e r p r e t a t i o n of s w i t c h i n g v a r i a b l e s as random v a r i a b l e s w i t h known s t a t i s t i c s . We can t h e r e f o r e d e f i n e and compute the j o i n t source entropy of s e t s of v a r i a b l e s , i n c l u d i n g the output v a r i a b l e . We use these i n f o r m a t i o n q u a n t i t i e s , or e n t r o p i e s , . t o c l a s s i f y s w i t c h i n g f u n c t i o n s i n t o Equivalence Classes under Permutation and Complementation of input v a r i a b l e s , and Negation of the Function. The e n t r o p i e s can a l s o be used to p r e d i c t some of the u s e f u l p r o p e r t i e s of s w i t c h i n g f u n c t i o n s , i n some cases more simply than co n v e n t i o n a l methods which employ Boolean Algebra. the model a l s o suggests a s w i t c h i n g c i r c u i t design philosophy based on the i d e a of using c i r c u i t elements, or gates, to pass i n f o r m a t i o n i n the input s i g n a l s which i s r e l e v a n t to the output, w h i l e b l o c k i n g the i r r e l e v a n t i n f o r m a t i o n . S e v e r a l algorithms are d e s c r i b e d , and t h e i r performance on the design of c i r c u i t s w i t h s m a l l numbers of v a r i a b l e s i s encouraging. The design philosophy seems p a r t i c u l a r l y able to handle t o p o l o g i c a l c o n s t r a i n t s , of the type becoming s i g n i f i c a n t i n modern s w i t c h i n g c i r c u i t design. TABLE OF CONTENTS Page 1. INTRODUCTION 1 2. MODEL DESCRIPTION 5 2.1 Random Variable, P r o b a b i l i t i e s , and Weights.. 5 2.2 Information Content or Entropy 7 2.3 Manipulation of Entropies 8 3. PCN EQUIVALENCE CLASSES . 11 3.1 D e f i n i t i o n of Equivalence Classes 11 3.2 Source Entropies as Invariants of PCN Equivalence Classes 13 3.3 Representative Functions f o r PCN Equivalence Classes 19 4. ANALYSIS OF FUNCTIONS USING ENTROPIES 24 4.1 Symmetry 24 4.2 Unate Cascade R e a l i s a b i l i t y 25 4.3 Monotonic and Unate Functions 30 4.4 Threshold Functions 36 4.5 D i s j u n c t i v e Decomposition 45 5. SYNTHESIS OF SWITCHING CIRCUITS 51 5.1 A Synthesis Philosophy 51 5.2 Information Performance Functions 54 5.3 Comparison of Design Algorithms 55 5.4 Design of Multiple-Output Functions 63 6. CONCLUSIONS . 65 APPENDIX 1 67 REFERENCES 69 ACKNOWLEDGEMENTS The author of t h i s thesis i s deeply indebted to Dr. E.L. Sigurdson of the Department of E l e c t r i c a l Engineering, U n i v e r s i t y of B r i t i s h Columbia, fo r h i s advice and c r i t i c i s m . The support of the Commonwealth Scholarship and Fellowship Plan i s also g r a t e f u l l y acknowledged. 1. INTRODUCTION A (Completely-specified) switching function, y = f ( x ^ , . . . , x ^ ) , i s a mapping which assigns one of two p o s s i b l e values to y for each combination of values taken by (x ,...,x^), where each x^ can have only one of two po s s i b l e values. Each of the v a r i a b l e s i n the set {x^,...,x^,y} i s thus a binary v a r i a b l e . A single-output combinational switching c i r c u i t , ( i . e . , a s i n g l e -output switching c i r c u i t with no memory), i s some interconnection of more p r i m i -t i v e switching elements within a "black box" having N binary inputs (x^,...,^) and one binary output y, such that y i s a switching function of (x^,...,x ). A switching c i r c u i t i s sa i d to " r e a l i s e " a p a r t i c u l a r switching function. The switching c i r c u i t design problem i s : given a switching function and some allowed classes of p r i m i t i v e elements, produce a c i r c u i t r e a l i s a t i o n , possibly subject to s p e c i f i e d constraints. The t r a d i t i o n a l methods of analysing and designing switching c i r c u i t s use Boolean Algebra. This follows the observation by Shannon (1) that the relay/contact r e a l i s a t i o n of a switching function i s analogous to a l o g i c a l p r o p o s i t i o n or a Boolean Expression. Since then, switching functions and Boolean Expressions have become v i r t u a l l y synonymous, and t h i s algebraic model of switching c i r c u i t s has led to many a n a l y t i c a l r e s u l t s , and to several suc c e s s f u l c i r c u i t design methods. Two such d i r e c t methods i n general use are the Map Method (2), and the Table Method. Karnaugh Maps are a simple v i s u a l method of s i m p l i f y i n g Boolean Expressions, and are s u i t a b l e f o r small numbers of v a r i a b l e s . The Quine-McCluskey Table (3) i s more systematic, and can be programmed for a d i g i t a l computer. Both these methods and t h e i r variants generate b a s i c a l l y "two-level" designs, ( i . e . switching c i r c u i t s with two l e v e l s of gates, between input and output), when l o g i c gates are the elements rather than relay contacts. 2 Other methods of design seek to e x p l o i t some of the properties of c e r t a i n classes of switching functions. Two such methods are the design algorithms for symmetric functions (4), and Decomposition Theory (5), both of which generate m u l t i - l e v e l designs. Again, these methods are algebraic i n nature. However, the Boolean Expression representing the function of a switching c i r c u i t t e l l s us l i t t l e else about the r e a l i s a t i o n , except i n the s p e c i a l case of relay/contact c i r c u i t s . This did not matter before the advent of "gate-type" networks, but modern design now uses elements, (e.g. NAND-gates, which are not the usual Boolean Operators, (AND, OR, NOT). Further complica-tions are that modern design c r i t e r i a often concern such q u a n t i t i e s as "number of interconnections" and " p l a n a r i t y " , rather than the "minimum number of switching elements" c r i t e r i o n which i s more nearly r e l a t e d to the form of Boolean Expression. Generally, the desired switching c i r c u i t r e a l i s a t i o n of a function has become less and l e s s obvious from the algebraic properties of the function. Modifications to the basic Boolean Expression model have been made, such as introducing new operators which model NAND- and NOR-gates (6). Attempts have been made to develop expressions which r e f l e c t c i r c u i t topology (7,8), but since these expressions derive from Boolean Expressions, which contain i n s u f f i c i e n t t o p o l o g i c a l information, these expressions f a i l , i n general, f o r much the same reasons. Other methods, such as Tree synthesis (9), or Arrays (10), take some basic topology other than the two-level one implied most d i r e c t l y by Boolean Expressions, and design within t h i s framework. However, the problem of synthesis with unspecified topology, but with t o p o l o g i c a l c o n straints, has not been solved with design methods based on analysis of the Boolean Expression of the function. 3 The model of switching functions proposed i n t h i s thesis i s developed from another c o n t r i b u t i o n of Shannon, namely Information Theory (11). Rather than analyse the switching function y = f(x x^) i t s e l f , we analyse such qu a n t i t i e s as the Information Content of subsets of {x^ ,x^,y}. To do t h i s , we assume a p r o b a b i l i t y d i s t r i b u t i o n for the set of possible input s i g n a l con-f i g u r a t i o n s , and since y i s not an independent v a r i a b l e , we can p r e d i c t the p r o b a b i l i t i e s of s i g n a l configurations for sets of v a r i a b l e s which include the output. The information content, or entropy, of a set of v a r i a b l e s i s then derived d i r e c t l y from the p r o b a b i l i t y d i s t r i b u t i o n for signals taken by that set. I f we model a switching c i r c u i t i n t h i s way, we f i n d that we are looking at a " d e t e r m i n i s t i c channel", from the Information Theory point of view ( i . e . one i n which, given the input, we exactly know the output). We can then consider those information q u a n t i t i e s , such as equivocation and mutual information, which prove u s e f u l i n the study of channels. It i s important to note that we are not concerned with t r y i n g to p r e d i c t the actual s t a t i s t i c a l properties of a c i r c u i t i n s i t u . This problem arises from consideration of c i r c u i t r e l i a b i l i t y , and has been studied (12). Rather, we are modelling the switching function as an information manipulator, we are more concerned with information content than p r o b a b i l i t i e s , and the input p r o b a b i l i t y d i s t r i b u t i o n we use i s u n l i k e l y i n p r a c t i c e . The model i s i n t u i t i v e l y s a t i s f y i n g i n that many properties of a switching function are apparent from information q u a n t i t i e s associated with i t . The r e l a t i o n s h i p s are arithmetic rather than algebraic i n nature. For example, the function y = f(x^,....,x^) i s independent of x^ whenever the information content of (x2.x^,...>*N>y) i s equal to that of ( x 2 , x 3 , . . . , x ^ ) . The f i r s t part of the thesis i s concerned with de f i n i n g the model more r i g o r o u s l y , and p r e d i c t i n g some properties of switching functions from arithmetic r e l a t i o n s h i p s between information q u a n t i t i e s . I t i s contended that, i n many cases, these arithmetic tests are easier to apply than the algebraic tests using the Boolean model, p a r t i c u l a r l y when a computer i s used. The second part of the thesis i s concerned with a c i r c u i t design philosophy that uses information q u a n t i t i e s . Rather than manipulate algebraic e n t i t i e s such as Prime Implicants or P a r t i a l Functions, we s t a r t with the idea that some information embedded i n the input information i s required to be passed by the switching c i r c u i t (or channel). The remaining information i s to be blocked. We therefore seek to combine input v a r i a b l e s , using a n acceptable form of switching element (or gate), such that some performance function, computed from the information properties of the gate output, i s optimised. We may then add t h i s gate to the input l i s t , delete any input which becomes redundant, and repeat the process. I t i s found that, at l e a s t f o r small numbers of input v a r i a b l e s , that "good" r e a l i s a t i o n s of functions are generated quickly. Several algorithms using t h i s philosophy are given, with various information performance functions that suggest themselves i n t u i t i v e l y . Book-keeping requirements are also discussed. B u i l d i n g up the c i r c u i t from input to output allows us to make any type of r e s t r i c t i o n we l i k e on the topology of the r e a l i s a t i o n , thus overcoming one objection of the algebraic approach. I f a p a r t i c u l a r connection to a gate v i o l a t e s a con s t r a i n t , i t i s simply not considered, and some other v a l i d "good" connection i s made. There are drawbacks to t h i s method of synthesis. Perhaps the most serious i s that convergence to the desired function, giving a "good" r e a l i s a t i o n , cannot be guaranteed. This i s common to most design methods. Secondly, one cannot compute the information properties of a gate's output from those of i t s inputs, wihtout f i r s t f i n d i n g the gate's output function. I t i s f e l t , however, that the r e s u l t s presented i n the following Sections j u s t i f y the model proposed, and i t i s evident that further development i s possi b l e . 5 2. MODEL DESCRIPTION 2.1 Random Variables, P r o b a b i l i t i e s , and Weights A completely-specified switching function, y = f ( x ^ , . . . , x ^ ) , N can be described by means of an (N+1) x (2 ) truth table. An example i s Fi g . 2.1. Each row of the table gives a pos s i b l e set of sign a l s f o r the ordered set of binary v a r i a b l e s {x^, x^,y}. In the model being described, we consider each v a r i a b l e and assign p r o b a b i l i t i e s to the values i t may take. The most n a t u r a l way to assign p r o b a b i l i t i e s i s to allow each input v a r i a b l e x_^ to be s t a t i s t i c a l l y independent of the r e s t , and to l e t Prob(x_^=0) = Prob(x_^=l) = 1/2. I f t h i s assignment of input p r o b a b i l i t i e s i s made, then we observe that each input s i g n a l configuration i s equally l i k e l y , N with a p r o b a b i l i t y of 1/2 . The p r o b a b i l i t y d i s t r i b u t i o n f o r y i s then f u l l y determined from the truth table, namely Prob(y=0) = (no. of zeros i n the N y-column)/2 . S i m i l a r l y , the p r o b a b i l i t y d i s t r i b u t i o n f o r any random vector, who se v a r i a b l e s are i n the set {x ,....,x^,y} i s f u l l y determined, and can be 3 evaluated from the tr u t h table. In the example of F i g . 2.1, there are 2 = 8 possi b l e random vectors, of which those containing y are n o n - t r i v i a l . For example; Prob(y=0) = 3/2 2 = 3/4, Prob(x i y=10) = 2/2 2 = 1/2, etc. N+1 It i s convenient to l a b e l uniquely each of the 2 random vectors. Since.we are concerned with the information aspects of such a vector, i t w i l l be c a l l e d an information "source", S.. The subscript j i s a binary integer; th i f there i s a "one" i n the i p o s i t i o n , i t indicates the presence i n the source of the i ^ v a r i a b l e i n {x^,. . . ,x_^,. . . ,x^,y}. For example,.the vector (x^,y) for a 3-variable function y = f(x^,x^,x^) i s the source SQ^Q^. We say that the order of a source S. i s the number of "ones" i n i . J 6 x l X2 y 0 0 l 0 1 0 1 0 0 1 1 0 F i g . 2.1. - Example of a truth table f o r a function y = f ( x ,x ). For each random vector, or source, of order K, there are 2 possible s i g n a l configurations, and we can compute 2 corresponding p r o b a b i l i t i e s by inspection of the truth table. I t i s more convenient to use p r o b a b i l i t i e s m u l t i p l i e d by 2^, g i v i n g i n t e g r a l q u a n t i t i e s c a l l e d "weights". For a K ^ - o r d e r source S., then, we define a set P. of 2 weights. J J The notation adopted for the i n d i v i d u a l weights i s chosen so as to characterise them uniquely. The weight p^ has an (N+l)-digit subscript formed from the symbols: {0,1,-}. I f f p_^ i s a member of the set P^ , then a "0" i n j corresponds to a "-" i n i . The remaining symbols i n i , zeros and ones only, correspond to one of the s i g n a l values taken by the source S_. . The magnitude of the weight p^ i s the number of rows of the truth table which a r e the same as i , reading the "-" d i g i t s i n i as e i t h e r zeros or ones i n the truth table. The notation appears cumbersome, but i s e a s i l y readable i n p r a c t i c e . In F i g . 2.1, we have; P101 = { p0-0' P0-l> P l - 0 ' P l - 1 } = t 1 ' 1 ' 2 ' 0 ^ So f a r , then, we have modelled a switching function as a set of N+1 binary random v a r i a b l e s . Using the truth table representation, we have developed a notation for weight d i s t r i b u t i o n s , given a p a r t i c u l a r but n a t u r a l d i s t r i b u t i o n f o r the N input v a r i a b l e s . The reason for the d i s t r i b u t i o n i s discussed i n Section 3.1. 7 2.2 Information Content or Entropy In our model of y - f ( x ^ , x ^ ) , we have i d e n t i f i e d 2 random vectors or sources, and have obtained weight d i s t r i b u t i o n s f or them. Using these d i s t r i b u t i o n s , we can derive the j o i n t information content, or entropy, th of the corresponding sources. For a K -order source S^, with weight d i s t r i -bution P., we define the source entropy or information content, H., of S. as 3 3 3 follows: H. = N — ( p p . l o g 2 ( p ) ) / 2 N b i t s . (2.1) This d e f i n i t i o n i s equivalent to the standard d e f i n i t i o n s (see for example (13), p l 4 ) , except for the normalisation terms, (those i n N), due to the use of weights rather than p r o b a b i l i t i e s . In the example of F i g . 2.1, we 3 have the 2 = 8 possible source entropies: Hooo = 0.000 H001 = 0.811 Hoio = 1.000 H011 = 1.500 Hioo = 1.00 H101 = 1.500 H110 = 2.00 H l l l = 2.000. Of these q u a n t i t i e s , {HQQ,(), ^ 010' ^100' ^110^ a r e t v x v x a x • Since they only measure the information content of input v a r i a b l e s , whose p r o b a b i l i t i e s were assigned independently of the function, these values are independent of the function. In f a c t , i f the l a s t d i g i t of j i n i s zero, then the value of H_. equals the number of ones i n j . S i m i l a r l y , since whenever a l l the input v a r i a b l e s are s p e c i f i e d , the output function becomes f i x e d , there i s no d i f f e r e n c e i n the information content of and S^^, whatever the function. Hence ^110 = ^111' t* 1^ s c a s e a n c * i-n general. N The remaining 2 -1 entropies, (^QQ-^J Hoil ' ^101' ^ N T * 1 6 E X A M P - L E ^ > ^ ° depend on the function, and from them alone, many useful properties of the function may be derived. (See Sections 3 and 4). 8 2.3 Manipulations with Entropies An important reason f o r developing t h i s model i s that i t suggests analogies, which lead to theorems, v e r i f i a b l e a n a l y t i c a l l y . The source entropies i n p a r t i c u l a r are r i c h sources of conjecture, and th i s Subsection gives ways of manipulating and i n t e r p r e t i n g the entropies i n forms that have been found useful i n Information Theory and that w i l l be us e f u l i n Section 4. I t i s convenient f i r s t , however, to in d i c a t e an a l t e r n a t i v e to the s o u r c e - l a b e l l i n g notation. The one used so f a r i s rigorous but somewhat clumsy, and so, when no confusion a r i s e s , the subscript f o r S, P, and H may l i s t the source v a r i a b l e s d i r e c t l y . For example; Hioo ° » H o n = ' Hioon = ^ I V ' E T C > The source entropy may be thought of as the average uncertainty i n pr e d i c t i n g the values of the source v a r i a b l e s at any time (given only the input p r o b a b i l i t i e s , but not t h e i r values). The maximum value that Hy for any func-t i o n y can take i s 1.0 b i t . This i s when the p r o b a b i l i t y that y i s zero i s 1/2, or when we are most uncertain of the value of y. Thus we have an i n t u i t i v e i n t e r p r e t a t i o n f o r source entropy. Suppose now that we speci f y a value f o r the input x^. I n t u i t i v e l y , we expect that we should be able to p r e d i c t a value for y with l e s s uncertainty. We therefore define the quantity Hy/x^, known as the equivocation of y with respect to x^. I t i s defined as: Ey/x± = rh^y - tixv (2.2) and i t can be shown (13) that: H y / x ^ H y . (2.3) We therefore i n t e r p r e t Hy/x^ as the average uncertainty i n y, when we know the value of x^. We may c a l c u l a t e the equivocation of any source with respect to any d i s j o i n t source, ( i . e . when the two sources have no variables i n common), 9 and t h i s concept i s used i n Section 4. We may s i m i l a r l y define the dif f e r e n c e between Hy and Hy/x^ as the mutual information of y and x^. This i s the quantity of information common to both v a r i a b l e s (or sources, i n general). I t i s denoted by Ix^:y or Iy:x^, since i t i s symmetric i n y and x^. Its value, always >_0, i s given by: Iy:x 1 = I x ^ y = Hy - Hy/x^^ = Ex± - Hx^y (2.4) = Hx 1 + Hy - Hx y. We may s i m i l a r l y define q u a n t i t i e s such as the mutual information of y and x^, given x^. This i s : I y : x 1 / x 2 = Hy/x 2 - Hy/x x = Hx 2y + H X ; Lx 2 - Hx x y - Hx 2. (2.5) The mutual information of three or more d i s j o i n t sources may also be defined, by extension from equation 2.4. I t should be noted that, f o r more than two sources, the mutual information may be negative. This i s the case f o r the function of F i g . 2.1, where: I X ] L:x 2:y = Ho^ + Hx 2 + Hy - H x ^ - H x ^ - Hx 2y + H x ^ y (2.6) = 1.0 + 1.0 + 0.811 - 2.0 - 1.5 - 1.5 + 2.0 = -0.189 b i t , i n t h i s case. The extension of equations 2.2-2.6 to cases where m u l t i v a r i a b l e sources replace s i n g l e v a r i a b l e s i s straightforward; f o r two d i s j o i n t sources, A andB,HA/B = HAB - HB, etc.. A convenient way to v i s u a l i s e the arithmetic p a r t i t i o n i n g of the N b i t s of input information i s analogous to the Venn Diagram of conventional switching theory. For the example of F i g . 2.1, such an information p a r t i t i o n i s given i n F i g . 2.2. 10 Area Quantity Represented Value i n terms of Source Entropies. Value f o r function i n F i g . 2.1. 1 H x l / X 2 Y H x l X 2 Y ~ H x 2 Y 0.500 b i t 2 Hx 2/x 1y H X 1 X 2 Y " H X 1 Y 0.500 " 3 H y y / x l X 2 Hx 1x 2y - H x1 x 2 = 0 0.000 " 4 . Ix :x /y - H x ^ + Hx 2y - H x ^ y - Hy 0.189 " 5 I x 2 : y / x 1 Hx 1x 2 + H x ^ - H x ^ y - Hx.^ 0.500 " 6 l x i : y / x 2 Hx 1x 2 + Hx 2y - H x ^ y - Hx 2 0.500 " 7 I x 1 : x 2 : y Hx 1 + Hx2 + Hy - Hx.^ - Hx^y - Hx 2y + Hx^x 2y -0.189 " F i g . 2.2 - P a r t i t i o n of Input Information Source Entropy Value i s the sum of areas: Hx^ ^ 1, 4, 6, 7 Hx 2 3, 4, 5, 7 Hy 3, 5, 6, 7 H X 1 X 2 1, 2, 4, 5, 6, 7 Hx l Y 1, 3, 4, 5, 6, 7 Hx 2y 2, 3, 4, 5, 6, 7 H x i x 2 y 1, 2, 3, 4, 5, 6, 7 11 3. PCN EQUIVALENCE CLASSES 3.1 D e f i n i t i o n of Equivalence Classes 2 N The complete set of 2 switching functions of N v a r i a b l e s i s very large for even moderate values of N. We therefore seek to p a r t i t i o n the complete set i n t o smaller d i s j o i n t subsets, i n the hope that some statement made about one function i n a subset applies to each member of the subset, but to no function outside the subset. I f t h i s can be done, then the subsets are "equivalence classes". The statement r e f e r r e d to may be a property, such as t o t a l symmetry, which p a r t i t i o n s the complete set into a small subset (symmetric functions) and a large one (asymmetric fu n c t i o n s ) . Such a p a r t i t i o n i s u s e f u l only i n dealing with the symmetric functions, since asymmetry i s not a useful property. A l t e r n a t i v e l y , we may define a class of transformations to be applied to functions, any member of which transforms a function i n t o another member of the same equivalence class as a function. A p a r t i c u l a r l y u s e f u l such class i s the group of Permutation, Complementation, and Negation transformations (PCN Transformations). Permutation r e f e r s to a permutation of the l a b e l s on the input v a r i a b l e s , Complementation to complementing some of the input v a r i a b l e s , and Negation to complementing the output function. In the example of F i g . 2.2, Permutation of x^ and x^ leaves the function unchanged, but Complementation of some of ( x ^ , X 2 ) , with or without Negation of the function y, generates a class of eight d i f f e r e n t functions. This class i s therefore an equivalence class under PCN Transformations. This p a r t i t i o n into PCN equivalence classes i s p a r t i c u l a r l y u s e f u l for c i r c u i t design, since any r e a l i s a t i o n for a function y becomes a r e a l i s a t i o n for any function equivalent to y, when the inputs and outputs are r e l a b e l l e d according to a s u i t a b l e PCN Transformation. Also, since there are a t o t a l of N+1 (N!)*2 PCN Transformations, the equivalence classes are large i n many cases. 12 The number of equivalence classes i s compared to the s i z e of the complete set of functions i n F i g . 3.1. N 2N 2 No. of PCN Classes 0 2 1 1 4 2 2 16 4 3 256 14 4 65536 222 5 9 4xl(T 616126 F i g . 3.1 - Number of PCN Equivalence Classes. Given some function, we wish to know to which PCN equivalence class i t belongs. A set of q u a n t i t i e s , derived from the function, are c a l l e d charac-t e r i s i n g i n v a r i a n t s i f f the set has the same value f o r a l l functions i n any p a r t i c u l a r equivalence c l a s s , a value d i f f e r e n t from that of any function not i n the equivalence c l a s s . Several c h a r a c t e r i s i n g i n v a r i a n t sets are po s s i b l e . (14, 15, 16). N We wish to show now that the set of 2 n o n - t r i v i a l source entropies of a function are c h a r a c t e r i s i n g i n v a r i a n t s f o r the equivalence classes induced by PCN transformations. We use the binary subscripts {j} to l a b e l ^ ^ } , {P_.} etc., i n the subsequent Section (see Section 2.1). I t i s found that i f we adopt some other p r o b a b i l i t y d i s t r i b u t i o n f o r the input signals than the "maximum information" one we have been using, Theorem. 1 below i s no longer generally true. We can,for example, make a function's entropies the same as one of i t s p a r t i a l functions, by giving h a l f the input s i g n a l configurations a zero p r o b a b i l i t y . 13 3.2 Source Entropies as Invariants of PCN Equivalence Classes THEOREM 1 The set of 2 N source entropies {H ; j odd} i s a set of c h a r a c t e r i s i n g i n v a r i a n t s of equivalence classes induced by the Group of PCN Transformations. The proof of Theorem 1 i s rather long, and i s s p l i t i n t o three parts, as below. 3.2.1 To show that {11A i s i n v a r i a n t under any PCN Transformation Any H, i s computed as a symmetric function of the weights i n P., (see 3 -3 equation 2.1). I t i s c l e a r that complementing operations on any x_^ i n {x_^ ; i=l,N} or negation of the function y simply permute the elements i n P . For example, complementation of x^ i n F i g . 2.1 induces the permutation ((p_oo' p - i o ^ ' ^ p - o i ' p - n ^ l n t h e s e t p o i r S u c h P e r m u t a t i o n s d o n o t c h a n § e the value of H.. J I f the v a r i a b l e s {x_^ ; i=l,N} are permuted, i t i s also evident that t h i s corresponds to a permutation of the subscripts {j} i n the set tH }. Hence the set {H_. } remains i n v a r i a n t under any PCN Transformation, to within a permutation of the elements of {^j}-3.2.2 To show that {P?} characterises at most one Class 3 We define P V as an unordered set, whose elements are those of the 3 ordered set P . Since H_. contains no information about the ordering of weights i n P., we must consider {P?}, rather than {P.}. 3 3 3 Consider any permutation of the v a r i a b l e s {x_^ ; i=l,N}. This i s equivalent to a permutation of the second-order weight sets i n {PV}. I t i s c l e a r , also from the subscript notation f o r these sets, that the converse i s true; namely that any permutation of the second-order sets, together with a consistent permutation of the higher-order sets within i s equivalent to some permutation of the input v a r i a b l e s . 14 Consider now transformations i n v o l v i n g complementation of the input v a r i a b l e s and negation of the function, (CN Transformations). Define T as N+1 the set of a l l sources, {S^; j=0, 2 -1}. Both t r i v i a l and n o n - t r i v i a l sources are included i n T. For each source S^. i n T, we may define a " l i n e a r function weight", W., as follows; th i f i s the i binary d i g i t of j, then: fj = ^/V 9 ( j 2 A X 2 ) ® ® ( W ® ( j N + l A y ) * ( 3 , 1 ) f^ i s a " l i n e a r f u n c t i o n " and W i s i t s "weight", ( i . e . the number of "ones" i n the column for f . , when f. i s given, by a truth table, as a function of — N {x.; i=l,N}). We also define W. as equal to 2 - W.. ({A,v,©} are the Boolean i J J operators {AND, OR, EXCLUSIVE OR}, r e s p e c t i v e l y . ) We say that a source S eT i s a subsource of S.eT i f f every switching th K v a r i a b l e i n S i s also i n S.. I f S. i s of K -order, there are 2 subsources of m 3 3 S.. W , the l i n e a r function weight f o r a subsource S of S. may be expressed 3 m m 3 as the arithmetic sum of h a l f the weights i n P.; W i s then the sum of the 3 m remainder. For example, consider the subsource of source i n F i g . 2.1. We have; W001 = W e i § h t o f ((° A x!) ® (0/\x2) © (lAy)) = Weight of (y) = 1 - <P0-I + Pi-i>» f r o m p i o r a s s o o i c sioi> a n d ^001 = (po-o + p i - o } = 3 ' We see that a weight p. i s included i n the sum for W i f those binary l m d i g i t s of i corresponding to the "ones" i n m sum a r i t h m e t i c a l l y to an odd integer. I f p. i s not included i n the sum for W , then i t i s for W . Consider l m m the quantity: G - Z b j S C S. (W .W ). (3.2) m j m m Since W and W are written as sums of d i f f e r e n t weights, then G. w i l l be m m 3 K-'1 K expressible as some sum of weight product-pairs, of which there are (2 )•(2 -1) We w i l l count the number of times an a r b i t r a r y product p a i r (p »p ) x y appears i n the sum G^. Let x and y d i f f e r i n d d i g i t s . Then (p x'P^) appears i n the sum for (W -W ) i f there are an odd number of "ones" i n m corresponding m m to those d d i g i t p o s i t i o n s . For example, i f x = -000-001, y = -111-011, then (p .p ) i s i n the sum f o r (W -W ) when m =01000101, but not for m = 01100101 r x y m m etc. We note that the n u l l source S n n has Wn „ = 0, and contributes nothing to G.: i t i s included f o r n o t a t i o n a l convenience only. J Since (p «p ) does not appear i n the sum for (W -W ) i f there i s x y m m an even number of "ones" i n the d d i g i t p o s i t i o n s , and since each of the 2 possible q u a n t i t i e s (W -W ) are included i n G., i f we include (W_, „.W„ ..), m m j 0...00..0 then the number of "odd" cases i s the same as the number of "even" cases. K—1 Thus (p -p ) appears i n the (2 ) "odd" cases. Since x, y and d were x y a r b i t r a r y , the same applies to each product-pair, and we have that: G. = J. (W -W ) •j S C. S. m m m J = ( 2 K _ 1 ) - ( Z <P *P )) 5 P V>P V e V (3.3) x^y y y 2 We now observe that, for the t r i v i a l sources {S ; m even}, m — N-1 W = W = 2 , independent of the function, due to the truth table construction m m and input p r o b a b i l i t y assignment. The exception i s = 0. The sources, {S ; m odd}, have weights {W ; m odd} which are i d e n t i f i e d as the l i n e a r m m function weights used by Golomb (14), to f i n d a set of i n v a r i a n t s f o r PCN equivalence classes. For each of these, we may define: N-1 2 N-1 2 — a = W - 2 ; a = (2 ) - (W -W ) ; m odd. (3.4) m m m m m — N-1 2 — Putting (W «W ) = (2 ) for m even, (except W. A'W„ _ = 0) i n equation 3.3., m m U • . U u.. U we obtain: E — K-1 N-1 2 G. = _ (W .W ) + (2* -1).(2 N V 5 m odd. i b a. m m = (2 K- X) • E (p -p ) x^y x y . 2. K-1 _ N ,ON~1N2 , / 0K-lv , 0N-ls2 = - E , , (a ) + (2 -l)-(2 ) + (2 )-(2 ) m odd m H e n C e ; S J S. («2) - ( 2 K - 1 ) . ( 2 B - 1 ) 2 m j m m odd + (2 K 1 ) . E (p -p ) = 0; p ,p e P.. x^ y x y x y J (3.5) We now notice that E(p *p ) i s independent of the ordering within x y 2 u P.. Hence we may obtain (a ) from P.. By successive a p p l i c a t i o n s of equation J m J 3.5 to{ P U; S cS.}, we obtain values f o r each quantity a i n {a ; S c S.}. m m j - ^ m m m j xx 2 Therefore from the set {P}, we obtain a set of values f o r the set {cu}. 2 To show that knowledge of {a }fo r a function i s s u f f i c i e n t to determine j the equivalence class of the function under CN Transformations, we state and prove Theorem 2. 2 N THEOREM 2 {cu; j = 2.M-1; M = 1,2 } i s a set of cha r a c t e r i s i n g i n v a r i a n t s f o r the CN equivalence classes. (Proof: see Appendix 1.) Hence, from equation 3.5 and Theorem 2, we have that the set of unordered weight sets (Pj} characterise at most one PCN equivalence c l a s s . 3.2.3 To show that f o r a given set {H.}, there e x i s t s at most one {P?} th The weights i n each K -order set P a^e not independent of each other, and of lower-order weights. The constraints are as follows: p. . n . + P. . , . 1 m N+1 1 m N+1 p i , . . . i - . . . i M , , ' (3.6) 1 m N+1 1/ where p. . . i s a weight of lower order. 1 l ' , , 1 m ••,1N+1 Also: p . n + p 1 - 2 . (3.7) This l a t t e r r e s u l t i s true because each row of the truth table has the same p r o b a b i l i t y of occurrence. These constraints may be combined, so as to enable each of the weights i n any P to be written as a l i n e a r arithmetic function of one weight i n P and one weight from each of the lower-order weight sets P^, such that S C S., m^j. From the example of F i g . 2.1: P101 = { p0-0' P 0 - l ' P l - 0 ' P l - 1 } ' and using equations 3.6 and 3.7, 2 2 " 2 + 1 = 2 P0-0 + • po-i : p i - o + p l - l ! po-o + p i - o : p 0 - l + p l - l P-0 + p - l p i o i { po- -0' ( 2 p - l .2-1+1 = 4 . 'o-o/5 V F — o Fo-o" v" H—tVo-cr-" I t follows that {P^} i s determined when we know the f i r s t weight ( i . e . p. such that i contains no " o n e s " ) , from each P . We note further, from Section 3.2.2 above, that any ordering of weights within each P^ to give some Pj which i s consistent with the constraints of equations 3.6 and 3.7, produces some {P.} which represents a function i n the equivalence class characterised by {P U}. 3 Hence i t s u f f i c e s to show now that there i s only one P^ corresponding to H., given the ordered weight sets {P ; S c S.; m=H}. 3 mm j For any weight p i n P*f, we can write: 18 Pj = {(a.-p), ( 2 N K + 1 - a . + P ) ; i = l , 2 K 1 } , (3.8) where the set {a_^ } consists of l i n e a r arithmetic combinations of lower-order weights only. We can then write H. as a function of p. 2K-1 2K-1 H. = N - ( E (a.-p) log (a.-p) + E (2 N~ K + 1-a.+p). 3 i = l i = l l o g 2 ( 2 N " K + 1 - a i + p ) ) / 2 N . (3.9) P a r t i a l l y d i f f e r e n t i a t i n g H^. twice with respect to p, we obtain: 2K-1 H'.' = (1/2 N). (-1/log 2).( E ( 2 N " K + 1 / ( a . - p ) . ( 2 N _ K + 1 - a +p))). J 6 i = l 1 1 Since each weight (a^-p) or (2^ ^ +^-a^+p) i s a non-negative integer, we can write: H^ ' = (A). (-B). (C) , where A,B,C > 0. (3.10) Thus equation 3.10 ind i c a t e s that H^ ' < 0, for a l l p, i n turn i n d i c a t i n g that p versus H. i s at most a double-valued function. There can be at most 3 two switching functions, a and b say, which have the same R\. , and the same ordered weight sets of order less than K, but which d i f f e r i n the unordered weight sets, Pj(a) a n ^ P j ( b ) , that give t h i s R\.. Consider now the new switching function c of the N+1 va r i a b l e s {XQ,X^,...,x^}, such that c = (x^Aa) v (x^Ab). There w i l l be a (K+l) -order weight set (c) of c such that: P".(c) = {P U(a), P U ( b ) } . I j 3 3 N+1 Hence, H (c) = N+1 - ( E p.log 2p)/(2 ), by d e f i n i t i o n , N+1 or, H (c) = N+1 - ( E p.log„p)/(2 ) = H. + 1. 2 peP.(a) 1 J 3 or P.(b) 3 We may also construct the functions d = (x^Aa) V (x^Aa) and 19 e = (XQAD)v(x^Ab) and t h e i r corresponding weight sets: P?.(d) = {P U(a), P U(a)}; (e) = [P U(b), P U ( b ) } . We now observe that p" ( c ) , P^ _. (d) , P^ _. (e) are d i f f e r e n t from each other, since P U(a) 4 P U(b),but H. .(c) = H. . (d) = H, .(e) = H. + 1. We therefore have threi j j • l j l j l J 1 d i f f e r e n t weight sets g i v i n g the same source entropy, c o n t r a d i c t i n g equation 3.10. We must conclude that there can be at most one set P? for any H., J J given the ordered weight sets of lower order than P^. By i t e r a t i n g t h i s argument, we see that there i s only one set {P^} for any set {^}. Putting the three parts of the proof together, we see that Section 3.2.1 demonstrates the invariance. of {H^ . } under PCN Transformations, and Sections 3.2.2 and 3.2.3 together show that {H^ .} characterises one and only one equivalence c l a s s . Hence Theorem 1 i s proved. 3.3 Representative Functions f o r PCN Equivalence Classes The set of i n v a r i a n t s due to Golomb (14) and Harrison (15), which were used i n Section 3.2 f o r the proof of Theorem 1, have two p r i n c i p a l disadvantages. The f i r s t i s subjective, namely that p h y s i c a l s i g n i f i c a n c e cannot e a s i l y be attached to them. The second i s that a f a i r l y complicated algorithm i s required to compute them. The source entropy i n v a r i a n t s have p h y s i c a l s i g n i f i c a n c e and are more e a s i l y computed, but t h i s l a t t e r fact introduces a d i f f e r e n t problem. Having computed the source entropies and i d e n t i f i e d the equivalence class to which a function belongs, we do not know which PCN transformation to apply to the function to get the representative function which we i d e n t i f y with the equivalence c l a s s . This problem i s l e s s serious for the Golomb invari a n t s as the steps required to obtain the i n v a r i a n t s give the required PCN transformation as w e l l . A s o l u t i o n to t h i s problem, when i t cannot be solved by inspection, N uses a set of weights from {P.}. We form the vector P of 2 weights, one from 20 each n o n - t r i v i a l P., such that the subscript i of p. i n P contains no zeros. The vector P can e a s i l y be computed from the function y by a s i n g l e matrix m u l t i p l i c a t i o n . In the example of F i g . 2.1, we compute P by: p = P - l = 1 ---- 1 1 1 1 X 1 P-11 0 0 1 0 1 0 p l - l 0 0 0 1 1 0 . p l l l . 0 0 0 0 1 0 where the function y i s represented by i t s truth table column, i n t h i s case [1,0,0,0] T. The vector P characterises a function, since the matrix i s non-s i n g u l a r , and so i t remains to manipulate P i n t o the corresponding vector f o r the representative function from the equivalence c l a s s . I f we choose as the representative function the one which has the smallest value f o r the f i r s t element of P, and to break t i e s , the smallest value f o r the second element, etc., then we can use e s s e n t i a l l y the same algorithm as the one used by Harrison (15, p 164), to compute the Golomb i n v a r i a n t s . The representative function fo r the equivalence class of y above i s c l e a r l y y i t s e l f . ALGORITHM 1. (Harrison). To f i n d a PCN Transformation which c a r r i e s an a r b i t r a r y function i n t o the representative function for i t s equivalence c l a s s . (1) Compute the vector P f o r the function y. (2) I f the f i r s t element of P >2^ +\ replace each K ^ - o r d e r weight p_^ i n P by (2^ K+"^-p^) • This i s Negation of y. N-2 N-1 (3) I f any second-order weight p^ i s >2 , replace i t by (2 -p^) • Also replace a l l higher-order weights i n v o l v i n g the v a r i a b l e associated with th p_^ by t h e i r "complement" weights. (I.e. a K -order weight p i s replaced by (2^ ^ +^-p).) This i s Complementation of the input v a r i a b l e associated with p. (4) Permute the second-order weights so that they are i n ascending order. Apply t h i s permutation c o n s i s t e n t l y to those higher-order weights i n v o l v i n g the same v a r i a b l e s . This i s a Permutation transformation. (5) I f some of the second-order weights are the same, t r y a l l permutations of these weights, to obtain the smallest ordered sequence of weights i n each of the higher orders, considering the third-order weights f i r s t . N-2 (6) I f some of the second-order weights are 2 , try a l l complementations of these v a r i a b l e s to obtain the minimal sequences, as i n (5). (7) The vector P' r e s u l t i n g from these manipulations i s the weight vector • of the representative function, and t r a c i n g the Permutation, Complementation and Negation steps required above gives a s u i t a b l e PCN Transformation f o r y in t o the representative function. Since Algorithm 1 i s v i r t u a l l y i d e n t i c a l to the Harrison algorithm, whereas the computation of source entropies i s by formula rather than by t algorithm , i t would seem that the source entropy representation of equivalence classes has some advantages over that of Golomb and Harrison. The source entropy i n v a r i a n t s f o r the fourteen 3-variable functions, together with the representative functions i s given i n Tables 3.1 and 3.2. t ( I t i s r e l a t i v e l y simple to permute the source entropies i n t o the ordering used when l i s t i n g the i n v a r i a n t s , namely ascending order of second-order entropies.) Equ. Class No. x l x 2 x 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 0 Input 0 1 1 1 0 0 1 1 1 1 1 0 0 1 0 0 1 Variables 0 0 0 0 1 0 1 0 0 1 0 1 1 0 0 0 ( ) 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 1 1 Rep, Fn. 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 y = f ( x 1 _ 3 ) i 0 0 1 0 0 1 0 0 1 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Elements of P 0 1 2 2 2 3 3 3 4 4 4 4 4 4 P — 1 C h a r a c t e r i s t i c 0 0 0 0 1 0 1 1 0 1 1 1 2 2 P-11 weight vector P 0 0 0 1 1 2 1 1 2 1 1 2 2 2 P-l-1 of y 1 0 0 1 1 1 2 1 2 2 1 2 2 2 2 P l - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 P - l l l 0 0 0 0 0 0 0 1 0 0 1 0 1 1 P l - l l 0 0 0 1 1 1 1 1 1 0 0 1 1 1 P l l - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 P l l l l Table 3.1 - Representative Functions and t h e i r C h a r a c t e r i s t i c Vectors for 3-Variable PCN Equivalence Classes. Equiv. Source Entropy Invariants Class H0001 H0011 H0101 H1001 H0111 H1011 H1101 H l l l l 1 0.000 1.000 1.000 1.000 2.000 2.000 2.000 3.000 2 0.544 1.406 1.406 1.406 . 2.250 2.250 2.250 3.000 3 0.811 1.500 1.500 1.811 2.000 2.500 2.500 3.000 4 0.811 1.500 1.811 1.811 2.500 2.500 2.500 3.000 5 0.811 1.811 1.811 1.811 2.500 2.500 2.500 3.000 6 0.954 1.406 1.906 1.906 2.250 2.250 2.750 3.000 7 0.954 1.906 1.906 1.906 2.750 2.750 2.250 3.000 8 0.954 1.906 1.906 1.906 2.750 2.750 2.750 3.000 9 1.000 1.000 2.000 2.000 2.000 2.000 3.000 3.000 10 1.000 1.811 1.811 1.811 2.500 2.500 2.500 3.000 11 1.000 1.811 1.811 2.000 2.500 2.500 2.500 3.000 12 1.000 1.811 2.000 2.000 2.500 2.500 3.000 3.000 13 1.000 2.000 2.000 2.000 2.000 3.000 3.000 3.000 14 1.000 2.000 2.000 2.000 3.000 3.000 3.000 3.000 Table 3.2 - Source Entropy Invariants f o r 3-Variable PCN Equivalence Classes. 4 . ANALYSIS OF FUNCTIONS USING ENTROPIES 4 . 1 Symmetry If x. denotes one of (x., x.), then y = f (x_ ,. . . ,x„) i s p a r t i a l l y 1 x i I N • • symmetric i n the i v a r i a b l e s (x^,...,x ) i f f any permutation of the v a r i a b l e s (x^,...,x ) leaves the function y unchanged.. P a r t i a l l y symmetric functions, p a r t i c u l a r l y t o t a l l y symmetric functions (for which i = N), have been extensively studied ( 1 7 ) , s ince, although they are sparse^, they often occur i n design problems. Algorithms e x i s t to i d e n t i f y the v a r i a b l e s of symmetry, and whether they are complemented or not ( 4 ) . P a r t i a l symmetry i s not obvious, i n many cases, from the Boolean Expression for a function. THEOREM 3 A necessary condition for a function y to be p a r t i a l l y symmetric • • • • i n (x^,.-. .,x ) i s that, f o r any permutation Q of (x^,...,x^) H = **Q(J)> f ° r aH i» w n e r e Q(j) i s the permutation Q as applied to the v a r i a b l e s i n j where applica b l e . For example, the representative function for Equivalence Class 3 i n Table 3.1 i s p a r t i a l l y symmetric i n ( x ^ j X ^ ) , and has H Q Q - I ^ = HQ-^QJ AN^ H ^ Q ^ ^ = H ^ Q ^ . The proof of Theorem 3 i s obvious, since i f a permutation of v a r i a b l e s leaves y unchanged, then the same permutation of subscript labels must leave the source entropies unchanged. We note that Theorem 3 i n only a necessary condition. In f a c t Equivalence Class 1 1 i n Table 3 . 1 contains t o t a l l y asymmetric functions, but has H Q Q ^ = H Q ^ Q ^ and ^ Q ^ I = ^ n o i " ^ 6 that for t h i s Equivalence Class, the representative function i s y = fix^yX^fX^) = f ( x ^ , x ^ , X 2 ) ; i . e . the permutation ( X2' X3^ i s equivalent to the complementation of x^. I t i s c l e a r that whenever the condition of Theorem 3 i s s a t i s f i e d f or a function, we can perform a permutation on the relevant v a r i a b l e s , and obtain a new function C' 11 ~) 2 t(There are 2 functions which are p a r t i a l l y symmetric i n i of the N v a r i a b l e s . ) d i f f e r i n g from the old by at most a CN Transformation. By the same token, i f a function i s p a r t i a l l y symmetric, the e q u a l i t i e s among source entropies i n d i c a t e only the v a r i a b l e s of symmetry, but not whether they are complemented. Theorem 3, then, i s a weak but simply-applied i n d i c a t o r of p a r t i a l symmetry as defined. But the condition of Theorem 3, which we may term "H-indistinctness i n (x ,...,x )", i s us e f u l even f or asymmetric functions, being a form of more generalised symmetry. The example used above, from Equivalence Class 11, i s H - i n d i s t i n c t i n x^ and x^, and has a r e a l i s a t i o n i n NOR-gates given by F i g . 4.1. The "pseudo-symmetry" i n x^ and x^ i s r e a d i l y apparent. F i g . 4.1 - R e a l i s a t i o n of an H-i n d i s t i n c t function, showing "pseudo-symmetry". a a NOR-GATE Function. 4.2 Unate Cascade R e a l i s a b i l i t y A unate cascade i s a switching c i r c u i t with the topology of F i g . 4.2. *N_ XN-1 1 *N-2 1 I1 feN-2 y N"3 1 y -y = f (x 1,. . . J X J ^ ) F i g . 4.2 - Unate Cascade Topology. The elements of such a cascade are 2-input, 1-output gates. The function that a gate may have, e^ = e ^ ( y i j L _ i > e i - l ^ ' "*"s r e s t r i c t e d to the unate functions of two v a r i a b l e s , namely any 2-variable function except e^ = x^ © x^ and e = x^ & x^. One reason why unate cascades are of i n t e r e s t i s that, with the advent of Large Scale Integration i n t o switching c i r c u i t design, r e a l i s a t i o n s that are to p o l o g i c a l l y regular (or i t e r a t i v e ) become very a t t r a c t i v e . By r e s t r i c t i n g the gates to being unate functions, we can r e a l i s e each gate from a s i n g l e NOR- or NAND-gate, by a PCN Transformation on the gate terminals, or by leaving some terminals disconnected. I t i s c l e a r that most functions are not Unate Cascade Realisable (UCR). N-1 In f a c t , there are (2 +1) PCN Equivalence Classes which contain UCR functions of N v a r i a b l e s . However, a l l switching functions can be r e a l i s e d as a 2-dimensional array, of which the Unate Cascade i s the "building-block" (18). An algebraic t e s t for UCR functions e x i s t s (18), and i s algorithmic i n nature, g i v i n g the ordering of input v a r i a b l e s i n the cascade, (to give the form of F i g . 4.2), and the unate function required for each gate. We now give a test based simply on the magnitude of some of the source entropies, which ind i c a t e s whether a function i s UCR, and which gives the ordering of the va r i a b l e s . THEOREM 4 A function y i s UCR, as i n F i g . 4.2, i f f : Ex± Xj^y <_ K + 1/2 K, f o r each K: 0 <_ K <_ N. We begin the proof with some simple cases. For K = 0, the condition i s t r i v i a l l y true, since Hy <_ 1.0 f o r any function. S i m i l a r l y , i f K = N, Hx^....x^y = Hx^....x^ = N, for any function. If y i s a proper function of the f i r s t M < N var i a b l e s only, i . e . y i s independent of {x^^,. .. . ,x^}, then Hx^. . . . x ^ ^ y = Hx^....x M +j = M + J , for J = 0,....,N-M, again s a t i s f y i n g the 27 con d i t i o n of Theorem 4. So we need only consider those functions which are proper functions of N v a r i a b l e s . I f such a proper function i s UCR, we say that i t i s Proper Unate Cascade Realisable (PUCR). We now need a Lemma. LEMMA 1 A function y i s PUCR as i n F i g . 4.2, i f f f o r (1 <_ K <_ N-1) , the K set of 2 p a r t i a l functions with respect to {x^; i = 1,K} are each i d e n t i c a l l y 0 or 1, except f o r one n o n - t r i v i a l p a r t i a l function. To prove Lemma 1, consider F i g . 4.2, where y i s PUCR. C l e a r l y • • • • . • — y = (x^Ae^) or y = (x^Ve^), (where x = x or x), since these are the only PUCR functions of two v a r i a b l e s . Expanding i n t o p a r t i a l functions with respect to X l : y = (x Ae ) V ( ^ A O ) or y = (X^ AI) V ( y ^ ) . Hence we see that the poss i b l e p a i r s of p a r t i a l functions with respect • • • to x^ are {0, e^} only, where 0 = 0 or 1. S i m i l a r l y , the only n o n - t r i v i a l p a r t i a l function of e^ with respect to x^ i s e^, giving the possible second-* * * * order p a r t i a l functions of y with respect to ( x ^ , x^) as {0,0,Oje^} only. th Continuing i n th i s way, we i d e n t i f y e as the only n o n - t r i v i a l K -order p a r t i a l function of y with respect to ( x ^ , . . . , x ^ ) . The converse i s e a s i l y established, since we can construct a Proper Unate Cascade, given the one n o n - t r i v i a l p a r t i a l function at each l e v e l . Hence Lemma 1 i s proved. Returning to the proof of Theorem 4, consider a PUCR function y, r e a l i s a b l e as i n F i g . 4.2. I t has an ordered weight set: K+1 Px 1 r^y = {p.; i = 0,2 x - l } , (where the "-" d i g i t s i n i of p_^ are deleted for convenience). From the subscript notation for p^, we can i d e n t i f y each p a i r i n the grouping: {(P., P i + 1 ) ; i = 0,2,..,2* -2}, 28 as the f i r s t - o r d e r weight set f o r one of the 2 p a r t i a l functions of y with respect to (x ,....,x ). Since y i s PUCR, then by Lemma 1, each p a r t i a l function, 1 K bar one, i s i d e n t i c a l l y 0 or 1. For such a p a r t i a l function, i t s f i r s t - o r d e r N-K N-K weight set i s {0, 2 } or {2 ,0}. Let the one n o n - t r i v i a l p a r t i a l function N-K N-K have a f i r s t - o r d e r weight set {p, 2 -p}, (0 < p < 2 ). Then: 2K+1 Hx, x^y = N - ( I p. .log„(p.))/2 N 1 K i = Q i 2 i = N - ((2 K - l ) . ( 0 . 1 o g 2 0 + 2 N _ K . l o g 2 ( 2 N " K ) ) + p.log 2(p) + (2 -p ) . l o g 2 ( 2 -p))/2 K N-K N-K N = K + (N-K)/2* - ( P . l o g 2 (p) + (2 i N - p ) . l o g 2 ( 2 N - p ) ) / 2 W . N-K N-K Now Min.(p.log 2(p) + (2 -p ) . l o g 2 ( 2 -p)) = 2 . ( ( 2 N - K - 1 ) . l o g 2 ( 2 N - K - 1 ) ) - ( 2 N - K ) . W - K - 1 ) . Hence Hx, x v < K + (N-K)/2 K - ( 2 N _ K ) • (N-K-l)/2 N, 1 K -K Hx .....x y < K + 1/2 , as required. Now consider the converse case. We have a proper function y of N va r i a b l e s , which i s such that: K < HXl...x..y < K + 1/2 K, f o r K = 1, N-1. (4.1) th We have to show that the K -order p a r t i a l functions implied by the weight set Px^...x^y are a l l , bar one, t r i v i a l functions, f o r a l l K. The proof i s by induction on N-K. As the basis for the induction, consider K = N-1. Each element of Px^...x^_^y must be one of {0,1,2}, f o r any function. Further, grouping the elements i n pa i r s such that each p a i r i s the f i r s t - o r d e r weight set for a p a r t i a l function, the possible (unordered) pa i r s are {1,1} or {0,2}. N-1 Let Px^...x^y consist of (a) pair s {1,1} and (2 -a) pa i r s {0,2}. Then: 29 H x l " " X N - l y = N " ( a ,(° + 0> + ^ - ( O + 2.1og 2(2)))/2 N N-1 N = N - (2.(2™ - a ) ) / 2 N = (N-1) + a / 2 ( N _ 1 ) . Since (N-1) < Hx^...xN_^y <_ (N-1) + 1/2 ^ N ^ \ from equation 4.1, then a = 1 i s th the only s o l u t i o n . Hence there i s only one n o n - t r i v i a l (N-1) -order p a r t i a l function. For the induction step, suppose equation 4.1 i s true, and l e t the th (K+l) -order p a r t i a l functions each be t r i v i a l bar one. In constructing Px ...x y = {p.}, each p. i s the sum of some d i f f e r e n t p a i r of weights i n I K i I Px ^ ...x^_^y, as i n equation 3.6. Since we have only one n o n - t r i v i a l weight p a i r i n Px . ..x y, (p, (2^ ^ ^-p)) say, the remainder each being 1 jtvrl (0, 2 N K " S , then Px,...x y must consist of: , , „N-K-1 N-K-l x t N-K (1) one of e i t h e r : (p + 2 ,2 - p) or (p, 2 - p), n . ,0N-K-1 .N-K-l. (2) (a) p a i r s : ( 2 , 2 ) , (3) ( 2 K - a - 1) p a i r s : (0, 2 N " K ) . Since (0 < p < 2^ ^ "*"), f o r n o n - t r i v i a l i t y , (1) can be written as (r, 2^ ^ - r ) , N-K with (0 < r < 2 ). Hence: H x - ^ . ^ y = N - (R + a - ( 2 N _ K - ( N - K - l ) ) + ( 2 K - a - 1 ) . ( 2 N _ K . ( N - K ) ) ) / 2 N , where R = r.log2(r) + (2 -r).log2(2 - r ) . N-K N-K From the range f o r r, the range for R i s (2 -(N-K) > R > 2 '(N-K-l)). Hence: N-K N-K H > N - (2 -(N-K) + a. (2 -(N-K-l)) K N-K N + (2 -a-1) (2 i N ^.(N-K)))/2 W, or H > N - (2 K-(N-K) - a ) / 2 K K K That i s : Hx^.-.x^y > K + a/2 . But Hx^...x Ry <_ K + 1/2 , by the induction hypothesis. Hence a = 0 i s the only s o l u t i o n and there i s only one n o n - t r i v i a l N-K p a r t i a l function, (that with weight set {r, 2 - r } ) , i n Px ...x y. Theorem 4 i s therefore proved. Theorem 4 i s e a s i l y applied to the recognition of UCR functions, together with the ordering of v a r i a b l e s i n the cascade. For example, the representative function f o r Equivalence Class 6 i n Tables 3.1 and 3.2 has Hx^y = 1.406; Hx^^y = 2.250, i d e n t i f y i n g i t as a UCR function with an ordering of v a r i a b l e s (among others possible) of (x ^ j X ^ . x ^ ) . To complete the r e a l i s a t i o n of the function, i t i s only necessary to f i n d (from a set of fo u r ) , which proper unate elements to use i n the cascade. Theorem 4 was suggested by i n t u i t i v e consideration of the entropies. It seems reasonable that x^ i n F i g . 4.2 should be highly "relevant" to the function, i n turn suggesting a low value for Hx^y, etc. Similar considera-tions lead to Theorem 5, stated without proof, since i t i s less useful than Theorem 4. THEOREM 5 For a Unate Cascade as i n F i g . 4.2, H x i + l y — H x : i / ' 1 = 1>'-->N~1' 4.3 Monotonic and Unate Functions A function y = f(x^,....,x ^) i s said to monotonic i n x^ i f i t can be written i n the form: y = (x 1Ag(x 2,. . . . X j ^ ) ) V h ( x 2 , This condition can be stated i n several other ways, in c l u d i n g the following: y i s monotonic i n x^ i f f f o r every row i n the truth table of the form (0v 2v^..•v^l) there i s a row of the form ( l v 2 v ^ . . .v^l) , where ( V2 V3* • • V J T ) 1 S a s e t °f values f o r ( x 2 x ^ . . . x N ) . I t follows that f o r every row (lv,.,.. .v^O) there i s a row (Ov^.-v^O). Monotonicity i n any other v a r i a b l e , i n c l u d i n g x^, i s s i m i l a r l y defined. 31 Since monotonic functions are often more e a s i l y r e a l i s e d when t h i s property i s recognised (15), i t i s desirable to be able to test f or monotonicity. I f monotonicity i s to be r e f l e c t e d i n the source entropies, i t seems reasonable, from the grouping of v a r i a b l e s i n the d e f i n i t i o n , that such a t e s t would involve the entropies {Hx^y, Hy, Hx 2...x Ny, H x ^ . ^ y } . The d e f i n i t i o n i t s e l f suggests that y i s strongly dependent on x , so that we might expect a r e l a t i v e l y small value for Hx^y. Such conjectures lead to the theorems below. THEOREM 6. I f a function y = f ( x 1 > ,x N) i s monotonic i n or then I x 1 : x 2 > . .x^/y <_ log 2(5/4) , where: I x 1 : x 2 . . x N / y = Hx i y + H x ^ . x ^ - H x ^ . ^ y - Hy. To prove Theorem 6, we consider the truth table for an a r b i t r a r y function y = f ( x ^ , . . . , x ^ ) . To bring out any monotonic stru c t u r e , the rows are grouped and reordered as i n F i g . 4.3. X l x2'*' XN y a rows of the form 0 V a 0 b " " " 0 Vb 0 c " " " " 0 V c l d " " " 0 V d l a " " " " 1 V a 0 b " " " 1 Vb I c " " " " 1 V c l d " " 11 1 V d 0 F i g . 4.3 - Truth Table S t r u c t u r e , where: (1) V a i s a set of (a) sets of (N-1) values assigned to (x 2 >...,x ), etc. (2) (a) i s the number of p a i r s of rows of the truth table of the form: ((Ov 2...v N0), ( l v 2 . . . v N 0 ) ) t (3) a + b + c + d = 2 N _ 1 = M, for convenience. t V a contains those sets of values for (x 2...x N) for which the p a r t i a l functions, y 0 and y-j_ of y with respect to x l s are both zero. S i m i l a r l y , V b contains those values for which y =0 and y,=l, etc. 32 In F i g . 4.3, every po s s i b l e type of row i s represented, and we can now write the relevant weight sets for the a r b i t r a r y function y: P x i y = {po-..-o> po-..-r pi-..-o> p i - . . - i } = {(a + b), (c + d), (a + d), (b + c)}, Py = { p _ _ 0 , p__ 1} = {(2.a + b + d), (2.c + b + d)}, (4.2) Px . . . j ^ y =' {(a + c) p a i r s {0,2} or {2,0}; (b + d) pairs' {1,1}} . I f y i s monotonic i n x^, we r e c a l l that f o r every row ( O v ^ ^ v ^ l ) , there i s a row ( l v 2 . . v ^ l ) . This means, i n terms of F i g . 4.3, that d = 0. From (3) above, we write b = M-(a+c). Equation 4.2, for a function monotonic i n x^, becomes: P x l Y = {(M - c ) , ( c ) , (a), (M - a)}, Py = {(M - c + a), (M + c - a)} , (4.3) Px 2...x Ny = {(a+c) p a i r s {0,2} or {2,0}; (M-a-c) p a i r s {1,1}} . Using these weight sets to derive source entropies, we obtain I x 1 : x 2 . . x N / y = H x ^ + Hx 2..x Ny - H x ^ . ^ y - Hy, whence: I = ((M-c).log 2((M-c+a)/(M-c)) + a.log 2((M-c+a)/a) + (m-a).log 2((M+c-a)/(M-a)) + c.log 2((M+c-a)/c) - 2. (c+a))/2 N. Setting the p a r t i a l d e r i v a t i v e s 91/ , 9 l / ~ , equal to zero, we obtain: 9c a' da (M-c+a).(M-a) = 4.a.(M+c-a), (4.4.1) (M+c-a).(M-c) = 4.c.(M-c+a), (4.4.2) from which we obtain the values: (c,a) = (0,M), (M,0), (M/5, M/5),at the t extreme values for Ix^:x 2 >.x^/y . Examination of second d e r i v a t i v e s i d e n t i f i e s the points as maxima or minima for I. Hence: t Eqn. 4.4.1 i s used to s u b s t i t u t e for c i n Eqn. 4.4.2. The r e s u l t a n t equation i s cubic i n a. Hence the three s o l u t i o n s . 33 Min .(Ix 1:x 2..x N/y) = 0, for (c,a) = (0,M) or (M,0), Max.(I X ; L:x 2..x N/y) = l o g 2 ( 5 / 4 ) , f o r (c,a) = (M/5.M/5). This l a t t e r r e s u l t proves Theorem 6, and any v a r i a b l e (including x^) could have been substituted for x^ i n the above proof, changing only the notation. As a c o r o l l a r y of Theorem 6, we can show that i f some y i s monotonic 2-N i n neither x^ nor x^, then Ix^;x 2..x^/y > 2 . This means that, f o r N = 3, the condition of Theorem 6 i s s u f f i c i e n t , as w e l l as necessary, f o r monotonicity. For N > 3, t h i s i s not so, and to obtain a u n i v e r s a l l y necessary and s u f f i c i e n t condition f o r monotonicity, we use Theorem 7 below. This theorem i s more useful than Theorem 6, but le s s elegant, since the test quantity used i s not simply an "equivocated" mutual information quantity, as i s that of Theorem 6. THEOREM 7 A function y = f ( x ^ , . . . . j X ^ ) i s monotonic i n x^ or x^ i f f Hx1/.y <_ Hr, where: Pr = {R, 2 N - R} and R = ( 2 N _ 1 ) . ( N - H x ^ . ^ y ) Using the same argument as was used i n the proof of Theorem 6, we derive the weight sets {Px^y, Py, Px 2...x Ny} as i n equation 4.2. From Px 2...x^y, we derive: N Hx 2 >..x Ny = N - ((a+c).2.1og 2(2))/2 , whence R = ( 2 N _ 1 ) . ( N - (N - ((a+c).2.1og 2(2))/2 N)) = (a+c). I f we make the s u b s t i t u t i o n s : T = Hr - Hx^/y, (c+d) = m, (a+d) = h, then the weight sets f o r an a r b i t r a r y function become: Px 1y = {(M-m), (m), (n), (M-n)}, Py = {(M-m+n), (M+m-n)}, Pr = {(M-d-b), (M+d+b)}, with T = Hr - Hx y + Hy. (4.5) We also have general constraints on {m,n,d,b}, derived from the f a c t that a l l weights are non-negative: 34 m,n > d , m,n < M - b, (4.6) m + n = M - b + d. Now suppose that y i s monotonic i n x^, ( i . e . d = 0). Then (M - d - b) = m + n, from equation 4.6. We can now write:-T = N - (PLP(m+n, 2.M-m-n, M+m-n, M-m+n) N - PLP(M-m,m,n,M-n))/2 , where s P L P ( P i ; i = l , s ) = Z p ± . l o g 2 ( p i ) . i = l S e t t i n g 9T/ I = gT/ I = 0, we obtain the so l u t i o n s : 6 3m|n gn|m (m + n) = M, or m = n. From the second d e r i v a t i v e s , both these solutions are minima f or T. We also notice that these two minimal curves cross the constraint boundaries of equation 4.6. Hence the minimum f o r T(d = 0) within the constrained region occurs when m = n, or m + n = M. In both cases we f i n d : Min.(T(d = 0)) = 0.0. If y i s monotonic i n x^, b = 0, and from symmetry, Min.(T(b=0)) = 0 once again. Hence, i f y i s monotonic i n x^ or x^, T >_ 0.0, or Hr >_ Hx^/y. This proves the "necessary" part of Theorem 7. Now suppose that y i s monotonic i n neither x^ nor x^, i . e . b,d > 1. Then we obtain: T = N -.(PLP(M-d-b, M+d+b, M-m+n, M+m-n) -PLP(M-m,m,n,M-n))/2 , with constraints b,d >_ 1 , m,n > d , m,n < M-b , m+n = M+d-b . Taking p a r t i a l d e r i v a t i v e s , we observe that gT/ gb , and gT/ , m,n,d gd m,n ,b (4.7.1) ( 4.7.2) (4.7.3) (4.7.4) equal zero only f o r (d + b) = 0. Since t h i s v i o l a t e s constraint 4.7.1, there i s no 35 l o c a l extremum f o r T within the constraint region; hence we must look along the constraint boundaries f o r (b,d) to f i n d Max.(T(b,d^O)). Since T and the constraints are symmetric i n (b,d), we may consider d > b. Then (d ,d . ), J — max mm given b, are (M-b, b). Thus, the constraint boundaries reduce to two c a s e s . Case 1 i s when (b,d) = (b,M-b); Case 2 i s when (b,d) = (b,b). For Case 1, constraints 4.7.2 and 4.7.3 force (m,n) = (M-b,M-b), when: T = N - (PLP(0,2.M,M,M) - PLP(b,M-b,b,M-b))/2 N. T i s maximised here when b = 1 or (M-l), given constraints 4.7.1, 4.7.3. Hence: T = N - (PLP(0,2.M,M,M) - PLP(1,M-1,1,M-1))/2N = ((M-l)/M).log 2(M-l) - (N-1). Now (M-l).log 2(M-l) < M.log 2(M). Hence T < (M.log 2(M))/M - (N-1) or T < 0. For Case 2, we may su b s t i t u t e f o r n by using constraint 4.7.4, and obtain: N T = N - (PLP(M-2.b,M+2.b,2.(M-m),2.m) - PLP(M-m,m,M-m,m))/2 • We observe that 3T/3m = 0 for any m; hence T i s a constant on t h i s boundary. We may then derive: T = N - (PLP(M-2.b,M+2.b) + 2.(M-m) + 2.m)/2N = (N-1) - ((M-2.b).log 2(M-2.b) + (M+2.b).log 2(M+2.b))/2 N. Now (M-A).log 2(M-A) + (M+A).log2(M+A) > 2.M.log 2(M), f o r any A > 0, M > A. Therefore: T < (N-1) - (2.M.log 2(M))/2 N = 0, or T < 0. Thus i n both Cases 1 and 2, we obtain T < 0 and we have proved the " s u f f i c i e n c y " part of Theorem 7. A function y i s said to be unate i f f , f o r every input v a r i a b l e x_^,y i s monotonic i n e i t h e r x. or x.. The set of unate functions contains, as a 1 x small subset, the UCR functions discussed i n Section 4.2. For example, f o r N = 5, there are 116 PCN Equivalence Classes (out of a t o t a l of 616,126) which contain unate functions. But of these 116, only 17 contain UCR functions. COROLLARY 7.1 A function y = f ( x , ...x^ i s unate i f f : Hx^/y f_Hr^, for i = 1,...,N, where: P r i = { r i ' 2 ^ ~ v ± } 5 r i = 2 N - 1 , ( N " H(x-x.)y); (X-x ±) = {x 1,. . . ,x N) - x_L. Both Theorems 6 and 7 are e a s i l y - a p p l i e d t e s t s for monotonicity, with the only remaining problem being the minor one of deciding whether the v a r i a b l e concerned i s complemented or not. 4.4 Threshold Functions Threshold functions, (or l i n e a r l y separable or 1 - r e a l i s a b l e f u n c t i o n s ) , may be defined as follows; a function y = f ( x ^ . . x N ) i s a threshold function i f f a set of r e a l numbers {a^, a 2,..,a^, T} can be found such that: N i f E a...x. > T, then y = 1: . , 1 x — i = l (4.8) N i f E a..x. < T, then y = 0.' The set of unate functions includes the set of threshold functions, which i n turn includes the set of UCR functions. Threshold functions have been extensively studied (19) and many properties of them are known. But the general problem of i d e n t i f y i n g a function as a threshold function, and determining a set of c o e f f i c i e n t s {a^a.^,. . . ,a^} has proven d i f f i c u l t . This i s p a r t i a l l y due to the non-"Boolean Alg e b r a i c " nature of equation 4'. 8, where we tr e a t the {x^} as r e a l v a r i a b l e s . Also, there i s an i n f i n i t y of p o s s i b l e c o e f f i c i e n t sets f o r any threshold function. Several necessary conditions f o r a function to be threshold-r e a l i s a b l e are known, and sometimes these conditions are also s u f f i c i e n t . For example, i f there are K (not n e c e s s a r i l y d i s t i n c t ) rows of the truth table { ( u l S u 2 , . . ,11^,0) . ; j = 1,K) and K rows { ( v x , v 2 , . . , , 1)^; j = 1,K} such that: 37 K K Z (u.) = E (v.) , for each i = 1,..,N, 3=1 3=1 then the function i s c a l l e d "K- summable". It i s easy to show that a threshold function i s K-asummable' for every K. The smallest number of va r i a b l e s f o r which there i s a known non-threshold, K-asummable function i s twelve (20). Also, a l l 2-asummable functions of up to eight v a r i a b l e s are threshold functions. Thus we see that K-asummability i s a powerful necessary condition for threshold functions. To show how asummability and monotonicity, as used i n Section 4.3, are r e l a t e d , we introduce the concept of "higher-order monotonicity'.' " F i r s t -order" monotonicity was defined i n Section'4.3 i n terms of occurrence of p a r t i c u l a r forms of rows i n the truth table. We may equivalently define a function y as monotonic i n x^ i f f y Q "implies" y^, where ( y ^ y ^ ) are the p a r t i a l functions of y obtained by s e t t i n g x^ to (0,1) r e s p e c t i v e l y . y Q " i m p l i e s " y^ means that whenever y^ = 1, then y^ = 1. To show that t h i s d e f i n i t i o n i s equivalent to others: y = ( y 0 A ^ ) v ( y 1 A x 1 ) . Now i f y Q implies y ^ then ^ A y ^ = y Q -Hence y = ( y ^ y ^ x ^ v ( y ^ x ^ = y 1 A ( ( y ( ) A x 1 ) v ~ y-jA ( y g v x i ^ = ^ 1 ^ 0 ^ V ^ i ^ x i ' w n i c h i s i n the form of a monotonic function. We now say that a function y i s .monotonic i n ( x ^ ^ ^ ) , or second-order monotonic, i f i t i s monotonic i n x^ and i n x^, and i n addition, f o r the second-order p a r t i a l functions with respect to ( x ^ x p , namely {y^Q, y l 0 ' y l l ^ ' we have that y ^ implies or i s implied by y^Q' From the above d e f i n i t i o n of f i r s t - o r d e r monotonicity, i n x^ and x^, we obtain the following " i m p l i c a t i o n s " (denoted by c . ) . . , i n the set of second-order p a r t i a l functions: t A function i s "K-asummable" i f f i t i s not "K-summable". 38 y o o c y o i ; y o o c y i o ; y o i c y n ; y i o c y n ' The d e f i n i t i o n of second-order monotonicity extends these i m p l i c a t i o n s , namely: y00 c y01 c Y10 c yll» o r y 0 o c y i o ^ o i c y i r We can now r e l a t e asummability and monotonicity. I f f a function y i s "completely monotonic", i . e . monotonic i n every subset of {x^,x 2,. . . >XT^}> then i t i s 2-asummable (21), which i s a strong necessary condition f o r the threshold property to hold. The reason f o r introducing higher-order monotonicity now becomes c l e a r e r ; we wish to extend the r e s u l t s of Section 4.3, and attempt to recognise higher-order monotonicity from the source entropies. In t h i s way we would hope to recognise threshold functions. F i r s t we have a theorem on second-order monotonicity. THEOREM 8 (1) I f a function y = f ( x ^ , . . , x N ) i s f i r s t - o r d e r monotonic i n x^ and x 2, and second-order monotonic i n (x^x^), then: Hry = Hx^ry + (3-A/2) • Hx 2ry + (2-3.N) + (N-1)-(A/2), where: r = ( x ^ , . . . , ^ ) , A = 3.1og 2(3), Hx^ry >_ H ^ 2 r y . (2) I f y i s f i r s t - o r d e r monotonic i n x^ and x 2 but not second-• • order monotonic i n ( x ^ x 2 ) , then Hry exceeds the above N-1 quantity by at l e a s t (A-4)/(2 ). To prove Theorem 8, we consider the truth table f o r any function y which i s monotonic i n x^, x^* and ( x ^ x 2 ) , with Y g i * " y i 0 " ^ & r e a r r a n S e t n e truth table so as to show (YQQ> ^QI' y i o ' y l l ^ s ^ e by side, and permute the values for r, or ( x y . - x ^ ) , to give the form of F i g . 4.4. 39 y y 00 y 01 y10 y l l M/2 0 0 0 0 permuted t # 0 values f o r 0 . 1 0 i . 1 r = (x 3..x N) 0 > 1 • ' 1 a* b* c" d' . 1 • • 1 • 1 • 1 F i g . 4.4 - P a r t i a l functions f o r second-order monotonicity. (1) a = No. of "ones" i n YQQJ etc. N-1 (2) M = 2 . (3) 0 < a < b < c < d < M/2. (4.9) From F i g . 4.4, we f i n d the following weight sets: t • Px^ry = {(a+b-c-d+M) p a i r s {0,2} ; the remainder {1,1}}, Px^ry = {(a+c-b-d+M) p a i r s {0,2}; the remainder {1,1}}, Pry = {(a-d+M/2) p a i r s {0,4}; (b+d-a-c)pairs {1,3}; (c-b)pairs {2,2}}. From the expressions f o r the corresponding source entropies, we obtain: (N - Hx ry)-M = a+b-c-d+M, (N - Hx 2ry)-M = a+c-b-d+M. Equations 4.9 are s u f f i c i e n t to allow s u b s t i t u t i o n f o r {a,b,c,d} i n the set Pry, namely: (a-d+M/2) = M-((N - (Hx^y + Hx 2ry)/2) - M/2, (b+d-a-c) = M-(Hx 2ry - N + 1 ) , (c-b) = M-(Hx^y - Hx 2 r y ) / 2 . Making these s u b s t i t u t i o n s , and using Pry to obtain Hry, we obtain the condition of Theorem 8 for Hry, when y i s second-order monotonic i n ( x ^ x 2 ) . Once again, t h i s r e s u l t i s e a s i l y generalised to other pa i r s of v a r i a b l e s , with any complementation being swamped when we consider source entropies. I f we consider the case where y i s f i r s t - o r d e r monotonic i n x^ and x 2 but not second-order monotonic i n ( x i L x 2 ^ » w e s e e that 1 S necessary t Here "{0,2}" represents the weights {0,2}, or {2,0}: e i t h e r ordering giving the same source entropy. that, i n addi t i o n to rows of the form given i n F i g . 4.4, we must have at l e a s t : for some value of r , C Y 0 O ' y 0 1 ' y 1 0 ' y l l ^ = ^°' 1>°> 1^ and f o r some other value, ( " , " . " > " ) = (0,0,1,1). With these two rows, we see that Y Q ^ ^ Y-^Q a n d y10 ^ y01' ^ U T T ' I A T T N E Y a r e consistent with yoo c y o r yoo c y i o ; y o i c y n ; y i o <= y i r a s r e c i u i r e d - A d d i n § t h e s e t w o rows to the truth table of F i g . 4.4, r e p l a c i n g rows of the form (0,0,0,0), we f i n d that Px^ry and Px^ry are unchanged, but that the set Pry i s augmented by two p a i r s of the form {2,2}, and diminished by two pairs of the form {1,3}. Thus we c a l c u l a t e that the new Hry exceeds the above value f o r Hry by N-1 (3.1og 2(3) - 4)/2 . This proves part (2) of Theorem 8. This l i n e a r arithmetic r e l a t i o n among source entropies f o r second-order monotonic functions i s rather s u r p r i s i n g , i n view of the f a c t that f i r s t -order monotonicity i s r e f l e c t e d i n i n e q u a l i t i e s rather than equations (see Theorems 6 and 7). I t i s pos s i b l e that a l t e r n a t i v e conditions using the source entropies may be found. Generalisation of Theorems 6, 7, and 8 to higher-order monotonicity has been attempted, but so f a r , only necessary but i n s u f f i c i e n t conditions have res u l t e d . We now turn to the second aspect of the threshold problem; namely that of f i n d i n g the c o e f f i c i e n t set^ f o r a threshold function. Many p a r t i a l r e s u l t s are known, and various estimation algorithms e x i s t (22). At the time of w r i t i n g , however, simple metho,ds of fi n d i n g a c o e f f i c i e n t set do not e x i s t . Many estimation procedures are based on the Chow parameters {c^5 i =l»N} (23), which are e a s i l y defined i n terms of our f i r s t - o r d e r weight sets, {Px_^y; i=l,N}: c. = (p , - p _ n_ . ) , where i -.. i-..± -.. u ..J. (4.10) (P_.._!_.._> P - . . - 0 - . . l ) e P0..010..1 = P x i y ' t The c o e f f i c i e n t set i s usually c a l l e d the "weight se t " i n the l i t e r a t u r e ; we avoid confusion by not using t h i s name. 41 It i s also easy to show that the magnitudes of the Chow parameters, {h^} = {Jc^j} are given by the N^-order source entropies: b = |c | = 2 N _ 1.(H(X-x )y - (N-1)), where (X-x^ = {x ,...x^} - x_^ . The Chow parameters themselves are not a very good estimate of the c o e f f i c i e n t set, but i t can be shown that there are at most two threshold functions having the same Chow parameters (24). This suggests that they are strongly r e l a t e d to the threshold property. Further, the sign of each c^ i s the same as that of any corresponding c o e f f i c i e n t a_^ . We now give an empirical r e s u l t , which enables us to f i n d v a l i d c o e f f i c i e n t sets for N <_ 5. The r e s u l t may be v a l i d f o r N> 5, but has not been examined as yet. th THEOREM 9 I f a function y = f(x ,. ' . j X ^ ^ ) i s a threshold function, with N -order source entropies {Hx^..x^_^y,...HX2...x^y }, then the magnitudes of a v a l i d c o e f f i c i e n t set {a^,a^,..., a^ } are given by a s o l u t i o n to the following, when N < 5: a_^ = A.b /(K+b ), where K i s the r e a l s o l u t i o n to N N I (b./(K+b.)) =» 1; E (a.) = A,' i = l 1 1 i = l 1 b = 2 N _ 1-(H(X-x )y - (N-1)), with ( X - x J = {x x,.. ,xN} - x ±. Theorem 9. was proved e m p i r i c a l l y for N = 5. A switching function was selected from each of the 63 PCN Equivalence Classes containing threshold functions, such that a l l the c o e f f i c i e n t s were non-negative. The magnitudes of the Chow parameters, seen as {b.} i n Theorem 9., and the c o e f f i c i e n t set {a.} 1 x was computed, using Algorithm 2. below. In each of the 63^cases, a threshold (T) was found, s a t i s f y i n g the threshold condition of equation 4.8. Although a rigorous analysis of the procedure i n Theorem 9. i s very d i f f i c u l t , we can demonstrate p l a u s i b i l i t y . Suppose we have a threshold function y with a non-negative set {a_^}. We wish to estimate the Chow parameter magnitudes from the c o e f f i c i e n t s . Such a function i s given i n F i g . 4.5, for which the Chow parameters are not an acceptable c o e f f i c i e n t set. y 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 x i 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 x 2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 X 3 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 F i g . 4.5 - Threshold Function with Chow Parameters {6, 2, 2, 2}. F i g . 4.6 - Real Functions F^CX-x.) = E (a .x.) + k.a. k 1 j - l J J For the p a r t i a l functions of y with respect t o x . , namely {y^CX-x^), y^(X-x^)}, we may construct r e a l functions: '•{Fj!j(X-x ), F^(X- X j.)} such that; N F.^X-x.) = Z (a..x.) + k.a . k x j = 1 J J i We p l o t two such r e a l functions i n each of the graphs of F i g . 4.6 f o r the switching function of F i g . 4.5, where the c o e f f i c i e n t set used i s {2, 1, 1, 1}. We see that the threshold condition of equation 4.8 may be restated as: F.^X-x.) < T i f yj^X-x.) = 0, k x k x Fj(X-x.) > T i f y^(X-x.) =1. rC X iC X In F i g . 4.6, a s u i t a b l e threshold value (T = 2.5) has been superimposed. We now make two observations about F i g . 4.6. F i r s t l y , the "distance" (d^-dy) on each graph i s nearly proportional to the magnitude of the relevant Chow parameter, due to the close r e l a t i o n between F^(a) and y^( a ) . Secondly, for each F(a), Max.(F(a)) = F ( l , l , l ) and Min.(F(a)) = F(0,0,0). I f we l i n e a r i s e each F(a) between i t s extrema, the gradient i s (A-a^)/K, where K i s the abscissa length over which F(a) i s defined. The v e r t i c a l separation between F^ and F^, ( l i n e a r i s e d ) , i s a^. Hence the h o r i z o n t a l separation W^-d^) i s estimated i n terms of (a^,A,K). I f we assume that t h i s remains a good estimate for c^, a f t e r l i n e a r i s a t i o n , then we obtain: 4 b. = Ic.l - K.a./(A-a.), with A = Z (a.). 4.11 x 1 x1 x x . n J If we assume that equation 4.11 i s an eq u a l i t y , f o r some choice of c o e f f i c i e n t s , rather than an approximation, then we may i n v e r t the equation and obtain the estimates for {a.} i n terms of {b.}, as i n Theorem 9. x x We note that the p r o p o r t i o n a l i t y constant K i n equation 4.11 i s constrained. This arises as follows: 44 a = A.(b./(K+b.)), from equation 4.11. 1 1 1 N N E (a.) = A. E (b./(K+b.)). i = l 1 i - l 1 1 N A = A. E (b./(K+b.)). i - l 1 N Hence E (b./(K+b.)) = 1 . (4.12) i - l 1 The a n a l y t i c s o l u t i o n to equation 4.12 i s d i f f i c u l t f o r N > 2, but a '. r a p i d l y convergent s o l u t i o n to the {a^} from {b^} i s given by Algorithm 2, below. This algorithm generates a normalised set {a_^ }, ( i . e . such that Ea^ = 1), by repl a c i n g b_^ successively by an approximation to b^/K. The error index e measures the algebraic sum of errors i n each a^. ALGORITHM 2 To obtain a normalised c o e f f i c i e n t set {a^} from the magnitudes of the Chow parameters {c.} of a threshold function y of at l l e a s t 5 v a r i a b l e s , to within an erro r of e. (1) Compute {b_^ ; i=l,N} = { | c_^ | ; i=l,N}, possibly from the N -order entropies. (2) I f every b^ equals zero, set every a_^ to (1/N) and go to (6). (3) Set a j [ = b ±/(l+b ), for each i . (4) Set A = ^ (a.). I f a = 1 + e, go to (6). i = l (5) Replace each b^ by (b^/K), and go to (3). (6) Attach the sign of c. to each a. i n {a.} to obtain a normalised c o e f f i c i e n t b l i i set for y, for N < 5, and small e. Algorithm 2 was implemented i n a twenty-statement FORTRAN subroutine, and produced a v a l i d c o e f f i c i e n t set f o r each 5-variable threshold function within twenty i t e r a t i o n s . 45 4 . 5 D i s j u n c t i v e Decompositions A switching function y = fCx^.-.x^) i s d i s j u n c t i v e l y decomposable (DD) i f f there e x i s t s a subset A of X = {.x x N>, such that y = g(h(A),X-A), with A containing at l e a s t two v a r i a b l e s . This property enables y to be w r i t t e n as a function (g) of l e s s than N v a r i a b l e s , by constructing the a u x i l i a r y function z = h(A). Since i t i s usually easier to r e a l i s e two functions of less than N v a r i a b l e s , rather than one of N v a r i a b l e s , recognition of the DD property i s valuable. The r e a l i s a t i o n structure of a DD function i s shown i n F i g . 4-. 7. X — L V D.D. K X-A - K+r F i g . 4 . 7 - D i s j u n c t i v e Decomposition y = g(h(A), X-A). Decomposition Theory, which includes more general decompositions than the Simple D i s j u n c t i v e Decompositions defined above, has b^en extensively developed ( 5 ) . However, the basic algorithm for f i n d i n g D i s j u n c t i v e Decompositions i s N-1 lengthy, i n v o l v i n g study of (2 - 1 ) P a r t i t i o n Matrices. The information model of switching c i r c u i t s we have been using does not enable recognition of DD functions from the source entropies, as was hoped. A d i f f e r e n t set of entropies, to be described, does have t h i s a b i l i t y , however. A P a r t i t i o n Matrix, mentioned above, i s a rearrangement of the N-K truth table of a function y as i n F i g . 4 . 4 . Each of the 2 columns i s a th (N-K) -order p a r t i a l function of y, namely the function obtained by s e t t i n g (N-K) v a r i a b l e s to constant values. Each p a r t i a l function i s then described by a truth table i n the remaining K input v a r i a b l e s , as i n F i g . 4 . 8 . A 3 A 4 Y 00 01 10 11 - X 1 X 2 00 0 1 0 1 01 0 0 1 1 10 0 0 1 1 11 0 0 1 1 F i g . 4.8 - P a r t i t i o n Matrix f o r DD function y = (g(h(A).X-A), A=(x 3x 4), X-A=(x 1x 2). The D i s j u n c t i v e Decomposition of F i g . 4.8 i s recognised by observing that there are only two d i s t i n c t rows, ((0101), (0011)), i n the P a r t i t i o n Matrix. This condition i s equivalent to the existence of columns ( p a r t i a l » • • • — functions) of only two types, {0, z}, where 0 = 0 or 1, and z = z or z. In Fi g . 4.8, we see two n o n - t r i v i a l p a r t i a l functions {YQ^J y10^' where YQ-^ = Y ^ Q * We may therefore define z = h( x 3 > x ^ ) = Y Q T l( x3> x^)» a n c * w e c a n r e a l i s e y i n the form of F i g . 4.7, with the p a r t i t i o n set A = {x^jX^} . N There are c l e a r l y 2 P a r t i t i o n Matrices, one f o r each subset of {x^,...,x^}. However, turning the matrix of F i g . 4.8 on i t s side gives the P a r t i t i o n Matrix for the p a r t i t i o n set {x^x^}. Further, the truth table f o r y i t s e l f i s a P a r t i t i o n Matrix for the t r i v i a l decomposition y = g(h(0),x^,..,x^) N-1 We are l e f t , then, with (2 -1) n o n - t r i v i a l P a r t i t i o n Matrices to examine for the "two d i s t i n c t row" condition. The "two d i s t i n c t rows" condition can be restated i n terms of the j o i n t p r o b a b i l i t y d i s t r i b u t i o n for values taken by the p a r t i a l functions. In F i g . 4.8, fo r example, the four p a r t i a l functions (columns) may be considered as a random vector, and we can count the number of occurrences of each possible value vector. We can then compute the j o i n t entropy for the p a r t i a l functions. C l e a r l y , i f there are only two d i s t i n c t rows i n the p a r t i t i o n Matrix, a l l but two of the value vectors w i l l have zero p r o b a b i l i t y . This i s r e f l e c t e d i n a value for the j o i n t p a r t i a l function entropy of < L . O b i t . If we denote by Hp^g^ the j o i n t entropy of the p a r t i a l functions with respect to the subset B of {x^,..,x^}, then from F i g . 4.8, we have: P f ( v v s = {0,0,0,3,0,1,0,0,0,0,0,0,0,0,0,0} , P „ f rV v ^ = {1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1}, pi^x^x^^ H p f ( x x ) = "(3/4).log 2(3/4) - ( l / 4 ) . l o g 2 ( l / 4 ) = 0.811 b i t , H f, v s = 4.(-(1/4).log-(1/4)) = 2.000 b i t . pr^x^x^; z We can state t h i s r e s u l t generally i n Theorem 10. THEOREM 10 A function y = f(X) i s D i s j u n c t i v e l y Decomposable with a p a r t i t i o n set A i f f H . .,, < 1.0 b i t , where X = {x.,..,x }. v pf(X-A) — ' I N N The 2 p a r t i a l function entropies (with two of them, namely H = H;H,.. .=1.0, being t r i v i a l ) can be l i s t e d for any function or pf(0) y pf(X) PCN Equivalence Class, and any Di s j u n c t i v e Decompositions are immediately apparent. We should note, however, that generating these entropies, while i t i s e a s i l y programmable, represents no r e a l saving of e f f o r t , compared to N the P a r t i t i o n Matrix method. Also, despite the fac t that there are 2 p a r t i a l function entropies, they do not uniquely characterise the PCN Equivalence Classes. I t would be s a t i s f y i n g i f a DD c r i t e r i o n could be found that uses the source entropies, but so far i t has not been discovered. The source entropies can be used, however, to f i n d the source entropies of the a u x i l i a r y function z = h(A) i n y = g(h(A),X-A), when we know that y i s DD with the p a r t i t i o n set A. This i s possible from the entropies alone, without reference to the truth table or P a r t i t i o n Matrices. The basis f o r Algorithm 3, K K below, i s the following r e s u l t . I f {y^; i=l,2 } are the 2 p a r t i a l functions K of y with respect to {x^; i=l,K}, with f i r s t - o r d e r entropies {Hy^; i=l,2 }, then the source entropy of y, H X l..x ..x y, can be written as: 48 Hx 1..x R. . x R + J y = (K+J-) + ( E H y i)/2 . (4.13) i = l ALGORITHM 3 Given a function y = g(h(A),X-A) with H p f ( X _ A ) <_ 1.0, and with source entropies t o f i n d the source entropies of the a u x i l i a r y function z = h(A). (1) If (X-A) contains K v a r i a b l e s , compute: L = 2 K.(1 - (H(X-A)y - K)/H ( X _ A ) ) , where L i s the number of t r i v i a l p a r t i a l functions of y with respect to (X-A), (derived from equation 4.13). (2) Let B by any set of M v a r i a b l e s such that B c A. Then a source entropy HBz of z = h(A) i s given by: HBz = 2 K.(H(X-A+B)y - K - (L.M)/2 K)/(2 K-L). N-K (3) Find each of the 2 subsets B of A, and use the values of K, M, L and the appropriate source entropy of y to f i n d the complete set of source entropies (Invariants) of z. As an example, consider the DD function of four v a r i a b l e s i n F i g . 4.8. We f i n d H , * = 0.811, with Hx x y = 2.406. Applying step (1) of Algorithm p i ^ XT.^2 3, we obtain: L = 4.(1 - (2.406 - 2)/0.811) = 2. Since A = {x^x^}, we l e t B be each of {0, x^, x^, {x^x^}} i n turn, and obtain the source entropies from steps (2) and (3) of the algorithm: Hz = 4.(H X lx 2y - 2 - 0)/2 = 4.(2.406 - 2)/2 = 0.811 b i t . Hx 3z = 4.(Hx ]x 2x 3y - 2 - (2/4))/2 = 1.500 b i t . Hx,z = 4.(Hx.xx.y - 2 - (2/4))/2 = 1.500 b i t . 4 1 2 4 H x ^ z = 4. ( H x ^ x x y ~ 2 ~ ( 4/4))/2 = 2.0°0 b i t . An a l t e r n a t i v e way of viewing DD functions i s to observe the analogy between the DD property and the property that an information channel has a " s u f f i c i e n t reduction" (13). We describe a channel by means of i t s channel matrix, which i s defined below. I f the channel has r input s i g n a l s and s output s i g n a l s , then the channel matrix Q has r rows and s columns, with the element q ^ of Q being the p r o b a b i l i t y of the j*"* 1 output s i g n a l , given the i ^ input s i g n a l . Suppose we l e t the values taken by the subset A of X = {x^,...,x^} be the output s i g n a l s , and the values taken by {X-A,y} be the input si g n a l s of-a channel. I f A contains K v a r i a b l e s , there are 2 input signals K N—K"T"1 ( i . e . r = 2 ), and 2 output s i g n a l s . The channel corresponding to the function of F i g . 4.8, with i t s channel matrix, i s given i n F i g . 4.9. To form a reduced channel, M columns of the channel matrix Q are added together, thus replacing M output signals with a s i n g l e new composite s i g n a l . (An example of such a reduction of a channel i s the pl a c i n g of a quantiser i n front of the receiver i n a communications system). In general, reducing a channel reduces the mutual information between input and output. I f , however, there i s no such information l o s s , the channel reduction i s c a l l e d a s u f f i c i e n t reduction. In terms of the channel matrix, combining output si g n a l s b^ and b^ i s a s u f f i c i e n t reduction i f f Prob(b^/a) and Prob(b2/a) are p r o p o r t i o n a l , f o r every input, a. In F i g . 4.9, we observe that the output si g n a l s {01, 10, 11} can be combined, since the corresponding elements i n each row are the same. We then have a new (reduced) channel with two columns (or output s i g n a l s ) . The two si g n a l s can be represented by the two values taken by a binary random v a r i a b l e , z. We know that the mutual information remains unchanged, since the reduction was a s u f f i c i e n t reduction, that i s : Ix^x^y.x^x^ = I x - ^ y . z. Observing that Hx^^z = ^ x l x 2 s l n c e z ^ s independent of {x^, x^}, w e obtain Hy/x^x^z = 0. In other words, knowledge of {x^, x^, z} i s s u f f i c i e n t 50 r=2K=4 output s i g n a l s . (values of { x 3x 4 }) 00 01 10 11 channel , -> »— Outputs. 000 (001) 010 O i l 100 101 (110) 111 „N-K+1 s = 2 = 8 input s i g n a l s . (values taken by {x^x^}) 00 01 10 11 000 1/4 1/4 1/4 1/4 (001) - - - -010 0 1/3 1/3 1/3 Inputs, O i l 1 0 0 0 100 1 0 0 0 101 0 1/3 1/3 1/3 (110) - - - -111 1/4 1/4 1/4 1/4 fProbs. not defined, since input s i g n a l does not occur. Channel Matrix Q, where q = Prob(x 3x 4=00/given x^x^OOO) , etc. F i g . 4.9 - Channel and Channel Matrix for y = f(x ,...x^) with p a r t i t i o n set A = (x 3,x 4>. to determine the output function y, which from F i g . 4.7, in d i c a t e s a D i s j u n c t i v e Decomposition. We state t h i s r e s u l t more generally i n Theorem 11. THEOREM 11 A function has a D i s j u n c t i v e Decomposition y = g(h(A),X-A) when the information channel with the values of {X-A,y} as inputs and those of A as outputs can be s u f f i c i e n t l y reduced to a channel with two outputs. 51 5. SYNTHESIS OF SWITCHING CIRCUITS 5.1 A Synthesis Philosophy In Section 1, there was some discussion of switching c i r c u i t design methods i n common use. In most cases, these methods centre around mani-pul a t i o n of Boolean q u a n t i t i e s associated with a function i n t o forms which can be tr a n s l a t e d into a r e a l i s a t i o n , using allowed forms of switching element. We i n d i c a t e d how these design methods, by themselves, were not s u i t a b l e when design c r i t e r i a other than "minimisation of elements" were used. We now present a design philosophy based on the information model of switching c i r c u i t s that we have been using as an a n a l y t i c a l t o o l . Consider the simple function y = (X^ AX^ ) ® (X^ AX^ ) to be r e a l i s e d with 2-input NOR-gates. (This i s a function from PCN Equivalence Class 4 i n Tables 3.1 and 3.2). The "minimum gate" r e a l i s a t i o n of t h i s function i s given i n F i g . 5.1, using the notation f o r NOR-gates of F i g . 4.1. x l X2 X 3 . y z l Z2 Z3 0 0 0 l 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 1 0 F i g . 5.1 - Truth Table and "minimum Gate" R e a l i s a t i o n of a Function of three v a r i a b l e s . We could have a r r i v e d at the design of F i g . 5.1 by, for example, Decomposition Theory, which would have revealed the decomposition y = g ( z 3 ( x 2 , x 3 ) ,x^) , with the r e a l i s a t i o n of y = g C z ^ . x ^ ) , z ^ C x ^ x ^ ) being t r i v i a l . However, we could also have looked at the source entropies of y and observed that the mutual information Ix^x^x^-.y = Hy = 0.811 b i t , 52 whereas the input information i s Hx^x x = 3 , <^ ^ i t . In other words, we have to f i n d an interconnection of gates which w i l l "pass" 0.811 and "block" the remaining 2.189 b i t of information. So we might begin our design by fi n d i n g some p a i r of input variables (for 2-input gates) to connect to the inputs of a NOR-gate, such that the output function of the gate i s "relevant", i n the information sense, to the output function. We can define an information performance function (to be discussed below) which w i l l measure the "goodness" of such an intermediate function, z, i n terms of the source entropies of some subsets of the v a r i a b l e s {x^,. . . , x ,y,z}. In the case of Fi g . 5.1, i f our performance function aimed at producing "minimum-gate" r e a l i s a t i o n s , we would expect that, of the possible choices of input v a r i a b l e s {(x^,x^),(x^,x^),(x^,x^)} and of the four ways of attaching input v a r i a b l e s ( i . e . with or without complementation), we would f i n d that z^, or z^, gave the best performance. Assume then that we have chosen z^ ( x 2 , x ^ ) . The next step i s to r e a l i s e that we can define y as a function of four v a r i a b l e s , y = g ( x ^ » X 2 ' X 3 > z i L ^ ' a n d w e c a n r e P e a t t n e process again. We would hope that the performance function would s e l e c t Z2^ X2' X3^ 3 3 t* i e best function, given that z^ had been constructed. This brings us to the concepts of "independence", " i r r e l e v a n c e " , + and "cover". We say that x and y are independent i f Hxy = Hx + Hy. I f y = f ( x 1 , . . , x N ) , and A i s some subset of X = {x^^.-.x^}, then the va r i a b l e s i n (X-A) are i r r e l e v a n t to y i f HA = HAy. (In other words, y can be. written as a function of A only.) We say that a set of va r i a b l e s A form a cover for y i f HA = HAy. In the example of F i g . 5.1, we see that, since Hx 2y = 1.811 = 1.000 + 0.811 = Hx 2 + Hy, y i s independent of x 2 > but x 2 t Usually c a l l e d "pairwise independent". i s not i r r e l e v a n t to y since Hx^x^y ^ *Ix;j_X3 = 2.000. Possib l e covers for y are {x^, x^, x^} or {x^, x^, x^, z^}. A minimal cover i s a cover which has no proper subset which i s a cover. Returning to the design of the r e a l i s a t i o n of F i g . 5.1, a f t e r the s e l e c t i o n of z^ and z^, the "augmented input v a r i a b l e l i s t " would be {x^ j X ^ , x^, z^ , z^ } • However, observation of Hx^z^z^Y = ^yi\z\Z2 ~ ^-500 defines {x^,z^,Z2} as a cover f o r y. This reduces the amount of t e s t i n g required i n the next step of the design from 20 to 3 p a i r s of v a r i a b l e s . We would continue to search the augmented v a r i a b l e l i s t , adding a "good" function at each step, and d e l e t i n g any i r r e l e v a n t v a r i a b l e s , u n t i l hopefully a new function i s generated which i s the desired target function y or i t s complement. This i s the ba s i c design philosophy we w i l l examine i n the subsequent sections. The basic idea i s to construct the r e a l i s a t i o n from input to output by " h i l l - c l i m b i n g " , rather than by attempting to manipulate algebraic properties of the function. Some advantages of such a procedure are c l e a r . F i r s t l y , the method i s very simple, and i s e a s i l y programmable since the entropies used i n the performance function are arithmetic rather than algebraic i n nature. Secondly, any t o p o l o g i c a l constraints on the design, such as p l a n a r i t y or "maximum depth", are e a s i l y included; any "good" function suggested by the performance function, which v i o l a t e s a constraint, i s rejected, and the next best one t r i e d . Also, we can use any type of gate equally w e l l . There may be problems i n the method. F i r s t l y , why should the method work at a l l ? Might not the augmented v a r i a b l e l i s t grow i n d e f i n i t e l y without generating the function? The answer to th i s i s empirical; we f i n d i n Sections 5.2, and 5.3, that the performance functions suggested by the i n t e r p r e t a t i o n of various information q u a n t i t i e s do provide convergence, f o r some algorithms at le a s t . 54 A more subtle problem i s concerned with the generation of functions which are not used i n l a t e r steps of the design. These "dead-ends" cannot be discarded during the design, since they are not recognised as such u n t i l the end. So the design algorithm has to consider them, since they are i n the augmented v a r i a b l e l i s t . But they might a f f e c t l a t e r performance function values adversely, being a p o s t e r i o r i "bad" functions. Problems of t h i s kind f a l l i n t o the general class of problems described as follows: I f one step i n a s e r i e s i s "non-optimal", i n hindsight, how "good" i s the series? This problem may be tackled by improving the i n d i v i d u a l steps of the s e r i e s , or by using some form of feedback. We can do the same i n our design procedure; we could attempt to use only those performance functions which do not generate unused intermediate v a r i a b l e s , or we could use a "sub-optimal" r e a l i s a t i o n to generate a better r e a l i s a t i o n , by returning to the step that produced the dead-end, and t r y i n g something els e . The ultimate j u s t i f i c a t i o n f o r t h i s design philosophy w i l l rest i n the q u a l i t y of r e a l i s a t i o n s produced, i n the ease with which the algorithms can be implemented and executed, and i n the i n s i g h t into the structure of r e a l i s a t i o n s which i t might provide. 5.2 Information Performance Functions The heart of any algorithm based on the design philosophy of Section 5.1 i s i t s Information Performance Function (IPF). There are two ways of fi n d i n g s u i t a b l e IPF 's. F i r s t l y , we can look at a number of "optimal" r e a l i s a t i o n s , compute entropies and use these to construct an IPF which would have produced these r e a l i s a t i o n s . Secondly, we might use the concepts of "relevance", "transmitted information", "mutual information" e t c . , suggested by the information model. 55 Realisations using 3-input NAND-gates of a l l 3-variable switching functions, which are optimal i n having the minimum t o t a l number of gates and i n v e r t e r s , have been l i s t e d (25). If we assume that the f i r s t gate i n the r e a l i s a t i o n ( i . e . one with input v a r i a b l e s only as inputs) has a function z, then we can define a source entropy f o r every subset of {x^yX^^^^y,z} . The simplest IPF suggested from t h i s data was simply (-Hzy). (I.e. the IPF would be used to s e l e c t the function z having the smallest value for Hzy). This would have given an "optimal" f i r s t step ( i n the above sense of optimal) i n 49% of cases. A b e t t e r IPF i s Iz:y = Hz + Hy - Hzy, which i s "optimal" i n 56% of cases. We can see i n t u i t i v e l y why t h i s might be more su c c e s s f u l , since Iz:y i s a c l o s e r measure of the "relevance" of z to y than i s Hzy, which could be small simply because Hz was small. Considerations such as these lead to a multitude of possible IPF's, based on the requirement of choosing the "optimal" f i r s t step. To reduce computation time, some algorithms used two IPF's; the f i r s t to s e l e c t the v a r i a b l e p a i r {x.,x.} to be used as inputs to a 1 3 gate and the second to s e l e c t one of the four functions { X . A X . , X.AX., X.AX., X.AX.. & i 3 1 3 1 3 1 3 r e a l i s a b l e with a NOR-gate. Table 5.1 summarises the IPF's used i n the algorithms discussed i n Section 5.3. They are i n increasing order of effectiveness i n generating "minimal-gate" r e a l i s a t i o n s , using 2-input NOR-gates, and i n most cases, the lower IPF's evolved from the upper ones. The most succ e s s f u l IPF found so f a r i s the l a s t one i n the table. 5.3 Comparison of Design Algorithms The ten IPF's i n Table 5.1 were incorporated i n eight design algorithms for the r e a l i s a t i o n of completely-specified single-output switching functions, with 2-input NOR-gates. Each algorithm was implemented i n FORTRAN, and run on e i t h e r an IBM 7044 or IBM 360/67 machine. In the following, we discuss the 56 implementation and performance of these algorithms, with some "book-keeping" considerations. ALGORITHM 4 (1) Construct a truth table with' columns for {x^; i=l,N} and y = f ( x ^ , . . , x ^ ) . (2) Define L as the "augmented input v a r i a b l e l i s t " . Define LR and LRT as l i s t s of rejected p a i r s of v a r i a b l e s , (permanent and temporary, r e s p e c t i v e l y ) . Define T as a threshold for the j o i n t entropy q u a n t i t i e s Hzy, where z i s i n L. (3) Put {x , x ; i=l,N} i n L. Put {(x,.,^); i=l,N} i n LR and LRT. (This means addresses of the appropriate columns i n the truth table, not the functions these v a r i a b l e s represent). Set T = Min.(Hx^y). (4) Find the p a i r (x., x.) such that: x., x. eL; (x.,x.)^LRT; and (Hx.y+Hx.y) i s a minimum. I f impossible, go to (6). (5) Construct the function (as a new column of the truth t a b l e ) ; z = N0R(x., x.) = (x.Ax.). i J i J (6) I f Hzy >_T, put (x.,, x ) i n LRT; go to (4). (7) I f there i s no (x^,:.x_.), such that Hzy < T, relax the threshold T by some amount (such as r e s e t t i n g T to a previous value), and go to (4). (8) I f ' z i s the same function as some other function l i s t e d i n L, put (x^, x^) i n LR and go to (4). (9) If the tests of ( 6 ) , ( 7 ) , and (8) are passed, put z and z i n L; put ( x ^ x j and (z, z) i n LR. Set T = Hzy. (10) I f z i s the function y or y, stop. Otherwise go to (.4). This algorithm uses a two-stage IPF. F i r s t l y , we seek p a i r s of * * * * v a r i a b l e s (x., x.) with low (Hx.y + Hx.y), and then accept the function i J i J * * z = N0R(x^, x^ .) only i f Hzy i s s u f f i c i e n t l y low, namely les s than T. The reason for the threshold c r i t e r i o n was to make th i s e a r l y algorithm analogous IPF Used f o r Choice of: Used i n Algorithm: I n t e r p r e t a t i o n : 1 -(Hx.y+Hx.y) i J {* ±,x j } 4 Looks f or v a r i a b l e s which have low j o i n t entropy with y. 2 -(Hzy) z, given { X i , X j } 4 Looks f o r function z with lowest j o i n t entropy with y. 3 (Ix_^:y+ I X j :y) {x.,x.} i J 5 Looks f o r v a r i a b l e s which are "relevant" to y. 4 Iz:y z, given {XJL,X.J } 5 Looks f or function z which i s most relevant to y. 5 Ix.x.:y-i 3 Ix.:y-Ix . :y i 3 { X . , X . } 6 Looks f or v a r i a b l e p a i r relevant to y, but with low i n d i v i d u a l relevance. 6 -Iy:C- {x.x.} i J { X . . X . } 7 Looks for v a r i a b l e p a i r i n the current cover C, such that the remaining variables i n C are of low relevance to y. 7 pf(x.x.) 1 3 {x.,x.} 8 Looks f or v a r i a b l e p a i r such that the j o i n t entropy of the p a r t i a l functions with respect to the p a i r i s minimised. 8 Hx.x./Azy , 1 3 (A=C-{x. X j}) z and { ^ . X j } 9 Looks f o r v a r i a b l e p a i r g i v i n g function z such that the amount of i r r e l e v a n t information i n the p a i r blocked by z i s maximised. 9 Iz:y/A v Ix.x.:y/A , 1 J (A=C-{x ix j}) z and {x.,x.} i 3 10 Looks f or function z of {x.,x.} such that i 3 the proportion of relevant information i n {x.,x.} passed by z i s maximised, i 3 10 -Hy/Az (A=C-{x.x.}) i 3 z and {x.,x.} 11 Looks for function z of {x.,x.} such that i J the uncertainty i n y, given z and A, i s minimised. Table 5.1 - Information Performance Functions. to the Fano Decoding Algorithm strategy ( 2 6 ) . A t y p i c a l r e s u l t f o r Algorithm 4 i s given i n F i g . 5.2 and 5 . 3 , for Test Function A. The appearance of the r e a l i s a t i o n i s "long and t h i n " ; there i s a tendency i n th i s algorithm to t r y to use the l a s t intermediate function to generate the next, due to the IPF used (namely Hx_y + Hx^y). This tendency l e d to a d i f f e r e n t IPF being used i n Algorithm 5. F i g . 5.3 - R e a l i s a t i o n of Test Function A by Algorithm 4. ALGORITHM 5 I d e n t i c a l to Algorithm 4 except f o r the following: ( 1 ) The IPF's become ( I x ^ y + I X J : Y ) a n d (Iz-y)> replacing - ( H x ^ + H X J Y ) and -(Hzy). '(2) The threshold T i s set to zero i n i t i a l l y , and i s reset to Iz:y when a new function, z, i s accepted. ( 3 ) A subroutine i s added which tests i f a new z i s the same function as some element of L. I f so, then the cheaper version of the function only i s retained. The p r i n c i p a l d i f f e r e n c e between Algorithms 4 and 5 i s i n the IPF's used. (The other change i s "book-keeping"). On Test Function A, the r e a l i s a -t i o n of F i g . 5.4 was generated i n 1/3 of the time ( 1 5 sees, on the IBM 7 0 4 4 ) 59 taken f o r the r e a l i s a t i o n of F i g . 5.3. However the a l g o r i t h m f a i l e d to converge f o r the more d i f f i c u l t Test Function B of F i g . 5.5. A n a l y s i s of the reasons f o r f a i l u r e of A l g o r i t h m 5 on Test F u n c t i o n B revealed that IPF's based on e n t r o p i e s d e r i v e d from the p a i r {x^,x^} are more s u c c e s s f u l than those d e r i v e d from i n d i v i d u a l v a r i a b l e s {x_^}. Hence Algorithms 6, 7 and 8. ALGORITHMS 6, 7, and 8 The s t r u c t u r e of these three algorithms i s the same, w i t h the IPF s e l e c t i n g the v a r i a b l e p a i r (x^, x ), given by Table 5.1. (1) Define a r e a l m a t r i x Q such that Q = IPF f o r the v a r i a b l e p a i r (x^,x^.) i n the current minimal cover C. The augmented v a r i a b l e l i s t L and cover C cont a i n only uncomplemented input v a r i a b l e s and int e r m e d i a t e f u n c t i o n s . (2) Find the maximum value Q.. i n the Q-Matrix. Construct four functions {z} = {X.AX,, X.AX,, X.AX,, X.AX.} of the v a r i a b l e s {x.,x.}. 1 j 1 j 1 j 1 3 i J (3) I f any z i s such that x. or x, (or both) would become i r r e l e v a n t , given 1 j z and the r e s t of the cover v a r i a b l e s , accept z and go to (5). (4) Accept that z with the greatest Iz:y.. (5) If z i s the same function as y or y, stop. (6) Construct the new minimal cover C, adjust the Q-Matrix values, and go to (2 A l l three algorithms produced the optimal r e a l i s a t i o n of Test Func-t i o n A i n l e s s than 15 sees. For Test Function B, Algorithm 6 did not converge, whereas Algorithm 7 produced a non-optimal r e a l i s a t i o n . Only Algorithm 8 produced optimal r e a l i s a t i o n s f or both Test Functions A and B. This r e s u l t was suspect, since the IPF i n t h i s algorithm (-H r . , i s the pf(x.x.) 1 3 c r i t e r i o n used f o r recognising DD functions (see Section 4.5), and Test Function A i s DD, with Test Function B being nearly so. Algorithm 8 was tested on a function from each of the PCN Equivalence Classes of 3 v a r i a b l e s , and pro-duced a "minimal-gate" r e a l i s a t i o n i n only 10 of 14 functions. So f a r , the IPF's used have been attempting to maximise the relevant information passed by a gate. Algorithm 9 was devised to t e s t the e f f e c t i v e -ness of the converse requirement; namely to attempt to maximise the i r r e l e v a n t information blocked by a gate. The structure of the algorithm was s i m i l a r to that of Algorithms5, 6, and 7 but used Hx.x./yz(C-{x.x.}). This quantity 1 3 1 3 measures the redundancy (or i r r e l e v a n t information) removed from the cover C by replacing {x.,x.} i n C by z = X.AX.. In t h i s algorithm, a Q-Matrix was 3 - 3 3 - 3 not e x p l i c i t l y constructed, since the s e l e c t i o n of ^ x^> Xj -" a n d z> previously separate, has been combined i n the one IPF. Algorithm 9 generated an optimal r e a l i s a t i o n for Test Function A, and a nearly optimal one for Test Function B. In 72 sees (on the IBM 7044) , i t constructed optimal r e a l i s a t i o n s f or 11 3-variable Classes, with the 61 remaining 3 Classes being suboptimally r e a l i s e d . It was discovered i n t h i s algorithm that attempting to separate the choice of v a r i a b l e p a i r from that of the function of those v a r i a b l e s (as i n Algorithms 4, 5, 6, 7 and 8) was l e s s s u c c e s s f u l than using a combined IPF, as i n Algorithm 9. In Algorithm 10, we return to the idea of maximising the information transmitted by a gate, but t h i s time we measure i t as a proportion of the information required to be transmitted from the input v a r i a b l e p a i r {x^,x_.}. This "gate e f f i c i e n c y " measure (Iz :y/C--{K/x }) / ( I x ^ ^ ty/C-fx^x^.}) was used as the p r i n c i p l e IPF of Algorithm 10. To "break t i e s " , the IPF of Algorithm 9 was used, and the remaining structure of Algorithm 10 was the same as that of Algorithm 9. The r e s u l t s of Algorithm 10 were a further improvement. Test Functions A and B were optimally r e a l i s e d , as were representatives of a l l 3-variable PCN Equivalence Classes except Class 4, (see F i g . 5.1), which was suboptimally r e a l i s e d . The experience of Algorithms 4-10 l e d to Algorithm 11, which i s the most succ e s s f u l of those so f a r developed. The f a i l u r e of Algorithm 10 on Class 4 was analysed, and the simplest IPF which would remedy t h i s was chosen. This IPF i s simply -Hy/(C-{x^x^})z; i n other words, we seek a function z = N0R(x^ ,Xj ) such that the uncertainty i n the target function y, given the replacement of {x^,x^.} i n the cover C by z, i s minimised. Once again, the IPF of Algorithm 9 was used to break t i e s . Algorithm 11 (1) Define C as the current minimal set of v a r i a b l e s forming a cover. Define LR as the current set of rejected t r i p l e s : (c^,c^.,k), where c^, c^ are names of v a r i a b l e s i n C, and k i s an integer d e f i n i n g one of the N four NOR-functions of {c., c.}. Define L as a truth table with 2 rows. i J (2) Put columns for the functions {x.; (i=l,N),y} i n L. Put the v a r i a b l e 62 names {x ; i=l,N} i n C. Empty LR. Set K = N+2. (3) Set T = -N. Set T £ = 0. Set 1 = 1 . (4) Generate the I*"*1 t r i p l e (c.,c.,k), where z, i s the function i n x j R {C.AC . , c u e , C.AC. , C.Ac.} i J i 3 i 3 i 3 (5) I f ( c ^ c ,k) i s i n LR, set I = I+l, and go to (4). (6) Generate the function z of {c.,c.} as the Kt1ri column of the truth table k x 2 L. Compute IPF(IO) = -Hy/Az for the t r i p l e (c.,c.,k), where i J A = C-{c.c.}. 1 J (7) I f IPF(IO) < T , set I = I+l, and go to (4). (8) If IPF(IO) > T , go to (11). (9) I f IPF(IO) = T , compute IPF(8) = Hc.c./z yA for the t r i p l e (c.,c.,k), -L i 3 k x 3 where A = C-{c.c.}. 1 J (10) I f IPF(8) < T , set I = I+l and go to (4). (11) I f z, i s the same function as some function i n C (or i t s complement), k put (c^,c.,k) i n LR, set I = I+l and go to (4). th (12) If the t r i p l e (c.,c.,k) has passed these t e s t s , store the K column of X J L (the function denoted by z ) i n the 2 N-vector LF. Set = IPF(10) and T 2 = IPF(8). (13) If the possible t r i p l e s (c^,c^.,k) generated from C have not been exhausted, set I = I+l, and go to (4). (14) LF now contains the best new function according to IPF(IO) (and IPF(8) th i f necessary). Put LF i n the K column of L, and set K = K+l. (15) I f the new function i s the function y or i t s complement, stop. (16) Otherwise, put z i n C, delete any redundant v a r i a b l e s from C, other than z^, and go to (3). Algorithm 11 has been given i n more d e t a i l than previous algorithms, not because i t i s more complex, but to show some book-keeping d e t a i l s . In f a c t , i t was implemented i n less than 200 FORTRAN statements (including 63 subroutines). Algorithm 11 generated "minimum-gate" r e a l i s a t i o n s f o r Test Function A and B, and for a function from each of the 14 3-variable PCN Equivalence Classes i n a t o t a l time, in c l u d i n g compilation, of 18 sees on the IBM 360/67. In addition, no "dead-ends" were produced. As regards functions of more than 4 v a r i a b l e s , i t i s l i k e l y that a more s e l e c t i v e search for t r i p l e s (c^,c^.,k) w i l l be necessary for economical design. Algorithm 11 was t r i e d on a complex function of 6 v a r i a b l e s (needing c. 35 2-input NOR-gates for i t s r e a l i s a t i o n ) . I t generated only seven new functions i n 30 sees. These were "good" functions, but i t was considered that techniques such as "tree pruning" and "look forward" to depths greater than one could be u s e f u l l y applied. 5.4 Design of Multiple-Output Functions So f a r , we have only discussed single-output functions y = f ( x ^ , . . . , x ^ ) . Many design problems concern the r e a l i s a t i o n of an output vector Y = { y^; i=l,M), where each y_^ = f ^ (x^,. . . ,x^). Although each y^ can be r e a l i s e d separately, i t i s usually d e s i r a b l e to r e a l i s e the whole vector at once. Adaptations of con-ventional design methods e x i s t (27). It i s a r e l a t i v e l y simple matter to adapt the algorithm of Section 5.4 to the multiple-output case, by r e p l a c i n g the v a r i a b l e y i n the IPF by the vector (or source) Y. I f , during an i t e r a t i o n of the algorithm, some y^ of Y i s generated, then we replace Y i n the IPF by Y' = Y-y , and proceed. Algorithms 9, 10 and 11 (the more succe s s f u l ones f o r single-output functions) were adapted i n t h i s way, to give Algorithms 12, 13 and 14 r e s p e c t i v e l y . They were tested on p a i r s formed by taking a representative function from two d i f f e r e n t 3-variable PCN Equivalence Classes. The r e s u l t s r e f l e c t e d the r e l a t i v e ° performance of Algorithms9, 10, and 11, namely that Algorithm 14 was better than Algorithm 13, which was better than Algorithm 12. A t y p i c a l set of r e s u l t s i s 64 given i n Table 5.2. I t i s c l e a r that these algorithms are le s s s u c c e s s f u l for multiple-output functions. This may be because the information requirements of i n d i v i d u a l functions i n Y may be very d i f f e r e n t , with the IPF t r y i n g to "average" these requirements and f a l l i n g between two stools i n the process. Probably a more e f f e c t i v e algorithm would use an IPF vector, with each element being the IPF for some y. i n Y. PCN Equiv-alence Class p a i r . // Gates for "min.-gate" r e a l i s a t i o n . # Gates for Algorithm 12. // Gates f o r Algorithm 14. (2,4) 5 5 5 (2,5) 5 5 5 (2,6) 3 3 3 (2,7) 5 6 6 (2,8) 6 6 6 (2,10) 5 5 5 (2,11) 4 5 5 (2,12) 4 4 4 (2,14) 7 9 9 (4,5) 6 DNF 6 (4,6) 5 5 6 (4,7) 5 5 5 (4,8) 8 9 9 (4,10) 6 DNF 8 (4,11) 6 9 9 (4,12) 7 7 10 (4,14) 6 13 6 Table 5.2 65 6. CONCLUSIONS In t h i s t h e s i s , we have presented a model f or switching functions based on Information Theoretic considerations. The model allowed us to p r e d i c t some of the us e f u l properties of switching functions; Unate Cascade R e a l i s a b i l i t y , Monotonicity, D i s j u n c t i v e Decomposability, and to some extent at l e a s t , Symmetry and Linear S e p a r a b i l i t y . P a r t i t i o n of functions i n t o PCN Equivalence Classes using the source entropies, which have some extra-mathematical s i g n i f i c a n c e , i s po s s i b l y the most u s e f u l of these r e s u l t s . The model l e d to a new switching c i r c u i t design philosophy, which does not use algebraic properties of the output function, but bu i l d s a r e a l i s a t i o n from the inputs to the output, using allowed forms of gates. The search f o r sui t a b l e interconnections between gates i s guided by the informational properties of the intermediate functions. Any t o p o l o g i c a l constraints can also be incorporated into the search procedure, a d i f f i c u l t y with conventional design methods. It i s considered that the r e s u l t s presented j u s t i f y the model proposed. But i t i s also c l e a r that much more work could be done. As regards the model i t s e l f , there are other aspects of Information Theory which might f i n d a p p l i c a t i o n i n d escribing switching functions. We have been using such concepts as j o i n t entropy and mutual information almost e x c l u s i v e l y , but i t i s pos s i b l e that, for example, Coding Theory might be ap p l i c a b l e . On the other hand, the sources might be described as z e r o ^ - o r d e r Markov Sources. Might not higher-order Markov Sources have some a p p l i c a t i o n to sequential c i r c u i t s ? I t should be possible to extend the a n a l y t i c a l r e s u l t s of Section 4. In p a r t i c u l a r , the p a r t i a l r e s u l t s on monotonic and threshold functions suggest that more general r e s u l t s might be awaiting discovery. One of the u s e f u l aspects of the model i s i t s being a source of conjectures and analogies. For example, some functions are more "complex" than others i n being more 66 expensive to r e a l i s e with most types of elements. There appears to be some c o r r e l a t i o n between the magnitudes of source entropies and the cost of r e a l i s a t i o n f o r a function. (Cf. PCN Equivalence Classes 2 and 8 i n Tables 3.1 and 3.2). Much more work can be done on the design algorithms, p a r t i c u l a r l y to incorporate more h e u r i s t i c s i n t o the search procedures. The algorithms described i n Section 5 consider every possible "next fun c t i o n " to a depth of one. For larger numbers of v a r i a b l e s , t h i s number of possible functions becomes excessively large. I t should be possible to apply "pruning" techniques to eliminate many dead-ends at an early stage of the design. The multiple-output algorithms also are s t i l l rather rudimentary. This design approach, however, w i l l only be j u s t i f i e d i f i t proves u s e f u l i n the design of p r a c t i c a l c i r c u i t s (with t o p o l o g i c a l constraints) more simply and economically than conventional methods. 67 APPENDIX 1 (Proof of Theorem 2 i n Section 3.2.1) If y = f(x^,..,x^) i s some switching function, we may define [Y] N N as the 2 *2 r e a l diagonal matrix such that Y.. = -1 i f y. = 1, and Y.. = +1 & X I 1 1 1 otherwise. S i m i l a r l y we may define Y_ as a column vector such that = -1 i f y. = 1, and Y. = +1 otherwise. (See (15)). l l We may uniquely represent y by i t s "coordinates" i n l i n e a r function space; by a column vector ay_ or a matrix [ay]. I f we represent y by [Y], then: [ay] = ( 1 / 2 ) . [ A N ] . [ Y ] . [ A N ] . (A.l) N N where [ay] i s a 2 x2 matrix such that the f i r s t column (row) i s the coordinate vector ay_. The remaining columns (rows) are permutations of the f i r s t column (row). The matrix [A^] i s given r e c u r s i v e l y by: , [A Q] = 1. (A.2) N/2 It can be shown that [A^]/(2 ) i s orthonormal, and from equation A.2, symmetric. Hence: [AJ-MAJ.] = 2 N. [I]. (A.3) Using the vector form, for y, the transformation i s given by: oy_ = (1/2). [A^.Y . (A.4) The inverse transformations e x i s t , from equation A.3, and are: [Y] = ( 2 " 1 2 - N ) - [ A N ] - [ a y ] - [ A N ] . (A.5) Y = ( 2 1 _ N ) . [A^-ay.. (A.6) 2 If we define a y to be the column vector such that each element i s the square of the corresponding element i n ay_ then we require to show that 2 knowledge of a y i s s u f f i c i e n t to i d e n t i f y the CN Equivalence Class to which y belongs. We are therefore i n t e r e s t e d i n those functions z which are such that 2 2 a y = a z. We can write t h i s condition as: 68 ay = [V]. az, (A.7) where [V] i s some switching function v i n matrix form. We can further say that any such function z can be written, not neces-s a r i l y uniquely, i n the form of: Z = [P]. Y, (A.8) 2 where [P] i s +1 times a permutation matrix. (Since the f i r s t elements of a y 2 and a z are the same, from equation A.7, then the "weight" or number of "ones" i n z i s the same as that of y or y. Hence a matrix of the form of [P] describes every z s a t i s f y i n g equation A.7. We wish to count the number of matrices [P] s a t i s f y i n g equations A.7 and A.8. From equations A.6 and A.7: ,1-NN r. . ,„1-Ns Z (2 X " ) . [ A N ] . oz = (2- " ) . [ A N ] . [V]. oy_ -N = (2 ). [A^]. [V]. [A^]. Y, from equation A.4. 1-N Hence Z_ = (2 ). [av] Y, from equation A . l . 1-N However, from equation A.8, we require (2 ) [av] to be i n the form of some [P]. But we know (a) that any [P] has only one non-zero element (+1) i n i t s f i r s t row; (b) that any [av] matrix i s determined when i t s f i r s t row i s known, since other rows are known permutations of the f i r s t . Hence the number of possible [P] matrices i s twice the number of possible N f i r s t rows of a permutation matrix. Since there are 2 ways of plac i n g the non-zero N+1 element, and two ways of choosing i t s value (+1), there are exactly 2 possible transformation matrices [P] s a t i s f y i n g equations A.7 and A.8. N+1 But the 2 CN transformations s a t i s f y equations A.7 and A.8, and hence 2 2 the only functions z which have a z = a y are those equivalent to y under some CN transformations. This completes the proof of Theorem 2. 69 REFERENCES 1. C.E. Shannon, "A Symbolic Analysis of Relay and Switching C i r c u i t s " , Trans. AIEE, Vol. 57, pp. 713-723, 1938. 2. M. Karnaugh, "The Map Method f o r Synthesis of Combinational Logic C i r c u i t s " , AIEE Trans., Part 1: Communications and E l e c t r o n i c s , Vol. 72, pp. 593-599, Nov., 1953. 3. E.J. McCluskey, "Introduction to the Theory of Switching C i r c u i t s " , McGraw-H i l l , New York, 1965. 4. M.P. Marcus, "Switching C i r c u i t s f o r Engineers", P r e n t i c e - H a l l I n t e r n a t i o n a l , London, 1962. 5. H.A. C u r t i s , "A New Approach to the Design of Switching C i r c u i t s " , D. Van Nostrand, Princeton, N.J., 1962. 6. J. Earle, "Synthesising Minimal Stroke and Dagger Functions", IRE Trans. on C i r c u i t Theory, Vol. CT-7, p. 144, Aug., 1960. 7. E.L. Lawler and G.A. Salton, "The Use of Parenthesis-Free Notation f o r Automatic Design of Switching C i r c u i t s " , IRE Trans. Ele c t r o n . Computers, Vol. EC-9, #3, p. 342, 1960. 8. J.P. Roth, "Minimisation over Boolean Trees", IBM J . Res. and Dev., Vol. 4, #5, pp. 543-558, Nov. 1960. 9. H.A. C u r t i s , "A Functional Canonical Form", J. ACM, Vol. 6, #2, A p r i l 1959. 10. R.C. Minnick, "A Survey of M i c r o - c e l l u l a r Research", J . ACM, Vol. 14, pp. 203-261, A p r i l , 1967. 11. C.E. Shannon and W. Weaver, "The Mathematical Theory of Communication", The U n i v e r s i t y of I l l i n o i s Press, Urbana, I l l i n o i s , 1949. 12. R.H. Wilcox and W.C. Mann, "Redundancy Techniques f o r Computing Systems", Spartan Books, Wash., D.C., 1962. 13. N. Abramson, "Information Theory and Coding", McGraw-Hill, New York, 1963. 14. S. Golomb, "On the C l a s s i f i c a t i o n of Boolean Functions", IRE Trans. Inform. Theory, Vol. IT-5, pp. 176-186, 1959. 15. M.A. Harrison, "Introduction to Switching and Automata Theory", McGraw-Hill, New York, 1965. 16. T. Kameda, "A New Descriptor f o r the Equivalence Class pf a Boolean Function", Tech. Note #2, D i g i t a l Systems Laboratory, Princeton U n i v e r s i t y , Jan., 1967. 17. R.F. Arnold and M.A. Harrison, "Algebraic Properties of Symmetric and P a r t i a l l y Symmetric Boolean Functions", IEEE Trans. Electron. Computers, Vol. EC-12, //3, p. 244', 1963. /u 18. A. Mukhopadhyay, "Unate C e l l u l a r Logic", IEEE Trans. Computers, Vol. C-18, #2, pp. 114-121, 1969. 19. M.L. Dertouzos, "Threshold Logic: A Synthesis Approach", MIT Press, Cambridge, Mass., 1965. 20. R.O. Winder, "Single Stage Threshold Logic", Switching C i r c u i t Theory and Logic Design, AIEE S p e c i a l P u b l i c a t i o n S-134, pp. 55-64, Sept. 1961. 21. S. Yajima and T. Ibaraki, "A Theory of Completely Monotonic Functions and i t s A p p l i c a t i o n to Threshold Logic", IEEE Trans. Computers, Vol. C-17, #3, pp. 214-229, 1968. 22. R.O. Winder, "Threshold Gate Approximations based on Chow Parameters", IEEE Trans. Computers, Vol. C-18, #4, p. 372, 1969. 23. C.K. Chow, "On the Characterisation of Threshold Functions", Switching Theory and L o g i c a l Design, AIEE Special P u b l i c a t i o n S-134, pp. 34-38, Sept., 1961. 24. E. Levine, "On the Characterising Parameters of a Threshold Function", IEEE Trans. Computers, Vol. C-17, #7, p. 696, 1968. 25. G.A. Maley and J . Earle, "The Logic Design of Tr a n s i s t o r D i g i t a l Computers", P r e n t i c e - H a l l I n t e r n a t i o n a l , London, 1963. 26. R.M. Fano, "A H e u r i s t i c Discussion of P r o b a b i l i s t i c Decoding", IEEE Trans. Inform. Theory, Vol. IT-9, p. 64, A p r i l , 1963. 27. T.C. Bartee, "Computer Design of Multiple-output L o g i c a l Networks", IRE Trans. Elec t r o n . Computers, Vol. EC-10, #1, pp. 21-30, 1961.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Switching circuits as information networks
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Switching circuits as information networks Matheson, William Stephen 1970
pdf
Page Metadata
Item Metadata
Title | Switching circuits as information networks |
Creator |
Matheson, William Stephen |
Publisher | University of British Columbia |
Date Issued | 1970 |
Description | A single-output combinational switching network has a number of input terminals, each carrying a signal variable which may take one of two values, and an output terminal, the signal variable of which has a value determined ideally by the input signals only. In this thesis, we make an arbitrary assignment of probabilities to each of the possible configurations of input signal values, (namely that each configuration is equally likely). This is an interpretation of switching variables as random variables with known statistics. We can therefore define and compute the joint source entropy of sets of variables, including the output variable. We use these information quantities, or entropies, to classify switching functions into Equivalence Classes under Permutation and Complementation of input variables, and Negation of the Function. The entropies can also be used to predict some of the useful properties of switching functions, in some cases more simply than conventional methods which employ Boolean Algebra. The model also suggests a switching circuit design philosophy based on the idea of using circuit elements, or gates, to pass information in the input signals which is relevant to the output, while blocking the irrelevant information. Several algorithms are described, and their performance on the design of circuits with small numbers of variables is encouraging. The design philosophy seems particularly able to handle topological constraints, of the type becoming significant in modern switching circuit design. |
Subject |
Switching theory |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-05-24 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0102160 |
URI | http://hdl.handle.net/2429/34785 |
Degree |
Doctor of Philosophy - PhD |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1970_A1 M38_2.pdf [ 4.24MB ]
- Metadata
- JSON: 831-1.0102160.json
- JSON-LD: 831-1.0102160-ld.json
- RDF/XML (Pretty): 831-1.0102160-rdf.xml
- RDF/JSON: 831-1.0102160-rdf.json
- Turtle: 831-1.0102160-turtle.txt
- N-Triples: 831-1.0102160-rdf-ntriples.txt
- Original Record: 831-1.0102160-source.json
- Full Text
- 831-1.0102160-fulltext.txt
- Citation
- 831-1.0102160.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0102160/manifest