PERCEPTRON-LIKE MACHINES by KRIS LANCE KRAFT ' A.B. , University of C a l i f o r n i a (Berkeley), 19^5 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE In the. Department of MATHEMATICS We accept this thesis as Conforming to the required standard The University of A p r i l B r i t i s h 1 9 6 9 Columbia In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an a d v a n c e d d e g r e e a t the U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y . sha11 make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and S t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by the Head o f my Department o r by h i s r e p r e s e n t a t i v e s . It i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my wr i t t e n permi ss i o n . D e p a r t m e n t o f The U n i v e r s i t y o f B r i t i s h C o l u m b i a V a n c o u v e r 8, Canada Date Supervisor: R. S. Rosenberg. i i . . ABSTRACT This paper investigates perceptron-like devices with an eye toward the recognition of simple predicates, such as p a r i t y and connectivity. F i r s t "multilayer" perceptrons are investigated to l i t t l e a v a i l . Then much more powerful perceptrons which have feedback are considered, y i e l d i n g better r e s u l t s . Examiners i i i . TABLE OP CONTENTS Page INTRODUCTION 1 TEXT 4 FEEDBACK PERCEPTRONS 22 • ORDER OF PARITY PREDICATE FOR FEEDBACK'PERCEPTRONS 26 CANNONICAL OPERATIONS FOR FEEDBACK PERCEPTRONS 27 ORDER OFvCONNECTIVITY PREDICATE FOR FEEDBACK PERCEPTRONS 33 BIBLIOGRAPHY 6 l LIST OF FIGURES FIGURE 1 - Simulation of a p a r i t y acceptor - accepting 2 - Continuation of Fig . 1. 3 - Simulation of a p a r i t y acceptor - r e j e c t i n g - Continuation of Fig . 3-5 -'Simulation' of Mc - accepting. r O - Simulation of Mc - re j e c t i n g 7 - A connected figure from Perceptrons 8 - A d i g i t a l version' of Fig . 7. 9 - Figure 8 at acceptance time 10 - A disconnected figure from Perceptrons 11 - A d i g i t a l version of F i g . 10. 12 - Figure 11 at re j e c t i o n time. INTRODUCTION The purpose of this thesis i s to further explore the c a p a b i l i t i e s of various kinds of "perceptron-like" devices. The main i n s p i r a t i o n , s p i r i t ^ approach and source of style w i l l be a monograph of Minsky and Papert e n t i t l e d Perceptrons and Pattern Recognition, (Sept. 1 9 6 7 ) . This monograph* i s remarkable, not so much for impressive results as f o r i t s straight-forward unassuming approach which removes much of the mystery (and pote n t i a l controversy) which has dogged e a r l i e r perceptron investigations. The Minsky paper explores the complexity of various l o g i c a l and geometric concepts with respect to a very simple kind of perceptron, a l i n e a r separation machine^ where there i s no "feed-back" or other communication among the "associator un i t s " . B r i e f l y these perceptrons operate i n the following manner: They are presented v/ith a rectangular array ( c a l l e d a retina) of squares which may be i n the active' or "inactive state. The so-c a l l e d "associator u n i t s " then look at small parts of the r e t i n a and decide on that basis to vote "yes" or "no". A l l votes are t a l l i e d and a p a r t i c u l a r weighted average taken of the votes. How the average i s weighted, of course, varies from machine to machine. If the weighted average i s above a cert a i n threshold * The monograph (plus some extended results) has just been pub I T ished i n book form. Perceptrons, M. Minsky and S. Papert, M.I.T, Press ( 1 9 6 9 ) . Some of the material i n this paper has been a n t i -cipated though not expanded upon i n Perceptrons on pages 2 2 8 - 2 3 2 , a section which did not appear i n 19^7, 2. the machine i s said to "accept" the p a r t i c u l a r activated subset of the r e t i n a , otherwise there i s rejection. The set of a l l accepted subsets i s said to define a predicate or a boolean function on the r e t i n a . The machine i s said to recognize the predicate so defined* Perceptrons or c l o s e l y related devices are often used i n pattern recognition programs for computers because they are so simple i n design (and easy to program). Rosenblatt did much work i n the pattern recognition area with such machines "generated at random". Other "learning" programs, such as the Samuel Checker Player, also c l o s e l y resemble perceptrons. Samuel, fo r example, has many so-called "board parameters" such as "number of pieces", and "number of possible moves". These parameters are his associators, for they grasp only a small part of the t o t a l inform-ation on the board. His parameters are combined i n wieghted averages to help determine what i s supposed to be the best checker ' move. Minsky on the other hand i s not concerned with the a p p l i -cations of perceptrons, but rather with the t h e o r e t i c a l l i m i t of their a b i l i t y . To explore this question he defined a very r e s t r i c t e d and s p e c i f i c kind of perceptron and investigates that. Unfortunately'these l i n e a r separation machines are not powerful enough to deal with many concepts ( i . e . , predicates) such as p a r i t y and connectivity which seem to the human at l e a s t , to be on the most elementary l e v e l . We are constantly surprised with 3 . the d i f f i c u l t y (high order) of some tasks as compared to the ease (low order) of others. This paper w i l l explore other designs f o r perceptrons along the l i n e s of Minsky with an eye toward the d i f f i c u l t y of , computation involved i n various tasks. B a s i c a l l y , the thesis w i l l explore perceptrons with many .associator " l e v e l s " , and then perceptrons which allow feed-back from machine to retina . The f i r s t alternative, w i l l not be espec i a l l y f r u i t f u l . However,.the second subject (feedback perceptrons) w i l l be quite i n t e r e s t i n g and deserving of-much more study than i t can get here. \ PERCEPTRON-LIKE MACHINES ¥e assume that we have a machine which i s presented with a rectangular array R of squares which may he occupied (blacked in) or not. We wish to say certain things about the array (often c a l l e d the r e t i n a ) . D e f i n i t i o n : A predicate function on R i s a function cp : P(R) - {0,1} , where P(R) i s the set of a l l subsets of R. A predicate i s written as [cp] where [ l ] = T and [0] = F . For any X c R we say [cp(X)j holds or i s true i f f cp(X) = 1 . • The two terms w i l l be used interchangeably henceforth.' 2 I * I Proposition: There are 2 predicates on R . IRI Proof: There are c l e a r l y 2 1 1 subsets X cz-R i n the domain of any predicate. Certain X's i n dom (cp) are "accepted by cp " ( i . e . cp(X) = 1 ) and ce r t a i n are "rejected". Thus cp i s uniquely described by the set of X's accepted i . e . by a subset of dom (cp) . Since there are 2 ^ ^ subsets of dom"(cp) there ?|R| are 2^ predicates cp . A p a r t i c u l a r l y simple and useful class of predicates are c a l l e d masks. i s a predicate such that either or there i s a set A c R such that D e f i n i t i o n : A mask cp^ cp(x) = 0 f o r any X c R cpA(X)' = 1 i f f X => A for a l l X c R . The zero mask w i l l be denoted as cpQ ; the one mask ( l o g i c a l l y ) as tp^ . D e f i n i t i o n : Suppose $ i s a.set of masks. We say that i|f i s r e a l i z a b l e from $ by a simple mask perceptron i f f f o r a l l X c R [\|i(X) = l ] E a cp.(X) _> 9- for some reals a ,6 . The simple mask perceptron which can be thought of as a separate piece of "concrete" hardware (consisting of the threshold 6 V. the c o e f f i c i e n t s a and the masks cp. ) i s said . c ? i i to compute cp from § . We write cp e S.M.P. ('$) i n t h i s case. Remark: We could have assumed 0 = 0 3 a to be r a t i o n a l or * i even i n t e g r a l i n R , and we could also have written > instead of >_ > obtaining an equivalent d e f i n i t i o n . Proof: " ( i ) I f we make the assumption (which we w i l l henceforth make) that cp^ e § then S a cp ±(X) > G i f f S a «P ±(X) + (a - 9 ) ^ ( X ) > 9 cp.^^ showing that 9 can always be taken to be zero. ( i i ) Only a f i n i t e number (<_ 2^R') of sums S(X) = £ a cp. (X) cp ±€6 «Pi 1 w i l l occur. I f h i s the minimum, nonzero pairwise difference 6. of the f i n i t e set of sums (S(X)} we then have S(X) _> 0 i f f h / S(X) > 0 - /2 f o r a l l X c R , showing that the >_ could be taken as >~ . ( i i i ) I f h i s as before then we know that i t i s possible to pick a' e Q close enough to a such that ^ i ^ i o o £ a cp, (X) + a' cp, (X) - S a cp.(X) . cp, x v cp, ' cp. 1 V ' '1 o ^ o < h 21 $' fo r a l l X c R . I f a l l such a ' s are so replaced we c l e a r l y cp. have: S a' cp.(X) > s a' cp. (Y) i f f E a cp.(X) > E a cp.(Y) cp. i v cp. i v cp. i v 7 cp. x v 7 that i s preservation of order f o r a l l of our f i n i t e number' of sums, showing that a 's can be taken to be r a t i o n a l . M u ltiplying through by the l e a s t commom denominator shows the same thing f o r integers. We now define a (formally) more complicated machine, which w i l l turn out to be equivalent i n computing power. D e f i n i t i o n : Suppose § i s a set of predicates. We say that cc i s r e a l i z a b l e from § by a l i n e a r separation perceptron i f f [cp(X) = 1] «-* E a cp.(X) > 6 for some reals a ,6 cp,e$ ^ i . ^ i 7 . and as before we write \jj e L.S.P. (§) R Remark: As before we could have assumed 6 = O.ct e Z and < v i to be < . We s h a l l henceforth r e f r a i n from making this kind of remark e x p l i c i t . Proposition: I f $ = the set of a l l masks then any predicate e" S.M.P. (*) . Proof: We write \|r in..disjunctive normal form ^ C 1 ( X ) v C 2 ( X ) v . . .vC n(X) where each C (X) = [z. ] A [ Z . ] A . . . A [ Z . ] and 1 1 1 • 2 m i each [ z ± . ] = [X^] or [X k] = [ l " x k ] for some X R e R We then have C. (X)*-» z. *z. • «z. = 1 1 2 >•— m i We now replace each z±. by X R or 1-Xk obtaining a polynomial 8. i n the X^'s only. No appear, though some terms w i l l be negative and a l l w i l l therefore be definable by a mask cp or i t s negative (-l)cp . Therefore C ± ( X ) ^ S cp(X) - E_ $(x) > i . cpe$^ c]5€$^-(Note that 0 and 1 are the only values attainable.) and U ( X ) ] « - » . S cp(X) - cp(X) > 0 since cpe? cpe$ i|i(X) w i l l hold i f f any one of the conductions C i s true. We now introduce the two concepts of support and order, which are to be some measure of how complicated a predicate i s supposed to be. • ... De f i n i t i o n : The support A of a predicate cp .is the smallest subset A c R upon which cp depends, i . e . f o r which cp(XnA) = cp(A) for a l l X c R . We write A = supp (cp) . De f i n i t i o n : We say that IJJ i s of order k i f f k i s the smallest number such that i K X ) * - > E a cp ± (X) > 0 where the cp^'s are a l l predicates with | supp cp^| <_ k .and the a 's are some r e a l numbers. Theorem: \i/ i s of order k i f f k i s the smallest number such m that cp(X)3->E a cp (X) > 0 where the cp^'s are a l l masks with p=l cpp p - p I supp (cp )| <_ k f o r a l l p and the a ' s are re a l s . Discussion: This theorem says i n ef f e c t that the L.S.P.'s and the S.M.P.'s are equivalent i n computing power. These machines are the center of study i n the Minsky paper. Unfortunately the complexity of predicates with respect to these machines (and the order measure of complexity) does not correspond to the human notion of what kind of things are "easy" and what kind of things are " d i f f i c u l t " . For example, we can say that the l o g i c a l operation of""f (negation) i s easy i n that i t does' not increase the order of "fiji over that of \i . On the other hand the oper-ations of v "or" and A "and" can ( i n a way which w i l l . be -.made more precise l a t e r ) not even be considered as being of f i n i t e order. In the geometric realm, the counting of points and even the recognition of integers i s "easy" as i s the a b i l i t y to recog-nize c e r t a i n topological invariants. On the other hand, recog-n i t i o n of p a r i t y , and connectivity i s not of f i n i t e order. These disappointing results could be viewed as a defect of the simple kind of machines used, or of the order measure, or both. We could, f o r example, consider complexity as a function of both the support size of the cp^'s and the number of masks or other types of predicates used. Such a theory could be quite i n t r a c t a b l e , so we consider instead various methods of strengthening 10. the machines. We must be careful though: a machine which i s too strong w i l l recognize a l l predicates i n a small order and be quite uninteresting. We w i l l , of course, be s a c r i f i c i n g one of the "seductive" attributes of the perceptron, namely that i t computes i n a very straight-forward (and e a s i l y simulated way) by f i r s t computing a l o t of easy to evaluate, simple information and then combining this p a r t i a l information by a simple algorithm. . Proof: Let ^k be the l e a s t number such that #(x)#-»£ a, i< (X) > 0 where the i|f 1 s are predicates with | supp ( s O I <. k for a l l p . We w i l l replace each it by a sum of the form E a cp where the cp' s are • p . cp * Cp€9 v masks such that max | supp cp | = | supp cp j the-theorem w i l l then cpes • -follow. Let p ^ m . As before we w i l l write ^ (X) = C ^ ( X ) v c | ( X ) v . . . v c P ( X ) . where C?(X) = [z? ] A [ Z ? ] A . . . A [ Z P ] and 1 x l x 2 mi [ z j j ] = [x] or [1-X] . As before we obtain C?(X)<?-5> S cp(X) - J cp(X) >_ 1 where cp€l^ cpe$^ $P and 5? a r e index sets of masks. The thing to notice i s the 11. identity. aX|3 = a(l-X)p = (a-aX)p = a£ - aX£ where a and p are conductions. This i d e n t i t y shows that the maximum support of the cp1 s and the cp's w i l l be supp C?(X) . And since at . lea s t one of the C? w i l l have maximal support = supp (X) we J. p can write U(X)3«-*S a ( I cp(X) - JZ cp(X)) > 0 p p cper cpe<r • . • • Remark: So fa r our predicates have depended upon the r e t i n a R . We wish to extend the d e f i n i t i o n of predicate to predicate schema, a term which w i l l then be used interchangably v/ith predicate. D e f i n i t i o n : A predicate schema i s a rule which generates a s" p a r t i c u l a r predicate f o r a re t i n a R of any given si z e . Examples:• ( l ) Denote the p a r i t y predicate by i!ppj{ which has the property that ilf(X) =1 i f f | x | i s even f o r every r e t i n a R and f o r a l l X c R . ( 2 ) Denote the connectivity predicate by ^QQNN W H I C ^ ^ A S the property that I'rjoNN = X i s c o n n e c i : e d f o r every r e t i n a R and for a l l X c R . We say that a figure X c R i s connected i f f every point x e X can be connected to any other point y e X by a r e c t i l i n e a r path i n R . We do not permit 11 diagonal connections" as they would permit two paths to "cross" without "toughing". Notation: A r e c t i l i n e a r path y in-- R 'going from x to y w i l l be denoted as y = '{.x = X Q , X 1 , X 2 , ... ,x n_ 1,x n=y} where x^ ^ and x. have a side i n common f o r i = 0,1,...,n-l. 12. De f i n i t i o n : A predicate (schema) i s of f i n i t e order i f f the predicates generated by are uniformly bounded. The minimum uniform bound M • i s c a l l e d the order of . We now begin our search for more complicated kinds of machines which can cope with typ^R a n * ^CONN * ^° ^ a r w e l i a v e dealt with machines of the following design: RETINA A S S O C I A T O R U N / I T S S U M M ^ o l J A U G - O R V m H The associator units have been masks or other predicates the machine has been equipped with. ' So f a r they have a l l been on ' one " l e v e l " and cannot communicate among each other. We now define a 2-level perceptron with the following'design. We w i l l make our d e f i n i t i o n i n a form which can e a s i l y be generalized by induction. 13. D e f i n i t i o n : A predicate i|i i s re a l i z a b l e from a p a i r of sets of predicates § and s o i f f for a l l X c R iif(X) =1 <$-> I, a (Xn ) _> 6 fo r some reals a ,9 where X-^ = {cp-^ ^ | cp-^X) — 1} . We write % e 2LP ($ 1,$ 2) . Remark: We could simplify our notation (and our thoughts) i f we thought of our f i r s t l e v e l of associator u n i t s , $^ , as a new r e t i n a , the output o f our 2-level p e r c e p t r o n Mt M p = < 8 > ^^llh^l > % 2 . c P 2 i 3 i = l > a t t i m e t = 0 a s X <= R } the output at time t = 1 as X]_ E % a n ^ output at t = 2 as X 2 = {cp2i | c p g i ^ l ^ = 1 3 — R2 a n d t h e ou'fcPu't t i r a e t = 3 as 0 or 1 . De f i n i t i o n : A predicate % i s re a l i z a b l e from an n-tuple of sets of predicates l - i - F i ^ i i f f f o r a I 1 X c R *(X) = 1*-* S a cp . (X ) > G a 9 € R cp n i£$ n ^ n i n i n 1 ^ n i where X s X c R o — Xk = ^ k j € §k I O f c j ^ - l ) = 1 3 f o r k - l , . . n . We write 4 e nL.P. U. I3? _ . Y L I i = l Ik. t h Remark: As before we w i l l t h i n k of the i l e v e l of a s s o c i a t o r s §^ as a new r e t i n a R^ f o r the next l e v e l i+ 1 . The output of the n l e v e l perceptron P at time t w i l l be Xfc f o r t = 0,...,n and 0 or 1 f o r t = n+1 . D e f i n i t i o n : The order of v/ith respect to an n - l e v e l perceptron P w i l l be the l e a s t number k such that' \ji e nL.P. and cpj_ e > f supp cp± | <_ k f o r a l l i . Theorem: I f \\j has order k w i t h respect to a 1 - l e v e l perceptron ( l i n e a r s e paration perceptron) then \ji has order <_ [jj k ] + 1 w i t h respect to a 2 - l e v e l perceptron. N.B. [ ] denotes.the " "greatest i n t e g e r f u n c t i o n " . • e.g. [5] = 5 > [5|-J = 5. , [J~2] = 1 . Proof: Let ty(X)*-»E a vA*) > 9 a , 8 £ R and ^ i p i . | supp cp^ | _< k f o r every i . By a previous theorem we might as w e l l assume that the cp±' s are masks. We might f u r t h e r assume that the a 's are e i t h e r + 1 f o r they may be assumed ^ i i n t e g r a l and hence + 1 i f we permit ourselves to copy a p a r t i c u l a r cp^ s e v e r a l times. We now have ijr(X)*-* 2 cp(X) - S cp(X) > e cpe§ cpe? v/ith | supp cp| _< k f o r any cp e $ U % where $ and % are sets of masks. We now design our 2 - l e v e l perceptron. For each cp e $ (or 1) d i v i d e supp cp up i n t o no more than L ^ T " ] + 1 d i s j o i n t subsets containing no more than 0 k ] + 1 elements each. Such a d i v i s i o n must be possible or we would have supp cp > (U?~k"] + 1 ) { [ $ T ] + 1) > ,J~TST = k . For each i define a mask (i^cp^) with (i!f1cp I)(X) = 1«-»X >_ S| f o r a l l X c R . Let Rx = ( U ^ ) I cp e $ i £ [fT }. + 1} . Define' (^cp) as (i!r2cp)(X1) = ^cpj.) € X± f o r a l l i < [^T ] + 1 . F i n a l l y consider the predicate p ( X ) ^ S U cp)(X ) - ^( 1lr 2cp)(X 1) > 9 , cpes • cpe$ where X.^ = {(^cp^ | (i|; 1^i)(X) = 1} . Clearly p i s r e a l i z e d from a p a i r of sets of predicates $^ = R i a n d $ 2 = [(v^cp) | cp e 1} . Clearly also by construction p i s of order <_. [^k~ ] + 1 . We wish to see p ( X ) i ] / ( X ) . But . •is(X) <?—£ £ cp(X) - Ejp(X) > 0 where ? U i contains only masks, cpes cpe§ . < And co(X) = l«-fr X 3 s j f o r a l l i ^ (t 1cp i)(X) = 1 f o r a l l (ij/^cp^ ) e X1 f o r a l l i^-$>( cp) (X.^ ) = 1 . So our sum E cp(x) - EJP(X) = s ( l / 2 c ? ) ( x 1 ) - r , ^ ( ^ 2 c P ) x 1 . cpe$ cpe<? cpei cpe§ . ' . * ( X)**p ( X ) . Theorem: perceptron If \j i s of order k with respect to a 1-level then . ijj has order <_ [$"TT ] + 1 with respect to an 1 6 . n - l e v e l perceptron. Proof J • • • • "a. Our proof proceeds as i n the previous theorem where we obtain \|r(X) <M> E cp(X) - E_cp(X) > 0 where $ and l,.:-.are cpe$ coe § sets of masks and | supp cp| < k for a l l cp e $ U § Pick any cp e § or I and divide supp cp int o d i s -j o i n t subsets Scf containing at most k ] + 1 points . i < k [JTT] + i On this " f i r s t l e v e l " define ( i^cp) (X) = 1*-* X3 s| . Now divide = { ( t ^ p ) ^ £ k + 1 } int o d i s j o i n t \ subsets S^ 0 containing at most [Vk~] + 1 points where n i o < J12 k 2 " ( [ ^ T T ] + D 2 Define U i 2cp) ( X ^ = 1 X ^ S i 2cp We now have Rgcp = Sj/j^) I i 2 — — n k ( [ T T ] + 1); 17 Proceed ind u c t i v e l y , d i v i d i n g ^ _ T _ into d i s j o i n t sub-sets s i mcp that contain at most [rxT^m ] + 1 points where im c :— — , then defining on the m^ *1 l e v e l - ( [ v H T ] + i ) m Rep = {(*, cp) | im < — k -} and ^ i m - ( [ n ^ j + ) ( 4 . cp)(X ) = l<5-?> X , D S. cp u n t i l m = n . At th i s time v v i r a y ' v m-1' m-1 — a.mv < -rr—^ rT < n R r, = ¥ = 1 • T h a t ± S a t l e V e l 1 1 • n - (^k + l ) n .- (^k ) n k For any cp e I (or %) there i s a unique (^j_NCP) H ^ N • Consider p(X) <5-* E . (i|r cp) ( X ) - S ~ ( y p ) ( X x ) > 6 where X Q = X . V - 1 = t( W < P ) € V l ' * e $ U * a n d < W ^ < V - 1 } f o r m = 0 , . . . , n - l . Clearly (?) p i s re a l i s e d by an nL.P. from t ^ i ^ i - i and i s of order <_ [rlTk' ] + 1 . But cp(X) = 1 <H> X supp cp k i f f X g . S ± cp 1-, < l" 1 ~ ( [ V T ] + 1) i f f (A. cp)(X) = 1 , 1 i f f (*. co) € X, x l • x i f f I^cp c X 1 18. i f f i f f i f f i f f X. 1 ± \ * (*i <P)(X ) (*i ^P)E X 2 R2cp c X 2 i 2 < k ( [ ^ T ] + 1)' = 1 i f f X - S> S, cp n - l — i f f c P ) ( x n _ 1 ) = 1 i . e, S (vii cp) (X 1 ) - £ (ij, cp)(X _ ) > 9 \ v n v ^ v n - l / ~^7 v vn^ / v n - l 7 cpes cpes i f f S cp(X) - cp(X) > 0 cp€$ cpe$ p ( X ) ^ f ( X ) Corollary I: I f cp i s any predicate then there i s N large ~ enough so that the order of cp <_ .2 with respect to an N-level perceptron. Proof: Let R be a ret i n a and cp a predicate on R . Cl e a r l y cp i s of order k <_ |R| with respect to a 1 - l e v e l perceptron. We conclude then that order < [^/k" ] + 1 with respect to an 1 9 . n l e v e l perceptron. Since l i m V^^L" ] + 1 = 2 and [^ /"k" ] + 1 Xi-KO • i s integer valued, we conclude that there i s an N such that N r [ ./k ] + 1 = 2 . Therefore order cp <_ 2 with respect to an N l e v e l perceptron. Most complicated predicates w i l l be of order 2 , however some may be of order 1 , f o r example, the order 1 predicates \tfith respect to a 1 - l e v e l perceptron. We might just as well mention that there i s a d i r e c t proof which may be sketched as follows. (Incidently this proof would be t y p i c a l of other machines endowed with too much power.) Let R be a r e t i n a . Form the set S of a l l pre-dicates ii • such that i s of order 2 with respect to an n-l e v e l perceptron f o r some n . We show that S = the set of a l l predicates on " R by showing that ( 1 ) S contains the masks of support 1 ( t r i v i a l ) . (2) e S —> ~ i \|f e S . This may be proved by. reversing i n e q u a l i t i e s or changing the sign of the c o e f f i c i e n t s . (3) ^ € S and ^ e S - ^ ^ v ^ e S . This f a c t may be proved considering the Ji^ and Ng-level perceptrons P^ and P, which recognize ^ and ty2 respectively and constructing a new perceptron v/ith = max {N-^Ng} + 1 i n a manner suggested by the picture: 20. C Z 3 2 . We w i l l not go into d e t a i l s . Corollary I I : , Any predicate schema ^7 i s of order ' <_ 2'with respect to the class of a l l f i n i t e l e v e l perceptrons. Proof: This statement i s obvious given the d e f i n i t i o n of predicate schema .(••arid the unstated d e f i n i t i o n of "order with respect to the class of a l l f i n i t e l e v e l perceptrons") since a l l t|i generated by are uniformly bounded i n order by 2 . We mention the " r e s u l t " only to clear up l i n g e r i n g ambiguities. Discussion: The proofs of the l a s t two theorems seem formidable, mostly because of our s t r i c t adherance (for'the time being) to the formalism and our l i b e r a l use of subscripts. A c t u a l l y the idea • behind the theorems i s simple (or t r i v i a l ) . We' merely convert our perceptrons into mask perceptrons and then break down the b i g masks by "pyramiding" smaller ones i n the manner of the suggestive 21. picture. which converts a mask "of order" [incorrect usage]. ^ j[ into, a "mask net" of order 3 and l e v e l 2 . Our theorem says with 2 l e v e l s the worst order we could have expected was [?/""TT* ] + 1 = [ 3 . 3 1 ] + 1 = 4 . To obtain a mask net of order 2 we must use 3 l e v e l s . I I The theorem says that f o r 3. l e v e l s our order w i l l be < [^/n~] + 1 = [ 1 . 8 2 ] + 1 = 2 . Theorem: I f i s of order k with respect to an n- l e v e l perceptron then i/ has order <_ k n with respect to a 1 - l e v e l perceptron, 22. Proof: (Sketch) Informally, for any cp e R n we look at "supp cp" i n each of Rn-1 5'*' 3 R 0 * W e c a n e a s i l y s e e t h a t 2 n these supports w i l l be <_ k,k , ...,k i n the various r e t i n a . To construct our 1-level perceptron we merely take f o r any cp e R ^ a nev/ cp which i s a predicate on R q and depends on only the (possibly) k n X i's i n R Q . Remark: Our theorems have made the order problem rather uninter-esting for the perceptrons with uniformly support r e s t r i c t e d higher order associator units as well as casting serious doubt on the order measure'"itself for this class of machines. S p e c i f i c a l l y , the complexity of the predicate seems measured more by the i n t e r -connections of the associators than the size of associator support. FEEDBACK PERCEPTRONS Discussion: We now continue our search for stronger perceptrons, this time obtaining a machine which looks more l i k e an automaton than the combinatorial nets we had before. We are motivated by a desire to minimize the amount of "wiring" i n the machine so that order w i l l play a larger: r o l e . Our idea i s to eliminate wiring between the many l e v e l s by using t h e . f i r s t l e v e l over and over again i n a "feedback" arrangement. So whereas before we had a s i t u a t i o n l i k e t h i s : 2 Mi U/ car o with p o t e n t i a l l y complicated wiring situations i n regions 1 through n as marked, we now wire region 1 once.and f o r a l l . We. might a l t e r n a t i v e l y think of this s i t u a t i o n as requiring a l l regions 1 through n to be wired i d e n t i c a l l y . We could think of the state of the machine at t = 0 as ST ( O ) = X c R 3 at t = 1 as ST(l) = (cp(X) | cp(X) = 1} , , at t = 2 ST(2) = {x e R | cp £ ST(l) and x i s at the end of a feedback arrow}. We have one problem l e f t : . We must t e l l the summation operator when to operate otherwise the machine w i l l cycle end-l e s s l y , never giving an output. For this purpose we allow a l l cp € R-j_ to communicate into a common' channel a "0" or a "1" .. 2 4 . A "1" into the channel from cpi means that cpi i s ready to proceed to the summation state; a zero means i t i s not. Only when there i s unanimous consent does the summation take place. The machine then shuts down. Diagramatically then we have the De f i n i t i o n : A feedback perceptron (F.P.)P i s a t r i p l e . <R,R_L,A> where the (rectangular) retina R = ( x - j J - ^ j i s a s e ^ of points,. indexed by a f i n i t e set I . R-^ i s the set of associator units {cp, } . T indexed by a f i n i t e set J , and j j e o A = {a.}. T U {9} i s the set of c o e f f i c i e n t s a. together with J , J e t J j the threshold 8 . The associator units cp-. are' functions from J P(R) to {X,0} x {0,1} x {0,1} where X-e R . (Think of this t r i p l e as i n d i c a t i n g the following. <YES;NO/reactivation of r e t i n a , YES;NO/proceed to summation, YES;NO subpredicate true> Remark: Any P.P. P i s a f i n i t e automaton. We define the i n i t i a l state of the machine as S < R . Inductively we define SI+1 = ^ x e R 3 q- j e R l a n d 9 i ( s i ) = <x 0 or 1, 0 or 1>} . 2 5 . Define the output at time n by OUT(n) = 1 i f f cp,(S ) = <x or 0,1,0 or 1> f o r a l l j e J A j e J J J l a s t co-ordinate. and £ a. cp.(Sn) >_ 8 where A = projection onto OUT(n) - 0 i f f cp.(S ) = <x or 0,1,. 0 or 1> f o r a l l j e J J , and £ a. $(S„) < 9 j e J J n ' OUT(N) = \ otherwise. D e f i n i t i o n : We say that X i s accepted by the F.P.P if-f OUT(n) = 1 where n i s the lea s t integer such that OUT(n) £ \ and the state of P ( i . e . the activated subset of R) remains the same after t = n . Remark: We might as well assume that our associators cp. can reactivate whole subsets of R since we may add as many "redundant "cp's" as needed, connecting each feedback arrow with a d i f f e r e n t x i on our subset.; 2 6 . ORDER OP PARITY PREDICATE FOR FEEDBACK PERCEPTRONS Theorem: Parity i s of order 2 v/ith respect to a F.P. Proof: See page 1 1 f o r a d e f i n i t i o n of > ^he p a r i t y predicate. Enumerate R as {x^} i = l , 2 , . . . , 2 n . I f I i s odd a si m i l a r argument works. Enumerate R. as cp . j = 1 , 2 , . . . and define cp .(X) = <a,b,c> where J a = j / 2 i f j even and x„ •. or x 0 e X but not both, = 0"/2 + ^ i f j odd and x Q or x e X but not both, • ^3 " . i - l = 0 otherv/ise. b = 0 i f f a = 0 unless j = 1 then set b = 0 i f f X^^LAX = c = 0 i f f only one of x~ and x e X c = 1 i f f both or neither x» and x„ e X . Let A = {a.} where a. = 1 i f j = 1 and a. = 0 otherwise. 3 3 3 then [X i s even] S a . cp . >_ 1 . <3 3 Clearly this machine operates by continually "trimming X down"-(preserving parity) u n t i l only one or zero points are l e f t If one point i s l e f t , | x | i s odd, i f none, [X| i s even. An example of this machine i s simulated and the sequence of steps shown i n Figs. 1 , 2 , 3 and 4 . To complete our proof we should also show that no "order 2 7 . 1 " P.P. can recognize p a r i t y . Namely, the two parts of the proof would show that' order (^P^R) O N F.P.s i s ( l ) <_ 2 and ( 2 ) > 1 . We show this fact a f t e r the next theorem i n a section e n t i t l e d "parity proof continued". CANNONICAL OPERATIONS FOR FEEDBACK PERCEPTRONS Remark: The l a s t machine operated by reducing a figure to a more tractable form' which i t could then deal v/ith. In mathematics v/e usually c a l l this kind of process normalization or reduction to cannonical form. We now show that a l l (F.P)s operate i n such a way. De f i n i t i o n : A (F.P) P computes -is by reducing figures to cannonical form i f f ( l ) ty(X) = 1 the machine accepts X and ( 2 ) the sequence of states SQ,S^,...,Sn before acceptance or rej e c t i o n of any X c R are such that [ty(S^) = 1 ] for a l l i or U ( S ± ) = 0 ] f o r a l l i ' . Theorem: A l l (F.P)'s operate by reducing figures to cannonical form. Proof: Assume P i s a (F.P) and does not so operate. Then there i s a predicate, tjj such .that P accepts ijr and there exists X c: R such that the sequence of states before acceptance So J*"" ;' Sn contains S, ^ k < n , with ty(sv) = 0 • I f w e l e t 28. •= R , a new i n i t i a l state, the sequence RQ,R^,..., w i l l coincide with SV,S., , ...,S with S accepted, which means \ji(Sk) = 1 contradiction. Discussion: Clearly any state of a (F.P) P on R can be described as leading to acceptance, reject i o n , or endless cycling. The i n i t i a l state merely starts the path of arrows induced by the next state function which ( i n the absence of additional input) would normally lead'to an endless cycle. The only event that can prevent such a cy c l i n g s i t u a t i o n i n this case i s to h i t on a state which activates the summation operation and tends to accept-ance or rejection. ; The problem then i s to p a r t i t i o n the set of states (=.p(R)) of the f i n i t e automata P into Qy and ft with X e i}/(X) - 1 and X £ ty(X) = 0 . Normally such a problem i s easy, but i n t h i s • s i t u a t i o n i t i s not since the" next state function (and hence the arrow paths) are- severly constrained i n S by the order of P . A f u l l and s a t i s f a c t o r y theory on. this subject would require r e l a t i n g order constraints to the state graph, of the auto-mata and then to the class of predicates, a weighty task indeed! We w i l l content ourselves with the probrem of "programming" (F.P)'s to handle various predicates such as connectivity. Par i t y Proof Continued: Consider any order 1 F.P. which recognizes *!>PAr> • A consequence of Minsky's group invariance 29-theorem i s that the only predicates of order 1 with respect to a 1 - l e v e l perceptron are of the form cp(X)<H? |X| <_ M or cp(X)«-e>|X| > M . We conclude that the blackened subsets of R ( i . e . the sequence of states of MP) must tend i n c a r d i n a l i t y to an odd number > M f o r some M _< |R| for r e j e c t i o n and an even number <_ M f o r acceptance i n any order 1 p a r i t y recognizer. Of course, the states could also tend to any odd number <_ M f o r r e j e c t i o n and any even > M f o r acceptance. We now consider the following-small example. Let R be a 1x2 r e t i n a . We examine the behaviour of a l l possible associator units to show the i m p o s s i b i l i t y of recognizing the p a r i t y predicate with an order 1 P.P. R = cp^ i s the associator on the l e f t square x^ of R . cp2 i s the associator on the f i g h t square x g of R . We break the proof down into cases where the cannonical form for "accept" i s form f o r "reject i s or both (Case c) . s down c X (Case I) or X •£ (Case II) (Case a) or ^ I (Case b) Case l a : i f II II i s read as "leads to-'the state" we can symbolize the required transformations i n this case i n the follow-ing way. (Remember: we must prevent endless c y c l i n g on non-30. (accept or reject) states, ( i ) accepted ( i i ) ( i i i ) (iv) X X rejected X X X X Now ( i ) shows that ^(X-^) = 1 or 0 , 1 or 0> . That i s upon being presented with a blank cp-^ reactivates nothing. Clearly cp2 ^ a s ^he s a m e property. Next (iv) shows cp-^ and' cp^ reactivate nothing upon "seeing" an active square. Therefore cp^_ and cpg • cannot reactivate anything so parts ( i i ) and ( i i i ) are impossible, contradicting the assumption that Case Ia<is possible. We b r i e f l y run through the other cases (omitting Ib and l i b for reasons of symmetry) showing that they are a l l impossible. ( i i ) ( i i i ) (iv) Case l c : The required transformations are: ( i ) accept reject _p X X X reject Xj X X As before ( i ) and (iv) show ' cp-^ and cp2 ' are i n e r t , 3 1 . showing (II) and ( i i i ) impossible. Case I l a : .The required transformations are: (i ) ( i i ) ( i i i ) (iv) X X reject X X X X X X accept —^> X X Item's ( i i ) or ( i i i ) show that neither cp-^ or cp2 w i l l reactivate both X^ and, X^ together. And ( i ) further shows that both must activate at le a s t one o f X^ or X^ , given a blank. Therefore one of the following p o s s i b i l i t i e s • o b t a i n s : (1) cp x (0) = <X1,-,-> cp 2 (0) = <X]_,-,-> (2) cp 1 (0) = <X±,-,-> cp 2 (0) = <X2,-,-> (3) cp 1 (0) = <X2,-,-> cp 2 (0) == <X1,-,-> (4) Cp 1 ( 0 ) = <X2,-,-> cp2(0) = <X2,-,-> Numbers ( l ) and (4) can be rejected out of hand as the run counter to ( i ) . Number (2) i s contrary to ( i i i ) and (3) i s contrary to ( i i ) . 32. Case l i e : (i ) ( i i ) ( i i i ) (iv) X X X j reject — ^ reject —> accept —p X X X X X X X Item ( i ) shows that cp1(0) = <-,0,-> or cp2 (0) = <-,0, Thus either ( i i ) or ( i i i ) are not reject states. Contradiction. Aside:' It i s i n t e r e s t i n g to note that a l t e r n a t i v e (3) i n Case 11a i s a consistant assignment f o r Case l i e . Also: I f the l a s t clause i n the d e f i n i t i o n of "accept" ( i . e . the state stays.-.'.the same) were not required we would NOT have "been able to derive a contradiction i n this 1x2 example. Furthermore even f o r larg e r examples, I do not know -whether any contradiction would exist. The number of p o s s i b i l i t i e s f o r associator action i s very large. Having eliminated a l l possible cases, we have shown that an order 1 p a r i t y recognizer i s not possible f o r F.P.'s 3 3 . ORDER OF CONNECTIVITY PREDICATE FOR FEEDBACK PERCEPTRONS Preview: We devote the next section to showing that 4'CONN I S of order <_'8 . That Is we f i n d F.P.., Mc which w i l l - recognize C^ONN w i t i l n o associator of MQ having support > 8 ." MQ i s simulated, as i s at the end of this paper. (See p. 11 f o r d e f i n i t i o n of ^CONN^' Description of Mc: Mc w i l l operate on an mxn retina" R and w i l l be.equipped with six kinds of associator units, the Right Dribbler, the^Left Dribbler, the Surrounder, the Sidewinder,•the Backslider and. the Scalawag as pictured below. ; Sidewinder Backslider Scalawag - (SW) ' (BS) (SC) S 34. The arrows show which way active squares tend to propogate (as shown "below) under the various kinds of units . At each square x^ e R , one of each kind of unit ( s ix i n a l l ) i s placed upright i n such a way that square 2 , the centre of the unit coincides with x^ on the r e t i n a . The assoc-iat o r s may overlap the edge of the r e t i n a i n which case such overlapped squares are always registered as "blank" or "inactive". We must now describe the output ( i . e . the t r i p l e ) of each unit as well as the summation operator £ . F i r s t Component: Each unit w i l l have only two outputs, the "growth" or "ac t i v a t i n g " output and the "standpat or "recopying" output. The standpat output merely copies the e x i s t i n g input with no additions or subtractions so that no information w i l l be l o s t during the time i n t e r v a l . The growth output copies the exis t i n g input and adds to i t , - i f possible, square 2 . Roughly speaking, most outputs w i l l tend to standpat; only a few are growth outputs. We give f o r each kind of unit the necessary and s u f f i c i e n t conditions that the output be a growth output. Both Dribblers: A l l four conditions must be met. (1) Square 1 i s active. Square 2 i s not. i . e . l + , 2 ° (2) Square 5 i s active implies Square 3 i s . i . e . 5 + - 3 + (3) 6 + - k+ (4) 7 + - ( 3 + and 5 + and 8 + ) , or more simply 7 + - ( 3 , 5 , 8 ) + 3 5 . Surrounder: ( l , 3 , 4 , 5 , 6 , 7 , 8 ) + 2 ° Sidewinder:. ( 3 , 1 , 6 , 7 , 8 ) + 2 ° Backslider: ( 3 , 4 , 5 , 6 , l ) + ( 2 , 7 , 8 ) ° Scalawag: Both conditions must be met: (1) ( 8 , 6 , 7 , 5 , 2 ) ° 1 + (2) 4+ 3 + Second Component: .For a l l units the second component (which says whether or not the unit i s ready to sum) i s a 0 i f f the unit's f i r s t component was a growth output and (of course) a 1 i f f the unit's f i r s t component was a standpat output. Therefore Mc proceeds to summation only a f t e r a l l units standpat, which,"of course, implies that the machine state has s t a b i l i z e d . Third Component: We define the t h i r d component (which,says whether or not the unit accepts the small b i t i t sees) i n a way which w i l l become clea r only a f t e r a proof. The right dribbler puts out a 0 i f f (1) 1 + 2 ° OR (2) 1 + 3° A l l other units always put out a one. . Summation Operator: M c accepts X i f f M i s ready to sum upon being given X and S (1 - cp(X))< 0 ' cp. an-associator unit of M c < "'• ' • 36. That i s , acceptance or r e j e c t i o n i s on a " b l a c k b a l l " system, X being rejected i f f any one of the right dribblers puts out an 0 i n i t s t h i r d component. D e f i n i t i o n : A square x e R i s said to be i n the scope of x e R (written x e scope (x-,.)). i f f x = x OR • x i s below x O v 1 O i i n x ' s column OR x Is to the l e f t of x i n • x 1s row' OR o o o x i s down and to the l e f t of x Q . Remark: Informally the way MQ operates i s to take any x e X of a connected figure and f i l l i n i t s scope, i f i t has not been done already. Thus a figure- s t a r t i n g out l i k e , t h i s : would end up looking l i k e this at the time of acceptance. Really then the t h i r d component of the RD i s merely checking to see i f a l l scopes have been f i l l e d i n on a processed and (supposedly connected) figure. Disconnected figures l i k e 37. w i l l end up l i k e this at the time of rejection. This fact i s very c l e a r l y i l l u s t r a t e d a f t e r the text i n the figures. Notation: Let a r e c t i l i n e a r path y i n R going from x to y be denoted.as y = {x = x Q,x 1,x ,...,xn_1,xn=y} where x ± and x^+-^ have a side i n common for i = 0 , 1 , . . . , n - l . Lemma 1: I f x e X and y e X are not connected to each o ^ o other then no single associator cp w i l l connect them- d i r e c t l y , ( i . e . i n one time step.) Proof: • Suppose X q and y Q are not connected. In order that cp connect xQ to y Q d i r e c t l y , y Q must be at one of the squares marked 1 , 2 , 3 , 4 , 5 , 6 , 7 or 8 i n r e l a t i o n to x o ¥e eliminate a l l p o s s i b i l i t i e s , f i r s t eliminating 5 , 6 , 7 and 8 by symmetry. ' (For example: I f y Q Is at 1 the picture looks the same as i f i t were at 3 , etc.) Case I: y Q i s at 4 and b must have been activated. We look at the six associators centered at b . A right or l e f t dribbler at b would Imply a previous connection from 4 to X q v i a [4, 12, 3, a, x } to activate b . A backslider would connect 4 to xQ v i a [4, 13, 5, c, X q } ' i f i t were active as would a sidewinder and a surrounder. A scalawag could NOT activate b under any circumstances. Case II: yQ i s at 3 • Either a or b could have been activated. Sub-case I l a : y i s at 3 and a was activated. . We examine the associators centered at a . A dribbler (right or l e f t ) a c t i v a t i n g a would imply a previous connection from x D to 3 v i a [3, b, x Q}' or {X q, d, 1, 10, 2, 11, 3} • A backslider at a implies a previous connection v i a [3, b, X q ] as does a side-winder and a surrounder. A scalawag cannot activate a . Sub-case l i b : y i s at 3 and b was activated. Either DR implies a connection v i a {3, a, X q } . A BS could not do the job nor could a SC. A SW implies a connection v i a (3, a, x Q} . A SR connects v i a {3, 12, 4, 13, 5, c, x<o) . Case I I I : y i s at 2 and a was activated. Active dribblers at a would imply a connection {2, 11, 3, b, x Q} as would a SR or {2, 10, 1, d, x o3 . Neither a BS or a SC would work. An active SW at a implies' a connection [2, 10, 1, d, X Q } . Case IV: y i s at 1 and either • a or d was activated." 39-Sub-case IVa: DR's give a path {x o, b, 3, 11, 2, 10, 1} or [xQ> d, 1} . A BS, SW, SR, or SC yiel d s the path {X q, d, 1} . Sub-case- IVd: A DR a c t i v a t i n g d connects v i a [ X q , a, 1} as does a SR". A SC or BS would not do the job. A SW connects v i a {x Q, 6, 7, 14, 8, 9, 1} • A l l p o s s i b i l i t i e s having being eliminated, the lemma i s 'shown. Lemma 2: Suppose x Q and y Q are not connected.' Then no two associators and cp2 w i l l connect them d i r e c t l y . Proof: Suppose x and y > are not connected. In order that c s r o o cp^ and cp2 connect XQ to y Q . d i r e c t l y , y Q must be at one of the squares marked 5,6,7,8,9,10,11,12,13,14,15 or 16, i n r e l a t i o n to x . For i f y were at 1,2,3, or 4 we would have a o o connection. If y were outside the assigned area two associators could not 40. p o s s i b l y connect them d i r e c t l y . I f y Q were at a,b,c,d,e,f,g, or h and v/ere connected v i a {x o 3x^,x 2,x^ = y Q ] to X Q either x-^ . or x^ would have to be at 1,2,3, or 4 , forming a' direct connection using only one associator's activated square, contrary to Lemma 1. As before we eliminate a l l p o s s i b i l i t e s , eliminating f i r s t 11,12,13,14,15 and 16 by symmetry. Case I: y Q i s at 5 • There are 3 dir e c t paths from 5 to x o • • ( i ) {5, a, 1, x Q} ( i i ) (5, a, 4, x 0} ( i i i ) {5, h, 4, x o} Case I I : y Q , i s at 6 . There are 3 direct paths from" 6 to x o ( i ) {6, b, 1, x o3 ( i i ) [6, a, 1, x Q}-( i i i ) {6, a, 4, x Q3 . -Case I I I : y Q i s at 7 There i s only one dir e c t path {73 b, 1, x Q3 from 7 to x Q . Case IV: y Q i s at 8 . There are 3 dir e c t paths from 8" to x Q. ( i ) {8, b , l , X Q ] (II) {8, c, 1, x o ] ( i i i ) {8, c, 2, x o) . Case V: y Q i s at 9 . There are 3 ' dir e c t paths from 9 to x o • 41. (i.) { 9 , c, 1, x o} ( i i ) { 9 , c, 2 , x o} ( i i i ) { 9 , d, 2 , x o ] . Case VI: y Q i s at 10. There i s only one direc t path {10, d, 2 , x Q3 from 10 to X Q . Note: We w i l l shorten this long, tedious.proof somewhat by the following observations and notational conveniences. ( i ) When we argue a case- we might as well assume that both squares i n the path i n question were i n i t i a l l y blank., and hence needed a c t i v a t i o n . Take, for example, Case I ( i i i ) . We argue i n this way: y Q i s at 5 and was connected i n one time step to x Q v i a the path { 5 , h, 4, X q } . Assume to the contrary that either h or ,4 was previously active. We derive contradictions i n both cases. Case h: h was previously active. But then h was not connected to X q since 5 wasn't. So h and X q (two disconnected points) were connected i n one time step by two assoc- :? i a t o r s v i a the square 4 alone. Therefore h and X q were connected i n one time step by one associator. . This statement viol a t e s Lemma 1. Case 4 uses the same idea. ( i i ) To avoid needless verbiage, our case arguments w i l l merely l i s t the type of associator alledged to have activated a path square, followed by a path o,r a statement of i m p o s s i b i l i t y . This path w i l l represent a pre-existing connection between X q and y Q whose existence i s implied by the active associator. For example, i n Case I ( i i i ) again we w i l l write (in part) 4 2 . BS at 4 - x Q, 3 , g, 1 5 , ft, 5 SC at 4 - DNW These crypt i c l i n e s mean that " i n order for a BS to activate square 4 a path must have existed before connecting x Q to 5 v i a {x Q, 3 , g, 15, h, 5) , contrary to assumption". Also we have "A SC Does Not Work, i . e . i t would not activate 4 i n this case F i r s t x i n h i b i t s the SC. Second h i s blank also o i n h i b i t i n g the SC. Note that we could also have written BS at 4 - DNW since h i s blank. We w i l l not be fussy about where our contradictions come from. 1 ( 1 ) : 7 8 1 b 1 * c 1 ' a J 5 X o _ [ h i h g LD, BS, SW, SR or SC at 1 -. x , 4 , a, 5 . So we assume 1 was activated by an RD. In this case BS, SW or SR at a -»-x , 4,h , 5 was active. DR at a -• x , 2 , c , 8 , b , 6 , * , 5 . SC at a - DNW since b Therefore 1 was previously active, contradiction to Lemma 1 l ( i i ) , ( i i i ) : SW, BS or SR at 4 - x , 3 , g, 1 5 , h,'5 4 - DNW . DR at 4 - x Q , l , a , 5 . SC at I I ( i ) ( i i ) : LD, BS, SC, SW or SR at" 1 -' x , 4 , a, 6 . RD at 1 - x Q, 2 , c, 8 , b, 6 . 43-I I ( i i i ) : SR, DR at 4 - x Q, 1, a, 6 . BS, SC at 4 - DNW . So a SW-at 4 i s the.only p o s s i b i l i t y . _ Assume a SW at 4 . DR, SW at a - X q , 3 , g , 15 , h, 5 , * , 6 . SC, BS at a - DNW . SR at a - x , 1 , b, 6 . III:. DR at 1 - x , 2 , c, 8 , b, 7 or x , 4 , - a , 6 , b, 7 . BS, SC, SR at 1 - DNW . So SW at 1 i s the only p o s s i b i l i t y . Assume an SW at 1 . SC, SW, BS, SR at b - DNW . DR at b -x Q, 4 , a, 6 , * , 7 • I I I ( i ) ( i i ) : DR at 1 -» 8,c , 2,x Q, or 8 , b , 6 , a , 4 , X Q . SR, BS at 1 - 8,c , 2,x-SC at 1 -* DNW . The only a l t e r n a t i c e i s a SW . Assume a SW at 1 . We must now examine the units at b and at c . DR at b - x o , 4,a , 6,4= , 7 , $ , 8 . BS, SW "at b - X q , 1 , C , 8 . SC at b - DNW . SR at b - x , l , c , 8 . Now f o r c . DR at c - 8,b,l,x or o . ' ' J o 8 , * , 9 , d , 2 , x Q . . BS, SC. or SR at c DNW . SW at c x Q , l , a , 6 , b , 8 . Contradiction i n both cases. 4 4 . I V ( i i i ) : SC, BS at 2 - DNW. SR at 2 - x , 3 , e , l l , d , 9 , c , 8 . DR at 2 - 8 , c , l , x ' . So SW at 2 i s the only p o s s i b i l i t y . Assume SW at 2 . DR, SW. at c - x Q , l , b , 8 . BS, SC, SR at c -DNW . V ( l i ) ( i i i ) : DR at 2 - 9 , c ; i , x o . SR, BS at 2 - X Q , 3 , e , l l , d , 9 . SC at 2 -DNW . So SW at 2 i s assumed. We examine d and c . DR at c - 9 , * , 8 , b , l , x Q . SW, BS at c - X Q , 2 , d , 9 .. SC SR at c - DNW . Now for d : DR at d - x , 3,e,ll,=j= , 1 0 , £ , 9 or x , 2,c,9,-.-SW at d - x Q , 2 , c , 9 • SR, BS, SC at d - DNW . V ( i ) : SR, BS, SW at c - 9,d , 2,x Q. SC at c - DNW . The only p o s s i b i l i t y then i s a DR at c , meaning that 8 and * are active. We examine square 1 . DR at 1 - x Q , 4 , a , 6 , b , 8 , * , 9 or X q , 2 , C , 9 • SC, SR at 1 - DNW . BS at 1 - X q , 2 , C , 9 . The only p o s s i b i l i t y i s an SW at 1 meaning that 2 was o r i g i n a l l y active when the DR'at c was applied. This y i e l d s x , 2 , d , 9 or x , l , b , 8 , * , 9 • * 1 9 l l 1 1 0 I d 1 2 ( T j l l J e " BS, SW at d - 10,£,ll,e ,2,x . SC,,,DR at d - DNW . The only 4 5 . p o s s i b i l i t y l e f t i s a DR ; hence * and 9 were previously active. We examine square 2 . SC, SR at 2 -» DNW . DR at 2 - X Q , 1 , C , 9 , * , 1 0 . BS at 2 - x , 3 , e , . l l , d , 1 0 . So an SW at 2 i s the only p o s s i b i l i t y . Hence 1 1 was previously active when the DR at d' was applied. This y i e l d s . x Q , 3 , e , l l , £ , 1 0 or x O , 2 , c , 9 , * , 1 0 Q.E.D. Remark: This proof was very long and boring! I t was put'in fo r completeness and to ease'nagging doubts. The proof might have been shortened somewhat by Ingenious assumptions, however i t i s very easy to get fooled! The brute force method though boring i s at lea s t clear. Lemma 3 : The, state t r a n s i t i o n s of Mc go only from connected states to connected states or from disconnected states to discon-nected states. They never switch back and forth. Proof: ( i ) Let X be a connected figure i n R - and l e t X1 P2 X be the figure formed from -X by an application of the state t r a n s i t i o n .function of MQ . We show that X"*" i s '•> connected. Let x1,y'1" e Xx . We show x 1 i s connected to y 1 . 1 1 1 1 -I f both x and y e X we are done. I f either x or y e X^~X i t must have been produced by one of the six -kinds of assoc-1 1 i a t o r units v/ith x or y In p o s i t i o n 2 . But,in such a case x^ or yx (or both) would be .connected to. associator p o s i t i o n 1 , denoted by x and y respectively, which i s i n X . A path from x^ and yx could then be of the form 4 6 . Y = {x = x ^ ^ j X g j X ^ , . . . i X n _ 1 , x n = y } where x 1^---. x r l_ 1 e X and either x-j_ = x or x ^ = y . ( i i ) Let X be a disconnected figure and l e t X^ " :=> X be the figure obtained from X by an application of the state trans-i t i o n of MQ . We show Xx i s disconnected by assuming i t i s connected and deriving a contradiction. Suppose X^ i s connected and X i s disconnected. Let x and y be i n separate components A and . B respectively of X . There i s a path Y = (x = x ,...,x n = y } connecting x to y i n X . Every x^ e 7 i s classed either as ( l ) an element of A or (2) i s derived from some element of A by an associator unit centered at x^ or (3) i s an element of some other component than A (say^G) or X or (4) is.derived only from elements of'other components by Associators centered at x^ . Let. the f i r s t two alternatives be of type I and the second two of type II . Since the path y starts i n type I and ends i n type I I there must be"an i Q such that x.. Is the l a s t x^ e y of type I. . I f x. i s o • • ' . o of class ( l ) x.. , cannot be of class (3) (C and A are dis-o + connected) or of class (4) without v i o l a t i n g Lemma 1. I f x. 1 o Is of class (2) x^ ^ cannot be of class (3) or v (4) without o v i o l a t i n g Lemmas 1 and 2 respectively. Lemma 4: I f R i s a re t i n a which contains a connected figure and i f M i s ready f o r summation then the following configuration cannot occur on R 47. X • X Proof- Assume X c R i s connected, that Mc i s ready f o r summation ( i . e . no associators w i l l he active) and that occurs on R where 1 and 4 are active, 2 and 3 Y = £l = X Q , X 1 , X 2 , ,x .. ,x = 4} 5 n - l * n J are blank. There i s a simple path y connecting 1'. to 4 . not passing through 2 or 3 Form the unquantized simple closed curve y"1" by connecting the centre of 1 to centre (2) and centre (x^) to centre (xj_+]_) f o r i = b,...,n-l v/ith straight l i n e segments. Our'picture looks l i k e t h i s : D e f i n i t i o n : We say y semi-encloses a square x-. i f f y i s of the form given above and encloses x i n the usual sense of the Jordan Curve Theorem. Claim: I f 2 and 3 are blank then either y semi-encloses 2 or y semi-encloses 3 but not both. Proof: Let z be a point (not a square) inside 3 Connect 48 . z to centre 3 . by a l i n e segment .and then draw a 45 ray from centre (3) upx/ard out to 3R . This new path 6 w i l l i n t e r s e c t y 1 either an odd or an'even number of"times. I f odd, 3 i s semi-enclosed by y ; i f even 3 i s not semi-enclosed by y . Clearly also since • y passes through 6 at the "centre" we have: y semi-encloses 3 -* Y does not semi-enclose -2 ^and y semi-encloses 2 -» y does not semi-enclose 3 • We resume the proof of Lemma 1. Pick a y connecting ' 1 and 4 semi-enclosing minimal area. Pick a highest row i n Y . There w i l l then be a l o c a l area represented as highest row . The path may then turn either to the l e f t or to the ri g h t . Case L: • y turned to the l e f t i n this p a r t i c u l a r spot. •49'. jar-n=ar 2 j -X l x . 1 Now 1 must"be blank since 1 i s semi-enclosed by y and y i s a minimal semi-encloser. I f 1 i s active, the path r\ obtained by substituting 1 f o r ^-±+i Y w i l l semi-enclose le s s area than y while s t i l l connecting our o r i g i n a l two squares. So 2 i s active, avoiding a dead end. And 3 i s active by the dribbler rule and the fact that 1 i s blank. We have We investigate the various p o s s i b i l i t i e s f o r a and b . a and b cannot both be blank because c would be. active and the DR rule would be v i o l a t e d . a and_ b cannot both be active. To have such a s i t u a t i o n would again v i o l a t e the DR rule. We are l e f t with two cases. F i r s t a i s blank and b i s active. Second b i s blank and a i s active. In the f i r s t case we . have: ' : where again the only possible assignments f o r a^ and b-j are (a-j_ blank, active) or (a, active, b, blank). Clearly 5 0 . this f i r s t a l t e r n a t i v e could not continue i n d e f i n i t e l y to the exclusion of the second, because 'X* i s not one of our end squares. The only p o s s i b i l i t y l e f t i s that at some time the second al t e r n a t i v e must occur y i e l d -ing: X X I X \r A. X ^ — ^ X X X X which gives us our o r i g i n a l configuration - only now we must be with a new single path y^, t o t a l l y able to connect X X semi-enclosed by y ( o r on y). In this way we obtain a sequence of paths y^,y^,...,ym connecting our figure and semi-enclosing a s t r i c t l y decreasing amount of area. Eventually we get to the point where the absolute minimum area i s semi-enclosed which means y looks l i k e or x 1 K X [ X Ix | x J x~~|~xl M r whence the blank i n the middle i s f i l l e d i n by a SR or a LD Contradiction. Case R: The path turned to the righ t . Here we have as before: "-.,iUJ.r.i ( X I X T X | a c X b where c blank y i e l d s a new path > t o t a l l y enclosed i n If c i s active .we get a contradiction v/ith the RD rule. Lemma 5 : I f R i s a re t i n a which contains a connected figure and i f Mc i s ready f o r summation, then the configuration X X cannot occur on R • Proof: For this lemma, the proof goes as before except that we end up with minimal area semi-enclosing configurations l i k e !x x l |x | |x fx - 1 or x 1 xj I X A. N v i o l a t i n g the SW and RD rules, Lemma 6: I f R i s a re t i n a which contains a connected figure and i f Mc i s ready f o r summation then X X cannot occur on R 52. Proof: X 6 2 h x 5 It i s easy to se~ that 1 and 2 must be b.. vale i n this case. We assume that 2 (for example} i s active as the argument f o r 1 i s symmetric. I f 2 i s active then we have either 6 blank or 5 blank or (3 blank and 1 active) a l l situations leading to the forbidden patterns of the l a s t lemmas. We argue now as before picking the highest row of a minimal area semi-enclosing path y connecting our o r i g i n a l .figure. (note: we have abused the d e f i n i t i o n s l i g h t l y . ) We obtain the figures X *1 • x 1 a j c top row and 1 x 1 x i U J •X c i i 2 top row depending upon whether y "turned right or l e f t " . In both cases there Is no consistent choice f o r c . Note that our f i r s t remark i n the proof was necessary to prevent the path from ending or beginning at x-^ or x^ .' Lemma 7: I f R i s a r e t i n a containing a connected figu r e and i f MQ i s ready f o r summation, then |)C_ || j X j cannot occur on R . 5 3 . Proof: (sketch) *b X X a ¥e can show by process of elimination that a, b and c must be blank. Then we plan to use the same type of argument as before. But we must be c a r e f u l . This time the path y may have as highest row only the end points. In such a case we cannot b u i l d such a strong\ configuration around our highest row as before. Case A: y extends above the end points: Here as before we get or a I X 1 b j X 3 leading to contradictions. Case B: y does not extend above the endpoints: Here we must look at the lowest row for the y semi-enclosing l e a s t area. moved right _ a 1 c 1 b 1 J L l X | x i l l moved c | a e l e f t x i X X By Lemma 6 a i s blank. By our comment at the beginning of the proof b i s blank, hence c i s too. I f y has moved to the 5 4 . l e f t we could further conclude that e i s blank and d i s active. (Using the SW and SC rules respectively.) But then the RD and LD at a would be active contradiction. If y had moved to the right , i t would eventually have to turn back up again to meet an end point, 5 p l ~ i H i X i X X X duplicating the s i t u a t i o n of the l e f t moving path. We conclude that there can be no path y of the required type connecting the o r i g i n a l f igure. Contradiction. Theorem: Upon receiving a connected figure X on an mxn reti n a - R , Mc w i l l continue to add retina c e l l s to X u n t i l U scope(x) xeX a " s o l i d " figure X° i s produced. This figure X° After X° i s attained, no state change w i l l take place. Proof: Suppose X i s a connected figure and has been inserted into Mc . It i s clear from the form of the six associator units that the state at time t' , x ^ - s a t i s f i e s . X'S c U scope (x) .. { Z ) xeX . . . It i s also c l e a r that i f there i s a time t with X,. \ = X° o ('C ) Our task then i s to no state change w i l l take place a f t e r t o 55-show that i f M sums at some time t, then we have X/, \ = X° = U scone (x) . We do so by contradiction. l V xeX Assume that at time t = t, M sums and that 1 c X/, \ ^ U scope (x) . Therefore there i s an x n e X such xeX 1 that y e scope ( X Q ) and that y / X ^ ^ f o r some y e R . Adopt the following scanning procedure i n scope X Q . Start at x and scan down the l a s t column. Then scan.the o second to l a s t column from top to bottom and so on u n t i l a l l columns have been scanned. Pick the f i r s t such y i n the scan and c a l l i t y . P i c t o r i a l l y then we have one of the following three pictures, depending upon where .y i s i n scope x Q i n r e l a t i o n to x . Remember y i s "blank" since y e X,, \ . . " 1-Case I: X o X X X X X X Case II: X o 1 y X X X X X X 56. Case I I I : X o o We show- that none of these cases are p o s s i b l e . That i s v e X/, \ c o n t r a d i c t i o n . Cases I and I I I : By choice of JQ we have one of the fo l l o w i n g f i g u r e s v/ith \y blank 1 J X || X 2 j j y 0 l l x CO L L j 5 2 I'o ! 4 l 3 In order that y be blank and not have been a c t i v a t e d o , - . by a DR we must have e i t h e r j5 a c t i v e or (2 a c t i v e and 1 blank) or (4 a c t i v e and 5 b l a n k ) . These cases l e a d to the forbidden c o n f i g u r a t i o n s . X X X or X or X Case I I : By choice of j Q we have i 57. 7 f * F I ' j10 | x J i 1 1 Lid In order to avoid a c t i v a t i o n of y Q by a SC we must have either 1 , 2, 3 , or active. Sub-case ( 1 ) : 1 active y i e l d s X X contradiction! Sub-case ( 2 ) : 2 active implies also that 6 i s active or ( 1 i s active and 3 blank) or 5 i s blank, a l l contradictions. Sub-case ( 4 ) : 4 i s active. We must avoid having 2 active by sub-case ( 2 ) . The only way. to avoid having a DR f i l l 2 i s by having y Q active (false) or ( 3 active and 7 blank) or ( 5 active and 8 blank) both contradictions. Sub-case (3)'- 3 i s active. We must avoid 1 being active. So we have 1 1 active or 1 0 active or ( 1 0 active and 9 blank) or y Q active. In no case could we avoid the forbidden patterns, o Therefore XfJ_ \ = X . / , A = (J scope (x) and t, = t xeX 1 o The theorem i s shown. Corollary 1 : MQ recognizes \Jj CONN.. Proof: Let X be connected. When Mc sums i t w i l l do so on 5 8 . the figure X° = U scope x , causing every associator to output xeX a 1 , causing acceptance. Let X be disconnected. When M C sums (which i t surely w i l l do i n a f i n i t e amount of time) i t w i l l do so on a figure Y . Certainly Y c u scope (y.) . (Just look at the yeY associators.) Since Y ^ U scope y , a- connected set, we must yeY have y Q e Y with some X q e scope y Q ~ Y . Pick any northeast-erlymost such xQ . Then either X X or Xo, with (x Q e Y , x e Y) w i l l occur leading to re j e c t i o n because of the t h i r d component RD rule. Therefore . M accepts X i f f X i s connected, c r Therefore M C recognizes ^ Q Q N N * Corollary 2: ^CONN i s o f ° rder <. 8 . Proof: Since M C recongizes ^'Q Q ^ N » ^ e P r o ° f consists only of noting that a l l associators units of M C have eight or les s squares i n them. Topics f o r further study: Of course^ one might always try to get better bounds on order ( ^ C Q N N ^ ' O R < ^ E R 1 o r 2 cases might be possible to rule out v i a brute force, but for order 3 , 4 , 5 , 6 or 7 the problem seems to amount to either a hopelessly 5 9 -large case,elimination problem or a hopelessly d i f f i c u l t P.P. programming problem. Other predicates might be investigated. vjt , L would * Tmod n . be easy. I have thought b r i e f l y about ^SIMPLY CONN o r (^X)<S-5>X has exactly n components. As Minsky notes, geometric properties involving straight l i n e s and c i r c l e s etc. involve tolerance" topologies which seem to be more of a problem than the P.P.'s. Other algebraic predicates might be investigated. Various t r a i n i n g sequences could be investigated on the P.P.' s using computers. Unfortunately I would expect "learning" to b,e ...slower f o r these machines than f o r regular perceptrons i f * only because there' are so many more modes of action for the F.P. The problem of formulating a general theory of F.P. 's seems almost hopeless. In f a c t many of the minimal state problems for f i n i t e automata and i t e r a t i v e arrays are very much l i k e our problems with connectivity. Only rough bounds are computed fo r s p e c i f i c tasks. No general theory i s even hoped f o r . We could vary the P.P.1 s i n such a way as to simplify tasks ( i . e . lower orders). For example, we might permit the r e t i n a to be activated by n colours rather than 1 . We might note i n passing that the design of feedback perceptrons i s quite s i m i l a r to that"" of the c e l l u l a r structures 6o'. treated i n von Neumann's Theory of Self Reproducing Automata. The main differences are i n the summation algorithm of the end and i n the fact that l o g i c a l operations can be broken down over many c e l l neighbourhoods i n feedback perceptrons. Perhaps a • closer study of the r e l a t i o n of feedback perceptrons to c e l l u l a r structures would lead to stronger methods fo r the perceptrons as well as giving them wider appeal. The problem that r e a l l y started us o f f was the comput-ational complexity problem. As previously mentioned, the unlimited number of associators we allow undercuts the order measure as a true measure of complexity. Maybe a theory which -counts t o t a l computation steps of any kind i s accessible. 61. •BIBLIOGRAPHY Ar b i b , A., Brains Machines and Mathematics., McGraw-Hill, N.Y., 19^5. Felgenbaum, Edward-A. and J u l i a n Feldman, Computers and Thought, McGraw-Hill, N.Y., 1963-' (Read e s p e c i a l l y the Samuel Checker P l a y e r ) . Hebb, D. 0., Organization of Behaviour, Wiley, N.Y., 1949. Minsky, Computation: F i n i t e and I n f i n i t e Machines, P r e n t i c e H a l l , Inc., Englewood C l i f f s , N.J. , 19o'7. \ Minsky, Marvin L. and Seymour Papert, Perceptrons and P a t t e r n Recognition, (a monograph), M.I.T. Press, Cambridge, Mass. 1967. Minsky, Marvin L. and Seymour Papert, Perceptrons, M . I . T . , Press, Cambridge, Mass., 19^9• N i l s s o n , N. J., Learning Machines: Foundations of T r a i n a b l e P a t t e r n C l a s s i f y i n g Systems, McGraw-Hill, N.Y., 1965. Rosenblatt, F., P r i n c i p l e s of Neurodynamics, Spartan Books, 1962-• Uhr, Leonard and James G. M i l l e r , P a t t e r n R e c o g n i t i o n , Wiley, N.Y., 1966. von Neumann and Burks, Theory.of Self-Reproducing Automata, U n i v e r s i t y of - I l l i n o i s P r e s s , 1966. 24 SQUARES FIGURE 1 ACCEPT (SAME STATE) FIGURE 2 25 SQUARES REJECT FIGURE .3 (SAME STATE) FIGURE * f X i , I x I I f M )cx * x x x X 'X X X X * X X X ACCEPT (SAME STATE) FIGURE 5 FIGURE .6 C O N N E C T E D - F I G U R E 7 D I S C O N N E C T E D F I G U R E 1 0 I I I ! i • ' i I ( ! I ! ! FIGURE" 11
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Perceptron-like machines
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Perceptron-like machines Kraft , Kris Lance 1969
pdf
Page Metadata
Item Metadata
Title | Perceptron-like machines |
Creator |
Kraft , Kris Lance |
Publisher | University of British Columbia |
Date Issued | 1969 |
Description | This paper investigates perceptron-like devices with an eye toward the recognition of simple predicates, such as parity and connectivity. First "multilayer" perceptrons are investigated to little avail. Then much more powerful perceptrons which have feedback are considered, yielding better results. |
Subject |
Perceptrons |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-06-02 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0080499 |
URI | http://hdl.handle.net/2429/35061 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1969_A6_7 K73.pdf [ 4.05MB ]
- Metadata
- JSON: 831-1.0080499.json
- JSON-LD: 831-1.0080499-ld.json
- RDF/XML (Pretty): 831-1.0080499-rdf.xml
- RDF/JSON: 831-1.0080499-rdf.json
- Turtle: 831-1.0080499-turtle.txt
- N-Triples: 831-1.0080499-rdf-ntriples.txt
- Original Record: 831-1.0080499-source.json
- Full Text
- 831-1.0080499-fulltext.txt
- Citation
- 831-1.0080499.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080499/manifest