PERCEPTRON-LIKE MACHINES by KRIS LANCE KRAFT ' A.B. , University of California (Berkeley), 19^5 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE In the. Department of MATHEMATICS We accept this thesis as Conforming to the required standard The University of April British 1969 Columbia In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library. sha11 make it freely available for reference and Study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the Head of my Department or by his representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my wr i tten permi ss i on. Department of The University of British Columbia Vancouver 8, Canada Date Supervisor: R. S. Rosenberg. ii. . ABSTRACT This paper investigates perceptron-like devices with an eye toward the recognition of simple predicates, such as parity and connectivity. First "multilayer" perceptrons are investigated to little avail. Then much more powerful perceptrons which have feedback are considered, yielding better results. Examiners iii. TABLE OP CONTENTS Page INTRODUCTION 1 TEXT 4 FEEDBACK PERCEPTRONS 22 • ORDER OF PARITY PREDICATE FOR FEEDBACK'PERCEPTRONS 26 CANNONICAL OPERATIONS FOR FEEDBACK PERCEPTRONS 27 ORDER OFvCONNECTIVITY PREDICATE FOR FEEDBACK PERCEPTRONS 33 BIBLIOGRAPHY 6l LIST OF FIGURES FIGURE 1 - Simulation of a parity acceptor - accepting 2 - Continuation of Fig. 1. 3 - Simulation of a parity acceptor - rejecting - Continuation of Fig. 3-5 -'Simulation' of Mc - accepting. r O - Simulation of Mc - rejecting 7 - A connected figure from Perceptrons 8 - A digital version' of Fig. 7. 9 - Figure 8 at acceptance time 10 - A disconnected figure from Perceptrons 11 - A digital version of Fig. 10. 12 - Figure 11 at rejection time. INTRODUCTION The purpose of this thesis is to further explore the capabilities of various kinds of "perceptron-like" devices. The main inspiration, spirit^ approach and source of style will be a monograph of Minsky and Papert entitled Perceptrons and Pattern Recognition, (Sept. 1967). This monograph* is remarkable, not so much for impressive results as for its straight-forward unassuming approach which removes much of the mystery (and potential controversy) which has dogged earlier perceptron investigations. The Minsky paper explores the complexity of various logical and geometric concepts with respect to a very simple kind of perceptron, a linear separation machine^ where there is no "feed-back" or other communication among the "associator units". Briefly these perceptrons operate in the following manner: They are presented v/ith a rectangular array (called a retina) of squares which may be in the active' or "inactive state. The so-called "associator units" then look at small parts of the retina and decide on that basis to vote "yes" or "no". All votes are tallied and a particular weighted average taken of the votes. How the average is weighted, of course, varies from machine to machine. If the weighted average is above a certain threshold * The monograph (plus some extended results) has just been pubIT ished in book form. Perceptrons, M. Minsky and S. Papert, M.I.T, Press (1969). Some of the material in this paper has been anti cipated though not expanded upon in Perceptrons on pages 228-232, a section which did not appear in 19^7, 2. the machine is said to "accept" the particular activated subset of the retina, otherwise there is rejection. The set of all accepted subsets is said to define a predicate or a boolean function on the retina. The machine is said to recognize the predicate so defined* Perceptrons or closely related devices are often used in pattern recognition programs for computers because they are so simple in design (and easy to program). Rosenblatt did much work in the pattern recognition area with such machines "generated at random". Other "learning" programs, such as the Samuel Checker Player, also closely resemble perceptrons. Samuel, for example, has many so-called "board parameters" such as "number of pieces", and "number of possible moves". These parameters are his associators, for they grasp only a small part of the total inform ation on the board. His parameters are combined in wieghted averages to help determine what is supposed to be the best checker ' move. Minsky on the other hand is not concerned with the appli cations of perceptrons, but rather with the theoretical limit of their ability. To explore this question he defined a very restricted and specific kind of perceptron and investigates that. Unfortunately'these linear separation machines are not powerful enough to deal with many concepts (i.e., predicates) such as parity and connectivity which seem to the human at least, to be on the most elementary level. We are constantly surprised with 3. the difficulty (high order) of some tasks as compared to the ease (low order) of others. This paper will explore other designs for perceptrons along the lines of Minsky with an eye toward the difficulty of , computation involved in various tasks. Basically, the thesis will explore perceptrons with many .associator "levels", and then perceptrons which allow feed back from machine to retina. The first alternative, will not be especially fruitful. However,.the second subject (feedback perceptrons) will be quite interesting and deserving of-much more study than it can get here. \ PERCEPTRON-LIKE MACHINES ¥e assume that we have a machine which is presented with a rectangular array R of squares which may he occupied (blacked in) or not. We wish to say certain things about the array (often called the retina). Definition: A predicate function on R is a function cp : P(R) - {0,1} , where P(R) is the set of all subsets of R. A predicate is written as [cp] where [l] = T and [0] = F . For any X c R we say [cp(X)j holds or is true iff cp(X) = 1 . • The two terms will be used interchangeably henceforth.' 2I*I Proposition: There are 2 predicates on R . IRI Proof: There are clearly 21 1 subsets X cz-R in the domain of any predicate. Certain X's in dom (cp) are "accepted by cp " (i.e. cp(X) = 1 ) and certain are "rejected". Thus cp is uniquely described by the set of X's accepted i.e. by a subset of dom (cp) . Since there are 2^^ subsets of dom"(cp) there ?|R| are 2^ predicates cp . A particularly simple and useful class of predicates are called masks. is a predicate such that either or there is a set A c R such that Definition: A mask cp^ cp(x) = 0 for any X c R cpA(X)' = 1 iff X => A for all X c R . The zero mask will be denoted as cpQ ; the one mask (logically) as tp^ . Definition: Suppose $ is a.set of masks. We say that i|f is realizable from $ by a simple mask perceptron iff for all X c R [\|i(X) = l] E a cp.(X) _> 9- for some reals a ,6 . The simple mask perceptron which can be thought of as a separate piece of "concrete" hardware (consisting of the threshold 6 V. the coefficients a and the masks cp. ) is said . c?i i to compute cp from § . We write cp e S.M.P. ('$) in this case. Remark: We could have assumed 0 = 0 3 a to be rational or *i even integral in R , and we could also have written > instead of >_ > obtaining an equivalent definition. Proof: " (i) If we make the assumption (which we will henceforth make) that cp^ e § then Sa cp±(X) > G iff S a «P±(X) + (a - 9)^(X) > 9 cp.^^ showing that 9 can always be taken to be zero. (ii) Only a finite number (<_ 2^R') of sums S(X) = £ a cp. (X) cp±€6 «Pi 1 will occur. If h is the minimum, nonzero pairwise difference 6. of the finite set of sums (S(X)} we then have S(X) _> 0 iff h / S(X) > 0 - /2 for all X c R , showing that the >_ could be taken as >~ . (iii) If h is as before then we know that it is possible to pick a' e Q close enough to a such that ^i ^i o o £ a cp, (X) + a' cp, (X) - S a cp.(X) . cp, xv cp, ' cp. 1V ' '1 o ^o < h 21 $' for all X c R . If all such a ' s are so replaced we clearly cp. have: S a' cp.(X) > s a' cp. (Y) iff E a cp.(X) > E a cp.(Y) cp. iv cp. iv cp. iv 7 cp. xv 7 that is preservation of order for all of our finite number' of sums, showing that a 's can be taken to be rational. Multiplying through by the least commom denominator shows the same thing for integers. We now define a (formally) more complicated machine, which will turn out to be equivalent in computing power. Definition: Suppose § is a set of predicates. We say that cc is realizable from § by a linear separation perceptron iff [cp(X) = 1] «-* E a cp.(X) > 6 for some reals a ,6 cp,e$ ^i .. ^i 7. and as before we write \jj e L.S.P. (§) R Remark: As before we could have assumed 6 = O.ct e Z and < vi to be < . We shall henceforth refrain from making this kind of remark explicit. Proposition: If $ = the set of all masks then any predicate e" S.M.P. (*) . Proof: We write \|r in..disjunctive normal form ^C1(X)vC2(X)v.. .vCn(X) where each C (X) = [z. ]A[Z. ]A...A[Z. ] and 1 11 • 2 mi each [z±.] = [X^] or [Xk] = [l"xk] for some XR e R We then have C. (X)*-» z. *z. • «z. = 1 12 >•— mi We now replace each z±. by XR or 1-Xk obtaining a polynomial 8. in the X^'s only. No appear, though some terms will be negative and all will therefore be definable by a mask cp or its negative (-l)cp . Therefore C±(X) ^ S cp(X) - E_ $(x) > i . cpe$^ c]5€$^-(Note that 0 and 1 are the only values attainable.) and U(X)]«-».S cp(X) - cp(X) > 0 since cpe? cpe$ i|i(X) will hold iff any one of the conductions C is true. We now introduce the two concepts of support and order, which are to be some measure of how complicated a predicate is supposed to be. • ... Definition: The support A of a predicate cp .is the smallest subset A c R upon which cp depends, i.e. for which cp(XnA) = cp(A) for all X c R . We write A = supp (cp) . Definition: We say that IJJ is of order k iff k is the smallest number such that iKX)*->E a cp±(X) > 0 where the cp^'s are all predicates with | supp cp^| <_ k .and the a 's are some real numbers. Theorem: \i/ is of order k iff k is the smallest number such m that cp(X)3->E a cp (X) > 0 where the cp^'s are all masks with p=l cpp p - p I supp (cp )| <_ k for all p and the a ' s are reals. Discussion: This theorem says in effect that the L.S.P.'s and the S.M.P.'s are equivalent in computing power. These machines are the center of study in the Minsky paper. Unfortunately the complexity of predicates with respect to these machines (and the order measure of complexity) does not correspond to the human notion of what kind of things are "easy" and what kind of things are "difficult". For example, we can say that the logical operation of""f (negation) is easy in that it does' not increase the order of "fiji over that of \i . On the other hand the oper ations of v "or" and A "and" can (in a way which will. be -.made more precise later) not even be considered as being of finite order. In the geometric realm, the counting of points and even the recognition of integers is "easy" as is the ability to recog nize certain topological invariants. On the other hand, recog nition of parity, and connectivity is not of finite order. These disappointing results could be viewed as a defect of the simple kind of machines used, or of the order measure, or both. We could, for example, consider complexity as a function of both the support size of the cp^'s and the number of masks or other types of predicates used. Such a theory could be quite intractable, so we consider instead various methods of strengthening 10. the machines. We must be careful though: a machine which is too strong will recognize all predicates in a small order and be quite uninteresting. We will, of course, be sacrificing one of the "seductive" attributes of the perceptron, namely that it computes in a very straight-forward (and easily simulated way) by first computing a lot of easy to evaluate, simple information and then combining this partial information by a simple algorithm. . Proof: Let ^k be the least number such that #(x)#-»£ a, i< (X) > 0 where the i|f 1 s are predicates with | supp (sO I <. k for all p . We will replace each it by a sum of the form E a cp where the cp' s are • p . cp * Cp€9 v masks such that max | supp cp | = | supp cp j the-theorem will then cpes • -follow. Let p ^ m . As before we will write ^ (X) = C^(X)vc|(X)v...vcP(X) . where C?(X) = [z? ]A[Z? ]A...A[ZP ] and 1 xl x2 mi [zjj] = [x] or [1-X] . As before we obtain C?(X)<?-5> S cp(X) - J cp(X) >_ 1 where cp€l^ cpe$^ $P and 5? are index sets of masks. The thing to notice is the 11. identity. aX|3 = a(l-X)p = (a-aX)p = a£ - aX£ where a and p are conductions. This identity shows that the maximum support of the cp1 s and the cp's will be supp C?(X) . And since at . least one of the C? will have maximal support = supp (X) we J. p can write U(X)3«-*S a ( I cp(X) - JZ cp(X)) > 0 p p cper cpe<r • . • • Remark: So far our predicates have depended upon the retina R . We wish to extend the definition of predicate to predicate schema, a term which will then be used interchangably v/ith predicate. Definition: A predicate schema is a rule which generates a s" particular predicate for a retina R of any given size. Examples:• (l) Denote the parity predicate by i!ppj{ which has the property that ilf(X) =1 iff |x| is even for every retina R and for all X c R . (2) Denote the connectivity predicate by ^QQNN WHIC^ ^AS the property that I'rjoNN = X is conneci:ed for every retina R and for all X c R . We say that a figure X c R is connected iff every point x e X can be connected to any other point y e X by a rectilinear path in R . We do not permit 11 diagonal connections" as they would permit two paths to "cross" without "toughing". Notation: A rectilinear path y in-- R 'going from x to y will be denoted as y = '{.x = XQ,X1,X2, ... ,xn_1,xn=y} where x^^ and x. have a side in common for i = 0,1,...,n-l. 12. Definition: A predicate (schema) is of finite order iff the predicates generated by are uniformly bounded. The minimum uniform bound M • is called the order of . We now begin our search for more complicated kinds of machines which can cope with typ^R an* ^CONN * ^° ^ar we liave dealt with machines of the following design: RETINA ASSOCIATOR UN/ITS SUMM^olJ AUG-ORVmH The associator units have been masks or other predicates the machine has been equipped with. ' So far they have all been on ' one "level" and cannot communicate among each other. We now define a 2-level perceptron with the following'design. We will make our definition in a form which can easily be generalized by induction. 13. Definition: A predicate i|i is realizable from a pair of sets of predicates § and so iff for all X c R iif(X) =1 <$-> I, a (Xn ) _> 6 for some reals a ,9 where X-^ = {cp-^^ | cp-^X) — 1} . We write % e 2LP ($1,$2) . Remark: We could simplify our notation (and our thoughts) if we thought of our first level of associator units, $^ , as a new retina , the output of our 2-level perceptron Mt M p = < 8 > ^^llh^l > %2.cP2i3i=l > at time t = 0 as X <= R } the output at time t = 1 as X]_ E % an^ output at t = 2 as X2 = {cp2i | cpgi^l^ = 13 — R2 and the ou'fcPu't tirae t = 3 as 0 or 1 . Definition: A predicate % is realizable from an n-tuple of sets of predicates l-i-Fi^i iff for aI1 X c R *(X) = 1*-* S a cp . (X ) > G a 9 € R cpni£$n ^ni ni n 1 ^ni where X s X c R o — Xk = ^kj € §k I Ofcj^-l) = 13 for k - l,..n . We write 4 e nL.P. U. I3? _ . Y L I i=l Ik. th Remark: As before we will think of the i level of associators §^ as a new retina R^ for the next level i+1 . The output of the n level perceptron P at time t will be Xfc for t = 0,...,n and 0 or 1 for t = n+1 . Definition: The order of v/ith respect to an n-level perceptron P will be the least number k such that' \ji e nL.P. and cpj_ e > f supp cp± | <_ k for all i . Theorem: If \\j has order k with respect to a 1-level perceptron (linear separation perceptron) then \ji has order <_ [jj k ] + 1 with respect to a 2-level perceptron. N.B. [ ] denotes.the " "greatest integer function". • e.g. [5] = 5 > [5|-J = 5. , [J~2] = 1. Proof: Let ty(X)*-»E a vA*) > 9 a ,8 £ R and ^i pi . | supp cp^ | _< k for every i . By a previous theorem we might as well assume that the cp±' s are masks. We might further assume that the a 's are either + 1 for they may be assumed ^i integral and hence + 1 if we permit ourselves to copy a particular cp^ several times. We now have ijr(X)*-* 2 cp(X) - S cp(X) > e cpe§ cpe? v/ith | supp cp| _< k for any cp e $ U % where $ and % are sets of masks. We now design our 2-level perceptron. For each cp e $ (or 1) divide supp cp up into no more than L^T"] + 1 disjoint subsets containing no more than 0 k ] + 1 elements each. Such a division must be possible or we would have supp cp > (U?~k"] + 1){[$T] + 1) > ,J~TST = k . For each i define a mask (i^cp^) with (i!f1cpI)(X) = 1«-»X >_ S| for all X c R . Let Rx = (U^) I cp e $ i £ [fT }. + 1} . Define' (^cp) as (i!r2cp)(X1) = ^cpj.) € X± for all i< [^T ] + 1 . Finally consider the predicate p(X)^S U cp)(X ) - ^(1lr2cp)(X1) > 9 , cpes • cpe$ where X.^ = {(^cp^ | (i|;1^i)(X) = 1} . Clearly p is realized from a pair of sets of predicates $^ = Ri and $2 = [(v^cp) | cp e 1} . Clearly also by construction p is of order <_. [^k~ ] + 1 . We wish to see p(X)i]/(X) . But . •is(X) <?—£ £ cp(X) - Ejp(X) > 0 where ? U i contains only masks, cpes cpe§ . < And co(X) = l«-fr X3 sj for all i ^ (t1cpi)(X) = 1 for all (ij/^cp^ ) e X1 for all i^-$>( cp) (X.^ ) = 1 . So our sum E cp(x) - EJP(X) = s (l/2c?)(x1) - r,^(^2cP)x1. cpe$ cpe<? cpei cpe§ .'. *(X)**p(X) . Theorem: perceptron If \jj is of order k with respect to a 1-level then . ijj has order <_ [$"TT ] + 1 with respect to an 16. n-level perceptron. Proof J • • • • "a. Our proof proceeds as in the previous theorem where we obtain \|r(X) <M> E cp(X) - E_cp(X) > 0 where $ and l,.:-.are cpe$ coe § sets of masks and | supp cp| < k for all cp e $ U § Pick any cp e § or I and divide supp cp into dis joint subsets Scf containing at most k ] + 1 points . i < k [JTT] +i On this "first level" define (i^cp) (X) = 1*-* X3 s| . Now divide = {(t^p)^ £ k + 1 } into disjoint \ subsets S^0 containing at most [Vk~] + 1 points where n io < J12 k 2 " ([^TT] + D2 Define Ui2cp) (X^ = 1 X^ Si2cp We now have Rgcp = Sj/j^) I i2 — —n k ([TT] + 1); 17 Proceed inductively, dividing ^_T_ into disjoint sub sets simcp that contain at most [rxT^m ] + 1 points where im c :— — , then defining on the m^*1 level - ([vHT] + i)m Rep = {(*, cp) | im < — k -} and ^ im - ([n^j + ) (4. cp)(X ) = l<5-?> X ,D S. cp until m = n . At this time v viray'v m-1' m-1 — a.mv < -rr—^ rT < n R r, = ¥ = 1 • That ±S at leVel 11 • n - (^k + l)n .- (^k )n k For any cp e I (or %) there is a unique (^j_NCP) H ^N • Consider p(X) <5-* E . (i|r cp) (X ) - S~(yp)(X x) > 6 where XQ = X . V-1 = t( W<P) € Vl ' * e $ U * and < W^<V- 1} for m = 0,...,n-l . Clearly (?) p is realised by an nL.P. from t^i^i-i and is of order <_ [rlTk' ] + 1 . But cp(X) = 1 <H> X supp cp k iff Xg. S± cp 1-, < l" 1 ~ ([VT ] + 1) iff (A. cp)(X) = 1 , 1 iff (*. co) € X, xl • x iff I^cp c X1 18. iff iff iff iff X. 1 ± \* (*i <P)(X ) (*i ^P)E X2 R2cp c X2 i2< k ([^T] + 1)' = 1 iff X - S> S, cp n-l — iff cP)(xn_1) = 1 i. e, S (vii cp) (X 1 ) - £ (ij, cp)(X _ ) > 9 \vnv^v n-l/ ~^7vvn^/v n-l7 cpes cpes iff S cp(X) - cp(X) > 0 cp€$ cpe$ p(X)^f(X) Corollary I: If cp is any predicate then there is N large ~ enough so that the order of cp <_ .2 with respect to an N-level perceptron. Proof: Let R be a retina and cp a predicate on R . Clearly cp is of order k <_ |R| with respect to a 1-level perceptron. We conclude then that order < [^/k" ] + 1 with respect to an 19. n level perceptron. Since lim V^^L" ] + 1 = 2 and [^/"k" ] + 1 Xi-KO • is integer valued, we conclude that there is an N such that N r [ ./k ] + 1 = 2 . Therefore order cp <_ 2 with respect to an N level perceptron. Most complicated predicates will be of order 2 , however some may be of order 1, for example, the order 1 predicates \tfith respect to a 1-level perceptron. We might just as well mention that there is a direct proof which may be sketched as follows. (Incidently this proof would be typical of other machines endowed with too much power.) Let R be a retina. Form the set S of all pre dicates ii • such that is of order 2 with respect to an n-level perceptron for some n . We show that S = the set of all predicates on " R by showing that (1) S contains the masks of support 1 (trivial). (2) e S —> ~i \|f e S . This may be proved by. reversing inequalities or changing the sign of the coefficients. (3) ^ € S and ^ e S-^^v^ e S . This fact may be proved considering the Ji^ and Ng-level perceptrons P^ and P, which recognize ^ and ty2 respectively and constructing a new perceptron v/ith = max {N-^Ng} + 1 in a manner suggested by the picture: 20. CZ3 2. We will not go into details. Corollary II: , Any predicate schema ^7 is of order ' <_ 2'with respect to the class of all finite level perceptrons. Proof: This statement is obvious given the definition of predicate schema .(••arid the unstated definition of "order with respect to the class of all finite level perceptrons") since all t|i generated by are uniformly bounded in order by 2 . We mention the "result" only to clear up lingering ambiguities. Discussion: The proofs of the last two theorems seem formidable, mostly because of our strict adherance (for'the time being) to the formalism and our liberal use of subscripts. Actually the idea •• behind the theorems is simple (or trivial). We' merely convert our perceptrons into mask perceptrons and then break down the big masks by "pyramiding" smaller ones in the manner of the suggestive 21. picture. which converts a mask "of order" [incorrect usage]. ^ j[ into, a "mask net" of order 3 and level 2 . Our theorem says with 2 levels the worst order we could have expected was [?/""TT* ] + 1 = [3.31] +1=4. To obtain a mask net of order 2 we must use 3 levels. I I The theorem says that for 3. levels our order will be < [^/n~] + 1 = [1.82] +1=2. Theorem: If is of order k with respect to an n- level perceptron then i/ has order <_ kn with respect to a 1-level perceptron, 22. Proof: (Sketch) Informally, for any cp e Rn we look at "supp cp" in each of Rn-15'*'3 R0 * We can easily see that 2 n these supports will be <_ k,k , ...,k in the various retina. To construct our 1-level perceptron we merely take for any cp e R^ a nev/ cp which is a predicate on Rq and depends on only the (possibly) kn Xi's in RQ . Remark: Our theorems have made the order problem rather uninter esting for the perceptrons with uniformly support restricted higher order associator units as well as casting serious doubt on the order measure'"itself for this class of machines. Specifically, the complexity of the predicate seems measured more by the inter connections of the associators than the size of associator support. FEEDBACK PERCEPTRONS Discussion: We now continue our search for stronger perceptrons, this time obtaining a machine which looks more like an automaton than the combinatorial nets we had before. We are motivated by a desire to minimize the amount of "wiring" in the machine so that order will play a larger: role. Our idea is to eliminate wiring between the many levels by using the.first level over and over again in a "feedback" arrangement. So whereas before we had a situation like this: 2 Mi U/ car o with potentially complicated wiring situations in regions 1 through n as marked, we now wire region 1 once.and for all. We. might alternatively think of this situation as requiring all regions 1 through n to be wired identically. We could think of the state of the machine at t = 0 as ST(O) = X c R 3 at t = 1 as ST(l) = (cp(X) | cp(X) = 1} , , at t = 2 ST(2) = {x e R | cp £ ST(l) and x is at the end of a feedback arrow}. We have one problem left: . We must tell the summation operator when to operate otherwise the machine will cycle end lessly, never giving an output. For this purpose we allow all cp € R-j_ to communicate into a common' channel a "0" or a "1" .. 24. A "1" into the channel from cpi means that cpi is ready to proceed to the summation state; a zero means it is not. Only when there is unanimous consent does the summation take place. The machine then shuts down. Diagramatically then we have the Definition: A feedback perceptron (F.P.)P is a triple . <R,R_L,A> where the (rectangular) retina R = (x-jJ-^j is a se^ of points,. indexed by a finite set I . R-^ is the set of associator units {cp, } . T indexed by a finite set J , and j j e o A = {a.}. T U {9} is the set of coefficients a. together with J ,JetJ j the threshold 8 . The associator units cp-. are' functions from J P(R) to {X,0} x {0,1} x {0,1} where X-e R . (Think of this triple as indicating the following. <YES;NO/reactivation of retina, YES;NO/proceed to summation, YES;NO subpredicate true> Remark: Any P.P. P is a finite automaton. We define the initial state of the machine as S < R . Inductively we define SI+1 = ^x e R 3 q- j e Rl and 9i(si) = <x 0 or 1, 0 or 1>} . 25. Define the output at time n by OUT(n) = 1 iff cp,(S ) = <x or 0,1,0 or 1> for all j e J A jeJ J J last co-ordinate. and £ a. cp.(Sn) >_ 8 where A = projection onto OUT(n) - 0 iff cp.(S ) = <x or 0,1,. 0 or 1> for all j e J J , and £ a. $(S„) < 9 jeJ J n' OUT(N) = \ otherwise. Definition: We say that X is accepted by the F.P.P if-f OUT(n) = 1 where n is the least integer such that OUT(n) £ \ and the state of P (i.e. the activated subset of R) remains the same after t = n . Remark: We might as well assume that our associators cp. can reactivate whole subsets of R since we may add as many "redundant "cp's" as needed, connecting each feedback arrow with a different xi on our subset.; 26. ORDER OP PARITY PREDICATE FOR FEEDBACK PERCEPTRONS Theorem: Parity is of order 2 v/ith respect to a F.P. Proof: See page 11 for a definition of > ^he parity predicate. Enumerate R as {x^} i = l,2,...,2n. If I is odd a similar argument works. Enumerate R. as cp . j = 1,2,... and define cp .(X) = <a,b,c> where J a = j/2 if j even and x„ •. or x0 e X but not both, = 0"/2 + ^ if j odd and xQ or x e X but not both, • ^3 ".i-l = 0 otherv/ise. b = 0 iff a = 0 unless j = 1 then set b = 0 iff X^^LAX = c = 0 iff only one of x~ and x e X c = 1 iff both or neither x» and x„ e X . Let A = {a.} where a. = 1 if j = 1 and a. = 0 otherwise. 3 3 3 then [X is even] S a . cp . >_ 1 . <3 3 Clearly this machine operates by continually "trimming X down"-(preserving parity) until only one or zero points are left If one point is left, |x| is odd, if none, [X| is even. An example of this machine is simulated and the sequence of steps shown in Figs. 1, 2, 3 and 4. To complete our proof we should also show that no "order 27. 1" P.P. can recognize parity. Namely, the two parts of the proof would show that' order (^P^R) ON F.P.s is (l) <_ 2 and (2) > 1 . We show this fact after the next theorem in a section entitled "parity proof continued". CANNONICAL OPERATIONS FOR FEEDBACK PERCEPTRONS Remark: The last machine operated by reducing a figure to a more tractable form' which it could then deal v/ith. In mathematics v/e usually call this kind of process normalization or reduction to cannonical form. We now show that all (F.P)s operate in such a way. Definition: A (F.P) P computes -is by reducing figures to cannonical form iff (l) ty(X) = 1 the machine accepts X and (2) the sequence of states SQ,S^,...,Sn before acceptance or rejection of any X c R are such that [ty(S^) = 1] for all i or U(S±) =0] for all i'. Theorem: All (F.P)'s operate by reducing figures to cannonical form. Proof: Assume P is a (F.P) and does not so operate. Then there is a predicate, tjj such .that P accepts ijr and there exists X c: R such that the sequence of states before acceptance SoJ*"";'Sn contains S, ^ k < n , with ty(sv) = 0 • If we let 28. •= R , a new initial state, the sequence RQ,R^,..., will coincide with SV,S., , ...,S with S accepted, which means \ji(Sk) = 1 contradiction. Discussion: Clearly any state of a (F.P) P on R can be described as leading to acceptance, rejection, or endless cycling. The initial state merely starts the path of arrows induced by the next state function which (in the absence of additional input) would normally lead'to an endless cycle. The only event that can prevent such a cycling situation in this case is to hit on a state which activates the summation operation and tends to accept ance or rejection. ; The problem then is to partition the set of states (=.p(R)) of the finite automata P into Qy and ft with X e i}/(X) - 1 and X £ ty(X) = 0 . Normally such a problem is easy, but in this•situation it is not since the" next state function (and hence the arrow paths) are- severly constrained in S by the order of P . A full and satisfactory theory on. this subject would require relating order constraints to the state graph, of the auto mata and then to the class of predicates, a weighty task indeed! We will content ourselves with the probrem of "programming" (F.P)'s to handle various predicates such as connectivity. Parity Proof Continued: Consider any order 1 F.P. which recognizes *!>PAr> • A consequence of Minsky's group invariance 29-theorem is that the only predicates of order 1 with respect to a 1-level perceptron are of the form cp(X)<H? |X| <_ M or cp(X)«-e>|X| > M . We conclude that the blackened subsets of R (i.e. the sequence of states of MP) must tend in cardinality to an odd number > M for some M _< |R| for rejection and an even number <_ M for acceptance in any order 1 parity recognizer. Of course, the states could also tend to any odd number <_ M for rejection and any even > M for acceptance. We now consider the following-small example. Let R be a 1x2 retina. We examine the behaviour of all possible associator units to show the impossibility of recognizing the parity predicate with an order 1 P.P. R = cp^ is the associator on the left square x^ of R . cp2 is the associator on the fight square xg of R . We break the proof down into cases where the cannonical form for "accept" is form for "reject is or both (Case c) . s down c X (Case I) or X •£ (Case II) (Case a) or ^ I (Case b) Case la: if II II is read as "leads to-'the state" we can symbolize the required transformations in this case in the follow ing way. (Remember: we must prevent endless cycling on non-30. (accept or reject) states, (i) accepted (ii) (iii) (iv) X X rejected X X X X Now (i) shows that ^(X-^) = 1 or 0, 1 or 0> . That is upon being presented with a blank cp-^ reactivates nothing. Clearly cp2 ^as ^he same property. Next (iv) shows cp-^ and' cp^ reactivate nothing upon "seeing" an active square. Therefore cp^_ and cpg • cannot reactivate anything so parts (ii) and (iii) are impossible, contradicting the assumption that Case Ia<is possible. We briefly run through the other cases (omitting Ib and lib for reasons of symmetry) showing that they are all impossible. (ii) (iii) (iv) Case lc: The required transformations are: (i) accept reject _p X X X reject Xj X X As before (i) and (iv) show ' cp-^ and cp2 ' are inert, 31. showing (II) and (iii) impossible. Case Ila: .The required transformations are: (i) (ii) (iii) (iv) X X reject X X X X X X accept —^> X X Item's (ii) or (iii) show that neither cp-^ or cp2 will reactivate both X^ and, X^ together. And (i) further shows that both must activate at least one of X^ or X^ , given a blank. Therefore one of the following possibilities•obtains: (1) cpx(0) = <X1,-,-> cp2(0) = <X]_,-,-> (2) cp1(0) = <X±,-,-> cp2(0) = <X2,-,-> (3) cp1(0) = <X2,-,-> cp2(0) == <X1,-,-> (4) Cp1(0) = <X2,-,-> cp2(0) = <X2,-,-> Numbers (l) and (4) can be rejected out of hand as the run counter to (i) . Number (2) is contrary to (iii) and (3) is contrary to (ii). 32. Case lie: (i) (ii) (iii) (iv) X X X j reject —^ reject —> accept —p X X X X X X X Item (i) shows that cp1(0) = <-,0,-> or cp2 (0) = <-,0, Thus either (ii) or (iii) are not reject states. Contradiction. Aside:' It is interesting to note that alternative (3) in Case 11a is a consistant assignment for Case lie. Also: If the last clause in the definition of "accept" (i.e. the state stays.-.'.the same) were not required we would NOT have "been able to derive a contradiction in this 1x2 example. Furthermore even for larger examples, I do not know -whether any contradiction would exist. The number of possibilities for associator action is very large. Having eliminated all possible cases, we have shown that an order 1 parity recognizer is not possible for F.P.'s 33. ORDER OF CONNECTIVITY PREDICATE FOR FEEDBACK PERCEPTRONS Preview: We devote the next section to showing that 4'CONN IS of order <_'8 . That Is we find F.P.., Mc which will - recognize ^CONN witil no associator of MQ having support > 8 ." MQ is simulated, as is at the end of this paper. (See p. 11 for definition of ^CONN^' Description of Mc: Mc will operate on an mxn retina" R and will be.equipped with six kinds of associator units, the Right Dribbler, the^Left Dribbler, the Surrounder, the Sidewinder,•the Backslider and. the Scalawag as pictured below. ; Sidewinder Backslider Scalawag - (SW) ' (BS) (SC) S 34. The arrows show which way active squares tend to propogate (as shown "below) under the various kinds of units. At each square x^ e R , one of each kind of unit (six in all) is placed upright in such a way that square 2 , the centre of the unit coincides with x^ on the retina. The assoc-iators may overlap the edge of the retina in which case such overlapped squares are always registered as "blank" or "inactive". We must now describe the output (i.e. the triple) of each unit as well as the summation operator £ . First Component: Each unit will have only two outputs, the "growth" or "activating" output and the "standpat or "recopying" output. The standpat output merely copies the existing input with no additions or subtractions so that no information will be lost during the time interval. The growth output copies the existing input and adds to it,- if possible, square 2. Roughly speaking, most outputs will tend to standpat; only a few are growth outputs. We give for each kind of unit the necessary and sufficient conditions that the output be a growth output. Both Dribblers: All four conditions must be met. (1) Square 1 is active. Square 2 is not. i.e. l+,2° (2) Square 5 is active implies Square 3 is. i.e. 5+-3+ (3) 6+ - k+ (4) 7+ -(3+ and 5+ and 8+), or more simply 7+ - (3,5,8)+ 35. Surrounder: (l,3,4,5,6,7,8)+ 2° Sidewinder:. (3,1,6,7,8)+ 2° Backslider: (3,4,5,6,l)+ (2,7,8)° Scalawag: Both conditions must be met: (1) (8,6,7,5,2)° 1+ (2) 4+ 3+ Second Component: .For all units the second component (which says whether or not the unit is ready to sum) is a 0 iff the unit's first component was a growth output and (of course) a 1 iff the unit's first component was a standpat output. Therefore Mc proceeds to summation only after all units standpat, which,"of course, implies that the machine state has stabilized. Third Component: We define the third component (which,says whether or not the unit accepts the small bit it sees) in a way which will become clear only after a proof. The right dribbler puts out a 0 iff (1) 1+ 2° OR (2) 1+ 3° All other units always put out a one. . Summation Operator: Mc accepts X iff M is ready to sum upon being given X and S (1 - cp(X))< 0 ' cp. an-associator unit of Mc< "'• ' • 36. That is, acceptance or rejection is on a "blackball" system, X being rejected iff any one of the right dribblers puts out an 0 in its third component. Definition: A square x e R is said to be in the scope of x e R (written x e scope (x-,.)). iff x = x OR • x is below x O v 1 O i in x's column OR x Is to the left of x in • x 1s row' OR o o o x is down and to the left of xQ . Remark: Informally the way MQ operates is to take any x e X of a connected figure and fill in its scope, if it has not been done already. Thus a figure- starting out like, this: would end up looking like this at the time of acceptance. Really then the third component of the RD is merely checking to see if all scopes have been filled in on a processed and (supposedly connected) figure. Disconnected figures like 37. will end up like this at the time of rejection. This fact is very clearly illustrated after the text in the figures. Notation: Let a rectilinear path y in R going from x to y be denoted.as y = {x = xQ,x1,x ,...,xn_1,xn=y} where x± and x^+-^ have a side in common for i = 0,1,...,n-l . Lemma 1: If x e X and y e X are not connected to each o ^ o other then no single associator cp will connect them- directly, (i.e. in one time step.) Proof: • Suppose Xq and yQ are not connected. In order that cp connect xQ to yQ directly, yQ must be at one of the squares marked 1,2,3,4,5,6,7 or 8 in relation to x o ¥e eliminate all possibilities, first eliminating 5,6,7 and 8 by symmetry. ' (For example: If yQ Is at 1 the picture looks the same as if it were at 3 , etc.) Case I: yQ is at 4 and b must have been activated. We look at the six associators centered at b . A right or left dribbler at b would Imply a previous connection from 4 to Xq via [4, 12, 3, a, x } to activate b . A backslider would connect 4 to xQ via [4, 13, 5, c, Xq} 'if it were active as would a sidewinder and a surrounder. A scalawag could NOT activate b under any circumstances. Case II: yQ is at 3 • Either a or b could have been activated. Sub-case Ila: y is at 3 and a was activated. . We examine the associators centered at a . A dribbler (right or left) activating a would imply a previous connection from xD to 3 via [3, b, xQ}' or {Xq, d, 1, 10, 2, 11, 3} • A backslider at a implies a previous connection via [3, b, Xq] as does a side winder and a surrounder. A scalawag cannot activate a . Sub-case lib: y is at 3 and b was activated. Either DR implies a connection via {3, a, Xq} . A BS could not do the job nor could a SC. A SW implies a connection via (3, a, xQ} . A SR connects via {3, 12, 4, 13, 5, c, x<o) . Case III: y is at 2 and a was activated. Active dribblers at a would imply a connection {2, 11, 3, b, xQ} as would a SR or {2, 10, 1, d, xo3 . Neither a BS or a SC would work. An active SW at a implies' a connection [2, 10, 1, d, XQ} . Case IV: y is at 1 and either • a or d was activated." 39-Sub-case IVa: DR's give a path {xo, b, 3, 11, 2, 10, 1} or [xQ> d, 1} . A BS, SW, SR, or SC yields the path {Xq, d, 1} . Sub-case- IVd: A DR activating d connects via [Xq, a, 1} as does a SR". A SC or BS would not do the job. A SW connects via {xQ, 6, 7, 14, 8, 9, 1} • All possibilities having being eliminated, the lemma is 'shown. Lemma 2: Suppose xQ and yQ are not connected.' Then no two associators and cp2 will connect them directly. Proof: Suppose x and y > are not connected. In order that csr o o cp^ and cp2 connect XQ to yQ . directly, yQ must be at one of the squares marked 5,6,7,8,9,10,11,12,13,14,15 or 16, in relation to x . For if y were at 1,2,3, or 4 we would have a o o connection. If y were outside the assigned area two associators could not 40. possibly connect them directly. If yQ were at a,b,c,d,e,f,g, or h and v/ere connected via {xo3x^,x2,x^ = yQ] to XQ either x-^ . or x^ would have to be at 1,2,3, or 4 , forming a' direct connection using only one associator's activated square, contrary to Lemma 1. As before we eliminate all possibilites, eliminating first 11,12,13,14,15 and 16 by symmetry. Case I: yQ is at 5 • There are 3 direct paths from 5 to xo • • (i) {5, a, 1, xQ} (ii) (5, a, 4, x0} (iii) {5, h, 4, xo} Case II: yQ , is at 6 . There are 3 direct paths from" 6 to x o (i) {6, b, 1, xo3 (ii) [6, a, 1, xQ}-(iii) {6, a, 4, xQ3 . -Case III: yQ is at 7 There is only one direct path {73 b, 1, xQ3 from 7 to xQ . Case IV: yQ is at 8 . There are 3 direct paths from 8" to xQ. (i) {8, b,l, XQ] (II) {8, c, 1, xo] (iii) {8, c, 2, xo) . Case V: yQ is at 9 . There are 3 ' direct paths from 9 to xo • 41. (i.) {9, c, 1, xo} (ii) {9, c, 2, xo} (iii) {9, d, 2, xo] . Case VI: yQ is at 10. There is only one direct path {10, d, 2, xQ3 from 10 to XQ . Note: We will shorten this long, tedious.proof somewhat by the following observations and notational conveniences. (i) When we argue a case- we might as well assume that both squares in the path in question were initially blank., and hence needed activation. Take, for example, Case I(iii). We argue in this way: yQ is at 5 and was connected in one time step to xQ via the path {5, h, 4, Xq} . Assume to the contrary that either h or ,4 was previously active. We derive contradictions in both cases. Case h: h was previously active. But then h was not connected to Xq since 5 wasn't. So h and Xq (two disconnected points) were connected in one time step by two assoc- :? iators via the square 4 alone. Therefore h and Xq were connected in one time step by one associator. . This statement violates Lemma 1. Case 4 uses the same idea. (ii) To avoid needless verbiage, our case arguments will merely list the type of associator alledged to have activated a path square, followed by a path o,r a statement of impossibility. This path will represent a pre-existing connection between Xq and yQ whose existence is implied by the active associator. For example, in Case I(iii) again we will write (in part) 42. BS at 4 - xQ, 3, g, 15, ft, 5 SC at 4 - DNW These cryptic lines mean that "in order for a BS to activate square 4 a path must have existed before connecting xQ to 5 via {xQ, 3, g, 15, h, 5) , contrary to assumption". Also we have "A SC Does Not Work, i.e. it would not activate 4 in this case First x inhibits the SC. Second h is blank also o inhibiting the SC. Note that we could also have written BS at 4 - DNW since h is blank. We will not be fussy about where our contradictions come from. 1(1): 7 8 1 b 1 * c 1 ' a J 5 X o _ [hi h g LD, BS, SW, SR or SC at 1 -. x , 4, a, 5 . So we assume 1 was activated by an RD. In this case BS, SW or SR at a -»-x ,4,h,5 was active. DR at a -• x,2,c,8,b,6,*,5. SC at a - DNW since b Therefore 1 was previously active, contradiction to Lemma 1 l(ii),(iii): SW, BS or SR at 4 - x , 3, g, 15, h,'5 4 - DNW . DR at 4 - xQ,l,a,5. SC at II(i)(ii): LD, BS, SC, SW or SR at" 1 -' x , 4, a, 6 . RD at 1 - xQ, 2, c, 8, b, 6. 43-II(iii): SR, DR at 4 - xQ, 1, a, 6 . BS, SC at 4 - DNW . So a SW-at 4 is the.only possibility._ Assume a SW at 4 . DR, SW at a - Xq, 3, g, 15, h, 5, *, 6 . SC, BS at a - DNW . SR at a - x , 1, b, 6 . III:. DR at 1 - x ,2, c, 8, b, 7 or x , 4,-a, 6, b, 7 . BS, SC, SR at 1 - DNW . So SW at 1 is the only possibility. Assume an SW at 1 . SC, SW, BS, SR at b - DNW . DR at b -xQ, 4, a, 6, *, 7 • III(i)(ii): DR at 1 -» 8,c,2,xQ, or 8,b,6,a,4,XQ . SR, BS at 1 - 8,c,2,x-SC at 1 -* DNW . The only alternatice is a SW . Assume a SW at 1 . We must now examine the units at b and at c . DR at b - xo,4,a,6,4=,7,$,8. BS, SW "at b - Xq,1,C,8 . SC at b - DNW . SR at b - x ,l,c,8 . Now for c . DR at c - 8,b,l,x or o . ' ' J o 8,*,9,d,2,xQ . . BS, SC. or SR at c DNW . SW at c xQ,l,a,6,b,8. Contradiction in both cases. 44. IV(iii): SC, BS at 2 - DNW. SR at 2 - x ,3,e,ll,d,9,c,8 . DR at 2 - 8,c,l,x' . So SW at 2 is the only possibility. Assume SW at 2 . DR, SW. at c - xQ,l,b,8 . BS, SC, SR at c -DNW . V(li)(iii): DR at 2 - 9,c;i,xo . SR, BS at 2 - XQ,3,e,ll,d,9 . SC at 2 -DNW . So SW at 2 is assumed. We examine d and c . DR at c - 9,*,8,b,l,xQ . SW, BS at c - XQ,2,d,9 .. SC SR at c - DNW . Now for d : DR at d - x ,3,e,ll,=j=,10,£,9 or x , 2,c,9,-.-SW at d - xQ,2,c,9 • SR, BS, SC at d - DNW . V(i): SR, BS, SW at c - 9,d,2,xQ. SC at c - DNW . The only possibility then is a DR at c , meaning that 8 and * are active. We examine square 1 . DR at 1 - xQ,4,a,6,b,8,*,9 or Xq,2,C,9 • SC, SR at 1 - DNW . BS at 1 - Xq,2,C,9 . The only possibility is an SW at 1 meaning that 2 was originally active when the DR'at c was applied. This yields x ,2,d,9 or x ,l,b,8,*,9 • * 1 9ll 1 10 Id 1 2 (Tjll J e" BS, SW at d - 10,£,ll,e,2,x . SC,,,DR at d - DNW . The only 45. possibility left is a DR ; hence * and 9 were previously active. We examine square 2 . SC, SR at 2 -» DNW . DR at 2 - XQ,1,C,9,*,10. BS at 2 - x ,3,e,.ll,d,10 . So an SW at 2 is the only possibility. Hence 11 was previously active when the DR at d' was applied. This yields . xQ,3,e,ll,£,10 or xO,2,c,9,*,10 Q.E.D. Remark: This proof was very long and boring! It was put'in for completeness and to ease'nagging doubts. The proof might have been shortened somewhat by Ingenious assumptions, however it is very easy to get fooled! The brute force method though boring is at least clear. Lemma 3: The, state transitions of Mc go only from connected states to connected states or from disconnected states to discon nected states. They never switch back and forth. Proof: (i) Let X be a connected figure in R - and let X1 P2 X be the figure formed from -X by an application of the state transition .function of MQ . We show that X"*" is '•> connected. Let x1,y'1" e Xx . We show x1 is connected to y1 . 11 11-If both x and y e X we are done. If either x or y e X^~X it must have been produced by one of the six -kinds of assoc-11 iator units v/ith x or y In position 2 . But,in such a case x^ or yx (or both) would be .connected to. associator position 1, denoted by x and y respectively, which is in X . A path from x^ and yx could then be of the form 46. Y = {x = x^^jXgjX^,. . . iXn_1,xn = y } where x1^---.xrl_1 e X and either x-j_ = x or x^ = y . (ii) Let X be a disconnected figure and let X^" :=> X be the figure obtained from X by an application of the state trans ition of MQ . We show Xx is disconnected by assuming it is connected and deriving a contradiction. Suppose X^ is connected and X is disconnected. Let x and y be in separate components A and . B respectively of X . There is a path Y = (x = x ,...,xn = y } connecting x to y in X . Every x^ e 7 is classed either as (l) an element of A or (2) is derived from some element of A by an associator unit centered at x^ or (3) is an element of some other component than A (say^G) or X or (4) is.derived only from elements of'other components by Associators centered at x^ . Let. the first two alternatives be of type I and the second two of type II . Since the path y starts in type I and ends in type II there must be"an iQ such that x.. Is the last x^ e y of type I. . If x. is o • • ' . o of class (l) x.. , cannot be of class (3) (C and A are dis-o+ connected) or of class (4) without violating Lemma 1. If x. 1o Is of class (2) x^ ^ cannot be of class (3) orv (4) without o violating Lemmas 1 and 2 respectively. Lemma 4: If R is a retina which contains a connected figure and if M is ready for summation then the following configuration cannot occur on R 47. X • X Proof- Assume X c R is connected, that Mc is ready for summation (i.e. no associators will he active) and that occurs on R where 1 and 4 are active, 2 and 3 Y = £l = XQ,X1,X2, ,x .. ,x = 4} 5 n-l* n J are blank. There is a simple path y connecting 1'. to 4 . not passing through 2 or 3 Form the unquantized simple closed curve y"1" by connecting the centre of 1 to centre (2) and centre (x^) to centre (xj_+]_) for i = b,...,n-l v/ith straight line segments. Our'picture looks like this: Definition: We say y semi-encloses a square x-. iff y is of the form given above and encloses x in the usual sense of the Jordan Curve Theorem. Claim: If 2 and 3 are blank then either y semi-encloses 2 or y semi-encloses 3 but not both. Proof: Let z be a point (not a square) inside 3 Connect 48. z to centre 3 . by a line segment .and then draw a 45 ray from centre (3) upx/ard out to 3R . This new path 6 will intersect y1 either an odd or an'even number of"times. If odd, 3 is semi-enclosed by y ; if even 3 is not semi-enclosed by y . Clearly also since • y passes through 6 at the "centre" we have: y semi-encloses 3 -* Y does not semi-enclose -2 ^and y semi-encloses 2 -» y does not semi-enclose 3 • We resume the proof of Lemma 1. Pick a y connecting ' 1 and 4 semi-enclosing minimal area. Pick a highest row in Y . There will then be a local area represented as highest row . The path may then turn either to the left or to the right. Case L: • y turned to the left in this particular spot. •49'. jar-n=ar 2 j -X lx. 1 Now 1 must"be blank since 1 is semi-enclosed by y and y is a minimal semi-encloser. If 1 is active, the path r\ obtained by substituting 1 for ^-±+i Y will semi-enclose less area than y while still connecting our original two squares. So 2 is active, avoiding a dead end. And 3 is active by the dribbler rule and the fact that 1 is blank. We have We investigate the various possibilities for a and b . a and b cannot both be blank because c would be. active and the DR rule would be violated. a and_ b cannot both be active. To have such a situation would again violate the DR rule. We are left with two cases. First a is blank and b is active. Second b is blank and a is active. In the first case we . have: ' : where again the only possible assignments for a^ and b-j are (a-j_ blank, active) or (a, active, b, blank). Clearly 50. this first alternative could not continue indefinitely to the exclusion of the second, because 'X* is not one of our end squares. The only possibility left is that at some time the second alternative must occur yield ing: XXIX \r A. X ^—^ X X X X which gives us our original configuration - only now we must be with a new single path y^, totally able to connect X X semi-enclosed by y (or on y). In this way we obtain a sequence of paths y^,y^,...,ym connecting our figure and semi-enclosing a strictly decreasing amount of area. Eventually we get to the point where the absolute minimum area is semi-enclosed which means y looks like or x 1 K X [ X Ix | x J x~~|~xl Mr whence the blank in the middle is filled in by a SR or a LD Contradiction. Case R: The path turned to the right. Here we have as before: "-.,iUJ.r.i (X I X T X | a c X b where c blank yields a new path > totally enclosed in If c is active .we get a contradiction v/ith the RD rule. Lemma 5: If R is a retina which contains a connected figure and if Mc is ready for summation, then the configuration X X cannot occur on R • Proof: For this lemma, the proof goes as before except that we end up with minimal area semi-enclosing configurations like !x xl |x | |x fx-1 or x 1 xj I X A. N violating the SW and RD rules, Lemma 6: If R is a retina which contains a connected figure and if Mc is ready for summation then X X cannot occur on R 52. Proof: X 6 2 h x 5 It is easy to se~ that 1 and 2 must be b.. vale in this case. We assume that 2 (for example} is active as the argument for 1 is symmetric. If 2 is active then we have either 6 blank or 5 blank or (3 blank and 1 active) all situations leading to the forbidden patterns of the last lemmas. We argue now as before picking the highest row of a minimal area semi-enclosing path y connecting our original .figure. (note: we have abused the definition slightly.) We obtain the figures X *1 • x 1 aj c top row and 1 x 1 xi UJ •X c i i 2 top row depending upon whether y "turned right or left". In both cases there Is no consistent choice for c . Note that our first remark in the proof was necessary to prevent the path from ending or beginning at x-^ or x^ .' Lemma 7: If R is a retina containing a connected figure and if MQ is ready for summation, then |)C_ || j X j cannot occur on R . 53. Proof: (sketch) *b X X a ¥e can show by process of elimination that a, b and c must be blank. Then we plan to use the same type of argument as before. But we must be careful. This time the path y may have as highest row only the end points. In such a case we cannot build such a strong\ configuration around our highest row as before. Case A: y extends above the end points: Here as before we get or a I X 1 b j X 3 leading to contradictions. Case B: y does not extend above the endpoints: Here we must look at the lowest row for the y semi-enclosing least area. moved right _a 1c 1b 1 JLl X |x ill moved c | a e left xi X X By Lemma 6 a is blank. By our comment at the beginning of the proof b is blank, hence c is too. If y has moved to the 54. left we could further conclude that e is blank and d is active. (Using the SW and SC rules respectively.) But then the RD and LD at a would be active contradiction. If y had moved to the right, it would eventually have to turn back up again to meet an end point, 5 pl~i Hi X i X X X duplicating the situation of the left moving path. We conclude that there can be no path y of the required type connecting the original figure. Contradiction. Theorem: Upon receiving a connected figure X on an mxn retina - R , Mc will continue to add retina cells to X until U scope(x) xeX a "solid" figure X° is produced. This figure X° After X° is attained, no state change will take place. Proof: Suppose X is a connected figure and has been inserted into Mc . It is clear from the form of the six associator units that the state at time t' , x^ -satisfies . X'S c U scope (x) .. {Z) xeX ... It is also clear that if there is a time t with X,. \ = X° o ('C ) Our task then is to no state change will take place after t o 55-show that if M sums at some time t, then we have X/, \ = X° = U scone (x) . We do so by contradiction. lV xeX Assume that at time t = t, M sums and that 1 c X/, \ ^ U scope (x) . Therefore there is an xn e X such xeX 1 that y e scope (XQ) and that y / X^ ^ for some y e R . Adopt the following scanning procedure in scope XQ . Start at x and scan down the last column. Then scan.the o second to last column from top to bottom and so on until all columns have been scanned. Pick the first such y in the scan and call it y . Pictorially then we have one of the following three pictures, depending upon where .y is in scope xQ in relation to x . Remember y is "blank" since y e X,, \ . . " 1-Case I: X o X X X X X X Case II: X o 1 y X X XX XX 56. Case III: X o o We show- that none of these cases are possible. That is v e X/, \ contradiction. Cases I and III: By choice of JQ we have one of the following figures v/ith \y blank 1 J X || X 2 jjy0ll x CO LL j 5 2 I'o ! 4 l 3 In order that y be blank and not have been activated o , - . by a DR we must have either j5 active or (2 active and 1 blank) or (4 active and 5 blank). These cases lead to the forbidden configurations. X X X or X or X Case II: By choice of jQ we have i 57. 7 f * F I' j10 | x J i 11 Lid In order to avoid activation of yQ by a SC we must have either 1, 2, 3, or active. Sub-case (1): 1 active yields X X contradiction! Sub-case (2): 2 active implies also that 6 is active or (1 is active and 3 blank) or 5 is blank, all contradictions. Sub-case (4): 4 is active. We must avoid having 2 active by sub-case (2). The only way. to avoid having a DR fill 2 is by having yQ active (false) or (3 active and 7 blank) or (5 active and 8 blank) both contradictions. Sub-case (3)'- 3 is active. We must avoid 1 being active. So we have 11 active or 10 active or (10 active and 9 blank) or yQ active. In no case could we avoid the forbidden patterns, o Therefore XfJ_ \ = X ./, \ = A = (J scope (x) and t, = t xeX 1 o The theorem is shown. Corollary 1: MQ recognizes \Jj CONN.. Proof: Let X be connected. When Mc sums it will do so on 58. the figure X° = U scope x , causing every associator to output xeX a 1 , causing acceptance. Let X be disconnected. When MC sums (which it surely will do in a finite amount of time) it will do so on a figure Y . Certainly Y c u scope (y.) . (Just look at the yeY associators.) Since Y ^ U scope y , a- connected set, we must yeY have yQ e Y with some Xq e scope yQ ~ Y . Pick any northeast-erlymost such xQ . Then either X X or Xo, with (xQ e Y , x e Y) will occur leading to rejection because of the third component RD rule. Therefore .M accepts X iff X is connected, c r Therefore MC recognizes ^QQNN * Corollary 2: ^CONN is of °rder <. 8 . Proof: Since MC recongizes ^'QQ^N » ^e Pro°f consists only of noting that all associators units of MC have eight or less squares in them. Topics for further study: Of course^ one might always try to get better bounds on order (^CQNN^ ' OR<^ER 1 or 2 cases might be possible to rule out via brute force, but for order 3, 4, 5, 6 or 7 the problem seems to amount to either a hopelessly 59-large case,elimination problem or a hopelessly difficult P.P. programming problem. Other predicates might be investigated. vjt , L would * Tmod n . be easy. I have thought briefly about ^SIMPLY CONN or ^(X)<S-5>X has exactly n components. As Minsky notes, geometric properties involving straight lines and circles etc. involve tolerance" topologies which seem to be more of a problem than the P.P.'s. Other algebraic predicates might be investigated. Various training sequences could be investigated on the P.P.' s using computers. Unfortunately I would expect "learning" to b,e ...slower for these machines than for regular perceptrons if* only because there' are so many more modes of action for the F.P. The problem of formulating a general theory of F.P. 's seems almost hopeless. In fact many of the minimal state problems for finite automata and iterative arrays are very much like our problems with connectivity. Only rough bounds are computed for specific tasks. No general theory is even hoped for. We could vary the P.P.1 s in such a way as to simplify tasks (i.e. lower orders). For example, we might permit the retina to be activated by n colours rather than 1 . We might note in passing that the design of feedback perceptrons is quite similar to that"" of the cellular structures 6o'. treated in von Neumann's Theory of Self Reproducing Automata. The main differences are in the summation algorithm of the end and in the fact that logical operations can be broken down over many cell neighbourhoods in feedback perceptrons. Perhaps a • closer study of the relation of feedback perceptrons to cellular structures would lead to stronger methods for the perceptrons as well as giving them wider appeal. The problem that really started us off was the comput ational complexity problem. As previously mentioned, the unlimited number of associators we allow undercuts the order measure as a true measure of complexity. Maybe a theory which -counts total computation steps of any kind is accessible. 61. •BIBLIOGRAPHY Arbib, A., Brains Machines and Mathematics., McGraw-Hill, N.Y., 19^5. Felgenbaum, Edward-A. and Julian Feldman, Computers and Thought, McGraw-Hill, N.Y., 1963-' (Read especially the Samuel Checker Player). Hebb, D. 0., Organization of Behaviour, Wiley, N.Y., 1949. Minsky, Computation: Finite and Infinite Machines, Prentice Hall, Inc., Englewood Cliffs, N.J. , 19o'7. \ Minsky, Marvin L. and Seymour Papert, Perceptrons and Pattern Recognition, (a monograph), M.I.T. Press, Cambridge, Mass. 1967. Minsky, Marvin L. and Seymour Papert, Perceptrons, M.I.T., Press, Cambridge, Mass., 19^9• Nilsson, N. J., Learning Machines: Foundations of Trainable Pattern Classifying Systems, McGraw-Hill, N.Y., 1965. Rosenblatt, F., Principles of Neurodynamics, Spartan Books, 1962-• Uhr, Leonard and James G. Miller, Pattern Recognition, Wiley, N.Y., 1966. von Neumann and Burks, Theory.of Self-Reproducing Automata, University of -Illinois Press, 1966. 24 SQUARES FIGURE 1 ACCEPT (SAME STATE) FIGURE 2 25 SQUARES REJECT FIGURE .3 (SAME STATE) FIGURE * f Xi, I x I I f M )cx * x x x X 'X XXX*XX X ACCEPT (SAME STATE) FIGURE 5 FIGURE .6 CONNECTED -FIGURE 7 DISCONNECTED FIGURE 10 III! i • ' i I ( !! ! FIGURE" 11
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Perceptron-like machines
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Perceptron-like machines Kraft , Kris Lance 1969
pdf
Page Metadata
Item Metadata
Title | Perceptron-like machines |
Creator |
Kraft , Kris Lance |
Publisher | University of British Columbia |
Date | 1969 |
Date Issued | 2011-06-02T12:32:50Z |
Description | This paper investigates perceptron-like devices with an eye toward the recognition of simple predicates, such as parity and connectivity. First "multilayer" perceptrons are investigated to little avail. Then much more powerful perceptrons which have feedback are considered, yielding better results. |
Subject |
Perceptrons |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Collection |
Retrospective Theses and Dissertations, 1919-2007 |
Series | UBC Retrospective Theses Digitization Project |
Date Available | 2011-06-02 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080499 |
URI | http://hdl.handle.net/2429/35061 |
Degree |
Master of Science - MSc |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- UBC_1969_A6_7 K73.pdf [ 4.05MB ]
- [if-you-see-this-DO-NOT-CLICK]
- Metadata
- JSON: 1.0080499.json
- JSON-LD: 1.0080499+ld.json
- RDF/XML (Pretty): 1.0080499.xml
- RDF/JSON: 1.0080499+rdf.json
- Turtle: 1.0080499+rdf-turtle.txt
- N-Triples: 1.0080499+rdf-ntriples.txt
- Original Record: 1.0080499 +original-record.json
- Full Text
- 1.0080499.txt
- Citation
- 1.0080499.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
United States | 11 | 1 |
China | 5 | 0 |
Hashemite Kingdom of Jordan | 4 | 0 |
Romania | 1 | 0 |
Italy | 1 | 0 |
Sweden | 1 | 0 |
Ukraine | 1 | 0 |
City | Views | Downloads |
---|---|---|
Ashburn | 6 | 0 |
Unknown | 4 | 6 |
Shenzhen | 4 | 0 |
Mountain View | 3 | 0 |
Skövde | 1 | 0 |
Rome | 1 | 0 |
Beijing | 1 | 0 |
Las Vegas | 1 | 1 |
Suceava | 1 | 0 |
Sunnyvale | 1 | 0 |
Amman | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080499/manifest