UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Logic-per-track associative memory 1976

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
UBC_1976_A6_7 T35.pdf
UBC_1976_A6_7 T35.pdf [ 4.09MB ]
Metadata
JSON: 1.0051772.json
JSON-LD: 1.0051772+ld.json
RDF/XML (Pretty): 1.0051772.xml
RDF/JSON: 1.0051772+rdf.json
Turtle: 1.0051772+rdf-turtle.txt
N-Triples: 1.0051772+rdf-ntriples.txt
Citation
1.0051772.ris

Full Text

LOGIC-PER-TRACK ASSOCIATIVE MEMORY GEOK-SENG TANG B.Sc, Nanyang University, 1967 M.Sc., University of Ottawa, 1969 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n the Department of Computer Science We accept t h i s thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA A p r i l , I976 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an a d v a n c e d d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e a n d s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t h e H e a d o f my D e p a r t m e n t o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Geok-Seng Tang D e p a r t m e n t o f Computer Science The U n i v e r s i t y o f B r i t i s h C o l u m b i a 2075 Wesbrook P l a c e V a n c o u v e r , C a n a d a V6T 1WS D a t e A p r i l 6, 1976 i ABSTRACT An associative,mor content-addressable, memory, one i n which data may he r e t r i e v e d "by i t s value r a t h e r than by r e a l address, has always been an a t t r a c t i v e idea. Although such a memory has not yet proven p r a c t i c a l f o r f i l e s of respectable s i z e , much i n t e r e s t i n g work has been done on the subject, f o r example, Minsky (1972), S l o t n i c k (1970) and Parker (1970, 1 9 7 1 ) . This t h e s i s i s concerned w i t h the device proposed by S l o t n i c k and Parker, c a l l e d 'Logic Per Track Device'. A f t e r b r i e f l y r e v iewing the design and c a p a b i l i t i e s of t h e i r device, the t h e s i s proceeds to propose some m o d i f i c a t i o n s to the design which not only l e a d to g r e a t l y enhanced performance, but a l s o e s t a b l i s h i t s p r a c t i c a l a p p l i c a t i o n f o r f i l e s of respectable s i z e . I n the device of S l o t n i c k and Parker, there i s a f i a r l y s o p h i s t i c a t e d l o g i c chip attached d i r e c t l y to each non-movable read-write head. This allows a l l l o g i c heads to search simultaneously f o r informa- t i o n matching a given key, so that any d e s i r e d r e c o r d could be l o c a t e d w i t h i n one r e v o l u t i o n . However, reading and w r i t i n g w i l l r e q u i r e a second r e v o l u t i o n because p a r t of the record w i l l have passed the head before the match i s recognized. Moreover, i f more than one record matches the search key, the e x t r a bookkeeping w i l l be needed i f matching records on d i f f e r e n t t r a c k s should p a r t i a l l y overlap. These problems have been ignored i n the r e t r i e v a l system developed by Parker (1970, 1 9 7 1 ) . i i The following four additional features of the device have been proposed: 1. Two l o g i c heads on each track has been introduced. The leading head w i l l continue to have the primary r e s p o n s i b i l i t y fo r simultaneous searching. The additional second head, t r a i l i n g a f i x e d distance behind w i l l do the actual reading and wr i t i n g of records. 2. A delay r e g i s t e r whose length i s the distance between lo g i c heads on the same track, has been added to the read-write head. The function of the delay b i t i s to t e l l the read-write head partner where to s t a r t reading (or writing) a record whenever a match i s recognized so that r e t r i e v i n g (or writing) a single record can always be performed i n the same revolution. 3 . Another major design change w i l l give the new device the a b i l i t y to keep track of a l l records which may be retrieved within a single revolution by p a r a l l e l search. To t h i s end, the monitor, which synchronizes the a c t i v i t i e s of a l l l o g i c head couples, w i l l be provided with a record counter, and a mark ent i t y w i l l be prefixed to every record on the disk i t s e l f . k. A f i l e i d e n t i f i c a t i o n mechanism has been established for the associative memory. Fun'ctxorls of such a mechanism are (a) to manage f i l e names, and (b) to manipulate data on the storage device. Next step i s to explore the use of such a modified device for f i l e - o r i e n t e d problems. 'Hierarchical search' for records possessing a s p e c i f i e d combination of keys can be performed i i i d i r e c t l y on the key part of records without the intermediate step of transmitting records into the main computer memory. In an application requiring chain processing, the chain pointer can he a key of the record because each record i n the associative memory i s accessed by content rather than by r e a l address. The chain key can be generated from the key <3f the record i t points to by a simple and reversible procedure. Such a chain technique has a number of advantages: (a) any chain i s i n fac t a two-way chain, (b) each record i n the chain can be retr i e v e d by following the chain key, or d i r e c t l y by the key of the record i f i t i s known, and (c) the tangle of actual physical addresses i n the chain processing can be avoided. The storage organization f o r more complex data structure such as tree structures presents another unique feature of the modified memory. In a tree structure, indexes to the subordinate records may be kept with each parent record, or each subordiante recoEdsnjaays^tDreaaniaindsx to i t s parent record. Both data structures take the same amount of storage space. Comparison <3f i t s performance to the convent t i o n a l counterpart shows that s i g n i f i c a n t improvements i n access times can be achived. i v CONTENTS . . L O G I C - P E R - T R A C K A S S O C I A T I V E MEMORY 1 1 . 1 R e v i e w o f T h e H a r d w a r e D e v i c e a n d P h y s i c a l D a t a F o r m a t 1 1. 2 L o g i c - P e r - T r a c k R e t r i e v a l S y s t e m 7 I . SOME A C C E S S I N G PROBLEMS 1̂ '* 2 . 1 F i l e I d e n t i f i c a t i o n M e c h a n i s m 1 4 2 . 2 Some A c c e s s i n g P r o b l e m s 1 6 2 . 3 ,The A d d i t i o n a l M e c h a n i s m 1 9 3. THE M O D I F I E D MEMORY D E V I C E 2 2 3 . 1 T h e M o d i f i e d M e m o r y D e v i c e 2 2 3 . 2 G e n e r a l i z e d P h y s i c a l D a t a F o r m a t 2 6 3 . 3 T h e D y n a m i c B e h a v i o u r o f T h e M e m o r y / 2 7 3 . 4 S y n c h r o n i z a t i o n o f T h e P a r a l l e l A c t i v i t i e s o f '. T h e M e m o r y 3 5 3 . 5 P e r f o r m a n c e o f T h e M e m o r y 3 5 1 . R e t r i e v a l o f a s i n g l e r e c o r d 3 6 2 . I n s e r t a r e c o r d . . . . 3 6 3 . I n s e r t a s e t o f r e c o r d s . . . . . . 3 8 4. D e l e t i o n . . . 3 8 5. C r e a t e o r d e l e t e a f i l e 3 9 6. G a r b a g e c o l l e c t i o n 3 9 3 . 6 A l t e r n a t e t o T h e M o d i f i e d M e m o r y D e v i c e . . . 4 0 3 . 7 H i e r a r c h i c a l S e a r c h M e c h a n i s m 44 3 . 8 E v a l u a t i o n o f H i e r a r c h i c a l S e a r c h 47 4. STORAGE O R G A N I Z A T I O N AND A C C E S S METHODS 5 0 4.1 C h a i n S t o r a g e O r g a n i z a t i o n I n L o g i c - P e r - T f e a c k E n v i r o n m e n t 5 0 4.2 Some A s p e c t s o f C h a i n - S t r u c t u r e s 5 9 1 . S e q u e n t i a l F i l e A c c e s s 5 9 2 . . I n s e r t i o n s a n d D e l e t i o n s 5 9 4".3 T r e e . S t r u c t u r e R e p r e s e n t a t i o n s 6 l 4.4 E v a l u a t i o n o f C h a i n S t r u c t u r e S e a r c h . . . . .64 V kj§ An Inverted Data Base Model f o r Conventional Disk Device 65 ^.6 Conclusion 7 2 REFERENCES «\8l ; v i FIGURES Figure 1.1(a) P r i m i t i v e i n f o r m a t i o n item i n b i t stream.... 4 Figure 1.1(b) Sample flata t r a c k i n the l o g i c - p e r - t r a c k d i s k 4 Figure 1 . 2 C y c l i c key searching mechanism 8 Figure 3 « 1 System c o n f i g u r a t i o n w i t h request queue, read and w r i t e queuing b u f f e r s 2 3 Figure 3 . 2 The c y c l i c memory delay device of the a s s o c i a t i v e memory and i t s l i n e a r map to the d i s k t r a c k 24 Figure 3 . 3 C o n t r o l processes f o r the request: r e t r i e v e a set of records w i t h the given key 28-29 Figure 3 « ^ A l t e r n a t e system c o n f i g u r a t i o n . 42 Figure 4.1 Examples of the student r e c o r d stored i n the student f i l e 5 3 Figure 4.2 B a s i c format of record i n a chain 5 6 Figure 4.3 Sample chain of student personal records.... 5 6 Figure 4.5 Conceptual view of tre e s t r u c t u r e 6 2 Figure 4.6 Example of student r e c o r d represented by a t r e e s t r u c t u r e 6 2 Figure 4.7 A data' o r g a n i z a t i o n r e p r e s e n t i n g Figure 4.5 : 6 3 Figure 4.8 Another data o r g a n i z a t i o n r e p r e s e n t i n g Figure 4.5 • 6 3 Figure 4.9 Layout f o r the i n v e r t e d f i l e o r g a n i z a t i o n f o r the conventional d i s k device discussed by Cardenas, 1 9 7 5 . . . 6 8 v i i ACKNOWLEDGEMENT S Special thanks are due Dr. J.R. H. Dempster for his inspiring supervision and generous assistance throughout the duration of this thesis, expecially during hours of numerous weekends. Valuable discussions with Dr. Allan Ballard of Computer Centre are acknowledged with gratitude. My heartfelt thanks go to my wife Karen for her sustaining encouragement and her invaluable assistance i n the typing of the complete manuscript. CHAPTER 1 LOGIC-PER-TRACK ASSOCIATIVE MEMORY 1 . 1 . Review of The Hardware Device and P h y s i c a l Data Format The l o g i c - p e r - t r a c k device was proposed by S l o t n i c k ( 1 9 7 0 ) and f u r t h e r developed by Parker ( 1 9 7 0 ) . Although such a device has not been b u i l t , i t ~ i s worth t h i n k i n g about i n the presence of l e s s expensive, h i g h l y r e l i a b l e e l e c t r o n i c p a r t s , as was pointed out by S l o t n i c k during 1 9 7 0 - The device i s designed to achieve high performance on f i l e - o r i e n t e d problems. The s t r u c t u r e and the c i r c u i t r y of the l o g i c head was designed by Parker ( 1 9 7 0 ) ; a r e t r i e v a l system f o r such a device has a l s o been proposed by Parker ( 1 9 7 0 , 1 9 7 l ) • Since then, there has been no f u r t h e r development of r e t r i e v a l systems f o r such a device. I n t h i s t h e s i s , Chapters 1 and 2 contain a review of the work of S l o t n i c k and Parker; while improvements f o r such a device and the r e t r i e v a l system comprise Chapter J>. F i n a l l y , Chapter k demonstrates some a p p l i c a t i o n s of such an improved device and r e t r i e v a l system. Moreover, comparisons between the proposed device and conventional ones are drawn i n order to evaluate the performance of the device. A l o g i c - p e r - t r a c k d i s k i s a f i x e d head d i s k w i t h a 2 l o g i c chip attached d i r e c t l y to each read-write head. The l o g i c chip allows each head to operate as an independent device. Each head reads and writes, and searches f o r space to write...new records on^its-own".track. The device i s intended to have a thousand heads with approximately one m i l l i o n '• h i t s f o r Q each track. This gives a t o t a l h i t capacity of 1 0 7 h i t s (Parker, 1 9 7 0 ). A l l of the i n d i v i d u a l l o g i c heads are connected together and communicate with a central source. The disk device i s designed to operate as an e s s e n t i a l l y • independent entity, which records and ret r i e v e s information on the "basis of a set of keys associated with each group of information items. The purpose of the design i s to combine a new hardware device with a software system to provide a t o t a l information r e t r i e v a l system. Moreover, i t i s desirable to have a system such that random access by means of any key i n the system to any data item would be extremely 1 1 rapid, on the order of q^th of a second or l e s s , where ^ second i s the revolution time of the disk. Each track i s broken up into four equal segments or disk quadrants. There i s a single p a i r of clock tracks which governs a l l the heads. One clock track indicates the b i t time; the other clock track indicates the beginning of a quadrant, or a quarter of a disk. The disk device treats data and keys as s e r i a l b i t streams. Each head looks at i t s own stream of b i t s as information: keys, data, and holes (spaces 3 with no data). There i s no d i r e c t communication i n t h i s design between adjacent heads. Everything i n the system i s s e r i a l by b i t . In order for each head to know what i t ' i s reading at a given time, these information e n t i t i e s are coded i n the stream with the following format: Each information item i n the b i t stream contains two f i x e d length f i e l d s and one variable length f i e l d . The f i r s t f i e l d i s a two-bit indicator of the type of primitive information item that e x i s t s i n the t h i r d f i e l d . The second f i e l d s p e c i f i e s the length of the variable length t h i r d f i e l d , which contains an arb i t r a r y b i t pattern. Thus each primitive information item can be depicted by Figure 1.1(a) i n which, f i e l d 1: two indicator b i t s specifying the three types of primitive information items, f i e l d 2 : binary integer i n d i c a t i n g number of b i t s i n f i e l d 3, f i e l d 3: a r b i t r a r y b i t pattern representing l o g i c a l data e n t i t i e s of a record. The following describes the basic primitive information items. 1. Hole item: The b i t pattern contains no information i n f i e l d 3 and i s therefore c a l l e d garbage. I t indicates that t h i s s l o t on the track i s available f o r key or data items. This type of 0 1 2 3 . 0 . k k+1 . . 0 . . 0 . . . . . 0 . . .m f, f i e l d 2 f i e l d 3 f i e l d 1 F i g u r e 1.1(a) P r i m i t i v e i n f o r m a t i o n item i n b i t stream. V key e n t i t y — — r e c o r d : key e n t i t y d a t a e n t i t y 1 0 Length Key 1 0 Length Key 1 1 ; Length Data •<i r e c o r d > I key e n t i t y d a t a e n t i t y r 4 — r e c o r d ? hole e n t i t y 1 •0 Length Key 1 1 ; Length Data .0 1 Length Hole 1 ^ n.l1PIT-^ yr- ^ ^ ^ ^ ^ ^ ^ key ' e n t i t y data e n t i t y hole e n t i t y • • 1 0 Length Key 1 1 Length Data iO 1 ;Length]Hole F i g u r e 1.1(b) Sample Data Track i n the L o g i c p e r Track D i s k . primitive information item spans over unused space. 2. Key item: This type of primitive information item represents a key and i t s value, given "by the b i t pattern i n f i e l d 3» A key i s any l o g i c a l item used to i d e n t i f y a class of r e l a t e d primitive data items. 3 . Data item; This primitive information item represents a record which i s a c o l l e c t i o n of structured data items. The following table gives the d i s t r i b u t i o n of two in d i c a t o r b i t s and.their associated primitive items: Indicator b i t s Primitive information items 0 0 unused 01 hole 10 key 11 data The following patterns for the e n t i t i e s have been chosen i n (Parker, 1 9 7 0 ) as a basic storage organization: - a quadrant contains a number of records, - a record i s either (a) a hole, or (b) one key to several keys followed by a data item. 6 See Figure 1.1(b) f o r a sample data t r a c k . Since a re c o r d element i s composed of an a r b i t r a r y number of key items, there e x i s t s a m u l t i p l e keying p o t e n t i a l f o r r e t r i e v i n g records r e s i d i n g on the d i s k . I t i s described i n (Parker, 1 9 7 0 ) t h a t each l o g i c head contains four r e g i s t e r s needed to process records r e s i d i n g on di s k t r a c k s . These four r e g i s t e r s are the l e n g t h r e g i s t e r , b i t r e g i s t e r , a u x i l i a r y r e g i s t e r and operation r e g i s t e r . 1 . Length r e g i s t e r : This i s used to hold the le n g t h (given by f i e l d 2 ) of the'current p r i m i t i v e i n f o r m a t i o n item being searched f o r i n the b i t stream. 2 . B i t r e g i s t e r : This contains the current b i t address on the t r a c k . I t i s used t o , f i n d a record by l o c a t i o n and f o r r e p o r t i n g back r e c o r d l o c a t i o n s to the c o n t r o l l e r . 3. A u x i l i a r y r e g i s t e r : This holds t e m p o r a r i l y a b i t address while the head skips through an e n t i t y of no current value. 4 . Operation r e g i s t e r : T h i s r e g i s t e r contains the operation code passed down by the c o n t r o l l e r . I t also counts down the le n g t h of the current p r i m i t i v e item to f i n d the beginning of the next item. This r e g i s t e r has both s h i f t and add c a p a b i l i t i e s so th a t i t can compare the leng t h of a des i r e d item w i t h the leng t h of 7 the current item. The f o l l o w i n g are some b u i l t i n p r i m i t i v e i n s t r u c t i o n s on each head: load the operation r e g i s t e r , t r a n s f e r one r e g i s t e r to another i n the same l o g i c head, search f o r a hole of s p e c i f i c l e n g t h , search f o r a key of s p e c i f i c b i t p a t t e r n , read from a s p e c i f i c l o c a t i o n , w r i t e to a s p e c i f i c l o c a t i o n . The important c h a r a c t e r i s t i c of the l o g i c head i s that i t has the c a p a b i l i t y to i d e n t i f y every p r i m i t i v e e n t i t y i n the b i t stream; moreover, since each p h y s i c a l r e c ord i n the b i t stream ends e i t h e r w i t h the data e n t i t y or w i t h a hole e n t i t y , the l o g i c head i s a l s o s e n s i t i v e to the begin/end l o c a t i o n of a p h y s i c a l r e c o r d i n the b i t stream. I n other words, the l o g i c head i s designed i n such a way tha t i t i s s e n s i t i v e to a l l p r i m i t i v e e n t i t i e s , and p h y s i c a l records c o n t a i n i n g the s p e c i f i c p a t t e r n s of the e n t i t i e s described p r e v i o u s l y . The patterns of the p r i m i t i v e e n t i t i e s i n the b i t stream w i l l be g e n e r a l i s e d i n Chapter 3» 1 . 2 . Logic-Per-Track R e t r i e v a l System A record r e s i d i n g on any t r a c k i s r e t r i e v e d by searching f o r one of i t s key e n t i t i e s . During.a search, the key i s broadcast from the c e n t r a l source. Since a l l e n t i t i e s Quadrant boundary-* Quadrant t r a c k C y c l i c key p a t t e r n B i t stream 1 2 3 ^ !01234567890123456789012345678901234567890... c y c l e c y c l e [ c y c l e c y c l e c y c l e c y c l e . •,...[.« b i i i i d o i i i i o b i i i i o D i i i i p b i i i i o b i i i i o p i i i i q . . i lojoooooooooooooooiiogiiooi key l e n g t h key key match found data, entityf..» key i n d i c a t o r Quadrant t r a c k 1 2 2 8 0 1 .7890123456789012345678901234567890123456. c y c l e c y c l e c y c l e i c y c l e c y c l e ! c y c l e ) ..- C y c l i c key p a t t e r n " ' l o i l l i q o l l l l . o l o i l l l o b l l l l O O l l l l o p i l l l o b l l l l O O l . . . lOioOOOOOOOOOOOOOOllOtllOOlli another d a t a e n t i t y ! f key l e n g t h k e y | key match found key i n d i c a t o r F i g u r e . 1.2. C y c l i c key searching>mechanism 9 are of variable length, the keys may not start at the same bi t numbers. Therefore the key i s broadcast to each l o g i c head over and over again c y c l i c a l l y . Please see Figure 1 . 2 for the c y c l i c b i t pattern of the key ' 0 1 1 1 1 0 ' being broadcast from the central source. Each head compares whatever b i t i t reads from the key with the b i t being broadcast. This implies that sometimes the l a s t b i t s are compared f i r s t . On an equal-unequal compare t h i s does not matter. As long as the keys can be written offset by the record writing software, the offset i s always compensated by the broadcasting. I f the key i s to be written at a given b i t address i n the system that b i t address i s taken modulo the key length. I f the b i t address modulo the key length i s not zero then the f i r s t part of the key i s broken o f f . The number of b i t s corresponds to the b i t address modulo the key length. This i s s h i f t e d to the other end of the key and i s broadcast l a s t . I t follows that a key written at d i f f e r e n t locations on a given track may show di f f e r e n t information b i t patterns i n the t h i r d f i e l d of the key item. The following example demonstrates the c y c l i c key pattern. Example: Assume that the key value ( f i e l d 3 ) i s given by the 6 - b i t pattern 0 1 1 1 1 0 , that i s , b i t count: 0 1 2 3 ^ 5 1 0 key pattern: 0 1 1 1 1 0 key indicator: 1 0 key length: 0 0 . . . 0 1 1 0 i n 18-bit format. Thus, f o r k=20 two d i f f e r e n t c y c l i c key patterns written on two d i f f e r e n t locations on a track appear as follows: C y c l i c pattern 1 : Quadrant b i t address; 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 o i 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 B i t stream: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 1 In t h i s example, the f i r s t two b i t s are the indicator b i t s f o r key entity, and the next 18 b i t s i s the binary number specify- ing the length of the key r e s i d i n g i n the t h i r d f i e l d of t h i s key e n t i t y . The b i t address f o r the t h i r d f i e l d i s 2 0 , and t h i s b i t address modulo the key length i s 2 , and hence the key i s broken i n two parts depicted by the following f i g u r e : f i r s t l a s t part part 0 1 1 1 1 0 The l a s t part i s then sh i f t e d to the other end of the key, so that the physical pattern of the key at the b i t l o c a t i o n 2 0 appears as follows: 1 1 l a s t f i r s t part part 1 1 1 0 0 1 This gives the c y c l i c b i t pattern ' 1 1 1 0 0 1 * to be written i n the t h i r d f i e l d of the key entity. How t h i s physical key pattern i s matched to the given key broadcast c y c l i c l y from' the central source i s depicted i n Figure 1 . 2 . C y c l i c pattern 2 ; The following i s another c y c l i c key pattern f o r the same key but written at b i t location 2 0 7 . Please see Figure 1 . 2 for how the c y c l i c key pattern i s matched to the given key. 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 8 8 8 9 9 9 9 9 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 1 1 1 Quadrant b i t address: ...78 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 . . . 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 1 1 The above example gives a f a i r l y clear picture of the simultaneous c y c l i c "search f o r a - f i x e d key"'instruction. In the "search for a hole -"instruction, the length of the hole to be searched for i s loaded into the length r e g i s t e r which w i l l be used to compare to the length of every hole entity encountered by the l o g i c head. Thus each l o g i c head looks for a proper hole, large enough to contain the record that i s 12 to "be inserted. I f the length of the hole i s not exactly equal to the length of the record, i t must be at least k b i t s longer, i n order that the unused part can s t i l l be recorded as a hole. On the search operations a l l heads are searching simultaneously f o r the given key. Therefore the entire disk can be scanned f o r a match on a single revolution. This takes about 25 milliseconds with most current disks. As soon as the location of the record i s found, the record may be read from or written to the s p e c i f i e d location. However, with a single l o g i c head, t h i s has to be done i n the next revolution because the l o g i c head can perform only one primitive i n s t r u c t i o n at a time. We s h a l l return to t h i s point i n Chapter 3- The following common operations may be b u i l t up from the i n d i v i d u a l operations of the l o g i c heads: search for a f i x e d key, ins e r t or delete record or a new key to the record, frequency counting based on a f i x e d key, f i l e recopy and restructuring, and garbage c o l l e c t i o n — c o l l e c t i n g a l l hole e n t i t i e s . Simultaneous search for a f i x e d key comprises the basic idea of the l o g i c per track r e t r i e v a l system. Application of such a r e t r i e v a l system to single key records has been described i n d e t a i l by Parker (1970). However,, there i s no comprehensive description how such records can be 13 manipulated within a f i l e . To solve t h i s problem, a f i l e i d e n t i f i c a t i o n mechanism f o r such a r e t r i e v a l system i s introduced i n Chapter 2 , and some accessing problems are also discussed. CHAPTER 2 SOME ACCESSING PROBLEMS 2.1. F i l e I d e n t i f i c a t i o n Mechanism Records are grouped together to form a f i l e or f i l e s ^ r Each record i n a f i l e consists of the same type of information held i n an i d e n t i c a l format as every other record i n the f i l e . An example i s a personnel f i l e that consists of employee records. There i s one record for each employee and a l l records hold the following data: employee number, family name and i n i t i a l s , s o c i a l insurance;number, s k i l l code(s), starting- employment date, address etc. These data elements are present i n a l l employee records. However, to retrieve a record, we must know before hand the name of the f i l e to which i t belongs, and i n p a r t i c u l a r the physical l o c a t i o n of the f i l e i n the memory storage. It follows that a storage area i d e n t i f i c a t i o n mechanism for the device should be established. Functions of such a mechanism are (1) to manage f i l e names, and (2) to manipulate data on the storage device. The mechanism would esta b l i s h a f i l e name within a context by constructing a physical f i l e i d e n t i f i e r . A user refe r s to a s p e c i f i c f i l e by a symbolic name. For each symbolic name i n a given context there i s a uniquely defined 1 5 i n t e r n a l physical representation i n the form of a b i t s t r i n g . This b i t s t r i n g i s the unique i d e n t i f i e r associated with the symbolic name given by and referenced by the user. Moreover, the physical f i l e i d e n t i f i e r w i l l be the key item written on the very beginning of the quadrant boundary of each track where the f i l e resides. Supposing that no two f i l e s w i l l share the same track quadrant, then the physical f i l e i d e n t i f i e r w i l l i d e n t i f y the actual l o c a t i o n of the f i l e i n the memory. Those unused quadrants w i l l be i d e n t i f i e d by a uniquely designed key which i s also written on the very beginning of the quadrant boundaries. In t h i s manner, a l l quadrants are named, and the very f i r s t key i n each quadrant i s the physical f i l e i d e n t i f i e r . To manipulate data on the f i l e , a l l l o g i c heads w i l l be f i r s t positioned on a . quadrant - -boundary',: - then- the -' physical f i l e i d e n t i f i e r associated with the symbolic f i l e name w i l l be broadcast from the central source to a l l l o g i c heads so that the key for the f i l e name i s searched throughout the entire memory. I f the quadrant i d e n t i f i e r matches the f i l e name, then the corresponding l o g i c head remains active, otherwise i t i d l e s . The • pr.ocBjss'cof "searching.^for^the f i l e n a m e repeated for every quadrant. This scheme obviously makes use of the f a c i l i t y f or simultaneous search. Moreover, i t w i l l simplify the implementation of the I/O system, since most of the d e t a i l i n locating physical records i s performed "by hardware. 2.2. Some Accessing Problems Consider the problem of processing i n f o r m a t i o n on the di s k . Suppose that a set of records must be r e t r i e v e d from the d i s k storage. These records may or may not vary i n len g t h but are a r b i t r a r i l y d i s t r i b u t e d . The b i t stream which contains holes, keys and data items must be o r i g i n a t e d from a f i x e d l o c a t i o n on the t r a c k , as otherwise we cannot t e l l what each b i t represents. Thus, there would be an average 1 l a t e n c y of g r e v o l u t i o n to get to the s t a r t i n g point." The _ ; - lat e n c y i s reduced by using f o u r s t a r t i n g p o i n t s , t h a t i s quadrant boundaries, which d i v i d e each d i s k t r a c k i n t o four equal segments. With quadrant segments, the average l a t e n c y i s reduced to g r e v o l u t i o n ; a f t e r one has reached the quadrant boundary there i s the usual average r o t a t i o n a l delay of h a l f a r e v o l u t i o n to f i n d the record. I n what f o l l o w s are b r i e f d e s c r i p t i o n s of the three b a s i c operations (Parker, 1 9 7 2 ) : search, i n s e r t i o n and d e l e t i o n , and t h e i r execution time i n term of d i s k r e v o l u t i o n s . 1. Search operation: steps d e s c r i p t i o n s 1 P o s i t i o n l o g i c heads to the next quadrant boundary and search f o r the f i l e name (g r e v o l u t i o n on average). 2 Search f o r the f i x e d key (less than or equal to one revolution to f i n d the matching key plus one revolution hack to the record p o s i t i o n again). 3 Read the matching record from the disk ( f r a c t i o n T of a revolution i d e n t i c a l to the length of the record). I f there i s only one record to be searched, then the operation i s completed. Otherwise, two extra steps (4-, 5) are needed for more than one matching record. 4 Return to the p o s i t i o n of the l a s t matching point- (1-T revolution). 5 Repeat from step 2 to step 4 u n t i l the search (step 2) has . 'returned to; the quadrant boundary at which i t began- at- r. step 1. Insert operation: "steps descriptions 1 P o s i t i o n l o g i c heads to the next quadrant boundary and search for the f i l e name (g revolution on average) 2 Search for a proper hole, large enough to contain the record that i s to be inserted (less than or equal to one revolution to f i n d the l o c a t i o n of a proper hole entity plus one revolution back to the beginning of the hole entity again). 3 Write the new record and a new smaller hole entity - i f any i s l e f t - into the place of the o r i g i n a l 18 hole e n t i t y (a f r a c t i o n T of a r e v o l u t i o n which i s i d e n t i c a l to the l e n g t h of the o r i g i n a l hole i n di s k r e v o l u t i o n s ) . 3. Delete operation: steps d e s c r i p t i o n s 1 P o s i t i o n l o g i c heads to the next quadrant boundary and search f o r the f i l e name (g r e v o l u t i o n on average). 2 Search f o r the f i r s t r ecord ( l e s s than or equal to one r e v o l u t i o n to f i n d the f i r s t matching key p l u s one r e v o l u t i o n hack to the record p o s i t i o n again). 3 Write a hole i n t o the place of the re c o r d (a f r a c t i o n T of a r e v o l u t i o n i d e n t i c a l to the l e n g t h of the r e c o r d ) . I f more than one record i s to be deleted, then two more e x t r a steps (4 and 5) are needed). 4 Return to the p o s i t i o n of the l a s t matching p o i n t (1 - T r e v o l u t i o n ) . 5 Repeat from step 2 u n t i l the search (step 2) has r e t u r n i t o >• the quadrant boundary at which i t began ...•-in. step 1. I t f o l l o w s from the above d e s c r i p t i o n that even i f i t can be guaranteed t h a t there i s at most one match to the r e c o r d key, the b a s i c operations ( i . e . , search, i n s e r t i o n , d e l e t i o n ) cannot be performed on the same r e v o l u t i o n due to the' f a c t 19 t h a t the same p h y s i c a l head can carry out only one p r i m i t i v e i n s t r u c t i o n at at time. I f these operations are to he performed i n l e s s than one r e v o l u t i o n time, - _•' an a d d i t i o n a l mechanism i s r e q u i r e d . 2.3. The A d d i t i o n a l Mechanism I t i s the purpose of t h i s t h e s i s to develop the a d d i t i o n a l mechanism (Chapter 3) i n order to speed up the basi c operations. The f o l l o w i n g i s a b r i e f summary of such a d d i t i o n a l mechanism: To speed up the access time, we s h a l l introduce a second l o g i c head on each t r a c k . The second l o g i c head, l i k e the f i r s t l o g i c head, i s a l s o s e n s i t i v e to each p r i m i t i v e e n t i t y , the s t a r t i n g and f i n i s h i n g of each record i n the b i t stream. I t s f u n c t i o n i s to read a r e c o r d from the t r a c k or to w r i t e a record onto the t r a c k . Moreover the second l o g i c head, when coupled w i t h the o r i g i n a l l o g i c head over each t r a c k , w i l l enable the device tt> - perform some bas i c operations such as search, i n s e r t i o n and d e l e t i o n w i t h i n a s i n g l e r e v o l u t i o n . This means that the o r i g i n a l l o g i c head w i l l search f o r a r e c o r d key or a hole while performing these b a s i c operations, and then the second l o g i c h e a d . w i l l carry out the read/write i n s t r u c t i o n f o r the operation. Next, consider the simultaneous search i n s t r u c t i o n w i t h only one l o g i c head over each t r a c k . Suppose th a t two l o g i c heads f i n d matches at the same time; then these two 20 matched records must be read, simultaneously. However, t h i s may be impossible i f a great many simultaneous matches are found. To resolve such a c o n f l i c t some record(s) w i l l be processed f i r s t . For example, simultaneously matched records may be processed according to the sequence number assigned to each track. However, t h i s may require as many extra revolu- tions as the number of those records that are found. This i s of course another i n e f f i c i e n t r e t r i e v a l scheme. To give a general solution to the simultaneous matching problem, each record w i l l be prefixed by a mark enti t y , and a record counter w i l l be added to the device. The record counter i s a b u i l t i n device that saves the number of matched records to be processed. Thus, i n the simultaneous match, the number of matching records w i l l be added to the counter. Whenever a matching record i s processed, i t w i l l be marked, and the record counter decreased by one. A matching record w i l l not be processed twice since i t i s i d e n t i f i e d by the marking mechanism, and a l l matching records w i l l be retrieved as the completion i s governed by the content of the record counter. CHAPTER 3 THE MODIFIED MEMORY DEVICE' 3 . 1 . The Modified Memory Device The main feature of the modified memory device i s that there i s another l o g i c head associated with each ex i s t i n g l o g i c head on every track. These two l o g i c heads remain a constant distance of d h i t s apart. One w i l l he c a l l e the control head (CH) while the one following w i l l be c a l l e d the read/write head (RWH). The functions of a l l head couples are synchronized by a single monitor which we s h a l l describe i n d e t a i l l a t e r . The structure of the control and read/write heads i s given by Figure 3.1. The control head has the same structure as described i n (Parker, 1 9 7 0 ) , and the same functions for i t s re g i s t e r s : b i t , length, a u x i l i a r y and operation. The RWH contains a b i t r e g i s t e r , a length r e g i s t e r and an operation r e g i s t e r which have the same functions as i n the control head. I t also contains a c i r c u l a r read/write delay r e g i s t e r of d b i t s i n length where d i s the distance between l o g i c heads on the same track. The b i t s of the delay r e g i s t e r are scanned c y c l i c a l l y by the RWH at the same rate as b i t s on the track i t s e l f . Consequently, a delay b i t set when the CH sees 22 a certain point on the track w i l l he scanned again when the same point passes the RWH. The monitor also contains a record counter r e g i s t e r which i s common to a l l l o g i c heads. Whenever the record key i s matched successfully hy a control head, the l a t t e r n o t i f i e s i t s associated read/write head to set the delay and also n o t i f i e s the monitor unit to add one to the record counter. Figure 3«2 sketches the delay device and the physical locations of i t s b i t s . By i n i t i a l i z a t i o n of the delay device we mean a l l b i t s of the delay memory are set to ' 0 ' . To set/reset (or turn on/turn off) the delay device means a l/O b i t i s written at the indicated l o c a t i o n by the RWH. The b i t w i l l also be referred to as the delay b i t t We s h a l l return to t h i s topic l a t e r . The following common operations may be b u i l t up from the i n d i v i d u a l operations of control heads: search for quadrant name, search for a key of s p e c i f i c pattern, search for a hole of s p e c i f i c length, unmark a record, n o t i f y the con t r o l l e r when a match i s found, n o t i f y the associated RWH to read, to write or to delete the matching record. The read/write head which follows the control head over the same track has the following c a p a b i l i t i e s : read/write/delete the record from/to the 23 ^RWH 3 CH TRACK MONITOR CONTROL CHANNEL PROCESSOR DATA CHANNEL PROCESSOR RECORD COUNTER REQUEST • READ : WRITE • • • • MAIN MEMORY DATA CHANNEL CH 3 TRACK RWH DISK - REVOLUTION CONTROL HEAD LENGTH REGISTER AUXILIARY REGISTER BIT REGISTER OPERATION REGISTER READAlRITE HEAD LENGTH REGISTER BIT REGISTER OPERATION REGISTER READ WRITE DELAY MAIN MEMORY Fi g u r e 3»1» System c o n f i g u r a t i o n w i t h r e q u e s t queue, re a d and w r i t e queuing b u f f e r s read/write delay 0 1 2 delay "bit m d-n...d-2 d-1 0 0 0 0 0 1 0 0 0 I match record address match found 0 0 mark h i t d h i t s key 1 JRWH CH Figure 3.2. The c y c l i c memory delay device of the associative memory.and i t s l i n e a r map to the disk-, track. The delay h i t i n the delay device i s the h i t which corresponds to the address of the matching record. J 25 specific location as instructed "by i t s control head or by the controller, mark the matching record, read records into the buffer in the garbage collection instruction. As illustrated i n Figure 3.1, attached to one end of the control channel controller are the control heads. Similarly, attached to the two ends of the data channel controller are RWH's and read/write buffer queues. Communication between the data buffers and the main memory i s serviced by the main memory data channel. A l l outstanding requests are queued up i n the request queue. Some possible requests are the following^.: insert a record, delete a record, retrieve a record or a set of records, garbage collection, create a f i l e , delete a f i l e , retrieve a l l f i l e names. We shall c a l l these requests primitive operations with the property that only one at a time can be serviced by the monitor. The activities of the control heads, read/write heads, control channel, data channel and the main memory data channel are supervised by the monitor. The function of the 26 control channel i s to support the a c t i v i t i e s of the control heads. In p a r t i c u l a r , i t broadcasts the f i l e name and therffecord key -in the c y c l i c b i t pattern to each active l o g i c control head i n performing a search operation. In the meantime, i t should be able to n o t i f y the monitor when the record i s found. Thus i t i s a f a i r l y sophisticated multiplex channel processor. The function of the data channel i s to transmit data between read/write heads and data buffers, and to control the a c t i v i t i e s of the read/write heads. 3.2. Generalized Physical Data Format Primitive information e n t i t i e s to be stored on the l o g i c per track disk w i l l be generalized to include the 'mark' entity: indicator b i t s mark b i t 00 0 or 1 The mark entity i s a one b i t entity, and i s always the f i r s t e ntity of a record i n the b i t stream. A 'hole' record w i l l contain two e n t i t i e s : the mark entity followed by the hole entity. A data record w i l l contain the following generalized entity patterns: (mark, key key, data). The following example demonstrates an important concept on the key part of a data record. Consider a simple data record pattern: (mark, key, data) 27 The data record may contain personnel information, i n which the s o c i a l insurance number i s the key. I t may contain p a y r o l l information, i n which the employee number i s the key. These two d i s t i n c t records share the same record pattern i n a single f i l e . Thus, a search for the s o c i a l insurance number might end up with an employee number. In order to i d e n t i f y the record key properly, the key must be q u a l i f i e d by the object r e f l e c t e d by the key. The mark entity not only leads a physical record i n the b i t stream, but also indicates the end of the preceding physical record. With t h i s dual function of the mark entity, the l o g i c heads can i d e n t i f y each variable length record i n the b i t stream. Because of t h i s c a p a b i l i t y , the logic-per- track associative memory possesses a l l features of the conventional disk storage device, and i s also an e f f i c i e n t information r e t r i e v a l oriented device. 3«3 ' The Dynamic Behaviour of The Memory The dynamic behaviour of the memory (that i s , the complex of main memory, monitor, control channel, data channel, l o g i c heads) i s b r i e f l y i l l u s t r a t e d i n Figure 3.3. For s i m p l i c i t y , the memory i s represented by f i v e columns: the f i r s t column represents the monitor, the second column the control channel processor and the t h i r d column the control heads. Column four i d e n t i f i e s data channel and read/write head a c t i v i t i e s , and f i n a l l y , column f i v e the main memory 28 MONITOR CONTROL CHANNEL CONTROL MEAD DATA CHANNEL BAIN MEMORY READ/WRITE HEAD DATA CHANNEL l a : next request * ar r i v e s l b : a c t i v a t e s c o n t r o l channel 2a: i n i t i a l i z e s a l l control heads and position them to quadrant boundaries ONE REVOLUTION II .a Phase one: search for f i l e name Ic : i n i t i a t e s search for f i l e name 2b: broadcasts 3a: a l l control heads f i l e name to search f o r the a l l control f i l e name heads 3b: each co n t r o l head remains active i f i t finds the f i l e name matched Id : i f necessary, activates main main memory data channel II .b Phase two: search for the record key •3c: active c o n t r o l heads unmark record during the f i r s t r evolution l e : i n i t i a t e s search key i n s t r u c t i o n 2c: broadcasts key 3d: active c o n t r o l heads in c y c l i c search for the pattern to a l l record key active c o n t r o l heads 3e: i f a cont r o l head finds a match, then i t (1) n o t i f i e s the monitor which in turn adds 1 t o record counter during the 1st revolu- t i o n , (2) marks on the delay device of RWH If : activates data channel i f necessary I I've Phase three: read/write operation l g : i r react butter available and the data channel i s free, then broadcast the ready signal to a l l active RWH's ta: the RWH being triggered by the ready s i g n a l and the delay b i t does the following: (1) resets the delay b i t , (2) marks and r . reads the matching record; in the meantime, the monitor substracts 1 from the record counter. tb: other RWH's being wakened by t h e i r delay clocks w i l l just reset the delay, b i t .continued 2 9 MONITOR CONTROL CHANNEL CONTROL HEAD DATA CHANNEL MAIN MEMORY READ/WRITE HEAD DATA CHANNEL l i : i f necessary, activates main memory data channel transmits data to main memory TEST FOR COMPLETION l j : i f record counter^O then another revolution i s needed and BOX II repeats; otherwise requst i s completed. Figure 3.3. Control processes f o r the request: retr i e v e a set of records with the given key. 30 data channel. The a c t i v i t y of various boxes i n Figure 3-4 depends upon the s p e c i f i c request being served by the memory. By a request (or a p r i m i t i v e operation) we mean the quadruple: (1) the f i l e name, (2) the re c o r d key, (3) the request type, and (4) the data. A l l the b u i l t - i n operations of the l o g i c head couple, and a l l the p r i m i t i v e operations described i n the l a s t s e c t i o n form some basic request types f o r the device. I n some requests, such as read a record from or w r i t e a re c o r d to a s p e c i f i c l o c a t i o n , the rec o r d key may be a dummy key, which w i l l thus d i s a b l e the a c t i v i t i e s of the c o n t r o l heads. With such a dummy rec o r d key, the search f o r the record key w i l l not take place. A request to r e t r i e v e a r e c o r d or a set of records w i t h a given key i n a f i l e means a l l records w i t h the matching key w i l l be r e t r i e v e d . The s e r v i c e of each request i s d i v i d e d i n t o three major steps: ( 1 ) i n i t i a l i z a t i o n , box I , (2) operation, box I I , and (3) t e s t f o r completion, box I I I . We s h a l l describe the a c t i v i t i e s of the memory u s i n g a s p e c i f i c request: r e t r i e v e a set of records w i t h key i n a f i l e . At the moment, we s h a l l ignore most of the i s s u e s i n v o l v e d w i t h the need to synchronize the var i o u s p a r a l l e l a c t i v i t i e s of the memory. Suppose that the memory i s i n s t r u c t e d to r e t r i e v e a set of records w i t h the given key. The f o l l o w i n g steps must 31 be performed: a) P o s i t i o n a l l control heads at the quadrant boundaries. b) Search f o r the f i l e name which i s the f i r s t key i n the quadrant. c) Any control head which fi n d s i t h e f i l e name matched w i l l remain active, others w i l l i d l e to the next quadrant boundary. d) A l l active control heads search for the key. e) I f a control head finds a match, then the record w i l l be read by the associated RWH, but meanwhile, the delay b i t i s set. f) The outstanding matched record count w i l l be kept i n the record counter r e g i s t e r . g) Continue step d to step e thoughout the quadrant. h) At the end of one revolution, step b to step g w i l l be repeated i f there i s an outstanding matched record. The execution of t h i s request begins by the monitor i n s t r u c t i n g the control channel to i n i t i a l i z e a l l control heads and p o s i t i o n a l l control heads to the next quadrant boundary. These steps are represented by box I, and are executed once for each request. The next step i s to f i n d the quadrants where the f i l e resides, that i s , to execute box II.a. Note that we i d e n t i f i e d the f i l e name by the quadrant name, which i s the very f i r s t key on each quadrant. Accordingly, using 32 the simultanuous search feature of the device, the f i l e name i s broadcast from the control channel to a l l l o g i c control heads, searching f o r the f i l e i d e n t i f i e r key. F i r s t of a l l , the length of the f i l e name w i l l be loaded into the length r e g i s t e r , which i s used to compare with the length of quadrant name. I f the lengths are not equal, then the search i s unsuccessful and the corresponding control head becomes i d l e . Consequently, i t remains to compare those quadrant names with the same length. As soon as the l a s t b i t of the f i l e name i s broadcast, a l l matches w i l l be found. The successful heads w i l l remain active, whereas.others w i l l become in a c t i v e . Accordingly, the f i l e l o c a t i o n i n the p a r a l l e l quadrants i s well defined so that records w i l l be retrieved from the proper quadrants. This step i s represented by box II.a and w i l l be executed once for every quadrant. The next and p r i n c i p a l step (box II.b) i s to search f o r the key and read every record (box II.c) with a matching key. One of the primary functions of each active control head i s to unmark each record whenever the record comes under the control head i n the f i r s t revolution. The key to be searched w i l l be continuously and c y c l i c a l l y broadcast to a l l active control heads by the control channel c o n t r o l l e r throughout the p a r a l l e l quadrants. I f a control head finds a match to the record key, then the monitor w i l l be n o t i f i e d v i a the channel c o n t r o l l e r . The channel c o n t r o l l e r adds one to the record counter r e g i s t e r . 33 In the meantime, the control head w i l l n o t i f y i t s associated head to mark the delay device. The record counter i s used to ensure that a l l matching records are processed f o r the current request. I f a read buffer i s available and the data channel i s free, the monitor i n i t i a t e s a control signal, c a l l e d the ready signal, to a l l active RWH's. The signal w i l l be terminated i f either a buffer i s not available or the data channel i s not free. The RWH, when triggered by i t s delay clock, checks for the ready signal. I f there i s no ready signal, theftsdst justere^setsgthe,,delayjiclocfciYsOtfierwise, i t does the following: (1) resets the corresponding b i t i n the delay device, (2) reads the matching record into the s p e c i f i c l o c a t i o n i n the read buffer queue,, (3 ) n o t i f i e s the channel c o n t r o l l e r to subtract one from the record counter, and ( 4 ) marks the matching record. An acknowledge signal i s generated when the reading i s completed. The monitor, once i t receives the acknowledge signal, activates the main memory data channel and the record i s transmitted to the main memory. The above occurences w i l l be repeated during one complete revolution. At the end of the revolution, the monitor checks the record counter r e g i s t e r . I f the record counter i s greater than zero then another revolution i s required and 34 box II w i l l be repeated. A l l active control heads w i l l f i n d the same matches as i n the previous revolution. However, i n the second and subsequent revolutions, the matching control head w i l l n o t i f y i t s associated RWH head to mark the delay device only i f the matching record has not already been marked by the RWH. Thus the marking mechanism assures that each a a matching record w i l l be processed once only. Some remarks follow: 1. From the above description, at least one revolution i s necessary for r e t r i e v i n g a set of records of the given key. I f we know that there i s at most one match for the given key, then the request i s completed as soon as the matching record i s read by the RWH, regardless of the rest of the revolution. This w i l l save the unnecessary latency time for the next request. 2 . The record counter has i t s contents changed i f (a) the associated control head finds' a match, or (b) the RWH begins to read the matching record. In case (a), the record counter w i l l be increased by 1, and the corresponding b i t of the delay device w i l l be marked. In case (b), the record counter w i l l be decreased by 1, and the corresponding b i t of the delay device w i l l be turned off. These two events may, however, occur simultaneously. To resolve such a c o n f l i c t , the concurrent processes must be interlocked i n order to keep the outstanding matching count up to date. We should keep i n mind that the above description i s 35 only an example. Actually the memory must he able to carry out a variety of primitive i n s t r u c t i o n s , and the actual a c t i v i t y of most boxes i n Figure 3 i s therefore a function of the i n s t r u c t i o n being served. There should be a way to select the desirable algorithm each time. 3.4. Synchronization of The P a r a l l e l A c t i v i t i e s of The Memory In what follows, we s h a l l discuss the need to synchronize the various a c t i v i t i e s of the co n t r o l l e r with each other and with the rot a t i o n of the disk. Such synchronization imposes constraints on the sequential processes of the memory that may run concurrently. 1. The co n t r o l l e r cycle (Box II) i s the t o t a l time that the monitor takes to execute boxes l c to I f . Box l c must be executed i n a time very s i g n i f i c a n t l y l e s s than the time f o r the disk to move k b i t s , where k i s the length of the f i r s t two f i e l d s i n a * primitive i n f ormat ion item. 2. I f more than one active RWH head i s triggered by i t s delay clock simultaneously, then these heads check f o r the ready signal. I f there i s a ready signal, then the channel c o n t r o l l e r , which controls the a c t i v i t i e s of l o g i c heads, w i l l enable only one of these RWH heads. 3.5- Performance of The Memory We s h a l l now try to j u s t i f y the revised memory device by describing i t s performance i n carrying out the operations 36' which were defined as primitive instructions (that i s , requests to the memory). Basic to every request i s to p o s i t i o n 1 the control heads to a quadrant boundary, which take -g- revolution on the average. 1. R e t r i e v a l of a single record This i s , of course, based on the assumption that there i s at most one match for the given key. The average time i s g + 2 ^ g revolution. In the conventional f i x e d head f i l e , i t takes 2 revolution to p o s i t i o n the head to the s t a r t i n g point and since every track must be searched sequentially, t h i s takes another g revolution(s) to r e t r i e v e the record, where n i s the number of tracks allocated to the f i l e , so that the t o t a l time i s -|-(n+l) revolutions. However, some f i x e d head disks can detect segment boundaries (for example, IBM 2305 disk); then the average r e t r i e v a l time i s reduced to only revolution's). Thus, the conventional device may be more e f f i c i e n t i f the f i l e has only one track. In what follows, we s h a l l assume that the conventional device i s a f i x e d head disk whose heads are sensitive to segment boundaries so that the time to p o s i t i o n the heads to the next s t a r t i n g point i s n e g l i g i b l e . 2. Insert a record In the conventional f i x e d head disk, the i n s e r t operation has two steps: (1 ) getting to the point where the item has to be written: -| revolution. 3 7 ( 2 ) the actual w r i t i n g process. Since, i n general, the time required to perform step 2 i s s i g n i f i c a n t l y less than step 1, the service time i s |r revolution. To inse r t a record i n our memory has three steps: ( 1 ) p o s i t i o n control head to the quadrant boundary (g revolution), ( 2 ) search for an empty sl o t (hole) of appropriate size (g to 1 revolution), ( 3 ) write the record on the s l o t found, and ( 4 ) write the hole e n t i t y on the t a i l of the s l o t . Steps 3 and 4 need approximately s^ revolution time, \_ same as the conventional device, where i s the revolution time to write s b i t s of information on the device. Again, since t h i s i s s i g n i f i c a n t l y l e s s than the time to perform steps 1 and 2 , i t i s neglected i n the following discussion. Step 2 needs up to one revolution i f there i s only one quadrant allocated to the f i l e , or a l l quadrants allocated happen to be i n the same disk quadrant. Since t h i s w i l l be the worst case, i t should be avoided i f possible. I t follows that a f i l e should occupy at least four consecutive non-parallel quadrants, since i f the empty slot s are uniformly d i s t r i b u t e d over the quadrants, then we can write on the f i r s t appropriate empty s l o t of the f i r s t quadrant encountered. In t h i s case, step 2 requires on the average g revolution, which gives the t o t a l service time of \ revolution. Thus, i f the f i l e i s not 38 heavily loaded, we have a very good chance of improving the t o t a l service time. 3. Insert a set of records In the conventional device, t h i s takes (•§• + s^) revolution. Since i § T i s the same i n both devices, we s h a l l drop i t i n the following comparison. In our memory, t h i s depends heavily on the d i s t r i b u t i o n of empty s l o t s on the f i l e . The worst case w i l l be that these records have to be inserted into four nonparallel quadrants. This w i l l need i n general one revolution time. However, i f we can i n s e r t a l l records i n the f i r s t quadrant encountered, then the t o t a l service time i s reduced to \ revolution (excluding s^). Here i s the average time d i s t r i b u t i o n (excluding s^): Number of nonparallel quadrants encountered: 1 2 3 4 1 1 3 Service time, m revolutions: •JJ -g jfj; 1 4. Deletion This i n s t r u c t i o n has 3 steps: ( 1 ) p o s i t i o n control heads to the quadrant boundary, (2) search for the key (-g to 1 revolution), (3) write a hole entity on the matching record. I f there i s more than one match on the f i l e , then the service time i s -| revolutions. However, i f there i s a single match, then i t i s about -g revolution. In the conventional device, since every track should be searched sequentially, i t takes 39 revolution to search f o r the key, and. then erase the record, where n i s the number of tracks allocated to the f i l e . 5. Create or delete a f i l e This takes at most one revolution to write/delete the f i l e name, i . e . change the quadrant name(s). However, i n the conventional device, i t i s necessary only to create/delete a f i l e directory entry which, of course, requires \ revolution on the average. 6 . Garbage c o l l e c t i o n Garbage c o l l e c t i o n i s the most i n e f f i c i e n t service. I t w i l l be c a l l e d when: ( 1 ) the f i l e i s highly loaded, ( 2 ) a large number of small hole e n t i t i e s have been created as a r e s u l t of insert/delete i n s t r u c t i o n s , or (3) no empty s l o t of proper size i s found i n performing an i n s e r t i n s t r u c t i o n . Garbage c o l l e c t i o n i n the f i r s t two cases i s i n i t i a t e d by request, whereas i n the t h i r d case i t i s i n i t i a t e d automatically by the monitor. Garbage c o l l e c t i o n i s e s s e n t i a l l y equivalent to the act of f i l e reorganization which r e s u l t s i n improving the service time of other requests. However, the service time, as we have pointed out previously, depends on the loading factor and the d i s t r i b u t i o n of hole e n t i t i e s . Hence, additional quadrants may be required i f a f t e r garbage c o l l e c t i o n , each quadrant i s s t i l l highly loaded. 40 To improve the e f f i c i e n c y of garbage c o l l e c t i o n , we s h a l l assume tha t a free p a r a l l e l quadrant and f r e e b u f f e r of one quadrant i n l e n g t h are always a v a i l a b l e . The f u n c t i o n of the a c t i v e c o n t r o l head of the f i l e w i l l be read only, t h a t i s , t r a n s m i t t i n g non-hole e n t i t i e s , one b i t at a time to the f r e e b u f f e r v i a the c o n t r o l channel. The data channel w i l l then transmit the corresponding data to the RWH of the f r e e quadrant. I f a f t e r the garbage c o l l e c t i o n , the quadrant i s s t i l l h e a v i l y loaded, then an extra•quadrant i s r e q u i r e d . The f i r s t h a l f of the records w i l l be w r i t t e n on the immediate next f r e e quadrant, and the other h a l f w i l l be w r i t t e n on the subsequent quadrant. New quadrants a l l o c a t e d to the f i l e w i l l be marked so t h a t garbage c o l l e c t i o n w i l l not be a p p l i e d t o them. A f t e r the garbage c o l l e c t i o n , i t r e q u i r e s an,'extra r e v o l u t i o n to unmark a l l quadrants of the f i l e . 3.6. A l t e r n a t e to the M o d i f i e d Memory Device As we have seen so f a r , the fundamental mechanisms th a t makes the l o g i c head couple work are the delay c l o c k device and the record counter r e g i s t e r . I t may be expensive to implement the delay clock mechanism f o r each RWH. I n f a c t , w i t h a matching r e c o r d queuing b u f f e r , these can be replaced 41 "by a s i n g l e r e c o r d counter r e g i s t e r i n the channel c o n t r o l l e r ( f i g u r e 3.4). Each e n t i t y i n the matching r e c o r d queue contains a record address. Attached to one end of the c o n t r o l channel are the request queue and the matching r e c o r d queue. When the c o n t r o l head f i n d s a match to the r e c o r d key, i t performs the f o l l o w i n g : (1) adds one to the record counter r e g i s t e r during the f i r s t r e v o l u t i o n , (2) n o t i f i e s the monitor of the matching record. Whenever the monitor r e c e i v e s the matching' s i g n a l , i t puts the address of the matching record i n the matching r e c o r d queue i f there i s a fr e e entry i n the queuing b u f f e r . I n case there i s no f r e e entry, then the matching r e c o r d w i l l be picked up i n the next or subsequent r e v o l u t i o n . I f the read b u f f e r i s a v a i l a b l e and the data channel i s f r e e , the monitor, which c o n s t a n t l y keeps t r a c k of the read/write head l o c a t i o n , s e l e c t s from the matching record queue the r e c o r d which comes f i r s t under a RWH and i n i t i a t e s the ready s i g n a l to the corresponding RWH. The RWH then marks and reads the matching record. Whenever a read i n s t r u c t i o n i s completed, the counter w i l l be reduced by 1 and the RWH w i l l generate an acknowledge s i g n a l to the monitor so t h a t the ..next-head can- be. s e l e c t e d . At the end of the r e v o l u t i o n , the contents of the counter w i l l be checked. I f the contents of the counter i s 0, then a l l 42 MONITOR' c o n t r o l channel p r o c e s s o r , , match bequest r e c o r d d a t a channel p r o c e s s o r r e c o r d counter r e a d w r i t e main memory da t a channel J 1 TRACK TRACK disk. r e v o l u t i o n C o n t r o l Head l e n g t h r e g i s t e r a u x i l i a r y r e g i s t e r b i t r e g i s t e r o p e r a t i o n r e g i s t e r Read/Write Head l e n g t h r e g i s t e r b i t r e g i s t e r o p e r a t i o n r e g i s t e r main memory F i g u r e 3«^» A l t e r n a t e system c o n f i g u r a t i o n 4-3 matching records have been retrieved; otherwise another revolution i s needed. In-the next revolution, a l l control . heads search f o r unmarked records, but do not add 1 to the record counter for matching records, iT.he address of the matching record w i l l be added to the matching record queue and the process i s repeated as i n the previous revolution. In t h i s alternate scheme, there i s one cycle of the selection-read- acknowledge process f o r each matchihg record, and each matching record must be i n the queue p r i o r to being accessed. In order to read adjacent matching records of the same track or of p a r a l l e l tracks sequentially, the time to generate the acknowledge signal and to execute the se l e c t i o n of the matched record should be le s s than two b i t s time, which i s the time f o r the l o g i c head to read the in d i c a t o r b i t s of the mark entity. The RWH must receive the ready signal p r i o r to encountering the mark b i t of the mark entity. I f more than one match to the key occurs concurrently, then the matching signals w i l l be intercepted by the control channel processor. The control channel processor w i l l then i n s t r u c t the record counter to increase the content by the number of matching records; however, i t w i l l place only one of these matching records i n the matching record queue. Since the monitor constantly keeps track of a l l RWH locations, i t can always have the next selected matching record available, and i n the meantime free those matching records whose addresses have 4 4 passed the RWH's. This implies that (1) adjacent matching records can be processed sequentially since the ready signal can be generated within two b i t s time, and ( 2 ) a free entry i n the queue i s available f o r other matching records. We s h a l l see.'.later that t h i s alternate device l i m i t s h i e r a r c h i c a l search i n the simultaneous keying mechanism. 3.?• H i e r a r c h i c a l Search Mechanism Consider the modified memory device described i n section 3*1; that i s , each RWH l o g i c head i s equipped with a delay device, but without the matching record queue for the system (Figure 3 . 1 ) . As we have seen so far, the operational c h a r a c t e r i s t i c of the lo g i c per track disk memory device i s the simultaneous keying mechaism. This simultaneous keying mechanism enables us to e s t a b l i s h another mechanism f o r the device, that i s , an h i e r a r c h i c a l search mechanism on the key part of the record. This, i n fact, i s not yet available i n conventional disk devices. Let k, k^, kg, ... k be keys of a record. A f i n i t e set of primitive predicates may be defined over the key-part of the records. These predicates, i n turn define the hi e r a r c h i c a l search. The following are examples of predicates which are l i k e l y to be included i n the primitive predicates: 45 1 . For a given key k, re t r i e v e a l l records whose key-part has a key matching k, or has no key matching k. These predicates w i l l denoted symbolically by k and -k respectively. In the simultaneous search, the given key i s broadcast over and over again c y c l i c a l l y , and each head compares whatever b i t i s read from the key of the record with b i t being broadcast, This implies that sometimes the l a s t b i t s are compared f i r s t . As we have demonstrated previously, t h i s does not matter on an equal-unequal comparision. However, the equal-unequal comparision mechanism enables us to implement the above predicate. 2 . The next predicate i s the limited combination of 'and' and *or' l o g i c a l operations on the key-part of records. Given a set of keys, k^, kg, k say, i t i s spossible to'.retrieve records whose -key-part, . s a t i s f i e s rthe f o i l owing.-^predicates: k. and k 0 and ... and k , l c m k. or k 0 or ... or k , l c m -k^ and kg or k^ or k^ , k. and k 0 and ... and k. or k.,1 or k., 0 or 1 2 l — i-+l — 1 + 2 -ars ... or k — m etc. The introducation of the marking mechanism enables us to implement the above primitive predicates, and the introduction of the l o g i c head couple on each track w i l l speed up the sevice 46 time. The following example demonstrates a predicate search on the lggic-per-track associative memory. Suppose that we want to access a record whose keys s a t i s f y the following predicate: k^ and kg. Then i n the f i r s t revolution, a l l active control heads w i l l search for key k^. I f a key i s found, then theeoohtrol head w i l l n o t i f y the associated RWH to mark the delay device, as described previously (N.6te that the alternate associative memory described i n Section .'3V.6: "cohfexns^'aaraa^'cTixn'gi; neieaDndl. queue instead of the delay device. A" ̂pnob-lem: occurs ,if:„ a-, match i s found while the matching record queue i s f u l l ; '. " • * t h i s would explain why i t l i m i t s h i e r a r c h i c a l search). This i n f a ct, memorizes a l l matched records within d b i t s distance of each RWH. Then the RWH w i l l i n turn mark the matched record when the matched record passes under the RWH. In the next revolution, a l l active control heads w i l l search the marked records for^key kg. I f key kg i s matched to a key of a marked ne'.ccorLds then the control head w i l l again n o t i f y the associate RWH to turn on the delay b i t . In t h i s manner, when the RWH encounters the delay b i t , i t w i l l leave the record as i t i s marked, otherwise, the RWH w i l l just unmark the record'. I t follows that a l l marked records s a t i s f y the given predicate, 4 7 and a l l d e s i r e d records are ready f o r r e t r i e v i n g . 3 . 8 . E v a l u s t i o n of H i e r a r c h i c a l Search We s h a l l consider two types of h i e r a r c h i c a l search: OR search and AND search, a. 'OR' search: An *0R* search i s a d i s j u c t i o n of equal-unequal searchs, k^ or k 2 — •"" — km* F o r e x a m P l e » ^ i - s "̂ he key r e f l e c t i n g age i n the personal r e c o r d , and i s q u a l i f i e d by the p r e f i x : AGE. Thus a key of age 20 would be represented by 'AGE20". The procedure to r e t r i e v e records s a t i s f y i n g an •OR' c o n d i t i o n i s : STEP 1: P o s i t i o n a l l c o n t r o l heads to the quadrant 1 boundary, g r e v o l u t i o n on average. STEP 2: I n the f i r s t r e v o l u t i o n , the c o n t r o l heads unraark a l l records and search f o r key k^, I f a match i s found by a c o n t r o l head, i t w i l l be marked by the a s s o c i a t e d read w r i t e head, and the record counter w i l l be incr e a s e d by one. At the end of the f i r s t r e v o l u t i o n , a l l matching records are marked and the number of matching records i s contained i n the record counter. STEP 3« I n the next r e v o l u t i o n , the c o n t r o l heads search f o r key kg on the unmarked records. 48 I f a match ' i s found, then the record w i l l he marked hy the RWH and the record counter w i l l .._ . . ' he., increased by one. STEP 4: Repeat STEP 3 f o r keys k^, ... k m < STEP 5- The f i n a l step i s to read a l l marked records. This requires at le a s t one revolution, at most n revolutions, whe're n i s the number of p a r a l l e l quadrants allocated to the f i l e . Therefore, the t o t a l access time i s at least m + ^ revolutions and at most m+n+-g revolutions. b. 'AND' search An 'AND' search i s a conjunction of equal-unequal condition, k. and k 0 and ... and k 1 c m such that each k^ r e f l e c t s a d i s t i n c t object, for example: 'AGES 0' and 'SEXFEMALE'. The r e t r i e v a l procedure i s as follows: STEP 1: P o s i t i o n control heads to next quadrant 1 boundary, g revolution. STEP 2: In the f i r s t revolution, the control heads unmark and search f o r key k^. I f a match i s found, then the record counter i s increased by one, and the associated RWH marks the matched record. Thus at the end of the f i r s t revolution, the number of matched records i s = 4 9 contained i n the-record-counter,'-andrail matched records are marked. I f the record counter i s 0 then the.;,request i s completed:, otherwise proceed to STEP 3« This takes one revolution. STEP,3: In the next revolution, the control heads search on these marked records only. I f a marked record does not'.match key kg, then the record w i l l he unmarked by the associated RWH and the record counter decreased by one. At the end of the revolution, i f the record counter i s 0 then the request i s completed, otherwise repeat STEP 3 for keys k^, ... k m > This step takes at least one revolution and at most m-1 revolutions. STEP 4-: At t h i s point the number . of records s a t i s f y i n g the 'AND' condition i s held i n the record counter and a l l suchr'ecords are marked. I f the counter i s 0 then the request i s completed, otherwise read a l l marked records which takes at least one revolution, at most n revolutions, where n i s the number of p a r a l l e l quadrants allocated to the f i l e . Therefore, the t o t a l access time i s at most (m * n + -g) revolutions, assuming that there i s at least one record s a t i s f y i n g the AND condition and at l e a s t revolutions. CHAPTER 4 STORAGE ORGANIZATION AND ACCESS METHODS 4.1. Chain Storage O r g a n i z a t i o n I n Logic-Per-Track Environment The h a s i c storage o r g a n i z a t i o n and access methods described i n the previous chapter f a c i l i t a t e random f i l e o r g a n i z a t i o n w i t h d i r e c t access. Moreover, the marking mechanism provides an h i e r a r c h i c a l access f i l e o r g a n i z a t i o n . I t i s the purpose of t h i s chapter to develop a more general storage o r g a n i z a t i o n , t h a t i s a chained storage, o r g a n i z a t i o n , and to demonstrate some a p p l i c a t i o n s of such a f i l e o r g a n i z a t i o n i n a simultaneous search environment as the counterpart of the f i l e o r g a n i z a t i o n s and access methods i n conventional d i s k storage devices. The b a s i c method f o r s t r u c t u r i n g and processing records i n a f i l e i n the l o g i c - p e r - t r a c k environment i s tha t records are stored, updated and r e t r i e v e d by key or keys. The c h a r a c t e r i s t i c of the bas i c storage o r g a n i z a t i o n i s tha t each r e c o r d i n a f i l e c o n s i s t s of the same type of i n f o r m a t i o n held i n an i d e n t i c a l format as other records i n a f i l e ; t h i s form of record s t r u c t u r e w i l l be r e f e r r e d to as m u l t i p l e - r e c o r d s t r u c t u r e . An example i s a student f i l e t hat c o n s i s t s of personal i n f o r m a t i o n , semester i n f o r m a t i o n sand-course i n f o r m a t i o n . A t y p i c a l r e c o r d i n a student f i l e may hold data 5 1 given "by Table 4.1. With the logical components of information specified in this table, the student record may be stored as a whole as one physical record (the multiple record approach), or may be segmented into three physical records — personal, semesters and aojufesejs. In either case a record key -must be chosen, say the student number, Moreover, an additional key i s needed in order to differentiate the personal records, semester records and course records in the f i l e (Figure 4.1); the student record can be accessed by the student number, or by both the student number and the student record type. Consequently, records should be stored and retrieved in some way so that each record can be properly identified. One method for suchre'tfor^Cidentification i s given by Figure 4.1: each record i s identified by the student number. A l l these records can be retrieved in random ordering by the student number with a single request to the associative memory. If the records must be retrieved in a fixed order, then they should be chained ' together. A unary chain i s the simplest technique of linking logically related records; related records are linked by a pointer that contains a reference to the next related record in logical sequence so that a l l related records are chained together by successive pointers. Implementation of a chain in the logic-per-track environment i s described below. • ' -52 P e r s o n a l I n f o r m a t i o n : ' s t u d e n t n u m b e r name s o c i a l i n s u r a n c e n u m b e r . ',< s e x '. b i r t h d a t e « p l a c e o f b i r t h ' i n s t i t u t i o n l a s t a t t e n d e d ; , . . p r e v i o u s d e g r e e h e l d r e c o r d t o - d a t e y e a r o f f i r s t a d m i s s i o n , a d d r e s s . S e m e s t e r I n f o r m a t i o n : f a c u l t y , d e p t . a n d y e a r a t t e n d e d A c a d e m i c s e s s i o n t u i t i o n f e e s t a n d i n g _ • • ' e t c . C o u r s e I n f o r m a t i o n : , g r a d e s . c o u r s e n u m b e r c o u r s e name m e e t i n g s d e s c r i p t i o n T a b l e L o g i c a l ^ c o m p o n e n t s i n a t y p i c a l s t u d e n t r e c o r d 53 MARK KEY DATA student ho. personal . semester • courses (Student information i s 'stored as a whole by one record: the key entity holds the student number, and the data entity holds personal , semester and course, information as specified i n the logi c a l structure of the student record. MARK KEY KEY D&TA student no. record type personal; record MARK KEY KEY DATA j student no. record type semester, record i MARK KEY .. KEY DATA student no. record type course record ^Student information i s stored as three records i n the f i l e : person;al semester and courses. Moreover, each record i s identified by the student number (KEY) and the record type. Figure.4.1. Examples of the student record stored i n the otudent f i l e . 5 4 Depending on the software approach being used, the contents of the pointer can be either of the following: 1. Actual address: The pointer i s a key entity which holds an actual physical address. The track address of the lo g i c head, and i t s location, are controlled by the channel c o n t r o l l e r . The chain may be created as follows: Each record of the chain i s allocated a key entity, which w i l l hold the address of the preceding record or,--the succeeding record. I f the pointer holds the address of the preceding record, . then we have a backwards chain; when the current record i s created, i t s address i s saved to he used'as a key of the next record i n the chain. On the other hand, i f the pointer holds the address of i t s succeeding record, each record i n the chain w i l l be created i n two steps: (1) write the current record and save i t s address, (2) inse r t the address into the chain pointer of the preceding record. Because each record i n the chain depends on the physical p o s i t i o n of the next record, a l l pointers may have to be revised after each f i l e reorganization or garbage c o l l e c t i o n . The revised•pHy^sjjcaELrecord formats for the associative memory device introduced i n Section 3>2 can avoid the tangle of actual physical addresses i n the chain structure. A chain organization which u t i l i z e s such a revised physical record format i s described below. In what follows, the chain 55 organization (chain structure) simply refers to the chain u t i l i z i n g such a revised physical record format f o r the associative memory. In other words, records are not chained by actual physical addresses. 2. Key pointer: The pointer i s a key en t i t y of each record i n the chain. An element of the chain i s a structure and at least one f i e l d i n each record i s a pointer value that i s used to point to another record. Obviously, the purpose of the chained organization i s to t i e together the elements of a data structure and the chain key i s properly chosen by the application program. In the logic-per-track environment, each record i n a chain needs two key f i e l d s : one f o r the key of the record, and the other for the pointer f i e l d (Figure 4.2). The pointer f i e l d contains a chain key which i s generated from the key f i e l d of the record i t points to by a simple (and reversible) process such as appending a single s p e c i a l character. For example, i n a student record structure, the personal record has a record key '7500001'> and a pointer f i e l d which contains a chain pointer '$7500002* pointing to a succeeding personal record i n the chain. The chain pointer '$7500002' i s generated from the succeeding personal record by appending a single special character, '$' say, to the key of the succeeding record of the student number 7500002. By continuing t h i s process, the student personal records are chained together. This simple example i s adequate to 56 MARK KEY KEY DATA POINTER F i g u r e 4.2. B a s i c a l l y , two key f i e l d s are needed f o r each r e c o r d i n a c h a i n : one f o r the key of the r e c o r d and the other the c h a i n p o i n t e r . P e r s o n a l Records : MARK KEY POINTER ' DATA 7500001 $7500002 d a t a V'. ... .. J MARK 1 7500002 [$7500003 £ 1. d a t a iMARK 1 7 500003 l i J7500009 | d a t a MARK 7500009 c h a i n p o i n t e r to next r e c o r d d a t a n e x t ' p e r s o n a l _ r e c o r d J F i g u r e 4.3. Sample c h a i n of student p e r s o n a l r e c o r d s . Each r e c o r d in' the c h a i n has the format as shown i n F i g u r e 4.2. . ••'< 57 demonstrate the following important features of the logic-per- track r e t r i e v a l system: a. The chain key of each record i s r e v e r s i b l e . For example, with the chain key of any record, say '$7500003', by applying the reverse procedure, we can obtain the key of the succeeding record: 7500003. Thus the chain i s i n f a c t a binary chain with only one pointer f i e l d i n the chain, which i s impossible i n conventional computer storage. b. Each record i n the chain can be re t r i e v e d by following the chain key, or d i r e c t l y by the key of the record i f i t i s known. In general, a record i n the chain can include any number of pointer f i e l d s . In the simultaneous search environment, each record i n the storage i s uniquely i d e n t i f i e d by one of i t s key(s). I f more than one record possesses the same key, then a simultaneous search f o r the key would encounter a l l these records regardless of t h e i r physical positions i n the chain. I f the order of records i n the chain i s important, then elements i n the chain must be linked i n the desired order. We s h a l l c a l l such a chain a linked list:,; analogous to l i s t structures i n computer storage. From the remark above, no two d i s t i n c t records may contain i d e n t i c a l keys which serve as pointers i n a linked l i s t . The key i n the pointer f i e l d of a chain may be generated by a simple and reversible procedure. 58 The chain may he ended with an end-of-chain marker, or by discontinuing the pointing key. The end-of-chain marker i s a s p e c i f i e d key value. Each chain record once accessed w i l l be checked against t h i s key by software, i n order to test fo r the end-of-chain. I f the chain i s terminated by discontinuing the chain key, then an attempt to access the record with .the chain key of the record w i l l f a i l to match any record i n the f i l e , , and accordingly, w i l l signal the end of the chain. The f i r s t record of the chain may be acce'ssed by one of i t s keys. This key may be ( 1 ) the chain key of some, other record, ( 2 ) obtained from some other f i l e , or ( 3 ) supplied by the user, dependhg on the data structure of the f i l e . The following i s the procedure to ret r i e v e the entire forward chain: ( 1 ) Retrieve the f i r s t record of the chain (-g revolution on average). ( 2 ) Generate the key of the succeeding record from the chain key of the current record; r e t r i e v e the succeeding record by t h i s key (-g revolution on average). ( 3 ) Repeat step ( 2 ) u n t i l the end of the chain i s encountered. Thus, with forward l i n k i n g , we must f i n d and read the record with key K before we can locate i t s successor. Step ( 2 ) reads, with backward l i n k i n g : construct a chain key for the predecessor from the key K of the current record, the 'R' reprieve the preceding record by i t s chain key. 59 4 . 2 . Some Aspects of Chain S t r u c t u r e s Some of the most important aspects of p r o c e s s i n g a chained f i l e are the'.'following: 1. F i l e access -- s e q u e n t i a l method, 2 . F i l e maintenance -- making i n s e r t i o n s and d e l e t i o n s , 3. D e s c r i b i n g and b u i l d i n g more complex data s t r u c t u r e s . 1. S e q u e n t i a l F i l e Access Seq u e n t i a l access means that the data elements ( t h a t i s , l o g i c a l records) are referenced i n a manner dependent upon the sequence i n which, data, . elements are stored. The sequence of records i s . l i n k e d by a chain, and the f i l e records can be accessed s e q u e n t i a l l y by the procedure described i n S e c t i o n 4 . 2 . 2. I n s e r t i o n s and D e l e t i o n s I f the l o g i c a l sequence of records i n a chain i s of no s i g n i f i c a n c e , i n s e r t i o n s are simply arranged. I f a r e c o r d i s to be i n s e r t e d i n the chain, a p..inter i s created i n the new record to reference the f i r s t r e c o rd i n the chain. The key of the f i r s t r e c o rd i n the chain i s g e n e r a l l y held i n the master re c o r d ( f o r example, a r e c o r d w i t h a •dummy-key would work w e l l ) . The p o i n t e r held i n the master record i s then a l t e r e d to reference the newly i n s e r t e d record. Where the l o g i c a l sequence of records w i t h i n a chain i s important, then w i t h the conventional device, the chain must be f o l l o w e d from the beginning u n t i l the i n s e r t i o n p o i n t i s reached. However, 6o t h i s may not he necessary i n the logic-per-track environment.. Insertion need not requirelthat the chain he followed from the beginning i f the key of the record at the point of i n s e r t i o n i s known. In t h i s case, the record can be retr i e v e d d i r e c t l y by simultaneous search. The pointer of the preceding (or succeeding) record i s then altered to reference the new record. The o r i g i n a l pointing key i n the preceding (or.succeeding) record i s inserted i n the new record, which then re-establishes/ the l i n k to the next record i n the chain. The same i s true of the conventional device, i f the address of the record at the point of i n s e r t i o n i s known. However, i t requires that the address of the record i n a. chain be stored somehow separately whether i n the core, memory or i n another disk f i l e . The deletion of records i s an equally simple matter. Regardless of the l o g i c a l sequence of the records i n the ,chain, the record to be deleted can be located and can be re t r i e v e d d i r e c t l y by simultaneous search. Moreover, the successor (or predecessor) of the deleted record can always be found d i r e c t l y ..because-' the chain key of the successor (or predecessor) i s constructed by the simple and reversible procedure from the key of the deleted record. This record w i l l be deleted once i t i s accessed. The chain key of the deleted record w i l l be stored as the pointer key f i e l d of the succeeding or preceding record, and the deletion i s completed. 61 4.3. T r e e S t r u c t u r e .Representati o r i s One of the data s t r u c t u r e s that model many r e a l - l i f e s i t u a t i o n s i s the t r e e s t r u c t u r e . For example, tr e e s t r u c t u r e s can he used to describe: the o r g a n i z a t i o n s t r u c t u r e of a business; a student r e g i s t r a t i o n r e p o r t i n g system organized by department, courses, e t c . A tree i s defined r e c u r s i v e l y as a f i n i t e set T of nodes s t r u c t u r e d as f o l l o w s : ( 1 ) one node i s designated as the root of the tree (T;) ; and ( 2 ) the remaining nodes are p a r t i t i o n e d i n t o d i s j o i n t sets T̂ ., Tg, ... T , each of which i s a t r e e . The t r e e s T^, Tg, ... T , are known as subtrees of the t r e e T. A t e r m i n a l node i s a node which has no subtrees and a nonterminal node i s r e f e r r e d t o as a branch node. For example, i n the t r e e s t r u c t u r e depicted i n Figure 4 . 5 , node A^ i s the r o o t of the t r e e , node and Bg are branch nodes, and nodes B^, C^, Cg, Q,y G^, C^, are t e r m i n a l nodes. Figure 4 . 6 gives an . example of the student, semester, and course data represented as a t r e e s t r u c t u r e . There i s more than one way to implement tr e e s t r u c t u r e s i n the l o g i c - p e r - t r a c k environment. For example, i n Figure 4 . 7 indexes to the subordinate records are kept w i t h each parent record, 1(B) denotes the index (-that i s , the chain key) to the r e c o r d w i t h key B. From a record w i t h key A, we can compute the keys B 1 j. B ?, B^ from the index f i e l d s and then 62 Figure 4.5 Conceptual view of tree structure. semester record course record course record student record semester record semester record course record course record course record Figure 4.- 6 Example of student record represented by a tree structure. 63 A l KB 1) KB 2) I(B^) data B l K c 2 ) I(C^) data i ( c 5 ) data B 3 data C l data °2 data °3 C 4 data data Figure 4.7» °5 data A l data B l I(A 1) data B 2 I(A 1) data B3 I(A 1) data C l I(B 1) data Figure 4.8. C 2 I .( B l ) data C 3 I(B 1) data c 4 KB 2) data C 5 KB 2) data representing Figure 4.5« organization representing Figure 4.5« Note that hoth repre sentati ons have the same amount of storage space . 64 access d i r e c t l y the subordinate record w i t h each of these keys. A l s o , given record say, we can access the parent record A by computing the index I(B^) and using i t as a search key. A d i f f e r e n t data o r g a n i z a t i o n i s shown i n Figure 4.8. Here, each subordinate r e c o r d s t o r e s an index to i t s parent r e c o r d . Given any record w i t h key Bg say, we can r e t r i e v e i t s predecessor which i s the record w i t h key A^, or a l l i t s immediate successors which are the records w i t h key 1(^2^ because both keys I(A^) and iCBg) are formed by a simple and r e v e r s i b l e procedure from the keys A^ and Bg, r e s p e c t i v e l y . Both data o r g a n i z a t i o n s take the same t o t a l amount of storage space. 4.4' E v a l u a t i o n of Chain S t r u c t u r e Search I n the l o g i c - p e r - t r a c k environment, the chain s t r u c t u r e i s the major storage o r g a n i z a t i o n besides the b a s i c storage organiza- t i o n described i n Chapter 3- E v a l u a t i o n of the performance of such .a. . - l i storage o r g a n i z a t i o n i s necessary. Assume that chain records are d i s t r i b u t e d uniformly around the d i s k t r a c k , and the chain has n records. a. R e t r i e v a l Time f o r a Chain A r e c o r d i n the chain may be r e t r i e v e d i n two ways: (1) by d i r e c t l y accessing i t i f a key of the record i s known, and (2) by f o l l o w i n g the chain otherwise. I f the key of the re c o r d i s known, than the r e t r i e v a l time i s r e v o l u t i o n on the average,(Section 3«5)« Suppose on the other hand t h a t the key of the record to be r e t r i e v e d i s not known; then the chain must 6 5 be followed. This w i l l take -g- r e v o l u t i o n , b. I n s e r t Time I n s e r t i n g a new record R i n a forward chain can al s o be done i n two ways: I f the key of the r e c o r d P at the p o i n t of i n s e r t i o n i s known, then the procedure i s simply: 1 . R e t r i e v e the record P d i r e c t l y , ^ r e v o l u t i o n on average). The r e c o r d P w i l l be deleted i n the next r e v o l u t i o n (one r e v o l u t i o n ) . 2. Move the chain key of the r e t r i e v e d r e c o r d to the chain key f i e l d of the re c o r d R to be i n s e r t e d . Replace the chain key of the r e t r i e v e d r e c o r d P by the chain key derived from the key of the re c o r d R to be i n s e r t e d . 10 3. Write r e c o r d R and the record P (-g- r e v o l u t i o n ) . Therefore, 23 the t o t a l time i s r e v o l u t i o n s on average. Suppose on the other hand t h a t the key at the po i n t of i n s e r t i o n i s not known, then the chain must be fo l l o w e d . This would take at most ^ e x t r a r e v o l u t i o n s , where n i s the leng t h of the chain, c. D e l e t i o n Time D e l e t i o n of a re c o r d i s much simpler because the key of the record to be deleted i s known; the r e c o r d can be deleted d i r e c t l y , which takes 1-g r e v o l u t i o n s on average. 4 . 5 . An I n v e r t e d Data Base Model f o r Conventional Disk Device To draw a comparison w i t h conventional devices, an i n v e r t e d data base s t r u c t u r e (Cardenas, 1 9 7 5 ) w i l l be used. 66 In t h i s model of an inverted data base, each record i s accessa able v i a a h i e r a r c h i c a l structure of three directory f i l e s : a track index f i l e , a key-value index f i l e and an accession index f i l e (Figure 4 . 9 ) . The following i s the inverted f i l e organization description (Cardenas, 1 9 7 5 » page 2 5 5 ) • The inverted f i l e organization consists of four types of blocks, where the block or page i s the unit of data p h y s i c a l l y transferred between secondary storage and main memory as a r e s u l t of a single I/O operation: a. Data blocks, which contain the records, that i s the data base. The complete record i s stored, including the inverted key-values. b. Key-value index blocks, which contain the access keys. An access key consists of the pair key-name/key-value; a short representation of the inverted key-name i s . s t o r e d addition to the key-value. These are noted I and VALUE, respectively, i n the key-value format shown i n Figure 4 . 9 » The VALUE could be f i x e d length with blank f i l l , but w i l l be considered variable length and thus w i l l be preceded by a key-value length indicator, k l . Associated with each access key i s a pointer, PTR (the actual physical address), to the beginning of a l i s t of record addresses (often c a l l e d the accession l i s t ) and the length, LGTH, of t h i s l i s t . The access:; 6 7 keys are stored s e q u e n t i a l l y i n key-value "blocks and ordered by key name I , then by VALUE. c. Record address blocks or accession blocks which co n t a i n the l i s t s o f , : " a c c e s s i o n numbers", which are p o i n t e r s to records i n the data base. Each l i s t i s a s e r i e s of consecu- t i v e words. No i n f o r m a t i o n i s stored w i t h i n the accession block as to where a l i s t begins or ends, since t h i s i n f o r m a t i o n i s provided i n the key-value b l o c k s . Each l i s t i s ordered according to the value of the p o i n t e r . d. A t r a c k index block i s provided f o r f a s t access to key-value blocks. The format of each t r a c k index block entry i s i d e n t i c a l to the key-value format. The t r a c k index block contains each key-value appearing at the beginning of a key-value block and the p o i n t e r s to these blocks (as diagrammed i n Figure 4.9). The use of one t r a c k index block probably s u f f i c e s f o r most a p p l i c a t i o n s . I n order to compute'average access time, estimates of three b a s i c u n i t s of time are used:: ' 1. T r n , the average time to access a block or page (that i s , a t r a c k ) . 2. T Q, the average time to compare an access key or key-value. 3. T-j-, the average time to i n t e r s e c t or"1 to merge' two ' blp.cksL-.of iac.e'ession numbers. These estimates vary depending on device environments. 68 "Index" or "Directory" in Fast Random Access Storage Key-Value Accenion DATA BASE 1/ Key-Value Format \f | I kg Value | Ptr |lgTh • £ •> x: ' E S 3 S 5. > F i g u r e 4.9 Layout f o r the i n v e r t e d f i l e o r g a n i z a t i o n f o r the c o n v e n t i o n a l d i s k d e v i c e d i s c u s s e d by Cardenas, 1975- ( F i g u r e 2, page 254, CACM, May 1975), 6 9 In what follows, we describe b r i e f l y the average access times f o r Equal-Unequal search, *0R* search and 'AND' search i n the conventional device; d e t a i l s jean be found i n Cardenas • ( 1 9 7 5 ) • The average access time for each search i s derived step by step i n terms of the basic units of time as i f one were taking each step through the directory and into the data base. The track index block, the key-value blocks and the accession blocks are assumed to be i n ordered form and thus binary search i s used where possible. Moreover, a large block size i s used coinciding with.track\si :ze v, that i s , the pages accessed are whole tracks. I f N denotes the number of /access keys - that are stored sequentially i n each block, then the average "search^ time-for- a-key is""- ' ' T^loggN, which i s small comparing to T T and T-j-, and therefore, w i l l be ignored i n the following discussion, a. Equal-Unequal Search An equal-unequal search, 'greater than' and 'less than* searches have the same access time according to the analysis of Cardenas ( 1 9 7 5 ) • The following are the procedures and access times f o r these searches i n the conventional disk device: Step 1: Search track-index Read key-value block: T^ Search key-value block. Step 2 Step 3 Step k Step 5 Read accession pointer l i s t : T#X , where r T • a' X i s the average number of accession pointer 7 9 b l o c k s . Step 6: Read data b l o c k s : T^* X^, where X D i s the average number of data b l o c k s t h a t w i l l be accessed per query. According to the a n a l y s i s of Cardenas, f o r the storage o r g a n i z a t i o n i n the conventional device, the average number of blocks that must be read i s : X D = D ( 1 - ( l - l / D ) K ) data b l o c k s , where D i s the number of data blocks i n the data base, and K i s the average number of records s a t i s f y i n g the. query. I n the above procedure, Step 1 to 5 search f o r a l l l o c a t i o n s of records r e s i d i n g i n the data base s a t i s f y i n g the query, and the t o t a l average access time i s T *(2+X ). Because T m i s the o T a T average time to access a t r a c k , and X > 1 , the time t o searching EL the d i r e c t o r y w i l l be at l e a s t 3 r e v o l u t i o n s (3TT) on average comparing t o only r e v o l u t i o n s i n l o g i c - p e r - t r a c k environment. M' step 6, X D = D ( l - ( l - l / D ) K ) i s the average number of data b l o c k s c o n t a i n i n g K records s a t i s f y i n g the query. I f K<£.D, then X^^K. I n t h i s case, both devices would take about the same- number of r e v o l u t i o n s to read K records. I f K i s l a r g e , then X^< K, the l o g i c - p e r - t r a c k device may be slower, but t h i s always can be compensated by the search time. 71 b. 'OR' and 'AND' searches 'OR' search: k. or k, or ... or k 1 2 m 'AND' search: k. and k 0 and ... and k . 1 2 m Step 1 Step 2 Step 3 Step 4 Step 5 Step 6; Read track index: T T Search track index Read key-value block: Search key-value block Repeat Steps 3 to k f o r each of the equal-unequal searches: mT^ 'OR' search: read and merge accession l i s t s . 'AND' search: read, merge and int e r s e c t accession l i s t s . In both cases, the time i s (Tm+Ti-)*m*X T I a Step 7 : Read data blocks: T T*X D. Therefore, i n either *0R' or 'AND* search, the directory blocks access time i s T T+m(T T+T I)X a+mT T = '.(l+m+mXa)TT + mX aT T According to the analysis i n Section 3"«"i8, i t takes an average of i m+g revolutions f o r the logic-per-track device to search and mark each matching record, compared with l+m+mXo revolutions (ignoring mX T T) fo r the conventional device. I t i s now clear that s i g n i f i c a n t improvements i n access 72 times can be achieved by using the logic-per-track environment. Moreover, t h i s b r i e f comparison drawn between the logic-per-track associative memory and the conventional one has demonstracted the p o t e n t i a l advantage of t h i s new approach i n information systems. 4.6. Conclusion The basic method for structuring and processing records i n a f i l e has been discussed: records are stored, updated and retrieved by key or keys. I t i s important to note that our inte r e s t i n information systems i n the logic-per-track environ- ment i s considerably more than an i n t e l l e c t u a l excursion into new methods of organizing and accessing data. The logic-per- track. information system provides immediate and e f f i c i e n t re.al time query access to integrated informational resources. The ch a r a c t e r i s t i c demand i n such an informational resource i s that inputs to the system arelheterogeneous and occur at random i n t e r v a l s . Requests f o r information from the system are dynamic and time-dependent, so that quries and responses cannot be prepared before hand, F i l e s are large; thus, redundant data must be factored out to reduce the waste of storage space. Moreover, information must be current and up to date to meet the needs of a modern society. Therefore, an informational change must be r e f l e c t e d immediately i n a l l associated f i l e s i n the informational resource. With the logic-per-track associative memory a r e t r i e v a l system i s able to meet the requirement f o r 73 fast access to large, integrated f i l e s that are designed to he updated dynamically i n a real time operational environment. In an informational query system, the number of records satisfying the query i s usually small, and therefore they can be retrieved within one and one-eighth revolutions time which i s almost • impossible for a large f i l e i n a conventional device. The developement of the rotational associative memory for such an efficient informational query system should be credited to Slotnick (9170) and Parker (1970) for their earlier work. In the device of Slotnick and Parker, there i s a f a i r l y sophis- ticated logic chip attached directly to each read-write head. A l l individual logic heads are connected together and communicate with a central source. The logic chip allows each head to operate as an independent device, able to read and write i t s own track, to search simultaneously for information matching a given key, or to search simultane ously for space to write new records. The simple but unique cyclic key pattern (Figure 1.2) introduced by Parker (1970) makes the simultaneous search mechanism possible (Section 1.2). Because a l l heads could perform searching simultaneously, a desired key, or desired space could be located in no more than a single revolution of the device. However, reading and writing cannot take place on the same revolution once the position of the record has been discovered because the search key may not be f i r s t i n the record so part of record w i l l have passed the head before the match i s 74 recognized ( S e c t i o n 2.2). Moreover, i n the simultaneous key- matching operation, i f the search key i s not a unique r e c o r d i d e n t i f i e r , "but..father appears as a key i n a number of d i f f e r e n t records i n the f i l e , then simultaneously matched records '.(if any) must be r e t r i e v e d , i n an u n s p e c i f i e d order. I n the device of S l o t n i c k and Parker, one read-write head at a time can a c t u a l l y be reading a record f o r t r a n s m i s s i o n t o the main computer. E x t r a r e v o l u t i o n s , and e x t r a bookkeeping, w i l l c e r t a i n l y be needed i f matching records on d i f f e r e n t t r a c k s should p a r t i a l l y overlap. These problems have been ignored i n the r e t r i e v a l system developed by Parker (1970, 1972). I t has been the purpose of t h i s t h e s i s to modify the design of S l o t n i c k and Parker i n order to provide a means f o r re a d i n g and w r i t i n g a record i n the same r e v o l u t i o n a f t e r i t s p o s i t i o n has been discovered,., and to provide necessary bookkeeping f o r r e t r i e v i n g simultaneously matching overlapping records. The f o l l o w i n g are four a d d i t i o n a l f e a t u r e s of the device: 1. Two l o g i c head on each t r a c k , having s i m i l a r but. d i s t i n c t l o g i c a l c a p a b i l i t i e s , remain a constant d b i t s apart (Figure 3»1)» The l e a d i n g head, c a l l e d the C o n t r o l Head (CH) w i l l continue to have the primary r e s p o n s i b i l i t y f o r - simultaneous.'-/searching. The a d d i t i o n a l second head, c a l l e d the Read-Write Head (RWH) t r a i l i n g a f i x e d distance behind the c o n t r o l head as the t r a c k i s scanned, w i l l do the a c t u a l r e a d i n g and w r i t i n g of records ( S e c t i o n 3«l)» 75 2. A delay r e g i s t e r of d b i t s i n l e n g t h , where d i s the constant distance between l o g i c heads on the same t r a c k , has been added to the Read-Write Head. I n the simultaneous key search and read (or hole search and w r i t e ) operation, the c o n t r o l head, when i t recognizes a-, s u c c e s s f u l match, w i l l s i g n a l i t s RWH par t n e r t o begin r e a d i n g (or w r i t i n g ) at the appropriate p o s i t i o n on the t r a c k . Such an appropriate p o s i t i o n should always l i e somewhere between the two l o g i c heads as they scan along t h e i r own d i s k t r a c k . The b i t i n the delay r e g i s t e r corresponding to such an appropriate p o s i t i o n i s c a l l e d a delay b i t ( S e c t i o n 3.1)» and w i l l be marked by the RWH whenever a match i s recognized. The delay time i s the number of b i t s 1 counting from the p o s i t i o n of the RWH when a match i s reported, to the l o c a t i o n of the matching record. A f t e r such a delay, the RWH upon encountering the delay b i t j u s t set and the d e s i r e d record p o s i t i o n simultaneously (or almost s i m u l t a n e o u s l y ) , w i l l always r e s e t the delay b i t . Thus, the f u n c t i o n of the delay b i t i s to t e l l the RWH partner.where to s t a r t reading (or w r i t i n g ) a r e c o r d so that r e t r i e v i n g (or w r i t i n g ) a s i n g l e r e c o r d can always be performed i n the same r e v o l u t i o n . The f u n c t i o n s of a l l head couples are synchronized by a s i n g l e monitor ( S e c t i o n 3»3). 3. Another major de.sign change w i l l give the new device the a b i l i t y to keep t r a c k of a l l records which may be r e t r i e v e d simultaneously (or almost simultaneously) by a p a r a l l e l search. To t h i s end, the monitor w i l l be provided w i t h a r e c o r d counter 76 (Figure 3.1) i and a mark e n t i t y ( S e c t i o n 3*2) w i l l be p r e f i x e d to every record on the d i s k i t s e l f . The r e c o r d counter i s common to a l l l o g i c head couples. I n a simultaneous search, the number of matching record w i l l be~added to the record counter. When- ever a matching r e c o r d has been processed, i t w i l l be marked by the RWH, and the r e c o r d counter decreased by one. A c c o r d i n g l y , a matching r e c o r d w i l l not be processed twice since i t i s i d e n t i f i e d by the marking-.mechanism, and a l l matching records w i l l be read because the completion i s governed by the content of the r e c o r d counter. k. A f i l e i d e n t i f i c a t i o n mechanism has been e s t a b l i s h e d f o r the a s s o c i a t i v e memory ( S e c t i o n 2.1). Functions of such a mechanism are (a) to manage f i l e names, and (b) t o manipulate data on the storage device. I n order t o ensure t h a t any -./y simultaneous search operation confines i t s a t t e n t i o n t o the / d e s i r e d f i l e , the f i l e name w i l l be the key item w r i t t e n on the very beginning of quadrant boundary of each t r a c k where the f i l e r e s i d e s . Supposing t h a t no two f i l e s w i l l share the same t r a c k quadrant, then the f i r s t key of each quadrant w i l l i d e n t i f y ' the a c t u a l l o c a t i o n of the f i l e i n the memory device. Thus, a f i l e i s made up of a number of t r a c k quadrants r a t h e r than whole t r a c k ( s ) . A schematic r e p r e s e n t a t i o n of the complete a s s o c i a t i v e memory system i s given by Figure 3»ls d i s k , l o g i c head couples, and the monitor, The monitor i n c l u d e s the r e c o r d counter 77 r e g i s t e r , channel processors f o r the search and read-write heads, read and write buffers, and a queue for requests received from the main computer memory. The dynamic behaviour of the associative memory system depends upon the s p e c i f i c request being served by the memory. The control process f o r a request to r e trieve a set of records with a given key i n a f i l e has been described i n d e t a i l (Section 3 * 3 and Figure 3 . 3 ) . I t should be emphasised, that the control process f o r the request described i n the thesis i s just an example of one of a number of possible control processes. However, t h i s example i s s u f f i c i e n t to demonstrate the general a c t i v i t i e s of the modified associative memory. In order to explore the c a p a b i l i t i e s of the modified memory device we have just described i n outline, the procedures fo r i t s use i n a 'h i e r a r c h i c a l search' f o r records possessing a s p e c i f i e d combination of keys (Section 3 « 7 ) » i n an app l i c a t i o n requiring sequential access to a l l the records of a f i l e (Section 4 . 1 ) , and i n operation on a f i l e with a more complex tree structure (Section 4 . 3 ) * have been described i n d e t a i l . With the introduction of the marking mechanism, the search for records s a t i s f y i n g some basic query conditions can be performed d i r e c t l y on the key part of records without the intermediate step of transmitting records into the main computer memory, which i s always required with a l l conventional devices. By comparing i t s performance to the conventional counterpart (Section 4 .5) , , . 78 significant improvements i n access times can be demonstracted. A chain i s a simple but important technique of linking logically related records; related records are linked by a pointer that contains a reference to the next related record i n logical sequence so that a l l related records are chained together by successive pointers. As described i n Section 4 . 1 , an element of the chain i s a structure and at least one f i e l d i n each record i s a pointer value that i s used to point to another record. Because a record i n the associative memory i s accessed by one of i t s logical keys rather than by address, the pointer value can be a logical key entity, which i s usually generated from the key of the record i t points to by a simple and reversible process (for example, see Figure 4 . 3 ) • Such an organization has a number of advantages; ?(a) a chain i s i n fact a two-way chain, (b) each record i n the chain can be retrieved by following the chain key, or directly by the key of the record i f i t i s known, and (c) i t can avoid the tangle of actual physical addresses i n the chain structure. The storage organization for more complex data structures such as tree structures (Section 4 . 3 ) presents another unique feature of the associative memory. For example, Figures 4 . 7 and 4 . 8 demonstrate two distinct representations frequently used for tree structure data. In Figure 4 . 7 indexes to the subordinate records are kept with each parent record, whereas i n Figure 4 , 8 , each subordinate record stores an index to i t s parent record. Both data structures take the same amount of storage space. 79 An associative memory has always heen an at t r a c t i v e idea because (a) data i s accessed by content rather than by address, and (b) i t has po t e n t i a l a p p l i cation i n data bases f o r informa- t i o n query systems. However, so f a r only a few designs for an associative memory have been published. Usually, each design i s just a simple modification to an e x i s t i n g disk device i n order to meet requirements f o r a s p e c i f i c a p p l i c a t i o n (for example, Minsky, 1972). In t h i s thesis, a design f o r a general purpose associative memory using sophisticated l o g i c chips has been proposed. Readers may question the r e a l i t y and cost j u s t i f i c a - t i o n of such a design. I t has been about 15 years since the electronic industry made i t s f i r s t miniature el e c t r o n i c c i r c u i t on a s i l i c o n 'chip*,. Since then a steady advance i n c i r c u i t organization and complexity has led to the microprocessor, a device whose l o g i c and memory c i r c u i t s can be held on one's thumb. According to the survey of current mateket and ele c t r o n i c industries made by D. J . Theis 01974) s "Microprocessors bring us one step closer to having a whole computer on a single chip of s i l i c o n . No larger than •|-inch square, they contain a l l the e s s e n t i a l elements of a central processor, including the control l o g i c , i n s t r u c t i o n decoding, and arithmatic processing c i r c u i t r y . To be useful, the microprocessor chip or chips are combined with memory and I/O integrated c i r c u i t chips to form a 'microcomputer', a machine almost as powerful as a minicomputer which usually 80 f i l l s no more than a single printed circuit hoard and sells for less than $1,000." (Theis, page 90). In conclusion of his survey, he stated further that: "In the future, one chip w i l l include the APU, memory, and I/O interfaces, so we w i l l truly have a computer on a chip ... The prefix 'micro' denotes small size and connotes small cost; however, i t certainly does not imply small capabilities",, (Theis, page 100). Therefore, with the modern computer hardware technology, the associative memory that i s described throughout this thesis not only w i l l become r e a l i s t i c (although at some cost), but also w i l l have i t s impact on the development of another computer - generation. 81 REFERENCES Cardenas, A.F. (1975)» A n a l y s i s and Performance of Inv e r t e d Data Base S t r u c t u r e s , Communications of the ACM, May 1975, Volumn 18, Number 5. 253-263. Minsky, N. (1972), R o t a t i n g Storage Devices as P a r t i a l l y A s s o c i a t i v e Memories, Proc. AFIPS 1972 F a l l J o i n t Computer Conference, Volumn 4 l , P a r t I , AFIPS Press, Montvale, N.J., 587-592. Parker, J.L. (1970), A Logic Per Track Information R e t r i e v a l System, Ph.D. Thesis, U n i v e r s i t y of I l l i n o i s , Department of Computer Science. Parker, J.L. (1971), A Logic Per Track R e t r i e v a l System, Information P r o c e s s i n g 71, North-Holland P u b l i s h i n g Company, 711-716. S l o t n i c k , D.L. (1970), Logic Per Track Devices, Advances i n Computers, Academic Pr e s s , 29I-296. Theis, D.J. (197^)> Microprocessor and Microcomputer Survey, Datamation, December 74, 90-101.

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 4 0
United States 2 1
City Views Downloads
Beijing 4 0
San Mateo 1 0
Ashburn 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}

Share

Share to:

Comment

Related Items