@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix skos: . vivo:departmentOrSchool "Science, Faculty of"@en, "Computer Science, Department of"@en ; edm:dataProvider "DSpace"@en ; ns0:degreeCampus "UBCV"@en ; dcterms:creator "Shamper, Julie Anne"@en ; dcterms:issued "2010-03-22T23:51:53Z"@en, "1980"@en ; vivo:relatedDegree "Master of Science - MSc"@en ; ns0:degreeGrantor "University of British Columbia"@en ; dcterms:description "The correctness (or integrity) of a database may be destroyed when the database is accessed concurrently by a number of independent transactions. Part of the job of a database management system is to control concurrency so that correctness is guaranteed. This thesis examines concurrency controls in general, and in particular those provided by the Educational Data Base System (EDBS). It is concluded that EDBS facilities are not sufficient to guarantee correctness. Several alternatives are proposed whereby correctness can be guaranteed. An approach is selected based on ease of implementation rather than on the degree of concurrency provided. Guaranteeing correctness and at the same time providing a high degree of concurrency turns out to be a very difficult problem both in theory and in practice."@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/22303?expand=metadata"@en ; skos:note "c, / PRACTICAL CONCUfifiENCY CONSIDERATIONS FOR A STUDENT ORIENTED DATABASE MANAGEMENT SYSTEM JULIE ANNE SHAMEEH B.Sc., The U n i v e r s i t y of A l b e r t a , 1975 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF THE FACULTY OF GRADUATE STUDIES (Department of Computer Science) We accept t h i s t h e s i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA September 1980 by MASTER OF SCIENCE i n cj J u l i e Anne Shamper, 1980 In presenting th i s thes is in pa r t i a l fu l f i lment of the requirements for an advanced degree at the Univers i ty of B r i t i s h Columbia, I agree that the L ibrary shal l make it f ree ly ava i l ab le for reference and study. I further agree that permission for extensive copying of th is thesis for scho lar ly purposes may be granted by the Head of my Department or by his representat ives. It is understood that copying or pub l i ca t ion of th is thesis for f inanc ia l gain sha l l not be allowed without my written permission. Department of CJMH\\J^ IAW ScXpWfc The Univers i ty of B r i t i s h Columbia 2075 Wesbrook P l a c e Vancouver, Canada V6T 1W5 Date ^ j . . \\<\\/BQ i i i ABSTRACT The c o r r e c t n e s s (or i n t e g r i t y ) of a database may be des t r o y e d when the database i s accessed c o n c u r r e n t l y by a number of independent t r a n s a c t i o n s . Part of the job of a database management system i s to c o n t r o l concurrency so t h a t c o r r e c t n e s s i s guaranteed. This t h e s i s examines concurrency c o n t r o l s i n gen e r a l , and i n p a r t i c u l a r those provided by the E d u c a t i o n a l Data Base System (EDBS). I t i s concluded that EDBS f a c i l i t i e s are not s u f f i c i e n t to guarantee c o r r e c t n e s s . S e v e r a l a l t e r n a t i v e s are proposed whereby c o r r e c t n e s s can be guaranteed. An approach i s s e l e c t e d based on ease of implementation r a t h e r than on the degree of concurrency provided. Guaranteeing c o r r e c t n e s s and at the same time p r o v i d i n g a high degree of concurrency t u r n s out to be a very d i f f i c u l t problem both i n theory and i n p r a c t i c e . i v TABLE OF CONTENTS Chapter I: I n t r o d u c t i o n 1 Chapter I I : Framework . 6 A. . B a s i c Concepts 6 B. S e r i a l i z a b l e Schedules 12 C. Locking P r o t o c o l s 16 D. Other Systems 26 E. Current Hot Research Issues ......42 Chapter I I I : Problem .45 A. EDBS D e s c r i p t i o n 46 B. ED BS Implementation < 54 C. . Problem Formulation .75 Chapter IV: Recommendations .87 A. A l t e r n a t i v e s 87 B. . S e l e c t e d Approach 96 Chapter V: Co n c l u s i o n ........100 REFERENCES 10 3 APPENDICES . .. 106 A: EDBS Overview 106 B: EDBS Commands And U t i l i t i e s 107 C: EDBS F i l e Usage 103 D: I n t e r f a c e F i l e System D e s c r i p t i o n 109 E: I n t e r f a c e F i l e System Functions 114 V LIST OF FIGURES 1. S e r i a l Schedules ...11 2. I n t e r f a c e F i l e System 54 3.. EDBS Overview 106 4.. EDBS F i l e Usage ...108 v i ACKNOWLEDGEMENTS I s i n c e r e l y thank my s u p e r v i s o r s Paul Gilmore and A l Fowler f o r t h e i r c a r e f u l review and p e r c e p t i v e c r i t i c i s m of my work. I a l s o wish t o thank Bob G o l d s t e i n for h i s d i r e c t i o n during the forma t i v e stages of the t h e s i s , h i s continued support and c a r e f u l r e a d i n g of the f i n a l document. F i n a l l y , I wish to thank the computer c e n t e r s t a f f both a t UBC and at SFU f o r t h e i r help i n g e t t i n g the E d u c a t i o n a l Data Base System o p e r a t i o n a l . 1 ChjLEtSE I: I n t r o d u c t i o n T h i s t h e s i s has two major a s p e c t s : concurrency and a student o r i e n t e d database management system. The student o r i e n t e d database management system i s the E d u c a t i o n a l Data Base System (EDBS).. EDBS c o n s i s t s of a number of APL workspaces.. Users access the system from an APL t e r m i n a l to perform va r i o u s o p e r a t i o n s on databases.. U t i l i t i e s are provided t o CREATE, DESTROY, and MAINTAIN databases. Other commands permit a user to r e t r i e v e data from a database, update data c o n t a i n e d i n a database, i n s e r t new data i n t o a database, or d e l e t e data from a database. EDBS i s intended f o r use i n te a c h i n g database concepts. There are at l e a s t two p o s s i b l e approaches to using EDBS i n an e d u c a t i o n a l environment. F i r s t , each student c o u l d c r e a t e and manipulate h i s own databases.. Second, the i n s t r u c t o r c o u l d c r e a t e a common database to be accessed by a l l students. Cone urgency i s def i n e d as the simultaneous use of a system by more than one user. When a database system i s used c o n c u r r e n t l y , data i n a database may become i n c o r r e c t . The o b j e c t i v e of concurrency r e s e a r c h i s t o provide concurrency while e n s u r i n g database c o r r e c t n e s s . The usual approach i s to provide a concurrency component f o r a database management system which c o n t r o l s concurrency such that c o r r e c t n e s s i s guaranteed. 2 A« M o t i v a t i o n Concurrency became an i s s u e i n r e l a t i o n to EDBS du r i n g i n s t a l l a t i o n of the system a t UBC. The a v a i l a b l e v e r s i o n of EDBS was dependent on the CDC/APL f i l e system. . Before the system c o u l d be i n s t a l l e d , i t had t o be modified t o r e l y on the a v a i l a b l e MTS/APL f i l e system. The two f i l e systems d i f f e r p r i m a r i l y i n t h e i r data s h a r i n g f a c i l i t i e s ( i . e . , access a u t h o r i z a t i o n , concurrency c o n t r o l ) . The p a r t i c u l a r demands that EDBS makes of the f i l e system i n terms of access a u t h o r i z a t i o n c o u l d be met i n a s t r a i g h t forward manner v i a the a v a i l a b l e MTS APL f u n c t i o n s . I t appeared i m p o s s i b l e , however, t o simulate r e q u i r e d concurrency c o n t r o l c a p a b i l i t i e s . I t thus became necessary to c o n s i d e r very c a r e f u l l y the i s s u e of concurrency b e f o r e c o n t i n u i n g with the implementation. B» Proposed Work A major p o r t i o n of t h i s t h e s i s i s concerned with d e f i n i n g the problem. Two major c o n s t r a i n t s imposed on the problem are the f o l l o w i n g : F i r s t , we are working i n an environment of imp e r f e c t i n f o r m a t i o n . In p a r t i c u l a r , i t i s very d i f f i c u l t to obta i n r e q u i r e d i n f o r m a t i o n about the e x i s t i n g EDBS concurrency s o l u t i o n . . Second, our ba s i c approach r e q u i r e s t h a t EDBS should not be m o d i f i e d , thus l i m i t i n g the ar r a y of p r a c t i c a l s o l u t i o n s . A summary of proposed work f o l l o w s : (1) review the l i t e r a t u r e i n the area of concurrency t o c l a r i f y b a s i c o b j e c t i v e s of concurrency 3 c o n t r o l . (2) examine EDBS concurrency c o n t r o l f a c i l i t i e s to determine the extent t o which the e x i s t i n g system a c h i e v e s these o b j e c t i v e s . (3) examine other systems to determine usual s o l u t i o n s . (4) propose a s o l u t i o n which s a t i s f i e s the b a s i c o b j e c t i v e s of concurrency c o n t r o l . C. Dimensions T h i s t h e s i s has s e v e r a l dimensions which may be d e s c r i b e d as f o l l o w s : 1. . F i l e Systems EDBS has been p r e v i o u s l y m odified t o convert from a dependency on the APL*PLUS f i l e system t o a dependency on the CDC/APL f i l e system. Our undertaking then i s t o convert from a dependency on the CDC/APL f i l e system to a dependency on the a v a i l a b l e MTS/APL f i l e system. In t h i s t h e s i s , as part of our problem d e f i n i t i o n , we examine EDBS concurrency s o l u t i o n s as they developed through each f i l e system. T h i s approach i s motivated by a d e s i r e to understand f i r s t , what c o r r e c t n e s s guarantee was i n i t i a l l y intended f o r EDBS and second, the ext e n t to which the e x i s t i n g concurrency s o l u t i o n i s c o n s t r a i n e d by the CDC/APL f i l e system. Hence, one major theme of t h i s t h e s i s i s f i l e systems, i n p a r t i c u l a r t h e i r concurrency c o n t r o l c a p a b i l i t i e s . 4 2. Data Models EDBS supports c r e a t i o n of r e l a t i o n a l , h i e r a r c h i c a l and network databases. In t h i s t h e s i s we review e s s e n t i a l c h a r a c t e r i s t i c s of each data model and we examine at l e a s t one system o t h e r than EDBS which supports the data model.. Our a n a l y s i s i s aimed at g a i n i n g i n s i g h t i n t o the question o f whether the data model i m p l i e s a need f o r a p a r t i c u l a r type of concurrency s o l u t i o n . Hence, a second major theme of t h i s t h e s i s i s data models, i n p a r t i c u l a r t h e i r concurrency c o n t r o l requirements. 3. . C o r r e c t n e s s Concurrency i s not the only f a c t o r i n a database environment which j e o p a r d i z e s c o r r e c t n e s s . The i n t e g r i t y of a database may be destroyed due t o hardware f a i l u r e s , human e r r o r s , or software e r r o r s . Many systems provide i n t e g r i t y a s s e r t i o n s f o r checking the v a l i d i t y of database updates and most systems provide a set of support r o u t i n e s t o detect and r e c o v e r from e r r o r s i t u a t i o n s . In t h i s t h e s i s we attempt to deal with the i s s u e of concurrency within the broader framework of c o r r e c t n e s s i n g e n e r a l by reviewing a d d i t i o n a l procedures f o r ensuring c o r r e c t n e s s provided by each system and EDBS. 5 4. Trade o f f s In t h i s t h e s i s we are i n t e r e s t e d i n trade o f f s , two of which are p a r t i c u l a r l y r e l e v a n t : the t r a d e o f f between i n f o r m a t i o n and performance as pointed out by Kung and Pap a d i m i t r i o u (1979) i n r e l a t i o n t o concurrency c o n t r o l mechanisms and the trade o f f between concurrency and c o r r e c t n e s s as evidenced by comparing concurrency s o l u t i o n s provided by EDBS and other systems.. C. Reader^s Guide Chapter I I of t h i s t h e s i s reviews the l i t e r a t u r e and d e s c r i b e s concurrency c o n t r o l f a c i l i t i e s provided by a number of systems. Chapter I I I d e s c r i b e s EDBS and compares i t s concurrency c o n t r o l f a c i l i t i e s with those of other systems. Chapter I I I concludes with a statement of the EDBS concurrency problem.. Chapter IV d e s c r i b e s a l t e r n a t i v e s and s e l e c t s what we b e l i e v e to be an optimum approach. F i n a l l y , Chapter V p o i n t s d i r e c t i o n s f o r f u r t h e r work. The reader i s advised that t h i s t h e s i s examines many i s s u e s not n e c e s s a r i l y r e l a t e d t o the f i n a l s o l u t i o n . T h i s t h e s i s i s intended to r e f l e c t our experience with the EDBS undertaking, which was b a s i c a l l y a search to determine the problem and f i n d an implementable s o l u t i o n . 6 £h§£i£E LLz Framework T h i s chapter reviews r e s e a r c h t o date i n the area of database concurrency. The o b j e c t i v e of concurrency r e s e a r c h i s to p rovide concurrency while ensuring database c o r r e c t n e s s . . One approach i s t c provide a \" s e r i a l i z a b i l i t y \" component f o r a data base management system which guarantees t h a t concurrent t r a n s a c t i o n s w i l l have a s e r i a l view of the database. The performance of such a component i s measured i n terms of the amount of concurrency provided. V i r t u a l l y a l l s o l u t i o n s proposed thus f a r have been based on l o c k i n g techniques. I t appears t h a t l o c k i n g unduly r e s t r i c t s the amount of concurrency which can be provided. Current r e s e a r c h tends toward f i n d i n g a l t e r n a t i v e mechanisms t o guarantee s e r i a l i z a b i l i t y while p r o v i d i n g a g r e a t e r degree of concurrency. S e c t i o n 1 of t h i s chapter d e s c r i b e s b a s i c concepts. S e c t i o n 2 examines i n d e t a i l the requirements f o r s e r i a l i z a b i l i t y as presented by Pa p a d i m i t r i o u (1979). S e c t i o n 3 examines l o c k i n g p r o t o c o l s . F i n a l l y , s e c t i o n 4 reviews concurrency s o l u t i o n s as implemented i n a number of systems. A. B a s i c Concepts A database i s d e f i n e d as a s e t of e n t i t i e s where each e n t i t y has a name and a value. Users execute a c t i o n s on the database. For the purpose of our a n a l y s i s we make the f o l l o w i n g assumptions about a c t i o n s : 7 (1) A c t i o n s are executed i n d i v i s i b l y and one before the other ( i . e . , a c t i o n s are executed with zero concurrency on a s e r i a l machine). In t h i s sense we say that a c t i o n s are atomic. (2) The only types of a c t i o n s a r e : read, w r i t e , c r e a t e and d e s t r o y . A c t i o n s are performed on e n t i t i e s . We do not c o n s i d e r an a c t i o n t o be some computation i n v o l v i n g only temporary v a r i a b l e s . . Rather, we view these s o r t s of computations as o c c u r r i n g i n d i v i s i b l y with the a c t i o n . A database r e p r e s e n t s f a c t s about some p o r t i o n of the r e a l world. Hence, th e r e e x i s t s a c e r t a i n s et of a s s e r t i o n s (hereafter c a l l e d c o n s i s t e n c y c o n s t r a i n t s ) which n e c e s s a r i l y must be s a t i s f i e d f o r the database t o be c o r r e c t r e l a t i v e t o the r e a l world. For example, suppose t h a t our database c o n s i s t s of the e n t i t i e s E1, E2,..., En. Cons i s t e n c y c o n s t r a i n t s a p p l y i n g to the database might be the f o l l o w i n g : E1 i s equal t o E2 En i s equal t o n E2 i s the index of E3 A database i s c o n t i n u o u s l y undergoing changes as user a c t i o n s transform i t from one s t a t e t o another. A database i s s a i d to be c o n s i s t e n t i f i t s c u r r e n t s t a t e s a t i s f i e s the s e t of co n s i s t e n c y c o n s t r a i n t s a s s o c i a t e d with the database . Con s i s t e n c y i s necessary but not s u f f i c i e n t f o r c o r r e c t n e s s . . For example, i f e x a c t l y one c o n s i s t e n c y c o n s t r a i n t a p p l i e s t o the database that \"EKE2\", then i t i s p o s s i b l e f o r a 8 user to e r r o n e o u s l y update E2 without v i o l a t i n g any c o n s i s t e n c y c o n s t r a i n t s . The database w i l l be i n c o r r e c t , however, because i t does not a c c u r a t e l y r e f l e c t the c u r r e n t s t a t e of that p o r t i o n of the r e a l world which i t i s intended to r e p r e s e n t . A database must be at l e a s t t e m p o r a r i l y i n c o n s i s t e n t d u r i n g update o f e n t i t i e s which are r e l a t e d to each other by some c o n s i s t e n c y c o n s t r a i n t . For example, \" i n moving money from one bank account to another there w i l l be an i n s t a n t during which one account has been d e b i t e d and the other not yet c r e d i t e d . This v i o l a t e s a c o n s t r a i n t that the number of d o l l a r s i n the system i s c o n s t a n t \" . 1 To guarantee c o r r e c t n e s s a t some l e v e l , a c t i o n s by the same user are grouped i n t o u n i t s c a l l e d t r a n s a c t i o n s . A t r a n s a c t i o n , when executed alone, transforms a c o n s i s t e n t database s t a t e i n t o a new c o n s i s t e n t s t a t e . Hence, a t r a n s a c t i o n serves as a u n i t of c o n s i s t e n c y . In t h i s s e c t i o n , we w i l l f o l l o w a n o t a t i o n s i m i l a r to t h a t presented by Eswaran e t a l . (1976) f o r r e p r e s e n t i n g an a c t i o n on a given e n t i t y by a p a r t i c u l a r t r a n s a c t i o n . For example, the read of e n t i t y E1 by t r a n s a c t i o n T1 w i l l be rep r e s e n t e d as (T 1,read,E1). The t r a n s a c t i o n name (T1 i n t h i s case) w i l l be e l i m i n a t e d i f i t i s c l e a r from context which t r a n s a c t i o n i s performing the a c t i o n . . To c l e a r l y f i x the idea of a t r a n s a c t i o n c o n s i d e r the f o l l o w i n g groups of a c t i o n s , T1 and T2, and suppose t h a t only one c o n s t r a i n t a p p l i e s to the database: t h a t \"E1 i s equal t o 1 T h i s passage i s taken from page 624 of Eswaran et a l . (1976). 9 E2\". T l = { ( r e a d , E l ) , (read,E2) , (write, E1) , (write, E2) } T2 = {(read,E1), ( w r i t e r E l ) } i In t h i s example, 1\\ would be a t r a n s a c t i o n but T2 would not be a t r a n s a c t i o n . T1 i s c o n s i s t e n c y p r e s e r v i n g s i n c e i t updates t both E1 and E2 (assuming t h a t the e n t i t i e s are updated c o r r e c t l y ) . Even though the database may be t e m p o r a r i l y i n c o n s i s t e n t during execution of T1, i t w i l l be c o n s i s t e n t upon completion of T1.. In c o n t r a s t , T2 updates E1 but not E2. Hence, upon completion of 12 the c o n s t r a i n t t h a t E1 i s equal t o E2 i s v i o l a t e d . T2 i s not c o n s i s t e n c y p r e s e r v i n g and t h e r e f o r e not a t r a n s a c t i o n . Having grouped a c t i o n s i n t o t r a n s a c t i o n s we want to run t r a n s a c t i o n s c o n c u r r e n t l y by i n t e r l e a v i n g a c t i o n s from d i f f e r e n t t r a n s a c t i o n s . The r e s u l t i n g sequence of a c t i o n s i s c a l l e d a schedule. We assume here that we have a scheduler whose i n p u t i s a sequence of a r r i v i n g t r a n s a c t i o n s and whose output i s the order of e x e c u t i o n of a c t i o n s w i t h i n the t r a n s a c t i o n s . In t h i s environment i n t e r l e a v i n g of a c t i o n s must be r e s t r i c t e d such t h a t each t r a n s a c t i o n i s i s o l a t e d from temporary i n c o n s i s t e n c i e s induced in the database by other t r a n s a c t i o n s . To see the c o r r e c t n e s s problems a s s o c i a t e d with running t r a n s a c t i o n s c o n c u r r e n t l y , c o n s i d e r our p r e v i o u s database t o which the c o n s t r a i n t a p p l i e s t h a t \"E1 i s equal t o E2\".. Now, suppose t h a t two t r a n s a c t i o n s , T1 = { (T1, write,E1) , (T 1,write, E2) } T2 = {(T2,read,El) , (T2 ,read, E2) } are run c o n c u r r e n t l y according to the f o l l o w i n g schedule: 1 0 { ( T 1 , w r i t e r E 1 ) , ( T 2,read,E 1) , ( T 2 , r e a d , E 2 ) , (T 1, w r i t e , E 2 ) } Note t h a t t h i s schedule g i v e s T 2 an i n c o n s i s t e n t view of the database ( i . e . , T 2 reads E 1 not equal t o E 2 because T 1 has updated E 1 but not E 2 a t the time of r e a d i n g ) . Now, i f i t had been intended t h a t t r a n s a c t i o n T 2 should update the same (or d i f f e r e n t ) e n t i t i e s based on the i n c o n s i s t e n t values read, then the database would be i n c o n s i s t e n t upon completion of the schedule. Another c o r r e c t n e s s problem i s the c l a s s i c \" l o s t update problem\" which occurs when two concurrent t r a n s a c t i o n s read the same value f o r an e n t i t y and then t r y to write back an incremented va l u e . The r e s u l t i s t h a t the second update ov e r w r i t e s the f i r s t and one increment i s l o s t . A database i s c e r t a i n l y i n c o r r e c t i f updates get l o s t , and f u r t h e r i t may be i n c o n s i s t e n t i f say two e n t i t i e s are r e l a t e d by a c o n s i s t e n c y c o n s t r a i n t and an update gets l o s t f o r one e n t i t y but not the other. . I f t r a n s a c t i o n s are run s e q u e n t i a l l y ( i . e . , one a f t e r the other i n some order) then the above c o r r e c t n e s s problems do not a r i s e . The schedule r e s u l t i n g from s e q u e n t i a l execution of a set of t r a n s a c t i o n s i s c a l l e d a s e r i a l schedule. For example, given our previous t r a n s a c t i o n s , T 1 and T 2 , t h e r e are e x a c t l y two p o s s i b l e s e r i a l schedules: 11 SERIAL SCHEDULES { (T 1, write,E1) , (T1, write,E2) , (T2,read,E1) , (T2,read, E2) } i 1 i . i T1 T2 { (T2,read,E1) , (T2 ,read ,E2) , (T 1, w r i t e , E1) , (T 1, w r i t e , E2) } • — - i i i T2 T1 Figu r e 1 S e r i a l schedules have a number of d e s i r a b l e p r o p e r t i e s . F i r s t , execution of t r a n s a c t i o n s a c c o r d i n g t o a s e r i a l schedule w i l l g i v e each t r a n s a c t i o n a c o n s i s t e n t view of the database. This i s because only one t r a n s a c t i o n w i l l be a c t i v e i n the system a t any one time.. Second, the database w i l l be c o n s i s t e n t upon completion of a s e r i a l schedule because each t r a n s a c t i o n w i l l transform a c o n s i s t e n t database i n t o a new c o n s i s t e n t s t a t e . T h i r d , the l o s t update problem cannot occur with a s e r i a l schedule because any read and wr i t e of an e n t i t y by a p a r t i c u l a r t r a n s a c t i o n w i l l occur i n d i v i s i b l y . Of course, s e r i a l s c hedules do not provide concurrency between t r a n s a c t i o n s ( i . e . , i n t e r l e a v i n g of a c t i o n s from d i f f e r e n t t r a n s a c t i o n s ) . The usual approach (Eswaran e t a l . .1976, P a p a d i m i t r i o u 1979, Kedem and S i l b e r s c h a t z 1979) to p r o v i d i n g concurrency while en s u r i n g c o r r e c t n e s s i n v o l v e s the notion of s e r i a l i z a b i l i t y . A schedule i s s a i d to be s e r i a l i z a b l e i f i t i s e q u i v a l e n t to some s e r i a l s c h e d u l e ( e q u i v a l e n t i n the sense t h a t t h e i r outcomes are the same).. The next s e c t i o n examines the notion o f s e r i a l i z a b i l i t y as presented i n Pap a d i m i t r i o u (1979).. 12 B. . S e r i a l i z a b l e Schedules P a p a d i m i t r i o u j u s t i f i e s the appeal of s e r i a l i z a b i l i t y as a c o r r e c t n e s s c r i t e r i o n as f o l l o w s : \"Databases are supposed t o be f a i t h f u l models of p a r t s of the r e a l world, and user t r a n s a c t i o n s r e p r e s e n t i n s t a n t a n e o u s changes i n the world. Since such changes are t o t a l l y ordered by temporal p r i o r i t y , the only a c c e p t a b l e i n t e r l e a v i n g s of atomic steps of d i f f e r e n t t r a n s a c t i o n s are those that are e q u i v a l e n t to some s e q u e n t i a l e x e c u t i o n of these t r a n s a c t i o n s . 1 ! 1 In t h i s s e c t i o n , we d e s c r i b e P a p a d i m i t r i o u ' s model of t r a n s a c t i o n s and c h a r a c t e r i z a t i o n of schedule e g u i v a l e n c e . . The reader i s cautioned t h a t P a p a d i m i t r i o u c o n s i d e r s a very narrow c l a s s of t r a n s a c t i o n s . Namely, those t h a t c o n s i s t of e x a c t l y two a c t i o n s : a read followed by a w r i t e . This simple t r a n s a c t i o n model i s used i n P a p a d i m i t r i o u (1979) as a framework f o r understanding and comparing d i f f e r e n t p h i l o s o p h i e s of s e r i a l i z a b i l i t y . 2 We present i t here f o r s i m i l a r reasons. We w i l l see t h i s model again i n Chapter 3 of t h i s t h e s i s where i t w i l l serve as a means of d e s c r i b i n g e s s e n t i a l c h a r a c t e r i s t i c s of our EDBS concurrency problem. P a p a d i m i t r i o u r e f e r s to a schedule as a h i s t o r y and views database e n t i t i e s as v a r i a b l e s . T r a n s a c t i o n s are assumed to c o n s i s t of two a c t i o n s : the r e t r i e v a l of the values of a s e t of v a r i a b l e s f o l l o w e d by the update of the valu e s of a s e t of v a r i a b l e s . The read a c t i o n f o r t r a n s a c t i o n T i i s denoted by R i , 1 T h i s passage i s taken from page 632 of P a p a d i m i t r i o u (1979).. 2 The same t r a n s a c t i o n model i s used i n B e r n s t e i n et al..(1978) and Rothnie and Goodman (1977). 1 3 the write a c t i o n by Wi. The s e t of v a r i a b l e s read by a t r a n s a c t i o n i s c a l l e d the s e t of the t r a n s a c t i o n (denoted by S (Ei) f o r t r a n s a c t i o n Ti) . S i m i l a r i l y , the set of v a r i a b l e s w r i t t e n , S(Wi), i s c a l l e d the w r i t e set of T i . . Each t r a n s a c t i o n i s viewed as a s e t of u n i n t e r p r e t e d f u n c t i o n symbols. For example, suppose t h a t S(Wi) = £y} and S (Ei) = £x, z} . Then we can view y as some f u n c t i o n of x and z [ i . e . , y=fi(x,z) ]. C o n s i s t e n t with our previous usage of the term, a c t i o n s within t r a n s a c t i o n s are assumed to be atomic. That i s , we assume t h a t values f o r v a r i a b l e s i n S(Ri) are read i n s t a n t a n e o u s l y . S i m i l a r i l y , we assume t h a t values f o r v a r i a b l e s i n S(Wi) are w r i t t e n i n s t a n t a n e o u s l y . P a p a d i m i t r i o u r e p r e s e n t s a h i s t o r y as f o l l o w s : Suppose t h a t the t r a n s a c t i o n s i n v o l v e d i n h i s t o r y h are TI and T2 where S(E1) = £x}, S(E2) = {}, S (W1) =s (W2) = {x,y} . Suppose a l s o t h a t a c t i o n s from T1 and T2 are scheduled i n the order (R1,B2,W1,W2}. Then h would be given by h = R1[x]W1[x,y ]W2[x,y]i P a p a d i m i t r i o u ' s notion of equivalence of h i s t o r i e s i s the f o l l o w i n g (Let V r e p r e s e n t the s e t of v a r i a b l e s i n the database and l e t f ( i , j ) be the f u n c t i o n a s s o c i a t e d with the j t h v a r i a b l e i n t r a n s a c t i o n T i ) : \"Two h i s t o r i e s , h i and h2, are e q u i v a l e n t i f , given any set of J V J domains f o r the v a r i a b l e s , any set of 1 In c o n t r a s t with P a p a d i m i t r i o u ' s approach we do not show i n h read(write) a c t i o n s f o r which the corresponding read(write) s e t i s empty. 14 i n i t i a l values f o r the v a r i a b l e s from the corresponding domains, and furthermore any i n t e r p r e t a t i o n of the f u n c t i o n s f ( i , j ) , the values of the v a r i a b l e s are i d e n t i c a l a f t e r e x e c u t i o n of both h i s t o r i e s . n l To understand P a p a d i m i t r i o u ' s s y n t a c t i c c h a r a c t e r i z a t i o n of h i s t o r y e q u i v a l e n c e some terminology must f i r s t be i n t r o d u c e d . In p a r t i c u l a r , given a h i s t o r y h we are concerned with the \"augmented v e r s i o n of h\", the \"reads from\" r e l a t i o n i n h, and a \" l i v e \" t r a n s a c t i o n i n h. The augmented v e r s i o n of h i s the h i s t o r y h' d e r i v e d from h by: (1) appending onto the f r o n t of h, a t r a n s a c t i o n T which w r i t e s a l l v a r i a b l e s i n the database without r e a d i n g any of them (2) appending onto the end of h, a t r a n s a c t i o n T' which reads a l l v a r i a b l e s i n the database without updating any of them. For example, i f we l e t V be the set of v a r i a b l e s i n the database then h' would be represented as f o l l o w s : h' = W[V] ... { h } ... R'[V] Now suppose that x S ( R i ) . . A \"reads from\" r e l a t i o n i n h i s d e f i n e d as f o l l o w s : \"Ri reads x from Wj i n h i f Wj i s the l a t e s t occurrence of a w rite symbol before R i i n h 1 such that x £ S ( W j ) \" 2 For example, given the augmented v e r s i o n of our p r e v i o u s 1 T h i s passage i s taken from page 633 of P a p a d i m i t r i o u (1979). 2 T h i s passage i s taken from page 634 of P a p a d i m i t r i o u (1979). 15 h i s t o r y , h' = W[V]Rl£x]W1[x,y]W2[x,y]B'[ V] we say t h a t RI reads x from W i n h. The d e f i n i t i o n of a l i v e t r a n s a c t i o n i n h i s as f o l l o w s : (1) T' i s l i v e i n h. (2) I f f o r some l i v e t r a n s a c t i o n T j r Bj reads a v a r i a b l e from Wi i n h, then T i i s a l s o l i v e i n h. P a p a d i m i t r i o u showed t h a t : \"Two h i s t o r i e s , h i and h2, are e q u i v a l e n t i f and only i f they have the same s e t of l i v e t r a n s a c t i o n s and a l i v e E i reads x from Wj i n h i i f and only i f E i reads x from Wj i n h 2 . n l E e c a l l t h a t a s e r i a l i z a b l e h i s t o r y i s one which i s e q u i v a l e n t to some s e r i a l h i s t o r y . Various notions of s e r i a l i z a b i l i t y e x i s t i n the l i t e r a t u r e . A major r e s u l t presented i n P a p a d i m i t r i o u (1979) i s t h a t r e c o q n i z i n g t r a n s a c t i o n h i s t o r i e s t h a t are s e r i a l i z a b l e i s an NP-complete problem. In the same paper i t i s shown t h a t previous notions of s e r i a l i z a b i l i t y a c t u a l l y p rovide f o r v a r i o u s subsets of the s e t of s e r i a l i z a b l e h i s t o r i e s . The NP-complete r e s u l t suggests t h a t there i s no e f f i c i e n t s e r i a l i z e r of h i s t o r i e s ( i . e . , a l g o r i t h m which converts an i n p u t h i s t o r y t o i t ' s c l o s e s t s e r i a l i z a b l e h i s t o r y ) . However, Pa p a d i m i t r i o u a l s o shows that f o r a l l e f f i c i e n t l y r e c o g n i z a b l e s u b c l a s s e s of s e r i a l i z a b l e h i s t o r i e s there e x i s t s an e f f i c i e n t s cheduler f o r the c l a s s ( i . e . , an a l g o r i t h m which transforms an i n p u t h i s t o r y i n t o the c l o s e s t s e r i a l i z a b l e h i s t o r y within i t s 1 T h i s passage i s taken from page 635 of P a p a d i m i t r i o u (1979).. 16 c l a s s ) . In the next s e c t i o n , we w i l l be concerned with these e f f i c i e n t s c h e d u l e r s . C. L o c k i n q P r o t o c o l s There are c e r t a i n d i f f i c u l t i e s i n designing a scheduler i n the form e n v i s i o n e d by P a p a d i m i t r i o u (1979). For example, the scheduler must be provided with s y n t a c t i c d e s c r i p t i o n s of a l l t r a n s a c t i o n s to be scheduled, and i t i s unclear to date how t h i s might be accomplished. With the n otable exception of the SDD-1 system (Bernstein et. a l . 1978) a l l concurrency s o l u t i o n s proposed thus f a r have been based on l o c k i n g ( i . e . , semaphores c o n t r o l l i n g access to d a t a ) . T r a n s a c t i o n s l o c k and unlock database e n t i t i e s according to a p a r t i c u l a r l o c k i n g p r o t o c o l . A l o c k i n g p r o t o c o l guarantees t h a t i f a t r a n s a c t i o n i s run c o n c u r r e n t l y with any other s e t of t r a n s a c t i o n s where each t r a n s a c t i o n f o l l o w s the same l o c k i n g p r o t o c o l , any p o s s i b l e r e s u l t i n g schedule w i l l be s e r i a l i z a b l e . An important aspect of t h i s approach i s t h a t i t focuses on each t r a n s a c t i o n i n d i v i d u a l l y and hence the s y n t a c t i c d e s c r i p t i o n of a l l t r a n s a c t i o n s to occur i n the h i s t o r y need not be known to the scheduler a p r i o r i . In t h i s s e c t i o n , we d i s c u s s a number of i s s u e s r e l a t e d to l o c k i n g p r o t o c o l s . We then d e s c r i b e two l o c k i n g p r o t o c o l s i n d e t a i l : two phase l o c k i n g and the t r e e p r o t o c o l . We conclude the s e c t i o n by r e l a t i n g a few p r a c t i c a l c o n s i d e r a t i o n s r e g u i r e d i n implementing a l o c k i n g p r o t o c o l . . 17 The reader i s advised t h a t we are now c o n s i d e r i n g a broader set of t r a n s a c t i o n s than t h a t c o n s i d e r e d by P a p a d i m i t r i o u . . In t h i s s e c t i o n , t r a n s a c t i o n s may c o n s i s t of more than two a c t i o n s and except where otherwise i n d i c a t e d we i n c l u d e a c t i o n s which crea t e and destroy e n t i t i e s . 1• Modes of Locking Various modes of l o c k i n g have been proposed i n the l i t e r a t u r e . A d i s t i n c t i o n between shared and e x c l u s i v e access to an e n t i t y i s o f t e n made. A shared lock i s intended f o r reading an e n t i t y ; an e x c l u s i v e lock i s intended i f the e n t i t y i s to be modified [see Kedem and S i l b e r s c h a t z (1979) f o r a d e f i n i t i o n of these two l o c k i n g modes]. Gray e t a l . (1975) propose a t h i r d mode of l o c k i n g c a l l e d i n t e n t i o n l o c k i n g i n which a t r a n s a c t i o n l o c k s a given node i n shared or e x c l u s i v e mode by f i r s t l o c k i n g a l l a n c e s t o r s of the node i n i n t e n t i o n -shared or i n t e n t i o n - e x c l u s i v e mode. (Lockable resources i n . t h i s context are assumed to form a d i r e c t e d a c y c l i c graph.) Yannakakis et a l . (1979) g e n e r a l i z e l o c k i n g to d - l o c k i n g i n which up t o d-1 t r a n s a c t i o n s may share an e n t i t y ( i . e . , a l o c k v a r i a b l e may assume d values 0, 1, d-1 and the locked s t a t e i s d-1) . In t h i s s e c t i o n we w i l l be concerned with e x c l u s i v e l o c k s the p r o p e r t i e s of which may be s t a t e d as f o l l o w s : (a) an e n t i t y may be l o c k e d by only one t r a n s a c t i o n at a time (b) i f a t r a n s a c t i o n T i cannot lock an e n t i t y (because i t i s a l r e a d y locked by some other 18 t r a n s a c t i o n Tj) then T i i s suspended u n t i l T j unlocks the e n t i t y . 2. P r e d i c a t e Locks Our p r e v i o u s d i s c u s s i o n assumes t h a t e n t i t i e s are accessed by name. However, i n a database•system i t would be common f o r a t r a n s a c t i o n to want to access some l o g i c a l subset o f e n t i t i e s by s p e c i f y i n g a c o n d i t i o n on t h e i r values ( i . e . , key a d d r e s s i n g ) . The requirement f o r a c c e s s i n g l o g i c a l subsets of e n t i t i e s suggests the need f o r some form of l o g i c a l l o c k i n g , h e r e a f t e r c a l l e d p r e d i c a t e l o c k i n g . P r e d i c a t e l o c k s operate as f o l l o w s (Schlageter 1978): Assume we have a set of e n t i t i e s d e s c r i b e d by p r e d i c a t e C which i s t o be l o c k e d . Then p r e d i c a t e C i s noted down and each e n t i t y the value of which s a t i s f i e s C i s taken f o r l o c k e d . 1 P r e d i c a t e l o c k i n g has been i n v e s t i g a t e d by many r e s e a r c h e r s (Eswaran et a l . 1976, Hies and Stonebraker 1977, S c h l a g e t e r 1978).. F u r t h e r examination of t h i s t o p i c i s beyond the scope of t h i s t h e s i s . We note on l y an example of what has been termed the \"phantom problem\" i n the context of p r e d i c a t e l o c k i n g . 1 Schlageter (1978) a l s o , i n v e s t i g a t e s the p o s s i b i l i t y of i d e n t i f y i n g the e n t i t i e s i n the s e t de f i n e d by P r e d i c a t e C and e x p l i c i t l y l o c k i n g each such e n t i t y . The r e s u l t s are g e n e r a l l y n e g a t i v e . 19 A phantom i s an e n t i t y that r e a l l y should not e x i s t g i v e n the c u r r e n t database s t a t e . For example, i f a t r a n s a c t i o n l o c k s a set of e n t i t i e s as g i v e n by some p r e d i c a t e . t h e n no e n t i t i e s should be added to t h i s s e t by another t r a n s a c t i o n u n t i l the f i r s t t r a n s a c t i o n t e r m i n a t e s . Thus, the nonexistence of e n t i t i e s must a l s o be l o c k e d . These no n e x i s t e n t e n t i t i e s are c a l l e d phantoms. The m a t e r i a l i z a t i o n of phantoms during the l i f e t i m e of a t r a n s a c t i o n may r e s u l t i n the t r a n s a c t i o n r e c e i v i n g an i n c o n s i s t e n t view of the database. To i l l u s t r a t e t h i s , c o n s i d e r the f o l l o w i n g example as d e s c r i b e d by S c h l a g e t e r (1978). \"A process P searches the same o b j e c t set twice.. The o b j e c t s searched f o r i n the f i r s t pass s a t i s f y P r e d i c a t e A, i n the second pass P r e d i c a t e A & B. As a r e s u l t of phantom m a t e r i a l i z a t i o n the set found i n the second pass may not be a subset of the set found i n the f i r s t p a s s.\" 1 3. D e s i r a b l e P r o p e r t i e s of a Locking; P r o t o c o l The performance of a l o c k i n g p r o t o c o l i s e v a l u a t e d i n terms of the amount of concurrency t h a t i t p r o v i d e s . In P a p a d i m i t r i o u ' s terms, we could say that the l a r g e r the c l a s s of s e r i a l i z a b l e schedules t h a t are p o s s i b l e responses of a l o c k i n g p r o t o c o l , the g r e a t e r the amount of concurrency provided ( i . e . , i f we c o n s i d e r a l o c k i n g p r o t o c o l to be a scheduler then a l a r g e r c l a s s , of r e c o g n i z a b l e s e r i a l i z a b l e schedules would mean l e s s r e s h u f f l i n g of the i n p u t schedule and fewer d e l a y s ) . 1 T h i s passage i s taken from page 267 of Schlageter (1978).. In our terminology s u b s t i t u t e the terms t r a n s a c t i o n f o r process and e n t i t y f o r o b j e c t . 20 Another d e s i r a b l e f e a t u r e of a l o c k i n g p r o t o c o l , i n a d d i t i o n to p r o v i d i n g a l a r g e amount of concurrency, i s t h a t i t ensures freedom from deadlock. Deadlock a r i s e s when say each of two t r a n s a c t i o n s a c q u i r e s only some of the e n t i t i e s t h a t i t needs and each waits f o r the other to unlock e n t i t i e s . The i s s u e of deadlocks i s examined i n Yannakakis et a l . (1979) f o r a s u b c l a s s of l o c k i n g p o l i c i e s c a l l e d L - p o l i c i e s . I t i s shown t h a t the problem of d e c i d i n g whether a s e t of t r a n s a c t i o n s i s not deadlock f r e e i s NP complete, and t h a t deadlock depends onl y on the order i n which e n t i t i e s are l o c k e d (and not on where they are unlocked). In the next s u b s e c t i o n we examine a l o c k i n g p r o t o c o l which i s deadlock f r e e and another which i s not. 4. Two Phase Locking P r o t o c o l / T r e e P r o t o c o l In d e s c r i b i n g l o c k i n g p r o t o c o l s we make the f o l l o w i n g assumptions: (a) the l o c k i n g or unlocking of an e n t i t y i s an a c t i o n (b) a t r a n s a c t i o n l o c k s each e n t i t y which i s to be accessed p r i o r to a c c e s s i n g i t (c) a l l l o c k s are e x c l u s i v e l o c k s (d) a l l a c t i o n s (except l o c k and unlock) modify t h e i r e n t i t i e s ( i n c l u d i n g read) (e) e n t i t i e s are accessed by name 21 F u r t h e r , we use the f o l l o w i n g model of s e r i a l i z a b i l i t y (Eswaran et a l . 1976): Construct a d i r e c t e d graph D(S) (a) with a node corresponding t o each t r a n s a c t i o n T i i n Schedule S (b) with an a r c ( T i , Tj) whenever i n S the output of some e n t i t y by T i serves d i r e c t l y as i n p u t t o Tj To c l a r i f y (b) above, l e t A1 and A2 be a c t i o n s and suppose that S = {..., ( T i , A1, E) , ( T j , A2, E) , ...} Then the a r c ( T i , T j) would be found i n D (S) i f no other a c t i o n i n v o l v i n g e n t i t y E occurs between A1 and A2 i n S.. I t can be proved (Eswaran et a l . . 1976) t h a t i f D(S) i s a c y c l i c then S i s s e r i a l i z a b i l i t y . Given a h i s t o r y of the form used i n P a p a d i m i t r i o u (1979) l e t us c o n s t r u c t the d i r e c t e d graph D (h) . The nodes i n D(h) would not n e c e s s a r i l y correspond t o l i v e t r a n s a c t i o n s i n h. For example, t r a n s a c t i o n s T1 and T2 are dead i n h = E1[XJR2[X], but the arc (T1,T2) would appear i n D (h) . Since Eswaran et a l . assume that a l l a c t i o n s are update a c t i o n s , a r c (T1,T2) w i l l appear i n the d i r e c t e d graph f o r each of the f o l l o w i n g h i s t o r i e s : hi = R1[X]W2£X] h2 = W1[X]R2[X] h3 = W1£X]W2£X] 22 In both h i and h3, output of X by T1 i s s a i d to serve as i n p u t to T2 even though i n n e i t h e r h i s t o r y i s i t the case t h a t R2 reads X from W1. Note a l s o t h a t E2 reads X from W1 i n h2 even though T2 i s dead i n h2. The two phase lock (2PI) p r o t o c o l proposed by Eswaran e t a l . (1976) s t a t e s t h a t once a t r a n s a c t i o n has unlocked some e n t i t y , i t may not l o c k any more e n t i t i e s . Thus, a t r a n s a c t i o n has two phases: a l o c k i n g phase and an unlocking phase. , The f o l l o w i n g i s an example of a t r a n s a c t i o n T1 which i s 2PL and another T2 which i s not 2PL: T1 = {(lock, E1), (read, E 1 ) , ( l o c k , E2) (unlock, E1) , (read, E2) , (unlock, E2)} T2 = {(lock, E1) , (read, E1), (unlock, E1) (lock, E2) , (read, E2) , (unlock, E2)} T1 i s 2PL because i t l o c k s a l l r e q u i r e d e n t i t i e s b e f o r e u n l o c k i n g any. T2 i s not 2PL because i t requests a l o c k on e n t i t y E2 a f t e r r e l e a s i n g a l o c k on E1.. Eswaran et al..(1976) have shown t h a t 2PL i s s u f f i c i e n t f o r ensuring s e r i a l i z a b i l i t y and a l s o necessary i n the sense t h a t i f a t r a n s a c t i o n , say T i , does not f o l l o w 2PL, then t h e r e e x i s t s some T j such t h a t concurrent execution of the p a i r ( T i , Tj) may produce a non s e r i a l i z a b l e schedule. The two phase lock p r o t o c o l does not ensure freedom from deadlock. To see t h i s consider two t r a n s a c t i o n s T1 and T2 where T1 = {(lock, E l ) , (lock, E2) (unlock, E1) , (unlock, E2) } T2 = {(lock, E2) , (lock, E1) (unlock, E2) , (unlock, E1)}1 23 Note t h a t both T1 and T2 are 2PL. Now, i f T1 and T2 are run c o n c u r r e n t l y such t h a t T1 succeeds i n l o c k i n g E1 and T2 succeeds i n l o c k i n g E2, then each t r a n s a c t i o n w i l l wait i n d e f i n i t e l y f o r the other to unlock i t s e n t i t y . . Hence, T1 and T2 w i l l be deadlocked. How i s i t c l e a r that 2PL ensures s e r i a l i z a b i l i t y ? I g n o r i n g deadlock, i t i s c l e a r t h a t D(S) w i l l be a c y c l i c f o r every p o s s i b l e schedule r e s u l t i n g from co n c u r r e n t execution of a set of t r a n s a c t i o n s T, i f each t r a n s a c t i o n i n T f o l l o w s 2PL. T h i s i s because 2PL r e g u i r e s that a t r a n s a c t i o n must lock a l l i t s e n t i t i e s before r e l e a s i n g any. Thus, a s i t u a t i o n c o u l d not occur with say two . t r a n s a c t i o n s i n which T i updates an e n t i t y before T j does and T j updates an e n t i t y before T i does. Fur t h e r , the a r c s i n D (S) represent a b i n a r y r e l a t i o n on the set of t r a n s a c t i o n s i n S and i f t h i s r e l a t i o n i s a c y c l i c then the t r a n s a c t i o n s can be embedded i n a l i n e a r order. Although i t has been shown t h a t 2PL i s e s s e n t i a l l y necessary f o r ensuring s e r i a l i z a b i l i t y when only s y n t a c t i c i n f o r m a t i o n about the t r a n s a c t i o n system i s a v a i l a b l e , other l o c k p r o t o c o l s have been d e r i v e d which guarantee s e r i a l i z a b i l i t y by r e l y i n g on i n f o r m a t i o n about database o r g a n i z a t i o n . . Kedem and S i l b e r s c h a t z (1978, 1979) have proposed the Tree P r o t o c o l f o r databases organized as rooted t r e e s and a DAG p r o t o c o l f o r databases organized as d i r e c t e d a c y c l i c graphs. The s t r u c t u r e upon which these l o c k i n g p o l i c i e s depend c o u l d i n p r a c t i c e r e f e r t o some l o g i c a l or p h y s i c a l o r g a n i z a t i o n of e n t i t i e s . For i l l u s t r a t i o n 1 As i s u s u a l we show only the l o c k and unlock s t e p s . 24 purposes we focus here on the Tree p r o t o c o l which may be s t a t e d as f o l l o w s : (a) A t r a n s a c t i o n may s e l e c t the f i r s t e n t i t y i n the database t r e e to be locked. (b) Subsequently, the t r a n s a c t i o n may l o c k an e n t i t y o n l y i f i t i s c u r r e n t l y h o l d i n g a lock on the e n t i t y ' s f a t h e r . (c) A t r a n s a c t i o n may unlock an e n t i t y at any time, however a given e n t i t y may be l o c k e d at most once w i t h i n the same t r a n s a c t i o n . From (c) above i t i s c l e a r t h a t the t r e e p r o t o c o l may r e s u l t i n t r a n s a c t i o n s which are not two phase l o c k e d . S i l b e r s c h a t z and Kedem (1978) have shown t h a t t h i s p r o t o c o l guarantees both s e r i a l i z a b i l i t y and freedom from deadlock. 5«- P r a c t i c a l C o n s i d e r a t i o n s When i t i s necessary f o r a t r a n s a c t i o n to l o c k e n t i t i e s i t may i n p r a c t i c e be more a p p r o p r i a t e f o r the t r a n s a c t i o n to l o c k some l a r g e r granule of the database which c o n t a i n s the r e q u i r e d e n t i t i e s . A granule i n t h i s context might be a r e c o r d , a page, a f i l e , a r e l a t i o n , a column of a r e l a t i o n , a t u p l e , e t c . A small granule s i z e allows a g r e a t e r amount of concurrency at a greater c o s t i n managing l o c k s . . A l a r g e granule s i z e , on the other hand, i n h i b i t s concurrency but minimizes management of l o c k s . 2 5 Ries and Stonebraker (1977) examine by s i m u l a t i o n the overheads i n v o l v e d i n l o c k i n g . The r e s u l t s of t h i s study show that a f a i r l y l a r g e granule s i z e i s optimum i n terms of such parameters as system throughout, response time, CPU u t i l i z a t i o n and I/O u t i l i z a t i o n . T h eir c o n c l u s i o n based on these r e s u l t s was t h a t coarse g r a n u l a r i t y such as f i l e or area l o c k i n g i s p r e f e r a b l e to f i n e g r a n u l a r i t y such as page or r e c o r d l o c k i n g . In a l a t e r study (Ries and Stonebraker 1979) s e v e r a l a l t e r n a t i v e assumptions i n t h e i r model showed t h a t the d e s i r a b l e granule s i z e i s more a p p l i c a t i o n dependent. For example, i t was found t h a t i f a l l t r a n s a c t i o n s are randomly a c c e s s i n g s m a l l p a r t s of the database then f i n e g r a n u l a r i t y i s p r e f e r a b l e . However, i f s e v e r a l t r a n s a c t i o n s access l a r g e p o r t i o n s of the database then coarse g r a n u l a r i t y i s again to be p r e f e r r e d . . These s t u d i e s are of i n t e r e s t because they i n d i c a t e t h a t the c o s t of managing l o c k s i s high, and that any advantages due to g r e a t e r concurrency may be outweighed by t h i s c o s t . Another motivation f o r l o c k i n g , i n a d d i t i o n to e n s u r i n g s e r i a l i z a b i l i t y i s to f a c i l i t a t e t r a n s a c t i o n back out procedures. Back out may be necessary i f i t i s found upon completion of a t r a n s a c t i o n t h a t some c o n s i s t e n c y c o n s t r a i n t has been v i o l a t e d or i f deadlock i s d e t e c t e d . Most systems maintain a log of a l l changes made to the database by each t r a n s a c t i o n . Back out of a t r a n s a c t i o n then, i n v o l v e s undoing a l l changes made by the t r a n s a c t i o n as recorded i n the l o g . . I t i s p o i n t e d out i n Davies (1973) and Bjork (1973) t h a t t h i s procedure may not work c o r r e c t l y i f a t r a n s a c t i o n i s permitted to update uncommitted data. Uncommitted data i s data which has been 26 updated by a t r a n s a c t i o n s t i l l i n progress which may be l a t e r backed out. To see the problem suppose t h a t data updated by some ongoing t r a n s a c t i o n T1 i s updated by another t r a n s a c t i o n T2. Then back out of t r a n s a c t i o n T1 w i l l undo T2's update. Hence, T2 w i l l a l s o have to be backed out. A need f o r \" i s o l a t e d \" backout suggests that l o c k s f o r update should be held to the end of a t r a n s a c t i o n . T h i s i s e x a c t l y the technique used i n System R (Astrahan et al..1976). S c h l a g e t e r (1978) p o i n t s out t h a t the two phase lock p r o t o c o l \"prevents the propagation of r o l l b a c k due to deadlock\". R e c a l l t h a t two phase l o c k i n g does not r e q u i r e t h a t l o c k s be h e l d to the end of a t r a n s a c t i o n . Of course, propagation of r o l l b a c k due to other reasons ( i . e . , hardware or software e r r o r s ) i s p o s s i b l e with two phase l o c k i n g . D.. Other Systems This s e c t i o n d e s c r i b e s concurrency c o n t r o l f a c i l i t i e s provided by a number of systems. 1. System R System R i s an experimental prototype database management system developed at the IBM San Jose, Research Laboratory (Astrahan et a l . 1976). The system provides a high l e v e l r e l a t i o n a l data i n t e r f a c e which permits d e f i n i t i o n of a v a r i e t y of views on common un d e r l y i n g data. (The SEQUEL term f o r an e x t e r n a l r e l a t i o n i s a view.) 27 The p r i n c i p a l language f o r i n t e r a c t i n g with System R i s the SEQUEL data sublanguage d e s c r i b e d i n Chamberlin and Boyce (1974). System R accomplishes the i n t e r f a c e between SEQUEL and the host language by means of a concept c a l l e d a c u r s o r . A c u r s o r i s a name which i d e n t i f i e s an a c t i v e s e t of t u p l e s and marks a p a r t i c u l a r t u p l e w i t h i n t h i s s e t . . A curso r i s p o s i t i o n e d by means of the SEQUEL operator which takes as parameters a SEQUEL guery and a c u r s o r name. The t u p l e s w i t h i n an a c t i v e s et may be r e t r i e v e d one a t a time by a user program by means of a FETCH command which takes as an argument a c u r s o r name. An a l t e r n a t i v e form of the FETCH command i s provided. The FETCH-HOLD command operates i n e x a c t l y the same way as the FETCH command except t h a t i t a l s o a c q u i r e s a hold on the t u p l e r e t u r n e d . A hold prevents other users from updating or d e l e t i n g the t u p l e u n t i l i t i s e x p l i c i t l y r e l e a s e d ( by means of a RELEASE command) or u n t i l the holding t r a n s a c t i o n has terminated. System R employs very s o p h i s t i c a t e d f a c i l i t i e s f o r ensu r i n g database c o r r e c t n e s s i n c l u d i n g i n t e g r i t y a s s e r t i o n s , a l o g g i n g and r e c o v e r y subsystem, automatic l o c k i n g , deadlock d e t e c t i o n and t r a n s a c t i o n back out procedures. I n t e g r i t y a s s e r t i o n s are s p e c i f i e d by means of SEQUEL p r e d i c a t e s . A s s e r t i o n s may d e s c r i b e p e r m i s s i b l e s t a t e s of the database or p e r m i s s i b l e t r a n s i t i o n s i n the database. The system a u t o m a t i c a l l y r e j e c t s any data m o d i f i c a t i o n command which v i o l a t e s an a c t i v e i n t e g r i t y a s s e r t i o n . A s s e r t i o n s are normally checked at the end of each t r a n s a c t i o n , and the t r a n s a c t i o n i s backed out i f some a s s e r t i o n i s v i o l a t e d . I f an a s s e r t i o n i s 28 s p e c i f i e d as IMMEDIATE i t i s checked a f t e r each data m o d i f i c a t i o n command with i n each t r a n s a c t i o n . In the remainder of t h i s s u b s e c t i o n we focus on the f o l l o w i n g concurrency c o n t r o l f e a t u r e s of System R: (a) user d e f i n e d t r a n s a c t i o n s (b) m u l t i p l e l e v e l s of c o n s i s t e n c y (c) l o c k i n g a t v a r i o u s g r a n u l a r i t i e s A user d e f i n e s h i s own t r a n s a c t i o n s i n System R by means of the BEGIN-TRANS and END-TRANS o p e r a t o r s . Such a t r a n s a c t i o n i s b a s i c a l l y a PL/1 program c o n t a i n i n g SEQUEL data manipulation commands. A user may a l s o s p e c i f y save p o i n t s within h i s t r a n s a c t i o n . As long as a t r a n s a c t i o n i s a c t i v e the user may backup t o the beginning cr to any i n t e r m e d i a t e save p o i n t . . System R supports three d i s t i n c t l e v e l s of c o n s i s t e n c y . When a user d e f i n e s h i s t r a n s a c t i o n he must s p e c i f y the l e v e l at which he wishes i t t o execute. At a l l three c o n s i s t e n c y l e v e l s the system guarantees that data modified by a t r a n s a c t i o n i s not modified by any other u n t i l the f i r s t t r a n s a c t i o n terminates. Thus, the system can guarantee that backout of m o d i f i c a t i o n s by one t r a n s a c t i o n w i l l not undo m o d i f i c a t i o n s made by other t r a n s a c t i o n s . The d i f f e r e n c e s i n c o n s i s t e n c y l e v e l s occur during read o p e r a t i o n s . L e v e l 1 c o n s i s t e n c y o f f e r s the l e a s t i s o l a t i o n from other users. At t h i s l e v e l a t r a n s a c t i o n may read uncommitted data. Such a t r a n s a c t i o n i n c u r s the r i s k of readi n g i n c o n s i s t e n t data values or data values which never e x i s t e d i f the t r a n s a c t i o n which s e t the values i s l a t e r backed out. The p o s s i b l y i n a c c u r a t e data gathered by a l e v e l 1 t r a n s a c t i o n would be 29 u n s u i t a b l e as a b a s i s f o r updating the database or f o r making commitments to the ou t s i d e world. I t may be adequate, however, f o r s t a t i s t i c a l r e p o r t i n g . A l e v e l 1 t r a n s a c t i o n may e x p l i c i t l y employ the HOLD operator t o p r o t e c t i t s e l f from see i n g uncommitted m o d i f i c a t i o n s or to guard a g a i n s t l o s i n g updates. L e v e l 2 c o n s i s t e n c y guarantees that a t r a n s a c t i o n does not see uncommitted data. However, the t r a n s a c t i o n may not see the same value f o r an e n t i t y each time the e n t i t y i s accessed ( i . e . , read r e p r o d u c i b i l i t y i s not guaranteed). At t h i s c o n s i s t e n c y l e v e l i t i s p o s s i b l e f o r another t r a n s a c t i o n to modify an e n t i t y and commit the change i n the i n t e r v a l between two s u c c e s s i v e accesses of the e n t i t y by a given l e v e l 2 t r a n s a c t i o n . , R e c a l l that data becomes committed as soon as the t r a n s a c t i o n which modifies i t te r m i n a t e s . The HOLD operator may be used by a l e v e l 2 t r a n s a c t i o n to ensure read r e p r o d u c i b i l i t y and to guard a g a i n s t l o s i n g updates. A l e v e l 3 t r a n s a c t i o n sees the l o g i c a l e q u i v a l e n t of a s i n g l e user system. A l l data read i s committed and read r e p r o d u c i b i l i t y i s guaranteed {except, of course, f o r changes made by the t r a n s a c t i o n i t s e l f ) . This read r e p r o d u c i b i l i t y a p p l i e s not only to s i n g l e t u p l e s but a l s o to c o l l e c t i o n s of t u p l e s . For example, i f a l e v e l 3 t r a n s a c t i o n accesses a l l employees whose s a l a r y f a l l s w i t h i n a c e r t a i n range, the same answer w i l l occur every time w i t h i n the t r a n s a c t i o n . Concurrent t r a n s a c t i o n s w i l l be prevented not only from updating or d e l e t i n g an employee from t h i s s e t but a l s o from adding an employee to t h i s s e t . . L e v e l 3 c o n s i s t e n c y does not r e q u i r e e x p l i c i t HOLDs. The problem of l o s t updates i s e l i m i n a t e d . 30 To guarantee c o n s i s t e n c y a t these v a r i o u s l e v e l s , System R set s l o c k s a u t o m a t i c a l l y (with the e x c e p t i o n of e x p l i c i t HOLDs which may be set by l e v e l 1 and 2 t r a n s a c t i o n s ) . To reduce the overheads r e q u i r e d f o r lock management, lo c k s are a p p l i e d at vari o u s g r a n u l a r i t i e s . For example, i n d i v i d u a l t u p l e l o c k s are a p p l i e d f o r t r a n s a c t i o n s t h a t access only a few t u p l e s , whereas a c o a r s e r g r a n u l a r i t y of l o c k ( i . e . , a t the l e v e l of a whole r e l a t i o n ) may be chosen f o r t r a n s a c t i o n s which access many t u p l e s . The p r o t o c o l f o r r e g u e s t i n q l o c k s i s t h a t d e s c r i b e d i n Gray et a l . (1976). System R employs l o c k i n g both at the l o g i c a l l e v e l of r e l a t i o n s and t u p l e s and at the p h y s i c a l l e v e l of pages. At the p h y s i c a l l e v e l , l o c k i n g i s used to guard a g a i n s t occurrences such as the f o l l o w i n g : \"... a data page may c o n t a i n s e v e r a l t u p l e s with each t u p l e accessed through i t s t u p l e i d e n t i f i e r , which r e q u i r e s f o l l o w i n g a p o i n t e r w i t h i n the data page. Even i f no l o g i c a l c o n f l i c t occurs between two t r a n s a c t i o n s because each i s a c c e s s i n g a d i f f e r e n t r e l a t i o n or a d i f f e r e n t t u p l e i n the same r e l a t i o n , a problem c o u l d occur at the p h y s i c a l l e v e l i f one t r a n s a c t i o n f o l l o w s a p o i n t e r to a tu p l e on some page while the other t r a n s a c t i o n updates a second t u p l e on the same page and causes a data compaction r o u t i n e to r e a s s i g n t u p l e l o c a t i o n s . \" 1 A d i s t i n c t i o n between shared and e x c l u s i v e l o c k s i s made f o r l o g i c a l l o c k i n g . . For a l l three c o n s i s t e n c y l e v e l s i f a t u p l e i s t o be i n s e r t e d or updated by a t r a n s a c t i o n then an e x c l u s i v e l o c k i s held on the t u p l e (or some superset) u n t i l the t r a n s a c t i o n t e r m i n a t e s . Read r e p r o d u c i b i l i t y i s achieved f o r l e v e l 3 t r a n s a c t i o n s by maintaining shared l o c k s on a l l t u p l e s 1 T h i s passage i s taken from page 124 of Astrahan e t a l . (1976). 31 and index values t h a t are read f o r the d u r a t i o n of the t r a n s a c t i o n . \"For t r a n s a c t i o n s with l e v e l 2 c o n s i s t e n c y read accesses r e q u i r e a shared l o c k with immediate d u r a t i o n . Such a lock request i s enqueued behind e a r l i e r e x c l u s i v e l o c k requests so that the user i s assured of r e a d i n g committed data. The l o c k i s then r e l e a s e d as soon as the request has been granted, s i n c e reads do not have t o be r e p e a t a b l e . 1 1 1 For t r a n s a c t i o n s with l e v e l 1 c o n s i s t e n c y , no l o c k s are h e l d f o r read purposes except f o r p h y s i c a l l o c k s on pages as d e s c r i b e d above. 2- - INGEES ' INGRES (Stonebraker et a l . . 1976) i s another r e l a t i o n a l DBMS which, l i k e System E, supports the concept of a view, p r o v i d e s i n t e g r i t y a s s e r t i o n s f o r checking v a l i d i t y of updates, and provides a l o g g i n g and recovery scheme. The concurrency c o n t r o l f a c i l i t i e s provided by INGRES, however, are l i m i t e d i n comparison with those provided by System B, l a r g e l y due t o address space l i m i t a t i o n s of the PDP-11s f o r which INGEES was designed. INGEES prov i d e s an o p t i o n whereby concurrency c o n t r o l can be turned o f f . The primary guery language supported i s QUEL. Users may execute QUEL statements and other INGEES u t i l i t i e s d i r e c t l y from a t e r m i n a l , or INGEES may be invoked from w i t h i n a program w r i t t e n i n EQUEL. EQUEL i s a s p e c i a l language c o n s i s t i n g of QUEL embedded i n the g e n e r a l purpose programming language C. 1 This passage i s taken from page 126 of Astrahan e t a l . . ( 1 9 7 6 ) . 3 2 I n t e g r i t y a s s e r t i o n s are s p e c i f i e d by means of QUEL q u a l i f i c a t i o n c l a u s e s . A user's request i s modified b e f o r e execution by ANDinq a p p r o p r i a t e a s s e r t i o n s with e x i s t i n g q u a l i f i c a t i o n s . This has the e f f e c t that no reguest can p o s s i b l y v i o l a t e any i n t e g r i t y a s s e r t i o n . . INGRES does not support the System R t r a n s i t i o n a s s e r t i o n s or d e f e r r e d a s s e r t i o n s . The reader i s cautioned that INGEES w r i t e r s use the term t r a n s a c t i o n d i f f e r e n t l y than we have been using i t . INGRES t r a n s a c t i o n s are not n e c e s s a r i l y c o n s i s t e n c y p r e s e r v i n g . Throughout the remainder of t h i s s u b s e c t i o n , whenever INGEES w r i t e r s use the term t r a n s a c t i o n we use term INGRES t r a n s a c t i o n . An INGRES t r a n s a c t i o n i s d e f i n e d as one INGEES command ( i . e . , QUEL data manipulation command or INGEES u t i l i t y ) . . The INGEES desig n e r s c o n s i d e r e d v a r i o u s other a l t e r n a t i v e s f o r d e f i n i n g a t r a n s a c t i o n i n c l u d i n g the f o l l o w i n g : (a) a c o l l e c t i o n of INGRES commands with no i n t e r v e n i n g C code (b) a c o l l e c t i o n of INGRES commands with C code but no system c a l l s (c) an a r b i t r a r y EQUEL program Option (a) i s p e r c e i v e d by INGRES designers t o be a minor extension of the chosen a l t e r n a t i v e (one which they promised t o implement given s u f f i c i e n t user demand). Option (b) i s seen as a minor g e n e r a l i z a t i o n of o p t i o n (a) not worth the a d d i t i o n a l system complexity r e q u i r e d f o r i t s implementation.. D e f i n i n g an INGEES t r a n s a c t i o n as an EQUEL program [ o p t i o n (c) ] i s , i n the opini o n of the INGEES designers, impossible t o 33 support. Such a t r a n s a c t i o n c o u l d c o n t a i n system c a l l s to say cre a t e and destroy f i l e s . The overhead of backing out through i n t e r m e d i a t e system c a l l s to r e s o l v e deadlock i s seen by INGRES designers as p r o h i b i t i v e . I t i s noted t h a t deadlock cannot be avoided i n t h i s case because ex e c u t i o n of a QUEL statement may depend on r e s u l t s obtained from system c a l l s . Hence, there i s no way t o t e l l i n advance that two t r a n s a c t i o n s may deadlock. I t i s i n t e r e s t i n g to note that System R supports a t r a n s a c t i o n of t h i s type, presumably because System R designers could b e t t e r a f f o r d the overheads f o r r e s o l u t i o n of deadlock ( i . e . , the op e r a t i n g system environment f o r System R i s an extended v e r s i o n of VM/37 0) . INGRES desig n e r s have chosen t o av o i d ( r a t h e r than d e t e c t and r e s o l v e ) deadlock. The lock p r o t o c o l used by INGRES t r a n s a c t i o n s amounts to r e q u i r i n g t h a t a t r a n s a c t i o n l o c k s a l l i t s e n t i t i e s i n one step. T h i s i s accomplished v i a a LOCK r e l a t i o n which records lock r e g u e s t s . An i n t e r a c t i o n 1 i s not allowed t o proceed unless a l l lock requests can be g r a n t e d . ( i . e . . The l o c k r e l a t i o n i s p h y s i c a l l y locked and i n t e r r o g a t e d f o r c o n f l i c t i n g l o c k s . I f no lock s c o n f l i c t then a l l r e q u i r e d l o c k s are i n s e r t e d i n t o the LOCK r e l a t i o n . Otherwise, the i n t e r a c t i o n waits f o r a f i x e d i n t e r v a l and t r i e s again.) I t i s not c l e a r from a v a i l a b l e documentation whether both shared and e x c l u s i v e l o c k s are employed by the l o c k p r o t o c o l . Locks are r e l e a s e d at the end of each i n t e r a c t i o n . 1 An i n t e r a c t i o n may c o n s i s t of more than one INGRES t r a n s a c t i o n . 34 I t i s not c l e a r what s o r t of c o r r e c t n e s s guarantee i s o f f e r e d by the INGEES lock p r o t o c o l . I f we assume that a l l l o c k s are e x c l u s i v e l o c k s then i t appears t h a t the lock p r o t o c o l w i l l always output a s e r i a l i z a b l e schedule when used by a s e t of c o n s i s t e n c y p r e s e r v i n g t r a n s a c t i o n s . When used by a s e t of INGEES t r a n s a c t i o n s , the lock p r o t o c o l may r e s u l t i n a schedule which i s not c o n s i s t e n c y p r e s e r v i n g . C l e a r l y , any lock p r o t o c o l even one that always outputs a s e r i a l schedule may r e s u l t i n an i n c o n s i s t e n t schedule when used by non c o n s i s t e n c y p r e s e r v i n g t r a n s a c t i o n s . I t appears a l s o t h a t INGRES may l o s e updates. I f th e r e i s no c o n s i s t e n c y p r e s e r v i n g u n i t , then there i s no way to r e q u i r e that the read and update of an e n t i t y by the same user be executed i n d i v i s i b l y . Hence, an update of the same e n t i t y by a d i f f e r e n t user may get i n t e r l e a v e d such t h a t an update gets l o s t . We s p e c u l a t e t h a t some p r o t e c t i o n a g a i n s t l o s t updates i s o f f e r e d by INGEES. F i r s t , s i n c e l o c k s are held f o r the d u r a t i o n of an i n t e r a c t i o n , l o s t updates w i l l be prevented i f a user c a r e f u l l y s e l e c t s the INGEES commands which w i l l comprise h i s i n t e r a c t i o n . . Second, i t appears that INGEES a u t o m a t i c a l l y converts any update command i n t o a r e t r i e v a l and update command. This would mean that some l o s t update s i t u a t i o n s w i l l be avoided by not a l l o w i n g the update of a v a r i a b l e t o be based on a previous and erroneous value read. i t appears t h a t the f a c i l i t y t o process s e v e r a l commands as an i n t e r a c t i o n could be used i n gene r a l to d e f i n e t r a n s a c t i o n s . These statements are h i g h l y s p e c u l a t i v e , however, as they have not been pointed out by 35 INGEES w r i t e r s . The g r a n u l a r i t y of l o c k i s r e l a t i v e l y coarse (on domains of a r e l a t i o n ) r e f l e c t i n g an environment where core storage f o r a l a r g e l o c k t a b l e i s not a v a i l a b l e . A p r e d i c a t e l o c k i n g scheme i s proposed i n the hope that c o n s i d e r a b l e concurrency may be provided a t an a c c e p t a b l e overhead i n l o c k t a b l e space and CPU time. 3. IMS IMS i s a h i e r a r c h i c a l DBMS which provides f a c i l i t i e s f o r running p r i m a r i l y batch a p p l i c a t i o n s . A user's e x t e r n a l view c o n s i s t s of a c o l l e c t i o n of l o g i c a l databases, each d e f i n e d by means of a program c o n t r o l block (PCB) which a l s o s p e c i f i e s a mapping t o a corresponding p h y s i c a l database. When a user program i s o p e r a t i n g on a p a r t i c u l a r l o g i c a l database the a s s o c i a t e d PCB i s maintained i n storage as a means of communication between the program and IMS. Each p h y s i c a l database i s an ordered set of a l l occurrences of one type of p h y s i c a l database r e c o r d . The f o l l o w i n g d e s c r i b e s e s s e n t i a l f e a t u r e s of the IMS h i e r a r c h i c a l data model. . The reader i s r e f e r r e d to Date( 1977) f o r f u r t h e r d e t a i l s : (a) A database i s defined by a t r e e s t r u c t u r e ( i . e . , a d e f i n i t i o n t r e e ) . Each node of the d e f i n i t i o n t r e e r e p r e s e n t s a segment type i n the database. (b) The d e f i n i t i o n t r e e c o n t a i n s e x a c t l y one r o o t segment type. The r o o t may have any number of c h i l d segment types. Each c h i l d of the r o o t may a l s o have any number of c h i l d segment types and so 36 on to any number of l e v e l s . (c) A database record c o n s i s t s of the occurrence i n the database of a s i n g l e r o o t segment and a number of dependent segments. (d) A database r e c o r d f o l l o w s the template imposed by the d e f i n i t i o n t r e e with the exception that f o r one occurrence of any given segment type t h e r e may be zero or more occurrences of each of i t s c h i l d r e n . IMS p r o v i d e s a complete set of r o u t i n e s to a s s i s t i n m a i n t a i n i n g database c o r r e c t n e s s i n c l u d i n g checkpoint and r e s t a r t procedures, procedures f o r backing out changes made to a database by a given program, and procedures f o r maintaining the system l o g . IMS does not e x p l i c i t l y provide i n t e g r i t y a s s e r t i o n s . However, as pointed out i n Date (1977), t h e r e are two f e a t u r e s of IMS t h a t can be viewed as mechanisms f o r h a n d l i n g c o n s i s t e n c y c o n s t r a i n t s . , F i r s t , IMS enforces unigueness of seguence f i e l d values (for segments which have seguence f i e l d s and are not m u l t i p l e v a l u e d ) . Second, c e r t a i n c o n s i s t e n c y c o n s t r a i n t s are enforced by the very h i e r a r c h i c a l s t r u c t u r e of an IMS database. For example, suppose t h a t E1 and E2 are segment occurrences and that a c o n s i s t e n c y c o n s t r a i n t C a p p l i e s t o the database t h a t \"E1 cannot e x i s t without E2\". I f the database i s organized such that the segment type of E2 i s s u p e r i o r i n the d e f i n i t i o n t r e e to t hat of E1, then C would be guaranteed by the simple r u l e f o r h i e r a r c h i c a l databases that an occurrence of a dependent segment cannot e x i s t without i t s parent. 37 IMS provi d e s a f a i r l y complete set of concurrency c o n t r o l f a c i l i t i e s . The primary d i f f e r e n c e between IMS and' other systems we have seen i n t h i s regard i s that IMS l e a v e s most of the r e s p o n s i b i l i t y f o r lock p r o t o c o l s up to the user ( i . e . , no automatic l o c k i n g ) . The remainder of t h i s s e c t i o n d e s c r i b e s IMS concurrency c o n t r o l f a c i l i t i e s . F i r s t , a PCB a s s o c i a t e d with a user program may s p e c i f y an option whereby a l l occurrences of an e n t i r e segment type are locked i n e x c l u s i v e mode. IMS w i l l not load and i n i t i a t e a program i f i t s PCB en t r y c o n f l i c t s with that of any program which i s alre a d y executing. Two PCB e n t r i e s c o n f l i c t i f e i t h e r s p e c i f i e s the EXCLUSIVE option f o r the same segment type. Second, IMS provides a hold mechanism t o guard against l o s t updates. Two v e r s i o n s of each r e t r i e v a l command are provided: a GET v e r s i o n and a GET-HOLD v e r s i o n . Both forms r e t r i e v e a p a r t i c u l a r segment occurrence, however a GET-HOLD command a l s o places the r e t r i e v e d segment i n h o l d . When a segment i s h e l d by one user i t may be r e t r i e v e d by other users but not v i a a GET-HOLD r e t r i e v a l command. When the h o l d i n g user i s s u e s a modify command the segment occurrence i s locked i n e x c l u s i v e mode ( i . e . , other users may n e i t h e r GET nor GET-HOLD the segment). Note that t h i s procedure ensures that two users w i l l not read the same value f o r a segment with the i n t e n t i o n o f updating i t . Hence, segment updates cannot be o v e r w r i t t e n . T h i r d , IMS supports shared l o c k s on segment occurrences which may be e x p l i c i t l y set and r e l e a s e d by a user program. Although there i s a l i m i t to the number of such l o c k s t h a t a program may hold at one time, t h i s f a c i l i t y appears s u f f i c i e n t 38 to guarantee c o n s i s t e n c y (assuming j u d i c i o u s use of a l o c k i n g p r o t o c o l by a l l concurrent programs). I t i s i n t e r e s t i n g t o note t h a t IMS permits a user to delegate d i f f e r e n t segment occurrences to v a r i o u s l o c k c l a s s e s . A l l segments w i t h i n a given l o c k c l a s s may then be unlocked i n one o p e r a t i o n . . The f o l l o w i n g i s an example of such l o c k i n g and u n l o c k i n g : GU PRESIDENT*QB(NAME=1 JOHN') DEQ B The •*Q' i n the GET UNIQUE command s p e c i f i e s t h a t the r e t r i e v e d segment should be locked i n shared mode. The 'B' f o l l o w i n g the *Q i s the l o c k c l a s s . The dequeue command (DEQ) r e l e a s e s a l l shared locks on segments i n lock c l a s s B. Note that a hold on a segment and shared l o c k on a segment are f u n c t i o n a l l y i d e n t i c a l . E x c l u s i v e l o c k s are r e l e a s e d when a program terminates or when a checkpoint o p e r a t i o n w i t h i n the program i s executed. Shared l o c k s are a l s o r e l e a s e d at these times. Backout may be i n i t i a t e d by a program or by the system. The automatic unlocki n g s t r a t e g y appears s u f f i c i e n t to guarantee i s o l a t e d backout. F i n a l l y , IMS permits s p e c i f i c a t i o n of an o p t i o n i n the PCB ( c a l l e d the BEAD EXPRESS option) whereby a program i s allowed to see uncommitted changes made by other c o n c u r r e n t programs. A program wiiich uses the BEAD EXPRESS op t i o n would be known as a l e v e l 1 t r a n s a c t i o n i n System R. 39 4. DBTG In t h i s s e c t i o n we review proposals made by the Data Base Task Group (DBTG) f o r maintaining c o r r e c t n e s s i n a network based system (CODASYL, 1976). In p a r t i c u l a r , we are i n t e r e s t e d i n the p r o v i s i o n s f o r s p e c i f y i n g i n t e g r i t y c o n s t r a i n t s i n the schema DDL, and i n the DML statements f o r concurrency c o n t r o l . . We f i r s t b r i e f l y review the network data model and the fundamental notion of currency ( not to be confused with concurrency).. A network i s a mere g e n e r a l s t r u c t u r e than a h i e r a r c h y i n that a given segment occurrence may have any number of immediate s u p e r i o r s (as w e l l as any number of immediate dependents). The network approach uses the term r e c o r d f o r segment and i n c l u d e s the concept of a s e t . A set type c o n s i s t s of one owner r e c o r d type and one or more member r e c o r d types. An occurrence of a set i n the database c o n s i s t s of the occurrence of one owner re c o r d and zero or more member recor d s . The reader i s r e f e r r e d to Date (1977) f o r f u r t h e r d e t a i l s on the network data model. The DBTG data sublanguage r e s t s on the fundamental n o t i o n of c u r r e n c y . The b a s i c idea i s t h a t the DBMS maintains f o r each a c t i v e user program a t a b l e of database key values which s p e c i f i e s the most r e c e n t l y r e f e r e n c e d r e c o r d occurrence w i t h i n each a r e a , 1 w i t h i n each r e c o r d type, w i t h i n each set type and within a l l record types. The most r e c e n t l y r e f e r e n c e d r e c o r d occurrence w i t h i n a l l record types i s c a l l e d the c u r r e n t of run-u n i t . 1 The storage space f o r a DBTG database i s p a r t i t i o n e d i n t o a number of named areas. 40 The DBTG proposes s p e c i f i c a t i o n c f i n t e g r i t y c o n s t r a i n t s i n the form of procedure d e c l a r a t i o n s i n the schema DDL. Such a procedure i s a u t o m a t i c a l l y invoked b e f o r e , a f t e r or on e r r o r during a p a r t i c u l a r o p e r a t i o n a f f e c t i n g a p a r t i c u l a r o b j e c t . The o b j e c t may be the schema i t s e l f , an area, r e c o r d , data item or s e t . . E x a c t l y when the procedure should be invoked as w e l l as the o b j e c t and o p e r a t i o n that should t r i g g e r the i n v o k a t i o n i s a l s o s p e c i f i e d as part of the schema. A d d i t i o n a l c o n s t r a i n t s may be s p e c i f i e d which are a u t o m a t i c a l l y checked a f t e r e x e c u t i o n of a s t o r e or modify o p e r a t i o n i n v o l v i n g a s p e c i f i e d data item. T h i s f e a t u r e c o u l d be used, f o r example, t o s p e c i f y values or ranges of values allowed f o r data items. The DBTG f a c i l i t i e s f o r concurrency c o n t r o l are s i m i l a r t o those of IMS i n t h a t they are not a p p l i e d a u t o m a t i c a l l y . Rather, i t i s l e f t up to the user to develop h i s own concurrency c o n t r o l s t r a t e g y based on the a v a i l a b l e concurrency c o n t r o l p r i m i t i v e s . The remainder of t h i s s e c t i o n d e s c r i b e s two such concurrency c o n t r o l p r i m i t i v e s : usage-mode and monitored-mode. A user must s p e c i f y a usage-mode f o r any area which he wishes to use. A user i n d i c a t e s two t h i n g s when he s p e c i f i e s a usage-mode: F i r s t , how he himself wishes to use the area. Second, how he wishes concurrent users t o use the area. The a v a i l a b l e usage-modes are d e s c r i b e d below: (a) R e t r i e v a l - given user wishes to read from t h i s area - other users may read or write i n t h i s area (b) Update •- given user wishes t o write i n t h i s area - other users may read or write (c) P r o t e c t e d R e t r i e v a l - given user wishes t o read from t h i s area - other users may read but not write 41 (d) P r o t e c t e d Update - given user wishes to write i n t h i s area - other users may read but not write (e) E x c l u s i v e R e t r i e v a l - given user wishes t o read - other users may n e i t h e r read nor write (f) E x c l u s i v e Update - given.user wishes to write i n t h i s area - other users may n e i t h e r read nor write A user i s not allowed access to an area i f a c o n f l i c t i n g usage-mode has alre a d y been e s t a b l i s h e d f o r t h a t area by some other user. The f a c i l i t y f o r ens u r i n g p r o t e c t e d or e x c l u s i v e use of an area might be e f f e c t i v e l y used to ensure database c o r r e c t n e s s . However, t h i s mechanism s e v e r e l y r e s t r i c t s concurrency because the unit of l o c k i n g ( i . e . , an area) i s a l a r g e p o r t i o n of the database. The DBTG proposals provide f o r an a d d i t i o n a l mechanism which d i f f e r s from any t h a t we have seen so f a r i n t h a t , r a t h e r than i s o l a t i n g a user from the a c t i v i t i e s of other concurrent users, t h i s mechanism attempts to n o t i f y a user of the a c t i v i t i e s of other concurrent users. The b a s i c i d e a of the scheme i s as f o l l o w s : A user may e x p l i c i t l y place a rec o r d i n t o monitored-mode. I f a user. A, has a r e c o r d i n monitored-mode then an attempt by A to change the rec o r d w i l l f a i l i f some concurrent user has changed the r e c o r d s i n c e A requested monitored-mode. User A r e c e i v e s an e r r o r message and i t i s up to him t o decide what t o do next. Monitored-mode has r e c e i v e d c r i t i c i s m i n the l i t e r a t u r e (Engles 1971, 1977). The b a s i c problems are as f o l l o w s : (a) When a request t o modify a rec o r d i n monitored-mode f a i l s , t h ere i s no way of f o r c i n g the user to 4 2 handle the s i t u a t i o n c o r r e c t l y . {h) No p r o t e c t i o n i s o f f e r e d a g a i n s t seeing uncommitted changes. (c) No mechanism i s provided to prevent concurrent users from changing data t h a t a given user i s c u r r e n t l y working on. (d) One user may t r a n s f e r another user's c u r r e n t of r u n - u n i t from one r e c o r d occurrence to another. (A p o s s i b l e r e s u l t i s t h a t a user, unaware t h a t h i s c u r r e n t of r u n - u n i t has been t r a n s f e r r e d , may f i n d h i m s e l f t r a v e r s i n g the wrong set occurrence.). S e v e r a l proposals have been made t o the CODASYL Programming Language Committee t o r e p l a c e the DBTG n o t i f y p r o t o c o l with a l o c k i n g scheme. One such p r o p o s a l made by UNIVAC(1976) prov i d e s f o r a LOCK statement which permits a user to e x p l i c i t l y l o c k a record and keep i t locked even when the rec o r d ceases t o be the cu r r e n t of r u n - u n i t . . [Some s i m i l a r i d e a s were proposed e a r l i e r by Hawley et a l . (1S75).] A u n i f i e d s o l u t i o n t o s e v e r a l DBTG problems i n c l u d i n g concurrency has been proposed by Engles (1976) based on l o c k i n g and the concept of a c u r s o r . E« Current Hot Research Issues We conclude t h i s chapter with a l i s t of c u r r e n t hot r e s e a r c h i s s u e s : 1. Q u a n t i t a t i v e measures f o r e v a l u a t i n g the performance of a scheduler are needed. To date only q u a l i t a t i v e measures have been proposed such as the s e t of a l l output schedules. P a p a d i m i t r i o u (1979) suggests one promising d i r e c t i o n which 43 would be t o approximate the t o t a l number of delays imposed on r e q u e s t s by counting the number rof t r a n s a c t i o n s t e p s which cannot execute immediately upon;= a r r i v a l . 2. A b a s i c assumption i n P a p a d i m i t r i o u • s formalism i s t h a t a s y n t a c t i c d e s c r i p t i o n of a l l t r a n s a c t i o n s to occur i n a h i s t o r y i s known t o the scheduler a p r i o r i . In P a p a d i m i t r i o u ' s words \" i t i s not cl-ear how to remove t h i s assumption and s t i l l r e t a i n the wealth of a v a i l a b l e s o l u t i o n s \" . One approach wou.'.d be to have a number of prototype t r a n s a c t i o n s to which a r r i v i n g t r a n s a c t i o n s can be matched. This approach i s d i s c u s s e d i n P a p a d i m i t r i o u (1979) and B e r n s t e i n et a l . (1977). R e c a l l t h a t l o c k i n g p r o t o c o l s get around t h i s requirement by f o c u s i n g on each i n d i v i d u a l t r a n s a c t i o n . 3. Many of the r e s u l t s presented i n the l i t e r a t u r e r e l a t e d to l o c k i n g s t r a t e g i e s are not adeguately g e n e r a l i z e d t o d i s t i n g u i s h between read and write a c t i o n s . \" In general l o c k p r o t o c o l s are s t a t e d by assuming t h a t a l l a c t i o n s modify t h e i r e n t i t i e s . Some g e n e r a l i z a t i o n of the two phase l o c k p r o t o c o l can be found i n Gray et a l . . (1975) and Lien and Weinberger (1 978). Kedem and S i l b e r s c h a t z (1979), i n the context of t h e i r work on l o c k p r o t o c o l s f o r database systems modeled by d i r e c t e d graphs, note a requirement f o r continued i n v e s t i g a t i o n \"concerning c o r r e c t n e s s when t r a n s a c t i o n s are p e r m i t t e d to set shared and e x c l u s i v e l o c k s \" . 4. A major c u r r e n t i s s u e , not d i s c u s s e d anywhere e l s e i n t h i s t h e s i s except here, i s d i s t r i b u t e d data management. For d i s c u s s i o n purposes we choose the SDD-1 system - a prototype 44 d i s t r i b u t e d database system under development by Computer C o r p o r a t i o n of America. The SDD-1 system i n v o l v e s redundant data s t o r e d at a number of network s i t e s . Users i n t e r a c t with SDD-1 as i f i t were a n o n - d i s t r i b u t e d database system because SDD-1 handles a l l i s s u e s a r i s i n g from d i s t r i b u t i o n which, among many o t h e r s , i n c l u d e d i s t r i b u t e d concurrency c o n t r o l . The concurrency i s s u e here i s the same as we have been d e s c r i b i n g except that i t i s f u r t h e r complicated by the problems of data d i s t r i b u t i o n and data r e p l i c a t i o n . A s e r i e s of papers i n ACM T r a n s a c t i o n s on Database Systems {1980) d e s c r i b e s the SDD-1 system and i t s concurrency c o n t r o l mechanism. F u r t h e r , an a n a l y s i s of c o r r e c t n e s s of the concurrency c o n t r o l mechanism i s provided (see Rothnie et a l . 1980, B e r n s t e i n et a l . 1980, B e r s t e i n and Ship man 1980). Many r e s e a r c h e r s have d i s c u s s e d e x t e n s i o n s of t h e i r r e s u l t s to d i s t r i b u t e d d a t a b a s e s ( i . e . , P a p a d i m i t r i o u 1979, Kedem and S i l b e r s c h a t z 1979). 5. F i n a l l y , there i s the i s s u e of concurrency i n B-trees. A B-tree (Bayer and McCreight 1972) i s a data s t r u c t u r e f o r o r g a n i z i n g a database to guarantee e f f i c i e n t a c c e s s . Concurrent o p e r a t i o n of processes on a database stor e d as a B-tree (or some variant) i n v o l v e s the usual problems a s s o c i a t e d with any concurrent system. However, s p e c i f i c s o l u t i o n s are suggested by the s t r u c t u r e imposed by the B-tree and by a reguirement f o r a l i m i t e d s e t of o p e r a t i o n s . See Kwong and Wood (1979, 1980) f o r a survey of s o l u t i o n s t o the problem of concurrency i n B-trees and a d i s c u s s i o n of the many remaining problems. 45 Chapter I I I : Problem In g e n e r a l , our EDBS concurrency problem may be s t a t e d as f o l l o w s : to ensure c o r r e c t n e s s of data maintained by EDBS when the system i s used c o n c u r r e n t l y under MTS. This chapter provides a host of background m a t e r i a l r e q u i r e d t o understand the problem. F i r s t , we d e s c r i b e EDBS from the user's p o i n t of view and we compare EDBS both i n gen e r a l and i n terms of concurrency c o n t r o l to those systems which have been d e s c r i b e d i n Chapter I I . Second, we d i s c u s s EDBS implementation of concurrency c o n t r o l . T h i s has s e v e r a l aspects. EDBS r e l i e s on the f i l e system to do some of the l o c k i n g r e g u i r e d by i t s concurrency c o n t r o l mechanism. EDBS has now been implemented with three d i f f e r e n t f i l e systems {i.e., APL*PLUS, CDC/APL, MTS/APLJ . Both the APL*PLUS and CDC/APL implementations are di s c u s s e d t o provide i n s i g h t i n t o the concurrency s o l u t i o n which was i n i t i a l l y i ntended f o r EDBS. The e x i s t i n g s o l u t i o n may be c o n s t r a i n e d by the CDC/APL f i l e system. I t i s our hope to devise a s o l u t i o n which i s independent of previous f i l e systems. T h i r d , we d e s c r i b e EDBS concurrency c o n t r o l f a c i l i t i e s using P a p a d i m i t r i o u ' s t r a n s a c t i o n model as presented i n Chapter I I . F i n a l l y , we provide a l i s t of key problems r e l a t e d to running EDBS c o n c u r r e n t l y under MTS. The reader i s cautioned that i n t h i s chapter when we use the term EDBS we are r e f e r r i n g to a s e t of APL f u n c t i o n s which may be used to perform o p e r a t i o n s on databases. 46 Ao EDBS D e s c r i p t i o n EDBS r e c o g n i z e s three types of u s e r s : system a d m i n i s t r a t o r s , database a d m i n i s t r a t o r s , and common users.. A system a d m i n i s t r a t o r i s r e s p o n s i b l e f o r i n s t a l l i n g , m a i n t a i n i n g , and modifying EDBS. A database a d m i n i s t r a t o r (DBA) i s r e s p o n s i b l e f o r c r e a t i n g , maintaining and g r a n t i n g access t o one or more databases.. A common user i s one who has been given p e r m i s s i o n by a DBA t o access a database. A DBA may a l s o be a common user. EDBS i s comprised of a number of APL workspaces. One workspace c o n t a i n s a l l the u t i l i t i e s r e q u i r e d by the system a d m i n i s t r a t o r to c a r r y out h i s r o l e (such as those r e q u i r e d to i n s t a l l the system). Other workspaces are intended f o r use by DBAs and c o n t a i n u t i l i t i e s f o r c r e a t i n g , d e s t r o y i n g , g r a n t i n g access t o , and maintaining databases. A l l t hree data models — h i e r a r c h i c a l , network, and r e l a t i o n a l — are supported by EDBS. . A separate system (workspace) implements each data model. These workspaces are intended f o r common users. The reader i s r e f e r r e d to Appendix A f o r an overview of EDBS workspaces and u t i l i t i e s . Although there are many s i m i l a r i t i e s among the commands a v a i l a b l e f o r i n t e r a c t i n g with the d i f f e r e n t types of databases, each system has i t s own data manipulation language. There are four c a t e g o r i e s of commands a v a i l a b l e with each system: (1) R e t r i e v a l commands (2) M o d i f i c a t i o n commands (3) B u f f e r commands (4) C o n t r o l commands 47 Data i n a database i s not manipulated d i r e c t l y by a user. Rather, i t i s moved i n t o a b u f f e r v i a r e t r i e v a l commands where i t may be read or modified v i a b u f f e r commands. M o d i f i c a t i o n commands take new or updated data from the b u f f e r and place i t i n the database. Common to a l l t h r e e systems i s the concept of a HOLD, however the u n i t of h o l d i n g i s d i f f e r e n t f o r each system. As i n other systems (System R, IMS) EDBS supports two v e r s i o n s of each r e t r i e v a l command: GET and GET-HOLD. Both v e r s i o n s r e t r i e v e data. The GET-HOLD a l s o a c q u i r e s a held on the data r e t r i e v e d . I f a hol d i s not immediately p o s s i b l e EDBS r e t u r n s a s t a t u s code to the user t o advise him to t r y again l a t e r to hold the item. There are t h r e e c o n t r o l commands: OPEN and CLOSE f o r opening and c l o s i n g a database and RELEASE. . The BELEASE command provides a user with the c a p a b i l i t y of e x p l i c i t l y r e l e a s i n g a hold which he has a c g u i r e d by means of a pr e v i o u s GET-HOLD r e t r i e v a l command. EDBS does not provide i n t e g r i t y a s s e r t i o n s . Two mechanisms are provided t o recover from a system crash which may leave the database i n an i n c o n s i s t e n t s t a t e : The f i r s t i s the MAINTAIN u t i l i t y which massages the database i n t o a c o n s i s t e n t s t a t e through an i n t e r a c t i v e d i a l o g u e with the DBA. The second mechanism i s the l o g g i n g of a l l i n s e r t i o n s , d e l e t i o n s and updates. A DUMPLOG u t i l i t y i s provided f o r p r i n t i n g the logged i n f o r m a t i o n . Logged e n t r i e s may be erased v i a the EBASELOG 48 U t i l i t y . 1 The reader i s r e f e r r e d to Appendix B f o r a summary of a v a i l a b l e EDBS data manipulation commands and u t i l i t i e s . The o b j e c t i v e s of t h i s s e c t i o n are t w o - f o l d : (1) t o provide a f l a v o u r of how each system within EDBS d i f f e r s from those we have seen before (both i n g e n e r a l and i n terms of concurrency c o n t r o l ) . (2) to d e s c r i b e i n d e t a i l the EDBS concurrency c o n t r o l f a c i l i t i e s a v a i l a b l e at the user i n t e r f a c e f o r each system. As a means of comparison we choose System R f o r the EDBS r e l a t i o n a l system, IMS f o r the EDBS h i e r a r c h i c a l system and DBTG f o r the EDBS network system. 1• B e l a t i o n a 1 System The EDBS r e l a t i o n a l system p r o v i d e s simple r e t r i e v a l , g u a l i f i e d r e t r i e v a l , and r e t r i e v a l from more than one r e l a t i o n . In a d d i t i o n , a f u l l range of storage o p e r a t i o n s are provided f o r i n s e r t i n g , updating, and d e l e t i n g t u p l e s . Among other t h i n g s , the f o l l o w i n g c a p a b i l i t i e s of the SEQUEL data sublanguage are not supported by the EDBS r e l a t i o n a l DML: (a) o r d e r i n g on t u p l e s r e t r i e v e d (b) nested mapping (c) s p e c i f i c a t i o n of more than two r e l a t i o n s i n a r e t r i e v a l command (d) e x p l i c i t e l i m i n a t i o n of d u p l i c a t e s from a query i The MAINTAIN, DUMPLOG and ERASELOG u t i l i t i e s have not been implemented at UBC. 49 r e s u l t . Like SEQUEL, a l l EDBS storage o p e r a t i o n s are r e s t r i c t e d t o operate on a s i n g l e r e l a t i o n at a time. The EDBS approach to d e f i n i n g new r e l a t i o n s d i f f e r s from that of System R. With System R f a c i l i t i e s f o r data d e f i n i t i o n and data manipulation are provided i n a u n i f i e d manner by means of the SEQUEL operator. . A user may c r e a t e new base r e l a t i o n s and des t r o y them when they are no longer needed.. EDBS, on the other hand, d i s t i n g u i s h e s between data d e f i n i t i o n and data manipulation.. A separate u t i l i t y i s provided f o r d e f i n i n g new base r e l a t i o n s . Only a DBA (not a common user) may cre a t e and destroy base r e l a t i o n s . The method of i n t e r f a c i n g the data sublanguage and the host language i s s i m i l a r with EDBS and System R. R e c a l l t hat System R accomplishes the i n t e r f a c e by means of c u r s o r s which i d e n t i f y a c t i v e s e t s of t u p l e s . The t u p l e s may be read by a program by s p e c i f y i n g the a p p r o p r i a t e c u r s o r i n a FETCH command.. EDBS maintains a c t i v e s e t s of t u p l e s i n a b u f f e r . Tuples are a v a i l a b l e to an APL program by means of b u f f e r commands which r e q u i r e s p e c i f i c a t i o n of a r e l a t i o n name. The r e l a t i o n name serves the f u n c t i o n of a c u r s o r . The only concurrency c o n t r o l mechanism a v a i l a b l e with the EDBS r e l a t i o n a l system i s the HOLD o p t i o n . . The System R automatic l o c k i n g , m u l t i p l e l e v e l s of c o n s i s t e n c y and t r a n s a c t i o n d e f i n i t i o n f a c i l i t i e s are not supported. The HOLD option operates d i f f e r e n t l y between EDBS and System R: The u n i t of hold with EDBS i s one r e l a t i o n . The u n i t of hold with System R i s a s i n g l e t u p l e . 50 EDBS e n f o r c e s c e r t a i n r u l e s on use of the GET-HOLD which may be s t a t e d as f o l l o w s : (a) A user must hold a r e l a t i o n p r i o r t o updating i t . (b) A user may h o l d at most one r e l a t i o n a t one time, (c) Only b u f f e r commands (or the RELEASE command) may i n t e r v e n e between the GET-HOLD and UPDATE commands. P r i o r use of the GET-HOLD i s not r e q u i r e d f o r i n s e r t i o n or d e l e t i o n of t u p l e s . EDBS a u t o m a t i c a l l y holds the r e l a t i o n i n these i n s t a n c e s . 2- H i e r a r c h i c a l System The EDBS data d e s c r i p t i o n language f o r h i e r a r c h i c a l databases i s s i m i l a r to that of IMS. The most notable d i f f e r e n c e i s t h a t EDBS does not support sequencing of key values or key values c o n s i s t i n g of more than one f i e l d value. The EDBS data manipulation language f o r h i e r a r c h i c a l databases i s a l s o s i m i l a r to the IMS data sublanguage.. The GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT, INSERT, DELETE, and REPLACE ope r a t i o n s are a l l supported. These op e r a t i o n s may be g u a l i f i e d by means of comparison and Boolean o p e r a t o r s which comprise the e q u i v a l e n t of an IMS segment search argument. In terms of concurrency c o n t r o l , only the HOLD opt i o n i s supported by EDBS. The HOLD o p t i o n operates e x a c t l y as d e s c r i b e d f o r IMS (see Chapter I I ) . EDBS has no f a c i l i t y by which a user may hold mere than one segment occurrence at one time. R e c a l l t hat IMS provides t h i s f a c i l i t y by means of the PCB e x c l u s i v e option or by means of e x p l i c i t shared l o c k s . 51 EDBS enforces the f o l l o w i n g r u l e s on use of GET-HOLD r e t r i e v a l o p e r a t i o n s : (a) Except f o r a roo t segment, the parent-to-be of any segment which i s t o be i n s e r t e d must f i r s t be held v i a a GET-HOLD r e t r i e v a l command. (b) Any segment which i s to be modified or d e l e t e d must f i r s t be held v i a a GET-HOLD r e t r i e v a l command. (c) Only b u f f e r commands (or the RELEASE command) may i n t e r v e n e between a GET-HOLD r e t r i e v a l command and any m o d i f i c a t i o n ccmmand. 3. Network System The EDBS data d e s c r i p t i o n language f o r network databases i s a r a t h e r l i m i t e d v e r s i o n of the DBTG schema DDL.. Many of the DBTG f e a t u r e s concerned with the storage l e v e l of a database system are not supported by EDBS. Other DBTG o p t i o n s which may be s p e c i f i e d i n the schema are not supported by EDBS.. For example, the storage c l a s s f o r every member rec o r d type i n EDBS i s assumed to be MANUAL. That i s , when an occurrence of a member r e c o r d type i s f i r s t c r e a t e d and placed i n the database, i t i s not a u t o m a t i c a l l y i n s e r t e d as a member of any set occurrence. Furthermore, the removal c l a s s f o r every member reco r d type i n EDBS i s always OPTIONAL. T h i s means t h a t an occurrence of any member record type may e x i s t i n the database without being a member of any s e t occurrence. 52 EDBS does not maintain as many currency i n d i c a t o r s as DBTG. Since an EDBS database i s not d i v i d e d i n t o named a r e a s , t h e r e i s no need to maintain the cu r r e n t r e c o r d occurrence w i t h i n each area.. The most important DBTG currency of a l l -- the c u r r e n t of run - u n i t -- i s not maintained by EDBS, T h i s has obvious i m p l i c a t i o n s i n terms of the r e s p e c t i v e DMLs. . With DBTG v i r t u a l l y every o p e r a t i o n uses the c u r r e n t of r u n - u n i t which i s e s t a b l i s h e d v i a the FIND o p e r a t i o n . With EDBS o p e r a t i o n s are performed on segment occurrences e s t a b l i s h e d by means of other c u r r e n c i e s ( i . e . , r e c o r d ox set p o i n t e r s ) . The concurrency c o n t r o l f a c i l i t i e s provided by the EDBS network system bear no resemblance t o those proposed by the DBTG. R e c a l l t h a t the DBTG f i n a l r e p o r t (1971) does not provide f o r use of ho l d s . The remainder of t h i s s u b s e c t i o n d e s c r i b e s the use of holds i n the EDBS network system. F i v e p o s s i b l e m o d i f i c a t i o n a c t i o n s are a v a i l a b l e : (a) i n s e r t i o n of a new r e c o r d (b) d e l e t i o n of a r e c o r d (c) i n c l u s i o n of a rec o r d i n a set occurrence (d) removal of a r e c o r d i n a set occurrence (e) update of a r e c o r d c u r r e n t l y i n the database. The f o l l o w i n g r u l e s apply to use of holds i n the network system: (a) Before a r e c o r d may be updated i t must be placed i n h o l d . Upon completion of the update the hold i s a u t o m a t i c a l l y removed. (b) For INCLUDE and fiEMOVE the r e c o r d may o p t i o n a l l y be placed i n hold.. A RELEASE command must be 53 i s s u e d to r e l e a s e the hold , (c) For d e l e t i o n the r e c o r d may o p t i o n a l l y be placed i n h o l d . The hold i s a u t o m a t i c a l l y r e l e a s e d upon completion of the DELETE command, (dj A user may have at most one r e c o r d occurrence i n ho l d at one time. There are c e r t a i n d i f f i c u l t i e s i n h e r e n t i n the EDBS concurrency c o n t r o l scheme f o r network databases.. As with DBTG the user i s net completely i s o l a t e d from the a c t i v i t i e s of other concurrent u s e r s . The EDBS User's Guide a d v i s e s t h a t \"the e x i s t e n c e of concurrent users n e c e s s i t a t e s t h a t c a u t i o n must be e x e r c i s e d when d e a l i n g with record or set p o i n t e r s \" . . A user may be a f f e c t e d i n one of the f o l l o w i n g ways: (a) A r e c o r d or set p o i n t e r may mark the p o s i t i o n of a re c o r d that has been d e l e t e d by another user. I f the l o c a t i o n remains empty and the p o i n t e r i s r e f e r e n c e d an e r r o r c o n d i t i o n w i l l r e s u l t . I f a new re c o r d i s s t o r e d i n the vacant l o c a t i o n the p r i o r d e l e t i o n w i l l be undetectable. (b) A set p o i n t e r may mark the p o s i t i o n of a member r e c o r d that has been removed from a set occurrence by another user. I f the p o i n t e r i s re f e r e n c e d an e r r o r c o n d i t i o n w i l l r e s u l t . 54 B.. EDBS Implementation EDBS was designed by the Systems Research Group at U n i v e r s i t y of Toronto. The implementation was dependent on a v a i l a b i l i t y of the APL*PLUS f i l e system and used o v e r l a y techniques to f i t the system i n t o a s m a l l workspace.. The U n i v e r s i t y of Calgary i n s t a l l e d EDBS on t h e i r C o n t r o l Data Computer (CDC). Overlay f u n c t i o n s were e l i m i n a t e d and the system was converted to r e l y on the CDC/APL f i l e system. With the recent i n t r o d u c t i o n of the MTS/APL f i l e system i t became f e a s i b l e to i n s t a l l EDBS at UBC. The d e c i s i o n was t o use the CDC/APL v e r s i o n of EDBS. The major requirement of the implementation was c o n v e r s i o n of the f i l e system. The major c o n s t r a i n t was that the system i t s e l f should not be modified. The g e n e r a l approach (suggested by Y a i r Wand, F a c u l t y of Management, U n i v e r s i t y of Calgary) was to provide an i n t e r f a c e f i l e system between EDBS and the MTS/APL f i l e system. The i n t e r f a c e f i l e system i s intended to simulate the CDC/APL f i l e system using the a v a i l a b l e MTS/AFI f u n c t i o n s . The f o l l o w i n g diagram r e p r e s e n t s the r e l a t i o n s h i p between EDBS, the i n t e r f a c e f i l e system and the MTS/APL f i l e system. INTEBFACE FILE SYSTEM I n t e r f a c e F i l e System >] MTS/APL I < j F i l e System F i g u r e 2 55 The f u n c t i o n s a v a i l a b l e from the i n t e r f a c e f i l e system are e x a c t l y those provided by the CDC/APL f i l e system. . When EDBS c a l l s a p a r t i c u l a r CDC/APL f u n c t i o n i t a c t u a l l y c a l l s an i n t e r f a c e f u n c t i o n which performs the task t h a t would have been performed by the CDC f u n c t i o n and furthermore r e t u r n s to EDBS e x a c t l y what i t expected when i t c a l l e d the CDC f u n c t i o n . The CDC and MTS APL f i l e systems d i f f e r p r i m a r i l y i n t h e i r f i l e s h a r i n g f a c i l i t i e s ( i . e . , access a u t h o r i z a t i o n and concurrency c o n t r o l ) . With these e x c e p t i o n s , s i m u l a t i o n of the CDC/APL f i l e system using the a v a i l a b l e MTS APL f i l e system was a s t r a i g h t forward t a s k . The EDBS approach to concurrency c o n t r o l r e q u i r e s s e t t i n g l o g i c a l l o c k s on such o b j e c t s as r e l a t i o n s , segments and re c o r d s and p h y s i c a l l o c k s on f i l e s . EDBS manages i t s own l o g i c a l l o c k s and r e l i e s on the f i l e system t c set p h y s i c a l l o c k s . One exception t o t h i s i s the r e l a t i o n a l system of the APL*PLUS v e r s i o n of EDBS which does not r e l y on the f i l e system f o r l o c k i n g . The c o n v e r s i o n at U n i v e r s i t y o f Calgary from APL*PLUS to CDC/APL i n v o l v e d e xtensive m o d i f i c a t i o n of the l o c k i n g mechanism used by the r e l a t i o n a l system. M o d i f i c a t i o n of the h i e r a r c h i c a l and network systems was f a r l e s s e x t e n s i v e i n v o l v i n g only the p h y s i c a l l e v e l of l o c k i n g . A r e s u l t of .modifications made at U n i v e r s i t y of Calgary was t h a t the l o c k i n g s t r a t e g i e s used by a l l three systems are now s i m i l a r ( i . e . , the r e l a t i o n a l system was modified t o r e l y on the CDC/APL f i l e system j u s t as the h i e r a r c h i c a l and network systems do) . 5 6 In t h i s s e c t i o n we d e s c r i b e the e x i s t i n g EDBS implementation along a number of dimensions. F i r s t , we examine data usage i n gen e r a l by reviewing'EBBS f i l e s and a u t h o r i z a t i o n at the f i l e system l e v e l . Second, we examine implementation of concurrency c o n t r o l i n the APL*PLUS v e r s i o n of EDBS. T h i r d , we look at major d i f f e r e n c e s i n the CDC/APL v e r s i o n . F i n a l l y , we consider EDBS under MTS. i 1.. EDBS F i l e s and A u t h o r i z a t i o n This s e c t i o n d e s c r i b e s EDBL. f i l e s and access reguirements at the f i l e system l e v e l . Access r e s t r i c t i o n s are a l s o imposed by EDBS i t s e l f . A l l of these f i l e s .ire created by v a r i o u s DBA u t i l i t i e s except f o r the message f . l l e , DBA l o g f i l e and b u f f e r f i l e s . (See Appendix C f o r a g r a p h i c a l r e p r e s e n t a t i o n of EDBS f i l e usage.) (a) Message F i l e : The message f i l e c o n t a i n s e r r o r messages used by v a r i o u s EDBS u t i l i t i e s . There ; i s one message f i l e per EDBS i n s t a l l a t i o n . A l l users have read access t o t h i s f i l e . Only the system a d m i n i s t r a t o r i s permitted t o modify t h i s f i l e . . The message f i l e i s taken from the EDBS d i s t r i b u t i o n tape. (b) T r a n s l a t i o n Table F i l e : The primary purpose of the t r a n s l a t i o n t a b l e i s to provide a d i r e c t o r y of a l l the databases administered by one DBA.. There i s one t r a n s l a t i o n t a b l e per DBA. A l l users have read 57 access to t h i s f i l e . Only the DBA has complete access to t h i s f i l e . (c) Record Table F i l e : The record t a b l e s t o r e s i n f o r m a t i o n about the p h y s i c a l aspects of r e l a t i o n s , segment ty p e s , and r e c o r d types. There i s one r e c o r d t a b l e per DBA. A l l users have read access t o t h i s f i l e . Only the DBA has complete access to t h i s f i l e . (d) F i e l d Table F i l e : The f i e l d t a b l e c o n t a i n s i n f o r m a t i o n on the p h y s i c a l c h a r a c t e r i s t i c s of domains, f i e l d s , and data items. There i s one f i e l d t a b l e per DBA. A l l users have read access to t h i s f i l e . Only the DBA has complete access to t h i s f i l e . (e) Log F i l e : The l o g f i l e i s used t o keep t r a c k of who has the database open and to keep a log of database m o d i f i c a t i o n s . There i s one l o g f i l e per database. A l l users have read/write access to t h i s f i l e . (f) Record F i l e : The r e c o r d f i l e i s used to s t o r e record occurrences (segment occurrences or t u p l e s ) , i n v e r t e d l i s t s , and l o c k s used to synchronize s h a r i n g of the r e c o r d type, segment type or r e l a t i o n . There i s one r e c o r d f i l e per record type, segment type, or r e l a t i o n . A l l users have read/w r i t e access to t h i s f i l e . 58 (gj B u f f e r F i l e : The b u f f e r f i l e i s used to hold data r e t r i e v e d from the database. The b u f f e r f i l e i s c r e a t e d i n an OPEN command and destroyed i n a CLOSE command. This f i l e can only be accessed by the user who owns i t . (h) DBA Log F i l e : The DBA l o g f i l e i s used t o keep t r a c k o f who the DBAs are. I t c o n s i s t s of e x a c t l y one component, a v e c t o r of EEA account numbers. There i s one DBA l o g f i l e per EDBS i n s t a l l a t i o n . A l l users have read/write access to t h i s f i l e . The DBA l o g f i l e i s created by the system a d m i n i s t r a t o r . A database c o n s i s t s of n + 1 f i l e s where n i s the number of d i f f e r e n t r e l a t i o n s , segment types, or record types i n the database ( i . e . , n record f i l e s and one l o g f i l e ) . Access t o reco r d f i l e s i s r e s t r i c t e d v i a the GRANT ACCESS u t i l i t y at the database l e v e l , r e c o r d type l e v e l or r e c o r d occurrence l e v e l . With the e x c e p t i o n of these r e s t r i c t i o n s any user has rea d / w r i t e access t o a l l database f i l e s . A l l users have re a d / w r i t e access to the DBA l o g f i l e because any common user i s p o t e n t i a l l y a DBA and the procedure f o r c r e a t i n g DBAs r e g u i r e s t h a t the vector of DBA account numbers be read and updated from the new DBA account. C o n t r o l over a u t h o r i z a t i o n of DBAs i s maintained as f o l l o w s : U t i l i t i e s f o r c r e a t i n g DBA schema t a b l e s and updating the DBA log are st o r e d in the system a d m i n i s t r a t o r ' s l i b r a r y . . Only i f a common user has the account of the SA w i l l he be .able to loa d the 59 procedure r e q u i r e d t o i n s t a l l h i m s e l f as a DBA. F u r t h e r , the SA must permit the a p p r o p r i a t e workspace f o r read to those users who are to become DBAs. 2.. Concurrency c o n t r o l with APL*PL0S The APL*PLUS v e r s i o n of EDBS makes use of three f a c i l i t i e s to implement the h i e r a r c h i c a l and network l o c k i n g mechanism: a f i l e i n t e r l o c k f u n c t i o n , a hold l i s t and a time out f e a t u r e . The f i l e i n t e r l o c k f u n c t i o n i s provided by the APL*PLUS f i l e system and permits a user to e x p l i c i t l y l o c k and unlock f i l e s f o r e x c l u s i v e use. S e v e r a l users r e q u e s t i n g an i n t e r l o c k on a f i l e are queued i n the order of t h e i r requests. I f a user does not p l a c e an i n t e r l o c k on a f i l e then he w i l l be able to access the f i l e even though another user has an i n t e r l o c k on the f i l e . In order t h a t a user i s guaranteed e x c l u s i v e use of a f i l e , a l l users must place an i n t e r l o c k on the f i l e p r i o r t o a c c e s s i n g i t . A hold l i s t i s a l i s t of a l l users c u r r e n t l y having a h o l d on a segment occurrence of a given type. A l l segment occurrences of a given type are s t o r e d i n one f i l e ( i . e . , the r e c o r d f i l e f o r the segment t y p e ) . A hold l i s t i s s t o r e d as the f i r s t component of each r e c o r d f i l e . The h o l d . l i s t c o n t a i n s zero or more t r i p l e s of the form (user ID, l o g i c a l index, time stamp). The l o g i c a l index i d e n t i f i e s the h e l d segment, the user ID i d e n t i f i e s the h o l d i n g user, and the time stamp i s the time at which the hold was s e t . . 60 When a user requests a hold on a segment, the hold l i s t f o r the segment type i s checked. I f the segment i s not c u r r e n t l y h e l d , then the hold i s granted by adding a new t r i p l e to the hold l i s t . I f the segment i s alre a d y i n hold a s t a t u s code i s re t u r n e d t o i n d i c a t e t h a t a hold i s not p o s s i b l e . The time-out f e a t u r e i s used t o prevent a segment occurrence from being u n a v a i l a b l e f o r an e x c e s s i v e p e r i o d of time. A hold on a segment i s guaranteed to be i n e f f e c t f o r only f i v e minutes. I f another user requests a hold on the segment a f t e r the time l i m i t has e x p i r e d then the p r i o r hold i s i removed. \" i During the time a segment i s i n hold , no other user may place a hold i n the same segment and thus 'no m o d i f i c a t i o n command a f f e c t i n g the segment may be executed. However, other users may s t i l l r e t r i e v e the h e l d segment at any time. During a c t u a l m o d i f i c a t i o n of a segment, access to the segment type i s r e s t r i c t e d to the user performing the update. This r e s t r i c t i o n i s accomplished by p l a c i n g an i n t e r l o c k on the segment type f i l e . . ; Since a l l users must place an i n t e r l o c k on a f i l e p r i o r t o a c c e s s i n g i t f o r the i n t e r l o c k to be e f f e c t i v e , both r e t r i e v a l s and m o d i f i c a t i o n s take p l a c e under the p r o t e c t i o n of an i n t e r l o c k . To avoid unnecessary d e l a y s the i n t e r l o c k i s pl a c e d on a f i l e only a f t e r a l l pr o c e s s i n g not r e g u i r i n g access to the f i l e has been completed and the i n t e r l o c k i s removed as soon as access to the f i l e i s complete. The exact, lock p r o t o c o l f o l l o w e d by EDBS i s not c l e a r from a v a i l a b l e documentation f o r the APL*PLUS v e r s i o n . 6 1 As an i n t e g r i t y measure EDBS a l s o s e t s i t s own lock on a f i l e d u r i n g m o d i f i c a t i o n s . T h i s lock occupies the second component of each r e c o r d type f i l e and i s a two element v e c t o r . The f i r s t element of the l o c k v e c t o r i s 0 i f the r e c o r d type i s not locked and 1 i f i t i s . The second element of the l o c k i vector i s 0 when no l o c k i s i n e f f e c t and the time the lock was set when the record type i s l o c k e d . The l o c k v e c t o r i s checked on every r e t r i e v a l and every m o d i f i c a t i o n . O r d i n a r i l y , i f a user does not i n t e r r u p t a m o d i f i c a t i o n the f i r s t element w i l l always be r e s e t t o 0 when the m o d i f i c a t i o n completes. However, i f a m o d i f i c a t i o n i s i n t e r r u p t e d and i s not resumed then the f i r s t element of the lock w i l l remain set to 1 to i n d i c a t e t h a t the f i l e i s i n an i n c o n s i s t e n t s t a t e . In t h i s event, f u r t h e r m o d i f i c a t i o n s t o the r e c o r d type are prevented, but r e t r i e v a l s are s t i l l a llowed. In e i t h e r case an 'INTEGRITY CHECK' message i s p r i n t e d each time the f i l e i s accessed u n t i l the DBA c o r r e c t s the s i t u a t i o n . * The APL*PLUS v e r s i o n of EDBS does not r e l y on the f i l e system t o implement l o c k i n g i n the r e l a t i o n a l system. Rather, the system i t s e l f maintains f o r each r e l a t i o n a reader l i s t and a l o c k v e c t o r to implement the l o c k i n g mechanism., 1 T h i s paragraph i s taken from documentation compiled by the Computer Systems Research Group, U n i v e r s i t y of Toronto. i 62 A l l t u p l e s from the same r e l a t i o n are s t o r e d i n one f i l e (the r e c o r d f i l e ) . The reader l i s t and l o c k vector f o r a r e l a t i o n are s t o r e d i n the f i r s t two components of the r e c o r d f i l e . A reader l i s t i s a matrix c o n t a i n i n g user IDs f o r those users c u r r e n t l y r e a d i n g the r e l a t i o n along with the time t h a t each read was s t a r t e d . A lock v e c t o r i s a two element numeric v e c t o r . The f i r s t element of the lock v e c t o r i s 2 i f the r e l a t i o n i s being modified, 1 i f the r e l a t i o n i s being h e l d and 0 i f the r e l a t i o n i s n e i t h e r being h e l d nor modified. The second element of the l o c k vector i s the time t h a t the hold or m o d i f i c a t i o n was s t a r t e d . The r u l e s which must be enforced are the f o l l o w i n g : (a) at most one user may hold a r e l a t i o n a t one time. (b) a r e l a t i o n may not be modified when another user has i t i n h o l d {c) no other user may access a r e l a t i o n while one user i s modifying i t . These r u l e s are enforced v i a the reader l i s t and l o c k vector as f o l l o w s : (a) For reading ( i . e . , the GET command) EDBS waits f o r the r e l a t i o n to be unlocked f o r m o d i f i c a t i o n or i n hold and then e n t e r s the user number and time onto the reader l i s t . (b) For holds { i - e . , the GET-HOLD command) EDBS waits f o r the r e l a t i o n t o be unlocked f o r m o d i f i c a t i o n or unheld and then e n t e r s a 1 i n t o the lock v e c t o r . (c) For m o d i f i c a t i o n s EDBS waits f o r the r e l a t i o n to 63 be unlocked and unheld, w r i t e s a 2 i n t o the lock vector and waits f o r the reader l i s t to be empty. To ensure that one user w i l l not t i e up a r e l a t i o n by never r e l e a s i n g a hold or by i n t e r r u p t i n g a read, both holds and reads are timed out a f t e r f i v e minutes. Thus, i f a user i s recorded as having h e l d a r e l a t i o n f o r more . than f i v e minutes h i s h o l d w i l l be dropped i f another user requests t o h o l d or modify the r e l a t i o n . S i m i l a r i l y , any row of the reader l i s t w i l l be ignored i f i t i s more than f i v e minutes o l d . 1 I f a r e l a t i o n m o d i f i c a t i o n command i s i n t e r r u p t e d then the r e l a t i o n may be t o t a l l y l ocked f o r any l e n g t h of time. T h i s lock i s not timed out because the r e l a t i o n f i l e may be i n an i n c o n s i s t e n t s t a t e . Instead, an 'INTEGRITY CHECK' e r r o r message i s p r i n t e d and n e i t h e r holds nor m o d i f i c a t i o n s are a l l o w e d . 2 i 3. Concurrency c o n t r o l with CDC/APL The CDC/APL implementation of l o c k i n g f o r h i e r a r c h i c a l and I network databases d i f f e r s from the APL*PLUS implementation mainly i n use of the i n t e r l o c k f u n c t i o n (which ' i s u n a v a i l a b l e with the CDC/APL f i l e system). Two f e a t u r e s of the CDC/APL system are used t o accomplish the i n t e r l o c k : a f i l e t i e f u n c t i o n and an e r r o r t r a p p i n g f u n c t i o n . i i 1 and 2 These two paragraphs are taken from documentation compiled by the Systems Research Group at the U n i v e r s i t y of Toronto. f i 64 The f i l e t i e f u n c t i o n (FTIE) takes as arguments a v e c t o r of i n t e g e r s and a matrix of f i l e names. A user must t i e a f i l e p r i o r to a c c e s s i n g i t . When a f i l e i s t i e d , i t i s made known to the user workspace and the i n t e g e r provided i s a s s o c i a t e d with the f i l e . T h e r e a f t e r , the user r e f e r e n c e s the f i l e by i t s number r a t h e r than by i t s name. A f i l e may be t i e d i n read/modify mode cr i n write mode.1 When a user t i e s a f i l e i n read/modify mode he i s i n d i c a t i n g that he wishes only to read the f i l e and t h a t he does not mind i f another user modifies the f i l e at the same time. When a user t i e s a f i l e i n w r i t e mode he i s i n d i c a t i n g t h a t he i i wishes to modify the f i l e and t h a t he does not want other users to modify the f i l e at the same time. Any number 1of users may be t i e d to a f i l e i n read/modify mode at the same time. Only one user a t a time may have a f i l e t i e d i n w r i t e mode., I f a second user attempts to t i e a f i l e f o r write he r e c e i v e s an e r r o r message. The f i l e system a l s o p r o v i d e s a f u n c t i o n (FUNTIE) f o r u n t i e i n g f i l e s . The CDC e r r o r t r a p p i n g f u n c t i o n takes as an argument a l o c a t i o n i n the program to which a branch w i l l ' occur i n the event of a system e r r o r . An u n s u c c e s s f u l t i e f o r write may be detected (and a p p r o p r i a t e a c t i o n taken) by means of the e r r o r t r a p p i n g f u n c t i o n . The f u n c t i o n of an i n t e r l o c k i s achieved v i a the f i l e t i e and e r r o r t r a p p i n g f u n c t i o n s i n the f o l l o w i n g manner: 1 The CDC/APL f i l e system a l s o permits a f i l e to be t i e d i n read mode which al l o w s other users t o read (but not modify) the f i l e at the same time. T h i s o p t i o n i s not used by EDBS. 65 (a) A f i l e i s normally t i e d i n read/modify'mode. In f a c t , the OPEN command t i e s a l l database f i l e s f o r read/modify. (b) Immediately p r i o r t o any wr i t e o p e r a t i o n the f i l e i s u n t i e d and an attempt i s made to t i e the f i l e i n write mode. i (c) I f the t i e f o r write succeeds the write o p e r a t i o n i s performed. Otherwise, EDBS waits f o r one minute and t r i e s again to t i e the f i l e f o r wr i t e . (d) I f a s u c c e s s f u l t i e f o r write i s not p o s s i b l e w i t h i n f i v e minutes EDBS give s up and r e t u r n s a s t a t u s code to i n d i c a t e an u n s u c c e s s f u l l o c k . . (e) Immediately a f t e r performing a write o p e r a t i o n the f i l e i s unti e d and t i e d again i n read/modify mode. The CDC/APL v e r s i o n of EDBS ( i n c o n t r a s t with the APL*PLUS version) does r e l y on the f i l e system to implement l o c k i n g i n the r e l a t i o n a l system. The approach i s s i m i l a r t o t h a t f o r the h i e r a r c h i c a l and network systems. The f i l e t i e and e r r o r t r a p p i n g f u n c t i o n s (not a l o c k vector) are used to e f f e c t l o c k i n g f o r m o d i f i c a t i o n . Component one of each r e c o r d f i l e c o n t a i n s a hold l i s t (not a reader l i s t ) . Since a user holds the whcle r e l a t i o n , a hold v e c t o r c o n s i s t s of e x a c t l y two elements. Element 1 i s the user number of the user c u r r e n t l y h o l d i n g the r e l a t i o n . Element 2 i s the time a t which the hold was s t a r t e d . . I f the r e l a t i o n i s not i n hold the hold vector i s ( 0 , 0 ) . . i 66 The f o l l o w i n g procedure i s used by a l l three systems t o avoid l o s t updates.. (Suppose t h a t an EDBS data manipulation command reads and w r i t e s v a r i a b l e X from f i l e F) : Loop: Read X from F T i e F f o r WRITE I f t i e f a i l s go to Wait Write X t o F T i e F f o r READ/MCDIFY Stop ; Wait: Delay 1 minute T i e f o r READ/MODIFY Go to Loop This procedure amounts to back out of previous o p e r a t i o n s . The read i s done and then l a t e r i f i t i s determined t h a t another user has the f i l e t i e d f o r w r i t e , the read i s ignored and redone. The procedure prevents l o s t updates because a gi v e n user's update i s made only i f another user was not updating the f i l e between the given user's read and update. | There are occasions when a user must lock more than one f i l e at a time. As an example, c o n s i d e r the DELETE command i n the h i e r a r c h i c a l system which t r i g g e r s d e l e t i o n from s e v e r a l r e c o r d f i l e s . The CDC/APL v e r s i o n of EDBS handles t h i s s i t u a t i o n by attempting t o t i e f o r write a l l r e q u i r e d r e c o r d f i l e s b e fore updating any of them. I f any of the f i l e s cannot immediately be t i e d f o r write EDBS u n t i e s any t h a t i t has s u c c e s s f u l l y t i e d , waits and then t r i e s again t o t i e them a l l f o r w r i t e . . Examination of EDBS APL code i n d i c a t e s t h a t whenever an EDBS o p e r a t i o n ( i . e . , GET, UPDATE, INSERT, DELETE, CREATEDBA, DESTROYDBA, GRANTACCESS) must modify data from more than one i f i l e , a l l f i l e s are t i e d f o r wr i t e before any of them are i m o d i f i e d . I 67 We conclude t h i s s u b s e c t i o n with a b r i e f d e s c r i p t i o n of the CDC f i l e access a u t h o r i z a t i o n f a c i l i t i e s . A user c o n t r o l s access to h i s f i l e s by means of a f i l e c ategory, password, and mode which may be s p e c i f i e d when the f i l e i s c r e a t e d from APL v i a the FCREATE f u n c t i o n . The f o l l o w i n g e x c e r p t s from the CDC APL r e f e r e n c e manual (1978) d e s c r i b e these f a c i l i t i e s . (a) The f i l e category i s o r d i n a r i l y p r i v a t e . P r i v a t e f i l e s cannot be accessed by other users. A f i l e may, a l t e r n a t i v e l y , be assigned a category of se m i p r i v a t e or p u b l i c . E i t h e r of these c a t e g o r i e s allows other users t o access the f i l e i f they know the password, the name of the f i l e , and the user number under which i t was s t o r e d . (b) The f i l e can be given a password. Only users who know the password can use the f i l e . (c) The f i l e mode i s used to c o n t r o l the type of o p e r a t i o n another user can perform. For f i l e s c r e a t e d by APL ( i n c l u d i n g workspaces) other users are o r d i n a r i l y allowed to read the f i l e (assuming the password and category do not exclude them) but.are not allowed to a l t e r or destroy the f i l e . Other users can be given permission t o a l t e r the f i l e by s p e c i f y i n g the WR o p t i o n ( f o r write) when the f i l e i s c r e a t e d . 68 A f i l e c a t egory, password, and mode may a l s o be r e a s s i g n e d or changed o u t s i d e the APL environment.. EDBS uses the f i l e category and mode- to c o n t r o l access to f i l e s . The password o p t i o n i s not used f o r t h i s purpose. 4.. EDBS Under MTS In t h i s s e c t i o n we examine the MTS f a c i l i t i e s f o r access a u t h o r i z a t i o n and concurrency c o n t r o l and we look i n t o the d e t a i l s of s i m u l a t i n g the corresponding CDC/APL f u n c t i o n s v i a these f a c i l i t i e s . We conclude t h i s s e c t i o n with a d e s c r i p t i o n of the EDBS s t a t e at which our problem s t a r t s — a s t a t e i n which a l l CDC f a c i l i t i e s are a v a i l a b l e v i a MTS f u n c t i o n s with the e x c e p t i o n of concurrency c o n t r o l which i s provided v i a the MTS automatic l o c k i n g mechanism. We say that our problem s t a r t s here because t h i s s t a t e provides very l i t t l e concurrency, i n c u r s the r i s k of deadlock and j e o p a r d i z e s database c o r r e c t n e s s . The MTS/APL f i l e system a l l o w s users to perform o p e r a t i o n s on f i l e s s t o r e d o u t s i d e the APL environment. The f i l e system c o n s i s t s o f a set of f u n c t i o n s which may be executed from APL or from w i t h i n an APL f u n c t i o n . To use the f i l e system a user takes a copy of i t (from a p u b l i c l i b r a r y or the SA's account) i n t o h i s a c t i v e workspace. Among o t h e r s , the f o l l o w i n g MTS op e r a t i o n s are p o s s i b l e from w i t h i n APL: CREATE, DESTROY, READ, WRITE, EMPTY, LOCK, UN1K, RENAME, and RENUMBER., 69 A subset of the c a p a i s i l i t i e s provided by the MTS PERMIT command i s a v a i l a b l e v i a the MTS/APL SHARE command. T h i s command allows a user t o e x p l i c i t l y permit other u s e r s to read from or write i n t o h i s f i l e s . The concurrency c o n t r o l f a c i l i t i e s a v a i l a b l e are b a s i c a l l y those p r o v i d e d by MTS which i n c l u d e three l e v e l s of l o c k i n g , automatic l o c k i n g , and e x p l i c i t l o c k i n g . The f o l l o w i n g e x c e r p t s from UBC F i l e s and Devices (1976) and UBC Commands (1977) d e s c r i b e these f a c i l i t i e s : Three c l a s s e s of l o c k i n g are provided f o r maintaining i n t e g r i t y of a shared f i l e : l o c k f o r reading,' lock f o r m o d i f i c a t i o n , and lock f o r d e s t r u c t i o n . The three l o c k i n g c l a s s e s are i n c l u s i v e i n the sense t h a t l o c k i n g a f i l e f o r m o d i f i c a t i o n a l s o l o c k s i t f o r r e a d i n g and l o c k i n g a f i l e f o r d e s t r u c t i o n a l s o l o c k s i t f o r r e a d i n g and m o d i f i c a t i o n . The r u l e s f o r concurrent use of a f i l e among separate t a s k s are the f o l l o w i n g : (a) Any number of tasks can have a f i l e locked f o r rea d i n g at the same time as long as no other task has the f i l e l o c k e d f o r m o d i f i c a t i o n or d e s t r u c t i o n . . I (b) Only one task can have a f i l e l o cked f o r m o d i f i c a t i o n at one time, and then only i f no other task has the f i l e locked f o r re a d i n g or d e s t r u c t i o n . > (c) Only one task can have a f i l e l o c k e d f o r t d e s t r u c t i o n at one time, and then o n l y i f no other task has the f i l e open or locked f o r 70 reading or m o d i f i c a t i o n . P i l e s are i m p l i c i t l y ( a u t o m a t i c a l l y ) locked and unlocked by MTS whenever a user reguests something of MTS which r e q u i r e s l o c k i n g . . For example, i f a user i s o p e r a t i n g from the subroutine l e v e l , 1 on the f i r s t c a l l to a subroutine to read or write a l i n e from a f i l e , MTS w i l l i m p l i c i t l y l o c k the f i l e f o r reading (or m o d i f i c a t i o n ) and l e a v e the f i l e locked i n t h a t manner u n t i l use of that f i l e i s complete.. In g e n e r a l , the l o c k i n g i s a s s o c i a t e d with the FDUB ( f i l e or device usage block) a s s o c i a t e d with the f i l e and automatic unlock i n g i s done when the FDUB i s r e l e a s e d . When MTS i m p l i c i t l y l o c k s a f i l e through a p a r t i c u l a r FDUB, i t may r a i s e 2 the l o c k i n g c l a s s but i t w i l l never lower i t . . For example, i f a user o p e r a t i n g from the subroutine l e v e l f i r s t w r i t e s a l i n e t o a f i l e and then reads a l i n e from the same f i l e , MTS w i l l i m p l i c i t l y l o c k the f i l e f o r m o d i f i c a t i o n before the f i r s t write and le a v e i t locked f o r m o d i f i c a t i o n t h e r e a f t e r . . I f , while attempting t o lock a f i l e MTS determines t h a t a c c o r d i n g t o the concurrent use r u l e s i t cannot loc k the f i l e as reguested, MTS i m p l i c i t l y and a u t o m a t i c a l l y attempts t o gueue the task to wait u n t i l the f i l e can be locked. . Before doing so, however, MTS checks to ensure t h a t gueuing the job to wait on 1 We d i s t i n g u i s h between the MTS command l e v e l and the MTS subroutine l e v e l . The f a c i l i t i e s a v a i l a b l e from w i t h i n APL are those at the subroutine l e v e l . 2 Lock f o r d e s t r u c t i o n i s s a i d to be.at a higher l e v e l than l o c k f o r m o d i f i c a t i o n which i s , i n t u r n , at a higher l e v e l than l o c k f o r r e a d i n g . 7 1 the f i l e w i l l not r e s u l t i n a s i t u a t i o n wherein the c u r r e n t task as w e l l as others w i l l be deadlocked i n d e f i n i t e l y i n t h e i r r e s p e c t i v e queues. I f such i s the case, MTS w i l l not allow the task to wait but i n s t e a d w i l l r e t u r n an e r r o r i n d i c a t i o n . F i l e s can a l s o be e x p l i c i t l y locked and unlocked from wit h i n APL ( v i a the LOCK and UNLK commands). In a d d i t i o n , an option may be s p e c i f i e d t o request t h a t no waiting be done i f l o c k i n g cannot be accomplished. I m p l i c i t l o c k s may be removed from w i t h i n APL v i a the FRELEASE command which a l s o c l o s e s the f i l e . . Otherwise, i m p l i c i t l o c k s w i l l remain i n e f f e c t u n t i l the user leaves APL. . E x p l i c i t l o c k s may be r e l e a s e d e x p l i c i t l y v i a the UNLK command (or the FRELEASE command which a l s o c l o s e s the f i l e ) . . A s p e c i a l f u n c t i o n a v a i l a b l e t o EDBS (and not part of the APL f i l e system) a l s o permits i m p l i c i t l o c k s t o be e x p l i c i t l y removed without c l o s i n g the f i l e . T h i s concludes our d e s c r i p t i o n of the MTS/APL f i l e system. We now t u r n our a t t e n t i o n to d e t a i l s of the s i m u l a t i o n . . A d e s c r i p t i o n of the simulated f u n c t i o n s i s a v a i l a b l e i n Appendix D.. We fo c u s here on the important ones: MTSCREATE, MTSTIE, MTSUNTIE, and TRAP 1. The MTSCREATE f u n c t i o n s i m u l a t e s the CDC FCREATE f u n c t i o n . The syntax of the FCREATE f u n c t i o n i s as f o l l o w s : •file-name[:password ] [ / o p t i o n s ] ' FCREATE fnum j 72 The l i s t of options can i n c l u d e any of the f o l l o w i n g : DA, C, WE, S, or PU to s p e c i f y d i r e c t access, coded, write mode, se m i p r i v a t e or p u b l i c . Since a l l EDBS f i l e s are d i r e c t a ccess f i l e s and no EDBS f i l e s are coded f i l e s , these o p t i o n s were not simulated ( i . e . , f i l e s are a u t o m a t i c a l l y d i r e c t access and s t r u c t u r e d ) . Since EDBS does not use the password o p t i o n , i t was not simulated. MTSCEEATE accomplishes ithe f o l l o w i n g remaining f u n c t i o n s : | (a) An APL i n t e r n a l format f i l e i s created whose name i s given by file-name. (b) I f the S or WE options are s p e c i f i e d the f i l e i s permitted f o r read or write r e s p e c t i v e l y v i a the MTS/APL SHAEE f u n c t i o n . (c) The f i l e t i e number (fnum) i s i n s e r t e d i n t o a t a b l e of f i l e t i e numbers and the f i l e name i s i n s e r t e d i n the same r e l a t i v e p o s i t i o n i n a t a b l e of f i l e names. The MTSFTIE f u n c t i o n s i m u l a t e s the CDC FTIE f u n c t i o n whose syntax i s the f o l l o w i n g : •[•account] file-name [:password ] [ / o p t i o n s ] ' FTIE fnum The password o p t i o n i s not used by EDBS and not simulated. The l i s t of options -- ED f o r read, EM f o r read/modify, and no opt i o n f o r write — are not easy to simulate. We leave t h i s task. f o r f u r t h e r examination. MTSFTIE accomplishes the f o l l o w i n g remaining t a s k s : (a) The account number i f provided i s concatenated to the f i l e name to produce the MTS name account:file-name. 73 (b) The f i l e i s t i e d . That i s , the f i l e number and filename t a b l e s are updated. 1 The MTS UNTIE f u n c t i o n s i m u l a t e s the CDC FUN TIE f u n c t i o n which takes one argument, a vector of f i l e t i e numbers.. The f u n c t i o n of MTSUNTIE i s to d e l e t e the f i l e t i e numbers and corresponding f i l e names from the f i l e number and f i l e name t a b l e s r e s p e c t i v e l y . The MTS TRAP 1 f u n c t i o n i s used i n place of the CDC e r r o r t r a p p i n g f u n c t i o n . The CDC e r r o r t r a p p i n g f u n c t i o n does.not appear p o s s i b l e t o si m u l a t e . The MTS TRAP 1 f u n c t i o n gets c a l l e d from EDBS but does no t h i n g . Given that a l l f a c i l i t i e s of the CDC/APL f i l e system are a v a i l a b l e v i a the i n t e r f a c e f i l e system with the exception of the CDC l o c k i n g and e r r o r t r a p p i n g c a p a b i l i t i e s , 1 the f o l l o w i n g problems can be expected when EDBS i s run under MTS i n a concurrent environment: (a) very l i t t l e concurrency (h) l o s s of database c o r r e c t n e s s (c) deadlock Very l i t t l e concurrency i s p o s s i b l e because MTS automatic l o c k s are not a u t c m a t i c a l l y r e l e a s e d u n t i l the user h o l d i n g the l o c k s s i g n s o f f from APL, The f o l l o w i n g examples i l l u s t r a t e t h i s p o i n t f o r the h i e r a r c h i c a l system: (a) When a user GET-HOLDs a segment, the hold vector 1 Each user has h i s own f i l e number t a b l e and f i l e name t a b l e which r e s i d e i n h i s a c t i v e workspace t o record f i l e s c u r r e n t l y t i e d and which are destroyed when he lea v e s APL. ' i 74 i n the r e c o r d f i l e must be checked and updated. When the hold vector i s updated the r e c o r d f i l e i s a u t o m a t i c a l l y locked by MTS f o r m o d i f i c a t i o n . T h e r e a f t e r , other concurrent users w i l l be prevented from r e t r i e v i n g segments from the rec o r d f i l e v i a e i t h e r the GET or GET-HOLD commands. (b) I f user A reads a segment from the r e c o r d f i l e , the f i l e i s a u t o m a t i c a l l y locked f o r read. Concurrent users are prevented from h o l d i n g any segment of the same type u n t i l user A l e a v e s APL because the record f i l e cannot be locked f o r m o d i f i c a t i o n to update the hold vector.. (c) A DBA may not c r e a t e a new database u n t i l a l l concurrent users who have read from the schema t a b l e s have signed o f f from APL. This i s because the schema t a b l e s may not be locked f o r m o d i f i c a t i o n by the DBA i f a common user has them lock e d f o r re a d i n g . Within the MTS APL environment, i f two users read from a f i l e and then both attempt to wr i t e i n t o the f i l e they w i l l be deadlocked each w a i t i n g f o r the other to r e l e a s e h i s shared l o c k before l o c k f o r m o d i f i c a t i o n can be accomplished. I f deadlock i s d etected by MTS an e r r o r message w i l l be passed back t o EDBS but EDBS w i l l not be able t o c o r r e c t l y i n t e r p r e t i t . Rather, EDBS w i l l continue operation as i f both w r i t e s had been s u c c e s s f u l . C o r r e c t n e s s of the database i s not maintained because an intended update has not been made. . 75 Co Problem Formulation EDBS has a l o g i c a l l e v e l of l o c k i n g and a p h y s i c a l l e v e l of l o c k i n g . The l o g i c a l l e v e l of l o c k i n g i s the GET-HOLD mechanism. P h y s i c a l l e v e l l o c k i n g i s t h a t done by the f i l e system. With the CDC/APL implementation, p h y s i c a l l o c k i n g would i n c l u d e the waiting and checking done by EDBS to determine i f a f i l e i s t i e d f o r writ e . At the l o g i c a l l e v e l , an e n t i t y i s a segment occurrence, r e c o r d occurrence or t u p l e . The granule f o r read i s a segment occurrence, r e c o r d occurrence or r e l a t i o n . The granule f o r m o d i f i c a t i o n i s a segment type, r e c o r d type or r e l a t i o n . L o g i c a l l e v e l l o c k i n g r e f e r s only to databases c r e a t e d by EDBS. Other data maintained by EDBS such as those contained i n the schema t a b l e s are never locked at a l o g i c a l l e v e l . A c t i o n s read, w r i t e , c r e a t e or destroy e n t i t i e s . In the r e l a t i o n a l system, f o r example, the only a c t i o n s are GET, PUT, UPDATE and DELETE. At the p h y s i c a l l e v e l e n t i t i e s are records and the granule i s a f i l e . Locking a p p l i e s to a l l data maintained by EDBS. An a c t i o n i s a f i l e system o p e r a t i o n f o r r e a d i n g , w r i t i n g , i n s e r t i n g or d e l e t i n g r e c o r d s . T h i s s e c t i o n d e s c r i b e s EDBS concurrency c o n t r o l f a c i l i t i e s i n terms of the b a s i c concepts presented i n chapter I I . F i r s t , we d e f i n e an EDBS t r a n s a c t i o n as our b a s i c unit of c o n s i s t e n c y . Second, we examine the EDBS lock p r o t o c o l both a t the l o g i c a l l e v e l and at the p h y s i c a l l e v e l . T h i r d , we compare the v a r i o u s EDBS implementations i n terms of l e v e l of c o n s i s t e n c y and amount of concurrency provided. F i n a l l y a l i s t of key problems 76 a s s o c i a t e d with running EDBS i n a concurrent MTS environment i s provided. The reader i s advised that the MTS/APL v e r s i o n of EDBS r e f e r r e d t o i n t h i s chapter i s the one which i s o p e r a t i o n a l i n an MTS environment as a s i n g l e user system. That i s , we assume that a l l f i l e system requirements have been simulated except f o r concurrency c o n t r o l . 1. I r a n s a c t i o n D e f i n i t i o n R e c a l l t hat a t r a n s a c t i o n i s d e f i n e d as a group of a c t i o n s by the same user which when executed alone transforms a c o n s i s t e n t database i n t o a new c o n s i s t e n t s t a t e . The f o l l o w i n g o p tions were c o n s i d e r e d f o r choosing an EDBS t r a n s a c t i o n : (a) APL t e r m i n a l s e s s i o n (b) APL program c o n t a i n i n g EDBS f u n c t i o n c a l l s (c) group of EDBS commands (d) s i n g l e EDBS ccmmand Before one of the above options can be s e l e c t e d , the terms database, e n t i t y , a c t i o n and user must be d e f i n e d i n the con t e x t of EDBS. EDBS i s used to c r e a t e and manipulate databases which comprise the r e c o r d f i l e s . In a d d i t i o n , EDBS maintains other data such as the schema t a b l e s and DBA l o g . Our concurrency problem i s concerned with maintaining c o r r e c t n e s s of a l l EDBS f i l e s ( not j u s t the r e c o r d f i l e s ) . Hence, we must d e f i n e a database as a l l . d a t a maintained by EDBS. 77 There are three users of EDBS: SAs r DBAs and common users. Only common users or DBAs who have become common users are database users ( i . e . , users of databases c r e a t e d by EDBS). Since our concurrency problem i s concerned with a l l EDBS data, we must d e f i n e a user as an EDBS user. There are s e v e r a l a l t e r n a t i v e s f o r d e f i n i n g an e n t i t y . For example, an e n t i t y c o u l d be a l o g i c a l item such as a segment, r e c o r d or t u p l e or an e n t i t y could be a p h y s i c a l item such as a f i l e or f i l e component. In a d d i t i o n , an e n t i t y c o u l d be something s m a l l e r than e i t h e r of these items such as an index value, f i e l d value or data-item. We s e l e c t a record i n a f i l e as an e n t i t y because at our l e v e l of concern (the f i l e system l e v e l ) these are the e n t i t i e s . A s i n g l e record i s used t o s t o r e a l l domains, f i e l d s or data-items w i t h i n a t u p l e , segment occurrence or r e c o r d occurrence. Thus, i n c e r t a i n c o n t e x t s we w i l l use the term e n t i t y to r e f e r to a t u p l e , segment or r e c o r d occurrence (and v i c e v e r s a ) . F i n a l l y , we must d e f i n e an a c t i o n . A c t i o n s read, w r i t e , c r e a t e or destroy e n t i t i e s . Any other computations are assumed to occur i n d i v i s i b l y with some a c t i o n . We choose as a c t i o n s the i n d i v i d u a l f i l e o p e r a t i o n s f o r r e a d i n g , w r i t i n g , i n s e r t i n g , d e l e t i n g r e c o r d s . There are two broad types of o p e r a t i o n s supported by EDBS: Data manipulation commands such as GET, PUT, UPDATE, DELETE and other u t i l i t i e s such as those f o r c r e a t i n g and d e s t r o y i n g DBAs, c r e a t i n g and d e s t r o y i n g databases and g r a n t i n g access t o databases.. Execution of each EDBS o p e r a t i o n generates a c e r t a i n 78 sequence of a c t i o n s . For example, c o n s i d e r the DELETE command i n the h i e r a r c h i c a l system which performs the f o l l o w i n g t a s k s : (a) Check schema t a b l e s to ensure t h a t the user has the necessary access r i g h t s . (b) Check b u f f e r f i l e t o ensure t h a t the segment i s being hel d . (c) Lock re c o r d f i l e s f o r segment to be d e l e t e d and f o r a l l descendant segments. {d) D e l e t e segment and i t s descendants, (e) Unlock record f i l e s . The a c t i o n s a s s o c i a t e d with the DELETE command i n c l u d e reads from the schema t a b l e s and d e l e t i o n s from the r e c o r d f i l e s . We do not c o n s i d e r read/write o p e r a t i o n s on the b u f f e r to be a c t i o n s because the b u f f e r f i l e s t o r e s only temporary i n f o r m a t i o n r e q u i r e d on a per user b a s i s . Locking done by the f i l e system i s a l s o not con s i d e r e d to be an a c t i o n . Locking done by EDBS ( i . e . , the hold mechanism) does generate an a c t i o n . An e n t i t y i n the re c o r d f i l e (the hold vector) i s updated. As another example, the EDBS u t i l i t y r e q u i r e d to c r e a t e a DBA c o n s i s t s of o p e r a t i o n s t o c r e a t e the schema t a b l e s and update the DBA log f i l e . A c t i o n s a s s o c i a t e d with CREATEDBA i n c l u d e c r e a t i o n of a number of e n t i t i e s i n the schema t a b l e s and update of an e n t i t y i n the DBA log ( i . e . , the vector of DBA account numbers). EDBS does not provide a mechanism whereby a c t i o n s by the same user may be grouped i n t o c o n s i s t e n c y p r e s e r v i n g u n i t s c a l l e d t r a n s a c t i o n s . ( R e c a l l the System R BEGIN-TRANS and END-TRANS o p e r a t o r s f o r t h i s purpose.) 79 We d e f i n e an EDBS t r a n s a c t i o n as a group ofiEDBS o p e r a t i o n s which have been c a r e f u l l y chosen by a user such t h a t the r e s u l t i n g seguence of a c t i o n s i s c o n s i s t e n c y p r e s e r v i n g . T h i s i s option (c) above. As noted, a problem with EDBS i n i t s c u r r e n t s t a t e i s that the concurrency c o n t r o l mechanism has no way of knowing where one t r a n s a c t i o n ends and another begins. Given t h a t i t i s p o s s i b l e t o support o p t i o n (c) f o r d e f i n i n g a t r a n s a c t i o n , i t would be a minor g e n e r a l i z a t i o n t o support option (b), an APL program c o n t a i n i n g EDBS commands. The d i f f i c u l t i e s which concerned INGEES designers when c o n s i d e r i n g t h i s option are not a concern here because EDBS does not support t r a n s a c t i o n backout. Hence, a d d i t i o n a l overheads f o r backout through system c a l l s i s not expected. I An APL t e r m i n a l s e s s i o n [ o p t i o n (a) ] i s not a s u i t a b l e u n i t f o r d e f i n i n g a t r a n s a c t i o n because i t would be ,too r e s t r i c t i v e of concurrency. For example, i f i t were deemed necessary t o support c o n s i s t e n c y l e v e l 3 of System E, the requirement f o r read r e p r o d u c i b i l i t y would mean t h a t another user could not modify an e n t i t y which has been read by a given user from the time the e n t i t y was read u n t i l the given user s i g n s o f f from APL. 1 An EDBS u t i l i t y could serve by i t s e l f as a t r a n s a c t i o n because i t i s c o n s i s t e n c y p r e s e r v i n g . The remainder of t h i s s u b s e c t i o n d i s c u s s e s why o p t i o n (d) , a s i n g l e EDBS data manipulation command, cou l d not serve as a t r a n s a c t i o n . . 80 EDBS data manipulation commands are not i n g e n e r a l c o n s i s t e n c y p r e s e r v i n g . Consider the REPLACE command i n the h i e r a r c h i c a l system which updates the segment obtained by the preceding GET-HOLD command. The REPLACE command cannot be considered to be a t r a n s a c t i o n because i t updates only one segment occurrence and i f a c o n s i s t e n c y c o n s t r a i n t r e l a t e s two segment occurrences, t h i s command when executed alone w i l l not be c o n s i s t e n c y p r e s e r v i n g . 2 . Lock ing P r o t o c o l Given t h a t we are able t o group a c t i o n s i n t o t r a n s a c t i o n s the next s t e p i s to provide a lock p r o t o c o l which guarantees that c o n c u r r e n t execution of t r a n s a c t i o n s w i l l r e s u l t i n s e r i a l i z a b l e schedules. The EDBS lock p r o t o c o l ( i . e . , the GET-HOLD mechanism) i s not s u f f i c i e n t to guarantee s e r i a l i z a b l e schedules. The major problem i s th a t the GET-HOLD command i s r e s t r i c t e d to operate on at most one r e l a t i o n , segment or re c o r d occurrence at one time. To see the problems a s s o c i a t e d with t h i s approach consi d e r the f o l l o w i n g example: Suppose that e x a c t l y one c o n s i s t e n c y c o n s t r a i n t a p p l i e s t o the database t h a t \"Segment S1 i s equal to Segment S 2 \" . Suppose a l s o t h a t two EDBS t r a n s a c t i o n s , T 1 and T 2 , update segments S1 and S2 a c c o r d i n g to the f o l l o w i n g schedule: { ( 1 1 , GET-HOLD, S 1 ) , ( T 1 , REPLACE, S 1 ) , ( T 2 , GET-HOLD, S 1 ) , ( T 2 , REPLACE, S 1 ) , ( 1 2 , GET-HOLD, S 2 ) , ( T 2 , REPLACE, S 2 ) , ( T 1 , GET-HOLD, S 2 ) , ( T 1 , REPLACE, S 2 ) } 8 1 As usual, we do not show b u f f e r commands. M o d i f i c a t i o n of the segment i n the b u f f e r i s assumed t o occur i n d i v i s i b l y with the REPLACE command. The above schedule i s not s e r i a l i z a b l e a c c ording t o Eswaran et a l . (1976) because T1 updates S1 before T2 does and T2 updates S2 before T1 does.. Thus the EDBS lock p r o t o c o l at the l o g i c a l l e v e l i s u n s u i t a b l e as a mechanism f o r c o n t r o l l i n g concurrency to maintain database c o r r e c t n e s s . EDBS p h y s i c a l l e v e l l o c k i n g appears t o serve two purposes: (a) to guarantee c o r r e c t n e s s of a s i n g l e EDBS op e r a t i o n (b) t o implement p a r t o f the l o g i c a l l e v e l l o c k i n g T h i s i s not an uncommon approach. R e c a l l the System R l o c k mechanism i n which p h y s i c a l l o c k s on pages are used t o ensure that o p e r a t i o n s at the R e l a t i o n a l Storage I n t e r f a c e give c o r r e c t r e s u l t s . A l s o , an o p t i o n i s a v a i l a b l e whereby p h y s i c a l l o c k i n g may be used to accomplish l o g i c a l l o c k i n g . With regard t o (a) above, the only guarantee made by the CDC/APL implementation i s that l o s t updates w i l l be prevented. The mechanism f o r doing so i s i l l u s t r a t e d i n the f o l l o w i n g example: (Suppose we have two t r a n s a c t i o n s , T1=£R1,W1} and T2=£R2,W2}, such that S(R1) =S(W1)=S (R2)=S (W2) = £x) .) An update w i l l be l o s t with e i t h e r of the f o l l o w i n g schedules: S1= {S1[X ]R2£X ]W1[ X ]W2£X ]} S2= {R1£X]R2£X]W2£X]W1£X]} R e c a l l t h a t these schedules cannot be s e r i a l i z a b l e a c c o r d i n g t o P a p a d i m i t r i o u (1979) because i n S1, T1 i s dead and 82 i n S2, T2 i s dead and there i s no s e r i a l schedule i n which e i t h e r t r a n s a c t i o n would be dead. CDC/EDBS does not allow e i t h e r of the above schedules t o occur. We assume that t i e f o r wr i t e occurs i n d i v i s i b l y with the wr i t e . In e i t h e r schedule a c c o r d i n g to the lock p r o t o c o l the second t r a n s a c t i o n t o attempt to write w i l l redo h i s read. The s e r i a l i z a b i l i t y component of EDBS w i l l transform the input s c h e d u l e s , S1 and S2, i n t o the r e s p e c t i v e output schedules 51 (output) = {B1£X]B2[X]W1£X ]B2[ X ]W2£X ]} 52 (output) = (E1[X ]E2£X ]W2[X ]B1[X ]W1[X ]} Since the reads which are redone are ignored the e f f e c t i v e output schedules are the s e r i a l schedules 51 ( e f f e c t i v e ) = {E1£ X ]W1[ X ]R2£X ]W2£X ]} 52 ( e f f e c t i v e ) = (B2£ X ] W2£ X ]B 1[ X ]W 1 [ X ]} With regard t o (b) above, the CDC/APL implementation does not p r o v i d e the l e v e l of i s o l a t i o n from other users o r i g i n a l l y intended f o r the system. B e c a l l t h a t the APL*PLDS implementation guarantees t h a t during execution of any m o d i f i c a t i o n command i n v o l v i n g a r e c o r d (segment, r e l a t i o n ) the user w i l l have e x c l u s i v e use of the rec o r d type (segment type, r e l a t i o n ) . E x c l u s i v e use of a f i l e , r e c o r d type or otherwise, i s never guaranteed with the CDC/APL implementation. B e c a l l t h a t the CDC/APL lock p r o t o c o l i s such t h a t when one user i s w r i t i n g to a f i l e other users may read from the f i l e . . The CDC/APL implementation does ensure that during any m o d i f i c a t i o n command other users w i l l be prevented from modifying the record type 83 (segment type,. r e l a t i o n ) . 3- L e v e l of Consistency The System R m u l t i p l e l e v e l s o f c o n s i s t e n c y i s concerned with the view that a t r a n s a c t i o n has of the database. Even though a c e r t a i n seguence of a c t i o n s i s not c o n s i s t e n c y p r e s e r v i n g , i t may have a c o n s i s t e n t view of the database. In t h i s s u b s e c t i o n , the System R l e v e l s of c o n s i s t e n c y w i l l be a p p l i e d to i n d i v i d u a l EDBS op e r a t i o n s as a means of comparing the APL*PLUS and CDC/APL v e r s i o n s of EDBS and to de s c r i b e c o n s i s t e n c y a v a i l a b l e when EBBS r e l i e s s o l e l y on MTS automatic l o c k i n g f o r concurrency c o n t r o l . I t i s f e l t that the System R notion of c o n s i s t e n c y i s useful, as an i n d i c a t o r of EDBS c o n s i s t e n c y even though we are a p p l y i n g t h i s n o t i o n , not at the l e v e l of a t r a n s a c t i o n as i n System R, but at the l e v e l of one EDBS o p e r a t i o n . R e c a l l the System R l e v e l s of c o n s i s t e n c y : L e v e l 1 - l e a s t i s o l a t i o n from other users - a t r a n s a c t i o n may read uncommitted data L e v e l 2 - a t r a n s a c t i o n reads only committed data - read r e p r o d u c i b i l i t y i s not guaranteed L e v e l 3 - complete i s o l a t i o n frcm other users - a t r a n s a c t i o n reads o n l y committed data - read r e p r o d u c i b i l i t y i s guaranteed The APL*PLUS v e r s i o n of EDBS appears to provide EDBS ope r a t i o n s with something between c o n s i s t e n c y l e v e l s 2 and 3,. Read r e p r o d u c i b i l i t y i s guaranteed, assuming t h a t a . given f i l e i s i n t e r l o c k e d at most once wit h i n the same o p e r a t i o n . However, an o p e r a t i o n may read uncommitted data because f i l e i n t e r l o c k s 84 do not appear t o be held t o the end of an o p e r a t i o n . The EDBS op e r a t i o n s under CDC/APL achieve l e v e l 1 c o n s i s t e n c y . Since the read modify t i e permits other users to write i n t o the f i l e at the same time, read r e p r o d u c i b i l i t y cannot be guaranteed. F u r t h e r , an o p e r a t i o n may read uncommitted data, again because a t i e f o r write does not prevent other users from reading ( r e g a r d l e s s of when the t i e f o r wr i t e i s released) . What l e v e l of c o n s i s t e n c y can we expect with MTS automatic l o c k i n g ? Every e n t i t y read by an op e r a t i o n w i l l be share l o c k e d . Every e n t i t y w r i t t e n w i l l be locked f o r e x c l u s i v e use. A l l l o c k s w i l l be held to the end of each o p e r a t i o n . Read r e p r o d u c i b i l i t y w i l l be guaranteed because no other o p e r a t i o n can modify an e n t i t y which i s locked f o r read, and an op e r a t i o n w i l l read only committed data because read i s not p o s s i b l e once an e n t i t y i s modified by some other user u n t i l that user r e l e a s e s the f i l e c o n t a i n i n g the e n t i t y . T h i s i s l e v e l 3 c o n s i s t e n c y . Of course, the p r i c e paid f o r t h i s l e v e l of c o n s i s t e n c y i s that very l i t t l e concurrency w i l l be provided. 4.. Amount of Concurrency Amount of concurrency r e f e r s t o the amount of i n t e r l e a v i n g of a c t i o n s from d i f f e r e n t t r a n s a c t i o n s . 85 With EDBS, i n t e r l e a v i n g of a c t i o n s from d i f f e r e n t users i s r e s t r i c t e d at a l o g i c a l l e v e l v i a the GET-HOLD mechanism. I n t e r l e a v i n g i s f u r t h e r r e s t r i c t e d by p h y s i c a l l e v e l l o c k i n g . In t h i s s u b s e c t i o n , we compare the r e l a t i v e amounts of concurrency o f f e r e d by the APL*PLUS, CDC/APL and MTS APL v e r s i o n s of EDBS at the l e v e l of i n d i v i d u a l EDBS o p e r a t i o n s ( i . e . , i n t e r l e a v i n g of a c t i o n s from operat The APL*PL0S v e r s i o n supports l e s s concurrency that the CDC/APL v e r s i o n because the CDC/APL f i l e system permits i n t e r l e a v i n g of reads and wri t e s by d i f f e r e n t users to the same f i l e , whereas the APL*PIUS implementation does not permit i n t e r l e a v i n g of any a c t i o n s by d i f f e r e n t users i n v o l v i n g the same f i l e . To c l a r i f y , suppose e n t i t i e s X, Y, and Z are s t o r e d i n the same f i l e and c o n s i d e r the f o l l o w i n g seguences of a c t i o n s w i t h i n schedules S1, S2 and S3: S1={. .. E1[X]E2[Y]E1[Z] ...} S2= {. . . . W1[ X ]R2[ Y ]W1[ Z ] ...} S3={... W1[ X ]W2[ Y ]W1£.Z ] ...} The CDC/APL l o c k p r o t o c o l would permit the sequence of a c t i o n s shown i n schedules S1 and S2 but not S3. The APL*PLUS lock p r o t o c o l would not allow the sequence of a c t i o n s shown i n any of these schedules. 86 The amount of concurrency expected with EDBS under MTS l i e s somewhere between that of the APL*PLUS and CDC/APL v e r s i o n s . MTS supports i n t e r l e a v i n g of reads ( but not reads and writes) by diff-erent users i n v o l v i n g the same f i l e . That i s , schedule S1 but not schedules S2 and S3 would be p o s s i b l e responses of the MTS/APL lock p r o t o c o l . 5. Key Problems This s u b s e c t i o n i d e n t i f i e s key problems which must be s o l v e d before EDBS i s o p e r a t i o n a l i n a concurrent MTS environment: (a) EDBS does not support a b a s i c u n i t of c o n s i s t e n c y . (b) The lock p r o t o c o l at the l o g i c a l l e v e l i s not s u f f i c i e n t t o guarantee s e r i a l i z a b l e schedules. (c) Very l i t t l e concurrency i s provided due to MTS automatic l o c k i n g . (d) C o r r e c t n e s s may be destroyed i f a deadlock s i t u a t i o n should a r i s e at the p h y s i c a l l e v e l . 87 Chapter IV: Recommendations In t h i s chapter we present s e v e r a l a l t e r n a t i v e approaches to p e r m i t t i n g concurrent use of the E d u c a t i o n a l Data Base System. The o b j e c t i v e of each approach i s to guarantee c o r r e c t n e s s of data maintained by EDBS. We assume that BEGIN-TRANS and END-TRANS operato r s are a v a i l a b l e as a means of d e f i n i n g t r a n s a c t i o n s . We conclude t h i s chapter by recommending an approach and s p e c i f y i n g s u p p o r t i n g r o u t i n e s . . A. A l t e r n a t i v e s In t h i s s e c t i o n we i n v e s t i g a t e three a l t e r n a t i v e s . A l t e r n a t i v e s 1 and 2 are s i m i l a r i n that they both recommend r e l y i n g on the f i l e system to do l o g i c a l l e v e l l o c k i n g . The two a l t e r n a t i v e s d i f f e r i n t h e i r proposed l o c k p r o t o c o l s . T h i s has i m p l i c a t i o n s i n terms of freedom from deadlock and the need f o r t r a n s a c t i o n backout. A l t e r n a t i v e 3 proposes not to r e l y on the f i l e system to do l o g i c a l l e v e l l o c k i n g . T h i s approach would re q u i r e e x t e n s i v e m o d i f i c a t i o n of EDES and i s not discussed here i n d e t a i l . A l t e r n a t i v e : _1 The l o c k p r o t o c o l i s d e s c r i b e d as f o l l o w s : (a) A t r a n s a c t i o n has a l o c k i n g phase and an unlocking phase. (b) The l o c k i n g phase i s provided by the MTS automatic l o c k i n g f a c i l i t i e s ( i . e . , Every e n t i t y read by a 88 t r a n s a c t i o n w i l l be locked f o r read immediately p r i o r to the read a c t i o n . Every e n t i t y w r i t t e n w i l l be locked f o r m o d i f i c a t i o n immediately p r i o r to the write a c t i o n . The l o c k c l a s s f o r an e n t i t y may be r a i s e d w i t h i n a t r a n s a c t i o n but i t w i l l never be lowered). (c) A l l l o c k s are a u t o m a t i c a l l y r e l e a s e d as the l a s t s tep i n the t r a n s a c t i o n . This c o n s t i t u t e s the unl o c k i n g phase of the t r a n s a c t i o n . T h i s lock p r o t o c o l i s not deadlock f r e e . T r a n s a c t i o n backout procedures are proposed f o r deadlock r e s o l u t i o n . This approach assumes t h a t , except f o r deadlock, the MTS automatic, lock p r o t o c o l ensures s e r i a l i z a b l e schedules. The proposed l o c k p r o t o c o l appears t o be a s i m p l i f i e d v e r s i o n of that used by System R l e v e l 3 t r a n s a c t i o n s ( i . e . , without i n t e n t i o n l o c k i n g ) . The proposal to r e l y on the f i l e system t o do l o c k i n g means t h a t the a c t u a l granule locked w i l l be one f i l e ( i . e . , a segment type, record type, r e l a t i o n ) . Hence, when we say t h a t a t r a n s a c t i o n l o c k s an e n t i t y we mean that the t r a n s a c t i o n l o c k s the f i l e c o n t a i n i n g the e n t i t y . The f o l l o w i n g example i l l u s t r a t e s how a l t e r n a t i v e 1 might work: Suppose t h a t each of two t r a n s a c t i o n s attempt t o update segments S1 and S2. Suppose a l s o t h a t segments S1 and S2 are of d i f f e r e n t types. The f o l l o w i n g seguence of oper a t i o n s w i l l r e s u l t i n deadlock: {(T1, GET-HOLD, S1), ( T i , REPLACE, S 1) , (T2, GET-HOLD, S2) , (T2, REPLACE, S2), (TI, GET-HOLD, S2), (T2, GET-HOLD, S1)} 89 T1 w i l l be waiting t o lock the record f i l e to read the hold v e c t o r of segment type S2. S i m i l a r l y , T2 w i l l be w a i t i n g to l o c k the re c o r d f i l e to read the hold vector f o r segment type S1. N e i t h e r t r a n s a c t i o n w i l l be able to lock i t s r e s p e c t i v e record f i l e because the other t r a n s a c t i o n i s holding an e x c l u s i v e l o c k on the f i l e , obtained during the previous GET-HOLD o p e r a t i o n . MTS w i l l d e t e c t t h i s deadlock s i t u a t i o n and pass a message back to one of the t r a n s a c t i o n s i n d i c a t i n g t h at the read (of the hold vector) has been u n s u c c e s s f u l . T h i s deadlock message could be a s i g n a l f o r t r a n s a c t i o n backout. In our example, i f T1 r e c e i v e s the deadlock message then the p r i o r update of segment S1 by TI should be undone. The remainder of t h i s s u b s e c t i o n d i s c u s s e s implementation c o n s i d e r a t i o n s f o r t r a n s a c t i o n backout: EDBS maintains a database l o g c o n t a i n i n g a l i s t of a l l database m o d i f i c a t i o n s . A rec o r d of the user who performed the m o d i f i c a t i o n i s not c u r r e n t l y maintained. The e x i s t i n g l o g g i n g procedure could be e a s i l y modified t o r e c o r d t h i s i n f o r m a t i o n . A l s o , a rec o r d of the time at which the m o d i f i c a t i o n was s t a r t e d would be u s e f u l . T r a n s a c t i o n backout would i n v o l v e undoing database m o d i f i c a t i o n s as recorded i n the l o g . The BEGIN-TEANS operator could r e c o r d the time at which the t r a n s a c t i o n was s t a r t e d as a g l o b a l v a r i a b l e s t o r e d i n the user's a c t i v e work space. I f i t becomes necessary to back the t r a n s a c t i o n out, database m o d i f i c a t i o n s recorded i n the log belonging t o t h i s user c o u l d be undone so long as the time at which the m o d i f i c a t i o n was made 90 i s l a t e r than the time at which the t r a n s a c t i o n was s t a r t e d . The System R p r i n c i p l e of i s o l a t e d backout w i l l be necessary. Hence, the lock p r o t o c o l must be such that data modified by a given t r a n s a c t i o n i s not modified by any other u n t i l the given t r a n s a c t i o n t e r m i n a t e s . I f l o c k s are r e l e a s e d as p a r t of the END-TRANS operator, i s o l a t e d backout w i l l be guaranteed. A l o g of m o d i f i c a t i o n s made to EDBS data, other than t h a t c o n t a i n e d i n the re c o r d f i l e s , i s not c u r r e n t l y maintained by EDBS ( i . e . , m o d i f i c a t i o n s to schema t a b l e s and the DBA l o g ) . These m o d i f i c a t i o n s c o u l d be logged and DBA u t i l i t i e s c ould then be handled l i k e any other t r a n s a c t i o n s (bracketed by BEGIN-TRANS and END-TRANS o p e r a t o r s ) . A l t e r n a t i v e : 2 — — — — — — — The l o c k p r o t o c o l i s d e s c r i b e d as f o l l o w s : (a) A l l e n t i t i e s r e g u i r e d by a t r a n s a c t i o n are a u t o m a t i c a l l y locked i n the r e g u i r e d mode i n one step at the beginning of the t r a n s a c t i o n . (b) I f the l o c k i n g mode i s to be r a i s e d w i t h i n the t r a n s a c t i o n ( i . e . , from read mode to write mode) then the highest l e v e l l o c k i n g mode i s requested. (c) A l l l o c k s are a u t o m a t i c a l l y r e l e a s e d at the end of the t r a n s a c t i o n . 91 T h i s a l t e r n a t i v e has the advantage t h a t t r a n s a c t i o n backout procedures f o r deadlock r e s o l u t i o n are not r e g u i r e d . The l o c k p r o t o c o l i s v i r t u a l l y i d e n t i c a l to that proposed by INGEES d i f f e r i n g only i n i t s implementation d e t a i l s . R e c a l l that INGRES avoids deadlock by r e g u i r i n g t h a t an i n t e r a c t i o n l o c k s a l l r e g u i r e d e n t i t i e s before proceeding. The l o c k i n g phase of the lock p r o t o c o l could be implemented v i a MTS/APL e x p l i c i t l o c k f a c i l i t i e s as part of the BEGIN-TRANS op e r a t o r . S i m i l a r l y , the END-TRANS operator could a u t o m a t i c a l l y r e l e a s e a l l l o c k s held by the t r a n s a c t i o n . As with A l t e r n a t i v e 1 the pro p o s a l i s to r e l y on the f i l e system to do a l l l o c k i n g . Hence, the a c t u a l granule locked w i l l be one f i l e . The event i n which a t r a n s a c t i o n cannot loc k a l l r e g u i r e d e n t i t i e s w i l l be r e a l i z e d i n the form of deadlock. I f lock requests deadlock, then the t r a n s a c t i o n which r e c e i v e s the MTS deadlock messaqe c o u l d r e t u r n a s t a t u s code to the user to i n d i c a t e t h a t he should attempt to run h i s t r a n s a c t i o n at a l a t e r time and a l l l o c k s c u r r e n t l y h e l d by the t r a n s a c t i o n c o u l d be r e l e a s e d . Since the t r a n s a c t i o n has not yet executed any a c t i o n s , t r a n s a c t i o n backout i s not r e q u i r e d . A l t e r n a t i v e 2 would be p r a c t i c a l l y impossible t o implement i n an EDBS context because i t would be d i f f i c u l t to determine l o c k i n g requirements at the beqinning of a t r a n s a c t i o n . Consider the f o l l o w i n g problems: (a) The whole t r a n s a c t i o n would have to be analyzed p r i o r to execution t o determine which f i l e s w i l l be r e q u i r e d . 92 (b) Within a s i n g l e EDBS data manipulation command or u t i l i t y , more than one f i l e may be r e g u i r e d . There i s no way to determine the f i l e s r e g u i r e d by examining the name of the command or u t i l i t y . (c) S i m i l a r l y , the mode of lock cannot be determined from knowledge of the command or u t i l i t y . (d) I t i s o f t e n i m p o s s i b l e to determine e i t h e r the d e s i r e d mode of lock or the f i l e s r e g u i r e d because these depend on the r e s u l t s of previous a c t i o n s . A s o l u t i o n t o these problems might be to assume the worst. Hence, f o r a l l t r a n s a c t i o n s composed by common users we cou l d assume that read access to the schema t a b l e s i s r e g u i r e d . F u r t h e r , we could assume t h a t every t r a n s a c t i o n r e g u i r e s w r i t e access t o a l l database f i l e s . Hence, the n record f i l e s and the database l o g f i l e c o u l d be loc k e d f o r m o d i f i c a t i o n and the schema t a b l e s c o u l d be locked f o r read at the beginning of each t r a n s a c t i o n . DBA u t i l i t i e s do not r e q u i r e access to database f i l e s . . A BEGIN-TRANS operator s p e c i f i c f o r DBAs could l o c k the DBA l o g f i l e and the schema t a b l e s f o r e x c l u s i v e use. Both a l t e r n a t i v e s 1 and 2 are r e s t r i c t i v e of concurrency because l o c k s are h e l d on f i l e s r a t h e r than on i n d i v i d u a l e n t i t i e s . A l t e r n a t i v e 2 i s more r e s t r i c t i v e than A l t e r n a t i v e 1 due to the requirement that a l l l o c k s must be requested at the beginning of each t r a n s a c t i o n . The amount of concurrency provided by A l t e r n a t i v e 2 w i l l be f u r t h e r r e s t r i c t e d by implementation requirements i f we choose t o make worst case 93 assumptions about the f i l e s r e g u i r e d by a t r a n s a c t i o n . A major advantage of a l t e r n a t i v e 2 over a l t e r n a t i v e 1 i s t h a t t r a n s a c t i o n backout procedures f o r deadlock r e s o l u t i o n are not r e g u i r e d . A l t e r n a t i v e : 3 A l t e r n a t i v e 3 d i f f e r s from pr e v i o u s approaches i n t h a t we do not r e l y on the f i l e system t o do l o g i c a l l e v e l l o c k i n g . F u r t h e r , we provide the user with l o c k i n g f a c i l i t i e s which he w i l l use as part of a lock p r o t o c o l to ensure database c o r r e c t n e s s . We do not d i s c u s s here the s p e c i a l f a c i l i t i e s which a DBA may need. We focus only on l o c k i n g reguirements f o r a common user. A l t e r n a t i v e 3 proposes t h a t the user be provided with the c a p a b i l i t y of h o l d i n g more than one l o g i c a l item at one time. For s i m p l i c i t y , a l o c k p r o t o c o l c o u l d be enforced which s t a t e s that a user must hold a l l l o g i c a l items which he i n t e n d s to modify a t the beginning of h i s t r a n s a c t i o n . C o n s i s t e n t with the e x i s t i n g approach, a s t a t u s code c o u l d be returned to the user to i n d i c a t e an u n s u c c e s s f u l h o l d i f any of the items requested cannot be immediately placed i n h o l d . T h i s very simple l o c k p r o t o c o l avoids deadlock and hence deadlock d e t e c t i o n and t r a n s a c t i o n backout procedures are not r e g u i r e d . 94 The f a c i l i t i e s proposed are b a s i c a l l y those of the IMS GET-HOLD mechanism.. E e c a l l that IMS does not enforce the r e s t r i c t i o n (as does EDES) th a t a user may have at most one l o g i c a l e n t i t y i n h o l d a t one time. Since IMS does not enforce the simple loc k p r o t o c o l which we have proposed, user programs may deadlock, and backout procedures are provided. When a l o g i c a l item ( i . e . , segment, r e c o r d , r e l a t i o n ) i s a c t u a l l y being modified i n the database i t must be locked f o r m o d i f i c a t i o n . With the e x i s t i n g approach, EDBS manages i t s own l o c k s f o r read ( i . e . , holds) and r e l i e s on the f i l e system t o lock f o r m o d i f i c a t i o n . A l t e r n a t i v e 3 delegates a d d i t i o n a l r e s p o n s i b i l i t y to EDBS f o r lock management. A lock f o r each segment, rec o r d or r e l a t i o n i n the database i s already maintained i n the form of a hold l i s t . A code c o u l d be added to each (user ID, l o g i c a l index, time stamp) t r i p l e i n the hold l i s t t o d i s t i n g u i s h between a read lock and a write l o c k . A d d i t i o n a l procedures w i l l be r e g u i r e d t o s e t and r e l e a s e l o c k s f o r m o d i f i c a t i o n . A l t e r n a t i v e 3 w i l l r e q u i r e that MTS automatic l o c k s be r e l e a s e d a t some a p p r o p r i a t e time. MTS l o c k s could be r e l e a s e d immediately a f t e r the a c t i o n r e s p o n s i b l e f o r s e t t i n g them or, a l t e r n a t i v e l y , automatic l o c k i n g c o u l d remain i n e f f e c t f o r the dur a t i o n of a s i n g l e EDBS o p e r a t i o n . . C e r t a i n EDBS storage o p e r a t i o n s may i n i t i a t e procedures t o reo r g a n i z e i n v e r t e d l i s t b l o c k s . T h i s provides at l e a s t one motivation f o r maintaining f i l e l o c k s f o r the d u r a t i o n of each i n d i v i d u a l EDBS o p e r a t i o n . R e c a l l t hat System E p h y s i c a l l e v e l l o c k i n g was motivated by s i m i l a r requirements. 95 A l t e r n a t i v e 3 o f f e r s a g r e a t e r amount of concurrency than a l t e r n a t i v e s 1 and 2 f o r the f o l l o w i n g reasons: (1) l o g i c a l items ( i . e . , a segment) may be locked r a t h e r than r e q u i r i n g that an e n t i r e f i l e ( i . e . , the segment type) be l o c k e d . (2) The user has i n f o r m a t i o n about h i s t r a n s a c t i o n which he can i n c o r p o r a t e i n t o h i s l o c k p r o t o c o l to i n c r e a s e concurrency. For example, a user knows when he i s f i n i s h e d with an e n t i t y and he may e x p l i c i t l y RELEASE i t f o r use by other users. A major disadvantage of a l t e r n a t i v e 3 i s t h a t database c o r r e c t n e s s i s not guaranteed unless a l l users employ an a p p r o p r i a t e l o c k p r o t o c o l . In t h i s s e c t i o n we have proposed three a l t e r n a t i v e s which may be summarized as f o l l o w s : A l t e r n a t i v e s 1 and 2 make two proposals i n common: F i r s t , to r e l y on the f i l e system to do both l o g i c a l and p h y s i c a l l e v e l l o c k i n g . Second, to a u t o m a t i c a l l y lock a l l e n t i t i e s r e g u i r e d by a t r a n s a c t i o n thus f r e e i n g the user of the r e s p o n s i b i l i t y f o r any p a r t i c u l a r l o c k i n g p r o t o c o l . A l t e r n a t i v e 1 recommends r e l y i n g on MTS automatic l o c k i n g f o r the lock phase of i t s lock p r o t o c o l . The l o c k p r o t o c o l admits deadlock and t r a n s a c t i o n backout procedures are proposed to r e s o l v e i t . A l t e r n a t i v e 2 recommends t h a t e n t i t i e s r e q u i r e d by a t r a n s a c t i o n be a u t o m a t i c a l l y locked v i a MTS e x p l i c i t l o c k i n g at the beginning of a t r a n s a c t i o n . The lock p r o t o c o l i s such t h a t i f deadlock occurs, any t r a n s a c t i o n i n v o l v e d i n the deadlock has not yet executed any a c t i o n s and hence t r a n s a c t i o n backout procedures are not r e g u i r e d . 96 A l t e r n a t i v e 3 proposes to pr o v i d e a d d i t i o n a l f a c i l i t i e s f o r l o c k i n g l o g i c a l items and to leave most of the r e s p o n s i b i l i t y f o r l o c k p r o t o c o l s up to the user. Dependence upon the f i l e system t o do l o g i c a l l e v e l l o c k i n g i s c o n t r a - i n d i c a t e d . B. S e l e c t e d Approach A l t e r n a t i v e 2 ( i . e . , automatic l o c k i n g of a l l e n t i t i e s at the beginning of a t r a n s a c t i o n and r e l e a s e of a l l l o c k s at the end of a t r a n s a c t i o n ) i s chosen l a r g e l y due to i t s implementation s i m p l i c i t y . Two types of BEGIN-TRANS and END-TRANS operators are proposed. These opera t o r s w i l l not be s t o r e d as part of the i n t e r f a c e f i l e system. Rather, the work spaces intended f o r common users w i l l c o n t a i n BEGIN-TRANS and END-TRANS o p e r a t o r s f o r common users. S i m i l a r l y , work spaces intended f o r DBAs w i l l c o n t a i n BEGIN-TRANS and END-TRANS o p e r a t o r s f o r DBAs. The SA i s not provided with t r a n s a c t i o n d e f i n i t i o n f a c i l i t i e s because h i s f u n c t i o n i s to set up the system i n i t i a l l y and to modify the system i f r e g u i r e d . We do not expect him to be using EDBS c o n c u r r e n t l y with EBAs and common users. The proposed s o l u t i o n makes use of the f o l l o w i n g i n f o r m a t i o n about EDBS: (1) Common users do not modify i n f o r m a t i o n i n the schema t a b l e s . (2) EDBS data manipulation commands do not r e q u i r e access t o the DBA log f i l e . (3) DBA u t i l i t i e s do not r e q u i r e access to database f i l e s . 97 (4) Database f i l e s {i.e., r e c o r d f i l e s and database log) remain t i e d when.EDBS r e t u r n s c o n t r o l to the user so long as the database i s open. The remainder of t h i s s e c t i o n provides s p e c i f i c a t i o n s f o r the BEGIN-TBANS and END-TRANS ope r a t o r s and o u t l i n e s d e t a i l s o f the implementation. (1) DBA Operators BEGIN-TR AN S - lock the schema t a b l e s f o r d e s t r u c t i o n - l o c k the DEA l o g f i l e f o r m o d i f i c a t i o n END-TRANS - r e l e a s e a l l l o c k s held on schema t a b l e s and DBA l o g f i l e {2) Operators f o r Common Users BEGIN-TRANS - lock a l l r e c o r d f i l e s and the log f i l e (belonging to the c u r r e n t l y open database) f o r m o d i f i c a t i o n - l o c k the schema t a b l e s f o r read END-TEANS - r e l e a s e a l l l o c k s held on database f i l e s and t a b l e s (3) Implementation D e t a i l s (a) The BEGIN-TEANS operator f o r DBAs must lock the schema t a b l e s (ZTTABLE, ZETABLE, ZFTABLE) f o r d e s t r u c t i o n . T h i s lock c l a s s i s chosen rather than m o d i f i c a t i o n because at l e a s t one of the DBA u t i l i t i e s (DESTBOYDBA) r e q u i r e s t h a t the schema t a b l e s be locked f o r d e s t r u c t i o n . The higher l o c k mode i s not expected to i n h i b i t concurrency because the DBA i s the only user who would r e q u i r e write access to the schema t a b l e s and 98 lo c k f o r d e s t r u c t i o n by a DBA would not exclude him from a l s o w r i t i n g . Common users who r e g u i r e read access to the schema t a b l e s w i l l be excluded by e i t h e r l o c k i n g c l a s s . (b) The BEGIN-TEANS operator f o r DBAs must lock the DBA l o g f i l e f c r m o d i f i c a t i o n . . The SA's account number must be known to accomplish the l o c k . We recommend s t o r i n g the system dependent v a r i a b l e SYSACCNT i n a l l DBA work spaces. (c) The BEGIN-TRANS operator f o r common users must lock the schema t a b l e s f o r read and database f i l e s f o r w r i t e . Names of database f i l e s are a v a i l a b l e from the i n t e r f a c e f i l e system v a r i a b l e FNAMES. Only filenames from FNAMES whose corresponding t i e number from FNUMS i s greater than 15 should be s e l e c t e d as database f i l e s . Locking the schema t a b l e s w i l l r e g u i r e the DBA account number. (d) The END-TRANS operator f o r common users must unlock the schema t a b l e s and database f i l e s . The procedure f o r l o c a t i n g f i l e names o u t l i n e d f o r the BEGIN-TRANS operator c o u l d a l s o be used here. (e) We recommend that f i l e s be locked v i a the FMTS LOCK f u n c t i o n , and t h a t they be unlocked v i a the FRELEASE f u n c t i o n . R e c a l l that FRELEASE a l s o c l o s e s the f i l e . I n v e s t i g a t i o n s i n t o advantages of unlocking f i l e s without c l o s i n g them has l e d us to conclude t h a t there are none ( i . e . , 99 r e d u c t i o n s i n overhead by not c l o s i n g f i l e s would be i n s i g n i f i c a n t ) . In t h i s chapter we have d e s c r i b e d three a l t e r n a t i v e s f o r p r o v i d i n g concurrent use of the E d u c a t i o n a l Data Base System, and we have s e l e c t e d an approach based on f e a s i b i l i t y of implementation. We conclude t h i s chapter with a summary of our recommendations: (1) BEGIN-TRANS and END-TRANS operato r s should be provided by means of which a user may d e f i n e h i s own t r a n s a c t i o n s . {2) E n t i t i e s r e g u i r e d by a t r a n s a c t i o n should be a u t o m a t i c a l l y locked as part of the BEGIN-TRANS operator and a u t o m a t i c a l l y unlocked as part of the END-TRANS operator. (3) EDBS should r e l y on the f i l e system t o do a l l l o c k i n g r e g u i r e d by i t s concurrency c o n t r o l mechanism. (4) In terms of implementing (2) above, worst case assumptions should be made concerning e n t i t i e s r e g u i r e d by a t r a n s a c t i o n . The amount of concurrency provided by our recommended approach can be summed up as f o l l o w s : Any number of t r a n s a c t i o n s may read from the schema t a b l e s at the same time. T r a n s a c t i o n s may execute c o n c u r r e n t l y on d i f f e r e n t databases c r e a t e d and maintained by EDBS. Otherwise, t r a n s a c t i o n s are executed i n a s e r i a l f a s h i o n . 100 Chapter V: Conclusion This t h e s i s has d e s c r i b e d our experience with the EDBS undertaking and proposed a s o l u t i o n f o r running EDBS i n a concurrent EDBS environment. In chapter 2, we c l a r i f i e d the b a s i c o b j e c t i v e of concurrency c o n t r o l which i s to ensure c o r r e c t n e s s . We then examined one approach f o r o b t a i n i n g c o r r e c t n e s s which* i s t o ensure t h a t a c t i o n s from t r a n s a c t i o n s are i n t e r m i n g l e d i n a s e r i a l i z a b l e f a s h i o n . Next, we examined two lock p r o t o c o l s which have been shown to guarantee s e r i a l i z a b i l i t y . F i n a l l y , we reviewed concurrency c o n t r o l f a c i l i t i e s provided by a number of systems. In chapter 3, we examined EDBS concurrency c o n t r o l f a c i l i t i e s and compared them to those of other systems.. We provided v a r i o u s analyses of the e x i s t i n g l o c k p r o t o c o l i n c l u d i n g an i n d i c a t i o n of the amount of concurrency and l e v e l of c o n s i s t e n c y provided. F i n a l l y , we compiled a l i s t of key problems r e g u i r i n g r e s o l u t i o n before EDBS i s o p e r a t i o n a l i n a concurrent MTS environment. In chapter 4, we proposed v a r i o u s a l t e r n a t i v e s o l u t i o n s which s o l v e d the key problems and we recommended an approach based on p r a c t i c a l c o n s i d e r a t i o n s . The remainder of t h i s chapter i d e n t i f i e s areas r e g u i r i n g f u r t h e r work. 1. In t h i s t h e s i s we have r e s t r i c t e d our a t t e n t i o n to read and write a c t i o n s . Those that c r e a t e and destroy t h e i r e n t i t i e s have not been considered i n d e t a i l . The notion of 10 1 s e r i a l i z a b i l i t y must be extended t o account f o r a c t i o n s which- c r e a t e and destroy e n t i t i e s . The a n a l y s e s presented i n chapter 2 and the a l t e r n a t i v e s proposed i n chapter 3 should be examined i n terms of t h i s broader n o t i o n of s e r i a l i z a b i l i t y . 2. We have i d e n t i f i e d two types of EDBS e n t i t i e s : those at a l o g i c a l l e v e l such as segments and those at a p h y s i c a l l e v e l such as r e c o r d s . However, some EDBS t r a n s a c t i o n s access e n t i t i e s which may not belong to e i t h e r of these two c a t e g o r i e s . For example, when a DBA c r e a t e s a database he c r e a t e s the r e c o r d f i l e s and the database l o g f i l e . I f he does not a l s o write i n t o these f i l e s , then he has c r e a t e d no e n t i t i e s . This d i f f i c u l t y i n d i c a t e s that a d d i t i o n a l s y n c h r o n i z a t i o n may be r e g u i r e d f o r t r a n s a c t i o n s which c r e a t e and destroy such items as the schema t a b l e s and databases. For example, i t i s not c l e a r how the proposed s o l u t i o n would handle the s i t u a t i o n i n which a DBA attempts to d e s t r o y a database while some user has i t open. 3. The problem of phantoms has not been thoroughly i n v e s t i g a t e d i n the context of EDBS. If a t r a n s a c t i o n l o c k s a set of e n t i t i e s as given by some p r e d i c a t e then other t r a n s a c t i o n s must be prevented from adding an e n t i t y to t h i s set u n t i l the given t r a n s a c t i o n t e r m i n a t e s . Otherwise, i f the given t r a n s a c t i o n makes a second r e t r i e v a l based on the same p r e d i c a t e , a d i f f e r e n t set of e n t i t i e s w i l l be r e t r i e v e d . Thus, the non-existence of an e n t i i t y must be locked. T h i s problem r e f e r s to the r e l a t i o n a l system where s e t s of e n t i t i e s ( i . e . , t u p les) are r e t r i e v e d and i t a l s o r e f e r s to 102 the h i e r a r c h i c a l and network systems where only one e n t i i t y at a time i s r e t r i e v e d . For example, i n the h i e r a r c h i c a l system, the non-existence of a p a r t i c u l a r segment must be locked u n t i l the end of a t r a n s a c t i o n so t h a t the t r a n s a c t i o n does not f i n d t h a t the segment d i d not e x i s t upon a f i r s t r e t r i e v a l and d i d e x i s t upon a second r e t r i e v a l . The a l t e r n a t i v e s presented i n chapter 3 should be e v a l u a t e d i n terms of how they r e s o l v e phantom problems. 4. . EDBS might provide a u s e f u l environment i n which t o i n v e s t i g a t e the r e l a t i o n s h i p between the l o s t update problem and the problem of ensuring c o n s i s t e n c y . P a p a d i m i t r i o u ' s t r a n s a c t i o n model assumes t h a t values f o r v a r i a b l e s i n a t r a n s a c t i o n ' s read s e t are read i n s t a n t a n e o u s l y and s i m i l a r i l y t h a t values f o r v a r i a b l e s i n the wr i t e set are w r i t t e n i n s t a n t a n e o u s l y . Given these assumptions, the only way that a h i s t o r y cannot be s e r i a l i z a b l e i s i f updates get l o s t . EDBS avoids l o s t updates at a l o g i c a l l e v e l . Given that EDBS supports the concept of a t r a n s a c t i o n , i t might be p o s s i b l e t o vary the lock p r o t o c o l at a p h y s i c a l l e v e l to simulate various c o n d i t i o n s f o r c o n s i s t e n c y , a l l the while being assured that the l o s t update problem i s under c o n t r o l . 103 BEFERENCES Astrahan, M.M., D.D. Blasgen, D.D. Chamberlin, K.P. Eswaran, J.N. Gray, P.P. G r i f f i t h s , S.F. King, R.A. L o r i e , P.R. McJones, J.W. Mehl, G. R. P u t z o l u , I. L. T r a i g e r , B.W. Wade, and V. Watson. 1976. System R: R e l a t i o n a l approach to database management. ACM T r a n s a c t i o n s on Database Systems 1 (2) : 97-137. Bayer, R . a n d E. McCreight. 1972. O r g a n i z a t i o n and maintenance of l a r g e ordered i n d i c e s . Acta I n f o r m a t i c a 1: 173-189. B e r n s t e i n , P.A., D.W..Shipman, J.B. Rothnie, and N. Goodman. 1977. The concurrency c o n t r o l mechanism of SDD-1: A system f o r d i s t r i b u t e d databases {the g e n e r a l c a s e ) . Computer C o r p o r a t i o n of America TR CCA-77-09. Cambridge, Mass. B e r n s t e i n , P.A., D.W. Shipman, and J.B. Rothnie.. 1980. Concurrency c o n t r o l i n a system f o r d i s t r i b u t e d database (SDD-1)..ACM T r a n s a c t i o n s on Database Systems 5(1): 18-51. B e r n s t e i n , P.A. and D.W. Shipman.. 1980. The c o r r e c t n e s s of concurrency c o n t r o l mechanisms i n a system f o r d i s t r i b u t e d databases (SDD-1)..ACM T r a n s a c t i o n s on Database Systems 5 (1) : 52-68. Bjork, L.A. 1973. Recovery s c e n a r i o f o r a DB/DC system. Proc. ACM Annual Conf. A t l a n t a , Ga: 142-146. Chamberlin, D.D. and R.F..Boyce.. 1974.. SEQUEL: A s t r u c t u r e d E n g l i s h guery language. Proc. ACM SIGFIDET Workshop. Ann Arbor, Michigan. May: 249-264. CODASYL.. 1971. Data Base Task Group Report. CODASYL, DBTG, ACM, New York. A p r i l . CODASYL Programming Language Committee. 1976. COBOL J o u r n a l of Development. C o n t r o l Data C o r p o r a t i o n . 1978 APL V e r s i o n 2 Reference Manual 60454000. . Date, C.J. 1977. An i n t r o d u c t i o n t o database systems. Second edition..Addison-Wesley P u b l i s h i n g Company. ISBN 0-201-14456-5. Davies, C.T. 1973. Recovery semantics f o r a DB/DC system. Proc..ACM Annual C o n f . . A t l a n t a , Ga: 136-141. Engles, R.W. 1971. An a n a l y s i s of the A p r i l 1971 DBTG Report -a p o s i t i o n paper presented to the Programming Language Committee by the IBM Representative to the Data Base Task 104 Group. ( A v a i l a b l e from B r i t i s h Computer Society.) Engles, R.W. 1976. Currency and concurrency i n the COBOL Data Base F a c i l i t y . Proc..IFIP Working Conference on Modelling i n Data Base Management Systems, January. Eswaran, K.D., J.N. Gray, R.A. L o r i e , and I.L. T r a i g e r . 1976. The notions of c o n s i s t e n c y and p r e d i c a t e locks i n a database system. Communications of the ACM 19(11): 624-633. Gray, J.N., R.A. L o r i e , G.R. P u t z o l u , and I.L. T r a i g e r . 1976. G r a n u l a r i t y of l o c k s and degrees of c o n s i s t e n c y i n a shared database. Pp. 695-723 i n Proc. IFIP Working Conference on M o d e l l i n g i n Data Base Management Systems. January. Freudenstadt, Germany. Hawley, D.A., J.S. Knowles, and E.E. Tozer. . 1975. .Database c o n s i s t e n c y and the CODASYL DBTG proposa l s . Computer J o u r n a l . 18 (1): 206-212. IBM C o r p o r a t i o n . .Information management s y s t e m / v i r t u a l storage g e n e r a l i n f o r m a t i o n manual..GH20-1260. Kedem, Z. and A. S i l b e r s c h a t z . 1979. . C o n t r o l l i n g concurrency using l o c k i n g p r o t o c o l s ( p r e l i m i n a r y r e p o r t ) . Proc. IEEE February: 274-285. Kwong, Y.S. and D. Wood. 1979. Concurrency i n B-trees, S-trees and T - t r e e s . Computer Science T e c h n i c a l Report 79-CS-17, McMaster U n i v e r s i t y . Kwong, Y.S. and D. Wood. 1979. Approaches to concurrency i n B-t r e e s . (To appear i n Proc. Mathematical Foundation of Computer S c i e n c e s , Rydzyna, Poland.) Kung, H.T. and C.H. P a p a d i m i t r i o u . 1979. An o p t i m a l i t y theory of concurrency c o n t r o l f o r databases. Proc. 1979 SIGMOD Conf. Boston, Mass. May. L i e n , Y.E. and P.J. .Weinberger. . 1978. C o n s i s t e n c y , concurrency, and crash recovery. ACM-SIGMOD Conference. P a p a d i m i t r i o u , C.H. 1977.. The s e r i a l i z a b i l i t y of concurrent database updates. J o u r n a l of the ACM 26(4): 631-653. R i e s , D.R. and M. Stonebraker. 1977. E f f e c t s of l o c k i n g g r a n u l a r i t y i n a database management system. ACM T r a n s a c t i o n s on Database Systems 2 (3): 233-246. Ries , D.R. and M. Stonebraker. 1979. Locking g r a n u l a r i t y r e v i s i t e d . . ACM T r a n s a c t i o n s on Database Systems 4(2): 210-227. . Rothnie, J.B. and N. Goodman. . 1 977. An overview of the p r e l i m i n a r y design of SDD-1: a system of d i s t r i b u t e d 105 databases. Proc. .1977 Berkeley Workshop on D i s t r i b u t e d Data Management and Computer Networks. Berkeley, C a l i f o r n i a . May. Eothnie, J.B., P.A. B e r n s t e i n , S. Fox, N. Goodman, M. Hammer, T. A. , Landers, C. . Eeeeve, D. W. Shipman, and E. Wong.. 1980.. I n t r o d u c t i o n t o a system f o r d i s t r i b u t e d databases (SDD-1).. ACM T r a n s a c t i o n s on Database Systems 5(1): 1-17. Sc h l a g e t e r , G. 1978. Process s y n c h r o n i z a t i o n i n database systems. ACM T r a n s a c t i o n s on Database Systems 3(3): 248-271. S i l b e r s c h a t z , A. And Z. Kedem. 1978. Consistency i n h i e r a r c h i c a l database systems. ( A v a i l a b l e i n JACM.) Simon F r a s e r U n i v e r s i t y . 1978.. MTS/APL f i l e system user's guide. Computing Center. Stonebraker, M., E. Wong, P. Kreps, and G. Held. 1976. The design and implementation of INGEES. ACM Tr a n s a c t i o n s on Database Systems 1 (3): 189-222. U n i v e r s i t y of B r i t i s h Columbia 1976. UBC f i l e s and devices. Computing Centre. U n i v e r s i t y of B r i t i s h Columbia 1977. UBC Commands, Computing Centre. U n i v e r s i t y of Ca l g a r y . E d u c a t i o n a l data base system data base a d m i n i s t r a t o r ' s manual. Department of Computer S e r v i c e s CSEM-009. U n i v e r s i t y of Toronto. 1975. E d u c a t i o n a l data base system data manipulation f a c i l i t y user's manual. Computer Systems Eesearch Group. October. U n i v e r s i t y of Toronto. 1975. E d u c a t i o n a l data base system data base a d m i n i s t r a t o r ' s manual. Computer Systems Eesearch Group. December. Yannakakis, M., C.H. P a p a d i m i t r i o u , and H.T. Kung. . 1979. Locking p o l i c i e s : s a f e t y and freedom from deadlock. Proc. IEEE. February: 286-297.. APPENDIX A: EDBS OVERVIEW 106 Grant Access Create • SA I Create DBA Create Database L i s t Databases L i s t Database Descr. Relational Hierarchical J Network QUERY AND UPDATE FACILITIES Maintain I Destroy Database Destroy DBA * Figure 3 107 APPENDIX B: EDES Commands and U t i l i t i e s R e l a t i o n a l System H i e r a r c h i c a l System Network System GET GET HOLD PUT UPDATE DELETE GET UNIQUE GET NEXT GET NEXT WITHIN PABENT GET HOLD INSEBT REPLACE DELETE GET RECORD GET SET GET HOLD STORE DELETE INCLUDE REMOVE MODIFY CHANGE TO CURRENT DBA U t i l i t i e s CREATEDBA DUMPLOG DESTEOYDBA ERASELOG LISTDATABASES MAINTAIN CREATE MAKEAVAILABLE DESTROY MAKUNAVAIL ABLE GRANTACCESS LISTACTIVEUSERS 108 APPENDIXC : EDBS FILE USAGE Figure 4 109 APPENDIX D: I n t e r f a c e F i l e System D e s c r i p t i o n This appendix provides a d e s c r i p t i o n of the major f u n c t i o n s which comprise our i n t e r f a c e f i l e system. There are b a s i c a l l y two c a t e g o r i e s of f u n c t i o n s which had to be s i m u l a t e d . F i r s t , the CDC/APL f i l e system f u n c t i o n s . Second, a number of system f u n c t i o n s . Of p a r t i c u l a r importance i n the l a t e r category i s the #FI f u n c t i o n . In the f o l l o w i n g , each simulated f i l e system f u n c t i o n i s d e s c r i b e d i n d e t a i l . The simulated system f u n c t i o n s are only b r i e f l y d e s c r i b e d . The reader i s r e f e r r e d to the CDC APL Version 2 Reference Manual f o r f u r t h e r d e t a i l s . Our d e s c r i p t i o n of the simulated f i l e system f u n c t i o n s i s i n part a restatement of pages 10.8 t o 10.10 of the CDC Manual. MTSCREATE: 'file-name [/options ]'MTSCREATE fnum The MTSCREATE f u n c t i o n s i m u l a t e s the CDC f i l e c r e a t e f u n c t i o n (FCREATE). MTSCREATE may be used to c r e a t e a f i l e and s p e c i f y o ptions about the type of f i l e . When the f i l e i s creat e d i t i s t i e d t o the f i l e number fnum. The l i s t of op t i o n s may i n c l u d e S or WR to permit the f i l e f o r read or r e a d / w r i t e , r e s p e c t i v e l y , t o other users. Any other options s p e c i f i e d i n the l i s t of op t i o n s are i g n o r e d . . F i l e s c r e a t e d by the MTSCREATE f u n c t i o n are always i n t e r n a l I/O format MTS l i n e f i l e s . Examples of f i l e c r e a t i o n f o l l o w : 'FILE 1 ' MTSCREATE 11 (A f i l e named FILE 1 with 11 as i t s number) •FILE2/DA S WR* MTSCREATE 2 (A f i l e named FILE 2 permitted f o r RW t o others) 110 MTSDEL: MTSDEL fnum [ , rnum] The MTSDEL f u n c t i o n simulates the CDC f i l e r e c o r d d e l e t e f u n c t i o n (FRDEL). MTSDEL d e l e t e s the record rnum from f i l e fnum. I f the r e c o r d was already absent, nothing i s done (except that the f i l e p o s i t i o n changes) and nc e r r o r s r e s u l t . MTSERASE: MTSERASE fnums The a c t i o n of the CDC f i l e erase f u n c t i o n (FERASE) when a p p l i e d to CDC d i r e c t access f i l e s i s simulated by the MTSERASE f u n c t i o n . A l l f i l e s s p e c i f i e d by the r i g h t argument are destroyed. MTSFMMES: r e s u l t < MTSFNAMES The MTSFNAMES f u n c t i o n r e t u r n s a matrix of names (and user I.D.s) o f f i l e s c u r r e n t l y t i e d . MTSFNAMES si m u l a t e s the CDC FNAMES f u n c t i o n . The number of columns i n the matrix r e t u r n e d i s always 13 ( i . e . , three l e s s than with CDC because MTS user I.D.s are t h r e e c h a r a c t e r s s h o r t e r than CDC user I.D.s). An example f o l l o w s : MTSFNAMES SAMPLE 1 ALGEBRA *XXAR FILE1 MTSFNUM_S_: r e s u l t < MTSFNUMS The MTSFNUMS f u n c t i o n r e t u r n s a vector of numbers i n use f o r t i e d f i l e s . The order i s the same as the order of f i l e names i n the r e s u l t from MTSFNAMES.. MTSNUMS si m u l a t e s the CDC FNUMS f u n c t i o n . 111 MTSFTIE: '[*account] f i l e - n a m e [ / o p t i o n s ]' FTIE fnum The MTSFTIE f u n c t i o n s i m u l a t e s the CDC f i l e t i e f u n c t i o n (FTIE). MTSFTIE g i v e s the number fnum t o the p r e v i o u s l y s t o r e d f i l e having the i n d i c a t e d name. I f a user account (I.D.) i s given, the s t o r e d f i l e i s sought under than I.D. r a t h e r than the one used f o r s i g n i n g onto the system. The l i s t of options i f provided i s i g n o r e d by the MTSFTIE f u n c t i o n . Examples using MTSFTIE f o l l o w : *FILE5' MTSFTIE 7 (A user t i e s one of h i s own f i l e s ) •*XXAR FILE 1• MTSFTIE 8 (A user t i e s a f i l e b elonging to another user) MTSFUNTIE: MTSFUNTIE fnums MTSFUNTIE simulates the CDC FUNTIE. A l l f i l e s f o r which t h e i r f i l e numbers appear i n the v e c t o r or s c a l e r r i g h t argument are u n t i e d . . To unt i e a l l t i e d f i l e s , use MTSUNTIE MTSFNUMS. . MTSREAD: r e s u l t < FREAD fnum [, rnum] The MTSREAD f u n c t i o n reads from the f i l e having fnum as i t s f i l e number t h a t r e c o r d having rnum as i t s r e c o r d number. I f rnum i s not provi d e d , the c u r r e n t record number i s used. I f that r e c o r d does not e x i s t , an empty numeric v e c t o r i s r e t u r n e d . MTSREAD si m u l a t e s the CDC FREAD f u n c t i o n . 112 MTSSTAT; r e s u l t < FSTATUS fnum MTSSTAT simulates a very s m a l l subset of the c a p a b i l i t i e s provided by the CDC f i l e s t a t u s f u n c t i o n (FSTATUS). MTSSTAT r e t u r n s the l a r g e s t record number c u r r e n t l y i n use by the f i l e having fnum as i t s f i l e number. I f the f i l e i s empty -1 i s ret u r n e d . MTSWEITE: ar r a y MTSWEITE fnum [, rnum] The MTSWEITE f u n c t i o n w r i t e s i t s l e f t argument on the f i l e having fnum as i t s number as the r e c o r d having rnum as i t s r e c o r d number. I f rnum i s not provided, the c u r r e n t record number i s used. MTSWEITE simulates the CDC FWEITE f u n c t i o n . The $\"FI f u n c t i o n s ($\"FI1, $\"FI2, $\"FI3 and $\"FI4) accomplish the i n t e r f a c e between EDBS and the f u n c t i o n s d e s c r i b e d above. The f o l l o w i n g d e s c r i b e s how they work: The CDC f i l e system f u n c t i o n s use the system f u n c t i o n #FI to perform a l l f i l e o p e r a t i o n s . For example, FCEEATE i s d e f i n e d as f o l l o w s : \"A FCEEATE B <1> A #FI 1,B\" The number (1) f o l l o w i n g the #FI i n d i c a t e s t h a t #FI should perform a f i l e c r e a t e . . Each f i l e system f u n c t i o n c a l l s #FI with a unique number as the f i r s t element i n the r i g h t argument.. The #FI f u n c t i o n can a c t u a l l y be used d i r e c t l y and i s used d i r e c t l y by EDBS. The $\"FI f u n c t i o n s simulate the #FI f u n c t i o n . A l l r e f e r e n c e s to #FI made by EDBS have been modified t o r e f e r to the a p p r o p r i a t e $\"FI f u n c t i o n . 113 Our i n t e r f a c e f i l e system i n c l u d e s a number of other f u n c t i o n s . . $\"TRAP1 r e p l a c e s the CDC e r r o r t r a p p i n g f u n c t i o n , #TEAP.. $\"LIB1 si m u l a t e s the CDC system f u n c t i o n #LIB. The MTSID and re v e r s e MTSID (REVMTSID) f u n c t i o n s r e p l a c e EDBS f a c i l i t i e s f o r decoding and encoding a user account number. APPENDIX E: I n t e r f a c e F i l e System F u n c t i o n s FUNCTION NAMES MTSCREATE MTSDEL MTSFTIE MTSFUNTIE MTSUNTIE MTSWRITE $\"FI4 $\"LIBDELETE $\"REVMTSID $\"TRAP1 MTSERASE MTSID $\"FI1 $\"LIBUPDATE MTSFNAMES MTSREAD $\"FI2 MTSFNUMS MTSSTAT $\"FI3 $\"LIB1 VARIABLE NAMES FNAMES FNOMS \" A MTSCREATE B;FNAME;MSG;OPTIONS <1> FNAME= (1+A$. •%•) $TAA <2> $*a)a)a> CREATE FILE CALLED FNAME 5)5)5) <3> $> (0$EQ$EX3$TAMSG=FMTS 'CREATE •,FNAME)%CREATED <4> MSG <5> $>0 <6> CREATED: OPTIONS= ( {$,FNAME) +1) $DRA <7> $*5)5)5> UPDATE FILENAME LIST 25)5) <8> $'* LIBUPDATE FN AM E <9> $*a)5)5) IF SEMIPRIVATE PERMIT FOR READ TO OTHERS 5)5)5) <10> $> ( ($, OPTIONS) $LTOPTIONS$. !S') %UPDATE <11> $>(0$EQ$EX3$TAMSG=FMTS 'SHARE • , FNAME)#SHARED <12> MSG <13> $>0 <14> $*5)S)5) IF WRITE OPTION PERMIT FOB WRITE TO OTHERS 5)5)5) <15> SHARED:$>( {$rOPTIONS)$LTOPTIONS$.'W')%UPDATE <16> $> (0$EQ$EX3$TAMSG=FMTS 'SHARE ',FNAME,' RW')%UPDATE <17> MSG <18> $>0 <19> $*5)5)5) UPDATE FNAMES AND FNUMS 5)5)5) <20> UPDATE: $> { ($ , FNAMES) $EQ0) %FIEST <21> FNAMES=FNAMES,<1> FNAME,( 1 6-$, FNAME) $ , ' ' <22> $>TIE <23> FIRST:FNAMES= 1 16 $,FNAME,(16-$,FNAME)$,• • <24> TIE:FNUMS=FNUMS,B II \" MTSDEL B;FNUM;RNUM;MSG;FNAME;M;DATA <1> FNUM=1$TAB <2> FN AME=,FNAMES<(FNUMS$EQFNUM)%$,$,FNUMS;> <3> $*333 SET APL EXTERNAL FORMAT 335) <4> MSG=* EXTERNAL1 FSETTYPE FNAME <5> $*333 SET DIRECT ACCESS MODIFIER 335) <6> #IO=0 <7> M=32$,0 <8> M<30>=1 <9> MSG=M FSETMODS FNAME <10> #10=1 <11> $>(($ r RNUM=1$DRB)$EQO)%SEQ <12> $*333 FIND LINE NO..FOR DIRECT ACCESS WRITE 333 <13> MSG=(RNUM*1000) FSETLINE FNAME <14> SEQ:$> (0$EQ$EX3$TAMSG=' • FWRITE FNAME) %RESET <15> MSG <16> $>0 <17> RESET: <18> $*333 INCREMENT LINE POINTER 333 <19> MSG= ,RNUM« FGETLINE FNAME <20> MSG = (RNUM+1000) FSETLINE FNAME II \" MTSERASE A;MSG;FNUM;FNAME;I <1> I=,.1 <2> $> (I$GT$ / , A) %0 <3> FNUMf_1$TAI$TAA <4> FNAME=,FNAMES<(FNUMS$EQFNUM) %$.$,FNUMS;> <5> $*333 UNTIE FILES GIVEN BY A 333 <6> MTSUNTIE FNUM <7> $*333 DESTROY FILES GIVEN BY A 333 <8> $>(0$EQ$EX3$TAMSG=FMTS 'DESTROY *,F NAME)%DONE <9> MSG <10> $>0 <11> DONE:I=I+1 <12> $*5)33 UPDATE FILENAME LIST 333 <13> $\"LIBDELETE FNAME <14> $>2 II \" R=MTSFNAMES; I <1> I=,1 <2> E= 0 13 $,• • <3> NEXT: $> (I$GT$ ,FNUMS) %Q <4> $>(FNAMES$EQ':')%ACC <5> $*333 COMPOSE FNAME IF NO ACCOUNT NO. GIVEN 333 <6> R=R,<1> • •,4$DR,FNAMES <7> $>OUT <8> $*333 COMPOSE FNAME IF ACCOUNT NO. GIVEN 333 <9> ACC:R=R,<1> «3',(,FNAMES)•,7$,5$DR,FNAMES <10> O0T:I=I+1 <11> $>NEXT II 116 <1> \" BESULT=MTSFNUMS EE SULT=FNUM S ii 11 A MTSFTIE B;MSG;FNAME;STATUS <1> $*a)33> COMPOSE FNAME IF ACCOUNT NO. GIVEN a)a)a) <2> $> (A<1>$NE« a)') %NOACC <3> FN AM E=A< 2 3 4 5> , 1 : • , ( ( (6$DEA) $NE • •) $. 1) $DB5$DKA <4> FNAME= (1+FNAME$. *%•) $TAFNAME <5> $*a)a)d) COMPOSE FNAME IF NO ACCOUNT NO. GIVEN a)a)a> <6> $>EEECHK <7> NOACC:FNAME= (1 + A$.'%')$TAA <8> EBBCHK:$>(B$EPFNUMS)%E1 <9> $*a)a)3 CHECK THAT FILE PEEMITTED 3 3 3 <10> $> (0$EQ$EX3$TAMSG=FMTS ('CHKACC • , FNAME) INTO * STATUS ' ) %OK <11> MSG <12> $>0 <13> OK:$>(STATUS$EQO)%E2 <14> $*a>33 UPDATE FNUMS AND FNAMES 333 <15> $> {($,FNAMES)$EQO) %FIEST <16> FNAMES=FNAMES,<1> FNAME,(16-$,FNAME)$ , ' • <17> $>TIE <18> FIRST:FNAMES= 1 16 $,FNAME, (16-$,FNAME)$,* • <19> TIE:FNUMS=FNUMS,B <20> $>0 <21> E1:«TIE NUMBEE IN USE' <22> $>0 <23> E2:'FILE NOT PEEMITTED' MTSFUNTIE A;I <1> $*333 UNTIE FILES GIVEN BY A 333 <2> I=,1 <3> $> (I$GT$, , A) JtO <4> MTSUNTIE 1$TAI$TAA <5> 1 = 1+1 <6> $>3 II II <1> <2> <3> Z = MTSID $*333 THIS FUNCTION FINDS THE USEE ID 333 $*333 FOE THE CALLING ACCOUNT a)a)a) Z=,#AV<#IO+ (4$,256)$EN1$TA#AI> II \" RESULT-MTSRE&D E;FNUM;BNUM;MSG;FNAME;M;DATA <1> FN UM=1$TAE <2> FN AME = , FNAMES< (FNUMS $EQFNUM) %$.$,FNUMS;> <3> $*333 SET APL INTEENAL FOEMAT 333 <4> MSG='INTEENAL1 FSETTYPE FNAME <5> $*333 SET DIEECT ACCESS MODIFIES 2)33 <6> #10=0 <7> M=32$,0 <8> M<30>=1 <9> MSG=M FSETMODS FNAME <10> #10=1 <11> $> ( ($,BNUM=1$DBB)$EQ0)%SEQ <12> $*333 FIND LINE NO. FOB DIBECT ACCESS BEAD 333 <13> MSG=(BNUM*1000) FSETLINE FNAME <14> SEQ:$> (0$EQ$EX3$TAMSG='DAT A• FEEA D FNAME) %EESET <15> $>(30$EQ$EX3$TAMSG) %NULL <16> MSG <17> $>0 <18> NULL:DATA=$.0 <19> $>UNLOCK <20> BESET: <21> $*333 INCEEMENT LINE POINTEE 333 <22> MSG=1BNUM' FGETLINE FNAME <23> MSG= (BNUM+1000) FSETLINE FNAME <24> $*333 UNLOCK IMPLICIT MTS LOCKING 333 <25> EESULT=DATA II \" BESULT=MTSSTAT A;MSG;FNAME;LINE;FNUM <1> $*333 FIND LAST LINE IN USE IN FILE 333 <2> FNAME=fFNAMES<(FNUMS$EQA)%$.$,FNUMS;> <3> MSG=FMTS •GETLST •,FNA ME INTO * LINE• <4> $> (12$EQ$EX3$TAMSG) %EOF <5> EESULT=,LINE/1000 <6> $>0 <7> EOF:RESULT-,1 it » MTSUNTIE A;Y;FNAME;MSG <1> $*333 UNTIE FILE GIVEN BY A 333 <2> $> ( (FNUMS$.A)$GT$,FNUMS)%EREOE <3> FNAME=,FNAMES< (FNUMS$EQA) %$.$,FNUMS;> <4> $*333 UPDATE FNUMS AND FNAMES 333 <5> FNUMS=(Y= FNUMS$EPA)%FNUMS <6> FN AMES=Y$C1FNAMES <7> $>0 <8> EBEOR:•ATTEMPTING TO UNTIE AN UNTIED FILE* II 118 \" A MTSWEITE B;FNUM;RNUM;MSG;FNAME;M;DATA <1> FN UM = 1 $ TAB <2> FN AME = ,FNAM ES<(FNUMS$EQFNUM)%$.$,FNUMS;> <3> $*333 SET APL INTERNAL FORMAT 333 <4> MSG=«INTERNAL' FSETTYPE FNAME <5> $*333 SET DIRECT ACCESS MODIFIEE 333 <6> #10=0 <7> M=32$r0 <8> M<30>=1 <9> MSG=M FSETMODS FNAME <10> #10=1 <11> $>({$,ENUM=1$DRB)$EQ0)%SEQ <12> $*333 FIND LINE NO. FOR DIRECT ACCESS WRITE 333 <13> MSG=(RNUM*1000) FSETLINE FNAME <14> SEQ:$> (0$EQ$EX3$TAMSG=A FWRITE FNAME)%RESET <15> MSG <16> $>0 <17> RESET: <18> $*333 INCREMENT LINE POINTER 333 <19> MSG= * RNUM 1 FGETLINE FNAME <20> MSG=(RNUM+1000) FSETLINE FNAME it \" A $\"FI1 B;X <1> X=1$TAB <2> $*333 BRANCH TO FUNCTION GIVEN BY B 333 <3> $> ( (X$EQ1) , (X$EQ2) , (X$EQ4) , (X$EQ9) , (X$EQ10) ) %L 1 rL2 , L4 , L9,L 1 <4> • NOT VALID $\"FI1 COMMAND' <5> L1:A MTSCREATE 1$DRB <6> $>0 <7> L2:A MTSWRITE 1$DBB <8> $>0 <9> L4:MTSERASE A <10> $>0 <1 1> L9:MTSFUNTIE A <12> $>0 <13> L10:A MTSFTIE 1$DRB II '» Z=$\"FI2 B;X <1> X=1$TAB <2> $*333 EfiANCH TO FUNCTION GIVEN BY B 333 <3> $> ( (X$EQ3) , (X$EQ7) , (X$EQ8) ) 5513^7^8 <4> 'NOT VALID $\"FI2 COMMAND' <5> L3:Z = MT SREAD 1$DEB <6> $>0 <7> L7: Z=MTSFNAMES <8> $>0 <9> L8:Z=MTSFNUMS II 119 \" $\"FI3 B <1> $*333 CALL FUNCTION GIVEN BY B 333 <2> MTSDEL 1$DRE II \" Z=A $\"FI4 B <1> $*333 CALL FUNCTION GIVEN BY B 333 <2> Z = MTSSTAT A II \" $\"LIBDELETE FNAME;MSG;X <1> $*333 DELETE FILE NAME FROM LIBNAMES 333 <2> MSG='INTERNAL* FSETTYPE 'LIBNAMES' <3> $>(0$EQ$EX3$TAMSG='X• FREAD 'LIBNAMES')%OK <4> MSG <5> $>OUT <6> OK:X={ (FNAME, (17-$,FNAME) $, • • ) 8 . $EQ$TEX) $C1 X <7> $*333 RESET RECORD NUMBER TO ONE 333 <8> MSG=FRELEASE •LIENAMES' <9> MSG='INTERNAL' FSETTYPE 'LIBNAMES' <10> $>(0$EQ$EX3$TAMSG=X FWRITE 'LIBNAMES')%OUT <11> MSG <12> OUT:MSG=FRELEASE 'LIBNAMES' II \" $\"LIBUPDATE FNAME;STATUS; MSG ; X <1> $*333 CHECK IF LIENAMES ALREADY CREATED 333 <2> $> (0$EQ$EX3$TAMSG=FMTS ('CHKACC LIBNAMES') INTO 'STATUS')%RE <3> $*333 CREATE LIBNAMES AND PERMIT FOR READ 333 <4> MSG=FMTS 'CREATE LIENAMES1 <5> MSG=FMTS 'SHAEE LIBNAMES' <6> X= 0 17 $,' • <7> $>WEITE <8> BEAD:MSG='INTEEN AL' FSETTYPE 'LIENAMES' <9> MSG='X» FEE AD 'LIENAMES' <10> MSG=FEELEASE 'LIENAMES• <11> $*333 UPDATE LIBNAMES 333 <12> WEITE:X=X,<1> FNAME, (17-$,FNAME)$, • • <13> MSG='INTEENAL' FSETTYPE 'LIBNAMES' <14> MSG=X FWBITE 'LIENAMES' <15> MSG=FBELEASE 'LIENAMES' II 120 \" Z=$\"LIB1 LIBNO;STATUS;MSG;FNAME <1> $*333 THIS FUNCTION EETUENS A LIST OF FILENAMES 333 <2> $*333 CONTAINED IN THE ACCOUNT GIVEN BY LIBNO 933 <3> $> (0$EQ$,LIBNO) %NOACC <4> FNAME=(1$DB5$TALIBNO),•:LIBNAMES' <5> $>NEXT < 6 > NOACC:FN AME='LIBNAMES• <7> NEXT:$>(0$EQ$EX3$TAMSG=FMTS{'CHKACC FNAME) INTO •STATUS ,)% <8> Z= 0 17 $,• « <9> $>0 <10> READ:MSG='INTERNAL1 FSETTYPE FNAME <11> MSG= ,Z« FREAD FNAME <12> MSG=FRELEASE FNAME it z=$\"MTSID LIST <1> $*333 THIS FUNCTION FINDS THE USER ID 333 <2> $*a)33 FOR A LIST OF USERS 333 <3> Z=#AV<#IO + $TE(4$,256)$EN, LIST> E=$\"EEVMTSID USEE $*333 THIS FUNCTION FINDS THE USEE ID 333 $*333 IN ENCODED FOEM FEOM CHAR REPN 333 E=256$DE (#AV$. USER) -#IO $\"TRAP1 LABEL $*333 THIS FUNCTION DOES NOTHING..IT IS USED TO REMOVE 333 $*333 THE EEEOE TEAPPING FACILITIES PEOVIDED BY #TEAP 333 # FNAMES C $,$,$EQ 2; $,$EQ 0 16 # FNUMS N $,$,$EQ 1 ; $,$EQ 0 T e a n s l i t e r a t i oji_T_a b l e ALPHA a $AL - LESS OR EQUAL < $LE ALPHABET A--z A-Z LESS THAN < $LT A-•z a-z A- Z LINE FEED $-. AND A & LOCKED FUNCTION $L\" ASSIGN -<- LOGARITHM © $@ $L0 $LN BRACKETS [ < MAXIMUM r $MA $CE ] > MEMBERSHIP € $EP $ME BRANCH -> $> $G0 MINIMUM L $MI $FL . MINUS - -CAP n $CA MODULUS 1 $1 CIRCULAR FUNCTIONS o $$ $C I $PI MULTIPLY X * COLON j : COMMA(CATENATION) » 9 NAND $N& COMMENT A $* $co NEGATION - $-COMPRESSION / % NOR V TN l $NR COlfPRESSIONll~\\ / $C1 NOT ~ $N0 CUP u $cu NOT EQUAL $NE NULL o $: DECODE X $DE $BA DEFINITION{DEL) V II OMEGA 6J $0M DELTA A $\" OR V I $0R DELTAU A $U\" DIERESIS .. $DS PARENTHESES • ( ( DIGITS 0 -•9 0-9 ) ) DIMENSION P $, $RH PERIOD DIVIDE 4- / PLUS + + DROP + $DR $D0 QUAD • # ENCODE T $EN $RP QUAD - DIVIDE $/ EQUAL $EQ QUOTE - QUAD m $# ESCAPE w $0UT QUOTE i i EXPANSION \\ $% EXPANSION(1) \\ $X1 • RANDOM 9 ? $RV EXPONENTIATION * 0 ROTATION $R0 EXECUTE $EX ROTATIONS © $R1 RSUB $RS FACTORIAL t $FA FORMAT SEMICOLON V • >. y • SUB ' cr:\"' $SU GRADE DOWN Y $GD I SYSTEM $IB $SY GRADE UP 4 $GU $TA $UP GREATER OR EQUAL > $GE TAKE + GREATER THAN > $GT TRANSPOSITION $TR INDEX $. $10 $IN UNDERSCORE - $UN From UBC APL "@en ; edm:hasType "Thesis/Dissertation"@en ; edm:isShownAt "10.14288/1.0051814"@en ; dcterms:language "eng"@en ; ns0:degreeDiscipline "Computer Science"@en ; edm:provider "Vancouver : University of British Columbia Library"@en ; dcterms:publisher "University of British Columbia"@en ; dcterms:rights "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en ; ns0:scholarLevel "Graduate"@en ; dcterms:title "Practical concurrency considerations for a student oriented database management system"@en ; dcterms:type "Text"@en ; ns0:identifierURI "http://hdl.handle.net/2429/22303"@en .