Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Design of relational database schemas : the traditional dependencies are not enough 1982

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
UBC_1982_A6_7 O42.pdf [ 3.81MB ]
UBC_1982_A6_7 O42.pdf
Metadata
JSON: 1.0051825.json
JSON-LD: 1.0051825+ld.json
RDF/XML (Pretty): 1.0051825.xml
RDF/JSON: 1.0051825+rdf.json
Turtle: 1.0051825+rdf-turtle.txt
N-Triples: 1.0051825+rdf-ntriples.txt
Citation
1.0051825.ris

Full Text

DESIGN OF RELATIONAL DATABASE SCHEMAS: THE TRADITIONAL DEPENDENCIES ARE NOT ENOUGH by ADEGBEMIGA OLA B . S c , The U n i v e r s i t y o f I b a d a n , 1977 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF MASTER OF SCIENCE i n THE FACULTY OF GRADUATE STUDIES (Department of Computer S c i e n c e ) We a c c e p t t h i s t h e s i s a s c o n f o r m i n g t o t h e r e q u i r e d s t a n d a r d THE UNIVERSITY OF BRITISH COLUMBIA A p r i l 1982 © Adegbemiga O l a , 1982 In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study. I further agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. I t i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Department of CpnnP\>Tef-i? - S c i e ^ C f The University of B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date DE-6 (.3/81) i i A b s t r a c t H i t h e r t o , most r e l a t i o n a l d a t a b a s e d e s i g n methods a r e b a s e d on f u n c t i o n a l d e p e n d e n c i e s (FDs) and m u l t i v a l u e d d e p e n d e n c i e s (MVDs). F u l l mappings a r e p r o p o s e d a s an a l t e r n a t i v e t o FDs a n d MVDs. A mapping between any two s e t s , a p a r t from b e i n g one-one, many-one, o r many-many, i s e i t h e r t o t a l o r p a r t i a l on t h e s o u r c e and t a r g e t s e t s . An ' i n t o ' mapping on a s e t , e x p r e s s e s t h e f a c t t h a t an e l e m e n t i n t h e s e t may n o t be i n v o l v e d i n t h e mapping. An 'o n t o ' mapping on a s e t i s t o t a l on t h e s e t . A many-many ( i n t o , o n t o ) mapping from s e t A t o s e t B i s w r i t t e n a s i o A m n B . The mappings i n c o r p o r a t e more s e m a n t i c i n f o r m a t i o n i n t o d a t a d e p e ndency s p e c i f i c a t i o n . I t i s shown, i n f o r m a l l y , t h a t t h e f u l l mappings a r e more e x p r e s s i v e t h a n FDs and MVDs. T r a n s f o r m a t i o n r u l e s , t o g e n e r a t e B oyce-Codd n o r m a l form and p r o j e c t i o n - j o i n n o r m a l form schemas f r o m t h e f u l l m a p p i n g s , a r e d e f i n e d . The f u l l m a p p i n g / t r a n s f o r m a t i o n r u l e s p r o v i d e a d i s c i p l i n e f o r m o d e l i n g n o n f u n c t i o n a l r e l a t i o n s h i p s , w i t h i n a s y n t h e t i c a p p r o a c h . T a b l e of C o n t e n t s A b s t r a c t i i L i s t o f F i g u r e s v A c knowledgements v i C h a p t e r I . I n t r o d u c t i o n 1 1 . 1 G e n e r a l A r e a Of R e s e a r c h 1 1.2 P r o p o s e d Work ' 3 1.3 R e a d i n g G u i d e 4 C h a p t e r I I . R e l a t i o n a l D a t a b a s e D e s i g n : Framework 6 2.1 B a s i c C o n c e p t s And D e f i n i t i o n s 6 2.1.1 I n f o r m a t i o n A n a l y s i s 6 2.1.2 C o n c e p t s In R e l a t i o n a l D a t a b a s e T h e o r y 8 2.2 D e s i g n A p p r o a c h e s 14 2.2.1 The S y n t h e t i c A p p r o a c h 14 2.2.2 D e c o m p o s i t i o n 17 2.2.3 D e c o m p o s i t i o n / C o m p l e t e D a t a R e l a t a b i l i t y 19 2.2.4 E n t i t y - R e l a t i o n s h i p M o d e l / T r a n s f o r m a t i o n R u l e s 23 2.3 A p p r a i s a l Of The D e s i g n A p p r o a c h e s 27 2.3.1 C o n s e q u e n c e s Of U n i v e r s a l R e l a t i o n A s s u m p t i o n 27 2.3.2 D a t a D e p e n d e n c i e s 29 2.3.3 D e c o m p o s i t i o n . V e r s u s The S y n t h e t i c A p p r o a c h ..30 C h a p t e r I I I . The F u l l M apping A p p r o a c h 31 3.1 F u l l Mapping S p e c i f i c a t i o n 31 3.1.1 Mapping T y p e s 31 3.2 D e s i g n Of R e l a t i o n Schemas From F u l l M a p p i n g s 34 3.2.1 T r a n s f o r m a t i o n R u l e s F o r BCNF Schemas 35 i v 3.2.2 T r a n s f o r m a t i o n R u l e s F o r PJ/NF Schemas 37 3.3 B a s i s F o r The T r a n s f o r m a t i o n R u l e s 38 3.3.1 N e c e s s a r y And S u f f i c i e n t C o n d i t i o n F o r A L o s s l e s s J o i n 38 3.3.2 The C a n d i d a t e Keys 40 3.3.3 The BCNF R u l e s : V e r i f i c a t i o n 42 3.3.4 The PJ/NF R u l e s : V e r i f i c a t i o n 45 C h a p t e r I V . E v a l u a t i o n Of The F u l l M apping A p p r o a c h 47 4.1 The N o r m a l i z a t i o n P r o c e s s 47 4.1.1 The Second Normal Form P r o b l e m 47 4.1.2 The T h i r d Normal Form P r o b l e m 50 4.1.3 The Boyce-Codd Normal Form P r o b l e m 50 4.1.4 The F o u r t h Normal Form P r o b l e m 52 4.1.5 The P r o j e c t i o n - j o i n Normal Form P r o b l e m 54 4.2 F u l l M apping V e r s u s FDs And MVDs 55 4.3 D e s i g n Example 58 4.4 F u l l M a pping A p p r o a c h V e r s u s O t h e r D e s i g n Methods .60 4.4.1 S y n t h e t i c Methods And The F u l l M a pping A p p r o a c h 61 4.4.2 D e c o m p o s i t i o n And The F u l l M apping A p p r o a c h ..62 4.4.3 E n t i t y - R e l a t i o n s h i p A p p r o a c h And The F u l l M a pping R u l e s 64 C h a p t e r V. C o n c l u s i o n s 67 5.1 A c h i e v e m e n t s 67 5.2 F u r t h e r R e s e a r c h . 70 B i b l i o g r a p h y 72 A p p e n d i x I • 76 L i s t o f F i g u r e s F i g u r e 1. M o d e l i n g Continuum 2 F i g u r e 2. Schema D e s i g n 3 F i g u r e 3. The D e s i g n P r o c e s s 3 F i g u r e 4. A Mapping D i a g r a m F o r F u n c t i o n a l D e p e n d e n c i e s ..10 F i g u r e 5. Sample R e l a t i o n F o r STOCK 11 F i g u r e 6. Mapping D i a g r a m Of The MVD In STOCK R e l a t i o n ...11 F i g u r e 7. Sample R e l a t i o n F o r S P J 13 F i g u r e 8. Mapping D i a g r a m F o r The JD In S P J 13 F i g u r e 9. Summary Of E x i s t i n g D e s i g n A p p r o a c h e s 14 F i g u r e 10. One-one Mapping D i a g r a m 32 F i g u r e 11. Many-one Mapping D i a g r a m 32 F i g u r e 12. Many-many Mapping D i a g r a m 33 F i g u r e 13. The FDs In R e l a t i o n Schema FIRST 48 F i g u r e 14. The FDs In R e l a t i o n Schema SCHOOL 51 F i g u r e 15. Sample R e l a t i o n F o r Schema SCHOOL 51 F i g u r e 16. Sample R e l a t i o n F o r CTX .53 v i A cknowledgements I am g r a t e f u l t o my s u p e r v i s o r P a u l G i l m o r e f o r t h e i n i t i a l i d e a s t h a t l e a d t o t h i s t h e s i s . H i s c a r e f u l r e v i e w o f t h e r e s e a r c h r e p o r t s was o f g r e a t v a l u e . I s i n c e r e l y t h ank Bob G o l d s t e i n f o r h i s t h o r o u g h r e v i e w of t h e t h e s i s . I a l s o w i s h t o t h a n k J u l i e Shamper f o r r e a d i n g t h e f i n a l document. 1 CHAPTER I_ . I n t r o d u c t i o n T h i s t h e s i s d e a l s m a i n l y w i t h a u t o m a t i c d e s i g n o f r e l a t i o n a l d a t a b a s e schemas. Most d e s i g n a p p r o a c h e s h i t h e r t o have been b a s e d on t h e f u n c t i o n a l and m u l t i v a l u e d , d e p e n d e n c i e s . The d e p e n d e n c i e s s e r v e as a means o f e x p r e s s i n g r e l a t i o n s h i p s and c o n s t r a i n t s t o be o b s e r v e d w i t h i n t h e d a t a b a s e . A new method f o r s p e c i f y i n g c o n s t r a i n t s and r e l a t i o n s h i p s , f r o m w h i c h r e l a t i o n schemas can be d e r i v e d , i s p r o p o s e d . 1.1 G e n e r a l A r e a of R e s e a r c h D a t a b a s e d e s i g n c a n be s u b d i v i d e d i n t o t h r e e major l e v e l s . 1 1. ) t h e c h o i c e of d a t a b a s e m o d e l s . 2. ) t h e d e s i g n of l o g i c a l d a t a b a s e schemas. 3. ) t h e p h y s i c a l d a t a b a s e d e s i g n and l o a d i n g . The l o g i c a l d e s i g n c a n be b r o k e n down f u r t h e r i n t o 1. ) R e q u i r e m e n t a n a l y s i s and c o n c e p t u a l schema d e s i g n . 2. ) D a t a m o d e l - s p e c i f i c schema d e s i g n . D a t a b a s e M o d e l s p r o v i d e ways i n w h i c h d a t a a r e a r r a n g e d and m a n i p u l a t e d . 1(Haseman and So 1977) i d e n t i f i e d t h e s e t h r e e l e v e l s . i ) c h o i c e of d a t a b a s e m o d e l s . i i ) t h e d e s i g n of d a t a b a s e schemas. i i i ) t h e l o a d i n g of t h e p h y s i c a l d a t a b a s e . 2 (Haseman and So 1977) i d e n t i f i e d t h r e e d i s t i n c t l e v e l s of d a t a b a s e r e f l e c t i o n o f t h e r e a l i t y . The r e a l w o r l d , t h e c o n c e p t u a l ( o r i n f o r m a t i o n ) model and t h e d a t a model l e v e l s . We c a n c o n c e i v e of a m o d e l i n g c o n t i n u u m as i n f i g u r e 1. R e a l C o n c e p t u a l D a t a P h y s i c a l w o r l d model model s t o r a g e c o n c e p t s l e v e l l e v e l l e v e l I : 1 1 r F i g u r e 1. M o d e l i n g C o n t i n u u m 2 O t h e r v i e w s .on l e v e l s o f a b s t r a c t i o n i n d a t a b a s e d e s i g n have been p r e s e n t e d ( U l l m a n 1980 a n d D a t e 1980). W h i l e t h e d i s t i n c t i o n between c o n c e p t u a l model and d a t a model i s e x p l i c i t i n (Haseman and So 1977), i t i s not t h e c a s e i n ( U l l m a n 1980 and Date 1980). T h i s t h e s i s d e a l s w i t h t h e d e s i g n o f r e l a t i o n a l d a t a b a s e schemas. F o l l o w i n g t h e i n t r o d u c t i o n o f t h e R e l a t i o n a l D a t a Model (Codd 1970), a l o t of work has been done on d e s i g n and a n a l y s i s of r e l a t i o n a l d a t a b a s e s . F o r some t i m e , two c o m p e t i n g a p p r o a c h e s have been t h e t h i r d n o r m a l form d e c o m p o s i t i o n (Codd 1971) and t h e s y n t h e s i s o f B e r n s t e i n and o t h e r s ( B e r n s t e i n e t a l 1975 and B e r n s t e i n 1976). In ( F a g i n 1977a), t h e f o u r t h n o r m a l form d e c o m p o s i t i o n was i n t r o d u c e d . I t t a k e s as i n p u t a t t r i b u t e s and s e m a n t i c i n f o r m a t i o n i n t h e form o f f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s . R e c e n t l y , a number o f improvements and o t h e r a p p r o a c h e s have been p r o p o s e d ( L i n g e t a l 1981, Wong e t a l 1980 and Z a n i o l o & M e l k a n o f f 1981). 2 I t o r i g i n a l l y a p p e a r e d i n (Haseman and So 1977) 3 1.2 P r o p o s e d work On t h e m o d e l i n g c o n t i n u u m ( f i g u r e 1 ) , our d e s i g n c o n s i d e r a t i o n s l i e between t h e c o n c e p t u a l model and t h e p h y s i c a l d a t a b a s e . The i n p u t t o t h e d e s i g n p r o c e s s c o n s i s t s o f S e m a n t i c and Use i n f o r m a t i o n , as d e p i c t e d i n f i g u r e 2. Semant i c i n f o r m a t i o n A p p l i c a t i o n a n a l y s i s Use i n f o r m a t i o n D e s i g n D a t a b a s e p r o c e s s schema F i g u r e 2. Schema D e s i g n 3 Our d e s i g n p r o c e s s c o n s i s t s of mapping s p e c i f i c a t i o n s and a method f o r t r a n s f o r m i n g t h e mappings i n t o r e l a t i o n schemas. A mapping from one s e t t o a n o t h e r r e p r e s e n t s a r e l a t i o n s h i p between any two s e t s of o b j e c t s (be i t p h y s i c a l , a b s t r a c t o r s t r u c t u r e d ) . W h i l e t h e s e m a n t i c i n f o r m a t i o n m o d e ls t h e " r e a l w o r l d " , t h e use i n f o r m a t i o n i s p r o v i d e d m a i n l y t o g u i d e i n t h e mapping s p e c i f i c a t i o n . The s e m a n t i c i n f o r m a t i o n r e v e a l s t h e i n h e r e n t f a c t s a b o u t t h e a p p l i c a t i o n e n v i r o n m e n t . The use i n f o r m a t i o n , on t h e o t h e r hand, p r o v i d e s t h e use c h a r a c t e r i s t i c s of t h e d a t a b a s e , s u c h as q u e r y t y p e s and s t a t i s t i c s . The use i n f o r m a t i o n h e l p s d e t e r m i n e some r e l a t i o n s h i p s t h a t may n o t be e x p l i c i t f r o m t h e i n h e r e n t f a c t s . In t h i s t h e s i s , b o t h c o n s t r a i n t s p e c i f i c a t i o n and schema d e s i g n a r e e m p h a s i z e d . 3 T h e d i a g r a m o r i g i n a l l y a p p e a r e d i n (Haseman and So 1977) 4 A p p l y T r a n s f o r m a t i o n R u l e s F i g u r e 3. The D e s i g n P r o c e s s C o n s t r a i n t s and r e l a t i o n s h i p s a r e r e p r e s e n t e d by mapping t y p e s as an a l t e r n a t i v e t o f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s . S u b s e q u e n t l y , t h e t r a n s f o r m a t i o n r u l e s w i l l be p r o p o s e d . The r u l e s g e n e r a t e r e l a t i o n schemas t h a t c o n f o r m t o t h e Boyce-Codd and P r o j e c t i o n - J o i n n o r m a l f o r m s . 1.3 R e a d i n g G u i d e The t h e s i s i s o r g a n i z e d as f o l l o w s . In c h a p t e r ( 2 ) , a r e v i e w of some c o n c e p t s i n r e l a t i o n a l d a t a b a s e t h e o r y , c o n c e p t s i n i n f o r m a t i o n a n a l y s i s , and a summary o f r e l a t i o n a l d a t a b a s e d e s i g n methods a r e p r e s e n t e d . The F u l l mapping a p p r o a c h t o r e l a t i o n a l d a t a b a s e d e s i g n i s p r e s e n t e d i n c h a p t e r ( 3 ) . F u l l m a p p ings, as a means o f s p e c i f y i n g r e l a t i o n s h i p s between d a t a i t e m s , a r e p r o p o s e d as an a l t e r n a t i v e t o f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s . T r a n s f o r m a t i o n r u l e s a r e f o r m u l a t e d t o g e n e r a t e r e l a t i o n schemas from f u l l mapping s p e c i f i c a t i o n s . C h a p t e r (4) i s an e v a l u a t i o n o f our t h e o r y . The e f f e c t of f u l l m appings on t h e n o r m a l i z a t i o n p r o c e s s i s e x a m i n e d . The f u l l mapping a p p r o a c h i s a l s o compared w i t h o t h e r d e s i g n methods. A d e s i g n example i s p r e s e n t e d i n s e c t i o n 4.3. C h a p t e r (5) c o n c l u d e s t h e t h e s i s . I t summarizes our work, as Mapping spec i f i c a t i o n 5 w e l l a s h i g h l i g h t s a r e a s t h a t need f u r t h e r r e s e a r c h . A r e a d e r who i s c o n v e r s a n t w i t h t h e r e l a t i o n a l d a t a b a s e t h e o r y may s k i p most p a r t s of c h a p t e r ( 2 ) . S e c t i o n 2.1 i s , however, r e q u i r e d t o u n d e r s t a n d t h e n o t a t i o n s and some c o n c e p t s u s e d i n t h e t h e s i s . A r e a d e r w i t h o u t much knowledge of the n o r m a l i z a t i o n t h e o r y may f i n d i t u s e f u l t o r e a d t h e s e c t i o n on " N o r m a l i z a t i o n P r o c e s s " ( s e c t i o n 4.2) b e f o r e c h a p t e r s (2) and (3) . 6 CHAPTER 11 . R e l a t i o n a l D a t a b a s e D e s i g n : Framework The g e n e r a l framework f o r r e l a t i o n a l d a t a b a s e d e s i g n i s p r e s e n t e d i n t h i s c h a p t e r . In s e c t i o n 2.1 a r e v i e w o f some b a s i c c o n c e p t s i n r e l a t i o n a l d a t a b a s e t h e o r y and i n f o r m a t i o n a n a l y s i s a r e p r e s e n t e d , i n s e c t i o n 2.2, a summary o f c u r r e n t d e s i g n m e t h o d o l o g i e s ; and i n s e c t i o n 2.3 a p p r a i s a l s of t h e d e s i g n a p p r o a c h e s a r e g i v e n . 2.1 B a s i c C o n c e p t s and D e f i n i t i o n s T h e r e a r e v a r i o u s n o t a t i o n s u s e d i n r e l a t i o n a l d a t a b a s e l i t e r a t u r e . In t h i s t h e s i s t h e f o l l o w i n g w i l l be u s e d . L e t t e r s A,B,C r... d e n o t e s i n g l e a t t r i b u t e s , and W,X,Y,... s e t s of a t t r i b u t e s . I f X and Y a r e s e t s of a t t r i b u t e s ( n o t n e c e s s a r i l y d i s j o i n t ) , t h e u n i o n o f t h e two s e t s i s w r i t t e n as XY. The p r o j e c t i o n of a r e l a t i o n R o v e r t h e s e t of a t t r i b u t e s X i s r e p r e s e n t e d by R [ X ] , and t h e n a t u r a l j o i n of two r e l a t i o n s R(X) and R(Y) i s w r i t t e n as R ( X ) . R ( Y ) . E x p l a n a t o r y d e f i n i t i o n s of r e l e v a n t t e r m s and c o n c e p t s a r e p r e s e n t e d i n t h e f o l l o w i n g s u b s e c t i o n s . Keywords a r e u n d e r l i n e d . 2.1.1 I n f o r m a t i o n A n a l y s i s I n f o r m a t i o n a n a l y s i s s e r v e s as a p r e l u d e t o t h e d e s i g n p r o c e s s . However, most works on r e l a t i o n a l d a t a b a s e d e s i g n o v e r l o o k t h i s s t e p . The main method of a n a l y s i s i n t h i s t h e s i s i s t o i d e n t i f y t h e s e m a n t i c e l e m e n t s s i m i l a r t o t h o s e i n Chen's E n t i t y - R e l a t i o n s h i p model (Chen 1976). The e n t i t y s e t s t h a t a r e o f i n t e r e s t i n t h e a p p l i c a t i o n e n v i r o n m e n t a r e i d e n t i f i e d . An 7 e n t i t y s e t i s a c o l l e c t i o n of o b j e c t s o f t h e same " t y p e " t h a t we w i s h t o r e c o r d i n f o r m a t i o n a b o u t i n t h e d a t a b a s e . An o b j e c t c a n be p h y s i c a l o r a b s t r a c t . The t y p e o f an o b j e c t i s not a b s o l u t e , i n t h e s e n s e t h a t i t can b e l o n g t o more t h a n one s e t a t d i f f e r e n t l e v e l s o f a b s t r a c t i o n . In ( S m i t h and S m i t h 1977), t h e n o t i o n o f g e n e r a l i z a t i o n was i n t r o d u c e d a s an a b s t r a c t i o n w h i c h e n a b l e s a c l a s s o f o b j e c t s t o be t h o u g h t of g e n e r i c a l l y a s a s i n g l e named o b j e c t . A g e n e r i c h i e r a r c h y c an a l s o be d e f i n e d on a s e t , s u c h t h a t e a c h l e v e l c o n s i s t s of o b j e c t s t h a t s h a r e common p r o p e r t i e s . F o r example, an EMPLOYEE s e t has common p r o p e r t i e s s u c h as (employee-number, name, age and e m p l o y e e - t y p e ) . I t s low e r g e n e r i c s e t s c o u l d be TRUCKERS, SECRETARIES and ENGINEERS, e a c h o f w h i c h we w i s h t o r e c o r d d i f f e r e n t a d d i t i o n a l i n f o r m a t i o n a b o u t . (Shipman 1980) a l s o e x p r e s s e s s i m i l a r i d e a s a b o u t e n t i t y t y p e s . S u b t y p e s a r e d e f i n e d b a s e d on t h e r o l e s w h i c h c e r t a i n members o f a p a r e n t e n t i t y s e t p e r f o r m . O t h e r s e m a n t i c e l e m e n t s t h a t a r e o f i n t e r e s t a r e t h e p r o p e r t y - v a l u e s e t s and t h e s t r u c t u r e d e n t i t y s e t s . A s t r u c t u r e d e n t i t y s e t r e p r e s e n t s an a s s o c i a t i o n among two o r more o t h e r e n t i t y s e t s . We r e f e r t o t h e r e l a t i o n s h i p s among e n t i t y s e t s a s e n t i t y s e t a s s o c i a t i o n s . A s e t d e s c r i b i n g an a s s o c i a t i o n among n e n t i t y s e t s i s a s u b s e t of t h e c a r t e s i a n p r o d u c t o f t h e s e t s . T h i s i s an a b s t r a c t i o n i n w h i c h a s s o c i a t i o n s among e n t i t y s e t s a r e r e g a r d e d as h i g h e r l e v e l o b j e c t s , w h i c h c a n have p r o p e r t i e s d e f i n e d on them ( S m i t h and S m i t h 1977). P r o p e r t i e s o f e n t i t y s e t s a r e e x p r e s s e d as s e t s o f a t t r i b u t e - v a l u e p a i r s . The c o l l e c t i o n of v a l u e s ( t h a t a r e s e m a n t i c a l l y p o s s i b l e ) of an 8 a t t r i b u t e f o r m a p r o p e r t y - v a l u e s e t . Thus an a t t r i b u t e i s e s s e n t i a l l y a f u n c t i o n w h i c h maps an e n t i t y s e t t o a p r o p e r t y - v a l u e s e t . We r e f e r t o s u c h f u n c t i o n s a s v a l u e a s s o c i a t i o n s . Hence, a t d i f f e r e n t l e v e l s of a b s t r a c t i o n , e n t i t y s e t s w i l l be d e f i n e d , named and t h e i r i n t e n s i o n s c l e a r l y s t a t e d . An i n t e n s i o n i s s u p p o s e d t o g i v e t h e meaning of a d e f i n e d s e t o r a s s o c i a t i o n . I t h e l p s t o d i f f e r e n t i a t e between a b s t r a c t i o n l e v e l s and t o r e s o l v e a m b i g u i t i e s t h a t ' m i g h t a r i s e from naming. S e t s i n a g e n e r i c h i e r a r c h y n e c e s s a r i l y have t h e same u n d e r l y i n g domains; t h i s s h o u l d be o b v i o u s from t h e i r i n t e n s i o n s . The s e t s , t h e a s s o c i a t i o n s and t h e s t a t e m e n t o f i n t e n s i o n s , c an be m a i n t a i n e d as a d a t a b a s e d i c t i o n a r y . Such a d i c t i o n a r y w i l l n ot o n l y s e r v e as a b a s i s f o r d a t a b a s e d e s i g n , but c an be u s e d i n q u e r y p r o c e s s i n g . But d i c t i o n a r y i m p l e m e n t a t i o n s i n t h e form of a s u p p o r t i n g s y s t e m w i l l n o t be s u i t a b l e . H i t h e r t o , i n r e l a t i o n a l d a t a b a s e s y s t e m s , t h e r e h as been no p r o v i s i o n f o r s t a t i n g t h e i n t e n s i o n s o f a t t r i b u t e s and th e d e p e n d e n c i e s among them. We s u g g e s t an a p p r o a c h whereby t h e d i c t i o n a r y i n f o r m a t i o n i s m a i n t a i n e d by t h e D a t a b a s e Management System i n u s e , but t h e i d e a i s n o t p u r s u e d f u r t h e r i n t h i s t h e s i s . D a t a b a s e k e r n e l and s e l f - r e f e r e n t i a l d a t a b a s e i s . a wide a r e a o f r e s e a r c h on i t s own. 2.1.2 C o n c e p t s i n R e l a t i o n a l D a t a b a s e T h e o r y A r e l a t i o n on t h e s e t of a t t r i b u t e s {A1,A2,...,An} i s a s u b s e t o f t h e c a r t e s i a n p r o d u c t Dom(A1) x Dom(A2) x...x Dom(An) where D o m ( A i ) ' s a r e t h e r e s p e c t i v e domains of t h e a t t r i b u t e s . The e l e m e n t s o f t h e r e l a t i o n a r e c a l l e d t u p l e s . In a d a t a b a s e , a 9 number o f r e s t r i c t i o n s c a n be p l a c e d on a r e l a t i o n . T h e s e r e s t r i c t i o n s a r e e x p r e s s e d i n a r e l a t i o n schema. A r e l a t i o n schema i s a s e t of a t t r i b u t e s a l o n g w i t h a s e t of c o n t r a i n t s ( C a d i o u 1975 and F a g i n 1981). A r e l a t i o n R i s s a i d t o be a v a l i d i n s t a n c e o f a schema R* i f i t has t h e same a t t r i b u t e s as t h e schema and o b e ys e v e r y c o n s t r a i n t o f t h e schema. A p r o p e r t y h o l d s f o r a r e l a t i o n schema i f i t h o l d s f o r a l l i n s t a n c e s o f i t . A c o n s t r a i n t i n r e l a t i o n schemas can be t h e s p e c i f i c a t i o n of a key f o r t h e r e l a t i o n , a f u n c t i o n a l d e pendency, a m u l t i v a l u e d dependency o r a j o i n d e p e n d e n c y . F o l l o w i n g ( B e r n s t e i n 1976), keys and s u p e r k e y s a r e d e f i n e d as f o l l o w s . L e t R be a r e l a t i o n and l e t X be a s u b s e t of a t t r i b u t e s of R. Then, X i s a key o f R i f e v e r y a t t r i b u t e o f R t h a t i s not i n X i s f u n c t i o n a l l y d e p e n d e n t upon X and no s u b s e t o f X has t h i s p r o p e r t y . The X - v a l u e s of R d e t e r m i n e t u p l e s o f R u n i q u e l y . A s u p e r k e y o f R i s any s e t o f a t t r i b u t e s o f R t h a t c o n t a i n s a key of R. Thus e v e r y key i s a l s o a s u p e r k e y . A r e l a t i o n may have s e v e r a l k e y s , r e f e r r e d t o as c a n d i d a t e k e y s . One of t h e k e y s i s u s u a l l y c h o s e n as t h e p r i m a r y key and by c o n v e n t i o n , u n d e r l i n e d i n t h e r e l a t i o n schema. The s e t of c a n d i d a t e keys of R w i l l be r e p r e s e n t e d by K e y ( R ) . F u n c t i o n a l Dependency e x p r e s s e s a c o n s t r a i n t t h a t h o l d s between two s e t s o f a t t r i b u t e s o f a r e l a t i o n . G i v e n a r e l a t i o n R, a c o m b i n a t i o n of a t t r i b u t e s Y o f R i s f u n c t i o n a l l y d e p e n d e n t on a t t r i b u t e s X o f R i f and o n l y i f e a c h X - v a l u e i n R has a s s o c i a t e d w i t h i t p r e c i s e l y one Y - v a l u e i n R a t any d a t a b a s e i n s t a n c e . The f u n c t i o n a l d e p e n d e n c y o f Y on X, u s u a l l y w r i t t e n as X >Y, c a n be d e p i c t e d by t h e f o l l o w i n g mapping d i a g r a m from 1 0 a s e t o f X - v a l u e e l e m e n t s o f a r e l a t i o n t o t h e s e t of Y - v a l u e e l e m e n t s of t h e r e l a t i o n . An X - v a l u e c a n a p p e a r i n more t h a n one X Y F i g u r e 4. A M apping D i a g r a m f o r F u n c t i o n a l D e p e n d e n c i e s t u p l e , but whenever two t u p l e s a g r e e on t h e i r X - v a l u e s , t h e i r Y- v a l u e s must a l s o be t h e same. L e t X and Y be c o m b i n a t i o n s o f a t t r i b u t e s of R. Y i s s a i d t o be f u l l y f u n c t i o n a l l y d e p e n d e n t on X i f i t i s f u n c t i o n a l l y d e p e n d e n t on X and n o t f u n c t i o n a l l y d e p e n d e n t on any s t r i c t s u b s e t o f X. The above d e f i n i t i o n o f f u n c t i o n a l l y d e p e ndency due t o ( D a t e 1980), i s v a l i d w i t h i n t h e c o n t e x t of a r e l a t i o n . ( B e r n s t e i n 1976) d e f i n e s f u n c t i o n a l d e p e ndency between two a t t r i b u t e s A and B a s a t i m e - v a r y i n g f u n c t i o n f:Dom(A) >Dom(B). I f f i s t h o u g h t of as a s e t o f o r d e r e d p a i r s { ( a , b ) : a e Dom(A) and b e Dom(B)}, t h e n a t any p o i n t i n t i m e , f o r a g i v e n v a l u e a e Dom(A) t h e r e w i l l be a t most one v a l u e b e Dom(B). F u n c t i o n a l d e p endency and m u l t i v a l u e d d e p e n d e n c y a r e u s u a l l y a b b r e v i a t e d as FD and MVD r e s p e c t i v e l y . M u l t i v a l u e d Dependency i s d e f i n e d as f o l l o w s . G i v e n a r e l a t i o n schema R ( X Y Z ) , t h e m u l t i v a l u e d d e p e n d e n c y of Y on X, u s u a l l y w r i t t e n as X-->-->Y, h o l d s i n R i f and o n l y i f t h e s e t of Y - v a l u e s m a t c h i n g a g i v e n ( X - v a l u e , Z - v a l u e ) p a i r i n R depends on t h e X - v a l u e and i s i n d e p e n d e n t of t h e Z - v a l u e . MVD can be 11 i l l u s t r a t e d by t h e f o l l o w i n g r e l a t i o n w h i c h o r i g i n a l l y a p p e a r e d i n ( Z a n i o l o and M e l k a n o f f 1981). SUPPLIER ITEM COLOR WOODMAN TABLE BROWN WOODMAN SOFA BLACK HOUSEMAN CARPET RED HOUSEMAN CARPET YELLOW HOUSEMAN CARPET BLUE HOUSEMAN SOFA BLACK BLAND CARPET RED BLAND CARPET YELLOW BLAND CARPET BLUE F i g u r e 5. Sample R e l a t i o n f o r STOCK( SUPPLIER, ITEM, COLOR ) In t h e sample r e l a t i o n STOCK, t h e r e i s a m u l t i v a l u e d d e p e ndency of COLOR on ITEM. E v e r y p a i r o f (SUPPLIER,ITEM) where i t e m i s CARPET i m p l i e s t h a t t h e p a r t i c u l a r s u p p l i e r s u p p l i e s t h e t h r e e c o l o r s (RED,YELLOW,BLUE). The dependency does n ot h o l d i f i t i s p o s s i b l e t o have a sample d a t a b a s e a s i n f i g u r e 5, b u t w i t h t h e l a s t t u p l e d e l e t e d . In s u c h a c a s e , BLAND a s u p p l i e r o f c a r p e t w i l l s u p p l y o n l y RED and YELLOW. From t h e mapping d i a g r a m s i n f i g u r e 6, i t i s c l e a r t h a t m u l t i v a l u e d dependency o f COLOR on ITEM i s a n o t h e r way of s t a t i n g t h e f a c t t h a t t h e r e l a t i o n schema STOCK i s e q u a l t o t h e j o i n . o f i t s p r o j e c t i o n s on (SUPPLIER,ITEM) and (ITEM,COLOR). A MVD X-->-->Y i n a r e l a t i o n R(W) i s s a i d t o 1 2 ITEM COLOR TABLE . BROWN SOFA . '. . BLACK CARPET ' r e d > v YELLOW BLUE F i g u r e 6. Mapping D i a g r a m o f t h e MVD i n STOCK r e l a t i o n be t r i v i a l i f W = X u Y, t h a t i s when X and Y a r e d i c h o t o m i e s o f R o r when Y i s a s u b s e t o f X. A J o i n Dependency (JD) i s a l s o a t y p e o f c o n s t r a i n t t h a t c a n be s p e c i f i e d i n a r e l a t i o n . A r e l a t i o n R(W) s a t i s f i e s t h e JD*(X,Y,...,Z) i f and o n l y i f i t i s t h e j o i n of i t s p r o j e c t i o n s on X,Y,...,Z, where X,Y,...,Z a r e s u b s e t s o f a t t r i b u t e s of R and W = X u Y u...u Z. From t h e d e f i n i t i o n of MVD, i t can be o b s e r v e d t h a t j o i n d ependency i s a g e n e r a l i z a t i o n o f m u l t i v a l u e d d e p e n d e n c y . The r e l a t i o n i n f i g u r e 7 i l l u s t r a t e s j o i n d e p e n d e n c y . The o r i g i n a l v e r s i o n a p p e a r e d i n (Date 1980). SUPPLIER ITEM WOODMAN HOUSEMAN BLAND TABLE SOFA CARPET 13 s P J S1 P i J2 S1 p2 J1 s 2 P1 J1 S1 P1 J1 F i g u r e 7. Sample R e l a t i o n f o r S P J ( S , P , J ) The J D * ( S P , P J , J S ) h o l d s i n r e l a t i o n S P J . I f t h e p a i r s ( S 1 , P 1 ) , (P1,J1) and ( J 1 , S 1 ) a p p e a r i n S P r P J and JS columns r e s p e c t i v e l y , t h e n t h e t u p l e ( S 1 , P 1 , J 1 ) must a p p e a r i n S P J . Thus a t an i n s t a n c e when the r e l a t i o n c o n t a i n s t h e f i r s t two t u p l e s , i f a t u p l e (S2,P1,J1) i s i n s e r t e d t h e n (S1,P1,J1) must a l s o be i n s e r t e d f o r t h e JD c o n s t r a i n t t o be s a t i s f i e d . The JD i s i l l u s t r a t e d by t h e mapping d i a g r a m i n f i g u r e 8. S P P J J S F i g u r e 8. Mapping D i a g r a m f o r t h e JD i n SPJ r e l a t i o n The r e l a t i o n a l d a t a b a s e n o r m a l forms (Codd 1972, F a g i n 1977a and o t h e r s ) a r e aimed m a i n l y a t e l i m i n a t i n g c e r t a i n a n o m a l i e s i n r e l a t i o n s . The o u t p u t s from t h e d e s i g n 1 4 m e t h o d o l o g i e s , p r e s e n t e d i n t h e n e x t s e c t i o n , a r e e v a l u a t e d i n t erms o f t h e n o r m a l forms t o w h i c h t h e y c o n f o r m . The n o r m a l forms a r e d e f i n e d i n s e c t i o n 4.1. 2.2 D e s i g n A p p r o a c h e s A l o t of work has been done on d e s i g n m e t h o d o l o g i e s f o r r e l a t i o n a l d a t a b a s e s . Two o f t h e e a r l i e r c o m p e t i n g a p p r o a c h e s a r e t h e t h i r d n o r m a l form d e c o m p o s i t i o n of (Codd 1971) and t h e s y n t h e t i c a p p r o a c h o f B e r n s t e i n and o t h e r s ( B e r n s t e i n e t a l 1975 and B e r n s t e i n 1976). T h e r e i s a l s o t h e f o u r t h n ormal form d e c o m p o s i t i o n o f ( F a g i n 1977a). R e c e n t l y , ( Z a n i o l o and M e l k a n o f f 1981) p r o p o s e d a d e c o m p o s i t i o n a p p r o a c h not b a s e d on e l i m i n a t i n g a n o m a l i e s , b u t on c o m p l e t e d a t a r e l a t a b i l i t y . O t h e r works on d e c o m p o s i t i o n i n c l u d e t h o s e of ( D e l o b e l and C a s e y 1973) and ( D e l o b e l and L e o n a r d 1974). A n o t h e r a p p r o a c h i s b a s e d on t h e E n t i t y - R e l a t i o n s h i p m odel. R u l e s a r e p r e s e n t e d i n (Wong and K a t z 1980) t o t r a n s f o r m a v e r s i o n of t h e E n t i t y - R e l a t i o n s h i p model i n t o r e l a t i o n a l d a t a b a s e schemas. A summary o f t h e d e s i g n a p p r o a c h e s i s g i v e n i n f i g u r e 9. In t h e f o l l o w i n g p a r a g r a p h s , a r e v i e w of t h e l i t e r a t u r e on e a c h of t h e m e t h o d o l o g i e s i s p r e s e n t e d . 2.2.1 The S y n t h e t i c A p p r o a c h A major work on t h e s y n t h e t i c a p p r o a c h t o r e l a t i o n a l d a t a b a s e d e s i g n i s t h a t of ( B e r n s t e i n 1976). T h i r d n o r m a l form (3NF) r e l a t i o n schemas a r e s y n t h e s i z e d from a g i v e n s e t o f a t t r i b u t e s and t h e f u n c t i o n a l d e p e n d e n c i e s among them. S i n c e an i n i t i a l r e l a t i o n i s not assumed, t h e f u n c t i o n a l d e p e n d e n c i e s a r e 1 5 A p p r o a c h " I n p u t s O u t p u t s P r o p o n e n t s 5 S y n t h e s i s FDs, A t t r i b u t e s 3NF r e l a t i o n schemas. B e r n s t e i n , Swenson, T s i c h r i t z i s D e c o m p o s i t i o n FDs MVDs. 3NF 4NF. C a s e y , C o d d , D e l o b e l , F a g i n . D e c o m p o s i t i o n / C o m p l e t e D a t a R e l a t a b i l i t y E l e m e n t a r y FDs, mu l t i p l e e l e m e n t a r y MVDs 3NF. » M e l k a n o f f Z a n i o l o E n t i t y - R e l a t i o n s h i p / T r a n s f o r m a t i o n R u l e s E n t i t i e s , R e l a t i o n s h i p s and t h e i r P r o p e r t i e s 4NF 4NF. K a t z , Wong. F i g u r e 9. Summary o f E x i s t i n g D e s i g n A p p r o a c h e s d e f i n e d a s t i m e - v a r y i n g f u n c t i o n s f r o m one domain t o a n o t h e r as s t a t e d i n s e c t i o n 2.1. The s y n t h e s i s a l g o r i t h m p r o d u c e s a n o n r e d u n d a n t c o v e r i n g of th e f u n c t i o n a l d e p e n d e n c i e s as f o l l o w s . L e t G be t h e s e t o f FDs. The c l o s u r e o f G, d e n o t e d G + , i s t h e s m a l l e s t s u p e r s e t of G t h a t i s c l o s e d under t h e f o l l o w i n g r u l e s . 1. ) R e f l e x i v i t y (X >X). 2. ) A u g m e n t a t i o n ( i f X >Z t h e n X u Y >Z) 3. ) P s e u d o t r a n s i t i v i t y ( i f X >Y and Y u Z >W t h e n X u Z >W) . An FD g e G i s s a i d t o be r e d u n d a n t i n G i f G + = ( G - { g } ) + . H i s "Oth e r a p p r o a c h e s a r e not r a d i c a l l y d i f f e r e n t and s h o u l d f i t i n t o one o r more o f t h e c a t e g o r i e s . 5 T h e l i s t s of t h e p r o p o n e n t s a r e not e x h a u s t i v e . 16 a n o n r e d u n d a n t c o v e r i n g of a g i v e n s e t of FDs G i f G* = H + and H c o n t a i n s no r e d u n d a n t FDs. The a l g o r i t h m p r o c e e d s by r e m o v i n g e x t r a n e o u s a t t r i b u t e s f r o m t h e l e f t s i d e s o f FDs i n t h e n o n r e d u n d a n t c o v e r i n g . An a t t r i b u t e X i i s e x t r a n e o u s i n an FD g e G, g:X1,...,Xp >Y, i f X I , . . . , X i - 1 , X i + 1 , ,Xp >Y. Removing e x t r a n e o u s a t t r i b u t e s h e l p s t o a v o i d p a r t i a l d e p e n d e n c i e s and s u p e r k e y s t h a t a r e not k e y s . I f two r e l a t i o n s have keys t h a t a r e f u n c t i o n a l l y d e p e n d e n t on e a c h o t h e r , t h a t i s i f t h e y a r e e q u i v a l e n t , t h e n t h e two r e l a t i o n s c a n be merged t o g e t h e r . The s y n t h e s i s p r o c e d u r e a c c o m p l i s h e s t h i s f a c t by m e r g i n g t o g e t h e r g r o u p s of FDs i f t h e i r l e f t s i d e s a r e f u n c t i o n a l l y e q u i v a l e n t . The n o n f u n c t i o n a l d e p e ndency between any two s e t s o f a t t r i b u t e s X and Y i s r e p r e s e n t e d by XY >0, where 0 i s a dummy a t t r i b u t e . The a l g o r i t h m e s s e n t i a l l y c r e a t e s one r e l a t i o n p e r n o n f u n c t i o n a l r e l a t i o n s h i p . T h i r d n o r m a l form r e l a t i o n schemas a r e p r o d u c e d . The u n i q u e n e s s of FDs between any two s e t s of a t t r i b u t e s have t o be assumed b e c a u s e t h e t r e a t m e n t o f t h e FDs i s s t r i c t l y s y n t a c t i c . I f t h e r e a r e two FDs on t h e same s e t s o f a t t r i b u t e s , t h e n t h e y a r e t h e same FD. Some p r o b l e m s r e s u l t i n g from t h e u n i q u e n e s s a s s u m p t i o n a r e i l l u s t r a t e d by t h e f o l l o w i n g example. L e t f1:DEPT# >MGR# and f2:MGR#,FLOOR >NO~OF-EMP be FDs s u c h t h a t f1 d e t e r m i n e s t h e manager of e a c h d e p a r t m e n t and f2 d e t e r m i n e s t h e number o f employees w o r k i n g f o r a p a r t i c u l a r manager on a f l o o r . By a p p l y i n g t h e p s e u d o t r a n s i t i v i t y r u l e t o f i and f 2 , we o b t a i n f3:DEPT#,FLOOR >NO-OF-EMP, w h i c h d e t e r m i n e s t h e number o f e m p l o y e e s o f t h e manager on a p a r t i c u l a r f l o o r . I f we have a n o t h e r FD g:DEPT#,FLOOR >NO-OF- 1 7 EMP w h i c h d e t e r m i n e s t h e number o f employees o f a d e p a r t m e n t on a p a r t i c u l a r f l o o r , t h e n t h e FDs g and f 3 a r e not t h e same i f a manager c a n manage two d e p a r t m e n t s . The a t t r i b u t e NO-OF-EMP of g may have t o be renamed t o make i t d i s t i n c t f r o m t h e c o m p o s i t i o n of f1 and f 2 . The p r o b l e m s a s s o c i a t e d w i t h t h e u n i q u e n e s s a s s u m p t i o n i s due m a i n l y t o t h e l a c k of e x p r e s s i v e n e s s o f FDs. I t i s n o t p o s s i b l e t o t e l l w hether a manager can manage more t h a n one d e p a r t m e n t from, f1:DEPT# >MGR#. T h e s e p r o b l e m s a r e f u r t h e r d i s c u s s e d i n s e c t i o n 2.3. 2.2.2 D e c o m p o s i t i o n Codd's t h i r d n o r m a l form d e c o m p o s i t i o n t a k e s as i n p u t an i n i t i a l s e t o f r e l a t i o n s a l o n g w i t h a s e t o f FDs (Codd 1971). U s i n g t h e d e p endency i n f o r m a t i o n , t h e i n i t i a l s e t of r e l a t i o n schemas i s c o n v e r t e d i n t o 3NF schemas. F o r a r e l a t i o n t o be i n 3NF, t h e r e must n o t be a t r a n s i t i v e d e p e ndency or p a r t i a l d e p e n dency on a key. T h e r e f o r e d e c o m p o s i t i o n i s c a r r i e d out t o e l i m i n a t e t r a n s i t i v e , a n d / o r p a r t i a l d e p e n d e n c y . In a r e l a t i o n schema R(A,B,C,D), i f A >C h o l d s t h e n R i s decomposed i n t o R,(A,C) and R 2(A,B,D) t o e l i m i n a t e t h e p a r t i a l d ependency o f C on t h e key AB. In c a s e s where t h e r e e x i s t t r a n s i t i v e d e p e n d e n c i e s ( s a y A >B and B >C) i n R(A,B,C,D), R i s decomposed i n t o R,(B,C) and R 2 ( A , B , D ) . When b o t h t y p e s of d e p e n d e n c i e s o c c u r i n R, a c h o i c e has t o be made a s t o w h i c h d e c o m p o s i t i o n s h o u l d t a k e p l a c e . In ( F a g i n 1977a), m u l t i v a l u e d d e p endency and f o u r t h n o r m a l form d e c o m p o s i t i o n were p r o p o s e d . The 4NF d e c o m p o s i t i o n i s a g e n e r a l i z a t i o n of t h e 3NF d e c o m p o s i t i o n . The MVD models 18 n o n f u n c t i o n a l r e l a t i o n s h i p s between a t t r i b u t e s . The d e s i g n p r o c e s s t a k e s a s e t o f a t t r i b u t e s a l o n g w i t h a s e t o f FDs and MVDs a s i n p u t . A s i n g l e r e l a t i o n schema c o n s i s t i n g of a l l t h e a t t r i b u t e s i s fo r m e d . The b a s i c r u l e i s t h a t i f a FD X >Y o r a MVD X — > — > Y h o l d s f o r a r e l a t i o n R ( X , Y , Z ) , where Z i s t h e s e t of a t t r i b u t e s n ot i n X or Y, t h e n t h e r e l a t i o n c a n be decomposed i n t o R,(X,Y) and R 2 ( X , Z ) w i t h o u t l o s s of i n f o r m a t i o n . In g e n e r a l , X i — > — > Y 1 | . . . | Y k h o l d s f o r R ( X , Y Y k ) i f and o n l y i f R i s t h e j o i n of i t s p r o j e c t i o n s R , ( X , Y , ) , R 2 ( X , Y 2 p r o v i d e s a n e c e s s a r y and s u f f i c i e n t c o n d i t i o n f o r a r e l a t i o n t o be dec o m p o s a b l e i n t o some p r o j e c t i o n s w i t h o u t l o s s of i n f o r m a t i o n . The d e c o m p o s i t i o n p r o c e s s p r o c e e d s u n t i l no r e l a t i o n schema has n o n t r i v i a l MVDs t h a t a r e n o t f u n c t i o n a l d e p e n d e n c i e s . T h a t i s , e v e r y n o n t r i v i a l MVD i s i m p l i e d by a key. The 4NF d e c o m p o s i t i o n a p p r o a c h p r o v i d e s a d i s c i p l i n e f o r h a n d l i n g p r o b l e m s r e l a t e d t o B e r n s t e i n ' s u n i q u e n e s s a s s u m p t i o n f o r f u n c t i o n a l d e p e n d e n c i e s . New a t t r i b u t e s c a n be i n t r o d u c e d i n th e i n i t i a l r e l a t i o n schema and renamed a f t e r a d e c o m p o s i t i o n . The u n i q u e n e s s a s s u m p t i o n need o n l y h o l d w i t h i n e a c h r e l a t i o n i n t h e n e t d e s i g n . However, t h e 4NF d e c o m p o s i t i o n i s a l s o f a c e d w i t h a number of p r o b l e m s . The MVDs a r e p a r t o f t h e i n p u t t o t h e d e s i g n p r o c e s s , but t h e y a r e not e a s i l y r e c o g n i z a b l e w i t h i n a r e l a t i o n . The o r d e r i n w h i c h d e c o m p o s i t i o n i s c a r r i e d o ut a l s o a f f e c t s t h e d e s i g n . But c h o i c e of d e c o m p o s i t i o n i s o n l y b a s e d on h e u r i s t i c s . I t i s not c l e a r how t o r e l a t e o r d e r o f d e c o m p o s i t i o n t o o p t i m a l d e s i g n or what c o n s t i t u t e s an o p t i m a l design.. ( R i s s a n e n 1977) p r o p o s e d t h e n o t i o n o f i n d e p e n d e n t components of r e l a t i o n s i n d e c i d i n g when a r e p r e s e n t a t i o n i s "good". 19 P r o j e c t i o n s of a r e l a t i o n R p r o v i d e a good r e p r e s e n t a t i o n o f R i f a l l t h e d e p e n d e n c i e s i n R a r e " n i c e l y " embedded i n t h e p r o j e c t i o n s . ( Z a n i o l o and M e l k a n o f f 1981) a l s o d e a l s w i t h t h e s e p r o b l e m s by d e c o m p o s i n g f o r c o m p l e t e d a t a r e l a t a b i l i t y . 2.2.3 D e c o m p o s i t i o n / C o m p l e t e D a t a R e l a t a b i l i t y In ( Z a n i o l o and M e l k a n o f f 1981), a new a p p r o a c h t o t h e d e s i g n o f r e l a t i o n a l d a t a b a s e s b a s e d on Complete R e l a t a b i l i t y o f D a t a was p r o p o s e d . S i n c e d i f f e r e n t d e c o m p o s i t i o n p a t h s can p r o d u c e d i f f e r e n t q u a l i t y o f r e l a t i o n s even when t h e y a l l c o n f o r m t o t h e same n o r m a l f o r m , e l i m i n a t i n g a n o m a l i e s seems not t o be an a d e q u a t e c r i t e r i o n f o r d a t a b a s e d e s i g n . The f o l l o w i n g example i l l u s t r a t e s t h e p o i n t . A r e l a t i o n schema R(AC#,EM,TX) r e l a t e s e m ployees and t h e i r t e l e p h o n e e x t e n s i o n s t o t h e a c c o u n t s w h i c h t h e y manage. Assum i n g an a c c o u n t has o n l y one manager and an employee has o n l y one t e l e p h o n e e x t e n t i o n , t h e n t h e f u n c t i o n a l d e p e n d e n c i e s AC# >EM and EM >TX h o l d i n R. The FD AC# >TX can be i n f e r r e d by t h e t r a n s i t i v i t y r u l e . Hence t h e r e l a t i o n c an be decomposed i n t o 1. ) R,(AC#,EM) and R 2(EM,TX) b a s e d on t h e FD EM >TX 2. ) R,(AC#,EM) and R 2(AC#,TX) b a s e d on t h e FD AC# >EM. The r e s u l t i n g schemas i n b o t h c a s e s a r e i n Boyce-Codd n o r m a l form, but (1) i s a b e t t e r r e p r e s e n t a t i o n b e c a u s e t h e t r a n s i t i v e FD AC# >TX c a n be i n f e r r e d from (1) but i t i s c o n c e a l e d i n ( 2 ) . The schemas i n (1) a r e s a i d t o e n s u r e compl The d e s i g n a p p r o a c h assumes an i n i t i a l s e t o f d a t a b a s e r e l a t i o n schemas w i t h sample r e l a t i o n s f r o m w h i c h t h e d e p e n d e n c i e s a r e d e t e c t e d . E l e m e n t a r y FDs and m u l t i p l e 20 e l e m e n t a r y MVDs a r e i n t r o d u c e d t o s i m p l i f y t h e t a s k o f d e t e c t i n g t h e d e p e n d e n c i e s i n r e l a t i o n s . They c o n s t i t u t e a s m a l l s u b s e t o f a l l FDs and MVDs i n a r e l a t i o n and t h e y have nondecomposable s t r u c t u r e s . A l l o t h e r FDS and MVDs c a n be i n f e r r e d f r o m them. E l e m e n t a r y FDs and MVDs a r e g e n e r a t e d as f o l l o w s : F o r a g i v e n r e l a t i o n R(W), a p a r t i a l o r d e r i s d e f i n e d among o r d e r e d p a i r s o f s u b s e t s o f W s u c h t h a t ( X 1 f Y , ) < ( X 2 , Y 2 ) i f X, c X 2 and Y, c Y 2 . L e t G be t h e s e t o f MVDs. The minimum members o f G a r e t h e e l e m e n t a r y MVDs of R, and a r e d e n o t e d G*. Thus X—>-->Y i s e l e m e n t a r y i f and o n l y i f Y n X = 0 and t h e r e e x i s t s no d i s t i n c t MVD X ' — > — > Y ' where X' c X and Y' c Y. F*, t h e e l e m e n t a r y FDs f o r a s e t F o f FDs, i s o b t a i n e d s i m i l a r l y . The e l e m e n t a r y MVDs a r e f u r t h e r s u b d i v i d e d i n t o s i n g l e and m u l t i p l e e l e m e n t a r y MVDs. S i n g l e e l e m e n t a r y MVDs a r e t h o s e t h a t do n o t have t h e same l e f t s i d e w i t h any o t h e r e l e m e n t a r y MVD, w h i l e m u l t i p l e e l e m e n t a r y MVDs have one o r more o t h e r MVDs w i t h t h e same l e f t s i d e . The c o n c e p t o f e l e m e n t a r y MVDs h e l p s t h e d e s i g n e r i n c h a r a c t e r i z i n g t h e dependency s t r u c t u r e i n a g i v e n r e l a t i o n . T h e r e i s an a l g o r i t h m f o r g e n e r a t i n g a l l m u l t i p l e e l e m e n t a r y MVDs w i t h l e f t s i d e X i f an e l e m e n t a r y MVD w i t h a l e f t s i d e Y i s known s u c h t h a t Y c X. T h e r e i s a l s o an a l g o r i t h m f o r g e n e r a t i n g t h e m u l t i p l e e l e m e n t a r y MVDs w h i c h form t h e minimum c o v e r f o r t h e MVDs w i t h r e s p e c t t o t h e r e f l e x i v i t y , a u g m e n t a t i o n , a d d i t i v i t y and c o m p l e m e n t a t i o n r u l e s f o r f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s . The r e f e r e n c e r u l e s f o r f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s a r e d i s c u s s e d i n ( B e e r i e t a l 1977). 21 The d e c o m p o s i t i o n a l g o r i t h m r e c u r s i v e l y decomposes r e l a t i o n s i n t o a p a i r o f s u b p r o j e c t i o n s a c c o r d i n g t o m u l t i p l e e l e m e n t a r y MVDs, u n t i l a s e t of a t o m i c r e l a t i o n s i s o b t a i n e d . An a t o m i c r e l a t i o n c o n t a i n s o n l y t r i v i a l MVDs. A t c o m p l e t i o n , t h e a l g o r i t h m p r o d u c e s a s e t o f a t o m i c s u b r e l a t i o n s and a s e t o f FDs r e f e r r e d t o a s ACOVER a n d ZCOVER r e s p e c t i v e l y . The two s e t s a r e l a t e r u s e d i n c o n s t r u c t i n g 3NF r e l a t i o n s . The i n i t i a l r e l a t i o n i s r e c o n s t r u c t a b l e a s t h e n a t u r a l j o i n of t h e p r o j e c t i o n s , t h u s t h e d e c o m p o s i t i o n i s c o n t e n t p r e s e r v i n g . Complete d a t a r e l a t a b i l i t y a l s o demands t h a t t h e s t r u c t u r e ( i . e t h e a t t r i b u t e s e t , t h e FDs and t h e MVDs) of t h e o r i g i n a l r e l a t i o n s be p r e s e r v e d . The a l g o r i t h m s e l e c t s t h o s e e l e m e n t a r y MVDs t h a t e n s u r e p r e s e r v a t i o n of t h e s t r u c t u r a l i n f o r m a t i o n . In t e s t i n g t h e d a t a r e l a t a b i l i t y c o n d i t i o n , t h e n o t i o n s o f a d m i s s i b i l i t y of FD c o v e r s and s c o p e o f e l e m e n t a r y FD a r e i n t r o d u c e d . The s c o p e o f . an FD X >A i s t h e s e t X u {A}. A ZCOVER g e n e r a t e d from a r e l a t i o n R i s s a i d t o be a d m i s s i b l e w i t h r e s p e c t t o ACOVER 1. ) I f t h e ZCOVER c o n t a i n s an e l e m e n t a r y FD w i t h s c o p e X, i t must c o n t a i n e v e r y o t h e r e l e m e n t a r y FD o f R w i t h s c o p e X. M o r e o v e r , i f R[X] f o r s u c h X i s a t o m i c , t h e n t h e ACOVER must c o n t a i n i t as a member. 2. ) I f ACOVER c o n t a i n s an a t o m i c p r o j e c t i o n R [ X ] , t h e n t h e ZCOVER must c o n t a i n e v e r y e l e m e n t a r y FD of R h a v i n g s c o p e X. In a d e c o m p o s i t i o n o f R(Z) i n t o R T (Y ) and R 2 ( X ) , l e t F1 and F2 d e n o t e t h e e l e m e n t a r y FDs i n R, and R 2 r e s p e c t i v e l y . The FDs w i t h s c o p e Z i n R c a n n o t be i n f e r r e d by FI a n d F2; t h e y a r e e x p l i c i t l y e n t e r e d i n t o ZCOVER. The r e m a i n i n g FDs i n R ( Z ) , F*, 22 a r e p r e s e r v e d by s e l e c t i n g a d e c o m p o s i t i o n s u c h t h a t F* c a n be i n f e r r e d f r o m t h e e l e m e n t a r y FDs i n t h e p r o j e c t i o n s ; t h a t i s F* c (F1 u F 2 ) + . T h i s i s t h e c o m p l e t e r e l a t a b i l i t y c o n d i t i o n (CRC) f o r e l e m e n t a r y FDs. To a v o i d r e d u n d a n c y , d e c o m p o s i t i o n i s b a s e d on t h e m u l t i p l e e l e m e n t a r y MVDs. Hence, t h e CRC f o r MVDs must a l s o be s a t i s f i e d . However, t h e r e v e r s e p r o j e c t a b i l i t y r u l e d oes n o t h o l d f o r MVDs. T h a t i s , X-->—>Y i n a p r o j e c t i o n o f R does n o t i m p l y X-->-->Y i n R. A weaker p r o p e r t y known a s j o i n a b i l i t y i s u s e d t o g e n e r a t e t h e s e t of MVDs i n f e r a b l e f r o m t h e e l e m e n t a r y MVDs i n t h e p r o j e c t i o n s . The two r e s u l t i n g s e t s , ACOVER and ZCOVER, c a n be u s e d t o improve 3NF r e l a t i o n s u s i n g B e r n s t e i n ' s a l g o r i t h m . The ZCOVER c o n s t i t u t e s a m i n i m a l c o v e r f o r FDs i n t h e o r i g i n a l r e l a t i o n R and c an be u s e d as minimum FD c o v e r i n B e r n s t e i n ' s a l g o r i t h m . The a l g o r i t h m w i l l now p r o d u c e 3NF r e l a t i o n s w h i c h c o m p l e t e l y c h a r a c t e r i z e t h e f u n c t i o n a l r e l a t i o n s h i p i n t h e i n i t i a l r e l a t i o n . The n o n f u n c t i o n a l r e l a t i o n s h i p s a r e r e p r e s e n t e d by e l e m e n t s of ACOVER w i t h o u t c o r r e s p o n d i n g FDs i n t h e ZCOVER. E a c h of them form a s e p a r a t e r e l a t i o n . The D e c o m p o s i t i o n / C o m p l e t e d a t a r e l a t a b i l i t y have been a b l e t o d e a l w i t h some of t h e p r o b l e m s i n no r m a l d e c o m p o s i t i o n . The c o m p l e t e r e l a t a b i l i t y c o n d i t i o n i s a b l e t o g u i d e t h e o r d e r i n wh i c h d e c o m p o s i t i o n i s c a r r i e d o ut t o p r o d u c e s u b r e l a t i o n s t h a t p r e s e r v e t h e d e p e n d e n c i e s i n t h e o r i g i n a l r e l a t i o n . C h a r a c t e r i z i n g t h e MVDs have been made e a s i e r , but d e t e c t i n g t h e i n i t i a l MVDs from w h i c h o t h e r s c an be g e n e r a t e d i s not t r i v i a l . L i k e most d e c o m p o s i t i o n methods, a s e t of i n i t i a l r e l a t i o n s i s assumed. The u n i v e r s a l r e l a t i o n a s s u m p t i o n i s known t o have some 23 u n d e s i r a b l e c o n s e q u e n c e s . In s e c t i o n 2.3, some o f t h e s e c o n s e q u e n c e s , as i n (Kent 1981), a r e d i s c u s s e d . 2.2.4 E n t i t y - R e l a t i o n s h i p M o d e l / T r a n s f o r m a t i o n R u l e s (Wong and K a t z 1980) p r o p o s e d a v a r i a n t o f t h e E n t i t y - R e l a t i o n s h i p model as an i n t e r m e d i a t e d e s i g n model w h i c h i s i n t u r n t r a n s f o r m e d i n t o r e l a t i o n schemas. The f o l l o w i n g s e m a n t i c o b j e c t s a r e r e c o g n i z e d w i t h i n t h e i n t e r m e d i a t e m o d e l . 1. ) E n t i t y s e t s ( E ( t ) ) : a one p a r a m e t e r f a m i l y o f s e t s w h i c h c h a n g e s as members a r e i n s e r t e d and d e l e t e d . 2. ) A P r o p e r t y o f an e n t i t y s e t E ( t ) i s a f u n c t i o n f mapping t E ( t ) t o some s e t V of v a l u e s a t t i m e i n s t a n c e t . The f u n c t i o n i s d e f i n e d on a l l of E ( t ) , and f o r e v e r y e e E ( t ) , f (e) i s t u n i q u e . 3. ) R e l a t i o n s h i p s : A r e l a t i o n s h i p R among e n t i t y s e t s t E 1 ( t ) , . . . , E n ( t ) i s a s u b s e t o f t h e c a r t e s i a n p r o d u c t E , ( t ) x E 2 ( t ) x...x E n ( t ) . No r e l a t i o n s h i p i s d e r i v a b l e f r o m o t h e r r e l a t i o n s h i p s ; t h e y a r e i n d e p e n d e n t and n o n d e c o m p o s a b l e . 4. ) P r o p e r t i e s of r e l a t i o n s h i p s : In a s i m i l a r f a s h i o n t o e n t i t y s e t s , p r o p e r t i e s c an be d e f i n e d on r e l a t i o n s h i p s . 5. ) S i n g l e - v a l u e d b i n a r y r e l a t i o n s h i p s : A b i n a r y r e l a t i o n s h i p R on e n t i t y s e t s E , ( t ) and E 2 ( t ) i s s i n g l e - v a l u e d i f e a c h t e n t i t y i n E , ( t ) o c c u r s i n a t most one p a i r i n R . t 6. ) A s s o c i a t i o n s : An a s s o c i a t i o n i s a b i n a r y r e l a t i o n s h i p i n wh i c h E , ( t ) e n t i t i e s o c c u r s i n e x a c t l y one i n s t a n c e . No 24 p r o p e r t i e s a r e a l l o w e d on a s s o c i a t i o n s o r s i n g l e - v a l u e d r e l a t i o n s h i p s . The d e s i g n g o a l i s t o p r e v e n t "update a n o m a l i e s " . An u p d a t e anomaly i s d e f i n e d as e i t h e r a f r a g m e n t a t i o n of t h e a t o m i c o p e r a t i o n s or u n c o n t r o l l e d s i d e e f f e c t s . An a t o m i c o p e r a t i o n i s one o f t h e f o l l o w i n g : 1. ) I n s e r t i n g o r d e l e t i n g an e n t i t y . 2. ) I n s e r t i n g o r d e l e t i n g an i n s t a n c e of a r e l a t i o n s h i p . 3. ) C h a n g i n g t h e v a l u e o f a f u n c t i o n ( p r o p e r t y o r a s s o c i a t i o n of an e n t i t y ) . D e l e t i o n o r i n s e r t i o n of an e n t i t y do have s i d e e f f e c t s . The d e l e t i o n o f an e n t i t y c a u s e s a d e l e t i o n of any i n s t a n c e o f a r e l a t i o n s h i p i n w h i c h i t p a r t i c i p a t e s . Any e n t i t y t h a t has t h e d e l e t e d e n t i t y as i t s r ange v a l u e i n an a s s o c i a t i o n i s a l s o d e l e t e d . The i n s e r t i o n of an e n t i t y "e" r e q u i r e s t h e e n t i t y t h a t i s t h e v a l u e o f any a s s o c i a t i o n o f e t o a l r e a d y e x i s t . The o r d e r of i n s e r t i o n of an e n t i t y may be c o n s t r a i n e d by a s s o c i a t i o n s . Thus, a c y c l e of a s s o c i a t i o n s i s n o t a l l o w e d . The i n t e r m e d i a t e model i s t r a n s f o r m e d i n t o r e l a t i o n schemas by t h e f o l l o w i n g r u l e s . 1. ) E a c h e n t i t y s e t has an e x p l i c i t i d e n t i f i e r w h i c h r e p r e s e n t s i t g l o b a l l y i n t h e m o d e l . An i d e n t i f i e r i s a o n e - t o - one p r o p e r t y of an e n t i t y s e t . 2. ) The i d e n t i f i e r of a p r i m i t i v e o b j e c t t o g e t h e r w i t h a l l i t s p r i m a r y f u n c t i o n s a r e g r o u p e d i n t h e same r e l a t i o n . A p r i m a r y f u n c t i o n i s a p r o p e r t y or an a s s o c i a t i o n , and a p r i m i t i v e o b j e c t i s e i t h e r a r e l a t i o n s h i p o r an e n t i t y s e t i n i t s r o l e as t h e domain of a p r i m a r y f u n c t i o n . 25 3.) T h e r e i s one and o n l y one p r i m i t i v e o b j e c t p e r r e l a t i o n . The t r a n s f o r m a t i o n r u l e s a r e s u c h t h a t t h e y p r e s e r v e t h e a t o m i c i t y o f u p d a t e s and c o n t r o l t h e s i d e e f f e c t s . R u l e 2 g r o u p s e n t i t i e s t o g e t h e r w i t h " t h e i r c o r r e s p o n d i n g p r o p e r t i e s and a s s o c i a t i o n s i n t h e same r e l a t i o n . I t a l l o w s d e l e t i o n and i n s e r t i o n of an e n t i t y t o be made a l o n g w i t h i t s a s s o c i a t i o n s and p r o p e r t i e s i n a s i n g l e r e l a t i o n t u p l e . A v i o l a t i o n o f one of th e n o r m a l forms (1NF,2NF,3NF,4NF) can be i n t e r p r e t e d a s a v i o l a t i o n o f one of t h e r u l e s (Wong e t a l 1980). 1. ) The g r o u p i n g t o g e t h e r of two p r i m i t i v e o b j e c t s w i t h no e n t i t y i n common o r a f u n c t i o n o f an e n t i t y and a r e l a t i o n s h i p i n v o l v i n g i t , b o t h r e s u l t i n a r e l a t i o n t h a t i s not i n 2NF. 2. ) P u t t i n g two f u n c t i o n s f1 and f2 w i t h d i f f e r e n t e n t i t y s e t s as t h e i r domains i n t h e same r e l a t i o n c a n v i o l a t e t h e 2NF. I f t h e f u n c t i o n s a r e o f t h e form E, ->E2 >S, 3NF i s v i o l a t e d . 3. ) The g r o u p i n g of two r e l a t i o n s h i p s w i t h a common e n t i t y s e t t o g e t h e r i n t h e same r e l a t i o n c a n p r o d u c e a r e s u l t t h a t i s n o t i n 4NF. The schemas r e s u l t i n g from t h e r u l e s , t h e r e f o r e , c o n f o r m t o t h e 4NF. The f o l l o w i n g example i l l u s t r a t e s t h e d e s i g n p r o c e s s . E n t i t y s e t s P r o p e r t i e s EMP ENAME, BIRTHYR DEPT DNAME, LOCATION JOB T I T L E , SALARY 26 A s s o c l a t i o n s W o r k s - i n ( E M P , DEPT) A s s i g n m e n t ( E M P , JOB) R e l a t i o n s h i p s mgr(DEPT, EMP) q u a l i f i e d ( E M P , JOB) a l l o c a t i o n ( D E P T , JOB) S t a t u s s i n g l e - v a l u e d g e n e r a l g e n e r a l P r o p e r t i e s n i l n i l number The i n t e r m e d i a t e model c o n s i s t s o f t h e f o l l o w i n g p r i m i t i v e o b j e c t s and f u n c t i o n s . F u n c t i o n s P r i m i t i v e O b j e c t s EMP DEPT JOB mgr a l l o c a t i o n q u a l i f i e d ENAME, BIRTHYR, w o r k s ~ i n , a s s i g n m e n t DNAME, LOCATION T I T L E , SALARY NUMBER The model i s t r a n s f o r m e d i n t o f i v e r e l a t i o n s a s f o l l o w s . EMP(ENO,ENAME,BlRTHYR,ASSIGNMENT,EDEPT) DEPT(DNO,LOCATION,MGR) J O B ( J I D , T l T L E , S A L A R Y ) ALLOC(DNO,JID,NUMBER) QUAL(JID,ENO) The E n t i t y - R e l a t i o n s h i p / T r a n s f o r m a t o n r u l e s p r o v i d e a p r a c t i c a l a p p r o a c h t o r e l a t i o n a l d a t a b a s e d e s i g n . But a s . n o t e d by t h e a u t h o r s , e v e r y 4NF r e l a t i o n schema i s not n e c e s s a r i l y g e n e r a t e d f r o m t h e i n t e r m e d i a t e model v i a t h e mapping r u l e s . The 27 r e s u l t i n g schema i s r e s t r i c t e d by t h e i n t e r m e d i a t e model. We b e l i e v e t h a t t h e e x p l i c i t s p e c i f i c a t i o n o f whether t h e r e l a t i o n s h i p s and f u n t i o n s a r e p a r t i a l o r t o t a l on t h e s o u r c e and t a r g e t s e t s , p r o v i d e s more meaning t o t h e i n t e r m e d i a t e m o d e l . The a d d i t i o n a l meaning a l l o w s more r e l a x e d r u l e s t o be d e f i n e d . These p o i n t s w i l l become more e v i d e n t i n our a p p r o a c h , p r e s e n t e d i n c h a p t e r 3. 2.3 A p p r a i s a l o f t h e D e s i g n A p p r o a c h e s The d e c o m p o s i t i o n a p p r o a c h e s t o r e l a t i o n a l d a t a b a s e d e s i g n s h a r e a l o t i n common. In p a r t i c u l a r , t h e y t a k e a s i n p u t an i n i t i a l r e l a t i o n d e s i g n . R e c e n t l y , t h e c o n s e q u e n c e s o f t h e U n i v e r s a l r e l a t i o n and o t h e r a s s u m p t i o n s were d i s c u s s e d i n (Kent 1981). In a d d i t i o n t o t h e s p e c i f i c q u e s t i o n s r a i s e d i n e a c h o f t h e a p p r o a c h e s , t h e p a p e r p r o v i d e s a common g r o u n d f o r a p p r a i s i n g b o t h t h e s y n t h e t i c and d e c o m p o s i t i o n methods. A number o f q u e s t i o n s c o n c e r n i n g t h e E n t i t y - R e l a t i o n s h i p / T r a n s f o r m a t i o n r u l e s a r e r a i s e d i n s e c t i o n 2.2.4. The d e s i g n method i s r a d i c a l l y d i f f e r e n t f r o m t h e o t h e r s . 2.3.1 C o n s e q u e n c e s of U n i v e r s a l R e l a t i o n A s s u m p t i o n ' B o t h t h e d e c o m p o s i t i o n and s y n t h e t i c methods make c e r t a i n i m p l i c i t a s s u m p t i o n s : 1. ) T h e r e a r e no d o m a i n s : Columns of r e l a t i o n s a r e d i s t i n c t l y named w i t h no f a c i l i t y f o r s t a t i n g t h e u n d e r l y i n g domains t h a t m i g h t be common t o s e v e r a l c o l u m n s . 2. ) A j o i n compares c o l u m n s i f and o n l y i f t h e y have t h e same name. 28 An e x p l i c i t a s s u m p t i o n common t o most d e c o m p o s i t i o n methods i s th e U n i v e r s a l r e l a t i o n a s s u m p t i o n . F o r a g i v e n s e t of r e l a t i o n s , S = { R , ( X ! ) , . . . , R n ( X n ) } , a u n i v e r s a l r e l a t i o n U(T) e x i s t s s u c h t h a t 1. ) The column names o f U c o n s i s t s o f a l l t h e column names o f t h e r e l a t i o n s i n R, t h a t i s T = X, u X 2 u...u Xn. 2. ) E a c h r e l a t i o n i n S i s a p r o j e c t i o n of U. However, t h e s e a s s u m p t i o n s have i m p l i c a t i o n s t h a t a r e n o t c o m p a t i b l e w i t h p r a c t i c a l d a t a b a s e d e s i g n . The u n i v e r s a l r e l a t i o n a s s u m p t i o n i m p l i e s t h a t c olumns have t h e same meaning i n e v e r y r e l a t i o n , b e c a u s e t h e y a r e p r o j e c t e d from t h e same s o u r c e . T h e r e f o r e , w h e r e v e r an a t t r i b u t e o c c u r s , i t must n e c e s s a r i l y have t h e same e x t e n s i o n s . Hence, u p d a t e s t o r e l a t i o n s o f t h e form R,(X,Y) and R 2 ( X , Z ) must p r e s e r v e e q u a l i t y o f t h e p r o j e c t i o n s R,[X] and R 2 [ X ] . In e s s e n c e , we c a n n o t use th e same a t t r i b u t e w i t h d i f f e r e n t i n t e n s i o n s i n v a r i o u s r e l a t i o n s h i p s . I t i s a l s o n o t m e a n i n g f u l t o have two r e l a t i o n s w i t h i d e n t i c a l column names. In B e r n t e i n ' s s y n t h e s i s , a t t r i b u t e s a r e a l l o w e d t o be renamed i n o r d e r t h a t t h e u n i q u e n e s s a s s u m p t i o n f o r FDs be p r e s e r v e d . D e c o m p o s i t i o n methods do n o t e x p l i c i t l y r e q u i r e t h e u n i q u e n e s s o f FDs, but a t t r i b u t e s c a n a l s o be renamed a f t e r t h e d e c o m p o s i t i o n of a r e l a t i o n . One c o n s e q u e n c e of re n a m i n g a t t r i b u t e s i s t h a t i t i s n o t p o s s i b l e t o make n a t u r a l j o i n s o v e r s u c h d i s t i n c t a t t r i b u t e s even t h o u g h t h e y s h a r e t h e same u n d e r l y i n g domain. Though i n p r a c t i c e some s y s t e m s a l l o w j o i n s o v e r d i f f e r e n t column names, t h i s i s o n l y u s e f u l i f t h e r e i s a p r o v i s i o n f o r c h e c k i n g t h a t t h e column names have t h e same 29 u n d e r l y i n g domain. A c l o s e l y r e l a t e d p r o b l e m i s t h a t o f e x p r e s s i n g r e l a t i o n s h i p s e x i s t i n g among a t t r i b u t e s o f a r e l a t i o n . 2.3.2 D a t a D e p e n d e n c i e s E x p r e s s i n g t h e d e p e n d e n c i e s between d a t a o b j e c t s i s v e r y c r u c i a l t o d a t a b a s e d e s i g n . The s y n t h e t i c a p p r o a c h t a k e s as i n p u t a s e t of FDs, b u t n o n f u n c t i o n a l r e l a t i o n s h i p s c an n o t be r e p r e s e n t e d d i r e c t l y . The 4NF d e c o m p o s i t i o n a l l o w s n o n f u n c t i o n a l r e l a t i o n s h i p s t o be e x p r e s s e d as MVDs. However, a MVD i s d e f i n e d s u c h t h a t i t i s o n l y r e c o g n i z a b l e when i t c o e x i s t w i t h a n o t h e r one i n t h e same r e l a t i o n . The t a s k o f d e t e c t i n g MVDs w i t h i n a r e l a t i o n i s a l s o n o t t r i v i a l . The MVDs a r e n o t o n l y u n i n t u i t i v e , t h e i r p r o p e r t i e s a r e n o t w e l l u n d e r s t o o d . Some MVDs h o l d i n a p r o j e c t i o n of a r e l a t i o n b u t not i n t h e o r i g i n a l r e l a t i o n . T h e s e a r e r e f e r r e d t o as embedded MVDs. An i s s u e y e t u n r e s o l v e d i s whether t h e r e e x i s t i n f e r e n c e r u l e s ( f r o m p r o j e c t i o n s t o t h e j o i n ) s t r o n g e r t h a n t h e j o i n a b i l i t y ( Z a n i o l o and M e l k a n o f f 1981). Thus, m u l t i v a l u e d dependency i s not v e r y s u i t a b l e a s a means of r e p r e s e n t i n g many-many r e l a t i o n s h i p s . I n g e n e r a l , two many-many r e l a t i o n s h i p s E S and E D, w i l l n o t a p p e a r as m u l t i v a l u e d d e p e n d e n c i e s i f t h e r e a l s o e x i s t s some r e l a t i o n s h i p s i n v o l v i n g S and D (Kent 1981). A c t u a l l y , i n a r e l a t i o n schema R(ESD) c o n t a i n i n g a t t r i b u t e s w i t h t h e two many-many r e l a t i o n s h i p s , i f S and D have some r e l a t i o n s i p s , we have a j o i n d e p e n d e n c y c o n s t r a i n t . C l e a r l y , t h e d e p e n d e n c i e s a r e not enough t o s p e c i f y t h e r e l a t i o n s h i p s t h a t do e x i s t among d a t a i t e m s o f a d a t a b a s e . 30 2.3.3 D e c o m p o s i t i o n v e r s u s t h e S y n t h e t i c A p p r o a c h The d e c o m p o s i t i o n and s y n t h e t i c methods d i f f e r m a i n l y i n t h e t y p e o f i n p u t t h e y t a k e . In g e n e r a l , d e c o m p o s i t i o n t a k e s as i n p u t an i n i t i a l s e t o f r e l a t i o n s , FDs and MVDs. F a g i n ' s 4NF d e c o m p o s i t i o n a c c e p t s s e t s o f a t t r i b u t e s , FDs and MVDs, but t h e f i r s t s t e p c o n v e r t s a l l t h e a t t r i b u t e s i n t o a s i n g l e r e l a t i o n . O n l y f u n c t i o n a l d e p e n d e n c i e s c a n be s p e c i f i e d d i r e c t l y i n s y n t h e t i c methods, b e c a u s e MVDs ca n o n l y be d e f i n e d w i t h i n t h e c o n t e x t o f a r e l a t i o n . The MVDs wou l d have t o be s p e c i f i a b l e i n a c o n t e x t - i n d e p e n d e n t form, i f t h e s y n t h e t i c methods a r e t o a c c e p t them d i r e c t l y as i n p u t . S i n c e s y n t h e t i c methods do n o t t a k e i n i t i a l r e l a t i o n s as i n p u t , column names a r e n e c e s s a r i l y u n i q u e . T h i s i s a l s o t h e c a s e f o r d e c o m p o s i t i o n when t h e u n i v e r s a l r e l a t i o n i s assumed. W i t h t h e p r e s e n t s t a t e o f d e p endency t h e o r y , none o f t h e a p p r o a c h e s i s c l e a r l y s u p e r i o r t o t h e o t h e r . S y n t h e s i s a p p e a r s more d e s i r a b l e i n p r a c t i c e , e s p e c i a l l y f o r l a r g e d a t a b a s e s . D e c o m p o s i t i o n t e n d s t o l e a v e r e s i d u e r e l a t i o n s w h i c h sometimes model r e l a t i o n s h i p s t h a t can n o t be e x p r e s s e d a s FDs or MVDs. But sometimes t h e a t t r i b u t e s of s u c h r e l a t i o n s do n o t b e a r any d i r e c t r e l a t i o n s h i p . Kent i s of t h e o p i n i o n t h a t a more e x t e n s i v e d e p e n d e n c y t h e o r y , i n w h i c h a l l d e p e n d e n c i e s c o u l d be f o r m a l l y e x p r e s s e d , i s needed. W i t h s u c h a t h e o r y , t h e s y n t h e t i c a p p r o a c h m i g h t be p r e f e r r e d . R e l a t i o n s c a p t u r i n g a l l t h e r e l a t i o n s h i p s would be g e n e r a t e d , w h i l e d e c o m p o s i t i o n would c o n t i n u e t o l e a v e u n r e l a t e d e l e m e n t s a g g r e g a t e d i n r e s i d u e r e l a t i o n s (Kent 1981). 31 CHAPTER I I I . The F u l l M apping A p p r o a c h T h i s c h a p t e r d e a l s w i t h F u l l M a ppings and t h e t r a n s f o r m a t i o n r u l e s . The mappings a r e p r o p o s e d a s an a l t e r n a t i v e t o f u n c t i o n a l and m u t i v a l u e d d e p e n d e n c i e s . The t r a n s f o r m a t i o n r u l e s g e n e r a t e r e l a t i o n schemas from t h e f u l l mapping s p e c i f i c a t i o n s . 3.1 F u l l M apping S p e c i f i c a t i o n A f o r m a l i n f o r m a t i o n a n a l y s i s of an a p p l i c a t i o n e n v i r o n m e n t r e v e a l s r e l e v a n t e n t i t y s e t s , v a l u e s e t s and a s s o c i a t i o n s ( s e c t i o n 2 . 1 . 1 ) . We p l a c e e m p h a s i s on t h e t y p e s o f mapping t h a t e x i s t among t h e s e t s . By r e p r e s e n t i n g p r o p e r t i e s w i t h a p p r o p r i a t e a t t r i b u t e - n a m e s and t h e e n t i t y s e t s w i t h a p r i m a r y a t t r i b u t e , t h e a s s o c i a t i o n s c a n a l l be e x p r e s s e d as mappings between a t t r i b u t e s . A p r i m a r y a t t r i b u t e has t o be a p r o p e r t y t h a t p r o v i d e s a o n e - t o - o n e c o r r e s p o n d e n c e between t h e e n t i t y s e t and t h e p r o p e r t y - v a l u e s e t . H i t h e r t o , i n r e l a t i o n a l d a t a b a s e t h e o r y , d a t a r e l a t i o n s h i p s a r e e x p r e s s e d as f u n c t i o n a l , m u l t i v a l u e d o r j o i n d e p e n d e n c i e s . F u l l M a pping i s p r o p o s e d as a means o f s p e c i f y i n g r e l a t i o n s h i p s between d a t a i t e m s . .3.1.1 Mapping T y p e s L e t A and B be s e t s a c t i n g a s s o u r c e and t a r g e t of a mapping r e s p e c t i v e l y . The f o l l o w i n g mapping t y p e s c a n be de f i n e d . 1.) One-one mapping (A 1 1 B ) : T h e r e i s a o n e - t o - o n e 32 c o r r e s p o n d e n c e between t h e s o u r c e and t h e t a r g e t e l e m e n t s B F i g u r e 10. One-to-one Mapping D i a g r a m 2.) Many-one mapping (A m 1 B ) : D i s j o i n t s e t s o f A - e l e m e n t s a r e mapped t o u n i q u e e l e m e n t s i n B. B F i g u r e 11. Many-one M a p p i n g D i a g r a m A one-many mapping from A t o B c a n a l w a y s be t r e a t e d a s a many- one mapping from B t o A. Hence we do not have t o d i s t i n g u i s h between many-one and one-many mapping t y p e s . 3.) Many-many mapping (A m n B ) : E l e m e n t s i n A a r e mapped 33 t o s e t s i n B, but t h e B - s e t s a r e n o t n e c e s s a r i l y d i s j o i n t . A B F i g u r e 12. Many-many Mapping D i a g r a m A mapping, a p a r t f r o m b e i n g one-one, many-one o r many-many, i s e i t h e r t o t a l o r p a r t i a l on t h e s o u r c e and t a r g e t s e t s . An i n t o - m a p p i n g on a s e t e x p r e s s e s t h e f a c t t h a t an e l e m e n t i n t h e s e t may not be i n v o l v e d i n t h e mapping. T h e r e i s an e l e m e n t of r e l a t i v i t y i n d e c i d i n g whether a mapping i s ' i n t o ' or ' onto' a s e t . L e t us c o n s i d e r an a p p l i c a t i o n e n v i r o n m e n t where a s u p p l i e r (SNO) o f an i t e m s t a y s i n a p a r t i c u l a r CITY and t h e c i t i e s have STATUS a t t a c h e d t o them. E v e r y s u p p l i e r s t a y s i n a c i t y and e v e r y c i t y has a s t a t u s . The mappings SNO CITY and CITY STATUS a p p e a r t o be b o t h t o t a l on t h e i r s o u r c e and t a r g e t s e t s . However, i f we w i s h i t p o s s i b l e t o e n t e r t h e i n f o r m a t i o n t h a t a p a r t i c u l a r c i t y has a p a r t i c u l a r s t a t u s even when t h e r e a r e no s u p p l i e r s l o c a t e d i n t h a t c i t y , t h e n t h e mapping SNO CITY i s ' i n t o ' CITY. T h a t i s , -at a d a t a b a s e i n s t a n c e , i f we match t h e e x t e n s i o n o f CITY t o t h a t o f SNO t h e r e may be c i t i e s not mapped t o any s u p p l i e r . An i n t o - m a p p i n g on a s e t i s a g e n e r a l i z e d f o r m of t h e c o n d i t i o n a l a s s o c i a t i o n i n ( R a v e r and Hubbard 1977). The c o n d i t i o n a l a s s o c i a t i o n e x p r e s s e s 34 t h e c a s e where an e l e m e n t i n t h e s o u r c e has e x a c t l y one c o r r e s p o n d i n g t a r g e t e l e m e n t o r none a t a l l . An o n t o - m a p p i n g on a s e t i s a t o t a l mapping on t h e s e t . A ' o n t o ' a s e t i f e v e r y e l e m e n t o f t h e s e t a l w a y s p a r t i c i p a t e i n t h e mapping. By t h e d e f i n i t i o n of i n t o and o n t o mappings, i t i s n o t p o s s i b l e t o have two o n t o - m a p p i n g s on t h e same s e t w i t h d i f f e r e n t domain e x t e n s i o n s a t any d a t a b a s e i n s t a n c e . T h e r e i s i n h e r e n t s e m a n t i c i n f o r m a t i o n e x p r e s s e d when t h e i n t o / o n t o s t a t u s o f a mapping i s s t a t e d . T h i s s h o u l d be combined w i t h t h e mapping t y p e s t o p r o v i d e more meaning t o d a t a d e p e n d e n c i e s . Hence, t h e r e a r e a t o t a l o f t w e l v e mapping t y p e s w h i c h we r e f e r t o as f u l l m a p p i ngs. An example of a f u l l y d e f i n e d mapping between two s e t s A and B o i i s many-many ( o n t o , i n t o ) w r i t t e n as A m n B . 3.2 D e s i g n o f R e l a t i o n Schemas from F u l l M a ppings A mapping between two s e t s X and Y i s a b i n a r y r e l a t i o n between t h e s e t s . I t can be r e p r e s e n t e d by a r e l a t i o n schema R ( X , Y ) . Such a schema i s a t o m i c s i n c e t h e mappings a r e n o n d e c o m p o s a b l e and a r e n o t d e r i v a b l e f r o m o t h e r m a p p i n g s . I f t h e p r i n c i p a l schemas a r e e x p r e s s e d e x c l u s i v e l y as a t o m i c r e l a t i o n s , t h e n t h e r e w i l l be t h e need t o a p p l y n - a r y j o i n s t o o b t a i n h i g h e r d e g r e e r e l a t i o n s i n o r d e r t o d e f i n e v i e w s and t o r e p r e s e n t a b r o a d c l a s s o f q u e r i e s . T h e r e f o r e , t h e r u l e s a r e d e f i n e d t o d e t e c t t h e mapping t y p e s t h a t c a n be c o m b i n e d . The r e l a t i o n a l schema d e s i g n p r o b l e m i s t o a v o i d r e p e a t i n g a t t r i b u t e s i n a l a r g e number o f low d e g r e e r e l a t i o n s , a s w e l l as 35 a v o i d t h e p r o b l e m s t h a t c o u l d a r i s e f r o m j o i n s . 3.2.1 T r a n s f o r m a t i o n R u l e s f o r BCNF schemas A r e l a t i o n schema R i s i n Boy c e - C o d d n o r m a l form (BCNF) i f f o r a l l d i s j o i n t and nonempty s e t s o f a t t r i b u t e s X and Y i n R, i f X >Y t h e n X i s a s u p e r k e y of R ( B e e r i and B e r n s t e i n 1979). Hence, t h e r u l e s a r e s u c h t h a t e v e r y d e t e r m i n a n t i s a c a n d i d a t e key. E v e r y d e t e r m i n a n t i s r e l e v a n t i n d e t e r m i n i n g t h e BCNF schemas. T h e r e f o r e , we s h a l l i n c l u d e s u p e r k e y s i n t h e s e t o f c a n d i d a t e k e y s . S i n c e p r o p e r keys a r e s u p e r k e y s , any c l a i m made f o r t h e s e t o f c a n d i d a t e keys of R i s a l s o t r u e f o r t h e p r o p e r keys of R. 1. ) E x c l u s i v e mappings a r e t h o s e t h a t have u n i q u e a t t r i b u t e s ; t h e y r e m a i n uncombihed. The c o r r e s p o n d i n g a t o m i c r e l a t i o n schemas a r e i n t h e i r f i n a l f o r m . L e t X and Y be s e t s of a t t r i b u t e s . A mapping f r o m X t o Y can be t r a n s f o r m e d i n t o a r e l a t i o n schema R ( X , Y ) , r e g a r d l e s s of whether t h e mapping i s ' i n t o ' o r 'o n t o ' X and Y. The c a n d i d a t e keys a r e d e t e r m i n e d a s f o l l o w s : The c o n v e n t i o n a d o p t e d i s t o l e a v e t h e i n t o / o n t o s t a t u s o f a mapping u n s p e c i f i e d i f t h e r u l e i s v a l i d f o r e i t h e r c a s e . a) X 1 1 Y ke y ( R ) = {X,Y} b) X m 1 Y key(R) = {X} c) X m n Y key(R) = {XY} 2. ) Common A t t r i b u t e G r o u p s : The n o n e x c l u s i v e mappings are. a r r a n g e d i n t o g r o u p s , s u c h t h a t e v e r y mapping i n a g r o u p has a common a t t r i b u t e w i t h a t l e a s t one o t h e r mapping i n t h e group* T h e r e a r e no two g r o u p s h a v i n g an a t t r i b u t e i n common. W i t h i n a 36 common a t t r i b u t e g r o u p , two mappings from A t o B and from B t o C ca n be combi n e d i n t o a r e l a t i o n schema R(A,B,C) a c c o r d i n g t o t h e f o l l o w i n g r u l e s . o o a) A 1 1 B + B 1 1 C ke y ( R ) = {A,B,C} o o b) A 1 1 B + B m 1 C key(R) = {A, B} o . o c ) A m n B + B m n C key(R) = {ABC} o o d) A 1 m B + B m 1 C key(R) = {B} 3.) The r e s u l t i n g schemas from r u l e (2) c a n be c o m b i n e d s u c c e s s i v e l y w i t h o t h e r mappings i n t h e g r o u p a s f o l l o w s : L e t R,(X,A) be a r e schema w i t h s e t s of a t t r i b u t e X and an a t t r i b u t e A. R, can be combined w i t h a mapping from A t o B i n t o R 2 ( X , A , B ) . T h e r e a r e t h r e e r e l e v a n t c a s e s d e p e n d i n g on whether B i s c o n t a i n e d i n X. I f B i s c o n t a i n e d i n X and X = Y u B, t h e r e s u l t i n g schema i s R 2 ( Y , A , B ) . I f keyCR,) i s t h e s e t of c a n d i d a t e keys of R,, t h e c a n d i d a t e k e y s , k e y ( R 2 ) , of R 2 i d e t e r m i n e d a c c o r d i n g t o t h e f o l l o w i n g r u l e s : C a s e ( i ) A e key(R,) and B / X s o o a) R,(X,A ) + A 1 1 B; k e y ( R 2 ) = key(R,) u {A,B} o o b) R,(X,A ) + A m 1 B; k e y ( R 2 ) = key(R,) u {A} Case ( i i ) AB e key(R,) and B c X. T h i s i s e q u i v a l e n t t o j o i n i n g o v e r s t r u c t u r e d e n t i t i e s . o o o o a) R,(Y,A ,B ) + A 1 1 B ; k e y ( R 2 ) = key ( R , ) u {A,B} o o o o b) R,(Y,A ,B ) + A m 1 B ; k e y ( R 2 ) = key ( R , ) u {A} o o o o c) R,(Y,A ,B ) + A m--—n B ; k e y ( R 2 ) = key ( R , ) 37 Case ( i i i ) AB / key(R,) and B c X. o o o o a) R,(Y,A ,B ) + A m n B ; k e y ( R 2 ) = k e y ( R , ) 4.) E x c e p t i o n t o t h e r u l e s . A l l t h e r u l e s r e q u i r e t h a t t h e common a t t r i b u t e ( s ) i n a j o i n be mapped 'onto' i n t h e i r c o r r e s p o n d i n g m a p p i n g s . However, t h e r u l e s a r e v a l i d i f a l l t h e n e c e s s a r y ' o n t o s ' a r e r e p l a c e d by ' i n t o s ' , as l o n g a s we can g u a r a n t e e s e m a n t i c a l l y t h a t t h e e x t e n s i o n s o f t h e a t t r i b u t e s i n v o l v e d w i l l be e q u a l a t any d a t a b a s e i n s t a n c e . T hose a t t r i b u t e s a r e s a i d t o have e q u i v a l e n t domain e x t e n s i o n s . In e s s e n c e , o n t o - m a p p i n g s g u a r a n t e e e q u i v a l e n t domain e x t e n s i o n s of j o i n a t t r i b u t e s . 3.2.2 T r a n s f o r m a t i o n R u l e s f o r PJ/NF Schemas 1. ) The e x c l u s i v e mappings a r e t r a n s f o r m e d i n t o r e l a t i o n schemas as i n t h e BCNF r u l e s . 2. ) Common A t t r i b u t e G r o u p s : W i t h i n t h e -common a t t r i b u t e g r o u p s , t h e f o l l o w i n g r u l e s h o l d . o o a) A 1 1 B + B 1 1 C key(R) = {A,B,C} o o b) A 1 1 B + B m 1 C key(R) = {A,B} o o c ) A 1 m B + B m 1 C key(R) = {B} 3. ) The r e s u l t i n g schemas f r o m r u l e (2) c a n be c o m b i n e d s u c c e s s i v e l y w i t h o t h e r mappings i n t h e g r o u p as f o l l o w s : 38 Case ( i ) A e key(R,) and B ^ X o o a) R,(X,A ) + A 1 1 B; k e y ( R 2 ) = k e y ( R , ) u {A,B} o o b) R,(X,A ) + A m 1 B; k e y ( R 2 r ) = key(R,) u {A} Case ( i i ) AB e key(R,) and B c X. "Let Y = X - B. ' o o o o a) R,(Y,A ,B ) + A 1 1 B ; k e y ( R 2 ) = key(R,) u {A,B} o o o o b) R,(Y,A ,B ) + A m 1 B ; k e y ( R 2 ) = k e y ( R , ) u {A} o o o o c ) R,(Y,A ,B ) + A m n B ; k e y ( R 2 ) = k e y ( R , ) 4.) The e x c e p t i o n t o t h e r u l e s i n s e c t i o n 3.3.1 a l s o h o l d s f o r p r o j e c t i o n - j o i n n o r m a l form schemas. 3.3 B a s i s f o r t h e T r a n s f o r m a t i o n R u l e s C o m b i n i n g two o r more mappings i n t o a r e l a t i o n schema i s e q u i v a l e n t t o j o i n i n g t h e c o r r e s p o n d i n g a t o m i c r e l a t i o n s o f t h e m a p p i n g s . Thus, t h e r u l e s must a t l e a s t e n s u r e t h a t t h e j o i n s a r e l o s s l e s s . 3.3.1 N e c e s s a r y and S u f f i c i e n t C o n d i t i o n f o r a L o s s l e s s J o i n A j o i n i s l o s s l e s s ( i n t h e s y n t h e s i s c o n t e x t ) i f t h e r e s u l t i n g r e l a t i o n c a n be p r o j e c t e d back t o t h e o r i g i n a l r e l a t i o n s b e f o r e t h e j o i n . T h a t i s , i f r e l a t i o n schemas R(X,Y) and R(Y,Z) a r e j o i n e d o v e r Y i n t o R ( X , Y , Z ) , t h e j o i n i s l o s s l e s s i f t h e p r o j e c t i o n s o f R ( X , Y , Z ) , R[X,Y] and R [ Y , Z ] , a r e e q u a l t o R(X,Y) and R(Y,Z) r e s p e c t i v e l y . I t h as been o b s e r v e d i n (Codd 1979) t h a t j o i n s l o s e i n f o r m a t i o n when t h e r e l a t i o n s i n v o l v e d do not have e q u a l p r o j e c t i o n s on 39 t h e i r common a t t r i b u t e ( s ) . The o b s e r v a t i o n i s not j u s t an e x t e n s i o n a l concept; i t r e v e a l s an important semantic n o t i o n . Claim: A j o i n of r e l a t i o n schemas i s l o s s l e s s i f and onl y i f the common a t t r i b u t e s are mapped 'onto' i n t h e i r c orresponding mappings or the a t t r i b u t e s always have i d e n t i c a l domain e x t e n s i o n s . Proof ( S u f f i c i e n c y ) : Let R,(X,Y) and R 2(Y,Z) be r e l a t i o n schemas denoting two mappings M1(X Y) and M2(Y Z) r e s p e c t i v e l y . The j o i n of R,(X,Y) and R 2(Y,Z) over Y i s R(X,Y,Z) = {( x , y , z ) : (x,y) e R,(X,Y) and (y,z) e R 2(Y,Z)}. If the extensions of Y i n R, and R 2 are always equal, then at any database i n s t a n c e , f o r each "y" i n the ( x , y ) - t u p l e s of R,(X,Y) there e x i s t s at l e a s t one i d e n t i c a l "y" i n the ( y , z ) - t u p l e s of R 2 ( Y , Z ) . I f a p a r t i c u l a r "y" occurs n and m times i n the Y- columns of R,(X,Y) and R 2(Y,Z) r e s p e c t i v e l y , then the "y" w i l l occur n x m times i n the Y-column of R(X,Y,Z). Hence, every (x,y) and (y,z) p a i r s of R,(X,Y) and R 2(Y,Z) w i l l appear i n the XY-column and YZ-column of R(X,Y,Z), r e s p e c t i v e l y , at l e a s t once. T h e r e f o r e , a p r o j e c t i o n of R(X,Y,Z) over XY and YZ w i l l reproduce the o r i g i n a l r e l a t i o n s R,(X,Y) and R 2 ( Y , Z ) . Repeated t u p l e s are merged s i n c e p r o j e c t i o n i s a set o p e r a t i o n . Proof (Necessary C o n d i t i o n ) : Let R(X,Y,Z) be the j o i n of R,(X,Y) and R 2(Y,Z) as i n the s u f f i c i e n c y proof above. If at' any database i n s t a n c e the domain extension of Y i n R, i s not equal to that of R 2, there w i l l 40 e i t h e r be a y i n t h e Y-column of R, not i n t h e Y-column of R 2 i o r a y i n t h e Y-column o f R 2 n o t i n t h a t of R,. Hence, a t u p l e J (x ,y ) o r (y ,z ) w i l l n ot a p p e a r i n t h e XY-column o r YZ-column i i J J of R(X,Y,Z) r e s p e c t i v e l y . T h e r e f o r e , R[X,Y] and R [ Y , Z ] , t h e p r o j e c t i o n s o f R(X,Y,Z) o v e r XY and YZ w i l l n o t be e q u a l . t o R,(X,Y) and R 2 ( Y , Z ) . Thus, t h e j o i n w i l l n o t s a t i s f y t h e l o s s l e s s p r o p e r t y . 3.3.2 The C a n d i d a t e Keys The c a n d i d a t e keys i n t h e t r a n s f o r m a t i o n r u l e s i n s e c t i o n 3.3 a r e d e r i v e d a c c o r d i n g t o t h e f o l l o w i n g c l a i m : C l a i m : L e t t h e s e t of c a n d i d a t e keys of t h e r e l a t i o n schemas R,(X,Y) and R 2 ( X , Z ) be k e y ^ ) and k e y ( R 2 ) r e s p e c t i v e l y . The new s e t o f c a n d i d a t e keys k e y ( R 3 ) , a f t e r a l o s s l e s s j o i n of R, and R 2 o v e r t h e s e t o f a t t r i b u t e X, depends on whether X i s a c a n d i d a t e key i n R, or R 2. T h e r e a r e f o u r c a s e s . 1. ) I f X e k e y ( R , ) and X / k e y ( R 2 ) t h e n k e y ( R 3 ) = k e y ( R 2 ) . 2. ) I f X / k e y ( R , ) and X e k e y ( R 2 ) t h e n k e y ( R 3 ) = k e y ( R , ) . 3. ) I f X € key(R,) and X e k e y ( R 2 ) t h e n k e y ( R 3 ) = ( k e y ( R , ) u k e y ( R 2 ) } 4. ) I f X / key(R,) and X / k e y ( R 2 ) t h e n k e y ( R 3 ) = {key(R,) x k e y ( R 2 ) } p r o o f C a se (1) X e k e y ( R , ) and X / k e y ( R 2 ) . In a l o s s l e s s j o i n of R,(X,Y) and R 2 ( X , Z ) i n t o R 3 ( X , Y , Z ) , e a c h t u p l e of R 2 w i l l a p p e a r once i n t h e XZ-column of R 3. T h i s 41 i s t h e c a s e b e c a u s e f o r e v e r y x - v a l u e i n R 2 ( n o t n e c e s s a r i l y u n i q u e s i n c e X jL k e y ( R 2 ) ) r t h e r e i s a u n i q u e x - v a l u e i n R, . Hence, t h e number of e n t r i e s i n R 3 i s d e t e r m i n e d by t u p l e s o f R 2. T h e r e f o r e , t u p l e s of R 3 a r e d e t e r m i n e d u n i q u e l y by t h e k e y s o f R 2. Case (2) S i m i l a r l y , when X £ key(R,) and X e. k e y ( R 2 ) , t h e keys o f R, d e t e r m i n e t u p l e s o f R 3 u n i q u e l y . C ase (3) X e key(R,) arid X e k e y ( R 2 ) . T h e r e i s a o n e - t o - o n e c o r r e s p o n d e n c e between t u p l e s of R, and R 2. Hence t u p l e s of R 3 a r e d i r e c t c o n c a t e n a t i o n of t u p l e s o f R, and R 2 o v e r e q u a l x - v a l u e s . T h e r e f o r e , t h e t u p l e s of R 3 a r e u n i q u e l y d e t e r m i n e d by t h e k e y s i n key(R,) or k e y ( R 2 ) . C ase (4) X / key(R,) and X / k e y ( R 2 ) . In g e n e r a l , e v e r y x - v a l u e i n R, o r R 2 can a p p e a r more t h a n o n c e . F o r any x - v a l u e w i t h n and m e n t r i e s i n R, and R 2, r e s p e c t i v e l y , t u p l e s c o n t a i n i n g t h e p a r t i c u l a r x - v a l u e w i l l a p p e a r n x m t i m e s i n R 3. S i n c e X i s not a key i n R, o r R 2, t h e r e i s a t l e a s t an x - v a l u e i n e a c h of t h e X-columns o f R, and R 2 t h a t a p p e a r s more t h a n o n c e . The c a r d i n a l i t y o f R 3 i s a l w a y s g r e a t e r t h a n e i t h e r o f R, o r R 2. T h e r e f o r e , none o f t h e keys i n K ey(R,) o r k e y ( R 2 ) c a n u n i q u e l y d e t e r m i n e t u p l e s of R 3. I t i s o n l y a c o m b i n a t i o n of a key i n k e y f R ^ and one i n k e y ( R 2 ) t h a t d e t e r m i n e s t u p l e s of R 3 u n i q u e l y . The new s e t of c a n d i d a t e keys i s t h e c a r t e s i a n p r o d u c t of keyCR,) and k e y ( R 2 ) . 42 3.3.3 The BCNF R u l e s : V e r i f i c a t i o n The schemas r e s u l t i n g f r o m t h e t r a n s f o r m a t i o n r u l e s c a n be g r o u p e d i n t o two c a t e g o r i e s ; t h e a t o m i c r e l a t i o n schemas and t h o s e from r u l e s (2) and ( 3 ) . The c a n d i d a t e keys of an a t o m i c schema R(X,Y) f o r X 1 1 Y, X m 1 Y and X m n Y a r e {X,Y}, {X} and {XY} r e s p e c t i v e l y . In e a c h c a s e , t h e d e t e r m i n a n t s a r e a l s o c a n d i d a t e k e y s . The a t o m i c schemas a r e t r i v i a l l y i n BCNF. A c o m b i n a t i o n o f two o r more mappings i n r u l e s (2) and (3) i s e q u i v a l e n t t o a j o i n of t h e i r c o r r e s p o n d i n g a t o m i c r e l a t i o n s . The r u l e s a r e f o r m u l a t e d , s u c h t h a t t h e j o i n i s l o s s l e s s and e v e r y d e t e r m i n a n t i s a s u p e r k e y . We e n s u r e l o s s l e s s j o i n by c o m b i n i n g o v e r a t t r i b u t e s w h i c h have e q u i v a l e n t domain e x t e n s i o n s i n t h e i r c o r r e s p o n d i n g m a p p i n g s . The j o i n a t t r i b u t e s a r e e i t h e r mapped 'onto' or t h e y a r e i n v o l v e d i n ' i n t o ' mappings t h a t a l w a y s have t h e same e x t e n s i o n s . In R u l e ( 2 ) , t h e r e a r e e l e v e n d i s t i n c t c o m b i n a t i o n s . O n l y f o u r o f them s a t i s f y t h e BCNF c o n d i t i o n . L e t A B and B C be two mappings t o be comb i n e d i n t o a r e l a t i o n schema R ( A , B , C ) . The s e t of d e t e r m i n a n t s o f R, D e t ( R ) , and key(R) f o r t h e v a r i o u s c o m b i n a t i o n s a r e as f o l l o w s . D e t ( R ) Key(R) o o 1. ) * A 1 1 B + B 1 1 C {A,B,C} {A,B,C} O o 2. )* A 1 1 B + B m 1 C {A, B} {A,B} O o 3. ) A 1 1 B + B m n C {A, B} {BC} o o 4. ) * A 1 m B + B m- 1 C {B} {B} 43 o o 5 . ) A m 1 B + B 1 1 C {A,B,C} {A} O o 6 . ) A m 1 B + B m 1 C {A, B} {A} O o 7. ) A m 1 B + B 1 m C {A,C} {AC} o o 8 . ) A m 1 B + B m n C {A} {ABC} o o 9 . ) A m n B + B 1 1 C {B,C} {AB} o o 1 0 . ) A m n B + B m 1 C {B} {AB} o o 11. )* A m n B + B m n C {0} {ABC} The c o m b i n a t i o n s where D e t ( R ) c Key(R) p r o d u c e schemas t h a t c o n f o r m t o t h e Boyce-Codd n o r m a l f o r m . The c o m b i n a t i o n s w i t h a s t e r i s k s s a t i s f y t h i s c o n d i t i o n . They a r e t h e o n l y ones a l l o w e d i n r u l e ( 2 ) . R u l e ( 3 ) a l l o w s s u c c e s s i v e c o m b i n a t i o n of mappings w i t h schemas g e n e r a t e d by r u l e ( 2 ) . I f R,(X,A) i s t o be combined w i t h a mapping A B i n t o R 2 ( X , A , B ) , t h e r e a r e f o u r c a s e s t o be c o n s i d e r e d . Case ( 1 ) B ^ X and A e key(R,) Case ( 2 ) B X and A / key(R,) Case ( 3 ) B e X and AB e key ( R , ) Case ( 4 ) B c X and AB £ keyCR,) L e t D e t ( R , ) and Key ( R i ) be t h e s e t o f d e t e r m i n a n t s and t h e s e t of c a n d i d a t e keys of R, r e s p e c t i v e l y . The s e t of d e t e r m i n a n t s and t h e s e t of c a n d i d a t e keys of R 2 f o r t h e v a r i o u s c o m b i n a t i o n s a r e a s f o l l o w s : 44 = X - B. D e t ( R 2 ) D e t ( R , ) u {B} D e t ( R , ) D e t ( R , ) D e t ( R , ) u {B} D e t ( R 2 ) D e t ( R , ) u {A, D e U R , ) u {A} D e t ( R , ) D e t ( R , ) u {B} D e t ( R 2 ) K e y ( R 2 ) Key(R,) u {B} K e y ( R 1 ) { AB} {B} K e y ( R 2 ) K e y ( R 1 ) Key(R,) Key(R,) x {AB} Key(R,) x {B} Key(R,) F o r c a s e s (3) and ( 4 ) , l e t Y Case ( 1 ) o o a) * R, ( X , A ) + A 1 1 B o o b) * R T ( X , A ) + A m 1 B O O c) R, ( X , A ) + A m n B o o d) R , ( X , A ) + A 1 m B Case (2) o o a) R,(X,A ) + A 1 1 B o o b) R, (X,A ) + A m 1 B o o c) R, (X,A ) + A m n B o o d) R , ( X , A ) + A 1 m B Case (3) o o o o a ) * R,(Y,A ,B ) + A 1 1 B o o o o b) * R, (Y, A , B ) + A m 1 B o o o o c) * R, (Y,A ,B ) + A m n B D e t ( R , ) u {A,B} Key(R,) u {A,B} D e t ( R , ) u {A} Key(R,) u {A} D e t ( R , ) Key(R!) 45 Case (4) D e t ( R 2 ) K e y ( R 2 ) o o o o a) R T ( Y , A ,B ) + A 1- 1 B D e t ( R , ) u {A,B} Key(R1 ) o o o o b) R T ( Y , A ,B ) + A m- 1 B D e t ( R 1 ) u {A} Key(R,) o o o o c ) * R,(Y,A ,B ) + A m -n B D e t ( R , ) Key(R,) G i v e n t h a t D e ^ R , ) c KeyCR,), t h e c o m b i n a t i o n s w i t h a s t e r i s k s a r e s u c h t h a t D e t ( R 2 ) £ K e y ( R 2 ) . The c o m b i n a t i o n s form r u l e ( 3 ) . J o i n s a r e b o t h commutative and a s s o c i a t i v e (Aho e t a l 1979), t h e r e f o r e t h e o r d e r of c o m b i n a t i o n w i t h i n t h e g r o u p s i s i m m a t e r i a l . 3.3.4 The PJ/NF R u l e s : V e r i f i c a t i o n A j o i n dependency c o n s t r a i n t JD*(X,Y,...,Z) i n a r e l a t i o n schema R, where X,Y,...,Z a r e c o m b i n a t i o n s o f a t t r i b u t e s of R, s t a t e s t h a t R i s t h e j o i n of i t s p r o j e c t i o n s o v e r X,Y,...,Z. A r e l a t i o n R i s i n p r o j e c t i o n - j o i n n o r m a l form (PJ/NF) i f and o n l y i f e v e r y j o i n dependency i s i m p l i e d by a c a n d i d a t e key o f R. A JD*(X,Y,...,Z) i n R i s i m p l i e d by c a n d i d a t e k e y s of R i f t h e j o i n a t t r i b u t e s i n X,Y,...,Z u n i q u e l y d e t e r m i n e t u p l e s of R. The a t o m i c r e l a t i o n s a r e t r i v i a l l y i n PJ/NF s i n c e t h e f u l l m appings a r e nondecomposable and no mapping can be d e r i v e d f r o m some o t h e r mappings. The r u l e s i n (2) and (3) e n s u r e t h a t t h e j o i n s a r e l o s s l e s s and t h a t t h e j o i n a t t r i b u t e s o f a l l t h e JDs i n a r e s u l t i n g schema a r e c a n d i d a t e keys i n t h e schema. 46 L e t R ( X 1 u...u Xn) be a r e l a t i o n schema r e s u l t i n g from a c o m b i n a t i o n o f n mappings r e p r e s e n t e d by t h e a t o m i c schemas R!(X!),...,Rn(Xn) r e s p e c t i v e l y . L e t { Y Y n - 1 } be t h e j o i n a t t r i b u t e s of R,(X,) and R 2 ( X 2 ) , R 2 ( X 2 ) and R 3 ( X 3 ) , , R n - l ( X n - l ) and Rn(Xn) r e s p e c t i v e l y . The JD* ( X , , X 2 , . . . , X n ) h o l d s i n R. And, s i n c e l o s s l e s s j o i n s a r e b o t h a s s o c i a t i v e and com m u t a t i v e (Aho e t a l 1979), e v e r y (2,3, . . . ,n-1 ) c o m b i n a t i o n s of R , ( X , ) , . . . , R n ( X n ) a r e JDs i n R. The number o f s u c h JDs i s n-2 n g i v e n by 1 + E C ( A p p e n d i x I ) . T h e r e a r e no o t h e r JDs i n R r= 1 r a p a r t from t h o s e on t h e j o i n a t t r i b u t e s {Y,......,Yn-1}. T h i s i s t h e 'case b e c a u s e of t h e n e c e s s a r y and s u f f i c i e n t c o n d i t i o n f o r a l o s s l e s s j o i n . F o r any two of t h e mappings r e p r e s e n t e d by R (X ) i i and R (X ), i f t h e r e i s an a t t r i b u t e A common t o b o t h X and 3 3 i i X , A c Y , o r R(X&i not a n a t u r a l j o i n of R and R . Hence, D i i i J th e r u l e s i n (2) and (3) need o n l y e n s u r e t h a t t h e common a t t r i b u t e s i n t h e s u c c e s s i v e j o i n s a r e c a n d i d a t e k e y s . T h e r e a r e e l e v e n d i s t i n c t c o m b i n a t i o n s o f two mappings o f t h e f o r m A B and B C i n t o a schema R(A,B,C) as i n s e c t i o n 3.4.3. O n l y t h r e e of t h e c o m b i n a t i o n s have t h e j o i n a t t r i b u t e B as a member of t h e s e t o f c a n d i d a t e k e y s . R u l e (2) c o n s i s t s of t h o s e t h r e e c o m b i n a t i o n s . R u l e (3) a l l o w s s u c c e s s i v e c o m b i n a t i o n of mappings w i t h r e l a t i o n schemas g e n e r a t e d by r u l e ( 2 ) . O n l y f i v e o f t h e d i s t i n c t c o m b i n a t i o n s have t h e i r j o i n a t t r i b u t e s as c a n d i d a t e k e y s ; t h e r u l e c o n s i s t s o f t h o s e c o m b i n a t i o n s . 47 CHAPTER IV . E v a l u a t i o n of t h e F u l l Mapping A p p r o a c h T h e r e a r e two a s p e c t s t o t h i s t h e s i s . The f u l l mapping as a means of e x p r e s s i n g r e l a t i o n s h i p s between d a t a i t e m s o f a d a t a b a s e i s p r o p o s e d . T r a n s f o r m a t i o n r u l e s t o g e n e r a t e r e l a t i o n schemas from t h e mappings a r e a l s o p r e s e n t e d . An a s s e s s m e n t o f t h e f u l l mappings as an a l t e r n a t i v e t o f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s , and a c o m p a r i s o n of schema d e s i g n v i a t h e t r a n s f o r m a t i o n r u l e s w i t h o t h e r d e s i g n methods a r e g i v e n i n t h i s c h a p t e r . In s e c t i o n 4.1, we run t h r o u g h a s e r i e s of examples t h a t show how a n o m a l i e s a r e e l i m i n a t e d i n t h e n o r m a l i z a t i o n p r o c e s s . The e f f e c t of f u l l mapping and t h e t r a n s f o r m a t i o n r u l e s a r e d i s c u s s e d w i t h t h e e x a m p l e s . A d e s i g n example i s g i v e n i n s e c t i o n 4.3. 4.1 The N o r m a l i z a t i o n P r o c e s s A r e l a t i o n i s s a i d t o be i n a p a r t i c u l a r n o r m a l f o r m i f i t s a t i s f i e s some c o n s t r a i n t s w h i c h a r e known t o p r e v e n t c e r t a i n u p d a t e p r o b l e m s . The f o l l o w i n g examples from (Date 1980) i l l u s t r a t e t h e n o r m a l i z a t i o n p r o c e s s . The examples a r e b a s e d on a r e l a t i o n c o n t a i n i n g i n f o r m a t i o n a b o u t s u p p l i e r s o f machine components, t h e p a r t s / q u a n t i t y s u p p l i e d and c i t i e s where s u p p l i e r s a r e l o c a t e d . 4.1.1 The s e c o n d Normal Form P r o b l e m The i n f o r m a t i o n i n t h e s u p p l i e r - p a r t e n v i r o n m e n t can be r e p r e s e n t e d as a t a b l e w i t h no a t t r i b u t e - v a l u e s r e p e a t e d i n t h e rows. Such a t a b l e c an be d e s c r i b e d by a r e l a t i o n schema 48 FIRST(SNO, STATUS, CITY, PNO, QTY). The r e l a t i o n i s s a i d t o be i n f i r s t n o r m a l f o r m . The f u n t i o n a l d e p e n d e n c i e s i n FIRST a r e shown i n f i g u r e 13. F i g u r e 13. The FDs i n R e l a t i o n schema FIRST The r e l a t i o n schema FIRST s u f f e r s f r o m c e r t a i n a n o m a l i e s . 1. ) I n s e r t i o n : I t i s n o t p o s s i b l e t o e n t e r t h e f a c t t h a t a p a r t i c u l a r s u p p l i e r i s l o c a t e d i n a c i t y u n t i l t h a t s u p p l i e r s u p p l i e s a t l e a s t one p a r t . T h i s i s t h e c a s e b e c a u s e no component of a p r i m a r y key may be n u l l . 2. ) D e l e t i o n : A t u p l e c o n t a i n i n g a s u p p l i e d p a r t i s d e l e t e d when t h e c o r r e s p o n d i n g s u p p l i e r no l o n g e r s u p p l i e s t h a t p a r t . I f t h e o n l y t u p l e f o r a p a r t i c u l a r s u p p l i e r i s d e l e t e d , t h e i n f o r m a t i o n t h a t t h e s u p p l i e r r e s i d e s i n a c i t y i s d e s t r o y e d . 3. ) Redundancy: The c i t y - v a l u e f o r a s u p p l i e r a p p e a r s i n a s many t u p l e s as t h e r e a r e p a r t s s u p p l i e d by t h e s u p p l i e r . The r e d u n d a n c y c a u s e s u p d a t e s e a r c h p r o b l e m s and g i v e s room f o r p o t e n t i a l i n c o n s i s t e n c i e s . A p o s s i b l e n o r m a l i z a t i o n s o l u t i o n i s t o decompose t h e r e l a t i o n schema FIRST i n t o SECOND(SNO, STATUS, CITY) and SP(SNO, PNO, QTY). The s o l u t i o n e l i m i n a t e s t h e n o n f u l l f u n c t i o n a l d e p e n d e n c i e s o f STATUS and CITY on t h e key. The n o n f u l l f u n c t i o n a l d ependency p r o b l e m i s s u f f e r e d by e v e r y 49 r e l a t i o n t h a t i s n o t i n s e c o n d n o r m a l f o r m (2NF). A r e l a t i o n i s 2NF i f and o n l y i f i t i s i n f i r s t n o r m a l form and e v e r y nonkey a t t r i b u t e i s f u l l y d e p e n d e n t on t h e p r i m a r y key. A f o r m a l a n a l y s i s of t h e s u p p l i e r - p a r t e n v i r o n m e n t w i l l r e v e a l t h e f o l l o w i n g f a c t s from w h i c h f u l l mappings can be d e r i v e d . 1. ) The q u a n t i t y of a p a r t (QTY) i s o n l y m e a n i n g f u l when a s s o c i a t e d w i t h a p a r t and i t s s u p p l i e r . T h e r e f o r e , t h e a s s o c i a t i o n (PNO SNO) i s i n d i v i s i b l e and s h o u l d be t r e a t e d a s an e n t i t y . 2. ) A s u p p l i e r s u p p l i e s many p a r t s and a p a r t may be s u p p l i e d by many s u p p l i e r s . 3. ) A s u p p l i e r i s l o c a t e d i n a c i t y even when he c u r r e n t l y s u p p l i e s no p a r t s . T h e r e may be some c i t i e s w i t h o u t s u p p l i e r s . The f u l l mappings i n t h e a p p l i c a t i o n e n v i r o n m e n t a r e a s f o l l o w s . SNO CITY (many-one, o n t o - i n t o ) CITY STATUS (one-one, o n t o - o n t o ) SNO PNO (many-many, i n t o - o n t o ) (SNO, PNO) QTY (many-one, o n t o - o n t o ) Some f a c t s t h a t a r e n o t r e v e a l e d by t h e f u n c t i o n a l d e p e n d e n c i e s c a n now be e x p r e s s e d . The f u n c t i o n a l d e p e n d e n c y o f CITY on SNO i s r e p r e s e n t e d by t h e many-one mapping between SNO and CITY. But t h e o n t o / i n t o s t a t u s of t h e f u l l mapping f u r t h e r s t a t e s t h a t t h e r e c an be some c i t i e s w i t h i n t h e d a t a b a s e w i t h no s u p p l i e r s r e s i d i n g i n them. The u p d a t e p r o b l e m s i n FIRST a r i s e a s a r e s u l t o f c o m b i n i n g r e l a t i o n s o v e r an a t t r i b u t e i n v o l v e d i n ' i n t o ' and 'on t o ' m a p pings. The schema SP(SNO, PNO, QTY) s h o u l d n e v e r have been combined w i t h (SNO C I T Y ) . 50 4.1.2 The T h i r d Normal Form p r o b l e m The r e l a t i o n schema SECOND(SNO, STATUS, CITY) s t i l l s u f f e r s f r o m c e r t a i n u p d a t e p r o b l e m s . 1. ) I n s e r t i o n : I t i s n o t p o s s i b l e t o e n t e r (CITY, STATUS) v a l u e u n t i l t h e r e a r e some s u p p l i e r s l o c a t e d i n t h a t c i t y . 2. ) D e l e t i o n : S i m i l a r l y , i f t h e o n l y s u p p l i e r i n a c i t y i s d e l e t e d , t h e c i t y / s t a t u s i n f o r m a t i o n i s l o s t . 3. ) Redundancy: T h e r e i s s t i l l some r e d u n d a n c y due t o c i t y / s t a t u s v a l u e t h a t i s b e i n g r e p e a t e d f o r a s many s u p p l i e r s i n a c i t y . A n o r m a l i z a t i o n s o l u t i o n r e p l a c e s t h e r e l a t i o n schema SECOND by SC(SNO, CITY) and CS( CITY , STATUS). T h e r e i s a t r a n s i t i v e d ependence of STATUS on SNO. SECOND i s n o t i n t h i r d n o r m a l f o r m . A r e l a t i o n i s i n t h i r d n o r m a l form (3NF) i f and o n l y i f e v e r y nonkey a t t r i b u t e i s n o n t r a n s i t i v e l y d e p e n d e n t on th e p r i m a r y key. From t h e f u l l mapping v i e w p o i n t , (SNO CITY) and (CITY STATUS) a r e not c o m b i n a b l e . The j o i n a t t r i b u t e CITY i s mapped 'o n t o ' i n CITY STATUS and ' i n t o ' i n SNO CITY. T h i s i n f a c t , e x p l a i n s t h e i n s e r t i o n / d e l e t i o n p r o b l e m s i n SECOND more t h a n t r a n s i t i v e d e p e n d e n c y . The i n s e r t i o n and d e l e t i o n a n o m a l i e s w i l l n ot o c c u r i f CITY i s mapped 'onto' i n b o t h m a p p i n g s . 4.1.3 The Boyce-Codd Normal Form P r o b l e m The r e l a t i o n schema SCHOOL( STUDENT, SUBJECT, TEACHER) o r i g i n a l l y a p p e a r e d i n (Da t e 1980). The f u n c t i o n a l d e p e n d e n c y d i a g r a m and a sample r e l a t i o n a r e g i v e n i n f i g u r e 14 and 15 51 r e s p e c t i v e l y . The f o l l o w i n g f a c t s a r e t r u e i n t h e a p p l i c a t i o n STUDENT SUBJECT F i g u r e 14. FDs i n R e l a t i o n Schema SCHOOL e n v i r o n m e n t . 1. ) F o r e v e r y s u b j e c t , a s t u d e n t i s t a u g h t by o n l y one t e a c h e r . 2. ) A t e a c h e r t e a c h e s o n l y one s u b j e c t , b u t e a c h s u b j e c t i s t a u g h t by s e v e r a l t e a c h e r s . STUDENT SUBJECT TEACHER SMITH MATH P r o f WHITE SMITH PHYSICS P r o f GREEN JONES MATH P r o f WHITE JONES PHYSICS P r o f BROWN F i g u r e 15. Sample R e l a t i o n f o r schema SCHOOL The r e l a t i o n SCHOOL i s i n 3NF, but i t s u f f e r s from c e r t a i n u p d a t e p r o b l e m s . We can n o t d e l e t e s u c h i n f o r m a t i o n a s " J o n e s i s s t u d y i n g p h y s i c s " w i t h o u t l o s i n g t h e i n f o r m a t i o n t h a t p r o f . Brown t e a c h e s p h y s i c s . The p r o b l e m a r i s e s from TEACHER b e i n g a d e t e r m i n a n t , b ut not a c a n d i d a t e key i n t h e r e l a t i o n . T h i s i s 5 2 t h e B o y c e - C o d d normal form p r o b l e m . We r e c a l l t h a t a r e l a t i o n R i s i n B o yce-Codd normal form i f and o n l y i f e v e r y d e t e r m i n a n t i s a c a n d i d a t e key i n R. A n o r m a l i z a t i o n s o l u t i o n decomposes t h e schema i n t o ST( STUDENT, TEACHER ) and TS( TEACHER, SUBJECT). B o t h ST and TS a r e i n BCNF and t h e u p d a t e p r o b l e m i s t a k e n c a r e o f . But d i f f e r e n t p r o b l e m s have been i n t r o d u c e d . The ST r e l a t i o n does not p r o v i d e much u s e f u l i n f o r m a t i o n . The r e l a t i o n s h i p between a s t u d e n t and a t e a c h e r i s o n l y m e a n i n g f u l w i t h r e s p e c t t o s u b j e c t . The f u l l mappings f o r t h e d a t a b a s e a r e as f o l l o w s : o o SUBJECT m n STUDENT O O (SUBJECT, STUDENT) 1 1 TEACHER o o TEACHER m 1 STUDENT From t h e BCNF t r a n s f o r m a t i o n r u l e s , TS( TEACHER, SUBJECT) and SST( SUBJECT, STUDENT, TEACHER) a r e g e n e r a t e d . The r e l a t i o n schema SST i s a t o m i c and can not be decomposed. I t i s a l s o not p o s s i b l e t o combine TS and SST. T h i s c a s e a c t u a l l y t u r n s o u t t o be an example where no amount of d e c o m p o s i t i o n w i l l p r o d u c e t h e d e s i r e d r e l a t i o n s . 4.1.4 The f o u r t h Normal Form P r o b l e m A r e l a t i o n schema CTX( COURSE, TEACHER, TEXT ) d e s c r i b e s a s i t u a t i o n where, f o r any g i v e n c o u r s e , t h e r e may e x i s t any number o f c o r r e s p o n d i n g t e a c h e r s and t e x t s . TEACHER and TEXT a r e assumed t o be i n d e p e n d e n t . T h a t i s , ' t h e r e a r e m u l t i v a l u e d d e p e n d e n c i e s COURSE—>-->TEXT and COURSE—>—>TEACHER i n CTX. 53 T h e r e a r e no f u n c t i o n a l d e p e n d e n c i e s . A sample r e l a t i o n i s g i v e n i n f i g u r e 16. COURSE TEACHER TEXT PHYSICS P r o f . GREEN BASIC MECHANICS PHYSICS P r o f . GREEN PRINCIPLES OF OPTICS PHYSICS P r o f . BROWN BASIC MECHANICS PHYSICS P r o f . BROWN PRINCIPLES OF OPTICS MATH P r o f . WHITE MODERN ALGEBRA MATH P r o f . WHITE PROJECTIVE GEOMETRY F i g u r e 16 Sample R e l a t i o n f o r CTX The r e l a t i o n CTX c o n t a i n s a l o t o f r e d u n d a n c y . A new t e x t f o r a c o u r s e w i l l r e q u i r e e n t r i e s f o r e v e r y t e a c h e r t h a t t e a c h e s t h e c o u r s e . A s o l u t i o n i s t o decompose CTX i n t o CT(COURSE, TEACHER ) and CX(COURSE, TEXT ) b a s e d on t h e m u l t i v a l u e d d e p e n d e n c i e s of TEXT and TEACHER on COURSE. The p r o b l e m i s t h a t CTX i s n o t i n f o u r t h n o r m a l f o r m . A r e l a t i o n i s i n f o u r t h n o r m a l form (4NF) i f and o n l y i f whenever t h e r e e x i s t s an MVD i n R, s a y A-->-->B, t h e n a l l a t t r i b u t e s of R a r e a l s o f u n c t i o n a l l y d e p e n d e n t on A. o o The f u l l mappings i n CTX a r e TEACHER m 1 COURSE and o o TEXT m 1 COURSE . I f t h e two mappings a r e co m b i n e d o v e r COURSE, t h e n a g i v e n 54 c o u r s e has t o be r e p e a t e d f o r a l l t h e t e a c h e r / t e x t c o m b i n a t i o n s . T h i s i s p r e c i s e l y what f o u r t h n o r m a l form i s t o e l i m i n a t e . A l t h o u g h we d i d n o t d e f i n e r u l e s f o r g e n e r a t i n g 4NF schemas, t h e PJ/NF r u l e s w i l l not a l l o w a j o i n of (TEACHER COURSE) and (TEXT COURSE). P r o j e c t i o n - j o i n n o r m a l form r e l a t i o n s a l s o c o n f o r m t o t h e f o u r t h n o r m a l form ( F a g i n 1979). 4.1.5 The P r o j e c t i o n - j o i n n o r m a l form p r o b l e m We r e c a l l t h a t a j o i n d e p e ndency c o n s t r a i n t JD*(X,Y,...,Z) h o l d s i n a r e l a t i o n R, i f R i s e q u i v a l e n t t o t h e j o i n of i t s p r o j e c t i o n s o v e r X,Y,...,Z where X,Y,...,Z a r e c o m b i n a t i o n s of a t t r i b u t e s of R. However, as i l l u s t r a t e d i n s e c t i o n 2.1, JD c o n s t r a i n t s a r e n o t e a s y t o m a i n t a i n . The r e l a t i o n SPJ i n f i g u r e 7 s u f f e r s f r o m a number of u p d a t e p r o b l e m s due t o i t s JD c o n s t r a i n t . An i n s e r t i o n of a t u p l e may c a l l f o r o t h e r t u p l e s t o be i n s e r t e d . S i m i l a r l y , a d e l e t i o n may r e q u i r e t h a t some o t h e r t u p l e s be d e l e t e d . However, not a l l JD c o n s t r a i n t s have t h e u p d a t e m a i n t e n a n c e p r o b l e m s . R e l a t i o n s w i t h s u c h p r o b l e m - f r e e JDs a r e s a i d t o be i n p r o j e c t i o n - j o i n n o r m a l f o r m . A r e l a t i o n R i s i n p r o j e c t i o n - j o i n n o r m a l form (PJ/NF) i f and o n l y i f e v e r y j o i n d e p e ndency i n R i s i m p l i e d by a c a n d i d a t e key o f R. A j o i n d ependency JD* (X, Y, ... . , Z) i n R i s i m p l i e d by a c a n d i d a t e key o f R i f t h e j o i n a t t r i b u t e s i n X,Y,...,Z u n i q u e l y d e t e r m i n e t u p l e s o f R. The p r o b l e m s i n t h e r e l a t i o n SPJ a r i s e b e c a u s e t h e j o i n a t t r i b u t e s S, P, and J a r e n o t k e y s . The f u l l mappings f o r t h e o o o o r e l a t i o n s h i p s i n SPJ a r e S m n P , P m n J and 55 o o S m n J . A c c o r d i n g t o t h e PJ/NF t r a n s f o r m a t i o n r u l e s , t h e r e l a t i o n s R(S, P ) , R(P, J ) and R(S, J ) a r e i n t h e i r f i n a l form; t h e y c a n not be combined i n any way. 4.2 F u l l M a pping v e r s u s FDs and MVDs F u l l mapping, as a means o f e x p r e s s i n g r e l a t i o n s h i p s between d a t a i t e m s of a d a t a b a s e , compares f a v o r a b l y w i t h f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s . Two o f t h e t h r e e b a s i c mapping t y p e s c an e x p r e s s f u n c t i o n a l d e p e n d e n c y . L e t X and Y be a t t r i b u t e s r e p r e s e n t i n g some e n t i t y or p r o p e r t y - v a l u e s e t s . 1. ) One-one mapping X 1 1 Y e x p r e s s e s t h e f u n c t i o n a l d e p e n d e n c y of Y on X and v i c e v e r s a . T h a t i s , X >Y and Y >X. The c o r r e s p o n d i n g a t o m i c r e l a t i o n schema i s e i t h e r R(X, Y) o r R(X, Y ) . 2. ) Many-one mapping X m 1 Y e x p r e s s e s t h e f u n c t i o n a l d e p e n d e n c y of Y on X as w e l l as t h e f a c t t h a t X i s n o t f u n c t i o n a l l y d e p e n d e n t on Y. T h a t i s , X >Y and Y-/->X. The c o r r e s p o n d i n g a t o m i c r e l a t i o n schema i s R(X, Y ) . The mappings e x p l i c i t l y s p e c i f y f u n c t i o n a l r e l a t i o n s h i p s i n b o t h d i r e c t i o n s . The same amount o f i n f o r m a t i o n c a n o n l y be i n f e r r e d from two or more f u n c t i o n a l d e p e n d e n c i e s . I n a d d i t i o n , t h e i n t o / o n t o s t a t u s o f t h e mappings p r o v i d e s some i n f o r m a t i o n t h a t c a n n o t be e x p r e s s e d i n f u n c t i o n a l d e p e ndency i o s p e c i f i c a t i o n s . A mapping. NAME m 1 PHONE i n a company d a t a b a s e e x p r e s s e s t h e f u n c t i o n a l d ependency o f PHONE on NAME. But i n a d d i t i o n , i t s p e c i f i e s t h e f a c t t h a t , a t a d a t a b a s e i n s t a n c e , a name may have no phone a s s o c i a t e d w i t h i t . 56 N o n f u n c t i o n a l r e l a t i o n s h i p s c a n be e x p r e s s e d a s many-many o r many-one mapping. The many-one mapping, a s shown a b o v e , s p e c i f i e s a f u n c t i o n a l d e p e n d e n c y i n one d i r e c t i o n and n o n f u n c t i o n a l r e l a t i o n s h i p i n t h e o t h e r . The many-many mapping s p e c i f i e s n o n f u n c t i o n a l r e l a t i o n s h i p i n b o t h d i r e c t i o n s : X-/->Y and Y-/->X. The c o r r e s p o n d i n g r e l a t i o n schema i s R(X, Y ) . The m u l t i v a l u e d d e pendency, a s a means o f e x p r e s s i n g n o n f u n c t i o n a l r e l a t i o n s h i p s , i s s u c h t h a t i t i s o n l y r e c o g n i z a b l e when i t c o e x i s t s w i t h a n o t h e r one i n t h e same r e l a t i o n . W h i l e an FD X >Y i s d e f i n e d o n l y i n terms o f t h e s e t s X and Y, t h e v a l i d i t y of an MVD X-->-->Y i n a r e l a t i o n R(U) depends on t h e v a l u e s of a l l t h e a t t r i b u t e s i n U. The MVD c a n not be d e r i v e d f r o m R [ X Y ] . L e t X and Y be s u b s e t s o f U, and l e t Z be t h e complement ( i n U) of t h e u n i o n XY. F o r any r e l a t i o n R ( U ) , t h e m u l t i v a l u e d d ependency x-->-->Y h o l d s i n R i f and o n l y i f R i s t h e n a t u r a l j o i n o f i t s p r o j e c t i o n s R[XY] and R [ X Z ] . The MVD X — > — > Z a l s o h o l d s i n R ( B e e r i e t a l 1977). By d e f i n i t i o n , MVDs a r e not o n l y u n i n t u i t i v e , b u t t h e i r p r o p e r t i e s a r e n o t w e l l u n d e r s t o o d . An MVD may h o l d i n a p r o j e c t i o n , b u t not i n t h e p a r e n t r e l a t i o n . Such MVDs a r e s a i d t o be embedded . Some embedded MVDs c a n be o b t a i n e d by p r o j e c t a b i 1 i t y from t h e MVDs i n t h e p a r e n t r e l a t i o n . The p r o j e c t a b i l i t y r u l e s t a t e s t h a t i f X-->-->Y h o l d s i n R(U) and X c Z c W, t h e n X-->—>(Y n Z) h o l d s i n R [ Z ] , The MVDs t h a t c a n not be d e r i v e d a r e s a i d t o be l a t e n t i n t h e r e l a t i o n ( Z a n i o l o and M e l k a n o f f 1981). 57 The many-many and many-one mappings a r e e q u i v a l e n t t o t r i v i a l MVDs. An MVD X — > — > Y wh i c h h o l d s i n R(U) i s t r i v i a l i f U = X u Y o r Y c X . The r e l a t i o n s h i p s between t h e mappings and MVDs c a n be examined i n r e l a t i o n s d e r i v e d from two o r more m a p p i n g s . But t h e o n l y i n f e r e n c e r u l e f r o m p r o j e c t i o n s t o a j o i n i s t h e j o i n a b i l i t y r u l e w h i c h s t a t e s t h a t : i f 1.) R(W u Z) = S(W).P(Z) 2. ) X — > — >Y h o l d s i n ' S ( W ) , and 3. ) Y n Z = 0 t h e n X—>-->Y h o l d s i n R(W u Z ) - ( Z a n i o l o and M e l k a n o f f 1981). The j o i n a b i l i t y r u l e , a s d e f i n e d a b o v e , o n l y d e a l s w i t h c a s e s where t h e a t t r i b u t e s Y and Z a r e d i s j o i n t . And, as s t a t e d i n s e c t i o n 2.3.2, two many-many mappings E m n S and E m n D w i l l , i n g e n e r a l , n ot a p p e a r as an MVD i f t h e r e e x i s t s some r e l a t i o n s h i p s i n v o l v i n g S and D (Kent 1981)-. T h u s , b e c a u s e of the n a t u r e o f m u l t i v a l u e d d e p e n d e n c i e s , t h e r e l a t i o n s h i p between them and mapping t y p e s i s n o t q u i t e c l e a r . However b o t h MVDs and f u l l mappings model t h e many-many r e l a t i o n s h i p s , but some MVDs may be h i d d e n i n d a t a b a s e r e l a t i o n s . The mappings a r e s u c h t h a t t h e y c an be r e c o v e r e d by p r o j e c t i o n o v e r t h e j o i n a t t r i b u t e s . The c o m b i n a t i o n o f two mappings X Y and X Z r e s u l t s i n MVDs X — > — > Y | Z , i f t h e r e a r e no o t h e r mappings between Y and Z i n t h e same r e l a t i o n . The i n t o / o n t o s t a t u s o f f u l l mappings d i c t a t e s w h i c h mappings c a n be c o m b i n e d . Hence, g i v e n a l l t h e mappings i n an a p p l i c a t i o n e n v i r o n m e n t , t h e y c a n be e x p r e s s e d a s t r i v i a l and n o n t r i v i a l MVDs d e p e n d i n g on w h i c h mappings a r e com b i n e d . The f u l l mappings p r o v i d e a t l e a s t as much i n f o r m a t i o n as t h e m u l t i v a l u e d d e p e n d e n c i e s . 58 4.3 D e s i g n Example The d e s i g n example i n (Wong e t a l 1980) w i l l be u s e d t o i l l u s t r a t e t h e t r a n s f o r m a t i o n r u l e s f o r g e n e r a t i n g r e l a t i o n schemas from f u l l m a p p i n g s . From t h e example, t h e f o l l o w i n g e n t i t y and p r o p e r t y - v a l u e s e t s c a n be i d e n t i f i e d . E n t i t y S e t s P r o p e r t y - v a l u e S e t s EMPNO ENAME, BIRTHYR DEPTNO DNAME, LOCATION JOBNO T I T L E , SALARY A p r o p e r t y has a one-one o r one-many c o r r e s p o n d e n c e w i t h an e n t i t y s e t . E v e r y e n t i t y has e x a c t l y one p r o p e r t y and e v e r y e l e m e n t i n a p r o p e r t y - v a l u e s e t i s a s s o c i a t e d w i t h some e n t i t i e s . The v a l u e a s s o c i a t i o n s a r e as f o l l o w s . o o 1 . ) EMPNO m 1 ENAME o o 2.) EMPNO m 1 BIRTHYR 3 . ) DEPTNO 1 1 DNAME 4.) DEPTNO m 1 LOCATION 5.) JOBNO m 1 T I T L E 6.) JOBNO m 1 SALARY T h e r e a r e s i x e n t i t y s e t a s s o c i a t i o n s . o o 1.) EMPNO m 1 DEPTNO d e r i v e d from t h e a s s o c i a t i o n 59 w o r k s - i n ( E M P , D E P T ) . o i 2. ) EMPNO 1 1 JOBNO d e r i v e d f r o m a s s i g n m e n t ( E M P , J O B ) a s s o c i a t i o n . I t i s assumed t h a t a j o b may not be f i l l e d . The a s s u m p t i o n i s c o n s i s t e n t w i t h t h e example and t h e d e f i n i t i o n o f a s s o c i a t i o n i n (Wong e t a l 1980). o i 3. ) DEPTNO 1 1 EMPNO d e r i v e d from t h e mgr(DEPT,EMP) r e l a t i o n s h i p . A d e p a r t m e n t may have o n l y one manager b e c a u s e t h e "mgr" r e l a t i o n s h i p i s s i n g l e - v a l u e d . o o 4. ) EMPNO m n JOBNO d e r i v e d f r o m t h e g e n e r a l r e l a t i o n s h i p q u a l i f i e d ( E M P , J O B ) . L e t us assume t h a t e v e r y j o b has some q u a l i f i e d employee and t h a t an employee i s q u a l i f i e d f o r a t l e a s t one j o b . o o 5. ) DEPTNO m n JOBNO d e r i v e d from t h e r e l a t i o n s h i p a l l o c a t i o n ( D E P T , J O B ) . A j o b may be a l l o c a t e d t o more t h a n one d e p a r t m e n t and a d e p a r t m e n t may have many j o b s . o o o o 6. ) (DEPTNO m n JOBNO ) 1 1 NUMBER d e f i n e s a p r o p e r t y "number" on t h e a l l o c a t i o n r e l a t i o n s h i p . U s i n g t h e t r a n s f o r m a t i o n r u l e s , BCNF r e l a t i o n schemas c a n be g e n e r a t e d from t h e ma p p i n g s . A l l t h e mappings a r e i n one common a t t r i b u t e g r o u p . They c an be combined a s f o l l o w s . o o o o 1.) EMPNO m 1 ENAME + EMPNO m 1 BIRTHYR O i O O + EMPNO 1 1 JOBNO + EMPNO m 1 DEPTNO 60 o o o o 2. ) DEPTNO 1 1 DNAME + DEPTNO m 1 LOCATION o i + DEPTNO 1 1 EMPNO o o o o 3. ) JOBNO 1 1 T I T L E + JOBNO m -1 SALARY o o 4. ) EMPNO m n JOBNO O O O 5. ) (DEPTNO m n JOBNO ) 1 1 NUMBER o o + DEPTNO m n JOBNO The c o r r e s p o n d i n g r e l a t i o n schemas and c a n d i d a t e k e y s a r e as f o l l o w s . 1. ) EMP(EMPNO, ENAME, BIRTHYR, DEPTNO, JOBNO) {EMPNO, JOBNO} 2. ) DEPT(DEPTNO, DNAME, LOCATION, EMPNO) {DEPTNO, DNAME, EMPNO} 3. ) JOB(JOBNO, T I T L E , SALARY) {JOBNO, TITLE} 4. ) QUALIFIED(EMPNO, JOBNO) {EMPNO-JOBNO} 5. ) ALLOCATION(DEPTNO, JOBNO, NUMBER) {DEPTNO-JOBNO} 4.4 F u l l M a pping A p p r o a c h v e r s u s o t h e r D e s i g n Methods T h i s s e c t i o n compares t h e f u l l m a p p i n g / t r a n s f o r m a t i o n r u l e s , a s a d e s i g n a p p r o a c h , w i t h o t h e r m e t h o d o l o g i e s d i s c u s s e d i n c h a p t e r ( 2 ) . The c o m p a r i s o n has two s i d e s t o i t . A c o m p a r i s o n i s made between f u l l mappings and t h e i n p u t s i n o t h e r methods, as w e l l as between t h e n a t u r e o f t h e t r a n s f o r m a t i o n r u l e s and o t h e r d e s i g n p r o c e s s e s . 61 4.4.1 S y n t h e t i c Methods and t h e F u l l Mapping A p p r o a c h The f u l l mapping a p p r o a c h t o r e l a t i o n a l d a t a b a s e d e s i g n i s a l s o s y n t h e t i c i n t h e s e n s e t h a t t h e d e s i g n p r o c e s s s t a r t s w i t h a s e t of a t t r i b u t e s and a s t a t e m e n t of t h e r e l a t i o n s h i p s among them. An e a r l i e r s y n t h e s i s a l g o r i t h m ( B e r n s t e i n 1976) u s e s t h e minimum c o v e r t e c h n i q u e . T h i s i n v o l v e s a p u r e l y s y n t a c t i c t r e a t m e n t of f u n c t i o n a l d e p e n d e n c i e s . The t e c h n i q u e demands t h a t t h e f u n c t i o n a l dependency between any two a t t r i b u t e s be u n i q u e . But t h i s i s n o t a l w a y s t h e c a s e i n p r a c t i c e , hence a t t r i b u t e s may have t o be renamed t o m a i n t a i n t h e u n i q u e n e s s a s s u m p t i o n . The d e s i g n r e s u l t s , i n t u r n , have t o be v a l i d a t e d s e m a n t i c a l l y . In t h e f u l l mapping a p p r o a c h , e v e r y r e l a t i o n s h i p i s s p e c i f i e d i n d e p e n d e n t of o t h e r s . E a c h mapping has an i n t e n s i o n and c a n n o t be d e r i v e d f r o m o t h e r m a p p ings. C o n s e q u e n t l y , two mappings w i t h e x a c t l y t h e same s p e c i f i c a t i o n s a r e not n e c e s s a r i l y t h e same. The mapping X 1 1 Y and X m 1 Y t o g e t h e r w i t h t h e i r i n t o / o n t o s t a t u s p r o v i d e s s i x d i f f e r e n t ways o f e x p r e s s i n g a f u n c t i o n a l d e p e n d e n c y between a t t r i b u t e s X and Y. Thus, f u l l mapping p r o v i d e s a d i s c i p l i n e f o r d e a l i n g w i t h t h e u n i q u e n e s s a s s u m p t i o n a n d t h e need f o r s e m a n t i c v a l i d a t i o n of d e s i g n r e s u l t s . The p r o b l e m s a s s o c i a t e d w i t h u n c o n t r o l l e d r e n a m i n g of a t t r i b u t e s have been d i s c u s s e d i n c h a p t e r ( 2 ) . The i n t o - m a p p i n g a l l o w s a r e l a t i o n s h i p t o be d e f i n e d on a s u b s e t of a domain w i t h o u t h a v i n g t o rename t h e a t t r i b u t e . The i n p u t t o t h e s y n t h e s i s a l g o r i t h m i s a l s o l i m i t e d t o f u n c t i o n a l d e p e n d e n c i e s . N o n f u n c t i o n a l r e l a t i o n s h i p s a r e e n t e r e d i n d i r e c t l y . The a l g o r i t h m e s s e n t i a l l y g e n e r a t e s a r e l a t i o n p e r e a c h n o n f u n c t i o n a l r e l a t i o n s h i p . And b e c a u s e of t h e n a t u r e of 62 m u l t i v a l u e d d e p e n d e n c y , t h e s y n t h e s i s a l g o r i t h m s have n o t been e x t e n d e d t o 4NF and PJ/NF d e s i g n s . The many-many and one-many mapping t y p e s d e s c r i b e n o n f u n c t i o n a l r e l a t i o n s h i p s between d a t a i t e m s . Thus, i t i s now p o s s i b l e t o model n o n f u n c t i o n a l r e l a t i o n s h i p s i n a c o n t e x t - i n d e p e n d e n t form, and t o g e n e r a t e PJ/NF schemas v i a t h e t r a n s f o r m a t i o n r u l e s . From a p r a c t i c a l v i e w p o i n t , t h e r u l e s a p p e a r more d e s i r a b l e t h a n t h e minimum c o v e r t e c h n i q u e . D e s i g n r e s u l t s need n o t be s u b j e c t e d t o s e m a n t i c v a l i d a t i o n . A l l s e m a n t i c c o n s i d e r a t i o n s t a k e p l a c e a t t h e mapping s p e c i f i c a t i o n s t a g e . 4.4.2 D e c o m p o s i t i o n and t h e F u l l M a p p i n g A p p r o a c h The d e c o m p o s i t i o n methods, i n g e n e r a l , t a k e as i n p u t f u n c t i o n a l and m u l t i v a l u e d d e p e n d e n c i e s and an i n i t i a l r e l a t i o n d e s i g n . The MVD m o d els n o n f u n c t i o n a l r e l a t i o n s h i p s . However, as d i s c u s s e d i n s e c t i o n 4.2, t h e p r o p e r t i e s o f MVDs a r e n o t w e l l u n d e r s t o o d . M u l t i v a l u e d d e p e n d e n c i e s a r e o n l y v a l i d w i t h i n t h e c o n t e x t of a r e l a t i o n and can o n l y be d e t e c t e d i n r e l a t i o n s . However, t h e y c a n n o t be e a s i l y d e t e c t e d . T h e r e a r e o t h e r p r o b l e m s r e l a t i n g t o embedded and l a t e n t d e p e n d e n c i e s . D i f f e r e n t d e c o m p o s i t i o n p a t h s can l e a d t o d i f f e r e n t d e s i g n s o f v a r y i n g q u a l i t i e s . C h o i c e of d e c o m p o s i t i o n i s m a i n l y b a s e d on h e u r i s t i c s . Some of t h e s e p r o b l e m s a r e d e a l t w i t h e x t e n s i v e l y i n ( Z a n i o l o and M e l k a n o f f 1981). I t was n o t e d t h a t some d e p e n d e n c i e s a r e l o s t i n a d e c o m p o s i t i o n p r o c e s s . Hence, d e c o m p o s i t i o n i s b a s e d on c o m p l e t e d a t a r e l a t a b i l i t y t o e n s u r e 63 t h a t a l l t h e d e p e n d e n c i e s a r e p r e s e r v e d . The d a t a r e l a t a b i l i t y c o n d i t i o n i s s u c h t h a t t h e i n i t i a l d e p e n d e n c i e s c a n be i n f e r r e d f r o m t h e p r o j e c t i o n s . O n l y t h e p a t h s t h a t p r e s e r v e t h e d e p e n d e n c i e s a r e us e d i n d e c o m p o s i t i o n . A l i m i t a t i o n t o t h i s a p p r o a c h i s b r o u g h t a b o u t by t h e p r o p e r t i e s of MVDs. An MVD c a n o n l y be i n f e r r e d ( f r o m a p r o j e c t i o n t o a j o i n ) u s i n g t h e j o i n a b i l i t y r u l e . I t i s not c l e a r whether some o t h e r MVDs a r e i n f e r a b l e by s t r o n g e r r u l e s . Most d e c o m p o s i t i o n methods a l s o assume an i n i t i a l r e l a t i o n . T h i s i s r e f e r r e d t o as t h e u n i v e r s a l r e l a t i o n a s s u m p t i o n (Kent 1981). The c o n s e q u e n c e s o f t h e u n i v e r s a l r e l a t i o n a s s u m p t i o n , a s d i s c u s s e d i n c h a p t e r ( 2 ) , a r e not c o m p a t i b l e w i t h p r a c t i c a l d a t a b a s e d e s i g n . The u n i v e r s a l meaning f o r column names has a s s o c i a t e d w i t h i t an i n t e r - r e l a t i o n a l c o n s t r a i n t w h i c h i s n o t e a s y t o m a i n t a i n . D e c o m p o s i t i o n methods d e a l w i t h t h e column name p r o b l e m by renaming a t t r i b u t e s . B u t , as m e n t i o n e d i n c h a p t e r ( 2 ) , u n c o n t r o l l e d r e n a m i n g of a t t r i b u t e s c a n c r e a t e j o i n p r o b l e m s i n q u e r y p r o c e s s i n g . The p r o b l e m s r e l a t i n g t o MVDs c a n be a v o i d e d by u s i n g t h e f u l l mappings t o model n o n f u n c t i o n a l r e l a t i o n s h i p s . F u l l m appings a r e e a s y t o comprehend and t h e y c an e x p r e s s a t l e a s t a s much i n f o r m a t i o n a s m u l t i v a l u e d d e p e n d e n c i e s . And s i n c e t h e f u l l m apping a p p r o a c h i s s y n t h e t i c i n n a t u r e , a s s u m i n g an i n i t i a l r e l a t i o n and i t s c o n s e q u e n c e s have been a v o i d e d . A c o m p a r i s o n c an a l s o be made from t h e p o i n t of view o f s y n t h e s i s v e r s u s d e c o m p o s i t i o n . D e c o m p o s i t i o n a p p e a r s s u p e r i o r t o e a r l i e r s y n t h e t i c methods b e c a u s e n o n f u n c t i o n a l r e l a t i o n s h i p s c an be modeled d i r e c t l y . The f u l l mappings now p r o v i d e a means 64 of m o d e l i n g n o n f u n c t i o n a l r e l a t i o n s h i p s w i t h i n a s y n t h e t i c a p p r o a c h . The t r a n s f o r m a t i o n r u l e s a r e s u c h t h a t e v e r y r e l a t i o n schema g e n e r a t e d can be p r o j e c t e d back t o t h e i n i t i a l m a p p i n g s . I f i t c a n be shown t h a t t h e f u l l mappings c a p t u r e a l l t h e r e l a t i o n s h i p s we w i s h t o e x p r e s s , t h e s y n t h e t i c a p p r o a c h w i l l c l e a r l y be s u p e r i o r t o d e c o m p o s i t i o n . 4.4.3 E n t i t y - R e l a t i o n s h i p A p p r o a c h and t h e F u l l M a p p i n g R u l e s The E n t i t y - R e l a t i o n s h i p a p p r o a c h o f f e r s a p r a c t i c a l method f o r d a t a b a s e d e s i g n . I t t a k e s as i n p u t e n t i t y s e t s , a s s o c i a t i o n s , r e l a t i o n s h i p s and p r o p e r t i e s (Wong e t a l 1980). The F u l l mapping a p p r o a c h c a n be g r o u p e d i n t h e same c a t e g o r y as t h e E n t i t y - r e l a t i o n s h i p method i n t h e d e s i g n method summary ( f i g u r e 9 ) ; t h e y s h a r e a l o t i n common.- The i n t e r m e d i a t e model i n t h e e n t i t y - r e l a t i o n s h i p a p p r o a c h can be s p e c i f i e d as f u l l m a p p i n g s . But f u l l m a p pings, t h r o u g h t h e i n t o / o n t o s t a t u s , a l l o w more s e m a n t i c i n f o r m a t i o n t o be s p e c i f i e d . W i t h i n t h e E-R a p p r o a c h , t h e s i n g l e - v a l u e d b i n a r y r e l a t i o n s h i p s and t h e a s s o c i a t i o n s a l s o s p e c i f y i m p l i c i t l y some i n t o / o n t o s t a t u s i n f o r m a t i o n . A b i n a r y r e l a t i o n s h i p R between t e n t i t y s e t s E , ( t ) and E 2 ( t ) i s s i n g l e - v a l u e d i f e a c h e n t i t y o c c u r s i n a t most one i n s t a n c e o f R . T h a t i s , some e n t i t i e s i n t E , ( t ) may n o t p a r t i c i p a t e i n t h e r e l a t i o n s h i p . An a s s o c i a t i o n i s a b i n a r y r e l a t i o n s h i p i n w h i c h E , ( t ) e n t i t i e s o c c u r i n e x a c t l y one i n s t a n c e ; an 'onto' mapping i s s p e c i f i e d on E , ( t ) . W i t h i n t h e framework o f f u l l m appings, r e l a t i o n s h i p s and a s s o c i a t i o n s a r e t r e a t e d u n i f o r m l y . The i n t o / o n t o s t a t u s i n f o r m a t i o n i s 65 s p e c i f i e d f o r a l l t h e s e t s i n v o l v e d . The p r o p e r t i e s have many- one o r one-one a s s o c i a t i o n w i t h t h e e n t i t y o r r e l a t i o n s h i p s e t s , and a r e a l w a y s ' o n t o ' . Thus, more s e m a n t i c i n f o r m a t i o n c a n be e x p r e s s e d i n t h e f u l l mappings t h a n t h e i n t e r m e d i a t e model o f t h e E-R a p p r o a c h . T h e r e a r e t h r e e r u l e s f o r t r a n s f o r m i n g t h e E-R i n t e r m e d i a t e d e s i g n model i n t o r e l a t i o n schemas. R u l e (1) a s s i g n s an e x p l i c i t i d e n t i f i e r f o r e a c h e n t i t y s e t . R u l e (2) g r o u p s t h e i d e n t i f i e r of a p r i m i t i v e o b j e c t (an e n t i t y s e t o r r e l a t i o n s h i p ) w i t h a l l i t s p r o p e r t i e s or a s s o c i a t i o n s i n t h e same r e l a t i o n . However, i t i s p o s s i b l e t o have two a s s o c i a t i o n s ( E , ( t ) , E 2 ( t ) ) and ( E , ( t ) , E 3 ( t ) ) s u c h t h a t a t t i m e t t h e e x t e n s i o n s of E , ( t ) i n t h e a s s o c i a t i o n s a r e not e q u a l . The g r o u p i n g of s u c h a s s o c i a t i o n s t o g e t h e r i n a r e l a t i o n w i l l p r e v e n t i n s e r t i o n of a t u p l e when t h e e x t e n s i o n s a r e n o t e q u a l . Hence r u l e (2) may g e n e r a t e r e l a t i o n s w i t h t h i s t y p e o f i n s e r t i o n anomaly. R u l e (3) a l l o w s one and o n l y one p r i m i t i v e o b j e c t p e r r e l a t i o n . The r u l e i s t o o r e s t r i c t i v e . Two r e l a t i o n s h i p s c a n be g r o u p e d t o g e t h e r i n a r e l a t i o n as l o n g as t h e y have a s e t i n common and t h e s e t a p p e a r i n g i n t h e two r e l a t i o n s h i p s have same domain e x t e n s i o n s . T h u s , a v i o l a t i o n o f r u l e (3) does n o t n e c e s s a r i l y r e s u l t i n an 'update anomaly' as d e f i n e d i n (Wong e t a l 1981). B o t h t h e E n t i t y - R e l a t i o n s h i p method and t h e F u l l m apping a p p r o a c h o f f e r what seem l i k e a p r a c t i c a l d e s i g n m e t h o d o l o g y . The b a s i c d i f f e r e n c e between t h e two a p p r o a c h e s i s i n t h e f u l l mapping s p e c i f i c a t i o n w h i c h a l l o w s more s e m a n t i c i n f o r m a t i o n t o be e x p r e s s e d . The a d d i t i o n a l i n f o r m a t i o n a l l o w s a c e r t a i n k i n d of i n s e r t i o n / d e l e t i o n anomaly t o be a v o i d e d . I t a l s o a l l o w s more r e l a x e d r u l e s . The f u l l mapping r u l e s a l s o t a k e a d v a n t a g e p r e v i o u s work on r e l a t i o n a l d a t a b a s e t h e o r y . 67 CHAPTER V . C o n c l u s i o n s 5.1 A c h i e v e m e n t s The f u l l mapping a p p r o a c h t o r e l a t i o n a l d a t a b a s e d e s i g n c an be v i e w e d i n two ways: t h e f u l l mappings as a means of s p e c i f y i n g d a t a dependency c o n s t r a i n t s , and t h e n a t u r e o f t h e t r a n s f o r m a t i o n r u l e s . We have been a b l e t o i n c o r p o r a t e more s e m a n t i c s i n t o d a t a d e p e ndency s p e c i f i c a t i o n . The i n t o / o n t o s t a t u s i n f o r m a t i o n of a mapping, as d i s c u s s e d i n e a r l i e r c h a p t e r s , c a n s p e c i f y c e r t a i n i n f o r m a t i o n t h a t c a n n o t be e x p r e s s e d i n a f u n c t i o n a l d e p e n d e n c y . The f u l l m appings a l s o compare f a v o r a b l y w i t h m u l t i v a l u e d d e p e n d e n c y . Two n o n f u n c t i o n a l r e l a t i o n s h i p s X Y and X Z a r e e x p r e s s e d a s a MVD i n a r e l a t i o n schema R ( X Y Z ) , i f R(XYZ) = R [ X Y ] . R [ X Z ] . I m p l i c i t l y , t h e MVD s t a t e s t h a t t h e e x t e n s i o n of X i n R(XY) i s a l w a y s e q u a l t o i t s e x t e n s i o n i n R ( X Z ) . W i t h i n t h e f u l l mapping a p p r o a c h , n o n f u n c t i o n a l r e l a t i o n s h i p s a r e s p e c i f i e d a s mappings between two s e t s . The mappings can be r e p r e s e n t e d by a t o m i c r e l a t i o n schemas. The c o n d i t i o n f o r a l o s s l e s s j o i n o f t h e schemas, t h e e q u i v a l e n c e o f domain e x t e n s i o n s of t h e j o i n a t t r i b u t e s , i s d e r i v e d from t h e i n t o / o n t o s t a t u s i n f o r m a t i o n . Thus t h e many- many r e l a t i o n s h i p ; b e t w e e n any two d a t a i t e m s c a n be e x p r e s s e d out o f c o n t e x t of a r e l a t i o n . The f u l l m appings, i n c o m p a r i s o n t o MVDs, a r e s i m p l e and i n t u i t i v e . The e a r l i e r s y n t h e s i s a l g o r i t h m s have n o t been e x t e n d e d t o f o u r t h n o r m a l form schemas b e c a u s e n o n f u n c t i o n a l r e l a t i o n s h i p s can n o t be r e p r e s e n t e d d i r e c t l y . S p e c i f y i n g n o n f u n c t i o n a l 68 r e l a t i o n s h i p s a s MVDs n e c e s s a r i l y r e q u i r e s an i n i t i a l r e l a t i o n . The g r a v e c o n s e q u e n c e s o f t h e U n i v e r s a l R e l a t i o n a s s u m p t i o n a r e d i s c u s s e d i n (Kent 1981). B u t , i t i s now p o s s i b l e t o model n o n f u n c t i o n a l r e l a t i o n s h i p s o u t o f c o n t e x t o f a r e l a t i o n , and hence w i t h i n a s y n t h e t i c a p p r o a c h . T r a n s f o r m a t i o n r u l e s a r e d e f i n e d t o g e n e r a t e BCNF and PJ/NF schemas from f u l l m a p p i n g s . U s i n g a d e c o m p o s i t i o n a p p r o a c h t o d e s i g n PJ/NF schemas w i l l e n t a i l d e t e c t i n g t h e j o i n d e p e n d e n c i e s i n a s e t of i n i t i a l r e l a t i o n s . T h i s i s a t e d i o u s t a s k and may not be p r a c t i c a b l e even i n s m a l l d a t a b a s e s . As n o t e d i n (Date 1980), t h e p r o c e s s o f d e t e r m i n i n g when a g i v e n r e l a t i o n i s i n 4NF but n o t PJ/NF (and hence c an be decomposed) i s s t i l l u n c l e a r . F u l l mappings would p r o v i d e a p r a c t i c a l a p p r o a c h t o a u t o m a t i c d e s i g n of r e l a t i o n a l d a t a b a s e schemas. As n o t e d i n ( T s i c h r i t z i s and L o c h o v s k y 1982), d a t a b a s e t h e o r y i s more of a schema a n a l y s i s t h a n schema d e s i g n . The t h e o r y p r o v i d e s d e e p e r u n d e r s t a n d i n g of t h e d a t a m o d e l s , t h e d a t a b a s e schemas, and t h e i r p r o p e r t i e s . But i t i s not r e a d i l y a p p l i c a b l e . I t s h o u l d be t r e a t e d as a t o o l f o r u n d e r s t a n d i n g , and n o t n e c e s s a r i l y a s a t o o l f o r d e s i g n . We b e l i e v e t h a t t h e f u l l mapping a p p r o a c h l e n d s i t s e l f t o p r a c t i c a l d a t a b a s e d e s i g n . From an i n f o r m a t i o n a n a l y s i s of an a p p l i c a t i o n e n v i r o n m e n t , t h e r e l a t i o n s h i p s among t h e d a t a i t e m s c a n be r e p r e s e n t e d a s f u l l m a p p i ngs. R e l a t i o n schemas a r e g e n e r a t e d v i a t h e t r a n s f o r m a t i o n r u l e s . The a r e no r u l e s d e f i n e d f o r 4NF schemas, b e c a u s e we have not f o r m a l l y s t a t e d t h e r e l a t i o n s h i p s between t h e mappings and m u l t i v a l u e d d e p e n d e n c y . A l t h o u g h some i n s i g h t s i n t o u n d e r s t a n d i n g MVDs have been g a i n e d , t h e r e l a t i o n s h i p between 69 t h e two i s not q u i t e c l e a r . Two many-many r e l a t i o n s h i p s E m n S and E m n D i n a schema R(ESD) a r e e x p r e s s e d as a JD*(ED, ES, SD), r a t h e r t h a n a s an MVD, i f t h e r e i s a many-many r e l a t i o n s h i p between S and D. The d e m a r c a t i o n between MVDs and JDs i s n o t v e r y c l e a r . However, t r a n s f o r m a t i o n r u l e s a r e d e f i n e d f o r PJ/NF schemas. P r o j e c t i o n - j o i n n o r m a l form i m p l i e s f o u r t h n o r m a l f o r m ( F a g i n 1979). The s e t s o f t r a n s f o r m a t i o n r u l e s n o t o n l y p r o d u c e r e l a t i o n schemas t h a t c o n f o r m t o t h e i r r e s p e c t i v e n o r m a l f o r m s , but t h e y a l s o e l i m i n a t e c e r t a i n a n o m a l i e s t h a t may e x i s t i n n o r m a l form schemas. F o r example, l e t R(A,B,C) be a r e l a t i o n schema. I f t h e o n l y d e p e n d e n c i e s i n R a r e A >B and A >C, t h e n i t i s i n B o yce-Codd n o r m a l f o r m . However, t h e r e l a t i o n s h i p s among A, B i o and C may be s u c h t h a t A m 1 B and A m 1 C. The mappings e x p r e s s t h e f u n c t i o n a l d e p e n d e n c i e s , but i n a d d i t i o n A i s mapped ' i n t o ' i n one o f t h e m a p pings. T h e r e f o r e , a t a d a t a b a s e i n s t a n c e when t h e e x t e n s i o n s of A i n t h e two mappings a r e n o t e q u a l , i t w i l l n o t be p o s s i b l e t o e n t e r a c e r t a i n t u p l e w i t h o u t n u l l v a l u e s . T h i s t y p e o f i n s e r t i o n / d e l e t i o n a n o m a l i e s c a n s t i l l be p r e s e n t i n BCNF r e l a t i o n s . The BCNF t r a n s f o r m a t i o n r u l e s e l i m i n a t e s u c h a n o m a l i e s by c o m b i n i n g o n l y o v e r a t t r i b u t e s w i t h e q u i v a l e n t domain e x t e n s i o n s . The f u l l mapping a p p r o a c h a l s o p r o v i d e s a d i s c i p l i n e f o r d e a l i n g w i t h some of t h e p r o b l e m s r e l a t e d t o r e n a m i n g o f a t t r i b u t e s and t h e u n i v e r s a l meaning o f column names. The domains of. a t t r i b u t e s a r e a l s o g i v e n a c o n s i d e r a t i o n w i t h i n t h e d e s i g n a p p r o a c h . The i n t o - m a p p i n g a l l o w s an a s s o c i a t i o n t o be 70 d e f i n e d on a s u b s e t of a domain w i t h o u t h a v i n g t o rename t h e c o r r e s p o n d i n g column name. Thus, a t t r i b u t e s o r column names do not n e c e s s a r i l y have a u n i v e r s a l m e aning. The d e f i n e d s e t s and mappings do have i n t e n s i o n s . From t h e i n t e n s i o n s , i t s h o u l d be c l e a r w h i c h domains have e q u i v a l e n t e x t e n s i o n s . Hence, by r e f e r r i n g t o t h e s t a t e m e n t o f i n t e n t i o n s d u r i n g q u e r y p r o c e s s i n g , j o i n s would be c a r r i e d o u t o n l y o v e r t h o s e domains t h a t e n s u r e t h e l o s s l e s s p r o p e r t y . However, i t i s i m p o r t a n t t h a t we a r e a b l e t o keep t r a c k o f t h e s e t s , t h e mappings and t h e i r i n t e n s i o n s . The t o p i c i s n o t c o n s i d e r e d i n t h i s t h e s i s . More work i s needed i n t h i s a r e a . I t i s hoped t h a t t h e f u l l mappings would s e r v e as a b a s i s f o r t h e k i n d o f g e n e r a l i z e d i n t e r d e p e n d e n c y c o n s t r a i n t s p e c i f i c a t i o n e n v i s a g e d i n (Kent 1981). 5.2 F u r t h e r R e s e a r c h As n o t e d above, f u r t h e r work i s r e q u i r e d on t h e i n t e g r a t e d d a t a d i c t i o n a r y . T h i s would i n v o l v e o r g a n i z i n g t h e s e t s , t h e mappings and t h e s t a t e m e n t of i n t e n s i o n s i n t o a s t r u c t u r e t h a t can be managed by t h e D a t a b a s e Management System i n u s e . The d i c t i o n a r y would s e r v e as a k e r n e l , f r o m w h i c h t h e d a t a b a s e c a n be d e s i g n e d . A r e l a t e d p r o b l e m i s t h e a u t o m a t i o n o f d a t a b a s e d e s i g n . From a d a t a b a s e k e r n e l , i t s h o u l d be p o s s i b l e t o d e r i v e d a t a b a s e schemas a u t o m a t i c a l l y . The c o m p u t a t i o n a l p r o b l e m s r e l a t i n g t o t h e t r a n s f o r m a t i o n r u l e s may have t o be s t u d i e d . A n o t h e r p o s s i b l e r e s e a r c h i s t o r e f o r m a l i z e t h e r e l a t i o n a l d a t a b a s e t h e o r i e s u s i n g t h e f u l l mapping a p p r o a c h . I f t h e r e l a t i o n s h i p s between f u l l mappings and m u l t i v a l u e d d e p e n d e n c i e s 71 can be f o r m a l l y s t a t e d , schema a n a l y s i s m i g h t be b e t t e r u n d e r s t o o d f r o m t h e f u l l mapping v i e w p o i n t . N a t u r a l l y , t h i s work s h o u l d be e x t e n d e d t o t h e Network and H i e r a r c h i c a l d a t a b a s e d e s i g n . T r a n s f o r m a t i o n r u l e s c o u l d be d e f i n e d t o g e n e r a t e Network and H i e r a schemas from f u l l m a p p i n g s . The work may l e a d t o a c o m p r e h e n s i v e and a u t o m a t e d d e s i g n s y s t e m . I f t h e s y s t e m i s a b l e t o g e n e r a t e schemas f o r t h e t h r e e c o n v e n t i o n a l m o d e l s , i t might s e r v e as a b a s i s f o r t e s t i n g t h e s u i t a b i l i t y of t h e d a t a models f o r d i f f e r e n t a p p l i c a t i o n e n v i r o n m e n t s . 72 B i b l i o g r a p h y Aho, A. V., B e e r i , C , and U l l m a n , J . D. "The T h e o r y of J o i n s i n R e l a t i o n a l D a t a b a s e s , " ACM T r a n s a c t i o n s on D a t a b a s e S y s t e m s , V o l . 4, No. 3, S e p t . 1979. A r m s t r o n g , W. W. "Dependency S t r u c t u r e s o f D a t a b a s e R e l a t i o n s h i p s , " P r o c . I n t ' 1 F e d e r a t i o n f o r I n f o r m a t i o n p r o c e s s i n g C o n g r e s s , N o r t h H o l l a n d , 1974. B e e r i , C , and B e r n s t e i n , P. A. " C o m p u t a t i o n a l P r o b l e m s R e l a t e d t o t h e D e s i g n of Normal Form R e l a t i o n a l Schemas," ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 4, No. 1, M a r c h . 1979. B e e r i , C , B e r n s t e i n , P. A., and Goodman. N. "A S o p h i s t i c a t e ' s I n t r o d u c t i o n t o D a t a b a s e N o r m a l i z a t i o n T h e o r y , " . P r o c . 4 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , B e r l i n , 1978. B e e r i , C , F a g i n , R. , and Howard, J . H. "A Co m p l e t e A x i o m a t i z a t i o n f o r F u n c t i o n a l and M u l t i v a l u e d D e p e n d e n c i e s i n D a t a b a s e r e l a t i o n s , " P r o c . ACM SIGMOD C o n f e r e n c e , T o r o n t o , 1977. B e r n s t e i n , P. A. " S y n t h e s i z i n g T h i r d Normal Form R e l a t i o n s f r o m F u n c t i o n a l D e p e n d e n c i e s , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 1, No. 4, Dec. 1976. B e r n s t e i n , P. A., Swenson. J . R., and T s i c h r i t z i s , D. C. "A U n i f i e d A p p r o a c h t o F u n c t i o n a l D e p e n d e n c i e s and R e l a t i o n s , " P r o c • ACM SIGMOD C o n f e r e n c e , San J o s e , 1975. B i s k u p , J . , D a y a l , U., and B e r n s t e i n , P. A. " S y t h e s i z i n g I n d e p e n d e n t D a t a b a s e Schemas," P r o c . ACM SIGMOD C o n f e r e n c e , May 1979. C a d i o u , J . M. "On S e m a n t i c I s s u e s i n t h e R e l a t i o n a l M odel o f D a t a , " P r o c . I n t ' 1 Symposium o f M a t h e m a t i c a l F o u n d a t i o n s o f Computer S c i e n c e , P o l a n d , S e p t 1975. Codd, E. F. "A R e l a t i o n a l Model f o r L a r g e S h a r e d D a t a B a s e s , " C o m m u n i c a t i o n s of t h e ACM , V o l . 13, No. 6, June 1970. Codd, E. F. " F u r t h e r N o r m a l i z a t i o n of t h e D a t a Base R e l a t i o n a l M o d e l , " C o u r a n t Computer S c i e n c e Symposium 6, D a t a Base Systems , P r e n t i c e - H a l l , May 1971. Codd, E. F. " E x t e n d i n g t h e D a t a b a s e R e l a t i o n a l M odel t o C a p t u r e More M e a n i n g , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 4, No. 4, Dec. 1979. Chen, P. P. "The E n t i t y - R e l a t i o n s h i p M o d e l : Toward a U n i f i e d View of D a t a , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 1, No. 1, March 1976. 73 D a t e , C. J . , I n t r o d u c t i o n t o D a t a b a s e S ystems, 3 r d e d . , A d d i s o n - W e s l e y , R e a d i n g , Mass, 1980. D a y a l , U., and B e r n s t e i n , P. A. "The U p d a t a b i l i t y of R e l a t i o n a l V i e w s , " P r o c . 4 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , B e r l i n , 1978. D e l o b e l , C. " N o r m a l i z a t i o n and H i e r a r c h i c a l D e p e n d e n c i e s i n R e l a t i o n a l D a t a M o d e l , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 3, No. 3, S e p t . 1978. D e l o b e l , C , and C a s e y , R. G. " D e c o m p o s i t i o n o f a D a t a b a s e and t h e T h e o r y o f B o o l e a n S w i t c h i n g F u n c t i o n s , " IBM J o u r n a l o f R e s e a r c h and Development , V o l . 17, No. 5, S e p t . 1972. D e l o b e l , C , and L e o n a r d , M. "The D e c o m p o s i t i o n P r o c e s s i n a R e l a t i o n a l M o d e l , " P r o c . I n t ' 1 Workshop on D a t a S t r u c t u r e M o d e l s f o r I n f o r m a t i o n Systems , B e l g i u m , May 1974. D e l o b e l , C , and P a r k e r , D. S. " F u n c t i o n a l a n d M u l t i v a l u e d D e p e n d e n c i e s i n R e l a t i o n a l D a t a b a s e and t h e T h e o r y o f B o o l e a n S w i t c h i n g F u n c t i o n s , " T e c h . R e p o r t No. 142, D e p t . Maths A p p l . e t I n f o r m a t i q u e , U n i v . de G r e n o b l e , F r a n c e . Nov. 1978. E h r i g , H., K r e o w s k i , H. J . , and Weber, H. " A l g e b r a i c S p e c i f i c a t i o n Schemas f o r D a t a b a s e S y s t e m s , " P r o c . 4 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , B e r l i n , 1978. F a g i n , R. " M u l t i v a l u e d D e p e n d e n c i e s and a New Normal Form f o r R e l a t i o n a l D a t a b a s e s , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 2, No. 3, S e p t . 1977. F a g i n , R. "The D e c o m p o s i t i o n V e r s u s t h e S y n t h e t i c A p p r o a c h t o R e l a t i o n a l D a t a b a s e D e s i g n , " P r o c . I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , O c t . 1977. F a g i n , R. "Normal Forms and R e l a t i o n a l D a t a b a s e O p e r a t o r s , " P r o c . ACM SIGMOD C o n f e r e n c e , May 1979. F a g i n , R. "Horn C l a u s e s and D a t a b a s e D e p e n d e n c i e s , " P r o c . 12th A n n u a l ACM Symposium on T h e o r y of Computing , L o s A n g e l e s , A p r i l 1980. F a g i n , R. "A Normal Form f o r R e l a t i o n a l D a t a b a s e s T h a t i s Based on Domains and k e y s , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 6, No. 3, S e p t . 1981. G e r r i t s e n , R. " T o o l f o r t h e A u t o m a t i o n of D a t a b a s e D e s i g n , " NYU Symposium on D a t a b a s e D e s i g n , May 1978. Haseman, W. D., and So, Y. H. "An I n t e g r a t i v e A p p r o a c h t o D a t a b a s e D e s i g n , " Work i n g P a p e r , C a r n e g i e M e l l o n U n i v e r s i t y , Dec. 1977. 74 H o u s e l , B.C., Waddle, V., and Yao, S. B. "The F u n c t i o n a l Dependency M o d e l f o r L o g i c a l D a t a b a s e D e s i g n , " P r o c . 5 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , 1979. Kahn, B. K. "A S t r u c t u r e d L o g i c a l D e s i g n M e t h o d o l o g y , " NYU Symposium on D a t a b a s e D e s i g n , May 1978. K e n t , W. " C o n s e q u e n c e s o f A s s u m i n g a U n i v e r s a l R e l a t i o n , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 6, No. 4, Dec. 1981. L i e n , Y. E . " M u l t i v a l u e d D e p e n d e n c i e s W i t h N u l l V a l u e s i n R e l a t i o n a l D a t a b a s e s , " P r o c . 5 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , 1979. L i n g , T., Tompa, F. W., and Kameda, T. "An Improved T h i r d Normal Form f o r R e l a t i o n a l D a t a b a s e s , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 6, No. 2, June 1981. N i c o l a s , J . M. " M u l t i v a l u e d D e p e n d e n c i e s and Some R e s u l t s on Undecomposable R e l a t i o n s , " P r o c . 4 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , B e r l i n , 1978. P a r k e r , D. S., and D e l o b e l , C. " A l g o r i t h m i c A p p l i c a t i o n s f o r a New R e s u l t on M u l t i v a l u e d D e p e n d e n c i e s , " P r o c . 5 t h I n t ' 1 C o n f e r e n c e on V e r y L a r g e D a t a b a s e s , 1979. R a v e r , N. , and Hubbard, G. U. "Automated L o g i c a l D a t a b a s e D e s i g n : C o n c e p t s and A p p l i c a t i o n s , " IBM J o u r n a l o f R e s e a r c h and Development , V o l . 16, No. 3, 1977. R i s s a n e n , J . " I n d e p e n d e n t Components of R e l a t i o n s , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 2, No. 4, Dec. 1977. Shipman, D. "The F u n c t i o n a l D a t a Model and t h e D a t a Language DAPLEX," ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 6, No. 1, M a r c h 1981. S m i t h , J . M., and S m i t h D. C. P. " D a t a b a s e A b s t r a c t i o n s : A g g r e g a t i o n and G e n e r a l i z a t i o n , " ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 2, No. 2, June 1977. T s i c h r i t z i s , D. C , and L o c h o v s k y , F. H., D a t a M o d e l s , P r e n t i c e - H a l l , Englewood C l i f f s , 1982. Van de R i e t , R. P. "On M u l t i v a l u e d D e p e n d e n c i e s and I n d e p e n d e n c i e s , " IBM R e s e a r c h R e p o r t RJ2380 , IBM San J o s e R e s e a r c h Lab., 1978, Wong, E. and K a t z , R. H. " L o g i c a l D e s i g n and Schema C o n v e r s i o n f o r R e l a t i o n a l and DBTG D a t a b a s e s , " E n t i t y - R e l a t i o n s h i p A p p r o a c h t o Systems A n a l y s i s and D e s i g n , (Chen, P~. P~. ed.) , N o r t h H o l l a n d , 1980. Z a n i o l o , C , and M e l k a n o f f , M. A. "On t h e D e s i g n o f R e l a t i o n a l D a t a b a s e Schemata," ACM T r a n s a c t i o n s on D a t a b a s e Systems , V o l . 6, No. 1, March 1981. 76 A p p e n d i x I L e t R(X, u X 2 u...u Xn) be a r e l a t i o n schema r e s u l t i n g f r o m some n mappings a c c o r d i n g t o t h e t r a n s f o r m a t i o n r u l e s . I f t h e c o r r e s p o n d i n g a t o m i c schemas r e p r e s e n t i n g t h e mappings a r e R ( X , ) , . . . , R ( X n ) , t h e JD*(X,,X 2,...,Xn) h o l d s i n R. The t o t a l number of j o i n d e p e n d e n c i e s b a s e d on t h e j o i n a t t r i b u t e s i n n-2 n JD*(X!,X 2,...,Xn) i s g i v e n by 1 + E C r= 1 r The c o m b i n a t i o n o f n mappings, u s i n g t h e t r a n s f o r m a t i o n r u l e s , i s s u c h t h a t t h e r e s u l t i n g r e l a t i o n schema i s a l o s s l e s s j o i n o f t h e a t o m i c schemas d e n o t i n g t h e m a p p i n g s . T h a t i s , R = R ( X , ) , R ( X 2 ) R ( X n - 1 ) . R ( X n ) . L e t {Y,,Y 2,...,Yn-1} be t h e j o i n a t t r i b u t e s o f R(X,) and R ( X 2 ) , R ( X 2 ) and R ( X 3 ) , . . . , R(Xn-1) and R(Xn) r e s p e c t i v e l y . S i n c e l o s s l e s s j o i n s a r e b o t h a s s o c i a t i v e and commutative (Aho e t a l 1979), e v e r y (2, 3 , . . . , n - l ) c o m b i n a t i o n s of R ( X , ) , R ( X 2 ) , R(Xn) p r o d u c e s a JD i n R, o v e r t h e Y i ' s . L e t us r e f e r t o a J D * ( Z , , Z 2 , . . . , Z n ) a s an n-component j o i n d e p e n d e n c y . The t o t a l number of J D s , b a s e d on t h e j o i n a t t r i b u t e s Y i ' s , i s g i v e n by t h e sum o f t h e number of 2- component, 3-component, and n-component d e p e n d e n c i e s . The 2-component d e p e n d e n c i e s a r e d e r i v e d by f a c t o r i n g ' o u t t h e R ( X i ) ' s f r o m ( R ( X , ) . R ( X 2 ) R ( X n ) ) . F o r e v e r y R ( X i ) , t h e c o r r e s p o n d i n g j o i n i s R ( X i ) . [ R ( X , ) R ( X i - 1 ) , R ( X i + 1 ) . . . . . R ( X n ) ] . I f X' = X, u...u X i - 1 u X i + 1 u...u Xn, t h e j o i n p r o d u c e s a J D * ( X i , X ' ) . The number o f s u c h JDs i s t h e same as th e number of ways an i t e m c a n be c h o s e n from n i t e m s ; i t i s 77 n g i v e n by C . S i m i l a r l y , t h e number o f r-component JDs i s t h e 1 same as t h e number o f ways o f c h o o s i n g ( r - 1 ) i t e m s from n i t e m s n where o r d e r i n g i s i m m a t e r i a l . The number i s g i v e n by C ( r = r - i 3 , . . . , n - l ) . The d i f f e r e n t ways of f a c t o r i n g o u t (n-1) R i ' s p r o d u c e s t h e same j o i n d e p e n d e n c y . JD*(X,,X 2,...,Xn) i s t h e o n l y n-component d e p e n d e n c y . O t h e r c o m b i n a t i o n s do not p r o d u c e JDs t h a t a r e d i s t i n c t from J D * ( X , , X 2 , . . . , X n ) , b e c a u s e of t h e a s s o c i a t i v e and commutative p r o p e r t i e s o f l o s s l e s s j o i n s . Hence, n-2 n t h e t o t a l number o f JDs o v e r t h e Y i ' s i s g i v e n by 1 + Z C r= 1 r

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 2 20
Japan 2 0
Malaysia 1 0
United States 1 4
City Views Downloads
Beijing 2 3
Tokyo 2 0
Unknown 1 2
Ashburn 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}

Share

Share to:

Comment

Related Items